community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Knowledge Base

Definitive answers from Designer experts.

Create a ‘Random’ sample using In-Database tools

Alteryx
Alteryx
Created on

To create a pseudo random sample of data using the In-Database (InDB) tools, this could be a potential solution.

 

(This example uses Microsoft SQL 2008)

 

A set of 10 records are used in the workflow and we want find a ‘random’ 50% of the records.

 

1_original data.png

 

If we were to use the InDB Sample tool and choose to sample 50% of the records we’ll just get the first five records, not very random.

 

Instead, using the InDB Formula tool we can create a new field, in this case it’s being called RandomID, and use the MS SQL expression NEWID() which will generate a GUID (this expression will probably vary depending on which database is being used).

 

2_formula tool config.png

 

Resulting in:

 

3_data w new ID.png

 

Now we can feed our data to the InDB Sample tool.  Configure the InDB Sample tool to sort by the RandomID field and select the number or percentage of rows to return.

 

4_sample tool config.png

 

Running the workflow our results look like this, notice the BankerIDs that are returned as compared to the original data

 

5_new data random.png

 

The workflow would look like this:

 

6_tools on canvas.png

 

(A workflow is not included with this article as it requires a specific database connection.)