Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

Create a ‘Random’ sample using In-Database tools

PeterS
Alteryx
Alteryx
Created

To create a pseudo random sample of data using the In-Database (InDB) tools, this could be a potential solution.

(This example uses Microsoft SQL 2008)

A set of 10 records are used in the workflow and we want find a ‘random’ 50% of the records.

1_original data.png

If we were to use the InDB Sample tool and choose to sample 50% of the records we’ll just get the first five records, not very random.

Instead, using the InDB Formula tool we can create a new field, in this case it’s being called RandomID, and use the MS SQL expression NEWID() which will generate a GUID (this expression will probably vary depending on which database is being used).

2_formula tool config.png

Resulting in:

3_data w new ID.png

Now we can feed our data to the InDB Sample tool. Configure the InDB Sample tool to sort by the RandomID field and select the number or percentage of rows to return.

4_sample tool config.png

Running the workflow our results look like this, notice the BankerIDs that are returned as compared to the original data

5_new data random.png

The workflow would look like this:

6_tools on canvas.png

(A workflow is not included with this article as it requires a specific database connection.)