We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Random % Sample Tool with AMP Engine

BaileyCallander
6 - Meteoroid

I currently have a workflow that generates a random sample using the deterministic output of a random seed. Since the workflow processes a large amount of data, it is more efficient to run it using the AMP engine.

 

According to the Alteryx website, the Random % Sample Tool in deterministic output mode selects different records with AMP (see below).

Random Sample Tool AMP.png

 

 

 

 

Tool Use with AMP

 

My question is: will this always change the output of the sample, even if the order of the records input into the tool remains the same? In summary, I need the generated sample to remain consistent each time it is run, unless there are changes to the workflow or the input file.

 

I was considering sorting the data before the Random Sample Tool to ensure that, even if the AMP engine alters the order of the output from some tools, the sort would maintain a consistent order for the Random Tool.

 

Please let me know your thoughts on whether this approach would work or if there is no way to rely on the AMP engine for generating a consistent sample.

3 REPLIES 3
alexnajm
18 - Pollux
18 - Pollux

@BaileyCallander I think it's saying if you select AMP vs non-AMP, the results between the two are different. I just ran the Random % Sample one tool example in Alteryx with deterministic output on and it seemed to be the same. I confirmed it with an Expect Equal tool.

 

You can check on your end too to see if it matches your expectation!

BaileyCallander
6 - Meteoroid

Thank you for your response! I appreciate the clarification regarding the distinction between AMP and Non-AMP processing. It’s reassuring to know that it doesn’t imply that the output could change each time we run it on AMP.

 

Our main challenge has been that, due to the large dataset and multi-threaded processing, the workflow when run using the AMP engine can yield different results when run on different computers. Previously, we did not include a sort of tool before the Random % sample tool, which I suspect contributed to the variations in output order from earlier tools in the workflow as each computer had different amounts of threads processing the data.

alexnajm
18 - Pollux
18 - Pollux

No worries. Just to confirm, you compared the same exact workflow being run on different machines (with the same sorting) and the output came out differently?

 

You might try turning on Engine Compatibility Mode then

Labels
Top Solution Authors