Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

AMP Engine Changes Order of Rows Input

BaileyCallander
6 - Meteoroid

Hey everyone,

 

I have a large dataset that I use to generate samples using a random seed. For efficiency, I prefer to run the workflow using the AMP engine. However, I've encountered an issue: with multi-thread processing, the sample often changes each time the workflow is executed because the order of the records is altered.

 

I considered adding a Record ID tool at the start of the workflow, but I believe this would be ineffective if the input order changes when the files are brought into the workflow. Another idea I've developed is to create two workflows: the first would input the data and add a Record ID without using the AMP engine and then output this to be used in the second workflow, which would utilize the AMP engine.

 

I wanted to get your thoughts on whether there might be a more efficient solution that would still ensure the sample remains consistent across runs.

3 REPLIES 3
KGT
13 - Pulsar

The reason that the order changes is that AMP is multi-threaded (Alteryx Multi-threaded Processing). So, in certain tools, the data will be chunked and sent to different cores for processing, then put back together in groups. For example, the Multi-Row Formula will split it into chunks based on the grouping fields selected (Disclaimer: This is my own view, not something I official, I find I don't need to sort unless I use a grouping field). A sort before the Multi-Row solves the issue.

 

AFAIK, the Input Tool, should always read in the data as is, so a recordID straight after would work to give you something to sort on. I think it would be an issue, if the order was changed on input, unless there was another selection or an SQL query etc.

nbondarchuk
Alteryx
Alteryx
BaileyCallander
6 - Meteoroid

Hi @nbondarchuk,

 

I appreciate your reply, while the Engine Compatibility Mode works well for smaller datasets, I noticed that it may not always produce the exact same output with larger datasets.

Labels
Top Solution Authors