Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
Garabujo7
Alteryx
Alteryx

Credit to giphy.comCredit to giphy.com

AMP, Alteryx Multi-Threaded Processing Engine 

 

The Alteryx Engine is the component that processes the data in Alteryx. Since we don't normally ask ourselves how Alteryx processes the data or what is under the hood (beyond the RAM and the hardware), the software works through a processing engine.

 

This new version, the AMP Engine, includes parallel processing ...

 

 

 

Credit to giphy.comCredit to giphy.com

 

 

 

Parallel processing? Here is a brief explanation:

 

 

 

 

AMP Animation 2.gif

 

 

 

To make it easier, let us think of a service office that only has one window. The number of people who can be served is one. It is a sequential processing model: for the next one to be served, you have to wait for the first one to finish. In this example, if each person takes 5 minutes to resolve their issue, to serve four it will take 20 minutes.

 

 

 

Created with piktochart.comCreated with piktochart.com

 

 

 

While in parallel processing, it is as if we had four service windows; these could serve four people at the same time, so in the same twenty minutes it could serve 20 people, four times more.

 

 

 

Created with piktochart.comCreated with piktochart.com

 

 

 

What the AMP Engine does is break the data into packets that are processed in parallel, for faster execution. In other words, the Engine will use all your processing cores and RAM when you run the workflow.

 

How do I use AMP in my flows? First, you have to have the version of Alteryx Designer 2020.2 or newer. To check the version of Alteryx you have installed, go to Help -> About.

 

 

 

Garabujo7_4-1594997609405.png

 

Garabujo7_5-1594997609409.png

 

 

 

If you do not have this version yet, you can update it at any time to start using AMP. Go to Help -> Alteryx Downloads.

 

 

 

Garabujo7_6-1594997609411.png

 

 

Choose version 2021.2 (or newer):

 

 

 

Garabujo7_0-1628265833727.png

 

If you have questions about which version to download or the installation process, you can consult this article with a quick guide to get you started with Alteryx Designer.

 

AMP is available for all workflows but in this version you have to specify that you want to use the AMP engine for each workflow flow individually. You can also select to use the AMP Engine for all new workflows in the User Settings:

 

 

 

CristonS_0-1629824435931.png

 

 

 

To do this, left click on any white part of the canvas.

 

 

 

Garabujo7_9-1594997609437.png

 

 

Then in the Workflow - Configuration, on the left part of the screen, select Runtime and at the bottom, the last option says Use AMP Engine.

 

 

 

 

Garabujo7_10-1594997609445.png

 

 

Apply AMP to all workflows

 

As of version 2021.1.4, it is possible to enable the application of AMP to all workflows.

 

 

Garabujo7_0-1630100452686.png

 

Garabujo7_1-1630100456483.png

 

 

With this global parameter, you no longer need to specify for each stream you want to use AMP.

 

Performance profile with AMP

 

Starting with version 2021.3, the ability to analyze the performance of specific analytic blocks when using AMP has been added.

 

To enable it, select it in the workflow settings and enable the performance profile.

 

 

Garabujo7_2-1630100945981.png

 

 

 

Here is an excerpt from the explanation of AMP performance profiling according to this Alteryx help article: “The original engine returns the time in milliseconds that each tool took to run, measured to 0.01 ms precision. AMP can also have multiple workers for each tool, but the total time will be combined by the tool.

 

Highlights on AMP Profiling

 

The performance profiling results between the original engine and AMP shouldn`t be compared as they have different nature. AMP uses many threads to execute tasks, but the total time will be summed up by counting every thread used for the tool.

 

Overall time per tool may be more than the total workflow time due to the multithreaded nature of AMP.

 

When there is not enough memory to execute a workflow, AMP performs additional memory management that will be reported as a separate message "Nms have been spent on memory management. M% of the total workflow execution time.".

 

Now you can run your workflow and feel the power of Alteryx's parallel processing engine, AMP.

 

How to check if you are using AMP?

 

To verify, you can see in the Results window if the following message appears: This is AMP Engine.

 

 

 

Garabujo7_11-1594997609452.png

 

 

 

I ran the following workflow on my local computer where I connect to a SQL Server database that reads 10.4 million records and blends three excel files: one with 99K, one with 21K, and one CSV with 2.4K records, respectively.

 

 

Garabujo7_12-1594997609459.png

 

 

The blend is made with a Find Replace tool. The process takes one minute and ten seconds.

 

 

 

111.png

 

 

 

It is a big difference for a relatively large volume of data, although the best test is the one you carry out on your computer and with your data to validate it.

 

 

 

Credit to Giphy.comCredit to Giphy.com

 

 

Have you tried it yet? Share your experience in the comments!

 

 

 

Credit to Giphy.comCredit to Giphy.com

Considerations:

 

Like everything in life, results can vary and depend on many factors such as the complexity of the workflow, what analytical blocks (tools) you use, the size of the data and the hardware you have available.

 

Requirements for AMP:

 

The AMP engine must have at least 400 MB to process a thread from a workflow. For example, with 8 threads, there must be at least 3.2 GB of memory available to AMP at run time.

Comments
Ken_Black
9 - Comet
9 - Comet

Thanks for this article!

 

I love the new checkbox to use AMP for all new workflows!  Finding that made reading this article well worth it!  Why you ask? Because I crush data with Alteryx. 

 

AMP Benchmark Testing (be sure to read all three pages!):  Crushing data with the new AMP engine 

 

Alteryx Usage: 282 billion records crushed and counting.... Rocking and Rolling in Alteryx 

 

Thanks,

 

Ken

 

biggdawg320
6 - Meteoroid

Does AMP significantly increase performance on the spatial matching tool?

and

Is a multi-CPU system significantly better using AMP vs. Multi-Core?

 

I'm doing a lot of work with spatial matching and my process is taking a considerable amount of time and I'm thinking adding raw power may be the best solution, but I want to know before setting up a business case that this is likely to produce a significantly better process.

 

Thanks.

NeilR
Alteryx Alumni (Retired)

I created 1,000,000 random points and 1,000 random polygons and matched them. Ran it a few times. With AMP it took about 6 seconds. Without AMP it took about 1:10. I'm on a Dell laptop. Happy to share the workflow upon request.

NeilR_0-1659025635853.pngNeilR_1-1659025669675.png

 

biggdawg320
6 - Meteoroid

Thanks @NeilR

 

So I'm only joining ~216000 points to 1 giant polygon and it took 4 days for me.

 

Can you share the workflow and your full laptop specs so I can see where my bottleneck may be?

 

Thanks again,

NeilR
Alteryx Alumni (Retired)

NeilR_0-1659034282488.png

Will DM you the workflow since I can't attach it here.