Credit to giphy.com
AMP, Alteryx Multi-Threaded Processing Engine
The Alteryx Engine is the component that processes the data in Alteryx. Since we don't normally ask ourselves how Alteryx processes the data or what is under the hood (beyond the RAM and the hardware), the software works through a processing engine.
This new version, the AMP Engine, includes parallel processing ...
Credit to giphy.com
Parallel processing? Here is a brief explanation:

To make it easier, let us think of a service office that only has one window. The number of people who can be served is one. It is a sequential processing model: for the next one to be served, you have to wait for the first one to finish. In this example, if each person takes 5 minutes to resolve their issue, to serve four it will take 20 minutes.
Created with piktochart.com
While in parallel processing, it is as if we had four service windows; these could serve four people at the same time, so in the same twenty minutes it could serve 20 people, four times more.
Created with piktochart.com
What the AMP Engine does is break the data into packets that are processed in parallel, for faster execution. In other words, the Engine will use all your processing cores and RAM when you run the workflow.
How do I use AMP in my flows? First, you have to have the version of Alteryx Designer 2020.2 or newer. To check the version of Alteryx you have installed, go to Help -> About.


If you do not have this version yet, you can update it at any time to start using AMP. Go to Help -> Alteryx Downloads.

Choose version 2021.2 (or newer):

If you have questions about which version to download or the installation process, you can consult this article with a quick guide to get you started with Alteryx Designer.
AMP is available for all workflows but in this version you have to specify that you want to use the AMP engine for each workflow flow individually. You can also select to use the AMP Engine for all new workflows in the User Settings:

To do this, left click on any white part of the canvas.

Then in the Workflow - Configuration, on the left part of the screen, select Runtime and at the bottom, the last option says Use AMP Engine.

Apply AMP to all workflows
As of version 2021.1.4, it is possible to enable the application of AMP to all workflows.


With this global parameter, you no longer need to specify for each stream you want to use AMP.
Performance profile with AMP
Starting with version 2021.3, the ability to analyze the performance of specific analytic blocks when using AMP has been added.
To enable it, select it in the workflow settings and enable the performance profile.

Here is an excerpt from the explanation of AMP performance profiling according to this Alteryx help article: “The original engine returns the time in milliseconds that each tool took to run, measured to 0.01 ms precision. AMP can also have multiple workers for each tool, but the total time will be combined by the tool.
Highlights on AMP Profiling
The performance profiling results between the original engine and AMP shouldn`t be compared as they have different nature. AMP uses many threads to execute tasks, but the total time will be summed up by counting every thread used for the tool.
Overall time per tool may be more than the total workflow time due to the multithreaded nature of AMP.
When there is not enough memory to execute a workflow, AMP performs additional memory management that will be reported as a separate message "Nms have been spent on memory management. M% of the total workflow execution time.".
Now you can run your workflow and feel the power of Alteryx's parallel processing engine, AMP.
How to check if you are using AMP?
To verify, you can see in the Results window if the following message appears: This is AMP Engine.

I ran the following workflow on my local computer where I connect to a SQL Server database that reads 10.4 million records and blends three excel files: one with 99K, one with 21K, and one CSV with 2.4K records, respectively.

The blend is made with a Find Replace tool. The process takes one minute and ten seconds.

It is a big difference for a relatively large volume of data, although the best test is the one you carry out on your computer and with your data to validate it.
Credit to Giphy.com
Have you tried it yet? Share your experience in the comments!
Credit to Giphy.com
Considerations:
Like everything in life, results can vary and depend on many factors such as the complexity of the workflow, what analytical blocks (tools) you use, the size of the data and the hardware you have available.
Requirements for AMP:
The AMP engine must have at least 400 MB to process a thread from a workflow. For example, with 8 threads, there must be at least 3.2 GB of memory available to AMP at run time.