Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

The Alteryx Engine 101 - The Basics

MichaelF
Alteryx Alumni (Retired)
Created

To get a better understanding of how to properly leverage a machine’s resources to use Alteryx, it can be very helpful to understand how the Alteryx Engine functions. To clear up any haziness surrounding the term “Alteryx Engine”, this article covers what happens when you click the Run Button, either in Alteryx Designer or in Alteryx Server:

This article covers how the engine processes each record and how it will utilize the machine’s core(s) and the machine’s RAM (memory) when a workflow is run.

Disk vs. Ram

The first thing to note is the difference between Disk and RAM. Executing processes is slower when done on Disk than when on RAM. With that in mind, the Engine will try to do it’s processing in RAM after it reads data from a Disk one record at a time. See below for the general flow of records through a workflow (Blue is on Disk, Red is in RAM).Example_1.gif

Record 1 is read from Disk to RAM, then moved into the Formula tool before being released back to Disk in the Browse tool. Once this is complete, then the Engine will move onto Record 2 and so on.

Workflows with Multiple Streams

The next scenario to consider is a workflow with multiple streams. If there are two outgoing connections from the Input tool, will each record have to be read twice? See below for the flow of data in this scenario.

Example_2.gif

Once Record 1 is read from the Input tool, the Input tool will hold that record in RAM and process it until the Engine finishes the top stream and releases it to Disk. Then it will resume from stored record to process the second stream. This avoids extra processing by the Input tool by utilizing RAM storage.

Sort and Join Tools

The final scenario this article will cover is when the Engine hits tools like the Sort tool and Join tool. With something like the Sort tool, to sort all the records, the Engine needs ALL the records, not one at a time as seen in the previous scenarios. Let’s take the use case of the Join tool –

Example_3a.gif

Each record is read from Disk and then stored in RAM before the Join tool.

Example_3b.gif

Next, the records will be split up, to most efficiently sort the records, which is done so the Join can be processed most efficiently.

Example_3c.gif

After the Left Input is stored and sorted, the same process will be done for the Right Input.Example_3d.gif

After all the records are sorted in RAM, the Engine can process by looking at the records one at a time as per the Join tool function. Since the records are sorted, the Engine can go back to running one record at a time until it is released onto Disk, and then proceed to the next record.

From this last example, the Engine is forced to use more RAM than normal because it needs to store records before processing. There are settings within Designer and Server that reference what is known as theSort/Join Memory, which is for these processes.

Sort_Join_Designer.png

As seen on Designer under Options > User Settings > Edit User Settings

Sort_Join_Server.png

As seen on the System Settings on the Server under Engine > General

The Sort/Join processes are used by the Sort and Join Tools, as well as other tools, which are called Blocking tools. Blocking Tools require that all records are read into the tool before more processing can be done. You can see which tools are blocking tools by referring to the Alteryx’s Periodic Table of Tools.

With the base knowledge and understanding of how the Engine works, you can better leverage your machine’s resources and understand why certain processes might be taking longer than expected.

Better specs = better Alteryx-ing = better data solving!

Comments
papalow
8 - Asteroid

@MichaelF Great post.  The animations were very helpful.  Sometimes it is helpful to take a step back and think about how things are done in the background.  

ephij
9 - Comet

Animations were extremely helpful, any thought about doing a similar post for the AMP engine?

adansantos
6 - Meteoroid

Hi @MichaelF 

 

Good explanation, congratulation!!!