This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
There are multiple strategies to improving the speed of a workflow from using a select tool to reduce field sizes or looking at the default sort join memory, however the first fundamental process is benchmarking.
The recommended process is to run the workflow three times to ensure the data has cached.
If the workflow has not cached the data then this can cause slower run times, so to ensure this is a fair test running the workflow three times should ensure all the data is cached.
If you want to run without cached data you will have to reboot your machine between runs.
Once you have the total time the workflow takes to complete/run you can now look at optimizing.
Optimizing your workflow!
Alteryx is designed to use all of the resources it possibly can. In order to make Alteryx run as fast as possible, it tries to balance the use of as much CPU, memory, and disk I/O as possible.
Set your Dedicated Sort/Join Memory Usage lower or higher on a per-Workflow basis depending on the use of your computer.
Sort work refers to the sort tool and other similar tools in re-ordering your data. Join work refers to any of the join processes.
If you are doing memory intensive non-sort work (i .e . large drive-times) then lower it!
If you are doing memory intensive sort-work then higher it.
Go to the Workflow Configuration > Runtime tab > Dedicated Sort/Join Memory Usage > Use Specific Amount
The Sort/Join memory setting is not a maximum memory usage setting; its more like a minimum, this allocated memory will be split between all the tools that sort in your workflow, but other tools will still use memory outside that sort/join block. Some of them (e .g . drive times with a long maximum time) can use a lot.
Where do I find the Sort/Join memory options?
To set a user level default dedicated Sort/Join Memory Usage, go to Options > User Settings > Edit User Settings > Defaults tab.
The global Default Dedicated Sort/Join Memory Usage at System level can be found at Alteryx > Options > Advanced Options > System Settings > Engine > Default sort/join memory usage (MB).
*******For machine bit version memory considerations please see here.
Lean for more speed!
A best practice to optimize the performance of your workflows is to remove data that won’t be needed for downstream processing as quickly as possible. You can always bring that data back into the workflow later if necessary.
Another good way to optimize workflow performance is using the filter tool to remove unnecessary data.
The filter tool queries records in your file that meet specified criteria and identifies these records in your data, such as ZIP = 01001 . You may choose to handle records that come from the True output differently than the false output by connecting additional tools to the workflow on either side. This will allow smaller amounts of data to be passed downstream.
Auto Field Tool
Optimize your workflow for speed by setting the field type to the most efficient type and smallest possible size.
Use the auto field tool right after your Input Data tool to assign the most efficient type and size to your fields.
Another benefit of using the auto field tool is that it will reduce the size of your output file.
Enable Performance Profiling
This option will allow you to see a milliseconds and percentage breakdown per tool in your workflow.
Having this breakdown will allow you to pinpoint the slower tools/processes in your workflow and use the methods suggested in this article to improve that tool/process.
Performance profiling can be found Workflow > Runtime > Enable Performance Profiling.
Disable All Browse tools
The Browse tool quickly becomes a data artisans best friend, it allows to see/review the entire data at any given step in the workflow building process, however, each of these browse tools creates a temporary yxdb and writing these files do take some time and slow down the processing.
There is an option to simply disable them so they can be easily enabled if need it. This setting can be found at Workflow > Runtime > Disable All Browse Tools.
Set your limits: Record Limit for the Inputs
When developing your Workflow, there is no need to bring in all your data during testing.
Use the Record Limit option in the Properties for the Input to bring enough records for testing.
If you want to set limits for all input tools in your workflow, you can also do this under the Runtime tab under Workflow – Configuration.
The tool container allows the user to organize a workflow better by combining tools in logical groups.
Tool Containers can be disabled to run only certain portions of the workflow, effectively bypassing tools for a quicker run.
Designer now has the ability to cache data from relational databases through the input tool.
When checked, data is stored in an yxdb file on disk so that data sources are not hit repeatedly during workflow development.
Data can only be cached when running a workflow in an Alteryx Designer session. The setting is ignored when the workflow is run in the scheduler, in the gallery, or from the command line.
The Connection progress is a great way to keep track of the number of records and the size of the data going from one tool to another. In addition to that, the thickness of he connection itself varies depending on the size of data passing through (great for troubleshooting).
The default setting for the Connection Progress is “Show Only When Running” however leaving this set as ‘Show” will allow you to investigate the size of the data at certain points permanently (Properties for the Canvas > Connection progress).
If you want more detail on any of the points mentioned above make sure to check out the great Tips and Tricks articles from Margarita Wilshire et al!