This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
on
08-23-2016
07:01 PM
- edited on
06-06-2019
06:09 AM
by
CristianoJ
The below is taken from the Tips & Tricks series presented at Inspire 2016. Special thanks to Margarita Wilshire and the Customer Support team for compiling these useful tips!
A best practice to optimize the performance of your workflows is to remove data that won’t be needed for downstream processing as quickly as possible, you can always bring later the additional data if needed. The Select tool removes fields or columns from your data. Other tools such as Join, Join Multiple, Spatial Match, Find Nearest, and to a certain degree Transform tools and Reporting tools have some Select functionality.
Useful tips when using the Select Tool:
Another good way to optimize workflow performance is using the Filter tool to remove unnecessary data. The Filter tool queries records in your file that meet specified criteria and identifies these records in your data, such as ZIP = 01001. You may choose to handle records that come from the True output differently than the False output by connecting additional tools to the workflow on either side. This will allow smaller amounts of data being passed downstream.
Optimize your workflow for speed by setting the field type to the smallest possible size and most efficient field type. String fields with a big size can be costly and carrying that through your workflow will slow it down. Use the AutoField tool right after your Input Data tool to assign the most efficient type and size to your fields.
Below the data types before and after the AutoField tool.
Another benefit of using the AutoField tool is that it will reduce the size of your output file.
The Browse tool quickly becomes a data artisan’s best friend, it allows to see/review the entire data at any given step in the workflow building process, however, each of these browse tools creates a temporary yxdb and writing these files do take some time and slow down the processing. When the workflow is ready for production is better to remove them, there is an option to just disable them so they can be easily enabled if need it. This setting can be found at Workflow > Runtime > Disable All Browse Tools
In the User Settings > Advanced tab on how to improve performance.
1- Undo Levels. You can undo or CTRL+Z by default 25 times. In order to undo these many times data needs to be stored in memory.
You can decrease the Undo Levels if you need to save memory and improve performance.
2- Disable Auto Configure. This option will stop the metadata from being loaded every time you add a new tool while developing a workflow, thus press F5 to load the metadata only when needed.
3- Autosave interval in Minutes. By default, the designer saves a version of the workflow every 10 minutes. If for some reason you think you lost your work there is this very handy options to save your skin. However, it can also make use of processing power when you do not expect. You may want to increase the autosave interval and improve performance too.
4- Tool Results Settings. It is about that little anchor next to most tools that shows results just like a browse tool but with limited results.
In this setting you can limit the memory size reserved to display the results, and save memory/performance. Add a browse tool when you really need to see all results.
Have you ever wondered why exactly your workflow is taking so long? Is it the input or a join that seems to take forever? Performance profiling can answer those questions for you. It will tell you how long each tool took to process and how much of the overall processing time was allocated to that specific tool. Simply check the box in the Runtime tab under Workflow – Configuration and then analyze the Results - Workflow - Messages.
Very usefull. Thank you
Thanks for sharing this.
One point about using the most efficient data types -- the comment "String fields with a big size can be costly", that's only if you are talking about fixed length strings, types String and WString. For the types V_String and V_WString your values take up no more space than is necessary (they have a couple of bytes extra to store the length, that's all). The specified size is just an upper limit, to make sure something crazy hasn't happened. For a fixed size string you use exactly the number of characters you specify, so that's where you would want to be careful. In the example shown, however, all the strings were variable sized. Where the example really saved space was changing things to integers.
Be careful not to let auto field change a zip code to an integer. "00234" is a perfectly fine zip code, you don't want it to turn into 234.
Is there a way to enter a code in the event section within the workflow, so when I configure to send an email after a successful flow, for it to add the date or time stamp in the email?
There's a timestamp in the first line of Output Log, which is the last code there ( %OutputLog% ).
Otherwise, you could place an Email tool at the end of your workflow (it would only activate if there are no errors upstream). You can use the Date Time Now tool to get a timestamp for that option.
Auto Field tool tip is great. Thank you!!
I would like to share a few more things in the User Settings > Advanced tab on how to improve performance.
1- Undo Levels. You can undo or CTRL+Z by default 25 times. In order to undo these many times data needs to be stored in memory.
You can decrease the Undo Levels if you need to save memory and improve performance.
2- Disable Auto Configure. This option will stop the metadata from being loaded every time you add a new tool while developing a workflow, thus press F5 to load the metadata only when needed.
3- Autosave interval in Minutes. By default, the designer saves a version of the workflow every 10 minutes. If for some reason you think you lost your work there is this very handy options to save your skin. However, it can also make use of processing power when you do not expect. You may want to increase the autosave interval and improve performance too.
4- Tool Results Settings. It is about that little anchor next to most tools that shows results just like a browse tool but with limited results.
In this setting you can limit the memory size reserved to display the results, and save memory/performance. Add a browse tool when you really need to see all results.
Have a great day!
Hi all , new user here. CAme across this thread and tried following along but the current release is much different now. For example "Alteryx > Options > Advanced Options > System Settings > Engine > Default sort/join memory usage (MB)"
If any of the OP are still around, would be cool to get an article edit!
I am NEW here. The advanced option has been changed. I also feel the slow run in the latest version.