Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Workflow optimization - general question

Katie_K
8 - Asteroid

Hi folks,

 

I have a workflow that is in need of optimization and I'm trying to think of smarter ways to use tools, starting with the most simple stuff that could be changed. 

 

I have a large data set (370M records and growing every day) where there are a number of 'Group By's being used in the Summarize tool.  It just occurred to me that if 1-2 of these levels are potentially removed (because I could get those levels of information another way outside of this flow), it might speed things up a bit.

 

Similarly I noticed a Unique tool in use in multiple places that I can probably turn into just 1 if it's done at the start rather than the end.

 

My guess is that both of these changes could improve performance (even 10 minutes shorter would be a welcome change!), but I was just curious if anyone knew specifically if these two tools can be drains on performance, if they're processing so many records.

 

The entire flow takes nearly 3 hours to run, and I'm not quite ready to test it yet so wanted to post here in the meantime in case anyone has any relevant/useful comments to share on this topic.

 

Thanks, Katie

6 REPLIES 6
JagdeeshN
12 - Quasar
12 - Quasar

Hi @Katie_K ,

 

I think Performance Profiling would be a good place to start.

 

 You select it in the Runtime tab (enable performance profiling).  The results show up in the Results Window at the end of the workflow.  You can go to the Messages in the Results Window and right-click to copy them out and store them.  

 

Hope this helps.

 

Best,

Jagdeesh

Katie_K
8 - Asteroid

Thanks Jagdeesh that's a great tip!  I heard about this during one of the presentations during the conference actually, but I completely forgot about it... thanks again.

JagdeeshN
12 - Quasar
12 - Quasar

@Katie_K  Do let me know if this gives you a start.

 

If not please feel free to share a sample of the workflow so that we can dive in deeper.

Katie_K
8 - Asteroid

Thanks again 😀

Luke_C
17 - Castor

Hi @Katie_K 

 

Your thoughts on group by's and unique tools impacting performance is correct. These are called 'blocking tools'. This means that instead of records passing through one by one as they're processed, all of the data must be read into the tool before it moves on downstream.  Other examples of tools like this are Joins, Sorts, append fields, auto fields, etc. I like to refer to the periodic table of Alteryx tools (link below) - the red outlines are tools that will behave like this. Strategically using them will certainly help your performance when dealing with millions of records.

 

https://community.alteryx.com/t5/Engine-Works/The-Periodic-Table-of-Alteryx-tools/ba-p/64120

 

Katie_K
8 - Asteroid

OK thanks for this Luke!  I've seen this periodic table before but didn't think about it in this context... thanks so much, this is helpful.

Labels