Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
DylanB
Alteryx Alumni (Retired)

It seems that much of the vast appeal of Alteryx to so many people is the degree to which workflows remove the programming part of data analysis. This makes it easier to focus on the data while trusting that whatever method is being used to analyze the data is done effectively and efficiently. Additionally, the visual clarity of workflows lends itself well to anyone looking at a new workflow trying to grasp what's going on.

 

These two aspects of the Alteryx software can help to make it very easy to get "in the zone" - fully focused on getting to the desired result - but not necessarily concerned with the procedure for getting there.

 

The issue is that this can lead to minor (or major) inefficiencies (sorting data then filtering it, leaving extra columns sitting around long after their use instead of selecting them out, etc).

I set out to try to deal what was in my opinion with the simplest of these, sorting immediately before filtering, in the form of an app. It would take in an alteryx workflow, app, or macro and would spit back out a new and improved version.


A simple example:


Let's input a very simple workflow that sorts, then filters a dataset:
simple_example.png

Now let's apply this Efficiency Boost App:

 

Efficiency_Boost_App.png

 

To get this result:

simple_example_edit.png

 

As you can see, it simply redirects the wires of the workflow to let data flow downstream more efficiently.

 

The results:

 

To figure out how much of a boost this had potential to give, I set up an iterative macro to take a size of a dataset and report back the time it took to randomly generate, then sort, then filter that many results. Then, I ran the Efficiency Boost App on it.

 

Here is the macro that sorts, then filters:

 

sort_filter.png


And here is the result after the Efficiency Boost App:

 

filter_sort.png

 

I threw these into a workflow to run them multiple times on different sized data sets and take averages, and it spit out these results (all in seconds):

time_results.png

 

It should come as no surprise that as datasets become larger, the Efficiency Boost App's benefits grow significantly.

 

Closing Remarks

 

As of now, the app is just in its infancy and has just been tested on relatively simple workflows/macros, but hopefully soon, now my groundwork on the parsing of the Alteryx XML is mostly done, it'll be possible to add much more functionality to find much more room for time savings.

Comments