Alteryx Designer Ideas

Share your Designer product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Multiple iterating data sets

There are many circumstances when you have to build an interative macro where it's not just the iterating data set that needs to change every iteration, but also a second data set.

Think about this like a loop where two different variables are updated on every iteration, not just the control variable in the For xxx control variable.

 

The way that users work around this is to use a temporary yxdb file where instead of a macro input you input from the yxdb, and then write back to the same yxdb.    This allows you to pretend that you can adjust 2 different data sets on every cycle of the loop.    there are 4 downsides to this:
a) User complexity - this breaks the conceptual simplicity of macro inputs since now the users have to understand that in situation X I use macro inputs; and in situation Y I have to use some other type of tool.

b) Speed penalty - writing to disk is between 1000x slower and 1 000 000x slower than working with data in memory (especially if it's in cache) - so by forcing this to go through a yxdb file, you do incur a speed penalty which is just not needed

c) blocking penalty - Because of the fact that you can't write to a file that you're still reading from, you need to pepper this with Block until done tools - and you need to initialize the macro using a first write to the yxdb file outside the macro - which further hurts speed.    Given the nuanced behaviour of block-until-done, this also introduces user complexity issues

d) Self-contained - because you have to initialize these files outside the macro - the macro is no-longer self-contained and portable (which breaks the principle of Information Hiding which is a key pillar of good modular decoupled software design.

 

The other way that users work around this - is to serialize their entire second recordset into a field which then gets tacked onto the iterating data set using an iterative macro.   This is HIGHLY wasteful becuase then you have to build a serialize & deserialize process for this second recordset.    It fixes the speed and blocking penalites from above, but introduces a computational overhead which is generating no value; and makes this even more complex for users - and a further blocker to using macros.

 

 

Recommendation:

We could make this simpler by allowing users to create multiple pairs of macro input / macro outputs so that 2 or 3 or n different data sets can be updated with every iteration.

 

 

Below is a screenshot demonstrating this, from an Advent of code challenge - the details of the problem are not important - the issue at hand is that there are 2 record sets which both need to be updated on every iteration.

 

cc: @NicoleJohnson @Samanthaj_hughes @SteveA 

 

SeanAdams_0-1641333684845.png

 

2 Comments
Samanthaj_hughes
12 - Quasar
12 - Quasar

I have a whole talk about figure 8 macros which is about how to get around this too. I actually join all the data back together and use a filter to push it back around and then filter it back out.

Useful when you are iterating two datasets specifically top group and bottom group hence figure 8.

SteveA
Alteryx
Alteryx

@SeanAdams Agreed we need this, AoC drove it home for me this year.  I feel like this is mostly a UI/UX problem because let's be honest, the current UI for iterative macro setup is both hard to find and obtuse.  Refining the UI to specify multiple input/loop connections seems straightforward, I would have to look at the macro back end to see how difficult it is to implement in the engine(s).