Batch macros are a new feature in Alteryx 5.0. They add some fairly simple capabilities that let you solve problems in ways that would otherwise be very difficult, if not impossible.
Batch macros are just like other Alteryx Macros, but with the addition of a special input we call the control input. Records that go in the control input don't become streams inside the macro like regular inputs. Instead, for each record, the entire macro will be reconfigured and run. You can use the fields in the control input records for two things: 1) to reconfigure the macro at run time, just as if they were the answers to questions in the macro GUI; 2) to group the records going in the other macro inputs into batches. Frequently, you're going to do both of these, but today we'll take them one at a time and look at a simple example that uses each of them.
For each record that comes in the control input, the batch macro will get reconfigured. You can use the fields in the record as variables to update the configuration.
Here is a blog post Ned wrote about a module that parses web server logs. Since their format is pretty arcane, it can only parse one log file at a time. Typically though, you'd like to parse a whole directory full of them at once. So in order to do that, we're going to turn his module into a batch macro. Starting with the original module, we'll go to the module properties, change the Module Type to Macro, and hit Create. This brings us to the Macro Design window, and the first tab there is "Batch Macro":
Starting from the top, obviously we're going to check the "Enable as Batch Macro" box. Next after that is:
If your macro has multiple outputs, they will all need to use the same output mode. For this example, we're going to choose Auto Configure by Name. Each time we parse a log file, we'll figure out which fields it has in it, but that can change if you configure the web server to log different things, so we want to be able to handle that.
The Control Parameter we added shows up as a variable to update from, just like the answer to a question on the questions tab would.
So that's all there is to it, now we can feed a directory tool into the control input of our macro, and parse all our log files:
Being able to reconfigure Alteryx processes based on data at run time is very powerful, but it does come with a cost. Usually Alteryx has all its configuration information up front, so all the overhead of configuration and initialization only happens once at the beginning. With a batch macro, it has to happen again at the start of each batch, which slows things down. If you're running a few dozen batches for a directory full of server logs, that will be fine. But you want to be careful you don't set things up to run a few million batches.
The second thing batch macros can do is run records in batches. Our previous example didn't have any regular, non-control inputs, but what would happen if it had? For each record that comes in the control input, the macro will be reconfigured and it will run on the complete set of records that come in the regular inputs. Or, it can run on a subset of those records that match a field from the control input.
The example we'll look at here is a user request we got a long time ago, but there hasn't been a really good answer for before. The user has a bunch of store locations, and wants to make trade areas using non-overlapping radii. So far, no problem. But, these stores are divided up into separate regional divisions that have defined boundaries, and the trade areas need to conform to these. For some example data, I'm going to use county centroids as store locations, and New England states as regions:
The radii need to be non-overlapping, but only compared to other stores in the same region, because we're going to cut the trade area boundaries to the regional borders. There really isn't a good answer here. The user basically needed to run his module separately for each sales region, then combine the results. And that's exactly just what our batch macro will do, but now it will be easy.
The actual macro here is trivially simple:
Just input, a trade area tool with "Eliminate Overlap" checked, and an output. The macro has one question and an associated action to update the maximum radius, but if you look at the Batch Macro Tab:
you see that while "Enable as Batch Macro" is checked, there aren't even any Control Parameters. We don't need any, because we don't actually need to configure anything differently for each batch in this case, just run them separately from the other batches.
So how do we do that? Let's take a look at the module that uses this Macro:
Up at the top we read in the Locations file, which has the store location points, and a RegionId to tell us what region the store belongs to. We put this through a Summarize grouping on RegionId just to get a unique list of Ids for the control input. That way, we'll run one iteration for each Region. Looking at the configuration for our macro, we see that now that we have a regular input as well as a control, we get a new tab: Group By.
Here we can choose one field from our control input, and one field from each other input. If you pick a field for a particular record, each run of the macro will only use the records where that field matches the value from the control input. For any inputs that you don't pick a field for, you'll get the whole stream again for each iteration. Note that the terminology can be a little confusing: "Group by" here doesn't mean quite the same thing as it would in the summarize tool; we're not combining records, just arranging them in batches.
So that's it, now we can run our module and the batch macro will generate radii for each location, but make them non-overlapping only compared to the other locations in their region, and then we can trim the results to a region boundary:
So you can see that batch macros do a couple of things that are actually fairly simple on their own. But when you combine them with each other, and everything macros and Alteryx as a whole can do, they can be extremely powerful.
Download this example here.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.