This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
on 06-21-201611:35 AM - edited on 07-27-202111:39 PM by APIUserOpsDM
Objective: A dataset has 31 fields, seven of which represent categories of data and 24 that represent each hour of a day (Figure 1). The objective is to multiply the values in each of the Category fields by the values in each of the hour fields, effectively creating 24 fields for each of the 7 categories. The data output should contain 199 fields*: the original seven categories of data, the 24 original hourly data and the 168 fields generated in the workflow. The 168 fields should be named using both the Category Name and the Hour: Category_1_1, Category_1_2….Category_7_24.
Method: Conceptually, the workflow will take each of the seven categories of data and multiply each categorical field by the values stored in each of the 24 hourly fields. The repetitive nature of this process lends itself nicely to a batch macro configuration to automate the calculations. One major obstacle, however, throws a bit of a wrench in our plans. Batch macro outputs “Union” the data, stacking data vertically. That means that the data associated with Categories 1-7 would be included in the output data as a new record, not as a new field. Because the goal is to see these data “Joined”, or appended laterally, we essentially need to defy the laws of gravity (or, batch macros, as it were) by leveraging the powers of Crosstab and Transpose tools. To successfully configure our data laterally, some key steps need to happen at three steps in the workflow: (I)before the data enters the batch macro, (II) inside the macro, and (III) after leaving the macro. Follow along with the attached workflow to visualize the process at each step.**
I. Before the data enters the Macro….
The very first step after bringing in data is to add aRecord IDto assign each of your records a unique identifier. Not only do we recommend this step when the input data will be undergoing some significant transformations but, ultimately, this Record ID is the glue that holds this whole process together. Without it, it’d be a pretty tough, if not impossible, task to get this data pieced back together.
The next step is to split the data into two streams, one for the categorical data fields and another for the hourly data fields, using Selecttools. Then, the data is Transposed, flipping the data vertically, using the Record ID as the Key Field and the Category or Hour fields as the data fields. This step prepares the data for the mathematical processes that will be applied to the data in the batch macro. The transposition of the Category data fields also creates an opportunity to use a Summarizetool to list the values (each of the Categories) that will be fed to the macro’s control parameter.
II. Inside the Batch Macro...
Once the data is brought into the batch macro, the data isJoinedtogether (by Record ID) to assemble the data so that the values stored in the Category fields are alongside its corresponding data from the Hour fields. These values are then multiplied together in the Formulatool. Additionally, a new field, called “Placeholder”, is created that contains the name of the Category that is used for that batch of the macro. Note that this field is connected to the Control Parameter, which indicates that the value in that field will be updated with each batch to reflect the changing Category field names. You may notice that, at first glance, the “Placeholder” field seems unnecessary as it simply duplicates the Category field. However, this field’s importance isn’t fully realized until we start to re-compile the data using Crosstabtools a bit further downstream.
At this point, the data is split again into the data associated with Categories and Hourly data to begin the process of horizontally arranging the data and updating the naming schema of the fields. Using Crosstab Tools, the Category Data is grouped by Record ID and Placeholder, which ensures that this newly created field carries through the macro. The Hourly data fields are split into two separate streams, one for the newly created data in the field “Multiplied” and the original data. Both data streams are updated using a Dynamic Renametool to add prefixes to field names. The new Hourly data fields, however, are connected to the Control Parameter. This means that, like the value of the “Placeholder” field, the prefix of the field will be updated with every batch of the macro. Before the data leaves the batch macro, it is joined together with a Join Multipletool. This is the first instance in which you can see that Record ID come in handy with re-assembling our data!
III. After the Batch Macro...
This third and final step of the workflow was developed to overcome the obstacle of the Unioned output of the batch macro process. Clicking on the “Browse” following the Batch Macro will show that all the records for each of the batches are now stacked vertically and our field names only reflect the first batch of data. All the data we want is technically there; we want to visualize it horizontally for a side-by-side comparison. To re-configure the data so that each batch of data is appended laterally, we can use another round of Transpose and Crosstab tools.
First, we’ll transpose the data so that, once again, is arranged vertically. You might be thinking that this is a step backwards from our end goal. Well, in this case, it’s one step back, then two steps forward! Not only do we get those field names that we need to fix we’re arranging our data vertically, we’re preparing to create our new 168 field names according to our naming schema. The fields we’re primarily concerned about re-naming are those with data that was calculated in the batch macro, which can be filtered out of the data. Then, using RegExand a Formula tool, we begin the process of creating our 168 new field names. Once again, that seemingly unnecessary “Placeholder” field comes in handy! Since all of our field names contained “Category 1” in them, we can compare our RegEx output (what we don’t want) with the Category that is contained in the “Placeholder” field (what we do want) with a Formula tool to re-construct every possible combination of Category Value and Hour that we need. A final Crosstab aligns the data horizontally so that the data can be Joined and visualized in the way we intended. After all that data manipulation, how do we know we got it all back together correctly? Luckily for us, that Record ID we created at the beginning of our workflow ensures that both our Crosstab and Join tools successfully connect everything correctly.
*Normally, we would not recomemnd creating 199 fields but sometimes visualization or side-by-side comparison is the goal. Inspired by a true story.