This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I have a data set that contain tow columns: the first column has duplicate values and the second has unique values.
Below is the desired output I want:
So basically, out system doesn't allow duplicate values in column A. So what I want is to separate the records into different groups(tables, files, whatever). In this case, the maximum count of duplicate records are 4. So we need to separate the records in 4 groups.
The output could be like this as well
So I guess the steps are: identify the maximum count of duplicate values - create groups based on the result and then separate those values in different group. I want it to be dynamic so it can detect how many groups needs to be generated.
I want to find a way to separate those records automatically.
The first step is to build a macro. Iterative Macros have four parts:
The input - This is like any input you use. You can select a file with the same schema as your actual data to build the maco
A workflow that does something - The key feature that the workflow needs is point where data that is "good", meaning that you have the final result you want, goes in one branch and the data that still needs to be processed goes in the other.
Output for the Good data
Output for the data that needs to be rerun
How it works
Macros take data, perform a process on part of that data. The portion of the data that is processed is put off to the side. This is one iteration.
The macro starts the process again with the remaining data, part of which is processed and part of which is and the rest which will be rerun in future iterations.
The process repeats or loops until all the data is processed, an error occurs, or a maximum number of iterations is reached.
Since this is a macro we don't use the standard input and output tools. Rather we use the Macro Input and Macro Outputs. This workflow does what you said in words. It takes a list of inputs and pull the unique values from column A and the associated value from column B and that goes to the "Good" side. Next using the formula tool we add the third column for the group. The non-unique values are sent to the R output so they can be put through the macro again.
Using the iteration number to assign the group
Alteryx keeps track of how many times it runs though an iterative macro and this is a variable we can access using the Engine constants. By default the counter starts at iteration 0, so I have added one so the first group is labeled 1 rather than 0.
Now we just need to assign the macro inputs/outputs to their roles. The input is assigned to the Input and the Macro Output that comes off of the Duplicate side of the Unique tool is assigned to the Iteration Output.
After saving the macro it can be added to a standard workflow. The macro will take any input as long as there are two columns and will run as many times as it needs to create the groups.