Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Separate records based on duplicate values in a column


Hi everyone,


Hope you guys are having a wonderful day!


Here is the question I have:



I have a data set that contain tow columns: the first column has duplicate values and the second has unique values.


Below is the desired output I want:




So basically, out system doesn't allow duplicate values in column A. So what I want is to separate the records into different groups(tables, files, whatever). In this case, the maximum count of duplicate records are 4. So we need to separate the records in 4 groups.


The output could be like this as well



So I guess the steps are: identify the maximum count of duplicate values - create groups based on the result and then separate those values in different group. I want it to be dynamic so it can detect how many groups needs to be generated.


I want to find a way to separate those records automatically.


Any idea will be appreciated.



Alteryx Certified Partner
Alteryx Certified Partner

Hi @CodingMan


Here's the simple solution that I found.



The first proposed idea is not possible with the same fields (A and B), since Alteryx does not allow fields with repeated names.


WF appended.


It is a great way to practice an iterative macro, all it takes is a unique tool and a formula tool.  Honestly, the hardest part about this workflow is learning how to think is loops.  There is a really good live training that shows hows to make the configuration And some documentation as well


The first step is to build a macro.  Iterative Macros have four parts:

  1. The input - This is like any input you use.  You can select a file with the same schema as your actual data to build the maco
  2. A workflow that does something - The key feature that the workflow needs is point where data that is "good", meaning that you have the final result you want, goes in one branch and the data that still needs to be processed goes in the other.
  3. Output for the Good data
  4. Output for the data that needs to be rerun

How it works

Macros take data, perform a process on part of that data.  The portion of the data that is processed is put off to the side.  This is one iteration.

The macro starts the process again with the remaining data, part of which is processed and part of which is and the rest which will be rerun in future iterations.

The process repeats or loops until all the data is processed, an error occurs, or a maximum number of iterations is reached.



Since this is a macro we don't use the standard input and output tools.  Rather we use the Macro Input and Macro Outputs.  This workflow does what you said in words.  It takes a list of inputs and pull the unique values from column A and the associated value from column B and that goes to the "Good" side.  Next using the formula tool we add the third column for the group. The non-unique values are sent to the R output so they can be put through the macro again.



Using the iteration number to assign the group

Alteryx keeps track of how many times  it runs though an iterative macro and this is a variable we can access using the Engine constants.  By default the counter starts at iteration 0, so I have added one so the first group is labeled 1 rather than 0.

formula config.JPG

Interface designer

Now we just need to assign the macro inputs/outputs to their roles. The input is assigned to the Input and the Macro Output that comes off of the Duplicate side of the Unique tool is assigned to the Iteration Output.




After saving the macro it can be added to a standard workflow.  The macro will take any input as long as there are two columns and will run as many times as it needs to create the groups.



Attached is a one tool solution, using the Multi-Row Formula tool, set to group on A, and the expression will increment the Group ID.




Thank you so much!