Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Separate records based on duplicate values in a column

CodingMan
7 - Meteor

Hi everyone,

 

Hope you guys are having a wonderful day!

 

Here is the question I have:

data.PNG

 

I have a data set that contain tow columns: the first column has duplicate values and the second has unique values.

 

Below is the desired output I want:

output.PNG

 

 

So basically, out system doesn't allow duplicate values in column A. So what I want is to separate the records into different groups(tables, files, whatever). In this case, the maximum count of duplicate records are 4. So we need to separate the records in 4 groups.

 

The output could be like this as well

output2.PNG

 

So I guess the steps are: identify the maximum count of duplicate values - create groups based on the result and then separate those values in different group. I want it to be dynamic so it can detect how many groups needs to be generated.

 

I want to find a way to separate those records automatically.

 

Any idea will be appreciated.

 

 

4 REPLIES 4
Thableaus
17 - Castor
17 - Castor

Hi @CodingMan

 

Here's the simple solution that I found.

 

Capturar.PNG

The first proposed idea is not possible with the same fields (A and B), since Alteryx does not allow fields with repeated names.

 

WF appended.

gregpdx
5 - Atom

It is a great way to practice an iterative macro, all it takes is a unique tool and a formula tool.  Honestly, the hardest part about this workflow is learning how to think is loops.  There is a really good live training that shows hows to make the configuration https://community.alteryx.com/t5/Live-Training/Live-Training-Build-Your-First-Iterative-Macro/td-p/5... And some documentation as well https://help.alteryx.com/2018.2/IterativeMacro.htm.

 

The first step is to build a macro.  Iterative Macros have four parts:

  1. The input - This is like any input you use.  You can select a file with the same schema as your actual data to build the maco
  2. A workflow that does something - The key feature that the workflow needs is point where data that is "good", meaning that you have the final result you want, goes in one branch and the data that still needs to be processed goes in the other.
  3. Output for the Good data
  4. Output for the data that needs to be rerun

How it works

Macros take data, perform a process on part of that data.  The portion of the data that is processed is put off to the side.  This is one iteration.

The macro starts the process again with the remaining data, part of which is processed and part of which is and the rest which will be rerun in future iterations.

The process repeats or loops until all the data is processed, an error occurs, or a maximum number of iterations is reached.

 

Configuration

Since this is a macro we don't use the standard input and output tools.  Rather we use the Macro Input and Macro Outputs.  This workflow does what you said in words.  It takes a list of inputs and pull the unique values from column A and the associated value from column B and that goes to the "Good" side.  Next using the formula tool we add the third column for the group. The non-unique values are sent to the R output so they can be put through the macro again.

Macro.JPG

 

Using the iteration number to assign the group

Alteryx keeps track of how many times  it runs though an iterative macro and this is a variable we can access using the Engine constants.  By default the counter starts at iteration 0, so I have added one so the first group is labeled 1 rather than 0.

formula config.JPG

Interface designer

Now we just need to assign the macro inputs/outputs to their roles. The input is assigned to the Input and the Macro Output that comes off of the Duplicate side of the Unique tool is assigned to the Iteration Output.

  

macroconfig.JPG

 

After saving the macro it can be added to a standard workflow.  The macro will take any input as long as there are two columns and will run as many times as it needs to create the groups.

Workflow.JPG

Joe_Mako
12 - Quasar

Attached is a one tool solution, using the Multi-Row Formula tool, set to group on A, and the expression will increment the Group ID.

group.png

 

CodingMan
7 - Meteor

Thank you so much!

Labels