Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

See All records that are duplicates, including the original unique record

InfoManager
7 - Meteor

I'd like to see all rows that have duplicate values in the same column. But, I don't want to just see the duplicates (usually using the Unique function) but see the the original unique one, and the duplicates thereafter. Afterwards I would have two data sets, one with content that is never duplicated in the data, and a data set of all content that is duplicated, including the original.

 

How would I do this? Thanks!

8 REPLIES 8
cmcclellan
13 - Pulsar

I would still recommend using the Unique tool.  It gives you the first unique on one output and the rest of the duplicates on the other output.  You can then join together to get all the records ... or just add tools from before the Unique tool

DavidP
17 - Castor
17 - Castor

I like @cmcclellan's idea. Here's how I'd do it

 

dups.png

DavidP
17 - Castor
17 - Castor

I knew there was a macro out there that does exactly what you want!

 

http://www.chaosreignswithin.com/2014/08/only-unique-macro.html

 

 

InfoManager
7 - Meteor

Thanks, DavidP. I think I might still be missing something. The last filter that gets me all the dupes, doesn't get me the original unique row that is later duplicated since only the duplicated rows have the "y" in the last column. I wonder if I just remove the initial Unique tool and just start with the Group by function, if that would solve the problem...I think I would then be able to get ride of the join and union...

InfoManager
7 - Meteor

Nope! That didn't work. I'll keep playing with it. I'm definitely missing something...

DavidP
17 - Castor
17 - Castor

Can you please send me some test data?

 

If you look at the J output of the Join tool, it should show all the Dup records, including the Original unique row - these records all have "y" flag. The records at the L output are the ones for which there are no dups. The union tool brings then together in one dataset again.

 

Come to think of it, the L and J outputs of the Join tool gives you the same output as the filter tool, so you don't need the filter and union tools.

 

Did you try the Crew (Only unique) Macro?

 

 

estherb47
15 - Aurora
15 - Aurora

Hi @InfoManager 

I'd use the Crew Macro's Only Unique tool, if you're able to download. See @DavidP 's post for the link.

But, if you need you can approximate that tool using some Unique, Joins, Summarizes. By Joining the U output back to the D output of Unique, you'll get all of the truly unique (no dupes) out of the L output. Then, easily repeat the first instance of each duplicate using the Summarize (you could also sample, grouping on all of the fields and setting N to 1). Added a field to qualify as original (value of 1) or copy (value of 2). Unioned and sorted.

The Only Unique macro does all of this for you. It's fantastic!

image.png

Tommy1
5 - Atom

I had this issue and simply used the Unique tool, a formula, and Union

 

 

Labels