I'd like to see all rows that have duplicate values in the same column. But, I don't want to just see the duplicates (usually using the Unique function) but see the the original unique one, and the duplicates thereafter. Afterwards I would have two data sets, one with content that is never duplicated in the data, and a data set of all content that is duplicated, including the original.
How would I do this? Thanks!
Solved! Go to Solution.
I would still recommend using the Unique tool. It gives you the first unique on one output and the rest of the duplicates on the other output. You can then join together to get all the records ... or just add tools from before the Unique tool
I knew there was a macro out there that does exactly what you want!
http://www.chaosreignswithin.com/2014/08/only-unique-macro.html
Thanks, DavidP. I think I might still be missing something. The last filter that gets me all the dupes, doesn't get me the original unique row that is later duplicated since only the duplicated rows have the "y" in the last column. I wonder if I just remove the initial Unique tool and just start with the Group by function, if that would solve the problem...I think I would then be able to get ride of the join and union...
Nope! That didn't work. I'll keep playing with it. I'm definitely missing something...
Can you please send me some test data?
If you look at the J output of the Join tool, it should show all the Dup records, including the Original unique row - these records all have "y" flag. The records at the L output are the ones for which there are no dups. The union tool brings then together in one dataset again.
Come to think of it, the L and J outputs of the Join tool gives you the same output as the filter tool, so you don't need the filter and union tools.
Did you try the Crew (Only unique) Macro?
Hi @InfoManager
I'd use the Crew Macro's Only Unique tool, if you're able to download. See @DavidP 's post for the link.
But, if you need you can approximate that tool using some Unique, Joins, Summarizes. By Joining the U output back to the D output of Unique, you'll get all of the truly unique (no dupes) out of the L output. Then, easily repeat the first instance of each duplicate using the Summarize (you could also sample, grouping on all of the fields and setting N to 1). Added a field to qualify as original (value of 1) or copy (value of 2). Unioned and sorted.
The Only Unique macro does all of this for you. It's fantastic!