Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

If there is a duplicate, how to seperate all of the duplicates?

njkm
5 - Atom
I have a list that has duplicates. For example my list is (A, B, C, A, A, D, E)
I want to be able to seperate all of the A's from the list to get (B, C, D, E) and (A, A, A) 

Using the Unique tool I get (A, B, C, D, E) and (A, A)

Any ideas?
5 REPLIES 5
TonyM
Alteryx Alumni (Retired)
Hi Nick,

Using the Filter tool will allow you to separate specified data from a field. In the case presented above, you will want to set your filter logic to [Field] != "A". The results will be (B, C, D, E) in the true and (A, A, A) in the false.

Regards,
njkm
5 - Atom

I appreciate the response tmoses. I would prefer to not manually write a filter for each duplicate as sometimes there will be 1000's of duplicates. Is there a way to use a tool to remove a record from the master list if it has a matching criteria on a sub list.


This is closer to the real life example: (in this example, the matching  criteria on the sub list is a recordid that I created before fuzzy matching)

I have 500,000 people in my contact directory each has a created recordid. On this list there are 3 different people named John Smith that work for company "xyz", 2 Michael Jones from company "psq" and 5 Tom Whites from "mno" within that contact directory.

The conference organizer has told me someone named John Smith from company "xyz", someone named Michael Jones from company "psq", someone named Tom White from "mno" registered along with  about 10,000 other people.

I have used fuzzy matching to discover that the John Smith from company "xyz", Michael Jones from company "psq" and Tom White from "mno" that are registered, match equally to all three John Smiths from company "xyz", 2 Michael Jones from company "psq" and 5 Tom Whites from "mno" in the contact directory

Since I can't tell which of these John Smiths, Michael Jones and Tom Whites are the correct one, I need to remove them all from the master list

I can figure out that there are more than 1 of a name by using the unique tool. (this will seperate out 2 of the 3 John Smiths, 1 of the 2 Michael Jones and 4 of the 5 Tom Whites.)
At this point, I could use the filter tool to manually type in each name on that duplicate list from the original matched contacts. But if there are 1000's of duplicates, this would take a long time.

Is there a way to remove all of the duplicates from the master list without having to manually type them in?

 

jgreene
8 - Asteroid
This is my standard method for that case:
kane_glendenning
10 - Fireball
Hi Nick,

I believe what you are after here is a variant on the Unique tool where everything that is duplicated or is a duplicate all goes out to one output.... and you're in luck. Adam Riley features this in a Blog Post on his site (Chaos Reigns Within) and it is available in the CReW Macro Pack that contains useful Macros by Adam, Chris Love & Ned Harding. I highly suggest that you check it out. The "Only Unique" Macro is very handy for de-duping lists where information has to be merged between duplicate records and the like.

Hope this helps.

Kane.
DanWhalen
7 - Meteor

jgreene, This is what I was searching for.  Its kinda messy looking, but that works.  If the Unique tool had an option to do this, just to separate out all the dups entirely, I think it would be a lot more useful.

Labels