Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Remove All Duplicate Records Based on All Fields

rreissig001
5 - Atom

Hello all,

 

I have a two Excel documents each of which contain running detailed time capture.  I would like to remove anything in Document 1 (which runs from March 16 through April 3) from Document 2 (which runs from March 16 through April 10).  See below data sample set:

 

Field 1Field 2Field 3Field 4Field 5
abcde
abcde
aaaaa
aaaaa
zyxwv
acedb

 

I would like to remove ALL items that are duplicates in ALL fields.  Therefore, the output would remove the first 4 records as lines 1 and 2 and lines 3 and 4 are exact duplicates.  The sample output would be:

 

Field 1Field 2Field 3Field 4Field 5
zyxwv
acedb

 

I have searched the community and tried everything I can think of but cannot seem to do this.  The unique tool does not work exactly.  I currently unioned the two datasets, used the unique filter on ALL fields and then joined them back together, whereas I thought the Left output would give me records unique to the first dataset but I currently have more records than I would expect.

 
 
1 REPLY 1
ggruccio
ACE Emeritus
ACE Emeritus

Hi @rreissig001,

 

After you Union the files, try summarizing using GROUP BY on all fields and put a COUNT on at least one field.

 

Then filter for only the records where the count =1.  I've posted a simplified view of it.

 

ggruccio_2-1586787243796.png

 

ggruccio_3-1586787295375.png

 

 

 

ggruccio_1-1586787217128.png

 

ggruccio_0-1586787179386.png

 

...thinking about it a bit more...instead of unioning you could also join file 1 and file 2 on all 5 fields...  Assuming file 2 is coming in the right side, then anything that doesn't join on all 5 fields on the right output would also do this.

 

Labels