Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Duplicates

GdeH
7 - Meteor

Hi,

I've uploaded 2 files in the tool. There are some duplicates between both files. I would like to delete those.

However, the duplicates that are included within 1 file I would like to keep them. 

For exemple, here below I would like to keep line 1 2 and 4 and remove line 2.
FILE 1 - XXX
FILE 2 - XXX
FILE 2 - XXX
FILE 1 - YYY

Here below I would like to keep line 1 2 and 5 and remove line 3 and 4.

FILE 1 - XXX

FILE 1 - XXX
FILE 2 - XXX
FILE 2 - XXX
FILE 1 - YYY

10 REPLIES 10
phottovy
13 - Pulsar
13 - Pulsar

I'm not completely sure I understand the difference between the two scenarios but I attached a couple possible solutions.

 

In the first one, I assign a unique RecordID to all the rows in File 1 and then use the unique tool to keep all of File 1 but remove any duplicates from File 2. 

 

In the second one, I use the Multi-Row tool to identify duplicates.

 

Hopefully one of these helps!

GdeH
7 - Meteor

Hi, thank you for your reply.

Actually, I've extracted the 2 files from a system and when extracting those, I have an overlap. 

Meaning that some lines that are included in file 1 are also included in file 2. I would like to remove the overlap items.

 

If I have 5 similar lines in file 1 and the exact same line appears 3 times in file 2, I would like to remove the 3 lines in file 2 and keep the 5 similar lines in file 1. 

If I have 4 different lines in file 1 and those 4 different lines are also included in file 2, I would like to remove the 4 lines in file 2.  

 

I hope this is more clear. 

Emil_Kos
17 - Castor
17 - Castor

Hi @GdeH,

 

Can you test if this solution works for you?

 

Emil_Kos_1-1613498320554.png

 

echuong1
Alteryx Alumni (Retired)

If I understand your requirements correctly, you should be able to create a flag that you can filter on.

 

I started by sorting the records by file name and values so anything the same would be grouped sequentially. From there, I used a multi-row formula to say if file = 1, keep everything. The second check it does is if the value is the same as the value above, to exclude (value of 0). From there, you can use a filter on flag = 1.

 

Hope this helps!

 

echuong1_0-1613498417523.png

 

GdeH
7 - Meteor

Unfortunately this is not working ..

echuong1
Alteryx Alumni (Retired)

The workflow that I provided works for the examples you gave previously. Can you expand upon what isn't working specifically, and provide additional examples?

GdeH
7 - Meteor

In the workflow you've made the input is 

FILE 1 XXX

FILE 1 XXX

FILE 2 XXX

FILE 2 XXX

FILE 1 YYY

FILE 2 YYY

FILE 2 YYY

FILE 2 YYY

I would like to obtain an output by removing the red items and the output of your workflow does not give this... 

 

In file 1 I would like to keep all the lines. I would like to add file 2 to file 1 without all the items that are already included in file 1. In the example above, we can see that in file 2 the 2 XXX lines are already included in file 1 and same for 1 YYY line which is already included in file 1 so I would like to remove those.   

echuong1
Alteryx Alumni (Retired)

I'm not quite sure I understand your logic. Why are you keeping the File 2 YYY rows as well (bolded)? There is already a YYY value in File 1, which is why both File 2 XXX rows are excluded if I understand your logic.

 

FILE 1 XXX

FILE 1 XXX

FILE 2 XXX

FILE 2 XXX

FILE 1 YYY

FILE 2 YYY

FILE 2 YYY

FILE 2 YYY

GdeH
7 - Meteor

We should keep the bold items because in file 1, we only have 1 YYY line so we can only remove it once in file 2. 

For the XXX lines in file 2 , we can remove both those since there are 2 lines XXX in file 1. 

Labels