Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Duplicates

GdeH
7 - Meteor

Hi,

I've uploaded 2 files in the tool. There are some duplicates between both files. I would like to delete those.

However, the duplicates that are included within 1 file I would like to keep them. 

For exemple, here below I would like to keep line 1 2 and 4 and remove line 2.
FILE 1 - XXX
FILE 2 - XXX
FILE 2 - XXX
FILE 1 - YYY

Here below I would like to keep line 1 2 and 5 and remove line 3 and 4.

FILE 1 - XXX

FILE 1 - XXX
FILE 2 - XXX
FILE 2 - XXX
FILE 1 - YYY

10 REPLIES 10
phottovy
13 - Pulsar
13 - Pulsar

I'm not completely sure I understand the difference between the two scenarios but I attached a couple possible solutions.

 

In the first one, I assign a unique RecordID to all the rows in File 1 and then use the unique tool to keep all of File 1 but remove any duplicates from File 2. 

 

In the second one, I use the Multi-Row tool to identify duplicates.

 

Hopefully one of these helps!

GdeH
7 - Meteor

Hi, thank you for your reply.

Actually, I've extracted the 2 files from a system and when extracting those, I have an overlap. 

Meaning that some lines that are included in file 1 are also included in file 2. I would like to remove the overlap items.

 

If I have 5 similar lines in file 1 and the exact same line appears 3 times in file 2, I would like to remove the 3 lines in file 2 and keep the 5 similar lines in file 1. 

If I have 4 different lines in file 1 and those 4 different lines are also included in file 2, I would like to remove the 4 lines in file 2.  

 

I hope this is more clear. 

Emil_Kos
17 - Castor
17 - Castor

Hi @GdeH,

 

Can you test if this solution works for you?

 

Emil_Kos_1-1613498320554.png

 

echuong1
Alteryx Alumni (Retired)

If I understand your requirements correctly, you should be able to create a flag that you can filter on.

 

I started by sorting the records by file name and values so anything the same would be grouped sequentially. From there, I used a multi-row formula to say if file = 1, keep everything. The second check it does is if the value is the same as the value above, to exclude (value of 0). From there, you can use a filter on flag = 1.

 

Hope this helps!

 

echuong1_0-1613498417523.png

 

GdeH
7 - Meteor

Unfortunately this is not working ..

echuong1
Alteryx Alumni (Retired)

The workflow that I provided works for the examples you gave previously. Can you expand upon what isn't working specifically, and provide additional examples?

GdeH
7 - Meteor

In the workflow you've made the input is 

FILE 1 XXX

FILE 1 XXX

FILE 2 XXX

FILE 2 XXX

FILE 1 YYY

FILE 2 YYY

FILE 2 YYY

FILE 2 YYY

I would like to obtain an output by removing the red items and the output of your workflow does not give this... 

 

In file 1 I would like to keep all the lines. I would like to add file 2 to file 1 without all the items that are already included in file 1. In the example above, we can see that in file 2 the 2 XXX lines are already included in file 1 and same for 1 YYY line which is already included in file 1 so I would like to remove those.   

echuong1
Alteryx Alumni (Retired)

I'm not quite sure I understand your logic. Why are you keeping the File 2 YYY rows as well (bolded)? There is already a YYY value in File 1, which is why both File 2 XXX rows are excluded if I understand your logic.

 

FILE 1 XXX

FILE 1 XXX

FILE 2 XXX

FILE 2 XXX

FILE 1 YYY

FILE 2 YYY

FILE 2 YYY

FILE 2 YYY

GdeH
7 - Meteor

We should keep the bold items because in file 1, we only have 1 YYY line so we can only remove it once in file 2. 

For the XXX lines in file 2 , we can remove both those since there are 2 lines XXX in file 1. 

Labels
Top Solution Authors