Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Identify duplicates in a specific column, and combining back to main excel file

driz
6 - Meteoroid

Hi All, 

 

I have a file with numerous columns and rows of data. I am trying to do the following:

 

1. In two columns "Reference ID" and "Amount", i would like to identify duplicates (i.e. as long as the transaction has the same ID and same Amount, it is to be considered as a duplicate, the same ID but with different Amount will not be a duplicate)

     ------- I tried using Unique tool, and it is not ideal because the D Anchor does not return all rows that are duplicates, the first identified duplicate will enter the U Anchor)

2. After identifying the duplicates, i need a new column to indicate that it is a duplicate

3. All the identified duplicates with the new column needs to be combined to the original source file with all the other transactions. 

 

A simple illustration is as follows:

 

Main source

Reference IDAmountCountryCompany
10000011,000New YorkA
10000021,000New YorkB
10000021,000New YorkB
10000021,500New YorkB
10000031,600New YorkC
10000042,000New YorkE
10000042,100New YorkE

 

Ideal output:

Reference IDAmountCountryCompanyDuplicate?
10000011,000New YorkA[can be left blank]
10000021,000New YorkBDuplicate
10000021,000New YorkBDuplicate
10000021,500New YorkB[can be left blank]
10000031,600New YorkC[can be left blank]
10000042,000New YorkE[can be left blank]
10000042,100New YorkE[can be left blank]

 

What I have tried:

- I tried using the summarise tool, followed by filter tool (to identify duplicates), and then a join tool to combine with source file. And then i am stuck....

- I am unable to download the macro Only Unique because of certain restriction on my work desktop  - so this solution will not work for me

 

Thanks for your help everyone!

 

 

 

6 REPLIES 6
Qiu
21 - Polaris
21 - Polaris

From what see, the multi-row Formula tool is the most efficient way.

1227-driz.png

flying008
15 - Aurora

Hi, @driz 

 

FYI.

 

录制_2024_12_27_13_14_11_949.gif

 

If duplicates rows more than 2, then get the max :

 

录制_2024_12_27_13_23_42_23.gif

driz
6 - Meteoroid

Hi Qiu, thank you. 

 

However, the duplicates can be anywhere in the column, not just row-1 or row+1, my illustration was a simple one, that is why it was only showing duplicates in the rows above and below. 

 

My current excel file has over 10,000 lines, and the duplicates can be anywhere. Is there a fix for this? 

binuacs
21 - Polaris

@driz another way of doing this

image.png

driz
6 - Meteoroid

Hi flying008, 

 

This is an interesting approach, and not too complicated... thank you!

flying008
15 - Aurora

Hi, @driz 

 

If you can , please mark it as a solution and give a like for more share.

Labels
Top Solution Authors