Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Identifying duplicate record id's to override a timestamp

cschwartz
5 - Atom

I'm trying to identify duplicate values in my dataset in order to pull out the most recent date in a separate field. My goal is to keep the entry with the earliest date stamp so that I have a final file of unique records.

 

In excel I would typically conditionally format the column for duplicates and then compare the two date fields. I'm struggling to come up with a solution to achieve this with Alteryx.

 

Any ideas are greatly appreciated!

12 REPLIES 12
al_sweets
8 - Asteroid

Hi @tom_montpool ...

 

I am using your method you stated below. I see as part of the Unique tool though, it sorts on the fields you select in the tool's configuration.

 

So when does the Unique tool know to sort on the selected Unique Fields, or keep the order that is inputted in?

 

tom_montpool
12 - Quasar

@al_sweets -- The Unique tool uses the sort order of the incoming data file, then, when you select specific fields in the Unique tool, it takes the first occurrence of the values, as sorted in the input, of the fields you have selected.

 

I'd honestly suggest that you take a small sample dataset and play around with the tool.

 

If you go to Help>Tutorials>Input and View Data>Viewing Data there's a sample dataset that you can sample from that would be good. On my machine, the data file is here:

 

C:\Program Files\Alteryx\bin\..\Samples\en\SampleData\TutorialData.yxdb

 

If you sample the top 10 records and choose the Country field in the Unique tool, you would get UserID=108 as the duplicate.

If you sample the top 10 records, sort on Birth Date, and choose the Country field in the Unique tool, you would get UserID=107 as the duplicate.

 

This is because both are from Country='CH' and in the first case, the input file is sorted on UserID so 107 comes before 108, but in the second case, 1950 comes before 1967, so UserID=108 is the 'first' record from Country='CH'.

al_sweets
8 - Asteroid

Hi @tom_montpool ,

 

Thanks for your response. I understand what you mean, and will definitely check out some sample data.

 

I understand the Unique tool takes the first occurrence of the unique fields selected, using the sort order that was inputted into the tool. However, reading the online documentation for the Unique tool here, it says "The data is sorted based on the Unique fields." -- hence my confusion! This sounds like no matter the sort order inputted into the tool, there is an implicit sort as part of the tool which sorts on unique fields selected as part of the tool's configuration.

Labels
Top Solution Authors