Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Identifying duplicate record id's to override a timestamp

cschwartz
5 - Atom

I'm trying to identify duplicate values in my dataset in order to pull out the most recent date in a separate field. My goal is to keep the entry with the earliest date stamp so that I have a final file of unique records.

 

In excel I would typically conditionally format the column for duplicates and then compare the two date fields. I'm struggling to come up with a solution to achieve this with Alteryx.

 

Any ideas are greatly appreciated!

12 REPLIES 12
jdunkerley79
ACE Emeritus
ACE Emeritus

I would use a summarize tool to group on the ID and select the maximum date.

 

You can then use a join tool to join back to the input data on ID and date to max date.

 

The J(oin) output of the Join tool will then be a unique set of the newest records.

MarqueeCrew
20 - Arcturus
20 - Arcturus

@jdunkerley79,

 

I think that the minimum date (earliest) record is the way to go on this one.  You say tomato and i say caprese.

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
patrick_digan
17 - Castor
17 - Castor

I've included a sample workflow like @jdunkerley79 described. I went with Min instead of max on the summarize tool like @MarqueeCrew. I added a summarize tool after my join to account for duplicate dates. 

jdunkerley79
ACE Emeritus
ACE Emeritus

I went with Max as it said:  to pull out the most recent date but @patrick_digan's sample is exactly how to do it

MarqueeCrew
20 - Arcturus
20 - Arcturus

@jdunkerley79 & @patrick_digan

 

let's examine the requirements:

 

I'm trying to identify duplicate values in my dataset in order to pull out the most recent date in a separate field. My goal is to keep the entry with the earliest date stamp so that I have a final file of unique records.

Step 1:  pull out the most recent date.

Step 2: keep the entry with the earliest date stamp.

 

I propose that we put this to a vote.  At a minimum we could play rock-paper-scissors and settle this like men.

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
tom_montpool
12 - Quasar

You could also sort on the date field and then use the Unique tool.

 

The Unique record would be the most recent, the Duplicate(s) would be the older one(s).

 

The nice thing with Unique is that is keeps the data for the entire record together.

tom_montpool
12 - Quasar

You could also sort on the date field and then use the Unique tool.

 

The Unique record would be the most recent, the Duplicates would be the older ones.

 

The nice thing with Unique is that is keeps the data for the entire record together.

cschwartz
5 - Atom

Thank you! This worked perfectly. I really appreciate everyone's suggestions!

MarqueeCrew
20 - Arcturus
20 - Arcturus

i guess that solves it.  @cschwartz has broken the tie and @jdunkerley79 is the winner.  I've starred his post.  

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Labels
Top Solution Authors