Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Removing duplicate values sorted by the date

acerdell
8 - Asteroid

Hello Everybody,

 

I have a set of data with load dates. I would like to get unique "pass" and "code" with the latest "date". Please see example below

 

PassCodeDate
TTR2342/9/2016
TTR45672/9/2016
TTR2342/10/2016
BRAK112/11/2016
BRAK112/12/2016
KAJS2342/12/2016
TTR2342/13/2016

 

 

This is how outcome should look like

 

PassCodeDate
TTR2342/13/2016
TTR45672/9/2016
BRAK112/12/2016
KAJS2342/12/2016

 

Thanks

6 REPLIES 6
patrick_digan
17 - Castor
17 - Castor

@acerdell You can use a summarize tool, group by Pass and Code and use MAX for Date. You could also sort by pass and code and date (descending on date) and then use the unique tool with pass and code selected as groupbys. You'll need to use the date time tool to convert it to an alteryx date (yyyy-mm-dd) if it's not already.

 

Let me know if that doesn't help!

acerdell
8 - Asteroid

I have other columns though...so I only want to filter by those 3 fields then expand my selection like we do in excel.

patrick_digan
17 - Castor
17 - Castor

@acerdell I think the unique option would work. It will group by Pass and Code, find first unique record it finds (this is why we put the sort in front of it so that the first record it grabbed would have the most recent date), and include all the columns of data from that row. I'm attaching a sample workflow.

 

Am I understanding your question correctly?

alex
11 - Bolide

Using summarize and then join - convert date to date format for max, but change it back to string for join

Edit:  This method would allow for all records for the pass and code that were on the max date  and had different data in the other columns - just in case that could happen :>

max data.JPGmax data2.JPG

Shurque
5 - Atom

I've used the solution that Alex suggested many times, however, in one case I had to work with millions of rows and the joins really slowed things down.

 

I found an alternative solution by using the Sort and Sample tools.

 

In the example, I sorted the Date field in Descending order and then use the Sample tool, choosing the First N Records (N = 1) and Group by Pass and Code.

 

Depending on the size of the dataset and the tools you feel more comfortable with using, you can consider this option as well.

 

 

patrick_digan
17 - Castor
17 - Castor

This reminds me of @chris_love's post from a few years ago where @MarqueeCrew and @Ned provided various responses to a similar hypothetical question which could also be quickly modified/applied to @acerdell's original question.

Labels