Alteryx Designer Desktop Discussions

AbhijeetChib · ‎03-30-2023

Good Day Everyone !

I have a dataset where I need to identify duplicates based on some conditions and use the oldest date for Tie breakers. Here is the Data

From a set of invoice numbers, I need to identify only AB/*. For all records with invoice numbers starting from AB we need to eliminate the duplicates while not touching all other formats of invoice numbers.

If there are multiple duplicates identified then i need to select the record that has the last modified date.

Here is the sample data

Invoice number	Modified Date
CA/12001234	23.03.2023
CA/12001234	23.03.2023
BD/120001234	23.03.2023
AP/120001235	23.02.2023
AB/120001235	23.03.2023
AB/12345678	23.03.2023
AB/12345678	23.02.2023
AB/12345678	23.03.2023
AB/12345678	23.03.2023
AB/7080000756	23.02.2023
AB/7080000756	23.03.2023

In the above dataset, invoices starting with CA,BD,AP do not need to be touched. For the AP ones, we need to identify the duplicates and for multiple duplicates the record with the oldest date needs to come through. In the above example, the below records need to come through.

AB/12345678	23.02.2023
AB/7080000756	23.02.2023

Thanks for your suggestions.

ShankerV · ‎03-30-2023

Hi @AbhijeetChib

One way of doing this.

Many thanks

Shanker V

DataNath · ‎03-30-2023

Hey @AbhijeetChib, here's how I'd tackle this:

- Isolate AB records

- Parse the date into ISO format for Alteryx to be able to work with it

- Summarize to Group By [Invoice Number] and pull out the minimum (oldest) date

- Reformat the date into your original display

- Union the handled AB data back to the rest of your data

Raj · ‎03-30-2023

sol

Alteryx Designer Desktop Discussions

Duplicate identification based on conditions

Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: How do I remove the tool descriptor?

Re: Single App that filters user selections dynami...