Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Removing Duplicates Based On Varying Criteria

FilipO
5 - Atom

A database containing sales info needs to be cleaned where:

- exact duplicates need to be removed;

- duplicates where a sales type is different but sale price, year, and quarter are the same;

- duplicates where a sales type is different and quarter is different, but price is the same; 

- duplicates where year, quarter, sale type are the same, but sale price increases. The entry with the higher value should remain.

 

Properties that are sold in a different year at a different price should remain. Properties that are sold in the same year at different quarters and are both S11 should remain. S11 sale type will be kept over other sale types where a duplicate exists. There are many adjoining columns containing additional info that need to kept intact. It would also be handy to view which entries have been removed. 

 

Current Data

Sale DateSale YearSale QuarterSale TypeSale PriceAddressAdditional Info
01/12/20212021Q4S11$3030 Andrew Avenuex
01/12/20212021Q4S11$3030 Andrew Avenuex
01/03/20142014Q1S13$70123 Stella Drivex
01/03/20142014Q1S11$70123 Stella Drivex
01/05/20102010Q2S13$5599 Problems Roadx
01/08/20102010Q3S11$5599 Problems Roadx
01/02/20052005Q1S12$3299 Problems Roadx
01/11/20192019Q4S11$641 Big Crescentx
01/11/20192019Q4S11$681 Big Crescentx

 

Desired Output

Sale DateSale YearSale QuarterSale TypeSale Price FinalAddressAdditional Info
01/12/20212021Q4S11$3030 Andrew Avenuex
01/03/2014 2014Q1S11$70123 Stella Drivex
01/05/20102010Q3S11$5599 Problems Roadx
01/02/20052005Q1S12$3299 Problems Roadx
01/11/20192019Q4S11$681 Big Crescentx

 

Would really appreciate some help with this - even a partial solution / method of action for some of the cleaning would be great! Many thanks. 

3 REPLIES 3
Luke_C
17 - Castor

Hi @FilipO 

 

Here's one option but I'm curious if anyone can come up with a more clever solution. Since the unique tool will keep the first unique record it finds, you can play around with sorting to apply your logic (i.e. sort the numbers high to low so the highest price is kept).

 

Luke_C_0-1656376839804.png

 

digitalmemo
8 - Asteroid

@FilipO I made an additional feature of bucketing the output and I put only unique tool so you could see records that are being compared before deduping. There seem to be more conditional cases that should be considered. Please provide the whole data set to anticipate the conditions. For now, feel pre to alter the workflow.

digitalmemo_0-1656392127251.pngdigitalmemo_1-1656392143247.png

 

 

grazitti_sapna
17 - Castor

Hi @FilipO, here is my solution to your problem not sure how this will act on a bigger dataset but able to resolve the problem for the provided sample data and there are many conditions applicable on the dataset.

grazitti_sapna_0-1656394752709.png

 

I hope this helps!

Thanks!

Sapna Gupta
Labels