Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Alteryx is here to help you solve your biggest data challenges. Read about the new Virtual Solution Center here.

Removing 'duplicate' rows based on values in 3 columns

7 - Meteor

Hi guys,

 

What is the best way to remove 'duplicate rows' from my dataset please? I am looking at a large set of data and where there are rows where values in 3 particular columns are the same I want to keep only one row and remove all the other duplicates (please note the rows will not be identical across ALL columns and I am judging dupes by values in 3 columns being the same).

 

Example data set below:

 

Reference numberData SourceVisitorsTransportSeason
1Library58BikeWinter
20Online69CarSummer
3Library71WalkAutumn
20Online5CarSpring
20Online11TrainUnknown

 

I want to remove the duplicate rows where the values in 'Reference number', 'Data source' and 'Transport' columns are the same- I want to keep only one row to represent this. So, from the table above the two rows highlighted pink below are identical in terms of the three columns I am concerned with:

 

Reference numberData SourceVisitorsTransportSeason
1Library58BikeWinter
20Online69CarSummer
3Library71WalkAutumn
20Online5CarSpring
20Online11TrainUnknown

 

I therefore want to remove one of the dupe rows (I don't mind which one as I am not concerned with the info in the 'Visitors' and 'Season' columns. For example after removing dupe I could expect the table to be as below:

 

Reference numberData SourceVisitorsTransportSeason
1Library58BikeWinter
20Online69CarSummer
3Library71WalkAutumn
20Online11TrainUnknown

 

How could I do this please?

 

Thank you!

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Hello @ccostello,

 

How about this?:

 

Untitled.png

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Regards

Highlighted
14 - Magnetar
14 - Magnetar

Hi @ccostello 

 

I would recommend checking out the Unique tool. This allows you to select which fields you want to check for duplicates in and leaves only one of the records.

 

Otherwise, if you want to remove ALL records that are duplicates, I always use the CReW macro Only Unique. You can find the download for this and other awesome tools here: http://www.chaosreignswithin.com/

Labels