community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

How to remove some 'duplicate' rows

Hi All,

 

I am a newbie for the Alteryx

i have a table contains over 100K rows. in these rows, i found that there are some 'duplicate' rows, please see a example

in below table,row 1 and row 2 contain similar data, but row 2, the fields type b and type c is blank, so i define row 2 is a 'duplicate' row.

 

same concept, row 4 is also a 'duplicate' row, 

product iduser idtype atype btype c
1235566AABBCC
1235566AA  
4567788XXqq1qq2
4567788XX  
9998767ZZ  

 

i need to remove this kind of 'duplicate'  rows from my data set.

 

for this example, the below rows should be retained

product iduser idtype atype btype c
1235566AABBCC
4567788XXqq1qq2
9998767ZZ  

 

How to setup a workflow to remove these 'duplicate' rows? how to identify this duplicate row in Alteryx

 

many thanks in advance

 

Peter

Alteryx Certified Partner

Hey @Peter2007 

 

There are a couple of options here for you....

 

You could use the unique tool, but this will only return the first duplicate line. This might not work if your data is in line 2

 

Or you could use the summarize tool (my preferred choice), with the settings you can choose first, last, max, min etc which will give you more control over what values are returned.

 

Give that a go and come back if you get stuck

 

Neil

 

Atom

Hi

 

Use sort tool first, select "userid", "type a" fields in sort tool and then use unique tool  in that select the same two fields.

 

This will give you the required results

Labels