This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I am a newbie for the Alteryx
i have a table contains over 100K rows. in these rows, i found that there are some 'duplicate' rows, please see a example
in below table,row 1 and row 2 contain similar data, but row 2, the fields type b and type c is blank, so i define row 2 is a 'duplicate' row.
same concept, row 4 is also a 'duplicate' row,
i need to remove this kind of 'duplicate' rows from my data set.
for this example, the below rows should be retained
How to setup a workflow to remove these 'duplicate' rows? how to identify this duplicate row in Alteryx
many thanks in advance
There are a couple of options here for you....
You could use the unique tool, but this will only return the first duplicate line. This might not work if your data is in line 2
Or you could use the summarize tool (my preferred choice), with the settings you can choose first, last, max, min etc which will give you more control over what values are returned.
Give that a go and come back if you get stuck
Use sort tool first, select "userid", "type a" fields in sort tool and then use unique tool in that select the same two fields.
This will give you the required results