Separate unique and duplicates based on particular condition

Question

Hi guys,

I got this difficult but also really interesting scenario and would love to hear your advise on it.

The goal: Finding out all the duplicates and remove them based on their rating score.

The data set looks kinda like this:

ClientID     Rate000018000025000034000048000056000067000077000087000090000102000115

After using the Fuzzy match and Unique tool to filter out the duplicates, we got this matching result:

ClientID 1ClientID2Rate ClientID 1Rate ClientID2000010000285000010000384000020000354000040000787000050000667000080000970000080001072000080001175000090001002000090001105000100001125

So now we know all the potential matching pairs. In each group, client with highest score will be survival, the rest will be deleted.

Eg: in group of 00001, 00002, 00003, since 00001 has highest rate, it will be the survival and 00002, 00003 will be deleted.

=> Expected outcome

SurvivalDuplicates0000100002, 0003000040000700006000050000800009, 00010, 00011

Any idea on how to achieve this?

Thank you so so much in advance!!!

DustinNg_ · Answer

I realized that it can be solved quite effectively by importing Python into the workflow and do all the coding there. That gives us lots of flexibility.

However, still would love to hear you guys advise on how we can solve it by using Alteryx purely.

Qiu · Answer

@DustinNg_ 
I would like to, just having a busy day. maybe in the weekend.
before that, I am sure someelse would offer a better one.

DustinNg_ · Answer

Would you like to give it a try? 😉