Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Separate unique and duplicates based on particular condition

DustinNg_
6 - Meteoroid

Hi guys, 

 

I got this difficult but also really interesting scenario and would love to hear your advise on it. 

 

The goal: Finding out all the duplicates and remove them based on their rating score. 

 

The data set looks kinda like this: 

 

ClientID     Rate
000018
000025
000034
000048
000056
000067
000077
000087
000090
000102
000115

 

After using the Fuzzy match and Unique tool to filter out the duplicates, we got this matching result: 

 

ClientID 1ClientID2Rate ClientID 1Rate ClientID2
000010000285
000010000384
000020000354
000040000787
000050000667
000080000970
000080001072
000080001175
000090001002
000090001105
000100001125

 

So now we know all the potential matching pairs. In each group, client with highest score will be survival, the rest will be deleted.

Eg: in group of 00001, 00002, 00003, since 00001 has highest rate, it will be the survival and 00002, 00003 will be deleted.

 

=> Expected outcome

 

SurvivalDuplicates
0000100002, 0003
0000400007
0000600005
0000800009, 00010, 00011

 

Any idea on how to achieve this? 

 

Thank you so so much in advance!!!

4 REPLIES 4
Qiu
21 - Polaris
21 - Polaris

@DustinNg_ 
Looks like an interesting topic.
I think it can be done with an iterative macro

DustinNg_
6 - Meteoroid

Would you like to give it a try? 😉 

Qiu
21 - Polaris
21 - Polaris

@DustinNg_ 
I would like to, just having a busy day. maybe in the weekend.
before that, I am sure someelse would offer a better one.

DustinNg_
6 - Meteoroid

I realized that it can be solved quite effectively by importing Python into the workflow and do all the coding there. That gives us lots of flexibility.

 

However, still would love to hear you guys advise on how we can solve it by using Alteryx purely. 

 

 

Labels
Top Solution Authors