Important Community update: The process for changing your account details was updated on June 25th. Learn how this impacts your Community experience and the actions we suggest you take to secure your account here.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Fuzzy Match - 95%

Sarath27
8 - Asteroid

Hi All,

 

I am trying to find the duplicates in three formats EXACT MATCH, FUZZY Match 95% and 90%. I have completed EXACT MATCHES using Sort required fields and using Multi Row formula.

 

But I am stuck at Fuzzy match, bcoz it produces more duplicates and I find it difficult to append original columns after Fuzzy match tool and also it creates an error "You have found a bug, pls replicates let us know we shall fix it".

 

End goal should be in below format

 

1. Group of similar rows should form a Cluster_ID. For example, if 10 rows are similar after fuzzy match 95% then it should form a unique ID.

2. So, if a group has no of rows>2 are eventually duplicate and the unique one which doesnt match with any similar rows should have a count =1 that I will ignore for my analysis.

 

Fuzzy Match = 95%

Sarath27_1-1661126446454.pngSarath27_2-1661126469112.png

 

 

 

Below is the Screenshot of my workflow and also mentioned the END RESULTS for your reference.

 

 

Sarath27_0-1661126373180.png

 

Pls kindly help me with this tricky scenarios I am stuck with it.

 

I am having 6462515 rows of data, with Fuzzy Match 95% I want to find duplicates and Non-duplicates

 

For example, if duplicates 6000000, the non-duplicates should be 462515. Finally I need to shows case the Duplicates and non duplicates, total should exactly match.

 

If you want to change Match Style, pls advice me. Its only Name and Address fields alone.

 

 

 

 

1 REPLY 1
KilianL
Alteryx Alumni (Retired)

Hi @Sarath27 ,

 

I can't provide a complete solution, but the 'make groups' tool might help you get there. It works pretty well together with the fuzzy match. You can find an example attached.

 

Fuzzy Matching - deduping.png

Labels