Hi All,
I am trying to find the duplicates in three formats EXACT MATCH, FUZZY Match 95% and 90%. I have completed EXACT MATCHES using Sort required fields and using Multi Row formula.
But I am stuck at Fuzzy match, bcoz it produces more duplicates and I find it difficult to append original columns after Fuzzy match tool and also it creates an error "You have found a bug, pls replicates let us know we shall fix it".
End goal should be in below format
1. Group of similar rows should form a Cluster_ID. For example, if 10 rows are similar after fuzzy match 95% then it should form a unique ID.
2. So, if a group has no of rows>2 are eventually duplicate and the unique one which doesnt match with any similar rows should have a count =1 that I will ignore for my analysis.
Fuzzy Match = 95%


Below is the Screenshot of my workflow and also mentioned the END RESULTS for your reference.

Pls kindly help me with this tricky scenarios I am stuck with it.
I am having 6462515 rows of data, with Fuzzy Match 95% I want to find duplicates and Non-duplicates
For example, if duplicates 6000000, the non-duplicates should be 462515. Finally I need to shows case the Duplicates and non duplicates, total should exactly match.
If you want to change Match Style, pls advice me. Its only Name and Address fields alone.