In my dataset, I got a field with company names. I would like to check the similarity between those records within this single field. I thought Fuzzy Matching would be a good start. However, after using Fuzzy Matching, the population of records has reduced. Could anyone help me with how to solve this problem or give me any suggestions to check the similarity?
Example: Company Name
X1 Ltd
X2 Ltd
X1
X1 and X1 Ltd should be the same company name by checking the similarity
Thanks any helps in advance.
Solved! Go to Solution.
Hi @Bpan, you are likely losing records from the fuzzy match because there are no other records which meet the defined match threshold (default is 80%).
I have taken your sample 3 records and added an additional 4th (ZZZZZ Ltd). This fourth record gets dropped in the fuzzy match tool as it is not similar enough to any of the other 3 records. In order not to lose it, you can union it with your other results.
I would also recommend exploring the 'Make Group' tool which would work well with the fuzzy match tool in this scenario.
Thanks so much, that solved my issues.
Thanks for your suggestion.