Hello,
I was wondering what tool I should use for obtaining similarity between two fields. I know that you can use an IF conditional, but this will only put out T or F if it is an exact match. I want to be able to compare the first record from one column with the first record from another column and see what the resulting match percentage is.
Example:
ABC | ABC, 123 | |
DEF | DEF | |
GHI | GIH |
Result:
ABC | ABC, 123 | 80% |
DEF | DEF | 100% |
GHI | GIH | 90% |
I know that Fuzzy Match allows you to compare records from one column. I was wondering if there was something similar with a different tool, or if you can achieve this with Fuzzy Match.
Solved! Go to Solution.
Hi @JR1! Fuzzy matching can compare data from 2 columns too! Merge Mode in the tool allows you to compare records from different sources. The attached workflow accomplishes your question using Fuzzy Matching, giving all 3 rows a score! Because Fuzzy Matching is meant to exclude pairs that don't match very well, I had to configure the Fuzzy Match tool in a very particular way that opens up the tool so all rows are returned. This introduces a few extra tools that may not be intuitive, so I added a lot of comments. Let me know if you would like some more explanation on how Fuzzy Matching in Alteryx works and why the work-arounds in the workflow were necessary. I hope this solved your question!
Hi @DultonM,
The solution you provided works great, and is very well explained! Thank you very much!
Hello DultonM,
Thank you so much for this self-explanatory workflow. This is almost what I was looking for.
The only problem I am having it to properly configure the Fuzzy tool for strings comparison in my case.
I have attached the sample file and the scores. As you can see that the scores do not reflect the correct matching in a few cases.
Is there any better way to make the Fuzzy tool work on this sort of data?
Thanks & Regards,
Ashish
Thanks @DultonM your workflow is really helpful. I'm just trying to modify it so that I can compare all the data fields between two columns. Would that just be flipping it to purge?