I'm trying to perform a fuzzy match with two input sources, but returning incorrect results. Tried different "Generate Keys" and Match function, but not returning the expected results. Attached the workflow with sample data.
Below are the different scenarios performed on the Fuzzy match tool with a Match Threshold of 80%.
Request you to please assist me with correct fuzzy match returning correct matched results.
Have you watched the Fuzzy Match videos under Academy: https://community.alteryx.com/t5/Videos/bd-p/live-training
If the Keys don't match, you won't get a match from the tool.
In many cases, the keys seem to be highly dependent on the first letter (or first few letters) of each word.
When you say "missing correct matches", what are the Keys for the missing matches?
It can be very helpful to select the option for Output Generated Keys.
Chris
Hi @urfriendumesh, when you see a comparison that is not correct (even though they have the same key) this just means that the comparison failed on that case. You then have three options:
I encourage you to take a look at the academy video mentioned by @ChrisTX.
I have watched the Fuzzy Match videos and designed the workflow using several Generate Keys and Match function. Below is one of the scenario which results in wrong match.
Input-1 |
SCOTCOVE PTY LTD |
CHALLENGER FUND |
Input-2 |
SOUTH HOOKE PTY LTD |
CHALLENGER |
CHALLENGER CAPGUARD |
Fuzzy Match
Generate Keys = Soundex w/Digits
Match function = Words & Digits: Jaro Distance
Returns the below output after performing with above Fuzzy match.
MatchScore | MatchKey | Input-1 | Input-2 |
82 | S321 | SCOTCOVE PTY LTD | SOUTH HOOKE PTY LTD |
90 | C452 | CHALLENGER FUND | CHALLENGER |
84 | C452 | CHALLENGER FUND | CHALLENGER CAPGUARD |
Practically, the client "CHALLENGER FUND" is a match, but the client "SCOTCOVE PTY LTD" is not a match using "Soundex". But, when performed a match using "Metaphone", does not result any of the match/output.
Request you to please check the attached workflow for more examples. Please correct me if I'm doing anything wrong and assist me with a correct fuzzy match keys and function.
Hi @urfriendumesh, there won't be a single mode that will work for all your data. You can find one that will work for almost all your data and go from there.
Thanks @gabrielvilella Can you please help me in modifying my workflow which returns only the matched records avoiding the wrong matches.
The default value for Generate Keys is Double Metaphone w/Digits. Have you tried that one?
If you lower your Match Threshold to 80, you should get matches for the examples you posted.
With Fuzzy Matching, you may never get a perfect result for all of your data. It's not like other tools, where there is definitive "correct" output.
Chris