Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Fuzzy Match returning incorrect matches/results

urfriendumesh
7 - Meteor

I'm trying to perform a fuzzy match with two input sources, but returning incorrect results. Tried different "Generate Keys" and Match function, but not returning the expected results. Attached the workflow with sample data.

 

Below are the different scenarios performed on the Fuzzy match tool with a Match Threshold of 80%.

 

  • Generate Keys=Soundex. Match Function=Jaro Distance ==> Returns both correct and incorrect match results
  • Generate Keys=Soundex. Match Function=Levenshtein Distance ==> Does not returns any results though there are few matches
  • Generate Keys=Double Metaphone. Match Function=Jaro Distance ==> Returns both correct and incorrect match results, but missing correct matches as well
  • Generate Keys=Double Metaphone. Match Function=Levenshtein Distance ==> Does not returns any results though there are few matches

 

Request you to please assist me with correct fuzzy match returning correct matched results.

 

6 REPLIES 6
ChrisTX
16 - Nebula

Have you watched the Fuzzy Match videos under Academy: https://community.alteryx.com/t5/Videos/bd-p/live-training

 

If the Keys don't match, you won't get a match from the tool.

 

In many cases, the keys seem to be highly dependent on the first letter (or first few letters) of each word.

 

When you say "missing correct matches", what are the Keys for the missing matches?

 

It can be very helpful to select the option for Output Generated Keys.

 

ChrisTX_0-1643654278490.png

 

Chris

 

gabrielvilella
14 - Magnetar

Hi @urfriendumesh, when you see a comparison that is not correct (even though they have the same key) this just means that the comparison failed on that case. You then have three options:

  • Try a new method
  • Discard that specific result
  • Discard all results with a match score equal or lower than that one

I encourage you to take a look at the academy video mentioned by @ChrisTX

urfriendumesh
7 - Meteor

I have watched the Fuzzy Match videos and designed the workflow using several Generate Keys and Match function. Below is one of the scenario which results in wrong match.

 

Input-1
SCOTCOVE PTY LTD
CHALLENGER FUND

 

 

Input-2
SOUTH HOOKE PTY LTD
CHALLENGER
CHALLENGER CAPGUARD

 

Fuzzy Match

Generate Keys = Soundex w/Digits

Match function = Words & Digits: Jaro Distance

 

Returns the below output after performing with above Fuzzy match.

MatchScoreMatchKeyInput-1Input-2
82S321SCOTCOVE PTY LTDSOUTH HOOKE PTY LTD
90C452CHALLENGER FUNDCHALLENGER
84C452CHALLENGER FUNDCHALLENGER CAPGUARD

 

Practically, the client "CHALLENGER FUND" is a match, but the client "SCOTCOVE PTY LTD" is not a match using "Soundex". But, when performed a match using "Metaphone", does not result any of the match/output.

 

Request you to please check the attached workflow for more examples. Please correct me if I'm doing anything wrong and assist me with a correct fuzzy match keys and function.

 

gabrielvilella
14 - Magnetar

Hi @urfriendumesh, there won't be a single mode that will work for all your data. You can find one that will work for almost all your data and go from there. 

urfriendumesh
7 - Meteor

Thanks @gabrielvilella Can you please help me in modifying my workflow which returns only the matched records avoiding the wrong matches.

ChrisTX
16 - Nebula

The default value for Generate Keys is Double Metaphone w/Digits.  Have you tried that one?

 

If you lower your Match Threshold to 80, you should get matches for the examples you posted.

 

With Fuzzy Matching, you may never get a perfect result for all of your data.  It's not like other tools, where there is definitive "correct" output.

 

Chris

Polls
We’re dying to get your help in determining what the new profile picture frame should be this Halloween. Cast your vote and help us haunt the Community with the best spooky character.
Don’t ghost us—pick your favorite now!
Labels