Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Fuzzy Match on ID combined with Char & Digits

FrankLu
5 - Atom

Input:

clipboard_image_0.png

 

 

I'm trying to get match with Merge mode of Fuzzy Match. 

 

I tried with different "Generate Keys":

1.Double Metaphone w/Digits, -- Result: The key is based on Digits only so it give incorrect match as all Keyword contain digits 12345.

2.Whole Field ( Case Insensitive)-- Result: no match

3.Alphanumeric Only ( Case Insensitive) -- Result: no match

 

 

Alphanumeric Only MatchKeys:

clipboard_image_1.png

 

I though fuzzy match will generate different length of keyword so i can pick possible match from output.

Need some help here as i can't figure out  which "Generate Keys"/"match Function" to use in this case.

 

 

3 REPLIES 3
ChrisTX
15 - Aurora

Take a look at the Match Function "Words & Digits: Levenshtein Distance"

 

https://blogs.mathworks.com/cleve/2017/08/14/levenshtein-edit-distance-between-strings/

 

Since your Keywords seem to be alphanumeric codes, the "edit distance" represented by Levenshtein should be a good option.

BrandonB
Alteryx
Alteryx
What is the expected output? Could you give us some more information about the values that you are looking to match and why they wouldn’t be exact? The reason that I ask this is because if you have some strings that the first 5 characters are the same as an example it might be easier not to use fuzzy match at all. You could append the list of keywords to itself and then run it through a filter tool where left([keyword],5) = left([right keyword],5) and it would show you all of the keyword matches where the first 5 characters are the same without even needing a fuzzy match.
FrankLu
5 - Atom

thanks. 

The result matched expectation for these sample IDs.

1<-->1001

2<-->1002

3<-->1003

 

I will run against live data set tomorrow. 

 

clipboard_image_0.png

Labels