Hi
i have a customer listing across a number of countries, which i want to do a fuzzy match. At the moment i am treating each country separately.
I'm ok doing it for English alphanumeric characters. However, is there a way to do it for eastern European (e.g. é). Does the fuzzy match treat these characters the same or differently?
i've not yet tried chinese customers, but are they treated differently because they are symbols? Is there a way to match across chinese characters too?
Any help would be appreciated.
Richard
Solved! Go to Solution.
While the Fuzzy Match tool will deal with languages other than English, it does need to be a latin based character set. You will want to use Double Metaphone to match.
From Wikipeda (https://en.wikipedia.org/wiki/Metaphone):
“Double Metaphone tries to account for myriad irregularities in English of Slavic, Germanic, Celtic, Greek, French, Italian, Spanish, Chinese, and other origin. Thus it uses a much more complex ruleset for coding than its predecessor; for example, it tests for approximately 100 different contexts of the use of the letter C alone.”
Chinese inherently doesn’t work for fuzzy matching because each character is a whole word. Fuzzy matching as a theory only works with phonetic languages.
can I ask if alteryx has solved issue for fuzzy match Chinese character, I need to use it to match company name both in Chinese simplified or tradition characters, and I tried many times, seems alteryx fuzzy match not even able to match two same Chinese tradition characters after I did data cleaning. Pls advise.
No, not as far as I'm aware. It can't treat these like normal alphanumerics (but i may be mistaken). We have to bypass fuzzy matching for Chinese characters and use an exact match instead (which is a less than perfect solution).
User | Count |
---|---|
19 | |
15 | |
15 | |
9 | |
8 |