Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Fuzzy matching - Customer listing across multiple countries & languages

rmwillis1973
8 - Asteroid

Hi

 

i have a customer listing across a number of countries, which i want to do a fuzzy match. At the moment i am treating each country separately.

I'm ok doing it for English alphanumeric characters.  However, is there a way to do it for eastern European (e.g. é). Does the fuzzy match treat these characters the same or differently?

 

i've not yet tried chinese customers, but are they treated differently because they are symbols?  Is there a way to match across chinese characters too?

 

Any help would be appreciated.

 

Richard

3 REPLIES 3
TaraM
Alteryx Alumni (Retired)

While the Fuzzy Match tool will deal with languages other than English, it does need to be a latin based character set. You will want to use Double Metaphone to match.

 

From Wikipeda (https://en.wikipedia.org/wiki/Metaphone): 

“Double Metaphone tries to account for myriad irregularities in English of Slavic, Germanic, Celtic, Greek, French, Italian, Spanish, Chinese, and other origin. Thus it uses a much more complex ruleset for coding than its predecessor; for example, it tests for approximately 100 different contexts of the use of the letter C alone.”

 

Chinese inherently doesn’t work for fuzzy matching because each character is a whole word.  Fuzzy matching as a theory only works with phonetic languages. 

Tara McCoy
lilyyangadsk
9 - Comet

can I ask if alteryx has solved issue for fuzzy match Chinese character, I need to use it to match company name both in Chinese simplified or tradition characters, and I tried many times, seems alteryx fuzzy match not even able to match two same Chinese tradition characters after I did data cleaning. Pls advise.

rmwillis1973
8 - Asteroid

No, not as far as I'm aware.  It can't treat these like normal alphanumerics (but i may be mistaken).  We have to bypass fuzzy matching for Chinese characters and use an exact match instead (which is a less than perfect solution).

Labels
Top Solution Authors