Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

compare 2 fields where in 1 are german umlaute (äöü) and in the other not (ae,oe,ue)

chvizda
8 - Asteroid

hi all

 

I have 2 datasets where in one the data are with german special characters like äöü.

In the other dataset the same word (name) with the international "translation" 

 

Examples:

Müller vs Mueller

Küng vs Kueng

König vs Koenig

Käse vs Kaese

 

Is there a way to (for example) join the 2 datasets even I have these differences ?

Or in my case I need to verify the 2 name fields for inconsistences and this should not come up with an error.

 

I dont want to find/replace the ü with ue ....

 

Thanks

 

Steffen

 

2 REPLIES 2
Thableaus
17 - Castor
17 - Castor

Hi @chvizda 

 

I think you can use Fuzzy Match in this one:

 

Fuzy1.PNG

 

- Add a Formula Tool to identify Source

- Union both datasets

- Use Merge Mode by Source in Fuzzy Match

- Use "Name" option

 

- After all you can join your datasets with the FuzzyMatch result

 

If the Threshold is not getting the results you are looking for, maybe you will need to make some adjustments on your Fuzzy Match Style

 

Cheers,

danilang
19 - Altair
19 - Altair

Hi @chvizda 

 

@Thableaus presented you with a great solution, as always

 

Now, I realize that you specifically said that "I dont want to find/replace the ü with ue ....", but is this because you want to use the German spelling in the rest of the workflow and replacing the characters would lose that?  If that's the case, consider this work flow.

 

WF.png

 

 

The characters to be converted are stored in the Lookup Characters list along with their equivalent "International" replacements.  The bottom half of the workflow builds an international equivalent of the accented name and adds it as a new field.  The last join uses this "international" to match with the international list.   From this point one, both the accented and international names are available for each record.  You can drop the international name completely from this point on

 

Results.png

 

Note that the 2 "...Value" fields are data I added into the input tables to show how their now both available on the records.  

 

The beauty of this technique is that you can expand the look-up list as needed, i.e. é=e, ç=c as required without having to manually change a Switch or IF in a formula tool

 

Dan

Labels