Announcement | We'll be doing maintenance between 2-3 hours, which may impact your experience. Thanks for your patience as we work on improving the community!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to match fields that have a typo

bh1789
8 - Asteroid

I have 2 sources of data that I am joining on matching Error ID field.  Some are not matching because of typos in the Error ID, but I still need to capture them as a match in the join.

 

Error ID from Source 1:  YRTK8dZnBTxCiGjGdx9GGx4j1Go=

Error ID from Source 2:  YRTK8dZnBTxCiGjGdx9GGx4j1Go (missing the equal sign)

 

ANOTHER EXAMPLE:

 

Error ID from Source 1:  4eNIe1joVHQCXUN0Z4Y+eGZI7eM=

Error ID from Source 2:  4eNIe1joVHQCXUN0Z4T+eGZI7eM=

 

How would I set this up so they would both result to the J output of the join tool?  Any assistance would be much appreciated.  Thank you.

3 REPLIES 3
Felipe_Ribeir0
16 - Nebula

Hi @bh1789 

 

If i understood well, you dont know exactly how the ID`s can be different from source 1 to source 2 right? You just know that they are very similar. So i guess that you gonna need to use the fuzzy match.

 

Here it is solving your exactly example, using this topic as reference: https://community.alteryx.com/t5/Alteryx-Use-Cases/Fuzzy-Match-Check-Row-by-Row/ta-p/270809

 

Felipe_Ribeir0_0-1670451962600.png

 

 

PanPP
Alteryx Alumni (Retired)

Hi @bh1789 

 

You can try to use the Append Fields tool which will add columns from a source input to every row in a target input. 

 

You can also leverage the Fuzzy match tool,

 

Are there any other columns within this data source that we can use to filter out or use an expression to gather all possible merges?

 

Hope this helps.

 

 

ChrisTX
15 - Aurora

Try the Fuzzy match tool.  There's a video under Learn > Academy > Videos, search for Fuzzy. 

With Fuzzy Match, try the Match Function = Words & Digits: Levenshtein Distance

 

In the two examples you posted, the Levenshtein Distance for each match would be 1, because it would require 1 keystroke to get the strings to match exactly.

 

ChrisTX_0-1670452171250.png

 

Chris

Labels