Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Fuzzy Match (Exact Match Style) not equal to Join tool

trevorwightman
8 - Asteroid

Hello,

 

I have two data files A and B where I am trying to bring one field from B over to A using common fields (address 1, addres 2, city, state, and zip). I was going to do a waterfall match where I bring over all of the exact matches and then slowly use fuzzy matching to bring over addition records with each pass and looser criteria on matching.

 

For my first pass I was hoping to use the fuzzy tool but some I am getting unexpected results:

 

When I use the join tool on those fields for the first pass it produces 29,332 unique matches (I verified this is correct by using the vlookup formula in Excel).

 

When I use the Fuzzy Match tool on those same fields using the Match Style of "Exact" I would expect to receive the same results, however I am only picking up 16,563 records.

 

Is it a wrong assumption that the Exact Match Style on the Fuzzy Matching Tool is synonymous with an exact match like with the Join tool?

1 REPLY 1
lmorrell
11 - Bolide

Hi @trevorwightman 

 

Example workflow is attached.

 

Your last line is spot on - the fuzzy match tool with an "Exact" match style is different to a join. However, you may be happy to know that the configurabilty of the tool allows the two to become very similar.

 

If you are using the in-built "Exact" match style - then the configuration will be as below.

Exact Match.png

 

The key point to note is that the Matching Keys are generated via a formula looking at "Double Metaphones with Digits" - which to my understanding is based more on the sound components/phonetics of the word than the letters used. To make the fuzzy match closer to a join tool we can switch the key generation formula to "Alphanumeric Only (Case Insensitive)" which will result in a case insensitive match based on the letters contained within each of the cells. While not the same as a join, it should be really close. 

 

However, from personal experience - I would recommend using your idea of a waterfall approach. The initial Join Tool will take care of all exact matches, netting some easy wins, and reducing the time used when matching across your two datasets with the Fuzzy Match Tool.  

 

The attached workflow attempts to show the difference between an Exact Fuzzy Match Tool vs. a Join Tool. 

 

Hope this helps! 

Labels