Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Selections for Fuzzy Matching on Parsed Address?

trevorwightman
8 - Asteroid

Hello,

 

I am using the below selections for fuzzy matching on a parsed address. Is there a better route to take? Does anyone recommend a better setup? Should I concatenate the address to one field and proceed that way? Any advice would be great, thank you!

 

trevorwightman_0-1580739379451.png

5 REPLIES 5
fmvizcaino
17 - Castor
17 - Castor

Hi @trevorwightman ,

 

Your configuration seems ok! 

One detail about your 'exact' method columns, you need to be careful and guarantee that all columns are written in the same way.

I would suggest you to also think about the weights of the addresspart, maybe that information is not as important as the address and also the thresold should be lower as there is a high possibility of different information representing the same address. So for that, you need a low match threshold and a weight based on how important you think that is.

 

Best,

Fernando Vizcaino

TomWelgemoed
12 - Quasar

Hi @trevorwightman ,

 

I would suggest starting with less fields and then adding more only when it's not accurate enough or performance is suffering. Personally I found that matching didn't work as well when I added too many fields (probably for same reason as @fmvizcaino mentioned). 

 

A neat trick could be to create a "match key", e.g. first 3 digits of a postcode, first few consonants of the streetname and say the number in the street. That effectively creates a grouping that you can work with and then you can use more specific checks in a formula tool afterwards.

MarqueeCrew
20 - Arcturus
20 - Arcturus

@trevorwightman ,

 

My friend @chris_love  and I debated the use of fuzzy matching at Inspire.  What you're asking is a case where I would recommend NOT using fuzzy matching.  There are good alternatives to the fuzzy matching with GOOGLE and HERE.  HERE is a bit less expensive (225K queries/month for free).  You can send the "un-parsed" address through the API and get PARSED and cleansed data back.  This will STANDARDIZE/NORMALIZE your addresses and you can then match via a JOIN.

 

When I first saw fuzzy matching, I was so impressed with 10 W 100th St. matching to 100 W 10th St.  But if both are real and different, do you really want to join on them?  When you're matching only on address, I'd imagine that 11107 Manchester Blvd will match to many different addresses.  The longer the street name (and more matching fields), the more the individual house numbers will merge.  You'd have to break the house number element into it's own field.  Then you'll find the units merging.  

 

I used to live on Bajio Rd.  I lived on the corner of Bajio Rd and Bajio Ct.  These are more fuzzy nightmares for you.

 

Either use the "free" data available with "effort" via the API or consider buying the address data bundle through Alteryx.

 

My two cents.

 

Cheers,


Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
TomWelgemoed
12 - Quasar

Hi @MarqueeCrew ,

 

I fully understand your point - I'd been through the same **bleep** - but I suppose that's just the nature of the type of work that you're trying to do. It's not clear-cut and your job is to prevent over-matching, like in the cases you mentioned. This means you might get fewer matches than you'd like for the sake of quality. And I think the fuzzy matching tool can achieve this well. Bear in mind that matching is not always for addresses - it can be for names, companies etc.

 

So maybe I'm siding with @chris_love on this one 🙂

 

Best,

Tom

MarqueeCrew
20 - Arcturus
20 - Arcturus

@TomWelgemoed ,

 

 don't get me wrong. I like a good fuzzy match. When you've got names and addresses, it's great. I prefer names and addresses that have been equally cleansed. I take my ice cold matches through a join and then go do a variety of fuzzy matches. 

if you're only matching on address, then caution is needed. 

cheers,

 

mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Labels