Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Fuzzy Match Against a Lookup - Help

fiorano
8 - Asteroid

Hi,

 

I am looking to use Fuzzy Match on a lookup Field.   For example, I have a column with RAW ID codes which I need to map to a master Lookup and create a new column.

 

Eg :

codemaptest.PNG

 

Is it possible to fuzzy match against the RAW and the LOOKUP field and create a new FINALCODE column?

 

I've attached the xlsx of the test data if anyone would like to provide a basic workflow.

 

Many thanks for any help!

 

7 REPLIES 7
jrgo
14 - Magnetar

Hi @fiorano,

 

Not sure if it's a use case for fuzzy matching. If my assumption is correct, you're trying to identify if the exact string of characters in your lookup column exist in your RAW field. The reason why I don't think FM is needed is because you need it to find EXACTLY that string.

 

The "Find Replace" tool (Join tool category) would seem to be the more appropriate tool to use.

image.png 

I have the tool configured to look for the "Find Value" in any part of the RAW field. If found, it'll append the "lookup" value instead of replacing.

 

Hope this helps!

 

Jimmy

fiorano
8 - Asteroid

Hi Jimmy,

 

Looks Like I was trying to over complicate things - thanks for this!!!

 

Out of interest, if for example my lookup has a rogue item in it, eg an extra 0 on the end :

 

RAWlookup
RHGFY950RHGFY950
RHGFY950FGGY178
RHGFY950WSDHHT87
_RHGFY950YFHHS412
RHGFY950HFFUYYR
RHGFY950RJJOODS
RHGFY950LMMJD873
RHGFY950XUHHDO93
RHGFY950IIHOOS03
RHGFY950THISONEJJOJASS03
RHGFY95020JSNKFFP67
RHGFY950HSGHL
RHGFY950IIIEDJJOS64
RHGFY950HIHISPAH78
RHGFY950_1UHUGE892
RHGFY950RHGFY9500

 

Could an extra step be added to your workflow?

 

Thanks again!

jrgo
14 - Magnetar

Thanks @fiorano and happy to help!

 

Generally speaking, you can add whatever steps you need to address issues. For what you mentioned specifically, I'm not sure what action/step you'd like it to take or how you would validate your 'lookup' table to identify "rogue" items.

fiorano
8 - Asteroid

Hi @jrgo,

 

Of course - I can see where your coming from.

 

One last scenario we may come across in our RAW column is has a value very similar (eg a typo where RHGFY950 has a 'Y' missing from it  and is RHGF950).  Would FM be used prior to the find and replace steps ?  I think this would be one of the main issues we could come up against....

 

RAWlookupFINALCODE
RHGFY950RHGFY950RHGFY950
RHGFY950FGGY178RHGFY950
RHGFY950WSDHHT87RHGFY950
_RHGFY950YFHHS412RHGFY950
RHGFY950HFFUYYRRHGFY950
RHGFY950RJJOODSRHGFY950
RHGFY950LMMJD873RHGFY950
RHGFY950XUHHDO93RHGFY950
RHGFY950IIHOOS03RHGFY950
RHGFY950THISONEJJOJASS03RHGFY950
RHGFY95020JSNKFFP67RHGFY950
RHGFY950HSGHLRHGFY950
RHGFY950IIIEDJJOS64RHGFY950
RHGFY950HIHISPAH78RHGFY950
RHGFY950_1UHUGE892RHGFY950
RHGF950HSGHLRHGFY950
jrgo
14 - Magnetar

@fiorano,

 

Yes, this is where FM may come into play. However, I'm not sure how effective your matches will be since ID codes aren't really grammatical values, which is what this tool is designed to assist with. Using the sample data set you provided, it seems like it would work, but you'd have to thoroughly validate the results when it's ran against your entire table.

 

It's first using the find replace tool to find the matches and then filtering to split out the records that did not get matched which would be the subset of records that would run through the FM tool.

image.png 

 

If you aren't familiar on how to use the FM tool, I'd strongly suggest looking at the KB and the trainings available in the "Academy" section. Unlike most of the other tools, FM will require a lot of trial and error to configure the tool to generate matches within your acceptable threshold range.

fiorano
8 - Asteroid

Thansks @jrgo !

 

This is more along the lines of what I'm after.  We have a lot of data coming from multiple sources and with 'dirty' data.  The solution you have posted will form the basis of a larger cleansing workflow!

 

Many thanks for your time and expertise!

 

fiorano
8 - Asteroid

HI @jrgo

 

Sorry to reply to this thread but I have posted another query which is based on this thread here :  https://community.alteryx.com/t5/Data-Preparation-Blending/Complex-Cleansing-and-Mapping-anyone-brav... .  Just wondering if you could have a quick look and offer any advice if possible?

 

Many thanks for your time once again.

 

 

Labels