Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Data match company name

emil
7 - Meteor

Dear all,

 

I have a question related to the tool I should use in Alteryx when it comes to match big data. I have a source file of 4M+ records and a target one of 2K. I need to get information from the source data to update the target one. The only way to match the 2 is by Company name. As you may imagine the company names can have differences related to punctuation ...

 

When I use the Join tool I have something like 10% target data updated.

 

When I use fuzzy match it never ends.

 

Your input is much appreciated.

 

Thanks,

Andy

2 REPLIES 2
afv2688
16 - Nebula
16 - Nebula

I would recommend you to use the find replace tool with the append fields to record set up.

 

This should help to get more.

 

Also I would set a cleaning tool to remove al punctuation and a previous find replace to switch all misspellings and abbreviations.

 

Cheers

ThizViz
11 - Bolide

Fuzzy match is very likely to never end unless you use a "waterfall" method....

 

Set your match criteria (either high or low thresholds, depending on your methodology), do the fuzzy match and set aside records that have a match.

 

Then take the unmatched records, change the match thresholds, and run the fuzzy match again.

 

Keep incrementally changing the thresholds. Once you've got a satisfactory match percentage, you can union all the outputs from prior fuzzy matches.

 

I hope that makes sense. I got the waterfall technique from this training video: https://community.alteryx.com/t5/Live-Training/Live-Training-Fuzzy-Matching-Intermediate-Users/td-p/...

 

The suggestion to start with a low threshold came from another solutions engineer who recommended it so that you're not going through successive iterations only to find that you made the cutoff at 65% but 64% is really the magic number.

@thizviz aka cbridges, Bolide
http://community.alteryx.com/t5/user/viewprofilepage/user-id/2328
Labels