Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Fuzzy between two excel files for anamoly detection

sriniprad08
11 - Bolide

Hi Team,

 

I have two excel. I would like to compare them and see if there is any mismatch. 

 

  1. Compare the two files and find
    a - same vendor number, same vendor name but different email id.
    b - same vendor number, same email id but different vendor name

The final data will be exported to new excel file,

Please let me know how to go about this and attached the excel file,

 

 

 

8 REPLIES 8
jvansistine
9 - Comet

Hi @sriniprad08 ,

Here is one way you could do it.  Since the Number always has to match, you could just use a join tool on that field, and then calculate the other matches.

 

This will duplicate some rows if the number exists multiple times in the master, but that will allow you to flag the correct one to keep or drop.

 

-Jim

jvansistine
9 - Comet

Depending on the size of your actual files, it might also be a good candidate for a batch macro.  You could loop through the values in the Invoice list one at a time and compare them to the master individually that way.

sriniprad08
11 - Bolide

hi @jvansistine 

 

Thank you so much very helpful. But is it possible to get the match score. The one's i marked in Yellow and Red are almost same. So we need to identify those as well. For eg. "Noal lime limited" and "Noal lime limited companies " are almost same except one extra word "companies" . Can we capture these?

 

Also you mentioned about batch macro. Can you please share some link or thoughts how to go about this as well?

 

jvansistine
9 - Comet

In that case, you could use the Fuzzy Match tool, which will give you the match score.  There are a few options for setting that tool up, so definitely check out the help page.  But that tool will allow you to find an acceptable threshold for "close enough" matches.

sriniprad08
11 - Bolide

Thank you @jvansistine  Very helpful. Works like a magic.

 

Cheers,

Sri

sriniprad08
11 - Bolide

Hi @jvansistine ,

 

Thank you. Is it possible to check multiple scenarios between two excel files. 

  1. Compare the two files and find
    a - same vendor number, same vendor name but different email id.
    b - same vendor number, same email id but different vendor name
jvansistine
9 - Comet

I think I would do something like this.

 

Do the fuzzy matches in parallel, and then use find/replace to substitute the standardized value from the master back on your invoices list.

 

I'm not sure how you can do them in a single step, since fuzzy match only looks down one column for its matches.

sriniprad08
11 - Bolide

Hi @jvansistine ,

 

Thank you so much. Really appreciate your support. I will go through it.

 

Cheers,

Srinivas

Labels