Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Fuzzy Join

bighead
7 - Meteor

Is there a way to do a fuzzy join on data from two separate spreadsheets? I joined expenses and payments data spreadsheets on employee ID and amount to identify which expenses have been paid but realized some of the joins werent showing up in the J output because the amounts are slightly off; I verified that the payments from the payments spreadsheet are actually for the expense listed in the expense spreadsheet. 

 

Ive attached some sample data that is similar to what Im working with, note that there are 2 tabs (my real data is 2 separate sheets). If I join both sheets everything but Tom will show up in my J output but I want Tom to show up as well.

4 REPLIES 4
rzdodson
12 - Quasar

Hey, @bighead, I noticed that in the Expenses sheet, Tom had trailing white space in his line, which was the root cause for him falling out of the L anchor in the workflow you are building. There are a number of approaches you can take here - kind of dealer's choice:

 

1. Use the TRIM function within Excel to clean your data to remove any leading and trailing white space.

2. Alteryx Options

   2a. Formula tool after you bring in your Expense data. Use the TRIM function in Alteryx (reserved if you are worried about altering the source data).

   2b. Use a Data Cleansing tool following bringing in your Expense data.

   2c. Use a Multi-Row Formula tool to include the TRIM function to cleanse your data. 

   2d. Use a Multi-Field Formula tool to include the TRIM function to cleanse all text fields

 

If you go with Formula/Multi-Row Formula options, your syntax will be:

 

Trim([Employee Name])

 

 

If you go with the Multi-Field Formula tool approach, your syntax will be:

 

Trim([_CurrentField_])

 

 

In the Configuration settings of your Multi-Field Formula tool, you'll select "Text" fields, uncheck "Copy Output Fields and Add", and then enter the expression from above.

 

Edit: I used Employee Name (Expenses sheet) and Employee (Payments) fields to do the join.

bighead
7 - Meteor

Hey @rzdodson, thank you for the reply! That was an error in the sample data, major apologies. My actual data has been cleaned and he shows up in the L anchor, problem is I cant get him to show up in my J anchor when I join on ID and amount because his expense/payment amounts are slightly different.

rzdodson
12 - Quasar

No worries, @bighead. Should be a pretty easy correction. Just one question for you, when you get a moment. Are we assuming that the ExpenseAmount (Expense sheet) is what needs to be what is reported out in the final output, or the Extended Transaction Amount (Payments sheet)?

 

I am attaching a workflow based on the assumption that the Extended Transaction Amount needs to be what we are reporting out in the Expenses sheet. Let me know if that is what you are looking for. :)

 

bighead
7 - Meteor

@rzdodson, I would like to show.

Labels