Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Merge two files based on columns that almost match

PMurphy4
6 - Meteoroid

Hello,

I am attempting to use Alteryx to merge a customer file with an internal file such that I have one line per product shipment that has both the customer information and our internal information. 

 

I've found that the only way to join the files is to create a Concat of the Purchase Order (PO) and PO Position/Item in each file and then join off this new column "Concat PO".  Unfortunately, the customer file leaves the purchase order (PO) prefix off the PO number but our ERP shipment history file includes the prefix.  When I do a normal join off the Concat PO the result is no matches.  I've tried to utilize the fuzzy match tool but that requires just one file as input.  Attached are made up examples of the two files I'm trying to merge, is it possible to join files based specific columns such that 628000227083000001 is matched with 227083000001?  Maybe a formula that says if the last 12 numbers in the string match then join the two files?  Or do I need to prep the customer file and create a new column that adds the prefix, which wouldn't be ideal because determining the prefix each time adds time 🙂

 

Thank you!

Patrick

3 REPLIES 3
AngelosPachis
16 - Nebula

Hi @PMurphy4 

 

From what I can see in the provided dataset, the Customer PO# (from Shipment History) contains then Po Number (from Customer Claims), is that correct?

 

If yes, then you can use the find & replace tool, to look within the field Customer PO# and try to find the value contained in the Po Number field.

 

AngelosPachis_0-1610738025447.png

 

The you can select to append any number of fields to the records that matched.

 

Hope that helps, let me know if that worked for you.

 

Regards,

 

Angelos

PMurphy4
6 - Meteoroid

Thank you for your solution @AngelosPachis.  I realized I missed one key point in my example data and that is when there are various PO positions associated with the same PO number.  As a result I had to modify the solution slightly so that I search the Concat PO field within the Shipment History file....if I just looked at the PO Number field Alteryx wouldn't know which PO position to append to.

 

Correct me if I'm wrong -  I learned two things that the "any part of field" option within the find section will start matching characters in a string but will only match if all the characters to the right of the starting point match and if it can't find a match it will still show the Find file information but no data will append from the Replace file.  I won't get duplicates in my output for each time Find finds characters in the middle of the string that match.  (does that make any sense :))

 

Example if Find input file has

700060001

700060002

and the Replace file has

700060002

700060003

and I append data (rather than replace) from the Replace file I will see the 700060001 line item in my output file but it will not append any data from the Replace file as there isn't a full 700060001 match in the Replace file.  I also will not get duplicates that could happen if Find was only reading "70006".

 

This allows me to see where Shipments data doesn't have an associated Claim but it won't highlight to me when I have a claim that doesn't have a corresponding shipment, so unlike my attached example it is probably better to have my Claim file connected to the Find node and Shipments to the Replace. 

My rambling here is so I remember what I learned  🙂

 

Thank you again,

Patrick

AngelosPachis
16 - Nebula

Hi Patrick @PMurphy4 ,

 

Your rambling absolutely makes sense. All the points you mentioned are correct and I'm very happy you managed to take the proposed solution, tweak it to whatever works for you and then apply it successfully to your workflow. That means you learnt something new and I'm glad you managed to get something useful out of this post. Kudos to you for asking the question!

 

Cheers,

 

Angelos

Labels