I have two data sources, one containing 26000 rows and the other containing 6410 rows.
The first dataset contains the following columns:
Customer name, Base_Code, Idenity_Number, Financials
The second dataset holds the following:
Customer name, Base_Code, Idenity_Number, Financials, Lapse
Both sets of data have identical formatting.
My goal is to join the Lapse column in the second dataset to the first dataset. The issue I have is that the numeric value in Financials does not match between the two datasets and I only want the closest match in source 1 to have the value in the Lapse column in source 2 against it.
There will be examples where there are multiple entries for the same customer ID and Base Code in each dataset, so I need to join the two based on Idenity_Number and Base_Code (which is exact) and then match against the nearest financial numeric match for each entry only.
There will never be more entries in dataset 2 then held within dataset 1 for each Customer and Base_Code.
Pasted below is example data of what dataset 1 looks like:
Here is an example of dataset 2:
And here is what I want to end up with:
Any help would be greatly appreciated.
Solved! Go to Solution.
Thanks Nick, that looks like it just might work. So simple too, don't know why my head couldn't think of it. I'll report back in a bit, just going to run my real dataset through the process.
Thanks once again Nick, with a slight adjustment to account for the real dataset this worked a dream.
No worries.
Happy Alteryxing.