Creating pairs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi All,
Could you please help with the below query.
I am looking to match the attached data set based on Currency and Amount to create pairs (of 2 each). Amounts are to be matched within a tolerance of 1 (within the same currency).
Kindly help with a solution for this.
Expected output attached
Thanks
Solved! Go to Solution.
- Labels:
- Fuzzy Match
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @griseldagomes,
If I deciphered your pairing correctly, you are not only required to create pairs based on amount value with sensitivity limit of 1, but also based on absolute value.
In most likely not the most advanced way, I would suggest:
 
- Calculate Abs Amount rounded to no decimal numbers using formula tool: Round(Abs(Amount),1)
- Calculate Abs Amount - 1 using formula tool (I used the same tool): Round([Abs Amount]-1,1)
- Calculate Abs Amount +1 using formula tool (I used the same tool): Round([Abs Amount]+1,1)
- Join data set with new fields on itself using 3 join tools. In the first join tool match on Currency and Abs Amount, unselecting all right fields (except for Unique Ref) and calculated fields.
- In the second join tool match on Currency and Abs Amount - 1, unselecting all right fields (except for Unique Ref) and calculated fields.
- In the first join tool match on Currency and Abs Amount + 1, unselecting all right fields (except for Unique Ref) and calculated fields.
- Then Union all J anchor results.
- Finally, using filter tool, filter out all results where Unique_Ref field matches Right_Unique_Ref field. That should give you only pairs you are after.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
validating
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @griseldagomes,
 
To eliminate such cases as with RUB, you can add a variance check:
- Add Sumarize tool to the T anchor of the filter. Configure it to Group By Currency and Sum Amount
- Then add Formula tool and calculate ABS(Sum_Amount)
- Then add Filter tool and filter for 0 <= Abs(Sum_Amount) <= 1
- Finally join the T anchor of 2nd filter tool to T anchor of 1st filter tool based on Currency. It will eliminate cases where both Amount values in pairs are negative or positive as sum of their abs amounts will be greater than 1.
 
 
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Your solution is working fine until the summarize tool.
In my current data, one currency is appearing only once as a pair in the input for the summarize tool.
However, assuming one particular currency appearing 4 times (as shown in the snapshot attached) then the summarize tool is grouping it up at a currency level and is unable to show exact pairs.
How can I tackle that scenario please?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Also I'd like to do a vlookup using the unique ref in the final output of the join tool to add the unique id's to the initial input file. I tried the find and replace tool, however unique id field is not appearing there. How can I get this done please?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
In the scenario when you can have multiple pairs for the same currency, I suggest to not untick Abs Amount (Left) field in three join tools.
Then, we adjust summarize tool to group by currency and group by abs amount. That allows to check each pair variance individually.
Then, we need to add second join condition in the join that joins back to our filtered data to join on Abs Amounts as well.
Not too sure what you are trying to do using find and replace tool, but unique ref fields do not appear in options, because they have data type of byte, and the match can only be done in find and replace tool using string fields. I added two options in bringing in paired amounts to original data set in attached solution.
 
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
thanks for your help with this @AnnaMikhaylova
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@griseldagomes , apologies, attached the wrong file accidentally.
