Hi, When combining 2 ledgers with join tool by matching Transaction id and Account number. Because those 2 are the only matching columns in datasets. Join tool is creating some additional duplicates. Are there any other ways to handle the problem? See attached example file.
Solved! Go to Solution.
Hi,
This is a great question and a very common scenario! The extra records are appearing because the Join tool creates a match for every possible combination of your join keys. If a key exists twice in the left file and once in the right, you'll get two records in the output.
The best practice is to remove these duplicates before the join. You can easily do this with the Unique tool.
Solution: Use the Unique Tool
Place a Unique tool onto the data stream for each of your ledgers before they connect to the Join tool.
In the configuration for each Unique tool, check the boxes for both Transaction id and Account number.
Connect the "U" (Unique) output anchor from each Unique tool to the "L" and "R" inputs of your Join tool.
This ensures only the first record for each unique combination is passed to the join, preventing any duplicates from being created.
If this post solves your problem, please help our community by marking it as an accepted solution. This will make it easier for others to find the answer!
Thanks for your reply. I tried that before in order to decrease possible combinations but Unique tool itself is leaving essential rows to complete full result.
User | Count |
---|---|
91 | |
79 | |
62 | |
36 | |
36 |