Hello,
I am trying to amend some data sets and when I join them together it seems to be dropping 342 records all together. I have attached a screenshot of my workflow with comments on how many records are in each tool (this is all I can show as it is sensitive data).
If anyone has any suggestions/solutions to where those records are going and how to get them back, I would really appreciate it!
Solved! Go to Solution.
Hi @kfish
This is somewhat expected behavior when joining two sets of data together. It all depends on what fields you're joining on. If you're joining on ID for instance, the J anchor will show all records that had that ID in both data sets, while the L and R anchors will show any IDs that didn't match. See below for more information. Typically people will use a union tool after the join tool to get records that dropped back into the dataset.
https://help.alteryx.com/20212/designer/join-tool
https://help.alteryx.com/20212/designer/union-tool
Hello @kfish ,
I think what you mean is that the total number of records from the input is not the same as the total number of records from the output.
The reason is that the middle part, Join, is containing records from both input streams.
See below:
Input: 1478 records
Output L anchor: 34
Output R anchor: 404
Output J: 694 (694 from Left, 346 from right)
-> It looks like 346 records is "Disappearing", but these records are the ones that are joined between the two streams.
The columns are added from one stream to another, but there are no additional rows as these completely match the first stream.
So if you are wondering where these 346 records have gone they are included in the 694 records in the J-anchor and you are not losing anything that you need to get back.
(If you would need to "Get back" anything it would be from the output of the L and R tool, and that you can do with a Union tool as @Luke_C mentioned, but I do not think this was what you meant with "Losing records", as the number 346 is the difference between the number of records in input vs output.)
//Regards
Elias
does the "union tool" need to have all three of the "join" outputs used as the "union tool" input?
Hi @JFarah it will depend on your fact pattern. Sometimes you want those records in your dataset, sometimes you don't
The strange thing is - I would think that the "missing number" would be the "join" output divided by 2 (since you only take one of the records from either the Left or Right input).