Hello,
How would I be able to add a field that tells me what caused the join to fail? I'm doing a join on 4 fields (CompanyID, DeptID, JobID, GroupID), and from the left results, I want to be able to identify which of the 4 fields (or if it was all 4 fields) did not match. For example record 1 did not match on CompanyID and DeptID, record 2 did not match on GroupID, record 3 did not match on any field, etc. My goal is to do a union with the left and middle results to have the full population again but to be able to identify which records all matched and which ones did not match and why they didn't match. Thanks.
If you have a null() in your matched field then it's from the left data set. Joins will have a full set of data on matched fields.
But how would I be able to identify which field didn't match? I know the Joins will have a full data set, but I'm trying to identify for the left data set, on which criteria did it not match. For example, was it due to the CompanyID not matching, or the DeptID not matching, or both not matching.
So in a join you have fields matching your left and fields matching your right. Fields from your right input will not be in your left anchor as there is no right match to populate that field. Your left field did not match on whatever field you selected it to match on. If you are matching on multiple fields - you will not have information on if there is a partial match. You may want to look at a find/replace in lieu of a join if you need that information. Likewise - if you are looking for multiple occurences of certain datapoints which can indicate a partial match you can summarize and group on that field and see which values have repeat occurences (ie two primary keys would tell you that a match occured.). If you need additional information - post some data and what you have.
Here is a sample workflow with data. Essentially I want a way to identify (even if it's not with the join function) that says for the Left results, these are the reasons it didn't match. Here are two examples of what I would like the final output to look like, doesn't have to match these exact formats but something along those lines.
Name | CompanyID | DeptID | JobID | GroupID | GroupName | Fields Not Matched |
JANE SMITH | 100 | 1000 | 5000 | 8200 | Manager | GroupID |
JILL DOE | 200 | 1100 | 7000 | 8200 | Manager | JobID, GroupID |
JACK DOE | 300 | 1000 | 7000 | 8300 | Director | CompanyID, JobID, GroupID |
JILL SMITH | 300 | 2000 | 7000 | 8300 | Director | CompanyID, DeptID, JobID, GroupID |
JANE DOE | 100 | 1000 | 5000 | 8000 | Analyst | Full Match |
JOHN DOE | 100 | 1000 | 5000 | 8100 | Senior | Full Match |
JOHN SMITH | 200 | 1100 | 6000 | 8000 | Analyst | Full Match |
OR
Name | CompanyID | DeptID | JobID | GroupID | GroupName | Does CompanyID Match | Does DeptID Match | Does JobIDMatch | Does GroupID Match |
JANE SMITH | 100 | 1000 | 5000 | 8200 | Manager | YES | YES | YES | NO |
JILL DOE | 200 | 1100 | 7000 | 8200 | Manager | YES | YES | NO | NO |
JACK DOE | 300 | 1000 | 7000 | 8300 | Director | NO | YES | NO | NO |
JILL SMITH | 300 | 2000 | 7000 | 8300 | Director | NO | NO | NO | NO |
JANE DOE | 100 | 1000 | 5000 | 8000 | Analyst | YES | YES | YES | YES |
JOHN DOE | 100 | 1000 | 5000 | 8100 | Senior | YES | YES | YES | YES |
JOHN SMITH | 200 | 1100 | 6000 | 8000 | Analyst | YES | YES | YES | YES |
Hi @elclark
One commonly used method is to put a UNION after the JOIN to union all the L, J and R outputs to get the “Combination” (imagine a Venn Diagram that shows you the combined area covered by at least one of the 2 circles).
in this way, those fields which did not match will have [null] values
dawn