Hi friends,
I have two simple excel files (unable to share due to confidential data) each with 700 (File 1 ) and 664 (File 2) records/rows. File 2 has two additional fields than File 1. Basically, I am trying to do a Vlook up from file 2 to pull those two fields into File 1. I am using the Join tool. When I Join by Record Position and Union the J and L output , the total output records count I am expecting is correct but some of the data positionings do not align correctly. However, when I use the Join by Specific Fields (both files have a common field), my J output itself is 20,297 records. I am wondering what is causing this increase when joining my specific fields. Any kind of ideas/help is so appreciated.
Thanks
Suman
Solved! Go to Solution.
This is because a traditional vlookup just finds the first instance of a match and then links up the corresponding data while a Join will return all matches found. The "duplicates" are just multiple rows matching from each data set. You will either want to dedupe your second dataset maybe using a unique tool, or you can use the Find Replace tool (also in the join category) as an alternative and use the option at the bottom to "Append Fields to Record" for the columns that you want. The Find Replace tool works more like your traditional definition of a vlookup in a situation like this.
Hi @itssumanb,
If I understand you correctly you are creating duplicates.
The file that you are using has duplicates on the fields that you are joining. Look at the example below if you would join those two columns together the final output would have more rows because the identifier isn't unique.
Identifier L First table | Identifier R Second table |
1 | 1 |
1 | 1 |
1 | 2 |
2 | 2 |
Is that makes sense?
If this was helpful please mark my post as a solution!
Good luck!
Hi @Emil_Kos
You are right. My identifier isn't unique. So below is an example of what the table looks like and the end output I am looking for. Can you help?
Table 1
Num | Date |
1 | 1/1/2020 |
1 | 2/1/2020 |
1 | 3/1/2020 |
2 | 12/1/2019 |
2 | 1/1/2020 |
Table 2 (the Notes is unique for each Num)
Num | Notes |
1 | abcd |
1 | abcd |
2 | xyz |
2 | xyz |
2 | xyz |
Expected output table (everything from Table 1 with matching 'Notes' from Table 2)
Num | Date | Notes |
1 | 1/1/2020 | abcd |
1 | 2/1/2020 | abcd |
1 | 3/1/2020 | abcd |
2 | 12/1/2019 | xyz |
2 | 1/1/2020 | xyz |
Hi @itssumanb
Check the attached workflow, most likely you want this. Please let us know otherwise.
Regards
Arundhuti
Hi @itssumanb,
I believe @Tyro_abc provided you with good working solution.
Please keep in mind it is very important to pay attention if you are creating duplicates using join tool as this might cause many data errors.
What I usually do is to observe amount of lines pre join and after join. If they don't match I know something is incorrect.
Quite often you don't need any additional tools to achieve this result you can just check the row count that is visible in the results window.