I have a problem joining two of my dataset. One main problem is there is no primary key for the two dataset since there are duplicate records with both sets. I'm trying to get supplemental data from the Table 2 to the Table 1.
Table 1:
Record | Details |
1 | |
1 | |
2 | |
2 |
Table 2:
Record | Details |
1 | Detail 1 |
2 | Detail 2 |
2 | Detail 3 |
2 | Detail 4 |
3 | Detail 5 |
Result should be like this:
Record | Details |
1 | Detail 1 |
1 | |
2 | Detail 2 |
2 | Detail 3 |
Thanks for the help.
Solved! Go to Solution.
@apathetichell, here is the data that I can share.
Current Day data:
UniqueKey | RecordID | Detail 1 | Detail 2 | Transaction Date | Findings |
1 | Account 1 | Transaction 1 | 4/8/2022 | ||
2 | Account 1 | Transaction 1 | 4/8/2022 | ||
3 | Account 2 | Transaction 1 | 4/11/2022 | ||
4 | Account 2 | Transaction 2 | 4/11/2022 | ||
5 | Account 2 | Transaction 1 | 4/8/2022 | ||
6 | Account 3 | Transaction 1 | 4/11/2022 | ||
7 | Account 3 | Transaction 2 | 4/11/2022 | ||
8 | Account 4 | Transaction 1 | 4/8/2022 | ||
9 | Account 5 | Transaction 1 | 4/8/2022 | ||
10 | Account 6 | Transaction 1 | 4/8/2022 |
Previous Day Data:
UniqueKey | RecordID | Detail 1 | Detail 2 | Transaction Date | Findings |
0408221 | 1 | Account 1 | Transaction 1 | 4/8/2022 | Findings 1 |
0408222 | 2 | Account 1 | Transaction 1 | 4/8/2022 | Findings 2 |
0408223 | 3 | Account 2 | Transaction 2 | 4/8/2022 | Findings 3 |
0408224 | 4 | Account 2 | Transaction 1 | 4/8/2022 | Findings 4 |
0408225 | 5 | Account 3 | Transaction 1 | 4/8/2022 | Findings 5 |
0408226 | 6 | Account 3 | Transaction 2 | 4/8/2022 | Findings 6 |
0408227 | 7 | Account 3 | Transaction 1 | 4/8/2022 | Findings 7 |
0408228 | 8 | Account 3 | Transaction 2 | 4/8/2022 | Findings 8 |
0408229 | 9 | Account 4 | Transaction 1 | 4/8/2022 | Findings 9 |
04082210 | 10 | Account 4 | Transaction 1 | 4/8/2022 | Findings 10 |
04082211 | 11 | Account 5 | Transaction 1 | 4/8/2022 | Findings 11 |
04082212 | 12 | Account 5 | Transaction 1 | 4/8/2022 | Findings 12 |
04082213 | 13 | Account 6 | Transaction 1 | 4/8/2022 | Findings 13 |
04082214 | 14 | Account 6 | Transaction 2 | 4/8/2022 | Findings 14 |
04082215 | 15 | Account 6 | Transaction 1 | 4/8/2022 | Findings 15 |
Desired Result:
UniqueKey | RecordID | Detail 1 | Detail 2 | Transaction Date | Findings |
0408221 | 1 | Account 1 | Transaction 1 | 4/8/2022 | Findings 1 |
0408222 | 2 | Account 1 | Transaction 1 | 4/8/2022 | Findings 2 |
3 | Account 2 | Transaction 1 | 4/11/2022 | ||
4 | Account 2 | Transaction 2 | 4/11/2022 | ||
0408223 | 5 | Account 2 | Transaction 1 | 4/8/2022 | Findings 3 |
6 | Account 3 | Transaction 1 | 4/11/2022 | ||
7 | Account 3 | Transaction 2 | 4/11/2022 | ||
04082210 | 8 | Account 4 | Transaction 1 | 4/8/2022 | Findings 9 |
04082211 | 9 | Account 5 | Transaction 1 | 4/8/2022 | Findings 11 |
04082213 | 10 | Account 6 | Transaction 1 | 4/8/2022 | Findings 13 |
From here, as you can see, for the current day, we can identify the transactions that is generated only this day. Thus, we also need to create a UniqueKey for the Transactions that has no UniqueKey.
Hope this will be helpful.
without a uniquekey in the current day dataset this won't work. You'll get hit with dimensional issues and get multiple entries for the same identifier.
let me explain - say you are trying to join account 1/transaction 1 from 4/8/2022 from current day - without a unique id it will not know which entry to match vs previous day - solution it will match both. This is standard SQL. Excel uses fake matching via vlookup (a match first strategy) but that is just an artificial excel strategy.
If you have unique transaction id in both this becomes straight forward.
@Rhael012 I agree with @Qiu and @apathetichell that Tile is a helpful tool on this type of problem. I did a test and you can just expand your unique and groupby selection to the keys that you want to join two tables on, i.e. Detail 1, Detail 2, and Transaction Date. See attached workflow.
One thing that I'm not sure is why from your desired result that you mapped Findings 3 not Findings 4. We can change the workflow based the logic why you want "Findings 3" assign to this record.
0408223 | 5 | Account 2 | Transaction 1 | 4/8/2022 | Findings 3 |
Hi @cathyshi525. This is working perfectly. Thank you so much for answering my concern!