Hello,
I have multiple datasets coming from different sources, and I need to check that all the 39 fields match for each ID in all the data sources, and fields that have a different value should be flagged. Initially, ii thought I would have to join on the unique ID and 'amount' column. however in my real data sets, there are 39 fields, do I need to do this 39 times?
If anyone could help me with a more efficient way to do this. That would be amazing. I have attached two datasets that are similar to my real datasets.
Thank you.
DATA SET 1 :
| ID | AMOUNT | Rating | Type | flag |
| 1 | 1000 | A+ | Coupon | No |
| 2 | 1000 | AA++ | Coupon | Yes |
| 3 | 3000 | BB | Coupon | yes |
| 4 | 6000 | CC | Coupon | yes |
| 5 | 6000 | BB | Coupon | yes |
| 6 | 7000 | A- | Coupon | No |
| 7 | 9000 | n/A | Coupon | No |
| 8 | 9000 | AAA | Coupon | No |
| 9 | 9000 | Bu | Coupon | No |
| 10 | 9000 | Bu | Credit | No |
DATA SET 2:
| ID | AMOUNT | Rating | Type | flag |
| 1 | 1000 | A+ | Coupon | No |
| 2 | 1000 | B | Coupon | Yes |
| 3 | 3000 | BB | Credit | yes |
| 4 | 6000 | CC | Coupon | yes |
| 5 | 6000 | BB | Coupon | yes |
| 6 | 7000 | A- | Credit | No |
| 7 | 9000 | n/A | Coupon | No |
| 8 | 9000 | AAA | Coupon | No |
| 9 | 9000 | Bu | Coupon | No |
| 10 | 9000 | Bu | Credit | No |