I have a specific use case for vehicles that are sold within a data set. Does anyone have thoughts on a good way to go about identifying the dealers with vehicles (VIN's) in common from a data set? My data set is millions of vehicles long but I'm wanting to find the most common combinations, whether Dealer A is selling to Dealer B or visa versa. Here are the basic data points I would have.
seller | buyer | date | amount | vin |
Dealer A | Dealer B | 2/11/2019 | 10000 | 123456 |
Dealer C | Dealer D | 3/15/2019 | 11000 | 123456 |
Dealer C | Dealer D | 2/1/2019 | 5000 | 567890 |
Dealer B | Dealer A | 3/1/2019 | 15000 | 567890 |
Dealer B | Dealer A | 1/11/2019 | 4000 | 456789 |
Dealer A | Dealer C | 2/15/2019 | 6500 | 456789 |
Dealer A | Dealer B | 1/16/2019 | 8800 | 654321 |
Dealer D | Dealer C | 3/1/2019 | 10000 | 654321 |
Solved! Go to Solution.