Hi all,
Requesting some help for a workflow im stuck In.
My data looks like this:
| TrainNo | Distance | OriginCode | Count |
| 15905 | 4273 | Awy | 1 |
| 15905 | 4273 | Cape | 28 |
| 15905 | 4273 | Ndls | 5 |
| 15906 | 4273 | Apdj | 5 |
| 15906 | 4273 | Asn | 2 |
| 15906 | 4273 | Dbrg | 28 |
| 15905 | 4231 | Awy | 1 |
| 15905 | 4231 | Bza | 1 |
| 15905 | 4231 | Cape | 23 |
Each train runs through multiple origin stations but I want to capture the highest distance each train runs through. So for train 15905, it covers distance of 4273 and 4231 but I only require the highest one. There are various origin codes coming for 4273 like Awy, cape and ndls, but I want to capture the origin station with the highest count so Cape in this this. Main idea being, for each train which has various distances, I want to capture the highest distance and a station with the highest count within the max distance.
My output needs to look like this. Each train should only have one row data. Would appreciate any help, thanks
| TrainNo | Distance | OriginCode | Count |
| 15905 | 4273 | Cape | 28 |
| 15906 | 4273 | Dbrg | 28 |