Hi all,
Requesting some help for a workflow im stuck In.
My data looks like this:
TrainNo | Distance | OriginCode | Count |
15905 | 4273 | Awy | 1 |
15905 | 4273 | Cape | 28 |
15905 | 4273 | Ndls | 5 |
15906 | 4273 | Apdj | 5 |
15906 | 4273 | Asn | 2 |
15906 | 4273 | Dbrg | 28 |
15905 | 4231 | Awy | 1 |
15905 | 4231 | Bza | 1 |
15905 | 4231 | Cape | 23 |
Each train runs through multiple origin stations but I want to capture the highest distance each train runs through. So for train 15905, it covers distance of 4273 and 4231 but I only require the highest one. There are various origin codes coming for 4273 like Awy, cape and ndls, but I want to capture the origin station with the highest count so Cape in this this. Main idea being, for each train which has various distances, I want to capture the highest distance and a station with the highest count within the max distance.
My output needs to look like this. Each train should only have one row data. Would appreciate any help, thanks
TrainNo | Distance | OriginCode | Count |
15905 | 4273 | Cape | 28 |
15906 | 4273 | Dbrg | 28 |
Solved! Go to Solution.
Hi @Suvasini the tool that will allow you to attack this problem is the Summerize tool, which will allow you to aggregate your data for each 'TrainNo'.
The configuration:
GroupBy: TrainNo
Max: Distance
Max: Count
Will return you your two row table given your sample, the only value missing would be your OriginCode which matches the max count, this can be bought through by joining your summerize output with your original data, joining on 'TrainNo', 'Distance=MaxDistance' and 'Count='MaxCount'.
Hope this is useful
Ben
@Suvasini
I think Sort Tool is a very good fit for this situation.
Thankyou!
@Qiu Much better!
Which tool is beneficial to compute the distance of top 3 closest stations from lets assume station A?