I want to deduplicate my rows grouped by one column, keep all the other columns, but choose which row is kept based on another column. How do I do this?
For example, I want to aggregate the rows based on ID and keep the row with min(Criteria_Column).
Before:
ID | Criteria_Column | Other_Column_1 | Other_Column_2 |
123 | 9 | a | s |
123 | 10 | d | f |
456 | 14 | j | k |
After:
ID | Criteria_Column | Other_Column_1 | Other_Column_2 |
123 | 9 | a | s |
456 | 14 | j | k |
It seems similar to the Unique tool except that I can't find any setting on the Unique tool where I can choose a criteria by which to decide which row to keep.
Solved! Go to Solution.
@tinapai
I believe Sort Tool is a good fit for this case.
The summarize tool would give you this functionality. Group by ID and then use min for the other fields. Alternatively, you can sort your data and use the unique which will keep the first value.
@cgoodman3 I think that would give me the min of each group for each of the columns, which is different from what I wanted-- I want the values of the row which has min(Criteria_Column) regardless of what is in the other columns.