Here is some dummy data from a dataset I am working with:
Row | Col1 | Col2 | Col3 | Col4 | Amt |
1 | 11234 | 12 | 687 | 11 | 0 |
2 | 14567 | 16 | 5333 | 12 | 0 |
3 | 54321 | 31 | 205 | 01 | 250 |
4 | 17890 | 16 | 123 | 01 | 100 |
5 | 54321 | 31 | 207 | 02 | 250 |
6 | 99876 | 31 | 209 | 12 | 0 |
In the above example, I am needing to compare rows that match based on Col1.
The logic I am trying to build is:
if the values in Col1 match AND their corresponding values in Col3 are = 205 or 207, then I need to keep the row with the '207' in Col3 and delete the other row with '205' in Col3.
In the above sample data, rows 3 & 5 have the same value in Col1 and their corresponding values in Col3 are 205 and 207. As a result, I need to delete row 3 from the dataset and keep row 5, along with the rest of the dataset.
Having some issues getting this to work, and would love for some ideas about how to solve this.
Thanks!
Solved! Go to Solution.
Here's a quick workflow I've created to do this but there are probably better ways to go around it.
In the summarise tool, I've used a group by on Col 1 and a count on Col 1. This basically figures out all the Col 1 IDs that are appearing more than once. In the filter tool then, I've separated out the unique Col 1 rows and used the join tool to combine the other columns back to these rows.
From the rows that are left out (output from the right output anchor of the join tool), I've used a filter tool on Col 3 to equate this to 207. Lastly I've used Union to combine this row back with the rest of the dataset. This is a little too specific to the user case now. But can be amended by replacing filter tool with other tools to customise to more conditions on which row to pick.
Let me know if that helps!
Thank you for the quick reply! This is a great way to solve this - let me recreate into my workflow and see if this will work for the dataset I'm working with.
Thank you!
I think this will work! Again, thank you for the help! 😁