I am trying to remove duplicates from an output that I have. Based on two columns I want to keep unique tax amounts. For example I know in this data set the number 14413855 has 4 tax amounts that total 1.97. There are 4 sets of this amount though and I only need one. Each number has a different amount of tax amounts. Is there a way that I can keep 1 single grouping for each number? Please see attached.
@gduffy could you use the Summarise tool, grouping by number and tax rate, and summing tax amount?
also look at the unique tool
@aatalai I need the separate tax amounts unfortunately so cannot summarize. Also I considered using the the unique tool but there are some of the same amounts in one string that I need and will be removed if I use the unique tool
@gduffy You don't have to sum the tax amounts to use the summarize tool. If you group by on all 3 fields, it will consolidate the records that are duplicated.
Bacon
I don't want my number column to be grouped though as it would summarizes that tax amount column where I want the detail. Is there a way to do that?
@gduffy Can you create a mock up of your desired output?
@gduffy ---> are there columns you have omitted for data privacy? otherwise --- can you take a step back and explain how you identify which duplicates to drop and which to keep? This should be a straight forward group by columns x,y,z,a,b,c ---> where those are your key columns. That doesn't tie between your two files so either you have not delineated your logic, the second upload is wrong, or there are missing columns.