I have a workbook that has numerous columns. I am identifying the duplicate rows (duplicate in column "Project"), and of these duplicate rows, I need to remove the one with the oldest date:
Project | name | date | $ | NOTE IF NEED TO REMOVE |
abc | JP | 10/1/2022 | 49 | Duplicate (Keep) |
abc | JP | 9/3/2022 | 36 | Duplicate (Remove) |
deb | Alf | 8/7/2022 | 82 | |
xyz | Fred | 6/12/2022 | 27 | Duplicate (Remove) |
xyz | Fred | 12/30/2022 | 30 | Duplicate (Keep) |
How do I remove the rows with the oldest date from this error report (want to throw error only on the most recent date)? I do not yet have the rightmost column in existence; if I did a simple filter would work :D.
Hi @JoshuaElphee ,
My solution.
If you want to keep the original order, you may add Record ID in the beginning.
Workflow
Expression in the 2nd Multi-Row Formula
IF [SeqNo] > 1
THEN "Duplicate (Keep)"
ELSEIF [Project] != [Row+1:Project]
THEN ""
ELSE "Duplicate (Remove)"
ENDIF
@JoshuaElphee - I typically use the unique tool to remove the duplicates if I'm able to use a sort. From my experience, the unique tool maintains the first record of a dup. In this example, if you sort by project and date (descending), then unique tool on [Project], it will keep the record with most recent date. If a Project has more then 2 records, it will still only retain the most recent date. Hope that helps!