Remove duplicates based on two columns
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
I'd like to remove duplicate values of ID based on the Number1 and Number2 columns in the table. So for e.g. if ID#1 has a value in either Number1 or Number2, id like to keep ID#1 and remove the second row (duplicate ID#1) which has no values in Number1 or Number2. However, there are some rows that are already unique but don't have any values for Number1 or Number2 (ID#2) and I would NOT like to get rid of these.
Thank you!!!
ID | FirstName | LastName | Number1 | Number2 |
1 | Bob | Smith | 555 | |
1 | Bob | Smith | ||
2 | James | Bond | ||
3 | Kevin | Bacon | 333 | 777 |
4 | Tyler | Perry | ||
4 | Tyler | Perry | ||
5 | Steven | Maria | 444 |
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
HI @Jairacha,
I would create a Formula to evaluate the values in both fields and if both are null or empty then assign a value of 99999 otherwise a value of 1. The formula below checks for IsEmpty (if your values are null you would just change to IsNull.)
Once that is complete your data will look like this:
Make sure to sort by ID and eval then attach a Unique tool configured for ID.
Unique only takes the first match for ID. So you get the rows where there are data filled in for [Number1] or [Number2] and when an ID has neither filled it also keeps those. You could then use a Select tool to remove the eval field if you want.
Final dataset:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks @ggruccio !!
