Remove not duplicates
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello guys I want to delete entire row with single entry
Input:
Name | ID | Date | Grade |
Alyssa | 1 | 2021-01-22 | 4 |
Alyssa | 26 | 2021-02-05 | 5 |
Alyssa | 45 | 2021-01-08 | 4 |
Dy | 6 | 2021-01-22 | 4 |
Edrose | 33 | 2021-02-05 | 5 |
Fatima | 9 | 2021-01-22 | 5 |
Fatima | 37 | 2021-02-05 | 5 |
Desired Output:
Name | ID | Date | Grade |
Alyssa | 1 | 2021-01-22 | 4 |
Alyssa | 26 | 2021-02-05 | 5 |
Alyssa | 45 | 2021-01-08 | 4 |
Fatima | 9 | 2021-01-22 | 5 |
Fatima | 37 | 2021-02-05 | 5 |
Solved! Go to Solution.
- Labels:
- Transformation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Ultralightbeam,
I have prepared a workflow for you:
I am checking how many times the name is occurring in the data and we removed everything that occurred only once.
If this workflow is helpful please mark my post as a solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Summarzie tool (Group by Name, Count Name). Join the results to your original database with Count as a new column. Filter by Count > 1. That should give you what you need.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Emil_Kos, I forgot the main issue why I posted this.
Input:
Name | Provider | Date | Grade |
Alyssa | Nico | 2021-01-22 | 4 |
Alyssa | Shaira | 2021-01-22 | 5 |
Alyssa | Shaira | 2021-02-05 | 5 |
Alyssa | Nico | 2021-01-08 | 4 |
Dy | Nico | 2021-01-22 | 4 |
Edrose | Nico | 2021-02-05 | 5 |
Fatima | Shaira | 2021-01-22 | 5 |
Fatima | Nico | 2021-02-05 | 5 |
Desired Output:
Name | Provider | Date | Grade |
Alyssa | Shaira | 2021-01-22 | 5 |
Alyssa | Shaira | 2021-02-05 | 5 |
Alyssa | Nico | 2021-01-08 | 4 |
Fatima | Shaira | 2021-01-22 | 5 |
Fatima | Nico | 2021-02-05 | 5 |
so if the name has two provider, it should get the max grade.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Ultralightbeam,
Sure give me a second.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Ultralightbeam,
I am not sure why you remove this line from the data:
Alyssa | Nico | 2021-01-08 |
4 |
As this combination isn't unique.
Please let me know if this workflow works for you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
hi @Emil_Kos
Alyssa | Nico | 2021-01-08 | 4 |
is still included on the desired output
I think based on your output its still the same
Alyssa | Nico | 2021-01-22 | 4 |
Alyssa | Shaira | 2021-01-22 | 5 |
both of this has been output but since both weeks are the same - we should get the max grade. Do you think it's possible? I've been doing this for two hours already
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Ultralightbeam,
Maybe I am missing something but I don't see any changes between your input and the desired output.
Can you highlight to me where do you see any changes between the two data sets?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I have elaborated the criteria on the comment column, thank you very much
Input:
Name | Provider | Date | Grade | Comment |
Alyssa | Nico | 2021-01-22 | 4 | This should be removed since it is the same date with the next row and grade is smaller. |
Alyssa | Shaira | 2021-01-22 | 5 | Retain since higher than duplicate date |
Alyssa | Shaira | 2021-02-05 | 5 | Retain since date is unique for alyssa |
Alyssa | Nico | 2021-01-08 | 4 | Retain since date is unique |
Dy | Nico | 2021-01-22 | 4 | This will be removed since it is only one - I am only including rows that has two or more grades. |
Edrose | Nico | 2021-02-05 | 5 | Same from row above |
Fatima | Shaira | 2021-01-22 | 5 | Retain |
Fatima | Nico | 2021-02-05 | 5 | Retain |
Desired Output:
Name | Provider | Date | Grade |
Alyssa | Shaira | 2021-01-22 | 5 |
Alyssa | Shaira | 2021-02-05 | 5 |
Alyssa | Nico | 2021-01-08 | 4 |
Fatima | Shaira | 2021-01-22 | 5 |
Fatima | Nico | 2021-02-05 | 5 |