Hello together,
i have a problem with uniting 2 data sets and removing duplicates.
1. So i have Dataset 1 which i want to keep in the union list. I ALSO want to keep the duplicates from this list.
ID | Name | Attribut |
12287 | Carbroker | AB |
263258 | ASI | AB |
222872 | ASI | AC |
77195 | HR-Services | AA |
258228 | ASO Europe | AB |
291141 | Asip Switzerland | AB |
256275 | Northman ltd. | AA |
256276 | Northman ltd. | AB |
2. I have the second dataset which look like this:
ID | Name | Attribut |
192782 | Bungee | AB |
123213 | Dental Services | AB |
258228 | ASO Europe | AA |
258228 | ASO Europe | AB |
258228 | ASO Europe | AC |
13123 | Ganza ldt | AB |
136275 | Pickermann | AA |
256276 | Northman ltd. | AC |
If the Name is already contained in Dataset 1 then i want to remove all of them in Dataset 2. Just keep in mind that the other collumns can have other attributes and i need the rows from dataset 1.
The solution should look like:
ID | Name | Attribut |
12287 | Carbroker | AB |
263258 | ASI | AB |
222872 | ASI | AC |
77195 | HR-Services | AA |
258228 | ASO Europe | AB |
291141 | Asip Switzerland | AB |
256275 | Northman ltd. | AA |
256276 | Northman ltd. | AB |
192782 | Bungee | AB |
123213 | Dental Services | AB |
13123 | Ganza ldt | AB |
136275 | Pickermann | AA |
I tried the joining tool the union tool, but i cant figure it out somehow.
Thank you in advance
Alfred
Solved! Go to Solution.
Hi thank you very much. It worked with the small dataset. On my original usecase i just figured out that the ID can be different in both sets
as example Dataset 1 has a row "ASO Europe" with the ID=2
ID | Name | Attribut |
12287 | Carbroker | AB |
263258 | ASI | AB |
222872 | ASI | AC |
77195 | HR-Services | AA |
2 | ASO Europe | AB |
291141 | Asip Switzerland | AB |
256275 | Northman ltd. | AA |
256276 | Northman ltd. | AB |
and Dataset 2 has three rows with "ASO Europe" and the ID 1 and 2.
ID | Name | Attribut |
192782 | Bungee | AB |
123213 | Dental Services | AB |
1 | ASO Europe | AA |
1 | ASO Europe | AB |
2 | ASO Europe | AC |
13123 | Ganza ldt | AB |
136275 | Pickermann | AA |
256276 | Northman ltd. | AC |
the solution should like this:
ID | Name | Attribut |
12287 | Carbroker | AB |
263258 | ASI | AB |
222872 | ASI | AC |
77195 | HR-Services | AA |
2 | ASO Europe | AB |
291141 | Asip Switzerland | AB |
256275 | Northman ltd. | AA |
256276 | Northman ltd. | AB |
192782 | Bungee | AB |
123213 | Dental Services | AB |
13123 | Ganza ldt | AB |
136275 | Pickermann | AA |
In other words if the the Name from dataset 1 matches the name in Dataset 2, i want to remove the matches from dataset 2 and keep the rest.
Sorry for changing the use case. Didnt know this dataset would cause an error
You guys are awesome. Will try it after the meeting!
It doesnt work with the huge datasets. I noticed you using my first posted data sets.
I just need to remove the rows in dataset2 which contain the same name as the rows in dataset 1. The ID and the Atrribut are a mess and cant be used as a identifier.
@noob_noodle It should work on huge dataset also, can you provide some more sample data and expected result?
It worked i just combined it with a record ID and it fixed it for me. Thank you .