Remove duplicates from 2 different datasets

I have 2 datasets with same column headers. I want to do a lookup and remove the data with common values under 1 column ( Say Column A). Please confirm what can be the best approach. Appreciate, if you can show with some dummy examples.


My suggestion is to load the 2 datasets with separate input data tools. Add an identifier column to each with value "dataset 1" and "dataset 2". Then Union the 2 together, use a unique tool to remove duplicates (make sure to exclude the identifier column) and split them back to 2 data sets with a filter tool on the Identifier column.


This method will keep the 1st instance found of a set of duplicates and remove the rest.


If, for instance, you wanted to remove the dups from both data sets. you can take the D output of the Unique tool and join it back to the original data sets 1 and 2 and take the left (or right) output of the joins.

Hi @Rajuguide see below to accomplish what you're looking for