Hi,
I have some data like:
In 1st file:
123456 closed status
123456 closed status
and in another file:
123456 closed status
123456 closed status
There are 100s of rows more up and down of this with different IDs (first column) etc.
When I am applying join on 1st column (ID), this particular value is being duplicated and I am receiving 4 of the same rows in Joined output.
123456 closed status
123456 closed status
123456 closed status
123456 closed status
Like this.
Now many other rows are also common in both files, but the joined output gives only 1 row for them.
Solved! Go to Solution.
@ShantanuDagar
When you are joining the tables together, it's matching them both to eachother like this:
The first row matches to the other table twice, the second row matches to the other table twice. It's called a many to many join.
Hopefully that makes sense.
So Join tool does many to many join always?
Is there any way we can change this "many to many" to "one to one" or other types in the tool configuration itself? or we have to apply Unique tool only after join tool to filter duplicates...
@ShantanuDagar If there is a duplicate records the join tool follows the many-many join and if you remove the duplicates then you will get one-one join.
@ShantanuDagar No, it doesn't do that by default.
Think about this, you have many values (that are the same) in the column of your first table, that match to many values (that are the same) in the column of your second table.
You can try to select another column to join on aswell, this can help make the join unique in terms of creating that one-to-one or one-to-many relationship:
If the tables just have duplicate records, you could use the unique tool before the join on either/both of the tables. But I doubt this is the case. I think you need to select more columns to join on in the Join Tool's configuration.