Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Join Problem - More records returning

sriniprad08
11 - Bolide

Hi Team,

Hope you are well.

i have problem. I am trying to join 

1) City_1  with 2) Name in the country_city file.

 

Ideally the record should be 2641 as in the City_1. But its giving double the value.

 

can you please help?

thanks,

 

7 REPLIES 7
ShankerV
17 - Castor

Hi @sriniprad08 

 

The problem is because of the data set.

 

In the City file, take an example Amsterdam.

For the City Amsterdam in world cities, it has 2 values one with Country Netherlands and United States.

 

Hence single line item in city file, is joined with 2 items in world cities file leading to double the value of items from the cities file.

 

ShankerV_0-1678876659415.png

 

Many thanks

Shanker V

 

sriniprad08
11 - Bolide

Hi @ShankerV ,

 

Thank you. But even if i select only the City_chk and Name and exclude all other fields. it still double the value.

 

 

ShankerV
17 - Castor

Hi @sriniprad08 

 

Taking a sample use case to explain the problem.

 

ShankerV_0-1678877022038.png

Dataset 1:

 

ShankerV_1-1678877064634.png

Dataset 2:

ShankerV_2-1678877084991.png

 

So if the dataset 2 has unique values for cities, then we can expect the same volume in the output post joining.

 

Many thanks

Shanker V

alexnajm
17 - Castor
17 - Castor

Selecting certain fields doesn’t change the number of rows that are causing the duplication - I’d use a Unique tool on the given fields (or perhaps some Group Bys in a Summarize) to get a unique list of values. 

ShankerV
17 - Castor

Hi @sriniprad08 

 

Yes, its true. Because the join (vlookup) criteria is City.

 

In the below, these are just select tool criteria to deselect few columns which are not needed in output.

 

ShankerV_0-1678877314270.png

 

ShankerV
17 - Castor

Hi @sriniprad08 

 

As @alexnajm suggested for Unique tool, Unique tool takes the first value. In case of London, it will consider the 1st record and it will take the country as canada. 

Hence as per the business requirement skip the duplicated rows in the dataset2 world cities file.

 

 

ShankerV_0-1678877874826.png

 

The root cause is because of the records 1:2 in dataset1:dataset2 respectively.

 

You can expect the same count of records in output if the data's are in 1:1 ratio.

 

Many thanks

Shanker V

 

alexnajm
17 - Castor
17 - Castor

Good call @ShankerV!

Labels
Top Solution Authors