Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Right Join ( whey Right join has More records )

NiranjanK1
8 - Asteroid

Hello experts,

 

My Left table has 77MB and Right Table has 27MB, i am trying to pull the maching records from the Left table(Right Join).
when i checked output its coming more than 6GB.

 

Doubt on below points please help me

 

  1. how to get maching records from Left table.
  2. Do i need to unselect left table fields to reduce the size.
  3. if i unselect the left table fields, How can i get the matching records from the Left table

Joins.png

Regards,
Niranjan

9 REPLIES 9
Emil_Kos
17 - Castor
17 - Castor

Hi @NiranjanK1,

 

This post explains really well why you see duplicates here:

 

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/How-do-Alteryx-generate-duplicate-rows...

 

You are joining the data on the columns that have duplicate positions in both data bases. The easiest way (it isn't best practice) is to use the unique tool in order to remove the positions that are duplicated and keep only one record for the position that you are using in join tool.

 

In short, you need to clean your data. Remove the duplicate positions and the workflow will work for you.

 

Please mark my post as a solution if this was helpful for you. If you have more questions please let me know!

NiranjanK1
8 - Asteroid

@Emil_Kos  Thanks for your reply

Both the data sources have same set of fields, if im using Right outer join do i need to un select left table fields. im not clear on below points, please suggest me

 

  1. how to get maching records from Left table.
  2. Do i need to unselect left table fields to reduce the size.
  3. if i unselect the left table fields, How can i get the matching records from the Left table

Regards,

Niranjan

Emil_Kos
17 - Castor
17 - Castor

Hi,

 

I think you don't need to unselect anything and the problem is something else. 

 

For example, in both files, you are using ID column 

 

ID Column Left TableID Column Right Table
11
11
11
21
22
32

 

You would expect to join the data and have 6 records but because you have duplicate values in the ID column in both sets you will multiply the results 

 

If you will use unique column and remove the duplicated ID values the workflow will start working for you. 

Emil_Kos
17 - Castor
17 - Castor

Hi @NiranjanK1,

 

I have created a workflow for you so it will be easier for me to explain.

 

Emil_Kos_0-1608631547497.png

 

In the example provided in a previous post, there have been duplicates in the ID column. That is the reason that we have 16 row of the data generated just from 6 lines:

 

Emil_Kos_1-1608631605774.png

 

If I have used the unique column I got only 6 output lines as there are duplicates only in one of the data sources. I achieved it by using a unique tool. 

 

Emil_Kos_2-1608631647480.png

I hope it is easier to understand.

NiranjanK1
8 - Asteroid

@Emil_Kos thank you, Now I got better under standing on the concept .

But i have both the set of data source fields in my output. Do i need to un-select the Left table fields. Kindly suggest me. please look have a look at below images. 

 

Joins.png

Emil_Kos
17 - Castor
17 - Castor

Hi @NiranjanK1,

 

Happy to hear that my post helped you to clarify what is happening in a workflow. 


Regarding your second question, it is a Really good question. Please click option and click deselect duplicated fields:

 

Emil_Kos_0-1608632901294.png

 

Thanks to that you will keep only one column from each data sets and you will not have plenty of duplicated columns.

 

NiranjanK1
8 - Asteroid

@Emil_Kos sorry to bother you again, If i use options and click deselect duplicated fields.

It is removing the data from right table. But im using Right OuterJoin. Do i need to remove the the fields from the left table.

Joins.png

Emil_Kos
17 - Castor
17 - Castor

Hi @NiranjanK1,

 

It will not matter the output will be exactly the same. Unless the column names differ and you want to keep the column names from the right table.

 

If that is the case I would suggest switching left and right connection:

 

Emil_Kos_0-1608633748803.png

 

NiranjanK1
8 - Asteroid

@Emil_Kos thanks a lot, Yes make sense i tested as well, got the expected output. 

Labels