Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Rows reduces after inner join

KBH
6 - Meteoroid

Hello 

 

The input file is a hierarchal  structure, I have 176765 records 

My output result with self-join ( L, J, R) =166441

 

 my client wants the same input number = output number which is impossible to do ?

 

How can I explain to him the lines are not reduced but they have been put as right fields?

 

Thank you in advance 

Khaoula 

 

 

KBH_0-1634224614300.png

 

7 REPLIES 7
DawnDuong
13 - Pulsar
13 - Pulsar

hi @KBH 

 

I will try to show a Venn diagram and use small numbers of instances to illustrate. But more importantly to check that the Join actually performs correctly.

 

Assuming that the Join is done based on Unique conditions, then the following equations hold:

L output + J output = L input numbers of records

R output + J output = R input numbers of records

 

with this, you can mathematically prove the following inequation:

L output + R output + J output < L input + R input

 

Alternatively, in statistics, the formula is: 
A union B = A + B - A intersects B.

So naturally A union B < A + B...

 

Hope this helps.

Dawn.

KBH
6 - Meteoroid

hello 

 

thank you for the explanation but my 

 

L output + J output = L input numbers of records correct I find the same number  input in L 

R output + J output = R input numbers of records not correct I don't find the same number of input 

 

how i can slove this ?

atcodedog05
22 - Nova
22 - Nova

Hi @KBH 

 

There is a possibility of R output + J output > R in this case there might one-to-many joins happening i.e 1 row of R can be joined with more than 1 row of L this way the number of rows might increase. Refer below.

 

atcodedog05_0-1634638409909.png

Hope this helps : )

 

KBH
6 - Meteoroid

thank you so much that explains 

atcodedog05
22 - Nova
22 - Nova

Happy to help : ) @KBH 

Cheers and have a nice day!

DawnDuong
13 - Pulsar
13 - Pulsar

hi @KBH 

Re. your question, the condition for the equation to hold is if the Join condition is unique i.e. always one to one.

If you have one to many in either of L or R then the equation does not hold and this is something that i always check whenever I use Join Tool.

For the more critical step, i usually include a Recon Checking container just to check the join logic, especially when we do not 100% know the data structure and how each record is uniquely identified (or not).

Cheers,

Dawn.

KBH
6 - Meteoroid

hi @DawnDuong 

Thank you for clarifying  it 

 

What is a reckon checking container ?

 

 

Thank you in advance 

Khaoula 

 

Labels