Alteryx Designer Desktop Discussions

faiqz · ‎04-11-2022

Hi,

I would like to know why after I joined 2 datasets, the result will be increased in the records.

For example, below is the workflow:

the select tool has 1,972,634 records

the dataset postcode_states has 53,441 records

but after i joined both of the dataset, i get 250,812,652

do anyone know the answer and how to solve it? From my understanding, the value should remain same or decreased according how many the records joined.

Thank you for your time.

atcodedog05 · ‎04-11-2022

Hi @faiqz

The possible issue your join key has duplicates in both the data sets hence its causing many to many join.

Example

You would need to make key unique in at least one dataset to prevent data explosion.

This a video I could get on the topic

https://www.youtube.com/watch?v=qLMwRxKDhxQ

Hopefully, someone can pitch in and help you understand this much deeper.

Hope this helps : )

Alteryx Designer Desktop Discussions

Join 2 datasets