Hi,
I was trying to solve one question where I have 2 data sets with customers went to movie theater A & B and wanted to find out the combination of unique customers who went to only one movie theater, not both. Also, the average age of the guests.
I used both union and join tool to find out the the unique combinations of customers but both the tools gave me different results.
Can anyone please advice which tool be using and when?
Thanks,
Archit
Solved! Go to Solution.
Hi Archit, the Join tool is more applicable in this case.
The J output of the Join tool gives you the common universe between the two datasets (which in your case is folks who went to both theaters), and L & R outputs give out those that are not mutually inclusive (i.e. people who went to one theater and not the other. Refer to the image below to visually understand the univ. of data coming out of the Join tool.
I have put together a sample implementation of the Join tool on your dataset to help further.
Hi @architgharat ,
1. join tool will join the data sets and will provide you 3 outputs.
a. L - where customers are only available in Left table
b. J - where customers are in both tables
c. R - where customers are only available in Right table.
2. union tool - it will simply join all the tables
based on your requirements you should be using both.
1. use Join to find out customers who went to only 1 cinema
2. use union to find out average if you need to evaluate average (you can use unique app to filter out client to unique)
Hope this information helps. 🙂
I think the easiest way to differentiate between the two is to think about what your end goal is - are you trying to expand your dataset horizontally or vertically?
When I say vertically, I mean similar to appending two datasets together. Let's say I have a sales transaction file for the North, South, East, and West regions and want to get a complete dataset. In excel, I would simply copy and paste all four together. This is a Union. It's simply stacking two or more datasets together.
Conversely, let's say I want to expand a dataset horizontally. I have a dataset and I want to add additional pieces of information to it. Traditionally, this would be accomplished using a vlookup in excel. This is a Join. It adds additional information based on common fields.
https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Join-vs-Union/ta-p/92694
Hope this helps!
Thank you, great example both in terms of understanding data as vertical / horizonal and with the excel reference