Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Combine Keyword in the problem statement

rohit782192
11 - Bolide

Hello ,

 

When we encounter a combine keyword Eg : Combine two data set  .  What does combine means,

 

Join or Union ?

 

Can anyone help on this.

1 REPLY 1
danilang
19 - Altair
19 - Altair

Hi @rohit782192 

 

"Combine" can mean either join or union, depending on the context and your purpose for combining them.  

 

In general, if you have very dissimilar datasets that are related by a relatively small number of key fields, you will usually join them.  An example of this is a list of employee details, name, employee number, age, location, etc and a list of their monthly sales.  You join the two datasets on employee number, so that you can "relate" the sales figures to a specific employee.   In very specific and relatively rare cases, you might need to union them by column position, grouped by employee.   In this format with the employee data mixed with the sales data, it becomes almost impossible to do further analysis, but it make its very easy to output to a report.

 

In contrast, if you have two very similar data sets with nearly identical schema, the method that you use to combine can be either join or union, depending on what you're doing with the combined dataset.   Lets say that you have a data set with monthly sales data for 2020 and another with the same fields for 2021.  If you want to forecast data for 2022, you use a union to create a longer dataset for the forecast algorithm to use.  If you want to compare year over year, you'll join them on month, department and item to create a wider dataset with the monthly data for each year on the same row.  Then you can compare Jan 2020 to Jan 2021 using a simple formula  

 

So the method that you use to combine is determined by what you want to do with the result.

 

Dan

Labels