Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Consolidate data using join

Miki2021
8 - Asteroid

Hello there,

 

Is there a faster way to consolidate two data sets into one based on sales order number and sales order batch number (also identical to invoice number line description) 

 

Dataset A:

Sales order number Sales order batch number customer name delivery 
P1000101ABCDenmark 
P1000102ABCUK
P1000201XYZFinland 

 

Dataset B:

Invoice number invoice number line description sales order number Invoice amount 
90000101P10001EUR100
90000202P10001EUR200
90000301P10002EUR300
8 REPLIES 8
atcodedog05
22 - Nova
22 - Nova

Hi @Miki2021 

 

You can use join tool for this and join on keys.

 

Workflow:

atcodedog05_0-1635771509670.png

 

Interactive lesson on join tool : https://community.alteryx.com/t5/Interactive-Lessons/Joining-Data/ta-p/76634 

 

Hope this helps : )

HomesickSurfer
12 - Quasar

Hi @Miki2021 

 

EDIT:  @atcodedog05 beat me to it...I concur with his solution.

 

Config the Join tool as follows:

 

Capture.PNGCapture2.PNG

MarqueeCrew
20 - Arcturus
20 - Arcturus

@Miki2021 ,

 

The join tool is the "fastest",  but how you intend to use the data may lead to the discovery of ways to improve the speed of the join operation.  In your sample data you have a relatively slim set of files.  If you were not going to make use of a field (e.g. customer name), the JOIN tool allows you to de-select the field from output.  To make the JOIN faster, you could add a SELECT tool BEFORE the JOIN tool.  Remove any unwanted fields there and you will run faster.  You wouldn't carry the groceries up 10 flights of stairs to decide which items were going to be left in your car.

 

Now that you are only carrying essential data with you, if there are data RECORDS that are unnecessary or can be summarized, you should take that action before the JOIN.  Suppose you don't care about UK data, if you FILTER the data out ahead of the JOIN then it will run faster.  If you have data that you can SUMMARIZE first, it is faster to summarize before the JOIN and then join after the summary.

 

Lastly, if you want to try using AMP then in your configuration you can use AMP and see if it speeds things up for you.

 

Cheers,


Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Miki2021
8 - Asteroid

@atcodedog05 

Thanks! 

 

I also watched that video 🙂 

atcodedog05
22 - Nova
22 - Nova

_

Miki2021
8 - Asteroid

@HomesickSurfer 

thanks! The join tool will disregard those data that are not matched based on specific fields.  

 

Is it better to use Replace/Find tool to append the data?

atcodedog05
22 - Nova
22 - Nova

Hi @Miki2021 

 

You can use join multiple tool which would give you full outer join and not leave out data.

 

Workflow:

atcodedog05_0-1635772633310.png

 

Hope this helps : )

 

HomesickSurfer
12 - Quasar

Hi @Miki2021 

 

The FindReplace will only match a single field and not meet your requirements.

The solution offered by @atcodedog05 @ https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Consolidate-data-using-join/m-p/841240... is optimal.

Labels