Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Data Join Manipulation Issue

rev
7 - Meteor

Hello everyone,

I am a little lost on how to do the following join in alteryx.

 

Dataset 1:

Col ACol BCol C
id1ABC123
id2DEF456
id3GHI567
id4JKL789
id5MNO322

 

Dataset 2

Column AColumn B
id1U
id1V
id2W
id2X
id2Y
id3Z

 

 

Final dataset:

Col ACol BCol CColumn B
id1ABC123U,V
id2DEF456W,X,Y
id3GHI567Z
id4JKL789 
id5MNO322 

 

I want to be able to join the two files with the final column data being appended into the same column and row separated by columns as shown in the final dataset. 
Any help would be highly appreciated. Thanks!!

 

 

 

4 REPLIES 4
Matt_Curtis
9 - Comet

Hi Rev,

 

Please see the attached workflow as one way to solve for your join.  You need to summarize Dataset 2 (group by Column A, concatenate Column B) first, then perform a left join using a combination of the Join and Union tools.

 

Screenshot of the solution:

 

join_example.JPG

rev
7 - Meteor

Hello Matt,

Thank you for taking the time to provide a solution. 

I had a follow up question; with the following workflow, I can see that there are duplicate items present in the final dataset, after the concatenate step in summarize tool. Is there a way to check if value from column B is already present in final dataset and if it isn't then append it to the column??

 

Thanks so much!

Rev

 

Matt_Curtis
9 - Comet

The nice thing about the way we have it set up here is that Alteryx is reading the entirety of Dataset 2 each time you run the workflow, so you won't end up with any duplicates in the end result; it will do the whole process each time and never append a duplicate by mistake.

 

If this were a more complicated example, with a database that updates nightly for instance, and you wanted to only process incremental changes, the workflow would look quite different and would likely be broken into 2 or more workflows.

 

I would suggest persisting your data into a database, excel file, or .yxdb (whatever makes sense for your application), then read that file at the start of a second workflow.  You could build in some logic, likely through a Join, to say if this combination (id2 and X, for example) already exists, then no need to update.

 

Make sense?

rev
7 - Meteor

Thanks Matt, this has been very helpful.

Labels
Top Solution Authors