Alteryx Designer Desktop Discussions

rev · ‎08-02-2017

Hello everyone,

I am a little lost on how to do the following join in alteryx.

Dataset 1:

Col A	Col B	Col C
id1	ABC	123
id2	DEF	456
id3	GHI	567
id4	JKL	789
id5	MNO	322

Dataset 2

Column A	Column B
id1	U
id1	V
id2	W
id2	X
id2	Y
id3	Z

Final dataset:

Col A	Col B	Col C	Column B
id1	ABC	123	U,V
id2	DEF	456	W,X,Y
id3	GHI	567	Z
id4	JKL	789
id5	MNO	322

I want to be able to join the two files with the final column data being appended into the same column and row separated by columns as shown in the final dataset.
Any help would be highly appreciated. Thanks!!

Matt_Curtis · ‎08-02-2017

Hi Rev,

Please see the attached workflow as one way to solve for your join. You need to summarize Dataset 2 (group by Column A, concatenate Column B) first, then perform a left join using a combination of the Join and Union tools.

Screenshot of the solution:

rev · ‎08-03-2017

Hello Matt,

Thank you for taking the time to provide a solution.

I had a follow up question; with the following workflow, I can see that there are duplicate items present in the final dataset, after the concatenate step in summarize tool. Is there a way to check if value from column B is already present in final dataset and if it isn't then append it to the column??

Thanks so much!

Rev

Matt_Curtis · ‎08-03-2017

The nice thing about the way we have it set up here is that Alteryx is reading the entirety of Dataset 2 each time you run the workflow, so you won't end up with any duplicates in the end result; it will do the whole process each time and never append a duplicate by mistake.

If this were a more complicated example, with a database that updates nightly for instance, and you wanted to only process incremental changes, the workflow would look quite different and would likely be broken into 2 or more workflows.

I would suggest persisting your data into a database, excel file, or .yxdb (whatever makes sense for your application), then read that file at the start of a second workflow. You could build in some logic, likely through a Join, to say if this combination (id2 and X, for example) already exists, then no need to update.

Make sense?

rev · ‎08-04-2017

Thanks Matt, this has been very helpful.

Alteryx Designer Desktop Discussions

Data Join Manipulation Issue

Re: How to select columns dynamically using number...

Re: Issue when using Block Until Done and Power BI...

Example workflow for setting up a custom list to u...

Re: Firm names parse

Re: Help with Multi-Row formula