Hello everyone,
I am a little lost on how to do the following join in alteryx.
Dataset 1:
Col A | Col B | Col C |
id1 | ABC | 123 |
id2 | DEF | 456 |
id3 | GHI | 567 |
id4 | JKL | 789 |
id5 | MNO | 322 |
Dataset 2
Column A | Column B |
id1 | U |
id1 | V |
id2 | W |
id2 | X |
id2 | Y |
id3 | Z |
Final dataset:
Col A | Col B | Col C | Column B |
id1 | ABC | 123 | U,V |
id2 | DEF | 456 | W,X,Y |
id3 | GHI | 567 | Z |
id4 | JKL | 789 | |
id5 | MNO | 322 |
I want to be able to join the two files with the final column data being appended into the same column and row separated by columns as shown in the final dataset.
Any help would be highly appreciated. Thanks!!
Solved! Go to Solution.
Hello Matt,
Thank you for taking the time to provide a solution.
I had a follow up question; with the following workflow, I can see that there are duplicate items present in the final dataset, after the concatenate step in summarize tool. Is there a way to check if value from column B is already present in final dataset and if it isn't then append it to the column??
Thanks so much!
Rev
The nice thing about the way we have it set up here is that Alteryx is reading the entirety of Dataset 2 each time you run the workflow, so you won't end up with any duplicates in the end result; it will do the whole process each time and never append a duplicate by mistake.
If this were a more complicated example, with a database that updates nightly for instance, and you wanted to only process incremental changes, the workflow would look quite different and would likely be broken into 2 or more workflows.
I would suggest persisting your data into a database, excel file, or .yxdb (whatever makes sense for your application), then read that file at the start of a second workflow. You could build in some logic, likely through a Join, to say if this combination (id2 and X, for example) already exists, then no need to update.
Make sense?
Thanks Matt, this has been very helpful.
User | Count |
---|---|
19 | |
14 | |
13 | |
9 | |
8 |