I am combining two data sources and individuals are listed more than once on each data source resulting in the output duplicating several time. Example: individual listed on one data source 14 times and another 24 times, resulting in output appearing 336 times (14 x 24). How do I change my workflow so that the data only appears 14 times for the one date source column and 24 times for the other source column?
Also, please disregard my previous question sent on 6.10.2021.
Lori Schwengler
Below is a copy of my workflow:
Hi @Lkschwe
This is likely happening in at least one of your join tools. Are there additional fields you can join on to prevent this? Or are you able to summarize the data before the joins? Without seeing some sample data it's tough to say exactly what the solution would be.
On one file it lists EMPLID, Name & pathway and on the other file it lists EMPLID, Name & Content, so in addition to EMPLID, the name would be the only other field to join which is listed as many time as as the EMPLID. If I were to use the Summarize tool and group by the EMPLID, what option do I select for Content to show the content titles and not sum or count of total?
Hi @Lkschwe, it depends on what your output needs to be. Right now your join is producing every combination of an employee's pathway and content values. Would it maybe be appropriate to concatenate the different pathways or content values into one row versus being broken out?
If possible, post a sample of the 2 data sets along with your expected output and we can try to see what will work best for your fact pattern.
Any sample data you can provide would be helpful, the headers alone don't allow me to see where the issue lies.
Is it possible to call me directly at (612) 671-4213 and walk through the workflow?
Hi Luke - I have attached sample data (Pathway & Views) and expected output. In reviewing each file, you will notice that the individual can be listed on several roles (both sample data reports). This results in each individual appearing several times (example: on Pathway report 20 times/rows and Views report 24 times/rows, resulting in 480 rows in the output). I have also attached a copy of the Alteryx workflow so you an see that there are other inputs (in addition to files attached) and circled where the files attached are located in the workflow.
Hi @Lkschwe
Is there a table that designates which content goes to which pathway? I think that's where the issue is coming. If we had a way to lookup which course goes to which pathway I think the join would go much more smoothly.
Hi Luke - currently, there is not a table, but if I create a table, where in the workflow would I add the table?
User | Count |
---|---|
19 | |
15 | |
13 | |
9 | |
8 |