Hello,
I am stuck on a current use case and am unsure on the best way to approach this.
I have a large data set that contains unique identifiers. From this I am determining some separate output files however I need to try and combine the data as best as possible at the very beginning.
Currently I join column 1 with column 2 to get column 3. This works for the most part but when there are two unique identifiers that match and that have the same "type" in column 1 I need to combine them together. For below the desired output would be in green:
EXAMPLE:
UNIQUE IDENTIFIER | COLUMN 1 | COLUMN 2 | COLUMN 3 |
1234 | PM | 12 | PM 12 |
1234 | PM | 12 | PM 12 |
1234 | PM 6 |
Column 2 is always a 12 month data type.
However column 2 will not always be 12 - there could be a scenario where column 2 contains "6" (6 months) and "12" (12 months) and the desired output would be "4" to stand for 4 months as the event "PM" happens 3 times throughout the year.
UNIQUE IDENTIFIER | COLUMN 1 | COLUMN 2 | COLUMN 3 | DESIRED OUTCOME |
1234 | PM | 12 | PM 12 | |
1234 | PM | 12 | PM 12 | PM 6 |
1234 | CAL | 6 | CAL 6 | CAL 6 |
Any suggestions on how the best way is to do this would be greatly appreciated!!
Solved! Go to Solution.
@Lauren_Holmes
Honestly, I am a bit confused.
Can you just give a simple input and an output so we can figure out how?
Hello -
EXAMPLE
INPUT
1234 | PM | 12 |
1234 | PM | 12 |
1234 | CAL | 12 |
OUTPUT
1234 | PM 6 |
1234 | CAL 12 |
As 1234 are the same on each line and PM is shown twice these need to be combined. I did have an idea of concatenating so the output would be:
1234 | PM,PM |
1234 | CAL |
and from here I can use a text input to say if concatenated column = PM, PM then output is PM 6? I know how to concatenate all eg. group by unique identifier and concatenate the Column 1 however issue with this is I don't want to concatenate all - PM,PM,CAL i only want to concatenate those that have identical events in column 1.
@Lauren_Holmes
This is based on my interpretion. 😁
@Lauren_Holmes wrote:
and from here I can use a text input to say if concatenated column = PM, PM then output is PM 6? I know how to concatenate all eg. group by unique identifier and concatenate the Column 1 however issue with this is I don't want to concatenate all - PM,PM,CAL i only want to concatenate those that have identical events in column 1.
Based on what i could understand you want to concat identical events. So what you need to do is groupby identical events & conact identical events
Workflow:
If you refer my workflow i am doing groubpy ID, Identical event, concat of Identical event, count of Identical event & group by value.
In the formula tool i am diving values/count and getting 6.
Hope this helps 🙂
This is great, thanks so much 🙂
Exactly what i am after, thank you so much 🙂