Hello,
I am stuck on a current use case and am unsure on the best way to approach this.
I have a large data set that contains unique identifiers. From this I am determining some separate output files however I need to try and combine the data as best as possible at the very beginning.
Currently I join column 1 with column 2 to get column 3. This works for the most part but when there are two unique identifiers that match and that have the same "type" in column 1 I need to combine them together. For below the desired output would be in green:
EXAMPLE:
| UNIQUE IDENTIFIER | COLUMN 1 | COLUMN 2 | COLUMN 3 |
| 1234 | PM | 12 | PM 12 |
| 1234 | PM | 12 | PM 12 |
| 1234 | | | PM 6 |
Column 2 is always a 12 month data type.
However column 2 will not always be 12 - there could be a scenario where column 2 contains "6" (6 months) and "12" (12 months) and the desired output would be "4" to stand for 4 months as the event "PM" happens 3 times throughout the year.
| UNIQUE IDENTIFIER | COLUMN 1 | COLUMN 2 | COLUMN 3 | DESIRED OUTCOME |
| 1234 | PM | 12 | PM 12 | |
| 1234 | PM | 12 | PM 12 | PM 6 |
| 1234 | CAL | 6 | CAL 6 | CAL 6 |
Any suggestions on how the best way is to do this would be greatly appreciated!!