Hi alteryx community,
I am looking for some help with data aggregation and I wasn't able to find the answer searching online. I've created a mock up input and output data set below to go with the narrative of what I am trying to achieve.
Input:
| Group | Product Code | Group Linkage Number | Name of Product | Value |
| AA20 | AA20_1 | 0 | Main Game | 1000 |
| AA20 | AA20_2 | 1 | DLC | 200 |
| AA20 | AA20_3 | 1 | Expansion | 600 |
| AA20 | AA20_4 | 3 | DLC | 50 |
| BC30 | BC30_1 | 0 | Main Game | 1600 |
| BC30 | BC30_2 | 1 | Expansion | 700 |
| BC30 | BC30_3 | 1 | DLC | 250 |
| BC30 | BC30_4 | 3 | MOD | 350 |
As you can see in the above input we are looking at "video games". These are grouped using the "Group" column, which identifies a certain title. All formulae work should be done within each individual group.
A single group can contain multiple entries, which are further identified in the "Product Code" column, which is simply the "Group" with an underscore and a digit, starting from 1 and counting upwards. Games can be "Main Game", "Expansion", "DLC", "MOD", etc. and they all have a certain "Value" associated with them.
Now, the titles do have a Parent-Child dependency between them. This is determined in the "Group Linkage Number" column. A "Main Game" product is the top parent, not linked to anything, so its linkage number is "0". For any other products, these will be linked to another product, and the "Linkage number" is the "underscore number" of the parent's "Product Code". So, looking at group "AA20", there's a DLC and Expansion, both linked to the "Main Game". Since the main game is product code "AA20_1" then the linked DLC and Expansion will have a linkage number of 1. As another example, DLC product code "AA20_4" has a linkage number 3, because its immediate parent is the expansion product "AA20_3". I hope this makes sense.
I need to aggregate data so that it ONLY shows Main Games and Expansions with the value being a sum of all other child entries, summarised within the main parent.
In the case of group AA20, that's easy enough. I would use a formula to substitute the Linkage Number into the Product Code and then perform a summarise on the Product Code.
In case of group "BC30" that's more problematic as there is a two step process. The MOD "BC30_4" has a linkage number 3, whereas the product "BC30_3" (the immediate parent) is itself a DLC, which is linked to the Main Game, "BC30_1". When I apply the above approach to this input, I get the following output:
| Group | Product Code | Group Linkage Number | Name of Product | Value |
| AA20 | AA20_1 | 0 | Main Game | 1200 |
| AA20 | AA20_3 | 1 | Expansion | 650 |
| BC30 | BC30_1 | 0 | Main Game | 1850 |
| BC30 | BC30_2 | 1 | Expansion | 700 |
| BC30 | BC30_3 | 3 | MOD | 350 |
AA20 is all fine, but there is an issue in BC30. The DLC which used to be BC30_3 was correctly changed into BC30_1 and summarised into the main game, but the mod, which used to be BC30_4 has been renamed to BC30_3 and is now orphaned.
A logical step would be for the MOD to first be "summarised" into the DLC, which would then be summarised into the Main Game. I don't know how to do this, as there can potentially be many steps in this parent-child relationship, but we still need to aggregate all "secondary" products (MODs, DLCs) and report them with their ultimate parents, the Main Games and Expansions. I assume that multi-row formula would probably need to be involved, but I leave this problem to your bright minds.
Thank you very much and apologies for the wall of text.