Hello community, I need your help.
I have a hierarchical dataset where each code represents a node in a hierarchy. The codes follow a specific format where segments where hypen ("-") should be ignored when counting node. Each segment represents a level in the hierarchy. Here's an example to illustrate the structure:
P-B8
L-E01
P-B503
P-A211
P-B714
P-B207
A-D1
L-E02
P-DE
L-H01
A-D01
A-E013
A-C030
L-H04
P-K6
P-F
L-A021
P-B6
P-A48
P-B9
A-E010
A-B032
A-E1
P-B707
A-H093
P-A45
A-A026
L-A010
The codes are hierarchical where each part of the code can be broken down to represent child-parent relationships. For instance:
- L-A010 can be broken down into:
- L (root node)
- L-A (child of L)
- L-A0 (child of L-A)
- L-A01 (child of L-A0)
- L-A010 (child of L-A01)
- Intermediate nodes may not exist. For instance, L-A010 might not have L-A01 but could have L-A0 or just L-A.
- Hyphens (-) should be ignored in the matching process. This means L-A010 is considered as LA010.
I want to extract the leaf nodes from this dataset. A leaf node is defined as a node that does not have any children. For example:
- If L-A0 has children L-A01 or L-A010, then L-A0 should not be in the list of leaf nodes.
I think in SQL, to get the list of leaf node, it would be something like below, but I can't seem to extract leaf node from the workflow:
SELECT n1.code
FROM nodes n1
LEFT JOIN nodes n2 ON n2.code LIKE CONCAT(n1.code, '%')
AND n2.code <> n1.code
WHERE n2.code IS NULL;Any help would be appreciated!