Finding the Leaf Node

Question

Hello community, I need your help.

I have a hierarchical dataset where each code represents a node in a hierarchy. The codes follow a specific format where segments where hypen ("-") should be ignored when counting node. Each segment represents a level in the hierarchy. Here's an example to illustrate the structure:

P-B8
L-E01
P-B503
P-A211
P-B714
P-B207
A-D1
L-E02
P-DE
L-H01
A-D01
A-E013
A-C030
L-H04
P-K6
P-F
L-A021
P-B6
P-A48
P-B9
A-E010
A-B032
A-E1
P-B707
A-H093
P-A45
A-A026
L-A010

The codes are hierarchical where each part of the code can be broken down to represent child-parent relationships. For instance:

* L-A010 can be broken down into:* L (root node)
* L-A (child of L)
* L-A0 (child of L-A)
* L-A01 (child of L-A0)
* L-A010 (child of L-A01)

* Intermediate nodes may not exist. For instance, L-A010 might not have L-A01 but could have L-A0 or just L-A.
* Hyphens (-) should be ignored in the matching process. This means L-A010 is considered as LA010.

I want to extract the leaf nodes from this dataset. A leaf node is defined as a node that does not have any children. For example:

* If L-A0 has children L-A01 or L-A010, then L-A0 should not be in the list of leaf nodes.

I think in SQL, to get the list of leaf node, it would be something like below, but I can't seem to extract leaf node from the workflow:

SELECT n1.code
FROM nodes n1
LEFT JOIN nodes n2 ON n2.code LIKE CONCAT(n1.code, '%')
    AND n2.code <> n1.code
WHERE n2.code IS NULL;

Any help would be appreciated!

KGT · Answer

Am I misunderstanding something here? Does your sample list consist of entirely leaf nodes? I added L-A01 so that I had one that wasn't a leaf node.

Either I've missed something and this doesn't help, or this gets you where you want to go...

I removed the "-" but preserved the original field, then sort and a multi-row to look at the next row and see if it is a child of the current.

Multi-row formula is:

IF Left([Row+1:Field1],Length([Field1]))=[Field1]
THEN 1
ELSE 0
ENDIF

Finding the leaf node.yxmd

ChrisTX · Answer

The problem is Alteryx can't do a Join with the LIKE operator

Try 1)  Do a cartesian join (use the Append tool or create a dummy field with a value 1 and join on the dummy field, Left input and Right input would be the same data stream).

Then 2) use a Filter tool with a formula like

[Value 1] != [Value 2] and

Substring([Value 2], 0, Length([Value 1])) = [Value 1]

Hope you don't have a huge data set, which could be a problem with a cartesian join.

If your data set is too large, you may need a batch or iterative macro.

Chris