Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Iterate though child nodes in XML

RodProtheroe
5 - Atom

Hi

I am processing a 25MB XML file with a dozen child nodes. Using the Input Data tool can let it Auto Detect a single Child Node, or I can set a single Specific Child Name, but I can't loop though all Child Nodes.

To get round this I'm using 12 Input Data tools, one per Child Node, and Union'ing the outputs. However, that means when a new Child Node turns up I have a broken workflow. (I can alert for new Child Nodes using a fixed list and RegEx, but this is after the event). 

 

So, in the xml example below, how do I capture the new incoming data when my salesperson starts in the West Region, without rewriting the workflow? (B = angle bracket)

B Sales Date="2021-06-17" B
B Regions B
B South StaffID='SmithJ' SalesCode='88c' DateFrom='2020-12-01' DateTo='2020-12-31' /B
B South StaffID='JonesA' SalesCode='88c' DateFrom='2020-12-01' DateTo='2020-12-31' /B
B South StaffID='HansonJ' SalesCode='18c' DateFrom='2020-12-01' DateTo='2020-12-31' /B
B North StaffID='DerryT' SalesCode='18c' DateFrom='2020-12-01' DateTo='2020-12-31' /B
B North StaffID='ProtonR' SalesCode='18c' DateFrom='2020-12-01' DateTo='2020-12-31' /B
B East StaffID='MurphyJ' SalesCode='18c' DateFrom='2020-12-01' DateTo='2020-12-31' /B
B East StaffID='McIntoshJ' SalesCode='89c' DateFrom='2020-12-01' DateTo='2020-12-31' /B
B /Regions B
B /Sales B

 

Thanks

 

Rod

2 REPLIES 2
kelly_gilbert
13 - Pulsar

Hmm, this looks like oddly-structured XML... If I'm looking at this correctly, StaffID is an attribute of the region and not a child of the region? I think that's where the XML parser is getting hung up and not recognizing all of the children. Also, is Sales the root element? 

 

If this is truly the structure of the file, I can't think of an easy way to auto-detect when a new region shows up. If you know the list of possible regions, you could use a batch macro to cycle through the list of regions, and attempt to parse each one individually.

 

The way I'd handle this myself would be parsing the file as text rather than using the XML parser. The drawbacks would be 1) a little more initial setup, and 2) if any new attributes were added to the region (other than StaffID, SalesCode, DateFrom, DateTo), you'd have to update your workflow accordingly.

 

You can read the XML file in as text by changing a few settings in the Input tool:

kelly_gilbert_0-1624332862461.png

 

Then, you can use a Formula or Regex tool to parse the region attributes (StaffID, SalesCode, etc.) into columns...

 

Finally, use a Multi-Row Formula tool to copy the Sales Date down to all of the rows.

RodProtheroe
5 - Atom

Thanks Kelly_Gilbert! I'm sure this would work great except my sample was a simplified version: the actual data is 30 MB and no carriage returns, which breaks the 16 KB field header limit.

 

Another problem is my 'regions' actually each have a different set of attributes, so if a new one was added with different attributes I would need to rewrite the workflow, as you point out in your answer.

 

So I guess the bottom line is I was hoping there would be some magic way that Alteryx would handle changes to the data coming in, and this is not reasonable. If this was a set of csv files rather than xml, there would be no you could automate handling the arrival of a new file with a new set of fields. The same applies to xml.

 

Thanks for your clear explanation which helped me come to this conclusion!

 

Rod

Labels