Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

XML parse doesn't work

gregx
8 - Asteroid

After years of using Alteryx, I just wonder - have anyone ever parsed real XML using 'XML parse' tool only?

 

Again, I had to parse a proper XML, however, it seems Alteryx can't parse the simplest (real) examples of XML using 'XML parse' tool.

Every time you have to combine multiple tools, deal with Regex etc.

 

So what's the point of having XML parse tool?

Even the example provided by Alteryx doesn't work, because it's fake  - they've prepared 3 rows of xml, when a normal XML is a long text, there is no such thing as a 'row'. If you change those 3 rows into one (as it is in real life) Alteryx will prompt an error.

 

 

3 REPLIES 3
JamesCharnley
13 - Pulsar

@gregx  I didn't find XML parsing that intuitive as someone without a computer background but there are some materials out there to have a look. I agree that the example workflows aren't that helpful in this case, but there are a couple of weekly challenges that helped me to get my head around it a bit better, for example these two:

 

https://community.alteryx.com/t5/Weekly-Challenge/Challenge-195-XML-Parsing/td-p/505892

https://community.alteryx.com/t5/Weekly-Challenge/Challenge-116-A-Symphony-of-Parsing-Tools/td-p/162...

 

I found this blog post helpful too though it's not an official Alteryx blog:

https://www.theinformationlab.co.uk/2016/01/04/xml_parsing_with_alteryxp7172/ 

gregx
8 - Asteroid

@JamesCharnley  thanks, but all of these are intentionally prepared XMLs. They fit alteryx's poor logic.

The problem is that in real life XMLs are organized in different way, simplified or more complex and Alteryx can't deal with that.

 

I use Alteryx at work (dealing with live examples), not for a theoretical exercises.

JoshKushner
12 - Quasar

I agree and sympathize with this issue. I'd recommend exclusively inputting the XML into the Python tool or reading the XML directly with the Python tool, then use the XML.etree.ElementTree library to parse.

 

The reason you might need to read the XML directly into the Python tool is most sufficiently complex XML will overflow the size of a single cell in Alteryx and won't be able to be fully parsed. If you try to get around this issue by reading the XML parsed onto different rows in Alteryx, the XML can't be parsed normally anyway, it just becomes an awkward data munging exercise.

 

TLDR: Read the XML directly into the Python tool And use the Element Tree library to parse it.

Labels