This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I am trying to parse this xml file but is having a hard time doing so. It seems that multiple tables (country, subdivision, territories, categories) are all into 1 block. I tried to use the XML Parse, REGEX and straight input tools but my limited knowledge of Alteryx (2 months) seems to be a problem.
The goal here would be to extract the data from the xml to get relational tables between the different elements (countries, subdivisions etc.)
It helps me a bit. I guess by using your suggestion I need to repeat the Input tool as many times as I need a section to be "carved out" meaning one for country, one for subdivision and so one. At least that is what I am doing now.
My problem now is the some child don't hold their parent id. This is the case for subdivisions and their respective country. This is also true for the subdivison name which seems to be in a different child than the subdivision itself!!!
In order to link the child and parent, the only thing I see would be to "Return Outer XML" and then use the XML Parse tool. Not sure if it makes sense.
Using a single input tool and subsequent XML parse tools is the way to proceed. The attached workflow uses an input tool to import the file, then a separate XML Parse tool for each of the different lists in the different nodes
Using a single input tool keeps the parent_id at all stages. For example at the subdivision languages level we have the id's of the country, subdivision and language.
An important thing to note here is the strategic use of the select tools. As soon as you've extracted the child_outer_xml chunks, deselect the parent_outer_xml field. This is parent_outer_xml is repeated for each of the records in your child record set, so memory use can balloon very quickly.