Alert: There is a planned Community maintenance outage October 16th from approximately 10 - 11 PM PST. During this time the Alteryx Community will be inaccessible. Thank you for your understanding!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Parse XML independent of nesting level

JasonFriedman
6 - Meteoroid

I have some XML, pasted below and also attached.

 

<suite>
  <testcase id="001" kind="bvt">
    <inputs>
      <arg1>4</arg1>
      <arg2>7</arg2>
    </inputs>
    <expected>11.00</expected>
  </testcase>
  <testcase id="002" kind="drt">
    <inputs>
      <arg1>9</arg1>
      <arg2>6</arg2>
    </inputs>
    <expected>15.00</expected>
  </testcase>
</suite>

 

I'd like Alteryx to ingest this as:

 

id kind arg1 arg2 expected
1  bvt  4    7    11
2  drt  9    6    15

 

I have a solution attached and pictured below:

 

First stepFirst stepSubsequent stepSubsequent step

 

That subsequent step would need to be repeated, I think, if the XML nesting was deeper.

 

Is there a solution that will work no matter how deep the nesting?

19 REPLIES 19
JasonFriedman
6 - Meteoroid

Thank you @MikeN. @DavidP, here's my workflow (with your macro and my original example.xml file renamed to example.csv):

 

JasonFriedman_1-1582754232613.png

 

And here's the result:

 

JasonFriedman_0-1582754136839.png

 

DavidP
17 - Castor
17 - Castor

Hi @MikeN,

 

The Parse XML macro only flattens the file and extracts the data, which means that regardless of how many layers of xml, you only need 1 xml parsing tool.

 

You then have to tidy up the data.

 

@JasonFriedman, glad that you got the macro working, now try the attached workflow.

 

DavidP_0-1582754454099.png

 

 

 

JasonFriedman
6 - Meteoroid

Ok @DavidP, looks like I'm getting closer. The second tool in the picture below is your macro. I notice that your image has a mark on the macro, my image does not.

 

JasonFriedman_0-1582756727504.png

 

DavidP
17 - Castor
17 - Castor

Don't worry about the + in the Parse XML macro - it's just a setting in my user config that tells me which tools are macros.

 

Can you share the output from the CrossTab tool - I'd like to understand the error in the select tool.

JasonFriedman
6 - Meteoroid

No output in the CrossTab tool, the only tool with output is the macro:

 

JasonFriedman_0-1582758491028.png

 

In your workflow the "example.xml" file is the original file I attached? These are my settings for that file:

 

JasonFriedman_1-1582758545437.png

 

 

 

DavidP
17 - Castor
17 - Castor

ok, I see the problem.

 

If you're loading the file as an xml file in the input data tool, it gets parsed as an xml file. We don't want this.

 

As I said in my earlier posts, if you're using the parse xml macro, you should load the file as a csv with delimeter set as \0 and the box unchecked that says First Row Contains Field Names (as per the example workflow that I posted).

 

This step is really important. Without it, nothing else works.

 

The parse XML macro expects a field [Field_1] that contains all the XML. If you do this bit right, the rest should fall into place.

 

DavidP_0-1582759179732.png

 

JasonFriedman
6 - Meteoroid

Thank you DavidP, works great now.

 

Follow-up ... I added the five tools in your final workflow to the macro you provided. I now have:

 

JasonFriedman_0-1582812672090.png

 

What threw me was, in your image, your input file was named example.xml. I see now that I can select from the configuration window the type of input file ... I'm not constrained by the file extension. My comment to the good people at Alteryx corporation is the File Format option value should not have a grey background, as grey implies it cannot be edited.

 

file_format.png

 

JoeL
Alteryx Alumni (Retired)

Hi Jason,

 

Please take a look at the alternative I have created below. It gets you the same answer using tools already in Alteryx:

 

XML example.PNG

 

I also added a third level to the data to show this works 3 levels deep, potentially more, as well.

JoeL_0-1582825091601.png

JasonFriedman
6 - Meteoroid

Thank you @JoeL , that solutions errors out with my actual data file:

	Select (6)	The field "Field_1" is not contained in the record.
	Summarize (5)	The field "Field_1" is not contained in the record {Action=Concat).
	Summarize (5)	RecordInfo::CreateRecord:  A record was created with no fields.
	Input Data (1)	617460 records were read from "C:\Users\jason\Downloads\myfile.xml"
	Designer x64	Finished running XML Flat Parse.yxmd in 1.0 seconds with 2 errors and 1 warning

 

Actually, @DavidP 's solution also does not work with my actual data file ... it does not error out, but returns only one record, with each field containing thousands of values separated by commas.

 

	Designer x64	Started running C:\Users\jason\Downloads\parse xml file 2.yxmd at 03/03/2020 10:22:00
	Input Data (12)	617460 records were read from "C:\Users\jason\Downloads\myfile.xml"
	Parse XML (11)	Tool #328: accountid: "IFT000347_24,IFT000349_24,IFT000350_24,IFT000351_24,IFT000352_24,IFT000358_24,IFT000372_24,IFT000..." was truncated
...
	Parse XML (11)	Tool #328: type: "D,D,P,D,D,D,D,D,D,D,D,P,D,D,P,P,D,D,P,P,D,D,D,D,D,D,D,D,P,D,D,D,P,D,D,P,D,P,P,D,D,P,P,D,D,P,D,D,D..." was truncated
	Browse (10)	1 records
	Output Data (14)	1 records were written to "C:\Users\jason\Downloads\output.csv"
	Designer x64	Finished running parse xml file 2.yxmd in 31.5 seconds with 46 field conversion errors and 1 warning

  

JoeL
Alteryx Alumni (Retired)

@JasonFriedman, the "Field_1" error can be fixed in the input tool. You need to deselect "First row contains field names" to get the XML to come out in Field_1. you can also rename your XML column to Field_1 using a select tool.

JoeL_0-1583949752195.png

 

Labels