Alteryx Designer Desktop Discussions

JamesFo · ‎01-16-2020

Hi Community,

I am trying to parse information on a few hundred thousand clinical trials that are stored in individual XML files. I've loaded these into Alteryx (one row per XML file) and am mostly getting on well using the XML parse tool with specific child names to grab the data I need into new columns.

I am however struggling where there are multiple fields on the same level with the same name. For example, a snip of the xml for one trial:

 <condition>Retinitis Pigmentosa</condition>
  <intervention>
    <intervention_type>Drug</intervention_type>
    <intervention_name>Vitamin A</intervention_name>
  </intervention>
  <intervention>
    <intervention_type>Drug</intervention_type>
    <intervention_name>Nutritional Supplement</intervention_name>
  </intervention>
  <eligibility>

Contains two different interventions that I would like to parse into multiple columns (or rows). There can be multiple interventions (or 0, highest i've seen is 5).

Does anyone have an elegant solution to parse out the intervention type & name?

I'm faffing around with the xml parse & regex but my knowledge of either is too poor to make it work!

Sample (shorter) dataset attached.

Many thanks for your help!

James

Jean-Balteryx · ‎01-16-2020

Hi @JamesFo ,

I attached a workflow doing what I think you want.

Tell me if it's not exactly what you want or if you have questions.

Cheers,

Jean-Baptiste

JamesFo · ‎01-16-2020

Hi @Jean-Balteryx,

Amazing - I hadn't realised that the XML parse would just create multiple rows where it found more than one. For some reason I thought it was only returning the first values found! Maybe I should have stuck a record ID on at the start...

Thank you!

James

Jean-Balteryx · ‎01-16-2020

@JamesFo ,

Yes, the XML tool will return everything that matches the value you specify, it won't stop to the first one.

You're welcome !

Cheers,

Jean-Baptiste

Alteryx Designer Desktop Discussions

Parsing multiple similarly named children in XML

Zero to Advanced in 20 days

Re: How to separate IDs from the text

Re: Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Adding Total Row to Table