community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Trying to parse a specific child element, but there are other children named the same

Fireball

This is an extract from a government xml budget document I am trying to parse.   It is publicly available

 

I have run into this problem before.  But, I am deeper into the data and the prior solution is not available

 

I have attached both a snippet from the XML and a simplified workflow that depicts my challenge. 


The master element is ModificationItems.   I need to extract data located on 3 lines:

 

  1. Title on line # 4, 
  2. Manufacturer's Name  on line #139 (which I dropped from the workflow since it is not a problem to pull in the real workflow), and
  3. Total Cost and its children on line #169.

But there is a problem.  Total Cost is used multiple times throughout.   I specifically want only the Total Cost that appears under Total Cost.  I want to ignore the other total Cost items.  While others look the same, I need specifically this one.  

 

I tried to walk it down from Procurement to Total Cost to Total Cost to ignore the other Total Cost Items under other children of Procurement.   It is not working.  As you can see, I am getting multiple row outputs.  It should be one row.   

 

Help and thank you

 

 

 

Data Scientist
Data Scientist

Hi @hellyars

 

To get only the TotalCost that occurs under TotalCost, at line 169, I modified the configuration of your last XML Parse Tool, and added a Filter Tool. 

 

For your final XML Parse Tool, I configured it so the Field with XML Data is set to TotalCost_OuterXML2, which is the output of your previous XML Parse Tool, where you extract the outside TotalCost layer. I set the Tool to Parse a Specific Child Name, TotalCost and to Return Child Values.

 

2018-02-26_8-36-22.png

 

This configuration makes it so the XML Parse Tool is looking for TotalCost from the already parsed outer TotalCost. The result is that only one of the rows in the data stream has data values resulting from this process. The rest have nulls because there is not an inner TotalCost in the XML.

 

2018-02-26_8-41-22.png

 

I then added and configured a Filter Tool to select the rows without nulls for the children of the inner TotalCost. The result is a single row, with the parsed data starting at row 169.

 

2018-02-26_8-41-52.png

 

I've attached your workflow with the modifications I made to get the inner Total Cost Values. Please let me know if this solution does not work for you.

 

Thanks!

Labels