Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Different options to import XML

joesparty
8 - Asteroid

Hello,

 

New user and looking forward to learning Alteryx.  Looking for suggestions on how to import the attached XML file:

  • Capture device names found in rows 5 – 48 such as;
    • 111801005495
    • 111801009217
  • Capture user and attributes found in rows 51-130 such as;
    • Admin superreader=yes
    • user1 superreader=yes

I've tried the input data as XML and CSV, however, I'm stuck.  I have outer XML files types to define, so I thought I'd post to see what the community thinks.

 

Thanks,

-Joe

5 REPLIES 5
danilang
19 - Altair
19 - Altair

Hi @joesparty 

 

There isn't a "one size fits all" answer to parsing XML files, since the internal structure of any element can vary.  For lots of different ways of parsing XML, check out Weekly Challenge 161 A Symphony of Parsing tools

 

Dan  

joesparty
8 - Asteroid

Hi @danilang,

 

Thanks for the link, there are some great solutions posted.  I think the issue I'm having with the XML parse tool is that the file isn't formatted as a true XML.  In the attached:

  • Option 1 the XML parse tool gives an error "invalid document structure at Line3 and Column7". 
  • Option 2 the XML parse tool gives an error "invalid document structure at Line4 and Column2".
  • Option 3 - I can see all of the XML data, but can't figure out how to capture row 4 as a device and not include row 37 which is the first user
apathetichell
19 - Altair

you can sort out between your two different record types by  setting a filter with REGEX_Match([name],"\d+") - not sure about your member embedded xml field though. Can't get Alteryx to dynamically choose that one.

joesparty
8 - Asteroid

Yes getting Alteryx to dynamically define the file is the issue.  Does anyone have a preferred regex converter website?

joesparty
8 - Asteroid

I've been able to make some progress and looking to get this across the line.  Using the option to define as a CSV I've been able to get a RegEx to work.  The goal is to extract the Device fields until the first TypeEnd=</devices>.  I was trying to use the Summarize with a First option, but it didn't work.

 

So the answer should be RecordID 4 - 48, then filter for !IsNull(Devices).

Labels
Top Solution Authors