Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extract data from cell in spreadsheet containing large amount of XML

jonvesguerra
7 - Meteor

Good morning everyone,

 

There's a column in my spreadsheet containing a large chunk of XML code. Is there any way to extract the data from it? I know the XML Parse tool exists, however I noticed the formatting for my data is different. It is my first time running into any XML so I am a bit confused. The workflow attached has the data in a text input, so no need to download the excel unless you want to see the entirety of the data.

 

Thanks!

5 REPLIES 5
ed_hayter
12 - Quasar

Had a look at the sample text field in VScode.

 

image.png

I manually split it into separate rows based on my best guess of hierarchy. i've highlighted a row that makes little sense to me. The color value never receives a closing tag. We then get two closing tags for tags that were never opened. It's also worth noting that there is a lot of information in this cell.

 

I would suggest the next steps are working out what information you want out of this cell along with what the structure of your desired output looks like. In my experience that is better than parsing for parsing sake without a plan of what you want.

 

If you realize you only need key parts of this - i would consider using REGEX to parse the specific pieces of information you want. E.g. You might want vehicle color and price. Regex can help you get that specific information from the string

 

 

PhilipMannering
16 - Nebula
16 - Nebula

Try this...

 

jrgo
14 - Magnetar

Lots of issues with the XML string... Start tags missing end tags and vice versa. Did you by chance alter the XML in preparation for posting here or is that string as is from the source and maybe scrubbed out more than intended?

 

Problem I'm seeing is on the "Color" (missing "type=" and end tag) and "FrameSize" (missing start tag) attribute.

 

If this is direct for the source, then yeah... you won't be able to use the XML Parse tool. To answer your question though, yes. you can extract the data from it, but will have to all be manual, in that you may need to create an expression to extract the value for each attribute. Likely some heavy REGEX, which could impact the integrity of your data should something change in the XML that your expression/design isn't expecting.

 

You may want notify the owner/vendor for the system the XML came from and let them know to see if they can correct.

PhilipMannering
16 - Nebula
16 - Nebula

Hi @jonvesguerra. Did my workflow solve your issue?

jonvesguerra
7 - Meteor

Hello! I think your solution did very well extracting the important information considering the poor formatting of the XML. I appreciate your work!

Labels