Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

XML Parsing for Beginners

TaraM
Alteryx Alumni (Retired)
Created

Reading XML in Alteryx is a little different than other input types and the reason for this is simple - XML is not a tabular data format and the data structure can vary from one file to the next. To effectively read and parse XML data in Alteryx, you need to use a combination of tools, namely the Input tool and the XML Parse tools.

 

Emily Chen from the Information Lab wrote a great article that describes the steps necessary to read and parse XML in Alteryx:

 

"XML files are different from flat columnar tables (the ones we’re used to!) because instead of headers and rows, the data is nested within tags, where the field headers are identified for every record with <> brackets." See the full article here.

 

 

 

 

Comments
serendipitytech
8 - Asteroid

 I've been trying to work with a variety of similar, but of course just different enough XML files with Alteryx. I've had some successes using the RegEx option to find pieces that match a specific pattern. But I keep running into some roadblocks along the way. 

 

In general, I don't know the structure of the XML files, so I'm having a hard time knowing where to start. I know that the file contains some specific information that I want, but knowing what child it is in isn't possible. So, I'm not sure how best to use Alteryx to initially determine the structure of the file. Through another poster, I was able to create a workflow that will show a list of all tags in the file, and that helps to know where to start looking. 

 

However, in a more specific question - I have a file (a sample is included below) that contains information about an item. My task is to generate a list that associates the item identifier to the specific attributes, for example from this sample, I'd be looking for something along this sort of output table:

item_identifier | grade_level | subject | Depth of Knowledge | Bloom's Taxonomy | etc etc

So, basically a table that has all of the  <source><langstring> as the title and the <taxon><entry> as the value. Then to make things a little more complex, when you get to the <source><langstring>Standard, there are multiples with the same name but different values, and I need all the values. This is the piece that I solved using a RegEx search to pull out all GUIDs

 

So, basically, I'm struggling a little with finding how to best parse out the data from a file like this when the tags don't contain the data I want, but all the titles and values are in the values of the tags (does that even make sense?). 

 

 

 

 

<?xml version="1.0" encoding="utf-8"?>
<resource identifier="600109" type="imsqti_item_xmlv2p1" href="600109.xml">
	<metadata>
		<qtiMetadata>
			<timeDependent>false</timeDependent>
			<interactionType>choiceInteraction</interactionType>
			<feedbackType>none</feedbackType>
			<solutionAvailable>true</solutionAvailable>
		</qtiMetadata>
		<lom>
			<classification>
				<taxonpath>
					<source>
						<langstring>grade level</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>05</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>subject</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>ela</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>ParentID</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>600109</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Depth Of Knowledge</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>1</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Bloom's Taxonomy</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>3</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Item Type</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>3</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Consortium ID</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>1</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Media Type</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>1</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>7FF4ECC2-7440-11DF-93FA-01FD9CFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>2F5F8D60-29E9-11D8-9067-8AE8A870519B</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>E4682D76-B8C9-11E0-9889-17519DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>9E994ACE-DA6D-11E2-BA53-9B72D51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>2DB98482-DA71-11E2-B055-E63F9DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>3736E626-DA67-11E2-B1AA-803E9DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>F7F67036-004B-11E1-A57E-F8449DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>BCA88310-B349-11E0-8CA3-85249DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>24002860-D9AE-11E2-8F9A-BB82D51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>D1647ED2-D9EC-11E2-A4CC-EF239DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>1E0D912E-ACC2-11E0-BB03-D6D7D51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>8EA42518-D9B3-11E2-BE28-04169DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>3C90E978-BA88-11E1-B2D2-09269DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>24B9A222-D9D1-11E2-A302-B4A5D51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>5550C688-DA6F-11E2-85CC-52EDD51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>ef950e9a-c184-42e9-af99-4d7fd3b6836a</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>392d881c-19f4-45d9-b7ee-646b71e2a2f5</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>38D39AEA-D9A6-11E2-8171-BA739DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>CDA6BB92-93DB-11E0-8527-63159DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>E42FDDB4-C237-11E0-875F-EA079DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>E98AA652-D9BB-11E2-BACF-F789D51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>4C54E392-7D11-11E0-9276-F5539DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>C742D620-D9B4-11E2-9DF4-B4A5D51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>2BBECD12-DA72-11E2-9C75-52EDD51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>61910C1C-D9E0-11E2-B767-92BED51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>3F2FCF9E-DA78-11E2-A988-824A9DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>AA9470EA-EC09-11E1-B53C-5B739DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>DB271DDE-DE81-11E0-831A-85489DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>108fa35f-d729-4a7e-984c-ffbb1edc7823</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>E934BA26-AF87-11E1-913D-05249DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>B75746E0-D9D4-11E2-99C6-671D9DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>A0B94972-279E-11E1-814D-E1469DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>A28E6FB6-D9D6-11E2-8B79-45B7D51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>23E1B210-3A33-11E2-B7CD-2BFEBCAA0334</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>23E388F6-3A33-11E2-B7CD-2BFEBCAA0334</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>C8393DB2-D9E6-11E2-BBB0-00249DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>43B78F2A-D9E2-11E2-96B6-A0BED51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>52C65876-D90F-11E2-9D4F-28629DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>BE21482E-B800-11E0-8FB8-5B1F9DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>36CEC8C6-D90D-11E2-8CB5-43E6D51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>391B987A-D9A8-11E2-8171-BA739DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>1CC5AA8E-9892-11E0-8388-5F489DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>616E4706-BFDC-11E0-8BD1-2C4B9DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>7b8c05a4-fc2b-4fe5-b146-db0dba595310</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>D5B546B4-DA79-11E2-BE58-6FEDD51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>afba90d1-5436-4a67-b7ae-b471f5175baf</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>0506A89C-D9BA-11E2-9295-FE079DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>3E69F77E-D9D3-11E2-A820-F2109DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>50A947E0-D9D9-11E2-BD13-D3A5D51F4EFC</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>EDE77E32-AF87-11E1-913D-05249DFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>E3D182D0-835A-11E0-A099-8AF99CFF4B22</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>Standard</langstring>
					</source>
					<taxon>
						<id>
							<langstring>1</langstring>
						</id>
						<entry>
							<langstring>a9afb396-4286-468e-92ca-8c46adde9e14</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>TeacherRead</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>false</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>StudentRead</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>true</langstring>
						</entry>
					</taxon>
				</taxonpath>
				<taxonpath>
					<source>
						<langstring>LanguageID</langstring>
					</source>
					<taxon>
						<entry>
							<langstring>1</langstring>
						</entry>
					</taxon>
				</taxonpath>
			</classification>
			<lifecycle>
				<version>2.1</version>
				<status>
					<source>LOMv1.0</source>
					<value>Final</value>
				</status>
			</lifecycle>
			<technical>
				<format>text/x-imsqti-item-xml</format>
			</technical>
		</lom>
	</metadata>
	<file href="600109.xml"/>
	<file href="style/LiveInspect.css"/>
	<file href="style/600109.css"/>
</resource>