This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
One of the greatest strengths of modern web APIs is their flexible, developer-friendly nature, which provides numerous options for both the provider and the user. However, this flexibility can make it more intimidating for business users to deal with the various data formats that these APIs provide. The purpose of this article is to familiarize you with the main data formats used by the vast majority of web APIs, and provide the basic knowledge that will allow you to confidently process the data they return into a typical tabular format.
The RegEx tool is kind of like the Swiss Army Knife of parsing in Alteryx; there are a whole lot of ways you can use it to do things faster or more effectively, but even if you just use the blade it's still immensely useful. Sometimes that's all you need, but if you do take the time to figure out how to use a few other tools in that knife, you'll start to see that there isn't much you can't do with it.
RegEx….it can be tough, but extremely useful when looking for ways of extracting information from a string. Regular Expressions are basically a code you can write to match a specific set of characters (it's a pattern matching syntax). It could be something as simple as finding the three-digit number in the string “Eample123Eample” or something more complicated like using hexadecimal to select a certain range of characters. RegEx is something that can be used in Alteryx via the RegEx tool in the Parse toolset and some of the main uses for RegEx are replacing, matching, and parsing specific characters within a string.
When importing a delimited file, have you ever run across an error saying ‘Too many fields in record…’ and it’s something like record number 1478254 and you think to yourself, “how the heck am I going to find that record and correct it?”
Well, let us help you with that.
This message is essentially telling us that some particular row of data, when parsed, has more fields that the first row of data (usually caused by an extra delimiter in that row).
The first step in identifying the row is to change the delimiter in the Input Data configuration window. Change the delimiter from the current delimiter to \0 (that is a backslash zero, which stands for no delimiter). This will read in your file but instead of parsing, it’ll just read in the data as is.
Now just add a RecordID tool and Filter tool so we can filter on RecordID 2 (or 1478254) to find the row that is holding us up.
Now that you’ve identified the row that is causing issues, you could go back to your original data, correct it and try importing it again.
If you were looking closely at the example data, you may have noticed that record 4 also had a different number of fields as row 1. After correcting row 2 and importing , again, we would get an error for row 4. Just follow the same procedure to correct additional errors. This could probably be done through an automated/iterative process, but that will have to wait for another time.
(attached is an example workflow created in Alteryx 10.0)
These methods should work in most versions of Alteryx.
One of the most common issues we see from clients trying to read in a .csv file is that they are receiving an error starting “Too many fields in row x”, and because of this Alteryx can’t read in the file.
There are a couple of different ways you can resolve this error.
First, in the Input Data tool, you can tell Alteryx to treat read errors as warnings to allow the file to be read in.
This will change the Error to a Warning so that Alteryx can read in the file so you can investigate.
The other method is to read the file in with no delimiter. You can do this by changing the delimiter in the Input Data tool to \0.
Once you have your data parsed back out into its fields, you can use the Dynamic Rename tool to correct your field names, a select tool to remove the original field, and a simple Trim() function to remove the extra delimiter from your data.
This process is illustrated in the attached workflow, created in version 10.1.
Between the RegEx , Text To Columns , and XML Parse Tools , the Alteryx data artisan already has an exceptionally robust selection of tools to help parse uniquely delimited data. However, there are still some data sets so entangled in formatting that it’s labor intensive to parse even for them. Enter the Find and Replace Tool , which captures the ability to find your nightmarish parsing workflows and replace them with sweet color by number pictures. Just kidding, it finds bad jokes and replaces them with good ones. Seriously, though, you could do both if you wanted to because this tool has the capability to look up a table of any number of specified targets to find in your data and will replace them with a table of specified sources. With the help of a few quick configuration steps, this tool can simplify some parsing use cases significantly.
Reading XML in Alteryx is a little different than other input types and the reason for this is simple - XML is not a tabular data format and the data structure can vary from one file to the next. To effectively read and parse XML data in Alteryx, you need to use a combination of tools, namely the Input tool and the XML Parse tools .
Easily the most used tool in the parsing category, the Text To Columns Tool makes for an extremely quick dicing of delimited fields. To use it you only need to specify a delimited field, delimiter(s), whether you’re parsing to rows or columns (you’ll need to specify a number of columns to parse into with this selection) and you’re off. Any way you slice it, this tool has you covered:
Dynamic Input threw error: ParseError: Invalid document structure at line 1 and column 1 while processing "AliasFileXml" on the workflow with "Root element is missing." in a popup. Upon clicking OK, a different pop-up comes up saying "Internal error in SRC_GetCosmeticName_Raw"
Date/Time data can appear in your data in string formats (text fields) or date formats. The DateTime Tool standardizes and formats such data so that it can be used in expressions and functions from the Formula or Filter Tools (e.g. calculating the number of days that have elapsed since a start date). It can also be used to convert dates in datetime format to strings to use for reporting purposes.