Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

Parsing Data with Unknown Number of Fields

PeterS
Alteryx
Alteryx
Created

You have a delimited file that you need to read into Alteryx but there are an unknown number of data fields. What do you do? (Aside from opening the file and counting out the delimiters by hand)

 

In this example all the data is coming through in one column of data and it needs to be parsed out to individual columns.

 

1 starting data.png

 

Pretending we didn’t know there are 4 fields we can follow simple workflow to dynamically parse the data into the appropriate number of fields.

Start by adding a Record ID tool to the workflow, then add a Text to Columns tool.  Rather than configuring the tool to parse to columns, we’re going split to rows using the delimiter, in this case the ‘|’.

 

2 workflow.png

3 configuration.png

 

This will take all the data from the one row and put it in individual rows, notice that the RecordID groups everything from the same row together, which will come in handy a little later.

 

4 progress.png

 

Add Formula tool and create a field call count and set it equal to 1.  Now a Running Total tool and configure it to group by RecordID and Create Running Total on the Count Field. This will essentially give us the number of columns in our data.

 

5 workflow.png

6 progress.png

 

Two more tools and we are good to go.

 

Now add a Cross Tab tool, this is where the ‘magic’ happens.  Configure to the tool to group by RecordID, set the Head Field to RunTot_Count, Data Field should be set to F1 and then select First under Methodologies.

 

7 workflow.png

8 configuration.png

 

The Final step is to add a Dynamic Rename tool and configure it to ‘Take Field Names from First Row of Data’. Run the workflow and you’re done!

 

9 workflow.png

 

10 final data.png

 

Happy Parsing! (this example workflow - created in v. 10.0 - is attached)

 

 

Attachments
Comments
GarthM
Alteryx Alumni (Retired)

The "run total" trick! Heavily used in Content Engineering. Great post!!

Edu_XGomes
7 - Meteor
Spoiler
 

Great post, I have to deliver a job really similar on this but the content in my column "F1" has many delimiters, some time there is "," or "//" or "--" and I am lost on this, can you guys please help me on this?
Thanks.

PeterS
Alteryx
Alteryx

Hi @Edu_XGomes

 

One way to to deal with multiple delimeters would be to use a Formula Tool to replace all the other delimeters with one delimeter so we only need to deal with one particular delimeter character.

 

In the Formula tool use the below function to update your 'F1' field.

 

Replace([F1], '//', '|')

 

In the example above wherever // is found it will be replaced with a |  You would just need to set up one formula for each of the delimeters you would need to replace. 

 

Here is example Input  data.

input.png

 

Insert a Formula tool after the Input.

Formula tool.png

 

Configure the Formula tool to account for all the varried delimeters.

formula.png

 

How the data looks coming out of the Formula tool. Everything has the same delimeter.

after formula.png

 

The bottom workflow is how the updated workflow would look.

workflow.png

Hope this Helps.

 

 

 

pradeephc
6 - Meteoroid

This is amazing... alteryx has really good features..

Thanks peter for writing this nice article..

Guruprem
6 - Meteoroid

Fantastic! Learned about Running Total Tool

mszpot89
9 - Comet

Create Running Total doesn't see Count field from previous step. How do you setup Record ID part? Should starting value be defined or remain as 1?

PeterS
Alteryx
Alteryx

Hi @mszpot89,

When configuring your Formula tool to create the count field, be sure to set the Data type to a number type like a Double.  When creating a new formula the default data type is a V_WString which will not be recognized by the Running Total tool.

 

Be sure to download the example workflow to see how it is configured.

iambhartesh
6 - Meteoroid

Thanks for sharing. I've been struggling with same kind of source. Now figured it out.

Gikafra
6 - Meteoroid

Beautiful explanation @PeterS. Learning a lot from this post.

Frachiu
7 - Meteor

Alteryx:

This article has great information but it needs quite a few edits.

A few examples:

there are an unknown ( ?)

we’re going split to rows using the delimiter, in this case the ‘|’. (The comma after the word delimiter creates confusion, although it is a pause, it is also the default delimiter as well as the commas surrounding the pipe) we’re going to split.

Pretending we didn’t know there are 4 fields (there is actually only one field that needs to split into four)

Add Formula tool and create a field call count and set it equal to 1 (Add a Formula tool/create a field called Count)

Configure to the tool to group by RecordID, set the Head Field (Remove the word “to” and change Head to Header)

select First under Methodologies (Under Methodologies select First).

I would also remove the commas from the word magic.

 

mwong
8 - Asteroid

Thank you so much!  This was EXTREMELY helpful and the post was well written, detailed, and easy to follow.