community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
SOLVED

Dynamic Fixed Width Text to Columns

Pulsar

Alright, I've got a doozy...

 

I receive fixed width text files that I currently parse out one by one.  I want to create a macro that auto-parses them.  The issue is that the number of columns and the width of individual columns changes from report to report.  The one I'm currently looking at comes like so:

 

image.png

 

The key to knowing the column widths is the second row with the dashes.  Two blank spaces separate each field.  Is there a way to take the format of a single row and parse the rest of the data accordingly?  I've poked at it with Regex, but no dice yet.

 

The only idea I've had thus far (and it ain't a good one) is to parse out every character onto a separate row, use a Multi-Row to figure out where the breaks are, and Concatenate them back together.  But, the size of the file is quickly going to become an issue.  200 characters across 200 records would give 40,000 rows --- this gets out of hand very quickly, not exactly a viable solution.

 

Assume no absolutes with the column naming conventions.  Could have letters, numbers, punctuation, etc.  I haven't built iterative marcros and while I want to learn, I'm not convinced that I want to learn today.  Looking to do this in a single pass if possible.

 

Thanks all!

Alteryx Certified Partner
Alteryx Certified Partner

I did it!  Yes, it can be done.

 

Cheers,

Mark

 

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.

This is a gem of a problem!  I'm onto something here, but I want to know if I can assume that the first row containing the field names (as well as the whole document) will always have two white space characters at a minimum between columns, and at a maximum one white space in between words?

 

I'm working on something that creates a regex parse statement based on the number of dashes between white space characters, then injects that formula into a parse tool contained in a macro (since you can only update a tool based on workflow data if you're in the layer above it).

 

Give me a minute to run with things and i'll see if i can't get you a small workflow.

Alteryx Certified Partner
Alteryx Certified Partner

Here's a link to my WebEx.  I'll show you how it is done.

 

https://meetings.webex.com/collabs/meetings/join?uuid=MD551XJD3K750MZ56FWSURXHOP-UCSL&epwd=dcd80e514...

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.

Ok, there's a bunch wrapped up in this macro.  Two key problems: 1) building a regex that knows how many characters need to be in each column, and 2) building the XML code that specifies the right number of columns so that the inner regex parser creates the right number.  See if you can work through it with the comments I added.  I went through and included some extra tools to trim whitespace from the fields.

 

I've tested this on a variable number of columns using the dummy text and it seems to work if I remove a column.  Let me know what happens if you add to it.

 

The key to inputting your text file is to set it as delimited using the "\0" delimiter, which means no delimiter at all, and disabling field names.  Then be sure to set the max field width wide enough to hold the entire row (i used 10000).

 

EDIT: @MarqueeCrew hosting a WebEx and beat me to the punch.  You make it hard to earn stars :)

Alteryx Certified Partner
Alteryx Certified Partner
My solution uses no regex and doesn't have any limits. Did you test my macro?
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.

@MarqueeCrew Can I ask how you approached this problem without using Regex?  Your macro's encrypted but i'm curious how you handle the parsing from a tool standpoint.

Alteryx Certified Partner
Alteryx Certified Partner

@david_fetters,

 

I encrypted the macro because I'm sending the post To @JoeM as a potential weekly challenge. Hopefully @danrh can verify that it works for his data varieties. I can YouTube the macro. It was fun to create. 

 

Essentially i use the second row to calculate the maximum field lengths. Then I dynamically substring each record into fields and finally rename and auto field the types. 

 

Substring was my friend. Feel free to Pm me and I can send you the macro or review it together. Maybe tomorrow I can create a video. 

 

Cheers,

 

mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Pulsar

And like magic, two solutions!  Thanks guys!

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

For those interested in knowing how to solve the challenge, I've created 3 videos to watch.

 

8:10 - How to

 

 

 

29:30 - Build the workflow

 

 

7:58 - Finish it up

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Labels