Alteryx Designer Desktop Discussions

danrh · ‎04-24-2018

Alright, I've got a doozy...

I receive fixed width text files that I currently parse out one by one. I want to create a macro that auto-parses them. The issue is that the number of columns and the width of individual columns changes from report to report. The one I'm currently looking at comes like so:

The key to knowing the column widths is the second row with the dashes. Two blank spaces separate each field. Is there a way to take the format of a single row and parse the rest of the data accordingly? I've poked at it with Regex, but no dice yet.

The only idea I've had thus far (and it ain't a good one) is to parse out every character onto a separate row, use a Multi-Row to figure out where the breaks are, and Concatenate them back together. But, the size of the file is quickly going to become an issue. 200 characters across 200 records would give 40,000 rows --- this gets out of hand very quickly, not exactly a viable solution.

Assume no absolutes with the column naming conventions. Could have letters, numbers, punctuation, etc. I haven't built iterative marcros and while I want to learn, I'm not convinced that I want to learn today. Looking to do this in a single pass if possible.

Thanks all!

MarqueeCrew · ‎04-24-2018

I did it! Yes, it can be done.

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.

david_fetters · ‎04-24-2018

This is a gem of a problem! I'm onto something here, but I want to know if I can assume that the first row containing the field names (as well as the whole document) will always have two white space characters at a minimum between columns, and at a maximum one white space in between words?

I'm working on something that creates a regex parse statement based on the number of dashes between white space characters, then injects that formula into a parse tool contained in a macro (since you can only update a tool based on workflow data if you're in the layer above it).

Give me a minute to run with things and i'll see if i can't get you a small workflow.

MarqueeCrew · ‎04-24-2018

Here's a link to my WebEx. I'll show you how it is done.

https://meetings.webex.com/collabs/meetings/join?uuid=MD551XJD3K750MZ56FWSURXHOP-UCSL&epwd=dcd80e514...

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.

david_fetters · ‎04-24-2018

Ok, there's a bunch wrapped up in this macro. Two key problems: 1) building a regex that knows how many characters need to be in each column, and 2) building the XML code that specifies the right number of columns so that the inner regex parser creates the right number. See if you can work through it with the comments I added. I went through and included some extra tools to trim whitespace from the fields.

I've tested this on a variable number of columns using the dummy text and it seems to work if I remove a column. Let me know what happens if you add to it.

The key to inputting your text file is to set it as delimited using the "\0" delimiter, which means no delimiter at all, and disabling field names. Then be sure to set the max field width wide enough to hold the entire row (i used 10000).

EDIT: @MarqueeCrew hosting a WebEx and beat me to the punch. You make it hard to earn stars :)

MarqueeCrew · ‎04-24-2018

My solution uses no regex and doesn't have any limits. Did you test my macro?

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.

david_fetters · ‎04-24-2018

@MarqueeCrew Can I ask how you approached this problem without using Regex? Your macro's encrypted but i'm curious how you handle the parsing from a tool standpoint.

MarqueeCrew · ‎04-24-2018

@david_fetters,

I encrypted the macro because I'm sending the post To @JoeM as a potential weekly challenge. Hopefully @danrh can verify that it works for his data varieties. I can YouTube the macro. It was fun to create.

Essentially i use the second row to calculate the maximum field lengths. Then I dynamically substring each record into fields and finally rename and auto field the types.

Substring was my friend. Feel free to Pm me and I can send you the macro or review it together. Maybe tomorrow I can create a video.

Cheers,

mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.

danrh · ‎04-25-2018

And like magic, two solutions! Thanks guys!

MarqueeCrew · ‎04-25-2018

For those interested in knowing how to solve the challenge, I've created 3 videos to watch.

8:10 - How to

29:30 - Build the workflow

7:58 - Finish it up

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.

Alteryx Designer Desktop Discussions

Dynamic Fixed Width Text to Columns