Let’s talk Alteryx Copilot. Join the live AMA event to connect with the Alteryx team, ask questions, and hear how others are exploring what Copilot can do. Have Copilot questions? Ask here!
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Parse help

dorchin
5 - Atom

I have this text file which have more than 7,000 rows.  How can I parse this to match the below desire output? 

 

dorchin_0-1754422140110.png

8 REPLIES 8
CoG
14 - Magnetar

RegEx will get you there: there appears to be sufficient structure in the Text Input that a proper Regular Expression in the RegEx Tool will be able to Parse the data.

 

Screenshot.png

 

Expression: (\d+)(.*?)\s{2,}(.*?)\s{2,}(\w{2})(.*?)\s{2,}(\w)\s(\w)(.*$)

 

One of the key items is splitting sections on the \s{2,}, which looks for multiple connected spaces.

 

Hope this helps and Happy Solving!

dorchin
5 - Atom

CoG, thank you your help! The RegEx worked for the most part but there was roughly 900 rows where it was [Null].  At first I thought maybe the data had special characters, so I did a data cleansing to remove punctuation and leading/trailing whitespace.  That didn't help.  Not sure why the RegEx parse didn't work on all. 

 

dorchin_0-1754426942079.png

 

CoG
14 - Magnetar

Not sure how to handle the first two cases, since the Short Code and Full Name are not separated by multiple spaces.

The last example ends in YSY20181214, but all other examples involve a space. I'm glad the expression worked for most of the values, but it seems like there are some inconsistencies (from what I had initially observed). Without well-defined rules, RegEx does not work. I clearly do not know all of the rules, so if you have more information about what the structure of the string is, that will be critical in refining the RegEx.

PangHC
13 - Pulsar

@dorchin @CoG for these case, mostly i will goto open in notepad first. to check the monospace view (where each letter is same size, where ease to see the pattern).

Screenshot 2025-08-06 102236.png 
if it structure like this. mean we can direct split by num of char.
(this why i raise idea to have a monospace in the result window.)

so here the sample workflow to split by char, and use macro that i built before. 

Screenshot 2025-08-06 102200.png

Yoshiro_Fujimori
15 - Aurora
15 - Aurora

@dorchin 

 

You may also use .flat file to read a data file which contain fixed-length fields.

help.alteryx.com : Flat (ascii) Files

 

You can define the metadata as described in the above help page with the extension as ".flat".

For your sample data, it would look like below;

<flatfile version="1">
  <file
    path="sample2025.txt"
    eoltype="crlf"
    trimWhiteSpace="t"
    allowShortLines="t"
  />
  <fields>
    <field name="ABA" type="String" length="9" />
    <field name="Bank Short Name" type="String" length="18" />
    <field name="Bank Full Name" type="String" length="36" />
    <field name="State" type="String" length="2" />
    <field name="Town" type="String" length="25" />
    <field name="Random1" type="String" length="2" />
    <field name="Random2" type="String" length="1" />
    <field name="Date" type="String" length="8" />
  </fields>
</flatfile>

 

Then you can read the .flat file with Input Data Tool.

readFlatFile_workflow.png

 

You can download and unzip the attached zip file and see how it works.

If you save the data file and the .flat file in separate folders, you need to set the path for the text data according to the actual location of the data file.

 

If you want to change the data type of any field, you can change "type" in the .flat file accordingly.

 

I hope this helps.

Gaurav_Dhama_
12 - Quasar

Check this one out, it is independent of length, so in future if your items length changes it will still work as expected.

What I am doing here is identifying converting all two consecutive spaces with "|".

Then I generate a list of possible combinations of "|" using generate rows.

Then I replace all the combinations of "|" with single "|".

Added a recordID, now I can split the whole data into individual rows.

Once done, I use Tile to group on recordID and assign a value to each unique record.

Cross tab the data group by recordID and header as tile sequence and values as Field_1.

Clean it, sort it and you have your data ready for you to work with.

dorchin
5 - Atom

Your workflow worked perfectly.  Silly question 1: where can I find the "Char Columnizer" tool?  I tried typing in the search bar and it didn't come up.  Silly question 2: how do you know how many characters to split into column (9,18,36 etc). Thanks again!!

PangHC
13 - Pulsar

It is Macro, where is built by me. not a standard tool. It should be save along workflow.


To use it, you need to:
1. move the macro to same folder of your workflow.
2. right click -> insert -> macro... to browse the macro

3. then can move as normal tool.
Screenshot 2025-08-07 104032.png

To share the workflow that use macro, you need to export it and make sure the macro is included:

Screenshot 2025-08-07 104228.png

 

To identify the number of char for each column, either count manually or using notepad. it show the # of character in bottom left.
Screenshot 2025-08-07 104513.png

Labels
Top Solution Authors