Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Alteryx is here to help you solve your biggest data challenges. Read about the new Virtual Solution Center here.

Character Truncation when parsing flat file by length

Highlighted
6 - Meteoroid

Hi, 

 

I am trying to parse a flat file by different, known-in-advance lengths (ie, characters 1-25, 25-50, 50-65, etc.) with each entry ~600 characters long. I am getting Null outputs and the input field cells show that they have truncated characters. This is shown below.

Alteryx String Parse by Length.PNG

Can anyone help me figure out what is going wrong here?

 

Thanks!

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Hi @kmshannon 

 

Have you tried to put a ".*" at the end of the RegEx, after your parsing groups?

 

Cheers,

Highlighted
6 - Meteoroid

Hi Thableaus

 

 

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Hi @kmshannon 

 

When I read a .txt fixed file, I like to import a file layout.  Here's what a flat file layout looks like:

 

<flatfile version="1">
   <file
      path="C:\Users\MFrisch\Desktop\temp.asc"
      eoltype="crlf"
      />
   <fields>
      <field name="Field1" type="String" length="1"/>
      <field name="Field2" type="Byte" length="3"/>
      <field name="Field3" type="Date" length="10"/>
   </fields>
</flatfile>

I'll use a tool like Excel or Notepad++ to help me.  If I construct the <Fields> values and paste them into a document named MyLayout.FLAT I can import that as part of the INPUT tool and create the right NAMES and filetypes.

 

Using excel's concatenate function, I can read an input list of fields and sizes and quickly construct the xml.  Yes, i could do that in Alteryx. too.

 

Cheers,


Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

@kmshannon 

 

Each row probably has a newline character, which is not covered by (.) itself. Dot means any character, but if you want to include "\n" (newline), you need to add a "?s" modifier (DOTALL mode).

 

I recommend this website  and also the book "Mastering Regular Expressions" that covers the entire engine behind it.

 

Try to put ?s before your Regex and see what happens.

 

Cheers,

 

 

Highlighted
6 - Meteoroid

I converted the .dat file I was using to .xlsx, but am still getting the character truncation error. Can anyone explain what is happening here? I am trying to read in string records with whitespaces that are up to ~600 characters in length.

Alteryx String Parse by Length.PNG

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Hi @kmshannon 

 

Could you please provide 1 row of your data?

 

It looks like your Regex is wrong.

 

it is trying to match .(any character) and then multiple spaces "\s"{n} in a sequence, which does not seem to be the case.

 

Cheers,

Highlighted
6 - Meteoroid

Thableaus,

 

A cleaned sample file is attached with two rows of data. The file has 24 columns of different, but known lengths. The lengths of each field (including empties) that I am trying to extract is as follows:

25
25
15
100
15
22
240
35
8
5
4
7
6
6
5
8
5
6
2
3
16
6
20
22

 

Let me know what you think the best approach is. I can't believe that Alteryx doesn't have a simpler way to parse .dat files based on field lengths.

Thanks

Labels