Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Fixing issues in file import with Flat File

dpakapd
8 - Asteroid

Hi,

For data processing I import data using a Fixed width flat file which when the data in the input file (usually in .txt format) is correct will import it as shown below:

 

dpakapd_0-1643922138742.png

 

However let's say the input file has some issue (happens sometimes when a vendor sends it with wrong formatting) and there is an additional tab or space before the first character (or subsequent characters) on any row then the input gets skewed for that row alone as shown below (notice the first row and first column only has 5 numbers. First row and 2nd column has 5 numbers instead of 4 and everything is shifted by one character to the right)

 

dpakapd_1-1643922277543.png

 

If this was one file, I can manually fix it. But I would like to see if there is some way I can import the data from the text file and then make Alteryx fix these issues so that the output is per the 1st image shown above.

 

Below is the (correct) format in which data usually occurs in the text file.

 

dpakapd_2-1643922370514.png

 

 

And below is when I shifted just the 1st row by 1 space to the right. How can I make Alteryx import the data regardless and fix it in the format (as shown in the very first image)?

 

dpakapd_3-1643922420336.png

 

Each column has a fixed width and the spacing is also important. So I cannot just strip all spacing and then use substring extract (tried it and it didnt work because it was pulling the number as "520101" whereas the correct format is "5201  " (2 spaces after 5201)).

 

Thanks

7 REPLIES 7
Luke_C
17 - Castor
17 - Castor

There may be a more elegant way to do this, but you could try:

 

  1. Reading it in as a non-delimited text file so everything is in one field
  2. Trim leading/trailing spaces
  3. Use a regex tool to parse out the fields based on the fixed width definitions (see below screenshot)
  4. Luke_C_1-1643923662882.png

     

 

dpakapd
8 - Asteroid

Hi Luke.. Thank you! Is there a way I can make Regex ignore the space before the 012200 (I am assuming what if the vendor had introduced an extra space in the middle rather than at the beginning) as shown below (first row). Using regex is there a way I can make it ignore this space before 0122

 

dpakapd_0-1643925063370.png

 

Without ignoring that extra space before 0122, my output becomes as shown below:

 

dpakapd_1-1643925134905.png

 

Thanks

binuacs
21 - Polaris

@dpakapd please provide some sample data and width of each fields

Luke_C
17 - Castor
17 - Castor

Hi @dpakapd I would just use a formula tool right after the input that is TRIM([Field_1]). That will get rid of the leading space (if there is one)

Luke_C
17 - Castor
17 - Castor

@dpakapd  missed your point about the extra space in the middle of the fields. That is trickier but should be able to get parsed out if you give some sample data (not screenshots).

 

Personally, if one of my vendors had agreed to a file format  I would push back and get them to correct it :) 

dpakapd
8 - Asteroid

@binuacs : Pl see attached a sample data. The width of the fields is as shown in the screenshot below from the flat file.

 

dpakapd_0-1644252955901.png

 

dpakapd
8 - Asteroid

Hi Luke.. I have attached some sample data.. 

As of now we are pushing back on vendors to fix the file format. But it becomes a repetitive exercise with multiple vendors each month. That's why I want to try and see if there is someway I can take in the data they give and spit it out in the correct format. 

 

Below is the screenshot from the flat file (not able to upload that).

 

dpakapd_0-1644253105232.png

 

Labels
Top Solution Authors