Alteryx Designer Desktop Discussions

mustufa2019 · ‎10-26-2019

I am trying to normalize a text file into a cleaner excel output file. There are headers and multiple rows here and there I do not need. I am trying to get all the users under the columns fields "ID", "Initials", and "Name". Please see screenshot below:

I usually do clean up in excel but wanted to create a workflow to make this a quicker process.

This is the output I am looking for

ID Initial Name

AU AUI Choi

TonyA · ‎10-27-2019

There are different techniques that can be used depending on the data.

The general steps to follow are:

1. Open the file as a txt file. This will open a "Resolve File Type" window.

2. Select Read as Delimited file and select OK.

3. If the data is in columns separated by spaces, you can define the columns in the next window by clicking on the character positions at the top of the window. Don't worry about unneeded line of characters before or after the data, just split the file based on the column boundaries of the data you care about. If the data is in columns separated by tabs, then you don't need to worry about column boundaries at this point. Take a look at this link for more info on how to split raw text data into columns: https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Text-txt-file-Non-Delimited-Flat-File-...

4. Click OK and the you'll get a preview of the data.

5. Use a Sample tool to skip the rows you don't care about. After this is done, you should have the column headers as the top row of data.

6. If you had split the columns in Step 3, then skip this step. If you hadn't split the columns, and the column data are separated by tabs, then set a row ID, use a Text to Columns tool to split the data to rows, then use a crosstab to put the data into separate columns.

7. You should now have the data in columns with the column names in the first row. Use the Dynamic Rename Tool to promote the first row to be the column headers.

8. At this point you can use Select, Cleansing, and/or Autofield to clean up the columns, select the ones you want, and set the data types.

mustufa2019 · ‎10-27-2019

This is the below setting i put while importing the file and the second screenshot shows the output. Not sure if i am doing this right.

When i select none as my delimiter, i get the below report output

RolandSchubert · ‎10-27-2019

Hi @mustufa2019 ,

your file looks like a report file create by a program running on a mainframe computer. I suppose, it is a fixed width text file, so you should select "Read in as a fixed width text file".

The next step will prompt you to (visually) define fields:

Of course you will have to remove the report header rows (e.g. using Sample tool).

Hope this is helpful.

Best regards

Roland

danilang · ‎10-27-2019

Hi @mustufa2019

@TonyA has a good solution for working on text files based on column position. Another technique you can use is to import the entire file with no delimiters and use a technique similar to this to parse out the data

Start with your input tool set to import the file with no delimiters (\0) and data in the 1st row

Then use a Multirow tool to find the rows that contain the information. In this case it's all the rows from the ID header row to the next null row. Use a filter to remove all the other rows, add a record ID and use use this split the "ID INITIALS NAME PRINTER" header row from the other data rows. The data you're interested in is everything from the start of the line to character position just before the word "PRINTER", so find the PrinterPosition in the Header row and append that to data rows. Trim the data rows so they contain everything up to but not including this position. Then use a Regex tool in Parse mode to split this on space(s) to give you

mustufa2019 · ‎10-30-2019

@TonyA

Thank you so much! I actually went back and tried your approach and it worked!

mustufa2019 · ‎10-30-2019

@danilang

Thank you so much for the help! I followed @TonyA solution and it worked!

Abhishek_0331 · ‎01-27-2021

Hi Roland,

This was really helpful, I would like to know if the same formatting can be done for a CSV or a XLS file coming from a mainframe computer. If Yes, could you please suggest a method ?

Best

Abhishek

Alteryx Designer Desktop Discussions

Delimited Text File Normalization

Re: Date Time Function - Prioritization Base on Du...

Re: Running multiple alteryx workflows within alte...

Re: Selecting the columns coming after a specific ...

Re: Regex(?) formula to remove values matching the...

Re: Python ECC SAP Extract into Alteryx Workflow