Alteryx Designer Desktop Discussions

MasoB · ‎03-04-2025

Hi,

I am trying to create a workflow to properly import txt data. My data looks like this and does not contain field names:

These are the options I select in the input tool:

This is the output I get:

Is there any way to build a workflow to properly import the data or will I ned to fix these errors manually in the txt file? Example data file attached.

Appreciate any guidance. Thank you.

alexnajm · ‎03-04-2025

You are gonna have to read it in with no delimiter and use RegEx afterwards to parse through it in my opinion...having quotes to group data together when there are quotes in the values makes this a challenge to read in normally

Warcry · ‎03-04-2025

Please provide what the results should look like because it looks like some rows have more columns than the others.

davidskaife · ‎03-04-2025

Hi @MasoB

If you read it in with no delimiter as @alexnajm suggested you can use a Data Cleansing tool to remove leading & trailing whitespace, plus tabs, line breaks and duplicate whitespace. Then using a Text to Columns tool you can split on comma, and set it to ignore delimiters in quotes.

This works for the first and second rows, but not the third as there is either a missing comma, or a comma too many around the point of the "PIPE, 4 " part. If you remove the quote after 4 it works in line with the other rows.

Of course if that is the exact file layout then it makes it more tricky, and Regex parsing will have to be the way forward.

MasoB · ‎03-04-2025

This is what it should look like ideally.

chuckleswk · ‎03-04-2025

You can use the following workflow to create the output that you're looking for.

You read the text in with a delimiter of \n
Use the Formula Tool to clean up the data and get a proper delimiter

Use the Text to Column tool to split it up based on your new delimiter
Use the Data Cleanse tool to get rid of the leading and trailing whitespace

Warcry · ‎03-04-2025

Use the python tool and update the file_path.

from ayx import Alteryx
import re
import pandas as pd
import csv

file_path = r"C:\Users\Warcry\Downloads\EXAMPLE UPDATED.txt"

parsed_data = []

# Open and read the file using the CSV reader

with open(file_path, "r", encoding="utf-8") as file:

reader = csv.reader(file, quotechar='"', delimiter=',', skipinitialspace=True)

for row in reader:

parsed_data.append(row)

# Determine the maximum number of columns dynamically

max_columns = max(len(row) for row in parsed_data)

# Define column headers based on the maximum number of columns

columns = [

"A", "B", "C", "D", "E", "F", "G"

][:max_columns] # Truncate or pad headers to match the actual data

# Create a Pandas DataFrame

df = pd.DataFrame(parsed_data, columns=columns)

# Clean the data

df["D"] = df["D"].str.strip().replace(r'"', '', regex=True)

df["F"] = df["F"].str.strip().replace(r'"', '', regex=True)

df["G"] = pd.to_numeric(df["G"], errors="coerce") # Ensure 'Amount' is numeric

print(df) # for testing

Alteryx.write(df, 1)

chuckleswk · ‎03-04-2025

@MasoB if one of these solutions works for you, please mark it as accepted so that other's can find / use it.

MasoB · ‎03-04-2025

Thank you for the simple and easy to follow solution.

KGT · ‎03-04-2025

Your file has consistent spacing and so it's just a matter of working out how many characters per field.

This is the character numbers for where commas occur in your sample data.

Record 1 has an erroneous comma at position 35 and Record 3 has one at position 58.

So, your fields are defined by the size and these formulas will get you there:

Format: Substring(fieldname,startChar,numChar)
Substring([Field],0,7)
Substring([Field],8,8)
Substring([Field],17,34)
Substring([Field],52,30)
Substring([Field],83,12)
Substring([Field],96,3)

You can wrap them all in Trim(formula,'"') to trim all the quotes off the end/start but that will also remove any ending quotes in the data. Depending on what the full data looks like you may choose to just change the substrings to 1,5 instead of 0,7 for instance.

I've attached a workflow that has more than I intended but shows my investigation as well. The Dynamic Formula is not essential, as you can copy/paste each of the formulas into a formula tool if you like.

Alteryx Designer Desktop Discussions

Importing TXT delimited by comma with quotes & commas in data

[Sharing] Custom Formula Function 'IsPrime' to Jud...

Re: What’s the best way to save each day’s working...

Re: Alteryx Load Time

Re: Accounting for Zero Records on Email Output

Re: Extract text up to 3rd - from a string