Join the Alteryx Community’s Maveryx Summer Cup event! Compete, network with others, and earn your gold through a series of challenges from July 24th to August 11th. Learn more about the event here.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
RESUELTAS

CSV with extra columns sometimes causing errors

drudd75077
Meteoroide

I work with csvs that sometimes have errors in them. For example, they may have 3 columns, but some rows happen to have more columns in the data. 

 

For example, my file could look something like this:

 

Col1, Col2, Col3

a,       b,      c

d,       e,      f

g,       h,      i,    j

 

So my file will error on the 3rd row because it looks like it has 4 columns but the csv only has 3 columns. I don't have control over how the csv is created so I can't change this.

 

I come from a Python background but am transitioning to a team that uses Alteryx so I'm trying to figure out how to handle this situation in Alteryx. In Python, I currently have my program just create more columns than I need so it imports everything I can then filter to the problem rows and fix them.

 

My files have millions of rows (normally around 10 - 30 rows will be like I described and there can sometimes be several extra commas, not just one extra.

 

 

6 RESPUESTAS 6
flying008
Magnetar

Hi, @drudd75077 

 

FYI. You can use tokenize method of regex by most columns to contain your data.

 

录制_2023_12_28_16_23_10_383.gif

AndrewDMerrill
Púlsar

There are several solutions, but what may be the best for you is to select the following Input Tool Setting:

Screenshot.png

This will import the error rows as Null's, which you can then filter out.

drudd75077
Meteoroide

@flying008

I don't think this will solve my problem. My problem is that I can't import the CSV data into Alteryx to begin with because the problem rows error out. I know how to resolve them once I can get them in the tool.

drudd75077
Meteoroide

@AndrewDMerrill

But then I have null rows, I want to get the data into Alteryx.

TimN
Púlsar

Hi,

This approach keeps 3 columns and leaves the extra data in the third column.

drudd75077
Meteoroide

@TimN 

 

Perfect exactly what I was looking for. 

Etiquetas