We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Ignore wrong records

Saraabdi955
8 - Asteroid

Hi everyone,

I have so many .txt and .csv files. In some of them, there are records that they have more than specific delimiters and I got this message:

"Record #1219: Tool #2: Too many fields in record #57"

 

How to ignore the wrong records and read all the file to end?

11 REPLIES 11
binuacs
21 - Polaris

@Saraabdi955 Change the Field length to 999999 and try again

 

binuacs_0-1677052615012.png

 

Saraabdi955
8 - Asteroid

@binuacs 
No consider:

 

Col1,Col2
22@22.com,qwerre
44@234.com,loplopl
11@11.com,123$,%()
78@78.com,98/*-+

 

there is a delimiter , and col2 consists of another ,

While running the file, I got that message.

Saraabdi955
8 - Asteroid

No consider attached  file:

there is a delimiter , and col2 consists of another ,

While running the file, I got that message.

binuacs
21 - Polaris

@Saraabdi955 use \0 as the delimiter instead of , also uncheck first row contains field names (6th option in input tool configuration). Then use text-column tool to split the records 

Saraabdi955
8 - Asteroid

it's not correct.

please watch the attached dataset.

what should we do when we have more than delimiters determined?

davidskaife
14 - Magnetar

Hi @Saraabdi955 

 

Is it because you have a comma in the password section? Using @binuacs solution you can get it to work - just restrict the Text to Columns tool to 2 columns only, leaving the extra in last column:

 

DavidSkaife_0-1677068523742.png

 

Your output is this:

 

DavidSkaife_1-1677068544010.png

 

 

If this isn't right i'd provide detail on how you expect the output to look

Saraabdi955
8 - Asteroid

Thanks @davidskaife 
it's ok but not exactly.

If there was another column after password, we couldn't do it.

davidskaife
14 - Magnetar

Hi @Saraabdi955 

 

Try this Regex Parse, i've tested it with an additional column after the password:

 

DavidSkaife_0-1677081282536.png

 

Here is the setting for the Regex:

 

DavidSkaife_1-1677081303065.png

 

DavidSkaife_2-1677081391192.png

 

danilang
19 - Altair
19 - Altair

Hi @Saraabdi955 

 

What you have here is classic example of "Dirty Data".  It's something for which there is no easy or simple fix for in any analytics platform, Alteryx, DataBricks, PowerBI, informatica, etc.  The process of cleaning data can take up to 25% of a data scientist's time. 

 

There's a reason that the Alteryx input tools throw errors and warnings if the schema of multiple files doesn't match or it a row in a csv file doesn't have the same number of delimiters as the header row.  The tools were developed with the principle that it is better to exclude suspect data than to include it and have it produce erroneous results.

 

There are techniques that you can use to clean data, but they need to be applied on a case by case basis, by someone who has access to the entire input data.  @binuacs has suggests loading all the files with no delimiters and then parsing the data.  @davidskaife suggest using regex to parse some problematic data.  What you need to do now is use these methods, and work your way through all the data files applying these and other methods until until you can load and parse all the data.  

 

If you have specific questions and you can provide specific examples, please post more questions as you work your way through the data.

 

Dan  

Labels
Top Solution Authors