We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Reading CSV with line breaks

Andy_Hines
5 - Atom

I have a large .csv file that I am trying to read in through the Input Data tool.  There are carriage breaks in random rows throughout the file and the input tool is treating each of those as new rows.  I have tried selecting "AMP Only: Allow Newlines in Quoted Fields" but it is still creating new records for each break. 

Here are my input settings:

Capture.PNG

 

I've added a sample of the data and you can see in rows 4 - 8, those should all be on the same line. 

 

Any help is much appreciated. 

 

4 REPLIES 4
danilang
19 - Altair
19 - Altair

Hi @Andy_Hines 

 

This is a tough one.  Is the text with the line breaks always something always something like 

 

"Automated refund
Reversed by ctrl# xxxxxx
update refund"

 

Dan

Andy_Hines
5 - Atom

It does seem to always say something like "Automated refund" - yes.  

danilang
19 - Altair
19 - Altair

Hi @Andy_Hines 

 

If you can figure out the pattern to these messages, specifically how they start i.e. "Automated refund" and how they end "update refund" you can concatenate the entire file into a single string by replacing the crlf with "|" and then use a regex replace to get rid of pipes within the start and end sequences.  Then split the resulting string to rows on the "|" and then split to columns on ","  you should be able to get the file into a reasonable format.

 

Dan

K_Vinogradov
6 - Meteoroid

Hey @Andy_Hines!

As Dan correctly stated, the task would be much easier if we can identify specific starting and ending patterns. However, I found a solution that worked on the sample and may be applicable to the whole dataset if the schema is the same.

The attached workflow ensures that the last three symbols of the cell must be a dot followed by two numbers. If this ending pattern persists throughout the whole dataset, we can combine all the rows together and "fix" the csv file.

I understand that it is not the ideal solution, and more pertains to cleaning the dataset rather than setting up the input tool, but I hope you can find this useful.

Labels
Top Solution Authors