Alteryx Designer Desktop Discussions

mvidas · ‎04-06-2021

Hi All,

I've got a problem trying to import a csv file through Input Data. I'm importing a file one created by one of our systems, and it seems that alteryx doesn't like an exclamation point after a quote in the file.

Here is a sample csv file to show what I mean:

field1,field2,field3,field4,field5
alteryx,is,the,best,ever
alteryx,is,the best,ever,
alteryx,is,the best,ever,
alteryx,is,the best,"ever"!!,
alteryx,is,the best,"ever!!",

It doesn't like row 5, the "ever"!!, part. It doesn't have an issue with the line after, as the exclamation marks are in the quotes.

I can't use option 9 Ignore Delimiters in Quotes as my data often (but not always) has data I need in quotes there, ie ,12345,"Last, First, M",etc so I need those to be escaped with quotes.

The system that makes this file has a 'comment' field which is where the "ever"!! portion is written. I don't even need that field but I can't control the output. I also cannot change it in the system itself.

The comment field in question is linked to a regular customer, so this comment field shows up in the daily reports regularly. As far as I know it is just one customer that causes this though.

I also can't use option 10 "Treat Read Errors as Warnings" as it causes the entire record to be [Null] values, instead of just the comment field there.

I thought about inputting the data with a dummy delimiter, using replace to change '"!!' to '!!"' but I don't know how to get the workflow to wait for the Output Data to finish before running the second Input data. Block Until Done didn't work as Input Data didn't have an input. An example:

Are there any options to solving this that I did not consider? Thanks!

BenMoss · ‎04-06-2021

I've bumped into a similar issue before, and I resolved it in a similar fashion as you, read it in with no delimiter, but I think the part you are missing is there is no need to write your data out and then read it in again, instead you can just use the text to columns tool to parse your data immediately.

Ben

mvidas · ‎04-06-2021

Ok, so I did figure out a workaround. I'll explain it here in case it could help someone else, but I'm still wondering if there's a better way of doing this so I'll leave it marked unsolved for now.

The top import has a record limit of 1 and First Row Contains Field Names unchecked, which pulls the headers into row 1. The Dynamic Rename promotes row1 to field names, with an empty data set.

The lower workflow imports the data with a delimiter of \0 and "first row contains field names" unchecked. Then it uses Replace() to fix the data. I then use text to columns with an excess number of columns listed, with the "Ignore delimiters in quotes" checked. Then another dynamic rename to promote row 1 to field names, and then a dynamic select to get rid of the original undelimited data (FieldNumber > 1). I then join the data, and only Output Common Subset of Fields.

mvidas · ‎04-06-2021

I realized that as you were writing it and posted my method (the number of columns could change so I wanted to code for the possibility). I thought there might be a different way around it but I guess there isn't.

Thanks for the sanity check!

BenMoss · ‎04-06-2021

@mvidas I'm not sure I get the two branches but the bottom half is exactly how I'd approach the task!

Ben

mvidas · ‎04-06-2021

@BenMoss The text to columns made me choose the number of columns to split to. I tried putting 0 but it didn't work. For this output file I know how many fields there are, but this system has a number of exported csv files and wanted to write it for a dynamic number of columns.

I'm attaching the workflow itself in case that helps. Is there a way of using text to columns for an unspecified number of columns?

BenMoss · ‎04-06-2021

In that scenario I would use the 'split to rows' functionality followed by a cross-tab tool which gives you a 'dynamic' text to columns.

Ben

mvidas · ‎04-06-2021

Interesting. I just tried it quickly and it put my huge file into one field, but I'll give it another try in the morning. Thanks for the idea!

Shalz · ‎04-06-2021

I would do it like this

1. Replace , with tab

2. Cleanse - Check Punctuation

3. Now use Dynamic rename to get first row of data as headers

Alteryx Designer Desktop Discussions

CSVFile: A field was missing a closing quote in record xxx