We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Parse csv field that has quotes and commas within a field

vpanwar
5 - Atom

I have attached a sample file and the workflow I currently have. In the Test to column parse, I am trying to select both Ignore delimiters within Quotes and Ignore delimiters within parenthesis, but it won't work for all use cases. The problematic String is - "Post-corrections (type "X, Y, Z")" which parses into three columns instead of one. 

 

I tried to also select the "Ignore Delimiter within Parenthesis" and it only works on its own. not in conjunction with 'Ignore Delimiters within quotes'.

 

Please let me know how I can parse these fields accurately. 

4 REPLIES 4
Yoshiro_Fujimori
15 - Aurora
15 - Aurora

@vpanwar ,

It seems the Text To Columns tool does its work right.

It is difficult to logically know that the string "Post-corrections (type "X, Y, Z")" is actually one column.

 

As a workaround, you may want to change the parameter "Number of Columns" to '6' with "Leave extra in last colunn".

(However, this solution works only when the colun in problem in the last column.)

binuacs
21 - Polaris

@vpanwar as @Yoshiro_Fujimori suggested, you can leave the extra columns as the  last column  in the text-to-column

image.png

vpanwar
5 - Atom

So, the example file is just an example. The number of fields as well the text within will be different as I plan to use this as a macro. So having a fixed number of fields is not an option. I was wondering if there is Regex formula that can be used to parse? 

Yoshiro_Fujimori
15 - Aurora
15 - Aurora

@vpanwar ,

 

As I mentioned earlier, there is no logical way to differentiate which quotation mark is 'inside', and which is the outer-most of a column.

 

For example, if you are told to separate a string "A"B,C,D"E" with comma, ignoring commas withing quotes:

when you come to an open quote (right before A), you will search for the closing quote, and find it after A.

Then the delmiter comes after B. So you would get the first column as underlined. There are no commas to be ignored.

 

So I think  there is not a simple solution as long as you use CSV delimited with commas.

 

If you have the control over the data source system, you may want to;

change the delimiter to a charactor which is never used within the text of the value in the columns (for example, pipe "|", at mark "@" etc.), or 
connect the database directly with Alteryx (if the data source is a database).

 

Labels
Top Solution Authors