I have attached a sample file and the workflow I currently have. In the Test to column parse, I am trying to select both Ignore delimiters within Quotes and Ignore delimiters within parenthesis, but it won't work for all use cases. The problematic String is - "Post-corrections (type "X, Y, Z")" which parses into three columns instead of one.
I tried to also select the "Ignore Delimiter within Parenthesis" and it only works on its own. not in conjunction with 'Ignore Delimiters within quotes'.
Please let me know how I can parse these fields accurately.
@vpanwar ,
It seems the Text To Columns tool does its work right.
It is difficult to logically know that the string "Post-corrections (type "X, Y, Z")" is actually one column.
As a workaround, you may want to change the parameter "Number of Columns" to '6' with "Leave extra in last colunn".
(However, this solution works only when the colun in problem in the last column.)
@vpanwar as @Yoshiro_Fujimori suggested, you can leave the extra columns as the last column in the text-to-column
So, the example file is just an example. The number of fields as well the text within will be different as I plan to use this as a macro. So having a fixed number of fields is not an option. I was wondering if there is Regex formula that can be used to parse?
@vpanwar ,
As I mentioned earlier, there is no logical way to differentiate which quotation mark is 'inside', and which is the outer-most of a column.
For example, if you are told to separate a string "A"B,C,D"E" with comma, ignoring commas withing quotes:
when you come to an open quote (right before A), you will search for the closing quote, and find it after A.
Then the delmiter comes after B. So you would get the first column as underlined. There are no commas to be ignored.
So I think there is not a simple solution as long as you use CSV delimited with commas.
If you have the control over the data source system, you may want to;
change the delimiter to a charactor which is never used within the text of the value in the columns (for example, pipe "|", at mark "@" etc.), or
connect the database directly with Alteryx (if the data source is a database).