Hi,
I'm trying to read .csv file that is prepared in line with format RFC4180: http://www.ietf.org/rfc/rfc4180.txt Esentialy it means that if value has quotation mark or comma, gets wrapped in additional quotation marks. On top of that whole line (first and last value) get's wrapped in extra quotation marks. When i try to input these files in alteryx I either end up with all values in one field or with parsing error. I figured out Input tools setup that allows me to ignore delimiters in quotes or single quotes, but i still have values with comma inside that is considered as two separate values. I can't build RegEx that will be flexible enough to include both special characters so will appreciate any advice :)
Attached is file with dummy data so feel free to try it on your own.
Solved! Go to Solution.
Are you sure that this is a valid file? It seems strange to me, some fields have lots of ", some fields have only one.
@Felipe_Ribeir0
I was also thinking something was wrong so I asked users to provide file even not opened by anyone and you see it. I only changed letters - quotes and commas are in the very same order. I'm still trying to confirm that and actually see original file, but still i wanted to make sure that workflow is bulletproof.
@gautiergodard
Makes perfect sense, thank you! I was focused on parsing whole file at once, and missed idea to separate rows and process them in two streams. Will test it on more cases :)
User | Count |
---|---|
107 | |
82 | |
72 | |
54 | |
40 |