This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Being new to Alteryx I have noticed that during the input process of CSV files I get warnings such as "An unescaped quote was found" with the record number identified. In my data this occurs because in some fields there are entries such as 24" (24 inches). I use the Formula tool and a Replace function to deal with the rouge quote during the workflow. My question is on the initial error, is there a way to resolve the error so that I know that the data moving through the workflow is getting cleaned? For instance, is there a way to create like a pre-sub-routine and stage the two workflows so that I get Results message for each routine? The reason I'm thinking of this approach is that in this sceanario, and many others, you do not get verification that your formula resolved all the warnings especially if the problem occurs in multiple columns and because the warning is only reported during the input process.
A similar problem occurs with rows of all Null values. The warning tells me that there are not enough fields in the record. I know why it occurs and I deal with it in the workflow but I would like to deal with the error during the input process and know it is resolved before the data hits the main part of the workflow. Thanks.
My data comes with everything in quotes, so I have to treat the input carefully otherwise I have more issues to deal with on the inside of the workflow. Both responses deal with the issue after the input warning occurs, and I've already done that. What I'm trying to do is eliminate the warnings that occur at the beginning so that I know the data is clean - ignoring the warnings can cause problems if my solutions don't catch everything. I don't see any other solution but to do a prescrub workflow, generate a new output that I use as input to the final workflow. The input messages on the final workflow then provides me with confirmation that I do not have data issues. Thanks to the both of you for your help..
Putting in a filter that will remove data all together if it isn't to spec would work instead. Get creative and use something like with (startswith([field_name],'"') OR right([field_name],2)='""'
Those in combination would give you any value that starts with a double quote or ends with 2 double quotes, that last piece should address for your column that has the legitimate double quote in it. Don't limit yourself to only checking the data on the input, other tools can address this issue for you along the way.
I have some nasty text files that were giving me problems too (not even csv, but prn, even though they can be interpreted as csv).
Here are the steps that work for me...
In the input tool, selected delimited and set the delimiter to none, and uncheck the box that says "First row contains headers".
Use the text to columns tool to split out your fields and set the delimiter to , (comma) or what I neede in my case is ,\t (comma and tab). Click to ignore delimiters in quotes and set the number of columns to whatever you need. I typically set this too high and then change the number of columns after everything's parsed.
Next, if your headers are not in the first row, use a sample tool to skip down to the header row.
Use a multi-field tool to strip the quotes "StripQuotes([_CurrentField_])".
Use the dynamic rename tool to get the field names from the first row.
Lastly, use a select tool to get rid of the first column because that column will be a concatenation of all your columns.
Is there an easy way to dynamically update the # of columns value in the 'text to columns' tool? I know I can do it with an action tool and update value with a formula, but struggling to figure out how to get the number passed into the action tool. I'm basically trying to package this up as a macro which reads the first line of data (which will contain field headers and I know will not contain extra quotes or special characters), counts the delimiters, and passes the output to the action tool. I'm sure there is an easy way to create a variable or something that can be referenced at run-time, but can't quite get there...