This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I am trying to learn Alteryx by taking a course at Udacity.
It is required to cleanse a data set.
In the csv file the delimiters are commas and there are some rows where the cells contain quotation marks.
The problem I have is that when I want to input the data, there are quotation marks added to the first cell (marked yellow) which incude the comma so that the parsing is prevented. However, in the original file there are no quotation marks in the first cells.
I have attached my work flow and the input data file.
Below you can see my results.
Although I appreciate more complex solutions, I guess there is something wrong with my settings or the way I insert the data and I hope someone can give a hint, as the same problem may affect me in the future if I can't handle it now the right way.
I'm unable to exactly understand your query. But from whatever I could figure, if your problem is to parse data basis "," although they're enclosed within quotes, then in that case, in your Text to columns tool, uncheck the "Ignore delimiters in quotes" which is currently checked.
Once you do this, the tool will split into columns even though the "," is enclosed within quotes.
P.S: In case this solves your query, kindly mark it as a solution.
The problem is that if I uncheck "ignore delimiters in quotes" then the numbers of the population will also be splitted (see attached file "Unbenannt").
To solve this I need to know why there are qoutation marks added when I browse the data in Alteryx. The quotation marks arount the City|County part are not there in the original file (see attached CSV file).
I do not know if this is because of my Alteryx settings, other settings or if I am making another mistake.
Well, if you adjust the width of the first column in the csv file, you'll notice that the data split is inconsistent. Some line-items are combined within a single cell while some are split between 2 or more cells. See below:
And, you'll also notice that the quotes are automatically input in only those cases where the data is not split into 2 columns, basically in a single column
So after some research online, I stumbled upon this answer, which is guiding us with this query.
I tried it with Notepad and it looked indeed like in Alteryx.
In the next step I downloaded the data in a ZIP folder and this data is displayed in the right way. However, if I download the data separately it does not work, so I suppose it may be the way I save the data or something like that.
Thanks a lot to both of you for guiding me to a solution for my problem.