Alteryx Designer

Find answers, ask questions, and share expertise about Alteryx Designer.
Register for the upcoming Live Community Q&A Session - and don't forget to submit your questions for @DeanS regarding the future role of analytics here.

Problem with data input

Highlighted
5 - Atom

Hello,

 

I am trying to learn Alteryx by taking a course at Udacity.

It is required to cleanse a data set.

 

In the csv file the delimiters are commas and there are some rows where the cells contain quotation marks.

The problem I have is that when I want to input the data, there are quotation marks added to the first cell (marked yellow) which incude the comma so that the parsing is prevented. However, in the original file there are no quotation marks in the first cells.

 

I have attached my work flow and the input data file.

Below you can see my results.

 

Unbenannt3.PNGUnbenannt4.PNG

 

 

Although I appreciate more complex solutions, I guess there is something wrong with my settings or the way I insert the data and I hope someone can give a hint, as the same problem may affect me in the future if I can't handle it now the right way.

 

Best Regards

 

 

Highlighted
8 - Asteroid

Hi @Rapha8 ,

I'm unable to exactly understand your query. But from whatever I could figure, if your problem is to parse data basis "," although they're enclosed within quotes, then in that case, in your Text to columns tool, uncheck the "Ignore delimiters in quotes" which is currently checked.

 

Once you do this, the tool will split into columns even though the "," is enclosed within quotes.

 

Happy learning!

 

Regards

Benn

 

P.S: In case this solves your query, kindly mark it as a solution. 

Highlighted
5 - Atom

Thank you @Bennel_Wilson for your response.

 

The problem is that if I uncheck "ignore delimiters in quotes" then the numbers of the population will also be splitted (see attached file "Unbenannt").

 

To solve this I need to know why there are qoutation marks added when I browse the data in Alteryx. The quotation marks arount the City|County part are not there in the original file (see attached CSV file).

 

I do not know if this is because of my Alteryx settings, other settings or if I am making another mistake.

Highlighted
8 - Asteroid

Hi @Rapha8 ,

 

Well, if you adjust the width of the first column in the csv file, you'll notice that the data split is inconsistent. Some line-items are combined within a single cell while some are split between 2 or more cells. See below:

 

Bennel_Wilson_0-1590070643739.png

And, you'll also notice that the quotes are automatically input in only those cases where the data is not split into 2 columns, basically in a single column

 

So after some research online, I stumbled upon this answer, which is guiding us with this query.

https://community.alteryx.com/t5/Alteryx-Designer-Discussions/txt-file-with-inconsistent-delimiters/...

 

 

And that is when, I noticed the input parameter in alteryx being "\0"

Bennel_Wilson_1-1590070707311.png

 

So "\0" essentially means, because of the inconsistent data in the csv file, alteryx is combining all the columns into a single column so that the data analysis we perform is consistent.

 

A couple ways to go forward from this point is mentioned in the answer that I've shared above. I hope this helps.

 

Regards

Benn

 

P.S In case this solves your query, kindly mark this as a solution.

 

 

Highlighted
12 - Quasar

I don't think quotes are being added to your data.  When you open a CSV file using Excel, the matched quotes are not displayed.

 

Open your CSV using a text editor like Notepad++ to see what the data really looks like.

 

Chris

Highlighted
5 - Atom

Hello you two,

 

I tried it with Notepad and it looked indeed like in Alteryx.

 

In the next step I downloaded the data in a ZIP folder and this data is displayed in the right way. However, if I download the data separately it does not work, so I suppose it may be the way I save the data or something like that.

 

Thanks a lot to both of you for guiding me to a solution for my problem.

I really appreciated your help.

 

Best Regards

Raphael

Highlighted
12 - Quasar

Here is an example where Excel and Notepad++ display CSV data differently.  In the first data row, Excel does not display the first double-quote character, but Notepad++ does.

 

With text files, it's usually helpful to use a text editor.

 

Annotation 2020-05-22 060856.png

 

Highlighted
5 - Atom

And learn something again.

 

Thank you.

Labels