Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #69: Web Stock Data

AndrewHoData
8 - Asteroid

I got stuck at the delimiter stage and looked at the solution to help solve it. Everything was okay after that!

 

Can anyone help me understand how we go about identifying which delimiter to use?

 

A png of my workflow below:

 

Spoiler

AndrewHoData_1-1665439905218.png

 

Kind regards,

 

Andrew

BenoitC
Alteryx
Alteryx

Done 

 

BenoitC_0-1665477886823.png

 

Benoit Conley

Sales Engineer
Alteryx, Inc.

TungThanhHo
8 - Asteroid

my solution

Kinga
8 - Asteroid

Hi,

 

Please find my solution below. I had to use link mentioned somewhere earlier in the topic https://datahub.io/machine-learning/mushroom/r/mushroom.csv instead of the one in starting workflow.

 

Spoiler
Kinga_0-1667653602038.png

 

grazitti_sapna
17 - Castor

My solution:

Sapna Gupta
Hiblet
10 - Fireball

This is in reply to @AndrewHoData ...

 

In this case, the data returned was a CSV file.  The data came back as a single string in one cell.  A CSV is a text file, and text files will use non-printing characters LineFeed (LF) and CarriageReturn (CR) to break lines. 

 

Depending on the computer generating the text file, the text file might use one or both of these characters.  These have ASCII codes of 10 and 13 respectively.  The challenge assumes you are going to know this, which is a bit unfair maybe, but once you know it, it is true forever.  Knowing that you have either LF or CR as the characters that mark the end of lines, we can use the TextToColumns tool to act on "\n", which is the tool's way of referring to NewLine ie LF, or "\r" which is the tool's way of referencing Return ie CR.

 

If you want to examine a string to see the Ascii code or Unicode values for the characters, I save the data into a file and then open it using Textpad.  Textpad is a free text editor that can open the files in Binary mode, and it shows me value of each character used.  This is useful when processing HTML data, as this can use odd HTML specific characters like non-breaking-space that do not crop up in normal text files.

 

Alternatively, the "challenge_69.yxzp" solution I have attached shows a way to do this in Alteryx.  I truncate the string (to speed up processing), then split to one char per row with one of my own macros.  Then I can convert the character to an Int value, and that is the ASCII code.  In the attached example, once the data is broken down to one-char-per-record, records 301 and 302 show values 13 and 10 in the data, CR then LF.  This tells me I can break the source string into lines with either \r or \n in the TextToColumns tool.

willstom24
7 - Meteor

Attached is my solution using the latest link.

 

Not a vert elegant solution and looks like I went around the houses a bit looking at some of the other solutions posted... but it works!

DanielG
12 - Quasar

I used the https://datahub.io/machine-learning/mushroom/r/mushroom.csv

as well because the one in the start file is no longer functional but I was able to get it done with that.

 

ahsanaali
11 - Bolide
Spoiler
ahsanaali_0-1670867841226.png

 

SkomantasTamulaitis
8 - Asteroid

I was also downloading the data from here: https://datahub.io/machine-learning/mushroom/r/mushroom.csv

Spoiler
Screenshot 2022-12-13 170725.png