Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Web Scraping

Gaurav3
5 - Atom

Hello,

 

I am trying to retrieve articles headlines and tags associated  from website : https://www.cnbc.com/metals/

How can I scrape the data ?

4 REPLIES 4
WilliamR
Alteryx
Alteryx

Hi @Gaurav3 ,

use the download tool for that purpose. You need to parse the HTML content to extract the desired data.

 

WilliamR_1-1577369703855.png

 

(If this post helps, then please consider it as the solution to help the other members find it more quickly).

GiuseppeC
Alteryx
Alteryx

Hi @Gaurav3,

 

in addition to what @WilliamR suggested, I noticed that the underlying HTML of the webpage that you posted comes in an unusual format, so I added some logic to give you an example of how to parse it and identify headlines.

 

See below and attached:

 

GiuseppeC_0-1577375471139.png

 

Hope this helps!

 

Giuseppe

fmvizcaino
17 - Castor
17 - Castor

Hi @Gaurav3 ,

 

One suggestion, non related to the workflow itself is to get data from rss feed page. It will be easier to get all the headlines.

https://www.cnbc.com/rss-feeds/

 

Best,

Fernando Vizcaino

 

Gaurav3
5 - Atom

Thank You! 🙂

Labels