Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Web Scraping

Gaurav3
5 - Atom

Hello,

 

I am trying to retrieve articles headlines and tags associated  from website : https://www.cnbc.com/metals/

How can I scrape the data ?

4 REPLIES 4
WilliamR
Alteryx
Alteryx

Hi @Gaurav3 ,

use the download tool for that purpose. You need to parse the HTML content to extract the desired data.

 

WilliamR_1-1577369703855.png

 

(If this post helps, then please consider it as the solution to help the other members find it more quickly).

GiuseppeC
Alteryx
Alteryx

Hi @Gaurav3,

 

in addition to what @WilliamR suggested, I noticed that the underlying HTML of the webpage that you posted comes in an unusual format, so I added some logic to give you an example of how to parse it and identify headlines.

 

See below and attached:

 

GiuseppeC_0-1577375471139.png

 

Hope this helps!

 

Giuseppe

fmvizcaino
17 - Castor
17 - Castor

Hi @Gaurav3 ,

 

One suggestion, non related to the workflow itself is to get data from rss feed page. It will be easier to get all the headlines.

https://www.cnbc.com/rss-feeds/

 

Best,

Fernando Vizcaino

 

Gaurav3
5 - Atom

Thank You! 🙂

Labels