Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Web Scraping

Gaurav3
5 - Atom

Hello,

 

I am trying to retrieve articles headlines and tags associated  from website : https://www.cnbc.com/metals/

How can I scrape the data ?

4 REPLIES 4
WilliamR
Alteryx
Alteryx

Hi @Gaurav3 ,

use the download tool for that purpose. You need to parse the HTML content to extract the desired data.

 

WilliamR_1-1577369703855.png

 

(If this post helps, then please consider it as the solution to help the other members find it more quickly).

GiuseppeC
Alteryx
Alteryx

Hi @Gaurav3,

 

in addition to what @WilliamR suggested, I noticed that the underlying HTML of the webpage that you posted comes in an unusual format, so I added some logic to give you an example of how to parse it and identify headlines.

 

See below and attached:

 

GiuseppeC_0-1577375471139.png

 

Hope this helps!

 

Giuseppe

fmvizcaino
17 - Castor
17 - Castor

Hi @Gaurav3 ,

 

One suggestion, non related to the workflow itself is to get data from rss feed page. It will be easier to get all the headlines.

https://www.cnbc.com/rss-feeds/

 

Best,

Fernando Vizcaino

 

Gaurav3
5 - Atom

Thank You! 🙂

Labels
Top Solution Authors