ACT NOW: The Alteryx team will be retiring support for Community account recovery and Community email-change requests Early 2026. Make sure to check your account preferences in my.alteryx.com to make sure you have filled out your security questions. Learn more here
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Use Alteryx to scrape a news site for stories on specific companies?

Henry_Gunn
6 - Meteoroid

Does anyone know if its possible to have Alteryx scrape a news site (provided that I have the logins) to return the url(s) of every article on the page with a specific company name in it/its title? 

 

I am trying to automate a competitive analysis setup and would like all applicable news stories (based on title/specific words in the body of the article) to be returned into a csv doc, or even better into excel.

 

I am new to Alteryx, please let me know if anyone can help.

2 REPLIES 2
pedrodrfaria
13 - Pulsar

Hi @Henry_Gunn 

 

You can do the scrape by using the download tool and download the HTML code. In the background, the information is embedded into the HTML, so you then would have to parse it out. It can become really tricky, but this is how you would need to do.

 

Pedro.

cgoodman
ACE Emeritus
ACE Emeritus

 As @pedrodrfaria mentions webscraping can be achieved using the download tool to return the underlying html. It is also possible to webscrape using python code.

 

While it is possible, some points to consider:

1) what are the T&C of the site you are scraping, many will have restrictions in relation to scraping. 
2) is there an API available, if so this would be the preferred option 

 

Chris
Check out my collaboration with fellow ACE Joshua Burkhow at AlterTricks.com
Labels
Top Solution Authors