Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Webscraping an Academic Calendar

bryanmac_92
8 - Asteroid

Hi,

 

How can you web scrape an academic calendar from a website? I am trying to output the bold letters into a usable data table.

 

Here is the website: https://www.umassglobal.edu/news-and-events/academic-calendar 

 

Here is what I want to extract from the website and into this format. 

 

Summer Session I – 2024 – April 29, 2024 – June 23, 2024
Summer Session II– 2024 – June 24, 2024 – August 18, 2024
Fall Session I – 2024 – August 26, 2024 – October 20, 2024
Fall Session II – 2024 – October 21, 2024 – December 15, 2024
Spring Session I – 2025 – January 6, 2025 – March 2, 2025
Spring Session II – 2025 – March 3, 2025 – April 27, 2025
Summer Session I – 2025 – April 28, 2025 – June 22, 2025
Summer Session II– 2025 – June 23, 2025 – August 17, 2025

 

 

Best,

Bryan

3 REPLIES 3
cmcclellan
13 - Pulsar

There's always a "what's it worth" that you should think about when do any work like this.

 

As a consultant I would say .... you've got the table already (because you posted it) and it won't change (unless you want to pick up 2026 dates), but the manual effort to do this is minimal and the effort to get a workflow running and perfect is a lot more.  Usually it's the other way around the the effort to get the workflow running and execute to get different results will far outweigh the manual effort.

 

As a learning exercise (as in, who cares I want to write the workflow anyway) I'd use the Download tool and them process the HTML. You want to extract all the H3 formatted text and then further process that to only get the H3's that you want.

aatalai
14 - Magnetar

@bryanmac_92 take a look at the workflow attached and let me know how you get on

Raj
15 - Aurora

@bryanmac_92 
find the workflow attached
mark done if solved.

Labels