Web Scrape Branch Details
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi all -
I'd like to scrape branch details from the ukps.com/stores.html website.
Down the right is a table of branches with city, phone number, website, email and address. I'd like to be able to scrape that info into a table for each of those elements.
Is this possible? I fear this is way beyond my understanding!
Thanks for any assistance you might be able to provide.
RDF
- Labels:
- Help
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Yes, it is possible.
The addresses you want are in the generated html.
Then use the download tool and make a regex to scrape
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for the response, but as mentioned I think this is a bit beyond my current skill levels - a bit more guidance on how to build the flow would be appreciated.
RDF
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Looking at the HTML I don't think the data is in a table?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Any help?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hey, @RDF25087.
To simplify this process, I would use Python. In my opinion, it will be easier to solve and easier to understand.
I suggest looking into requests (to send the HTTP GET request) and Beautiful Soup (to parse the HTML).
I believe these are the main libraries you need. Seems like the data you want are in the HTML and not loaded dynamically with JavaScript, so something like Selenium won't be necessary.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@geraldo @acarter881 - thank you both for your support - it's much appreciated.
@geraldo - when I try to run your workflow above I get the attached error on the download.
Thanks
RDF
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
This message has something related to your internet access.
For me it's running perfectly. I don't have a proxy or firewall activated.
Can you open a url through the browser?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for the reply. I work for a large business that do everything they can to make downloading/accessing the internet as difficult as possible. When I get home I'll try running it on my home network.
