Web Scraping using the Download Tool to extract info from Inspect Element
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello,
From my understanding this data is being pulled in from a database using jquery. I am unaware of a way to scrap this data without getting access to the database itself.
Thanks,
Nick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
So there's a fun learning curve to effective web scraping, and you need to be careful because it can be a violation of the company's terms of service if you scrape their data. You can also get your IP permanently banned from their website (so you and anyone else in your company can never go there again). That said, if data is displaying on a webpage, then it is being sent to your browser somehow and you just have to figure out where it's coming from.
If you are using Chrome, you can go to the options >> more tools >> Developer Tools to open the developer tools window. Go to the page you want to investigate and then click the network tab in developer tools and hit "Record". Refresh the page and wait for the network tab to fill up, then click "Stop Recording." The network tab will show you all of the individual documents/files being sent to your browser by the website. Then all you have to do is find the page that returns the JSON content. See the image below for an example from the website you linked.
On this page, you can click through each of the files to examine the contents of the file. It looks like the file named 1238 is the JSON response containing the data you're after. If you double click it, you'll see it links to a webpage that contains all of the data. You can download that file using the download tool pointed at that URL (it looks like its just requires the mall ID number to switch between malls) or just doing it as a Save As and then parse it later. I've searched through that JSON document and it contains the store names and square footage you're after.
Again, buyer beware. Most companies don't want you scraping their websites and explicitly make it a violation of their TOS.
EDIT: just to close the loop on this, if you look at the terms and conditions located at: https://centrecorp.net/terms-conditions/ you'll see it explicitly prohibits scraping (see below), so use this as a lesson on how scraping works but don't violate their terms and conditions. Under Acceptable Uses, it states the user must not:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
This website was just a random example of the concept I was trying to figure out. 100% we all need to check policies before we extract any info.
It was a fun excercise to learn from, thanks for you response!
