Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Web Scraping using the Download Tool to extract info from Inspect Element

morioren
6 - Meteoroid
 
3 REPLIES 3
NickC
Alteryx Alumni (Retired)

Hello,

 

From my understanding this data is being pulled in from a database using jquery.  I am unaware of a way to scrap this data without getting access to the database itself.  

 

Thanks,

Nick

david_fetters
11 - Bolide

So there's a fun learning curve to effective web scraping, and you need to be careful because it can be a violation of the company's terms of service if you scrape their data.  You can also get your IP permanently banned from their website (so you and anyone else in your company can never go there again).  That said, if data is displaying on a webpage, then it is being sent to your browser somehow and you just have to figure out where it's coming from.

 

If you are using Chrome, you can go to the options >> more tools >> Developer Tools to open the developer tools window.  Go to the page you want to investigate and then click the network tab in developer tools and hit "Record".  Refresh the page and wait for the network tab to fill up, then click "Stop Recording."  The network tab will show you all of the individual documents/files being sent to your browser by the website.  Then all you have to do is find the page that returns the JSON content.  See the image below for an example from the website you linked.Capture.PNG

On this page, you can click through each of the files to examine the contents of the file.  It looks like the file named 1238 is the JSON response containing the data you're after.  If you double click it, you'll see it links to a webpage that contains all of the data.  You can download that file using the download tool pointed at that URL (it looks like its just requires the mall ID number to switch between malls) or just doing it as a Save As and then parse it later.  I've searched through that JSON document and it contains the store names and square footage you're after.

 

Again, buyer beware.  Most companies don't want you scraping their websites and explicitly make it a violation of their TOS.

 

EDIT: just to close the loop on this, if you look at the terms and conditions located at: https://centrecorp.net/terms-conditions/ you'll see it explicitly prohibits scraping (see below), so use this as a lesson on how scraping works but don't violate their terms and conditions.  Under Acceptable Uses, it states the user must not:Capture.PNG

 

 

 

 

morioren
6 - Meteoroid
Thanks so much for your reply.
This website was just a random example of the concept I was trying to figure out. 100% we all need to check policies before we extract any info.
It was a fun excercise to learn from, thanks for you response!
Labels