Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Download Tool to scrape multiple pages where URL doesn't change

AcevedoYo
8 - Asteroid

Team - I'm using the Download tool to scrape information off of this website: https://www.aacsb.edu/accreditation/accredited-schools

 

I'm nearly there, I've managed to parse all of the information I'm looking for and join everything together for the first page. I'm now ready to expand my solution to all the pages of schools. The only problem I'm running into is that when I select "View All" at the bottom of the page the URL doesn't update, so I'm not sure how to go about Downloading the entire list of available schools.

 

Workflow attached (TEST). What am I missing here? I feel like it's something simple but I'm a bit of a newbie at this

 

_____________________________________________

EDIT: 2021-04-27

Thanks to @TheOC and @mceleavey for helping out. I did have to redesign the parsing sections a bit, but was able to make quick work of it. Final workflow attached below (v2).

 

Output looks like this:

YomaraA_0-1619547857366.png

 

9 REPLIES 9
Qiu
20 - Arcturus
20 - Arcturus

@AcevedoYo 
Now I am getting this by clicking your URL

Qiu_0-1619493613345.png

 

AcevedoYo
8 - Asteroid

Oh, really? That's odd... Link is working for me. Any difference when you try the home page? https://www.aacsb.edu/

mceleavey
17 - Castor
17 - Castor

Hi @AcevedoYo ,

 

This requires some wizardry in the Payload/Header section.

I've attached a workflow which will return all results, courtesy of @TheOC . He named the workflow, but I thought it was accurate.

 

Hope this helps,

 

M.



Bulien

AcevedoYo
8 - Asteroid

I could cry. Thank you so much @mceleavey and @TheOC !!! The people on the internet never fail me.

 

Just goes to show there is always so much to learn!

 

you_are_a_wizard.gif

 

mceleavey
17 - Castor
17 - Castor

no problem.gif



Bulien

AbhilashR
15 - Aurora
15 - Aurora

@mceleavey - really cool work! Would you be able to give us a walk us through of the steps you took to construct the HTTP POST parameters? I am a novice when it comes to this and would love to learn.

mceleavey
17 - Castor
17 - Castor

Hi @AbhilashR ,

 

No problem, but it's a bit complicated so @TheOC is going to write it up in a blog post. Watch this space.

 

If you need it earlier, DM me and we'll arrange a session to go over it.

 

M.



Bulien

TheOC
15 - Aurora
15 - Aurora

Hey @AcevedoYo,
It's a pleasure, Glad that sorted it for you!

There's definitely always something to learn with Alteryx, you're 100% right. If you're interested in how I tackled this one, as @mceleavey said, we're looking to write a blog post on it sometime, as I think its something that could be used on many use-cases, and I've not seen it documented too much. If you have any questions, or want to know about it sooner, happy to chat on here or on a call! 

Cheers!


Bulien
FaeDalton
5 - Atom

I am trying to attempt this same thing and i cannot figure out how you did this. 

I am trying to pull Google Scholar Profile information. 

 

This is the website i am attempting to pull Profiles (google.com)

Labels