Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Parese

Simon1187
9 - Comet

Hi @Qiu,

 

I hope you are doing well. 

 

I am trying to parse the HTML file. Could you please help me with that? 

 

Thanks, 

2 REPLIES 2
danilang
19 - Altair
19 - Altair

Hi @Simon1187 

 

What are you trying to find in this file?  From looking at it in Chrome, it's only script functions with no data.  This looks like the output of an initial web call that then uses the urls embedded in it to fetch the actual data.  The key URL here seems to be the call to HTTP://api.autopilothq.com/anywhere/...  that contains the expired token.   This is probably the next call in the chain.  The data may come from this. 

 

Edit:  I just realized that you are the same person wanting to scrape the data from 4000 pages that I answered yesterday.  From this first sample, it looks like you'll need a selenium based solution to prerender the complete pages so you can scrape the data from the final page. 

 

The other alternative is to add download tools calling the embedded urls until you get to the final data you need.  At each step look at the resulting html in chrome to help you find the possible links.  That's how I found the autopilot link.  Since all the pages that you're searching are created by the same team, the format of the pages and the number of embedded calls will probably be the same, so a single workflow will probably be able handle them all. 

 

Dan

danilang
19 - Altair
19 - Altair

Hello @Simon1187 

 

As you mentioned privately, you are trying to connect to a Cube.js site.  Your best option is probably to connect via their API.  This will bypass having to perform multiple web downloads and scaping.  Once you get the a key to access the site, you'll be able to access the data directly.  You can check this article for the basics on connecting to a REST API and this one for an introduction to authentication.  

 

Dan

Labels