Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Web Scraping Javascript in Alteryx

JDG0711
5 - Atom

Good morning, I'm trying to scrape a table with changing data in Alteryx from the following site: https://www.adr.com/drprofile/66987V109

 

I just want the table with the following data:

 

DRs Outstanding 201,978,216

DR Market Cap 21.23B USD

Underlying Shares Outstanding 2,068,264,000

Company Market Cap 217.44B USD

DR% of Company Market Cap 9.77%

 

When I try using the download tool the output says, "!doctype html><html lang="en"><head><meta charset="utf-8"/><meta name="viewport" content="width=device-width,initial-scale=1,shrink-to-fit=no"/><meta http-equiv="X-UA-Compatible" content="IE=edge"/><meta name="theme-color" content="#000000"/><link rel="manifest" href="/manifest.json"><link rel="shortcut icon" href="/favicon.ico"><title>J.P. Morgan's adr.com | The premier site for the global investor</title><script defer="defer" src="/static/js/main.8b0acd94.js"></script></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="app"></div><script src="/globals.js"></script></body></html>"

 

Would anyone be able to tell me if it's possible to use the download tool to get this portion of the table without error?

 

7 REPLIES 7
ArnaldoSandoval
12 - Quasar

Hi @JDG0711 

 

I tried Alteryx's Download tool to no avail as the JP Morgan page has dynamic rendering and the Dowload tool don't get them, then I tried using Python, and it failed because the data to download (screenshot below) is a table without header 😲 and Pandas does not handle header-less tables or my knowledge is limited.

 

JPMorgan-01.png

 

Arnaldo

sparksun
11 - Bolide

I've got the all the data from the URL, you just need do some further data manipulations to get the data in the format as you like.

Ben_H
11 - Bolide

Hi @JDG0711,

 

Sparksun has the right solution here - but I thought I would add a bit further information.

 

They have captured the API request from the page itself to get at the data you're looking for. You can do this yourself by using your browser's inspect function and looking at the network requests.

 

Regards,

 

Ben

ArnaldoSandoval
12 - Quasar

Hi @JDG0711 

 

Indeed the workflow posted by @sparksun bypass the constraints I comment earlier by submitting your request via a third party service; I googled for that service without finding any comment, review, feedback; You should be aware that what I believe your account number is exposed to a third party service.

 

Arnaldo

JDG0711
5 - Atom

Thanks Sparksun this worked!

 

 

JDG0711
5 - Atom

Thanks Ben, I'm familiar with the "inspect" function but regarding the network request, where do you see the actual API request when you go in there.

ArnaldoSandoval
12 - Quasar

@JDG0711 

 

This screenshots may help:

 

  • First click on the Network tab.
  • Then click on each Path on that screen, until you see the information you are chasing on the response pane.

Thanks to @sparksun and @Ben_H we learned something new today !!! and my aplogies for the panic attack earlier regarding the source of the URL to use.

 

First click on the Network tab

JPMorgan-02.png

 

click on each Path on that screen

JPMorgan-03.png

 

Arnaldo

Labels