Downloading HTML tables, Alteryx not finding tables?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I'm trying to download and parse the following tables from this site: Corporate tax rates table. I've managed to understand where the data is coming from: Data source.
When I'm downloading the data, Alteryx is only returning the "Locations" and "Footnotes" tables, but nothing in between and no countries.
Could somebody help me with parsing the tables? I tried a lot, to no avail.
This is what Alteryx spits out, by the way, as you see only two tables:
<!doctype html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>KPMG</title>
<script language="JavaScript">
var selectedTax = "Corporate"
</script>
<script language="JavaScript" src="js/jquery-1.6.2.min.js"></script>
<script language="JavaScript" src="js/js-GM-PRGM-009.js"></script>
<link href="css/style-GM-PRGM-009.css" rel="stylesheet" type="text/css">
</head>
<body>
<div class="GMPRGM009ResponsiveTable">
<table width="100%" >
<tbody>
<tr>
<th class="GMPRGM009xslTHLocation">Location</th>
<th class="GMPRGM009xslTHFootnotes">Footnotes</th>
</tr>
</tbody>
</table>
</div>
</body>
</html>
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The actual data is coming from https://s3.amazonaws.com/kpmg-global/tax-rates-tool/js/taxRateTool-data.js
(found this URL by looking at the network tab in Chrome)
This provides the data in JSON format.
You can download this and then parse.
Something like:
Sample attached
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The websites use a script to render the content ie the tables are not built out using static HTML which Alteryx can download and parse. You can use a method described by @DavidM here to scrape websites with dynamically generated content.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Exactly what I was looking for, @jdunkerley79! Thanks very much :).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
This is a great help! Any pointers on where to find this table including 2021 rates?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I downloaded your Alteryx file, did not change anything in your workflow and ran it, but I did not get any result.
The download tool outputs 0 results.
Wondering if you have any insights of how to fix.
