I am looking to import the "The Full Data Set" table from the following webpage,
https://www.transparency.org/cpi2018
they have an imbed link or I would like to import the zip file with full data, any suggestions?
Solved! Go to Solution.
It looks like the table values are in a javascript object so they won't come through in the Download tool.
You can look at using tools like https://www.parsehub.com/ to scrape this type of data. If you search the Community there was a good post on this recently.
Hi @_collin_lowney , if you would like to get the zip file and the link will remain the same, you can try this workflow:
- downloading the page and selecting line 1133 - which contains the link to the zip file
- changing the filename so that it contains filename with the correct folder (I used the temporary folder where the workflow is cached - [Engine.TempFilePath]
- Unzip the file into a subfolder /extract - this is really nasty but it works :)
- Take the right file with a simple filter and look at the data
If this works let me know.
EDIT: prerequisite is that you have 7-zip installed and the 7z.exe is either in this folder "C:\Program Files\7-Zip\" or you change the path accordingly in the Formula(18) -> https://www.7-zip.org
... and why not just grabbing the JSON document on this link? https://www.transparency.org/assets/data/cpi2018/table-data.json :)
You can find this link in the embed.js and I suppose the table-data.json might not change often.
Workflow who downloads JSON and reads data is attached.
Great - got it to work Thanks!