This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I am looking to import the "The Full Data Set" table from the following webpage,
they have an imbed link or I would like to import the zip file with full data, any suggestions?
Go to Solution.
Go to Solution.
You can look at using tools like https://www.parsehub.com/ to scrape this type of data. If you search the Community there was a good post on this recently.
EDIT: here is the post I mentioned, thanks to @OliverW
Hi @_collin_lowney , if you would like to get the zip file and the link will remain the same, you can try this workflow:
- downloading the page and selecting line 1133 - which contains the link to the zip file
- changing the filename so that it contains filename with the correct folder (I used the temporary folder where the workflow is cached - [Engine.TempFilePath]
- Unzip the file into a subfolder /extract - this is really nasty but it works 🙂
- Take the right file with a simple filter and look at the data
If this works let me know.
EDIT: prerequisite is that you have 7-zip installed and the 7z.exe is either in this folder "C:\Program Files\7-Zip\" or you change the path accordingly in the Formula(18) -> https://www.7-zip.org
... and why not just grabbing the JSON document on this link? https://www.transparency.org/assets/data/cpi2018/table-data.json 🙂
You can find this link in the embed.js and I suppose the table-data.json might not change often.
Workflow who downloads JSON and reads data is attached.
Great - got it to work Thanks!