Parse JSON
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi All,
I am trying to create a database of UK companies using publicly available data from the UK Companies house (https://download.companieshouse.gov.uk/en_pscdata.html)
The file is in JSON format and using Download/JSON parse it does not seem to work. The download in particular does not seem to work as the file is in Zip format and it is one of the many
The database is pretty large (over 9 GB) so I have uploaded a small sample.
Any ideas?
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@ste_demi if the first issue is at hand is the zip format, then the first process is seeing how to unzip these files before processing them. If the process is repeatable you could use a batch macro if you're worried about processing issues on the machine and drop out a .yxdb file at the end of each run. Afterwards you could union all the .yxdb files together and do one bulk upload to the database (or you could just incrementally load them via each batch's run).
All the best,
BS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@ste_demi
I've put something together for you:
I ran the container on the top to unzip the files using python, up to you if you want to do this manually.
I then added the bottom data stream to actually unpack the .txt file. I used "\n" as a delimiter as each row in text file is in JSON form. So parsed the text file into individual rows and then used the JSON parse tool.
Seemed to get the job done. I'll attach my solution. Have a play around with it.
All the best,
BS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
That was brilliant! many thanks