This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Hi everyone,
I'm trying to input a json file which is a union (I think) of a number of different webpages which have been scraped. My issue is that only the first of these pages seems to be coming through and I also receive a warning message:
Input Data (1) Error message: The document root must not follow by other values. at character position: 143856
I have a csv file which contains the same data and I can see that this has all of the pages. The reason I'm using the json file rather than the csv is so that I can keep the column headers from each page.
Separately, this has all come about because I was unable to correctly parse the page directly through Alteryx. This is also included in the sample workbook in case anybody would like to look at that as well!
Thanks
DB
Solved! Go to Solution.
Hi @DataBlender,
Your JSON file is indeed a whole lot of JSON files unioned, but that is easy enough to deal with. Bring it in and configure the input tool as CSV with no delimiter:
Then add a recordID and put it through the JSON Parse tool. Note that your schema changes after record 406 and so you might want to split on that before cross-tabbing your results.
Kane
Thanks @KaneG!
Had the exact same issue as DataBlender - great solution, thanks!
You'll have to set the Input Tool: "Field Length" to a high enough value that it does not truncate the values. Also de-select the "First Row Contains Field Names" checkbox.
With this data, I would also advise playing around with the dynamic Select in order to separate the attributes fields
Thank you so much!!! Looks like I overthink this problem way too hard haha
Will see what I can do with Dynamic Select! Really appreciate your help!