Hi,
I've been trying to parse some nested JSON but I cannot get my head around it. I can do it if it's only one level (nodes?) but if there are multiple levels I cannot do it.
The desired output is like this, where name, start_time and end_time are unique
name | start_time | end_time | in | out | abandon |
StoreEntrance.Door1.Left | 2017-10-10T00:00:00.000+01:00 | 2017-10-10T00:04:59.999+01:00 | 1 | 0 | 0 |
StoreEntrance.Door1.Left | 2017-10-10T00:05:00.000+01:00 | 2017-10-10T00:09:59.999+01:00 | 0 | 1 | 0 |
StoreEntrance.Door1.Right | 2017-10-10T00:00:00.000+01:00 | 2017-10-10T00:04:59.999+01:00 | 0 | 3 | 0 |
StoreEntrance.Door1.Right | 2017-10-10T00:05:00.000+01:00 | 2017-10-10T00:09:59.999+01:00 | 2 | 0 | 0 |
I attached the JSON file that I try to parse this way (I had to rename it as .txt because .json is not a supported attachment file type)
Any help would be appreciated.
Alex
Solved! Go to Solution.
Elegant solution with the Regex, thank you very much for your help!
Hello,
I'm trying to parse a complicated JSON file that contains information about computer vulnerabilities from the National Vulnerability Database operated by NIST. The main page for the files is here https://nvd.nist.gov/vuln/data-feeds.
The smallest file they have is still almost 3 MB when unzipped. The 2002 file is the one giving me the most trouble and the smallest one with which I can demonstrate my issues. Download it here https://static.nvd.nist.gov/feeds/json/cve/1.0/nvdcve-1.0-2002.json.zip.
The Workflow is pretty much the same as the one provided, just a few tweaks due to the naming conventions.
The problem comes with the nested fields "vendor_name".
You can see how they appear in this excerpt. This isn't the entire section, just a snippet.
"cve" : { "data_type" : "CVE", "data_format" : "MITRE", "data_version" : "4.0", "CVE_data_meta" : { "ID" : "CVE-1999-0590", "ASSIGNER" : "cve@mitre.org" }, "affects" : { "vendor" : { "vendor_data" : [ { "vendor_name" : "apple", "product" : { "product_data" : [ { "product_name" : "mac_os", "version" : { "version_data" : [ { "version_value" : "*" } ] } } ] } }, { "vendor_name" : "linux", "product" : { "product_data" : [ { "product_name" : "linux_kernel", "version" : { "version_data" : [ { "version_value" : "2.6.20.1" } ] } } ]
Each vendor_name has its own associated product_name.
But this is how it looks in the final report.
Record_Name publishedDate_Fixed lastModifiedDate_Fixed cvssV2_baseScore cvssV3_baseScore vendor_name product_name value versionEndIncluding version_value CVE-1999-0590 2000-01-01 04:00:00 2008-01-09 12:35:00 10 apple,linux,microsoft mac_os,linux_kernel,windows_2000,windows_95,windows_98,windows_nt NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it. *,2.6.20.1,*,*,*,3.5.1,4.0
The vendor names are concatenated in a single field and the product names and version_value are concatenated in their respective fields.
I'd rather break it down like this.
Record_Name publishedDate_Fixed lastModifiedDate_Fixed cvssV2_baseScore cvssV3_baseScore vendor_name product_name value versionEndIncluding version_value CVE-1999-0590 1/1/2000 4:00 1/9/2008 12:35 10 apple mac_os NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it. * CVE-1999-0590 1/1/2000 4:00 1/9/2008 12:35 10 linux linux_kernel NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it. 2.6.20.1 CVE-1999-0590 1/1/2000 4:00 1/9/2008 12:35 10 microsoft windows_2000 NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it. * CVE-1999-0590 1/1/2000 4:00 1/9/2008 12:35 10 microsoft windows_95 NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it. * CVE-1999-0590 1/1/2000 4:00 1/9/2008 12:35 10 microsoft windows_98 NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it. * CVE-1999-0590 1/1/2000 4:00 1/9/2008 12:35 10 microsoft windows_nt NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it. 3.5.1 CVE-1999-0590 1/1/2000 4:00 1/9/2008 12:35 10 microsoft windows_nt NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it. 4.0
Grouped by Vendor, then Product, then Version.
But the parsing is beyond my skill level.
I have a Workflow that should do exactly what I want. It produces the desired results with the sample data. But, on real data, the files that range from 5 MB to 200 MB, I've let it run for 72 hours on one of the smaller files and it still didn't finish. I tried breaking it into smaller workflows and dumping incremental results into Altertyx data files. The file created by step 2 of 4 was 6 GB.
Any advice would be great. See attached monstrosity.
I attached the wrong version the first time. Sorry.
This is what the attached one produces with the demo data.
CVE published_Date lastmodified_Date cvssV2_baseScore Vendor_Name Product_Name Version_Value CVE-1999-0590 2000-01-01 04:00:00 2008-01-09 12:35:00 10 apple mac_os * CVE-1999-0590 2000-01-01 04:00:00 2008-01-09 12:35:00 10 linux linux_kernel 2.6.20.1 CVE-1999-0590 2000-01-01 04:00:00 2008-01-09 12:35:00 10 microsoft windows_2000 * CVE-1999-0590 2000-01-01 04:00:00 2008-01-09 12:35:00 10 microsoft windows_95 * CVE-1999-0590 2000-01-01 04:00:00 2008-01-09 12:35:00 10 microsoft windows_98 * CVE-1999-0590 2000-01-01 04:00:00 2008-01-09 12:35:00 10 microsoft windows_nt 3.5.1 CVE-1999-0590 2000-01-01 04:00:00 2008-01-09 12:35:00 10 microsoft windows_nt 4.0