Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Parsing nested JSON

Alex-Andronic
6 - Meteoroid

Hi, 

 

I've been trying to parse some nested JSON but I cannot get my head around it. I can do it if it's only one level (nodes?) but if there are multiple levels I cannot do it.

 

The desired output is like this, where name, start_time and end_time are unique

namestart_timeend_timeinoutabandon
StoreEntrance.Door1.Left2017-10-10T00:00:00.000+01:002017-10-10T00:04:59.999+01:00100
StoreEntrance.Door1.Left2017-10-10T00:05:00.000+01:002017-10-10T00:09:59.999+01:00010
StoreEntrance.Door1.Right2017-10-10T00:00:00.000+01:002017-10-10T00:04:59.999+01:00030
StoreEntrance.Door1.Right2017-10-10T00:05:00.000+01:002017-10-10T00:09:59.999+01:00200

 

I attached the JSON file that I try to parse this way (I had to rename it as .txt because .json is not a supported attachment file type)

 

Any help would be appreciated.

Alex

5 REPLIES 5
jdunkerley79
ACE Emeritus
ACE Emeritus

Try the attached.

 

I think should be close enough to get you started.

 

2017-10-11_18-54-22.jpg

 Basically uses Regex and join to get records into the needed format

Alex-Andronic
6 - Meteoroid

Elegant solution with the Regex, thank you very much for your help!

jrdepriest
7 - Meteor

Hello,

 

I'm trying to parse a complicated JSON file that contains information about computer vulnerabilities from the National Vulnerability Database operated by NIST. The main page for the files is here https://nvd.nist.gov/vuln/data-feeds.

The smallest file they have is still almost 3 MB when unzipped. The 2002 file is the one giving me the most trouble and the smallest one with which I can demonstrate my issues. Download it here https://static.nvd.nist.gov/feeds/json/cve/1.0/nvdcve-1.0-2002.json.zip.

2018-03-06_9-51-52.png

The Workflow is pretty much the same as the one provided, just a few tweaks due to the naming conventions.

The problem comes with the nested fields "vendor_name".

You can see how they appear in this excerpt. This isn't the entire section, just a snippet.

 

    "cve" : {
      "data_type" : "CVE",
      "data_format" : "MITRE",
      "data_version" : "4.0",
      "CVE_data_meta" : {
        "ID" : "CVE-1999-0590",
        "ASSIGNER" : "cve@mitre.org"
      },
      "affects" : {
        "vendor" : {
          "vendor_data" : [ {
            "vendor_name" : "apple",
            "product" : {
              "product_data" : [ {
                "product_name" : "mac_os",
                "version" : {
                  "version_data" : [ {
                    "version_value" : "*"
                  } ]
                }
              } ]
            }
          }, {
            "vendor_name" : "linux",
            "product" : {
              "product_data" : [ {
                "product_name" : "linux_kernel",
                "version" : {
                  "version_data" : [ {
                    "version_value" : "2.6.20.1"
                  } ]
                }
              } ]

Each vendor_name has its own associated product_name.

 

But this is how it looks in the final report.

 

Record_Name	publishedDate_Fixed	lastModifiedDate_Fixed	cvssV2_baseScore	cvssV3_baseScore	vendor_name	product_name	value	versionEndIncluding	version_value
CVE-1999-0590	2000-01-01 04:00:00	2008-01-09 12:35:00	10		apple,linux,microsoft	mac_os,linux_kernel,windows_2000,windows_95,windows_98,windows_nt	NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it.		*,2.6.20.1,*,*,*,3.5.1,4.0

The vendor names are concatenated in a single field and the product names and version_value are concatenated in their respective fields.

 

I'd rather break it down like this.

 

Record_Name	publishedDate_Fixed	lastModifiedDate_Fixed	cvssV2_baseScore	cvssV3_baseScore	vendor_name	product_name	value	versionEndIncluding	version_value
CVE-1999-0590	1/1/2000 4:00	1/9/2008 12:35	10		apple	mac_os	NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it.		*
CVE-1999-0590	1/1/2000 4:00	1/9/2008 12:35	10		linux	linux_kernel	NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it.		2.6.20.1
CVE-1999-0590	1/1/2000 4:00	1/9/2008 12:35	10		microsoft	windows_2000	NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it.		*
CVE-1999-0590	1/1/2000 4:00	1/9/2008 12:35	10		microsoft	windows_95	NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it.		*
CVE-1999-0590	1/1/2000 4:00	1/9/2008 12:35	10		microsoft	windows_98	NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it.		*
CVE-1999-0590	1/1/2000 4:00	1/9/2008 12:35	10		microsoft	windows_nt	NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it.		3.5.1
CVE-1999-0590	1/1/2000 4:00	1/9/2008 12:35	10		microsoft	windows_nt	NVD-CWE-Other,A system does not present an appropriate legal message or warning to a user who is accessing it.		4.0

Grouped by Vendor, then Product, then Version.

But the parsing is beyond my skill level.

 

jrdepriest
7 - Meteor

I created a demo data set based on the excerpt I posted earlier. This set is representative of the whole.

 

I feel like this is going to take a SubSubField and a SubSubSubField and that is getting too complicated for my brain to parse.

jrdepriest
7 - Meteor

I have a Workflow that should do exactly what I want. It produces the desired results with the sample data. But, on real data, the files that range from 5 MB to 200 MB, I've let it run for 72 hours on one of the smaller files and it still didn't finish. I tried breaking it into smaller workflows and dumping incremental results into Altertyx data files. The file created by step 2 of 4 was 6 GB.

Any advice would be great. See attached monstrosity.

I attached the wrong version the first time. Sorry.

This is what the attached one produces with the demo data.

CVE	published_Date	lastmodified_Date	cvssV2_baseScore	Vendor_Name	Product_Name	Version_Value
CVE-1999-0590	2000-01-01 04:00:00	2008-01-09 12:35:00	10	apple	mac_os	*
CVE-1999-0590	2000-01-01 04:00:00	2008-01-09 12:35:00	10	linux	linux_kernel	2.6.20.1
CVE-1999-0590	2000-01-01 04:00:00	2008-01-09 12:35:00	10	microsoft	windows_2000	*
CVE-1999-0590	2000-01-01 04:00:00	2008-01-09 12:35:00	10	microsoft	windows_95	*
CVE-1999-0590	2000-01-01 04:00:00	2008-01-09 12:35:00	10	microsoft	windows_98	*
CVE-1999-0590	2000-01-01 04:00:00	2008-01-09 12:35:00	10	microsoft	windows_nt	3.5.1
CVE-1999-0590	2000-01-01 04:00:00	2008-01-09 12:35:00	10	microsoft	windows_nt	4.0
Labels