Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

JSON parsing with a mix of single and double quote as delimiters

francoiscortezon
6 - Meteoroid

Dear community,

 

I am receiving a JSON that has single quotes as delimiters. The JSON parse tool doesn't seem to like it. When replacing the single quotes by double quotes before parsing, it works almost perfectly...

 

Except that this JSON contains specificities (don't know if it's normal or not?). For example: when a text contains one or more single quotes, the delimiters are changed to doubles quotes. And when there are both single and double quotes inside the text, the delimiters are set to double quotes and the double quotes are escaped (replaced by \"). 

 

Here I am trying to explain what I have in the JSON (field "From") and what would be needed for the JSON parser to work properly (field "To"):

 

FromTo
'tag': 'some "text'"tag": "some text"
'tag': 'some text'"tag": "some text"
'tag': "some ' text"

"tag": "some text"

'tag': "some 'text ' "

"tag": "some text  "

'tag': "some ' \" text "

"tag": "some   text "

 

Basically, we would just transform the delimiters into double quotes and get rid of single and double quotes inside which are not delimiters.

 

I think the solution is either:

- a custom JSON parser already made to handle such cases (looked in the community without success)

- a magical regular expression: I tried several without luck, like: (.*'.*)(\"*)(.*'.*) or (.*'.*)("*)(.*'.*) (and many others) in a RegEx tool in replace mode, with $1$3 as replacement text. If anyone can tell me why this one doesn't work, would appreciate a lot!

 

Any help from your RegEx or JSON wizards would be appreciated! 😉

 

Thanks a lot in advance for your help.

 

François

8 REPLIES 8

Hi @francoiscortezon 

 

I did it using the formula tool. I tried basically to remove all punctuations except : then I used Regex replace to have the double quotes around both tag and text. I added after a Cleansing tool to get rid of double whitespace.

christine_assaad_0-1658436792534.png

 

Hope that helps. Cheers!

francoiscortezon
6 - Meteoroid

Thanks a lot Christine. Your solution is going in the right direction.

 

It needs to work on full JSON content, for example: {'tag1': "some ' text", 'tag2': 'some text', 'tag3': null}

Which should translate to: {"tag1": "some ' text", "tag2": "some text", "tag3": null} (or {"tag1": "some text" .... -> The single quote inside the double quotes can be left or removed, it's not really important.

 

I used your solution after splitting to rows with the comma as a delimiter. It logically adds the double quotes in the incorrect places. Do you know if there is a way to adapt your RegExReplace formula to identify the special characters ( { } [ ] mainly I think) as well as to let the terms without quotes as they are (null for example)?

 

 

francoiscortezon_0-1658476517519.png

 

Thanks again a lot for your help.

 

François

 

grazitti_sapna
17 - Castor

@francoiscortezon, give this a try.

grazitti_sapna_0-1658478611835.png

 

Thanks!

Sapna Gupta
francoiscortezon
6 - Meteoroid

Thanks a lot Grazitti. Your solution is unfortunately not working if there is a space at the beginning of a text. I told myself it was not so important and tried to make your solution work by removing space around characters : and '. 

 

But finally, I tried again the Split to Rows method and I think I got it. Seems to work. For sure, there is a way to make the big formula in the middle a bit more elegant, but for now, it works.

 

I will try soon with the full dataset and see if I didn't miss more special cases 😁.

 

Thanks again a lot to you both for your help!

 

Your ideas definitely helped me to think about the various approaches, and I learned a lot about RegEx.

 

Have a good day.

 

François

 

 

Hi @francoiscortezon 

 

Sorry I did not see your previous email as I wasn't tagged. I'm happy we are close.

 

See below, I basically removed the Regex Replace I used, used a Json build to take care of where double quotes should be placed.

christine_assaad_0-1658499760192.png

Hope this helps. If yes, please mark it as solution.

francoiscortezon
6 - Meteoroid

Hi @christine_assaad ,

 

My first post, I will tag people from now on, thanks. 😀


As you can see from my latest post and my answer to @grazitti_sapna , I updated my sample data to reflect better my case: each line is already a complete JSON. The problem is about the single / double quotes. And I think I finally got it, but will confirm on Monday.

 

Thanks again and have a great weelkend.

 

François

grazitti_sapna
17 - Castor

@francoiscortezon, could you please try my updated workflow seems to work for me.

grazitti_sapna_0-1658727481268.png

 

Thanks!

Sapna Gupta
francoiscortezon
6 - Meteoroid

Hi @grazitti_sapna,

 

Thanks for your work and for making me discover "non-capturing groups" in RegEx, interesting! (?:)

 

Unfortunately, it doesn't work when single quotes don't have space around them.

 

Ex: {'tag': "some'text"} becomes {"tag":"some"text"}, which is incorrect.

 

Due to holidays of my colleagues, I might not be able to fully validate my workflow before few weeks. So in case you have ideas more elegant than mine, do not hesitate ;-)

 

Thanks a lot and have a great day.

 

François

 

Prime Analytics loves Alteryx - Copy.png

 

 

 

Labels