Dear community,
I am receiving a JSON that has single quotes as delimiters. The JSON parse tool doesn't seem to like it. When replacing the single quotes by double quotes before parsing, it works almost perfectly...
Except that this JSON contains specificities (don't know if it's normal or not?). For example: when a text contains one or more single quotes, the delimiters are changed to doubles quotes. And when there are both single and double quotes inside the text, the delimiters are set to double quotes and the double quotes are escaped (replaced by \").
Here I am trying to explain what I have in the JSON (field "From") and what would be needed for the JSON parser to work properly (field "To"):
From | To |
'tag': 'some "text' | "tag": "some text" |
'tag': 'some text' | "tag": "some text" |
'tag': "some ' text" | "tag": "some text" |
'tag': "some 'text ' " | "tag": "some text " |
'tag': "some ' \" text " | "tag": "some text " |
Basically, we would just transform the delimiters into double quotes and get rid of single and double quotes inside which are not delimiters.
I think the solution is either:
- a custom JSON parser already made to handle such cases (looked in the community without success)
- a magical regular expression: I tried several without luck, like: (.*'.*)(\"*)(.*'.*) or (.*'.*)("*)(.*'.*) (and many others) in a RegEx tool in replace mode, with $1$3 as replacement text. If anyone can tell me why this one doesn't work, would appreciate a lot!
Any help from your RegEx or JSON wizards would be appreciated! 😉
Thanks a lot in advance for your help.
François
I did it using the formula tool. I tried basically to remove all punctuations except : then I used Regex replace to have the double quotes around both tag and text. I added after a Cleansing tool to get rid of double whitespace.
Hope that helps. Cheers!
Thanks a lot Christine. Your solution is going in the right direction.
It needs to work on full JSON content, for example: {'tag1': "some ' text", 'tag2': 'some text', 'tag3': null}
Which should translate to: {"tag1": "some ' text", "tag2": "some text", "tag3": null} (or {"tag1": "some text" .... -> The single quote inside the double quotes can be left or removed, it's not really important.
I used your solution after splitting to rows with the comma as a delimiter. It logically adds the double quotes in the incorrect places. Do you know if there is a way to adapt your RegExReplace formula to identify the special characters ( { } [ ] mainly I think) as well as to let the terms without quotes as they are (null for example)?
Thanks again a lot for your help.
François
Thanks a lot Grazitti. Your solution is unfortunately not working if there is a space at the beginning of a text. I told myself it was not so important and tried to make your solution work by removing space around characters : and '.
But finally, I tried again the Split to Rows method and I think I got it. Seems to work. For sure, there is a way to make the big formula in the middle a bit more elegant, but for now, it works.
I will try soon with the full dataset and see if I didn't miss more special cases 😁.
Thanks again a lot to you both for your help!
Your ideas definitely helped me to think about the various approaches, and I learned a lot about RegEx.
Have a good day.
François
Sorry I did not see your previous email as I wasn't tagged. I'm happy we are close.
See below, I basically removed the Regex Replace I used, used a Json build to take care of where double quotes should be placed.
Hope this helps. If yes, please mark it as solution.
Hi @christine_assaad ,
My first post, I will tag people from now on, thanks. 😀
As you can see from my latest post and my answer to @grazitti_sapna , I updated my sample data to reflect better my case: each line is already a complete JSON. The problem is about the single / double quotes. And I think I finally got it, but will confirm on Monday.
Thanks again and have a great weelkend.
François
@francoiscortezon, could you please try my updated workflow seems to work for me.
Thanks!
Hi @grazitti_sapna,
Thanks for your work and for making me discover "non-capturing groups" in RegEx, interesting! (?:)
Unfortunately, it doesn't work when single quotes don't have space around them.
Ex: {'tag': "some'text"} becomes {"tag":"some"text"}, which is incorrect.
Due to holidays of my colleagues, I might not be able to fully validate my workflow before few weeks. So in case you have ideas more elegant than mine, do not hesitate ;-)
Thanks a lot and have a great day.
François
User | Count |
---|---|
19 | |
15 | |
14 | |
9 | |
8 |