Regex- Tokenize
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Why is my Tokenize parse not working correctly? I am expecting each "<juice>.*</juice>" combo to be in its own column.
Text Data-
Field 1 |
the beginning of my questions<juice>apple</juice>alphabet soup is good<juice>orange</juice>the earth is round<juice>cranberry</juice>go lakers |
Output-
Field 1 | Field 11 |
the beginning of my questions<juice>apple</juice>alphabet soup is good<juice>orange</juice>the earch is round<juice>cranberry</juice>go lakers | <juice>apple</juice>alphabet soup is good<juice>orange</juice>the earch is round<juice>cranberry</juice> |
- Labels:
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
It's really bad when I just watch this board for REGEXs to solve 😂.
I think yours is fairly easy--you are just missing your parentheses to capture the juice name. I also changed it to word characters as opposed to the period which will take white spaces and gum up your results.
\<juice\>(\w*)\<\/juice\>
I like to escape (the backslash) all the symbol characters just in case. Let me know if it works!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Here is my solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you for the response. This did work, however I would prefer to use (.*) instead of (<\w+\>) to make it more dynamic in case a number or special character were to come through as an input. Any ideas?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
No luck, now no data is parsing. Also would prefer to stray away from the (\w*) in order to make it more dynamic, in case a number or special character were to come through in input.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I think in order to answer your question better, we need to know more about the data surrounding the information. Will there always be three <juice> tags? If so, that makes all the difference:
\<juice\>(.*)\<\/juice\>[\s\S]*\<juice\>(.*)\<\/juice\>[\s\S]*\<juice\>(.*)\<\/juice\>
That will work every time no matter what as long as there are always three juice tags.
Cheers!
