Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Regex- Tokenize

garretwalters12
8 - Asteroid

Why is my Tokenize parse not working correctly? I am expecting each "<juice>.*</juice>" combo to be in its own column.

 

Text Data- 

Field 1
the beginning of my questions<juice>apple</juice>alphabet soup is good<juice>orange</juice>the earth is round<juice>cranberry</juice>go lakers

 

garretwalters12_0-1611026556071.png

 

Output- 

Field 1Field 11
the beginning of my questions<juice>apple</juice>alphabet soup is good<juice>orange</juice>the earch is round<juice>cranberry</juice>go lakers<juice>apple</juice>alphabet soup is good<juice>orange</juice>the earch is round<juice>cranberry</juice>
5 REPLIES 5
BretCarr
10 - Fireball

It's really bad when I just watch this board for REGEXs to solve ðŸ˜‚.

 

I think yours is fairly easy--you are just missing your parentheses to capture the juice name. I also changed it to word characters as opposed to the period which will take white spaces and gum up your results.

 

\<juice\>(\w*)\<\/juice\>

 

I like to escape (the backslash) all the symbol characters just in case. Let me know if it works!

sparksun
11 - Bolide

Here is my solution.

 

sparksun_0-1611049645985.png

 

garretwalters12
8 - Asteroid

Thank you for the response. This did work, however I would prefer to use (.*) instead of (<\w+\>) to make it more dynamic in case a number or special character were to come through as an input. Any ideas?

garretwalters12
8 - Asteroid

No luck, now no data is parsing. Also would prefer to stray away from the (\w*) in order to make it more dynamic, in case a number or special character were to come through in input.

BretCarr
10 - Fireball

I think in order to answer your question better, we need to know more about the data surrounding the information. Will there always be three <juice> tags? If so, that makes all the difference:

 

\<juice\>(.*)\<\/juice\>[\s\S]*\<juice\>(.*)\<\/juice\>[\s\S]*\<juice\>(.*)\<\/juice\>

 

That will work every time no matter what as long as there are always three juice tags.

 

Cheers!

Labels
Top Solution Authors