Regex - How to capture everything but certain characters, repeated once or more
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello Alteryx Community,
I am still fairly new to Alteryx, and Regex by extension, and I have a column that I am trying to parse out where I want to capture everything separately, and it may repeat once or more, while excluding certain characters. I'll include examples below:
01234 ยท Word Word
01234 ยท Word Word:01234 ยท Word Word
01234 ยท Word Word:01234 ยท Word Word:01234 ยท Word word
01234 ยท Word Word:01234 ยท Word Word:01234 ยท Word word:01234 ยท Word word
(Please note the "ยท" and ":" in the data)
My current expression is as follows: ((?:[^ยท:])+)
This expression matches everything separately and repeats itself, but it places all the matches in Group 1 instead of Group 0, which Alteryx does not seem to like. Any ideas on how I can change this so that everything can be in Group 0? Or some help with writing a different expression would be much appreciated.
Edit: I updated the expression to match everything to Group 0: (?:[^ยท:])+
However, Alteryx returns an error saying there is Nothing to Parse. I'm getting closer but still just need help getting Alteryx to recognize this.
- Labels:
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
So is the . or : a delimiter for what represents a word or just any space? It would be good to know which parts of the sample rows you consider a match above.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I did end up figuring it out, I just changed the output method from Parse to Tokenize and it captured everything. Thank you though
