Using RegEx to parse numbers out of square [ ] brackets
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi all,
I've had no luck in using this tool and resorted to using a set of other tools (text to columns etc.) to get the job done.
I basically have a description column as follows:
Apple [1234]
Barking Dog [2134]
Grumpy Cat [981901]
I simply need to parse out the numbers in those brackets to generate a new column/field with ID.
Solved! Go to Solution.
- Labels:
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Learning Regular Expressions can be tough to get at first. I believe there is a lesson in the Academy you might want to check out.
This is what I used and it worked nicely:
.+\s\[(\d+)\]
Let me explain each piece.
When I use the RegEx tool for parsing, I like to outline the entire pattern of the data and then put parentheses around the parts I want to parse out.
So, I start with any character 1 or more times .+
then i noticed the words always have space between the last word and the square bracket. A space is represented by \s
Then I add the square bracket...but because the square bracket is a reserved character (it is used for other things), you need to put an escape character \ in front of it: \[
Then I put an open paren and close paren around the piece I want to parse out. to parse numbers, you put \d+ which stands or any digit one or more times.
And then I put the closing square bracket but remembering to put the escape character in front of it: \]
I have attached the workflow. Hope that makes sense.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks, those descriptions are very useful.
If I have a combination of numbers and text within the square brackets, using the \d+ didn't parse, obviously.. a combination of numbers and text (e..g. AB101, 101AB) is proving to be tricky.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The following should solve this for you too:
regex_replace([Field1],".*\[(.*)\].*",'$1')
Anything ... up to the LAST " [ " bracket found is SKIPPED.
This begins a GROUP #1.
Group 1 is everything up to the LAST " ] " bracket.
Everything else (if any) is SKIPPED.
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Since the workflow I sent uses the RegEx tool, you can use Mark's advice below and just change the digits (\d) to any character (.)
So this will work: .*\s\[(.*)\].*
the \s is not necessary since it is actually captured in the .*
Maureen
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thankyou, works like a charm!
