Extract String Between Last 2 Parantheses
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I use the formula below to pull out all the data between parentheses:
REGEX_Replace([Course Title], ".*\((.*)\).*", "$1")
Now I need to get only the data between the last two parentheses. See below, some of my original data has more than 2 parentheses, but the data I need will always be between the last two:
Original Data | Data I Need |
J (Q.) Smith ( QWERTY ) | QWERTY |
B (D.) Jones (ASDFGH) | ASDFGH |
R Johnson ( ZXCVBN45 ) | ZXCVBN45 |
P COLLINS ( LKJHGF ) | LKJHGF |
Z Brownfox ( DFGH234 ) | DFGH234 |
Solved! Go to Solution.
- Labels:
- Preparation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hey @Rob48, there's a few expressions you could use for this - I've gone with something like this (using the Formula tool so you can also trim very easily, but have also left the RegEx tool option for you to see):
Trim(REGEX_Replace([Original Data], '.*\(([^)]+).*', '$1'))
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hey @Rob48 you can use this RegEx
REGEX_Replace([Original Data], '.*\((.*?)\).*', '$1')
Then use a data cleanse to remove any extra whitespaces
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
DataNath with it very fast - GOAT contributer.
I did it with a Regex-Parse and the expression:
.*\((.*)\)$
Mine requires a separate formula to trim so I like it less than the replace.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks DataNath the formula worked.
One more question if you don't mind. I have a hard time following RegEx logic, can you point out where in the formula the two parentheses are ID'd and the data is extracted? I'd like to keep this formula in my bag of tricks if I need it for other special characters or strings.
Trim(REGEX_Replace([Original Data], '.*\(([^)]+).*', '$1'))
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Glad to hear that @Rob48 - the breakdown is as follows:
.*\( - Takes as many characters as necessary to get to the last occurrence of '(',
( - Starts the capture group
[^)]+ - Square brackets create a character lookup group, adding ^ to the start denotes 'is not a member of this group' and this part therefore spans one or more characters that aren't ')' i.e. everything until we hit ')'
) - End the capture group
.* - Rest of string
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
trying to do this again with different special characters but i'm lost - I can't figure out where my two parentheses are the represent the second to last and the last iteration. Can you recopy the formula but replace the target characters with "###"?
so that with data that looks like this:
ABC###XYZ###1234###QWERTY###THE ANSWER###ASDFASDF
will return:
THE ANSWER
(I'm assuming the formula would be the same whether it's a single # or a string of 3)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
hey @Rob48 you can use this expression to extract 'The ANSWER':
REGEX_Replace([Input], "^(?:[^#]+###){4}([^#]+)###.*$|.*#([^#]+)#.*", "$1$2")
![](/skins/images/1AD6617C767659D8A9F9801BEDFCFB20/responsive_peak/images/icon_anonymous_message.png)