I'm new to REGEX and have tried to get this to work, but to no avail. Was hoping someone here could help.
I need to find items that are placed in single qoutes in a string that fall into the following catagories:
LL-LLLL-NNN
LLLL-NNNNN
LLLLL-NNNNN
LLLLLL-NNNNN
LLLLLLL-55555
LLL-NNNNN
where L is a letter and N is a number.
Can someone help?
Thanks!
Solved! Go to Solution.
\d is a number
[[:alpha:]] is a letter
* means 0 or more of them
+ means 1 or more of them
. means any single character (letter, number, dash, whatever)
i think you can get what you want with a regular formula (though you could also use the Regex Tool)
regex_match([Field1],'[[:alpha:]]+.*[[:alpha:]]*-\d+')
it looks like you can have multiple groups of letters before your numbers, which is why i allowed for the extra .* and [[:alpha:]]* again. then it will work whether extra groups of letters are there or not.
-1 means TRUE
The string may have other information before and after, so I would only want what matches. How do I do that?
I did copy that same code into the REGEX Parse Tool and it worked great. Thanks!
sorry just saw this now. looks like you got it to work in the regex tool. if you chose Parse mode which is kind of similar to tokenize, you can split up a string into however many substrings you want, of any length, and you tell it that by putting each "output field" (called a marked group) in parentheses.
so if you could have a bunch of stuff after it, i think the Parse mode formula would just be:
([[:alpha:]]+.*[[:alpha:]]*-\d+).*
that extra .* on the end, outside of your "marked group" means it's NOT marked, which basically means it won't be "output" as a field. but it allows for anything else to come after our string, and it will ignore it, and only "keep" what's within the ( )
It looks like I get all the extraneous information prior to the single quotes as well as well as items outside single quoutes, which I don't want. I've attached a few lines of the data that may make more sense, where I have the "Desired Output" and what the "RegEx Output" looks like.
Thanks for your help.