Let’s talk Alteryx Copilot. Join the live AMA event to connect with the Alteryx team, ask questions, and hear how others are exploring what Copilot can do. Have Copilot questions? Ask here!
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Using REGEX to find specific style of strings

NeilFisk
9 - Comet

I'm new to REGEX and have tried to get this to work, but to no avail.  Was hoping someone here could help.

 

I need to find items that are placed in single qoutes in a string that fall into the following catagories:

  LL-LLLL-NNN

  LLLL-NNNNN

  LLLLL-NNNNN

  LLLLLL-NNNNN

  LLLLLLL-55555

  LLL-NNNNN

where L is a letter and N is a number.

 

Can someone help?

 

Thanks!

5 REPLIES 5
mst3k
11 - Bolide

\d is a number

[[:alpha:]] is a letter

* means 0 or more of them

+ means 1 or more of them

. means any single character (letter, number, dash, whatever)

i think you can get what you want with a regular formula (though you could also use the Regex Tool)

 

regex_match([Field1],'[[:alpha:]]+.*[[:alpha:]]*-\d+')

 

it looks like you can have multiple groups of letters before your numbers, which is why i allowed for the extra .* and [[:alpha:]]* again. then it will work whether extra groups of letters are there or not.

 

-1 means TRUE

 

mst3k_0-1632950799222.png

 

NeilFisk
9 - Comet

The string may have other information before and after, so I would only want what matches.  How do I do that?

NeilFisk
9 - Comet

I did copy that same code into the REGEX Parse Tool and it worked great.  Thanks!

 

Capture.PNG

mst3k
11 - Bolide

sorry just saw this now. looks like you got it to work in the regex tool. if you chose Parse mode which is kind of similar to tokenize, you can split up a string into however many substrings you want, of any length, and you tell it that by putting each "output field" (called a marked group) in parentheses.

so if you could have a bunch of stuff after it, i think the Parse mode formula would just be:

 

([[:alpha:]]+.*[[:alpha:]]*-\d+).*

 

 

that extra .* on the end, outside of your "marked group" means it's NOT marked, which basically means it won't be "output" as a field. but it allows for anything else to come after our string, and it will ignore it, and only "keep" what's within the ( )

 

 

NeilFisk
9 - Comet

It looks like I get all the extraneous information prior to the single quotes as well as well as items outside single quoutes, which I don't want.  I've attached a few lines of the data that may make more sense, where I have the "Desired Output" and what the "RegEx Output" looks like.

 

Thanks for your help.

Labels
Top Solution Authors