Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Check out our powerful new search update! You can read more here. Please let us know if you have any feedback by creating a topic here.

Intro to Regex video - the question mark

Highlighted
Meteoroid

The August 2018 Intro to Regex video has a tokenize Regex that uses a question mark, which seems to modify what comes after its use, not what comes before, like the documentation states.  The Regex is:

 

\d+:.+?PMC\d{7}

 

which is scanning a block of text to break it into separate lines beginning with a number and colon, and ending with the phrase PMC1234567 (or any seven digits).  In the block of text to be parsed, preceding each  PMC1234567  there also is a   PMC   followed by some letters rather than seven digits.  Why doesn't the Regex    PMC\d{7}   match    PMC1234567    regardless of whether there's a random solo  PMC someplace prior in the string?   PMC by itself is not a match to   PMC\d{7}

 

The video talks about the question mark make the entire Regex phrase "greedy" and "lazy" but shouldn't   PMC\d{7}   be sufficient by itself to just find   PMC1234567  ?  Does it need to be enclosed in some kind of special character to make this happen?  

 

Thanks!

 

Highlighted
Alteryx Certified Partner

Hi, @Newt,

 

Your correct in your statement that to match PMC1234567 you only need PMC\d{7}

 

The lazy character is usually used after a .* or .+ and the idea is to get the smallest string possible. .* tends to get the biggest combination of characters possible. See example below:

 

1Q23456789123456789Q

if you use .*Q - the match will be  1Q23456789123456789Q

if you use .*?Q - the match will be 1Q

 

Best,

Fernando Vizcaino

Highlighted
Meteoroid

Typed into Rubular.com, the .*?Q returns no matches at all.  

Highlighted
Alteryx Certified Partner

Hi @Newt ,

 

I've just tested in rubular and it returns 2 different groups as expected.

fmvizcaino_0-1581958313421.png

 

I suggest you to use Regex101, you can learn a lot more from that platform.

Best,

Fernando Vizcaino

 

Labels