Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

RegEx Parse help - RegEx not picking from the beginning of a line

Carolyn
8 - Asteroid

I'm trying to do a RegEx Parse but I'm struggling to get RegEx to split it where I want. 

 

I have a string of letters, numbers, and underscores. Because of upstream manipulations, I don't have any punctuation besides the _

 

My lines will begin with 2-3 letter characters (in theory, I could have 1 or 4+, but 2-3 is what I'm expecting and I don't expect numbers but it could happen) then an underscore then a single letter then an underscore followed by other stuff of varying length. 

 

All I want to do is separate the (2-3 letters underscore letter underscore) into one group and everything else into another

 

Examples:

  1. CC_M_Word_MORE_WORDS_January_2024_P1124 would become (CC_M_) and (Word_MORE_WORDS_January_2024_P1124)
  2. CCC_Q_DiffWord_Q4_Stuff_Things_Dated_C_R_NC_Other_Things_P1224 would become (CCC_Q_) and (DiffWord_Q4_Stuff_Things_Dated_C_R_NC_Other_Things_P1224)

 

I've come up with 10 different ways to do it for #1 and they all work beautifully. No matter what I try, I can't get #2 to behave. I thought the ^ at the beginning would do the trick, but no luck.

 

Help??

 

Output.png

4 REPLIES 4
BRRLL99
11 - Bolide

please try this

 

^([A-Za-z]{2,3}_[A-Za-z]_)(.*)

Carolyn
8 - Asteroid

@BRRLL99 That worked - thank you!

 

The root of the issue was that my first \w wasn't working. Once I had yours, I played with it until I figured out which part was breaking mine. 

 

Any idea why this works

 

^([A-Za-z]+_[A-Za-z]_)(.*)

 

but this doesn't

 

^(\w+_[A-Za-z]_)(.*)

 

??

 

I'm so glad it was such an easy fix but also a bit irritated since I fought it for so long 😆

BRRLL99
11 - Bolide

The reason why ^([A-Za-z]+_[A-Za-z]_)(.*) works while ^(\w+_[A-Za-z]_)(.*) doesn't might be due to the difference in what \w matches compared to [A-Za-z].

In most regex implementations, \w matches any word character, which includes letters, digits, and underscores. However, [A-Za-z] specifically matches only uppercase and lowercase letters.

So, if your input includes characters that are not letters, digits, or underscores (such as punctuation), \w will match them, while [A-Za-z] won't.

Carolyn
8 - Asteroid

Thank you!! I didn't realize \w included the underscore. I was reading it as [a-zA-Z0-9] but completely reading over the underscore which is very clearly there! ACK! Thanks again!

 

2024-02-22_16-59-26.png

Labels