We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

RegEx Parse help - RegEx not picking from the beginning of a line

Carolyn
12 - Quasar
12 - Quasar

I'm trying to do a RegEx Parse but I'm struggling to get RegEx to split it where I want. 

 

I have a string of letters, numbers, and underscores. Because of upstream manipulations, I don't have any punctuation besides the _

 

My lines will begin with 2-3 letter characters (in theory, I could have 1 or 4+, but 2-3 is what I'm expecting and I don't expect numbers but it could happen) then an underscore then a single letter then an underscore followed by other stuff of varying length. 

 

All I want to do is separate the (2-3 letters underscore letter underscore) into one group and everything else into another

 

Examples:

  1. CC_M_Word_MORE_WORDS_January_2024_P1124 would become (CC_M_) and (Word_MORE_WORDS_January_2024_P1124)
  2. CCC_Q_DiffWord_Q4_Stuff_Things_Dated_C_R_NC_Other_Things_P1224 would become (CCC_Q_) and (DiffWord_Q4_Stuff_Things_Dated_C_R_NC_Other_Things_P1224)

 

I've come up with 10 different ways to do it for #1 and they all work beautifully. No matter what I try, I can't get #2 to behave. I thought the ^ at the beginning would do the trick, but no luck.

 

Help??

 

Output.png

4 REPLIES 4
BRRLL99
11 - Bolide

please try this

 

^([A-Za-z]{2,3}_[A-Za-z]_)(.*)

Carolyn
12 - Quasar
12 - Quasar

@BRRLL99 That worked - thank you!

 

The root of the issue was that my first \w wasn't working. Once I had yours, I played with it until I figured out which part was breaking mine. 

 

Any idea why this works

 

^([A-Za-z]+_[A-Za-z]_)(.*)

 

but this doesn't

 

^(\w+_[A-Za-z]_)(.*)

 

??

 

I'm so glad it was such an easy fix but also a bit irritated since I fought it for so long 😆

BRRLL99
11 - Bolide

The reason why ^([A-Za-z]+_[A-Za-z]_)(.*) works while ^(\w+_[A-Za-z]_)(.*) doesn't might be due to the difference in what \w matches compared to [A-Za-z].

In most regex implementations, \w matches any word character, which includes letters, digits, and underscores. However, [A-Za-z] specifically matches only uppercase and lowercase letters.

So, if your input includes characters that are not letters, digits, or underscores (such as punctuation), \w will match them, while [A-Za-z] won't.

Carolyn
12 - Quasar
12 - Quasar

Thank you!! I didn't realize \w included the underscore. I was reading it as [a-zA-Z0-9] but completely reading over the underscore which is very clearly there! ACK! Thanks again!

 

2024-02-22_16-59-26.png

Labels
Top Solution Authors