Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Capturing Uppercase in Strings with Regex

reignM
7 - Meteor

Hi! I have this combination of data. 

 

ABC Video

DEF

GHI

Video of JKL

(MNO)

PQR123 Video

 

What i wish to do is to get the strings that are all in uppercase. Hence, the output should be 

ABC

DEF

GHI

JKM

MNO

PQR

 

I tried this expression of (A-Z\s]*) using RegEx_Match however is only getting those that are only of uppercase string. In this case,

DEF 

GHI

 

How do I ensure it captures all the string in uppercases without the lowercases, special character or digits? 

 

I would greatly appreciate your help.

 

 

6 REPLIES 6
clmc9601
13 - Pulsar
13 - Pulsar

Hi @reignM,

 

Your regex pattern is on the right track! There are a variety of patterns you could use. If you use the regex string functions in the formula tool, here are two of my favorite ways to solve this use case.

 

This first regex pattern is instructing Alteryx to replace any instance of a non-uppercase character (can use \U or [^A-Z] to represent a non uppercase character) with a blank string. In effect, this will keep only the uppercase characters. For "Video of JKL", it would keep "VJKL", so it depends on how you want it to behave. The last argument, 0, instructs the pattern to be case sensitive. By default, most of the regex functions in Alteryx are case INsensitive.

regex_replace([StringColumn], '\U','',0)

 

The second pattern says to capture and return the first group of uppercase characters that have 2 or more occurrences. This pattern can break more easily or may not give the right results if you have multiple matching groups in your string.

regex_replace([StringColumn], '.*(\u{2,}.*','$1',0)

 

Like I mentioned before, there are a variety of ways to get a functional pattern in regex. I'd encourage you to experiment until you find one that works for you! Rubular is my favorite site for testing regex patterns. Heads up that regex functions are computationally expensive, so if you find another way to extract your desired strings, that will be easier on your computer with a large data set.

 

I hope this helps!

Qiu
20 - Arcturus
20 - Arcturus

@reignM 
I tried anothe approach, maybe not as elegant as the one from @clmc9601 >
First the string will be broke from the Space, then remove any non a-zA-Z characters the follows a case sensitive filter.

0412-reignM.PNG

JarekSkudrzyk
11 - Bolide

@Qiu @clmc9601 really nice solutions!
@reignM I would only add another link to regex website that I find really useful - https://regex101.com/

 

reignM
7 - Meteor

@clmc9601

Thank you so much for the help and even going above and beyond to provide explanation and resources! I appreciate it very much. I hope you have a great day 😊

reignM
7 - Meteor

@Qiu 

Thank you for the different approach! 

reignM
7 - Meteor

@JarekSkudrzyk 

Appreciate the resources! What an amazing community!

Labels