Alteryx Designer Knowledge Base

Definitive answers from Designer experts.

How to Trim Numbers: Regex to the Rescue

HenrietteH
Alteryx
Alteryx
Created

Let's say you have data like this and you'd like to remove the numbers at the end:

Blue4509

Yellow2345

Orange2315

Blue6754

You could useLeft(x, len), like this:left([Data], (length([Data]) - 4))

(This calculates the length of the field, subtracts 4 and then takes that many characters from the left. E.g.: length(Blue4509) = 8, 8-4 = 4, left(Blue4509,4) = Blue)

Ora combination ofReverseString(Str) andSubstring(x, start, length):ReverseString(substring(ReverseString([Data]),4,100))

(This reverses the string, uses a substring to start at the 5th position [positions are zero based], takes the next 100 characters, and then reverses it back. E.g.: reversestring(Blue4509) = 9054eulB, 5th position = e, next 100 characters = eulB, reversing it back results in Blue)

BUT what if your data looks like this:

Blue4509
Yellow2345
Orange231
Blue6754
Green596828

The above won't work anymore, but ReplaceChar(x, y, z) will:replacechar([Data], '0123456789','')

(ReplaceChar(x,y,z)returns the string [x] after replacing each occurrence of the character [y] with the character [z].)

OrRegex_Replace(string, pattern, replace,icase):regex_replace([Data],"\d","")

(Regex_Replace(string, pattern, replace,icase) returns the string after replacing the pattern with the replace. In this case, it replace all digits.)

Or the Regex Tool:

regx1.PNG

\d in regular expression identifies digits (aka numbers)

What if your data looks like this:

1Blue4509
2Yellow2345
3Orange231
4Blue6754
5Green596828

and you want to preserve the number at the beginning of the string but remove the ones at the end?

The $ character in regex denotes the end of the string, * identifies one or more instances of the preceding character,so \d*$ identifies one or more digits at the end of the string:regex_replace([Data], "\d*$","") or:

regx2.PNG

Also, check outall the otherfunctions on the Alteryx help page,POSIX Extended Regular Expression Syntax, more info on the Regex Tool, and theRegex Cheat Sheeton the community.

Attachments
Comments
KAFord
8 - Asteroid

Hi there --- this is so close to what I am trying to do but I'm struggling with getting the regex syntax correct. Here is what I am trying to convert: 

 

COURS-8651-2

COURS-6651-20

COURS-7786F-1

 

What I need is the following: 

COURS8651

COURS6651

COURS7786F

 

I can get the last number removed but I cannot seem to get the "-" removed in either position. How should I tweak this formula {regex_replace([Data], "\d*$","")} to acheive my results? 

 

Thank you! 

KAFord
8 - Asteroid

Hi there --- I actually resolved the message abovt but have another problem. I have data that looks like this: 

 

CRJS-9000-36/PPPA-9000A-36-Dissertation2016 Fall Qtr 08/29-11/20-PT27
HUMN-9001-48/COUN-8560-48/SOCW-9000-48-Dissertation2016 Fall Qtr 08/29-11/20-PT27

 

I need to remove everything that follows the section number (highlighted in red). As you can see, some will have two courses and others will have three courses so I cannot use LEN syntax. Since the information is words, numbers and a date along with the leading "-", I am really struggling with the regex syntax to remove that information. 

 

Any help you might toss my way would certainly be appreciated. 

 

Thanks, 

--Kim

AndyM
Alteryx
Alteryx

@KAFord Kim,  is it just a matter of finding the section number preceeded and followed by a dash e.g. -\d\d-  ?   

 

This will parse the two parts of the larger string

 

2017-01-06_17-25-40.png

 

 

KAFord
8 - Asteroid

I will try that particularly since not all of the courses will have the word "dissertation" in them. In another file that is like this, the course name follows so while I found a way to remove the "dissertation" for the next file I'm working with I cannot. I actually used this using a filter: 

 

TRIM(Left([Course Name1], FindString([Course Name1], "-Dissertation")))

 

However, as you can see, it will only remove courses with that word so your method should take care of all of them.  I just started using Alteryx this week so I'm going through a steep learning curve particularly with creating the syntax for Regex. 

 

Thanks, Andy!!! 

 

 

brad_j_crep
8 - Asteroid

Great work with regex!  I've run into an issue where I've parsed out an address and in the StreetName there are stings like '5th Avenue Apt Apt Apt.'  I want to do a Replace(<String>, "Apt", "") but there are a few that have '6th Ave Apt G' or 'Apt610' or 'AptA.'  How do I move those with the 'Apt' and a letter or number over the the SuiteName and SuiteNumber fields?  Can I use regex? Or is it just a can of worms?

 

Thanks,

Brad