Alteryx Designer Knowledge Base

Definitive answers from Designer experts.
Don't forget to submit your entry for the Excellence Awards by October 30! | Need more information about the program? Check out the blog here

How to Trim Numbers: Regex to the Rescue


Let's say you have data like this and you'd like to remove the numbers at the end: 







You could use Left(x, len), like this: left([Data], (length([Data]) - 4))

(This calculates the length of the field, subtracts 4 and then takes that many characters from the left. E.g.: length(Blue4509) = 8, 8-4 = 4, left(Blue4509,4) = Blue)


Or a combination of ReverseString(Str) and Substring(x, start, length)ReverseString(substring(ReverseString([Data]),4,100))

(This reverses the string, uses a substring to start at the 5th position [positions are zero based], takes the next 100 characters, and then reverses it back. E.g.: reversestring(Blue4509) = 9054eulB, 5th position = e, next 100 characters = eulB, reversing it back results in Blue)



BUT what if your data looks like this: 



The above won't work anymore, but ReplaceChar(x, y, z) will: replacechar([Data], '0123456789','')

(ReplaceChar(x,y,z) returns the string [x] after replacing each occurrence of the character [y] with the character [z].)


Or Regex_Replace(string, pattern, replace,icase): regex_replace([Data],"\d","")

(Regex_Replace(string, pattern, replace,icase) returns the string after replacing the pattern with the replace. In this case, it replace all digits.)


Or the Regex Tool



\d in regular expression identifies digits (aka numbers)


 What if your data looks like this: 



and you want to preserve the number at the beginning of the string but remove the ones at the end? 


The $ character in regex denotes the end of the string, * identifies one or more instances of the preceding character, so \d*$ identifies one or more digits at the end of the string: regex_replace([Data], "\d*$","") or: 




 Also, check out all the other functions on the Alteryx help page, POSIX Extended Regular Expression Syntax, more info on the Regex Tool, and the Regex Cheat Sheet on the community. 

8 - Asteroid

Hi there --- this is so close to what I am trying to do but I'm struggling with getting the regex syntax correct. Here is what I am trying to convert: 






What I need is the following: 





I can get the last number removed but I cannot seem to get the "-" removed in either position. How should I tweak this formula {regex_replace([Data], "\d*$","")} to acheive my results? 


Thank you! 

8 - Asteroid

Hi there --- I actually resolved the message abovt but have another problem. I have data that looks like this: 


CRJS-9000-36/PPPA-9000A-36-Dissertation2016 Fall Qtr 08/29-11/20-PT27
HUMN-9001-48/COUN-8560-48/SOCW-9000-48-Dissertation2016 Fall Qtr 08/29-11/20-PT27


I need to remove everything that follows the section number (highlighted in red). As you can see, some will have two courses and others will have three courses so I cannot use LEN syntax. Since the information is words, numbers and a date along with the leading "-", I am really struggling with the regex syntax to remove that information. 


Any help you might toss my way would certainly be appreciated. 





@KAFord Kim,  is it just a matter of finding the section number preceeded and followed by a dash e.g. -\d\d-  ?   


This will parse the two parts of the larger string





8 - Asteroid

I will try that particularly since not all of the courses will have the word "dissertation" in them. In another file that is like this, the course name follows so while I found a way to remove the "dissertation" for the next file I'm working with I cannot. I actually used this using a filter: 


TRIM(Left([Course Name1], FindString([Course Name1], "-Dissertation")))


However, as you can see, it will only remove courses with that word so your method should take care of all of them.  I just started using Alteryx this week so I'm going through a steep learning curve particularly with creating the syntax for Regex. 


Thanks, Andy!!! 



8 - Asteroid

Great work with regex!  I've run into an issue where I've parsed out an address and in the StreetName there are stings like '5th Avenue Apt Apt Apt.'  I want to do a Replace(<String>, "Apt", "") but there are a few that have '6th Ave Apt G' or 'Apt610' or 'AptA.'  How do I move those with the 'Apt' and a letter or number over the the SuiteName and SuiteNumber fields?  Can I use regex? Or is it just a can of worms?