Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Gather all 9 clues to complete the final Weekly Challenge on Dec 16!

Learn More

RegEx - Extract a Variable Dollar Value N Characters from a Given Word


I tried this once before.  I took it down by "solving it", but it was not solved.  It was a hot mess.  I now better understand what I need.


I need to extract the dollar value of the string value that is immediately N characters from the word "awarded."  The string value will be >$1,000,000.  The text may include other values, but I only want to extract the first string value immediately N characters from "awarded."  


I do not know how to adjust for the variability in the number of characters or in the value of the award nor how to convert it to a dollar value from a string. 


I know it has to start. ^.*\bawarded\s. ...




Lorem ipsum dolor sit amet, consectetur adipiscing awarded an $8,069,336 elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Augue eget arcu dictum varius duis at consectetur lorem donec. Nunc scelerisque viverra mauris $28,069,336 in aliquam sem fringilla. 

 OR something like this

Lorem ipsum dolor sit amet, consectetur adipiscing awarded to my best friend $9,998,069,336 elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Augue eget arcu dictum varius duis at consectetur lorem donec. Nunc scelerisque viverra mauris $28,069,336 in aliquam sem fringilla.



A quick follow up.   This is the current expression.  \bawarded\s.*?([0-9]{7,})


I use a formula tool to remove string value's $ sign and commas.  The expression looks for a series of numbers >7 N characters after awarded.  


BUT, I am stuck on one small detail.  Some text may have a second occurrence of awarded.  How do I stop the expression after it finds the first awarded and value pair? 


AND while it works on, it actually does not work at all in Alteryx.  UGH. 


Hello @hellyars 


Are you open to a non-RegEx solution? Thought of this approach quickly, since you just want to pull numbers.


You can remove all text, punctuation, and extra spaces with the data cleansing. Then a few manipulations. A formula tool gives you only the first number in each row (and a filter tool makes sure the numbers are >=1000000







Please let me know if this works for you, because it's a single tool solution!


You can use a multi-field formula tool to get just the first set of numbers (and change the data type to a number(. For whatever reason, the Replace stops after the first number.


I suggest:

ToNumber(Replace(REGEX_Replace([Input], "^.*?\bawarded\b.*?\$([0-9,]{6,}).*?$|^.*()$", "$1"),",",""))

Breaking it down the core regex:


Matches the awarded amount. Using non-greedy ways to match means it should extract whole number correctly into $1


This part:



is a trick to mean that if the string does not match then $1 will be empty.


After this a traditional REPLACE is used to remove commas and then ToNumber to convert to a number


Sample attached.




Thank you.  This works, but I need it to key off the word awarded.   An awarded value will always exist.  If I can get this to work, I can apply it to other keyword value pairs that are less frequently used.  




Brilliant.  That works in one fell swoop.  


I tested it by adding multiple award + value pairs to the text.  It always matches the first pair.   It looks like ^.*? up front is the key to matching the first occurrence.  It also worked when I added it to the expression I had been toiling with earlier.   Why do you use the second \b in" \bawarded\b" ?



Alteryx Certified Partner
Alteryx Certified Partner

Hi @hellyars 


This is a shorter solution:

ToNumber(Replace(REGEX_Replace([Field1], ".*(?<!awarded)awarded.*?\$([\d,]+).*", "$1"), ",", ""))


Using negative lookbehind, you assure that you hit your first awarded word.

Then you use non-greedy till the first "$", and group what you want to capture (digits or commas - full number).

Replace commas by nothing and convert the expression to a number.




@jdunkerley79 got to this before I did, and his solution is great. Glad we were able to solve this for you!!!


You can put the ToNumber and Replace functions in the same formula tool, after the Regex_Replace. Although you can string both together, I find it easier to separate them.




I used the second \b to allow for something like 'awarded: blah' to match but ' awardeds ' would not.


Depends a lot on the input exactly what regex would be best but glad solution worked for you.