community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Community v19.6

Looks aren't everything... But the latest Community refresh looks darn good!

Learn More
SOLVED

Extracting numbers from a string

Asteroid

Hello Alteryx Experts!

I know you will be able to help me find a solution to my issue. 

I have very dirty data, very dirty data! But there is hope at the end of the tunnel.

I have a text field where I need to extract 3 numbers which occur together. Th first number will always be preceded with an "S", the next number by "O" and the last by "R".

The typical different formats will be:

There is text before the needed fields S:2 O:2 O:3 and there is text after the needed fields

There is text before the needed fields S: 2 O: 2 O: 3 and there is text after the needed fields

There is text before the needed fields S:2, O:2, O:3 and there is text after the needed fields

There is text before the needed fields S: 2, O: 2, O: 3, and there is text after the needed fields

There is text before the needed fields (S:2 O:2 O:3) and there is text after the needed fields

There is text before the needed fields (S: 2 O: 2 O: 3) and there is text after the needed fields

 

But there will always be a letter than a number. 

 

Somebody please help me.

 

Thanks in advance :)

 

Alteryx Certified Partner

I suggest using the power of RegEx in the Formula tool to eliminate everything that isn't a numeric character. Try this formula:

 

REGEX_Replace([Field],"[^\d]","")

 

This expression will replace everything that isn't a numeric [0-9] character with nothing ("")

Alteryx Partner

Hello @ZoeM ,

 

If this are the only numbers in the filed you can use the regex tool with this expression:

 

(.*)(\d)(.*)(\d)(.*)(\d)(.*)

 

and for the replace the following

 

$2$4$6

 

This will give you the 3 numbers toghether in a field

 

Cheers

Alteryx Certified Partner
Alteryx Certified Partner

Hi @ZoeM 

@CharlieS  nailed it. This works like a charm.

This would also work: REGEX_Replace([Field],"[^0-9]","")

 

Cheers,

Asteroid

Excellent recommendation.

That was my first stab and my formula was:

 

REGEX_Replace([Summary], "[\D]", "")

 

The problem is that there are also instances like:

 

There is text 123 before the necessary fields S: 2 S:2 O:3 and the rest 345 is also there...

 

So there would be numbers too in the text but only want the ones focused on S, O, and R

 

Alteryx Partner

Maybe this is a little too much, but it works on the regex tool:

 

Reg expression:

(.*\:)(\s)?(\d)(.*\:)(\s)?(\d)(.*\:)(\s)?(\d)(.*)

Replacement text

$3$6$9

 

PD:

If there are more letters with the same parameters you can also use:

(.*S\:)(\s)?(\d)(.*O\:)(\s)?(\d)(.*R\:)(\s)?(\d)(.*)

 

Cheers

Asteroid

Tried that in the RegEx Tool and dint get the desired result.

Is there a way to specify the below:

 

Search for "S:" and get subsequent numerical value, "O:" and numerical value and "R:" and subsequent numerical value?

Highlighted
Aurora

Hi @ZoeM 

 

I think this gives what you're looking for

 

WF.png

 

The regex is set to parse with this expression

 

.*(S(:|: )(\d+)).*(O(:|: )(\d+)).*(R(:|: )(\d+)).*

 

Giving each result in it's own field

 

Results.png

 

Dan

Alteryx Certified Partner

I added one step to @danilang's solution: replacing any occurrence of "S:", "O:", or "R:" with a common "#:" so the sequence of "S","O", or "R" doesn't matter. This appears to work in all scenarios.

 

20190517-RegExParse.png

Asteroid

You must be from Louisiana because this is straight up magic!!!!!

i so so appreciate all of everyone's input!

 

The community succeeds again...

Labels