This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Hello Alteryx Experts!
I know you will be able to help me find a solution to my issue.
I have very dirty data, very dirty data! But there is hope at the end of the tunnel.
I have a text field where I need to extract 3 numbers which occur together. Th first number will always be preceded with an "S", the next number by "O" and the last by "R".
The typical different formats will be:
There is text before the needed fields S:2 O:2 O:3 and there is text after the needed fields
There is text before the needed fields S: 2 O: 2 O: 3 and there is text after the needed fields
There is text before the needed fields S:2, O:2, O:3 and there is text after the needed fields
There is text before the needed fields S: 2, O: 2, O: 3, and there is text after the needed fields
There is text before the needed fields (S:2 O:2 O:3) and there is text after the needed fields
There is text before the needed fields (S: 2 O: 2 O: 3) and there is text after the needed fields
But there will always be a letter than a number.
Somebody please help me.
Thanks in advance :)
Go to Solution.
Go to Solution.
I suggest using the power of RegEx in the Formula tool to eliminate everything that isn't a numeric character. Try this formula:
This expression will replace everything that isn't a numeric [0-9] character with nothing ("")
Hello @ZoeM ,
If this are the only numbers in the filed you can use the regex tool with this expression:
and for the replace the following
This will give you the 3 numbers toghether in a field
@CharlieS nailed it. This works like a charm.
This would also work: REGEX_Replace([Field],"[^0-9]","")
That was my first stab and my formula was:
REGEX_Replace([Summary], "[\D]", "")
The problem is that there are also instances like:
There is text 123 before the necessary fields S: 2 S:2 O:3 and the rest 345 is also there...
So there would be numbers too in the text but only want the ones focused on S, O, and R
Maybe this is a little too much, but it works on the regex tool:
If there are more letters with the same parameters you can also use:
Tried that in the RegEx Tool and dint get the desired result.
Is there a way to specify the below:
Search for "S:" and get subsequent numerical value, "O:" and numerical value and "R:" and subsequent numerical value?
I think this gives what you're looking for
The regex is set to parse with this expression
.*(S(:|: )(\d+)).*(O(:|: )(\d+)).*(R(:|: )(\d+)).*
Giving each result in it's own field
I added one step to @danilang's solution: replacing any occurrence of "S:", "O:", or "R:" with a common "#:" so the sequence of "S","O", or "R" doesn't matter. This appears to work in all scenarios.
You must be from Louisiana because this is straight up magic!!!!!
i so so appreciate all of everyone's input!
The community succeeds again...