Extracting numbers from a string
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello Alteryx Experts!
I know you will be able to help me find a solution to my issue.
I have very dirty data, very dirty data! But there is hope at the end of the tunnel.
I have a text field where I need to extract 3 numbers which occur together. Th first number will always be preceded with an "S", the next number by "O" and the last by "R".
The typical different formats will be:
There is text before the needed fields S:2 O:2 O:3 and there is text after the needed fields
There is text before the needed fields S: 2 O: 2 O: 3 and there is text after the needed fields
There is text before the needed fields S:2, O:2, O:3 and there is text after the needed fields
There is text before the needed fields S: 2, O: 2, O: 3, and there is text after the needed fields
There is text before the needed fields (S:2 O:2 O:3) and there is text after the needed fields
There is text before the needed fields (S: 2 O: 2 O: 3) and there is text after the needed fields
But there will always be a letter than a number.
Somebody please help me.
Thanks in advance :)
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I suggest using the power of RegEx in the Formula tool to eliminate everything that isn't a numeric character. Try this formula:
REGEX_Replace([Field],"[^\d]","")
This expression will replace everything that isn't a numeric [0-9] character with nothing ("")
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello @ZoeM ,
If this are the only numbers in the filed you can use the regex tool with this expression:
(.*)(\d)(.*)(\d)(.*)(\d)(.*)
and for the replace the following
$2$4$6
This will give you the 3 numbers toghether in a field
Cheers
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Excellent recommendation.
That was my first stab and my formula was:
REGEX_Replace([Summary], "[\D]", "")
The problem is that there are also instances like:
There is text 123 before the necessary fields S: 2 S:2 O:3 and the rest 345 is also there...
So there would be numbers too in the text but only want the ones focused on S, O, and R
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Maybe this is a little too much, but it works on the regex tool:
Reg expression:
(.*\:)(\s)?(\d)(.*\:)(\s)?(\d)(.*\:)(\s)?(\d)(.*)
Replacement text
$3$6$9
PD:
If there are more letters with the same parameters you can also use:
(.*S\:)(\s)?(\d)(.*O\:)(\s)?(\d)(.*R\:)(\s)?(\d)(.*)
Cheers
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Tried that in the RegEx Tool and dint get the desired result.
Is there a way to specify the below:
Search for "S:" and get subsequent numerical value, "O:" and numerical value and "R:" and subsequent numerical value?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @ZoeM
I think this gives what you're looking for
The regex is set to parse with this expression
.*(S(:|: )(\d+)).*(O(:|: )(\d+)).*(R(:|: )(\d+)).*
Giving each result in it's own field
Dan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I added one step to @danilang's solution: replacing any occurrence of "S:", "O:", or "R:" with a common "#:" so the sequence of "S","O", or "R" doesn't matter. This appears to work in all scenarios.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
You must be from Louisiana because this is straight up magic!!!!!
i so so appreciate all of everyone's input!
The community succeeds again...
