Hi experts,
New to Alteryx and have been tasked with cleaning up some data that is more complicated than I expected.
We have Type recorded in a string of user details and I need to pull out the type value (4 or 5 digits) into a new column. The problem is that Type has been entered with absolutely no consistency in the field.
As you can see, sometimes it is Type with no space, Type:, Type# or even my favorite, Type #= .
I've been trying to go about it with RegEx but I'm not getting anywhere :)
thanks in advance!
Solved! Go to Solution.
Hey @CraigF
Use a normal Formula tool and create a new Int field with this expression:
TONUMBER(iif(regex_match([Field1], ".*?TYPE.*(\d\d\d\d\d).*"),REGEX_Replace([Field1], ".*?TYPE.*?(\d\d\d\d\d).*", "$1"),REGEX_Replace([Field1], ".*?TYPE.*?(\d\d\d\d).*", "$1")))
Basically, it is saying if there are 5 digits in a row after the word TYPE, use those 5 digits, otherwise use the 4 digits after the word TYPE.
Hope this helps!
hi @CraigF
Great solution from @Kenda.
Mine's a little different in that it looks for "Type" followed non-digits followed by digits
(Type[^\d]+[\d]+)
I then use another Regex to pull the digits from this. I know that there is a way to search within the previous results, and combine the 2 regex tools into one, but my regex-FU is not strong enough. Maybe @Thableaus can help here.
Dan
Hi @danilang - my help here:
I think REGEX tool with Parse Method would work with this single expression:
.*?(?:\btype\W*)(\d+).*
This would bring any sequence of digits after the word type (uppercased or not since we are considering case insensitivity here) and a non-word character 0 or more times (that would be all of the symbols or spaces that the author of the topic mentioned).
Cheers,
quick and dirty, I like it, thank you).
Going to play with everyone elses solution later when I have more time as well to continue my learning!