hi everyone,
I need to clean up a phone number column which contains phone numbers but not only! :)
for the following cases, I need some help
- the name of the contact is preceeding the phone number. The length of the name is always different. the only similar case is that the phone number is preceeded with a blank space.
ex: Ms Marple 0176/123456789
- two phone numbers in the same cell separated with a dash or a single letter
thank you very much for your help
Myriam
Solved! Go to Solution.
Sorry, typo in the regex. It should look like this:
(.+\s)(\d+)[/\s](\d+)
One more correction - I misread the original post, if the numbers are separated by a '/' or a letter (not a space), the regex should look like this:
(.+\s)(\d+)[/[:alpha:]](\d+)
Hi David,
there is a little mistake in your regex: a ( or a ) is missing but I cannot see where :(
I wished I could write regex!!
If you have any tutorial to advise, please do so! :)
thank you
Myriam
Hi Myriam,
Attached is the Alteryx workflow. With the Regex parse tool, using the parse Output method I created 3 columns (each set of brackets represent a column). The logic works like this:
(.+\s) .+ means one or more of any character and \s is a space, so if you put them together it's a string of characters that ends in a space.
(\d+) means a sequence of one or more numbers
From your description, we want to exclude the separator character between the 2 numbers which can be a / or a letter, so you write that like this:
[/[:alpha:]] - the outer [] represents a set which is either a / or a letter which in this case is represented by [:alpha:]. Since there are no (), we're telling the parse tool to find this but not to parse it to a column.
Finally, we want the 2nd number parsed, which is done by (\d+)
There are some great tutorials and training material on the Alteryx community regarding regex (just type in regular expressions in the community search bar), and the down arrow on the right hand side of the regular expression input window expands to a list of regex commands.
The best thing to do is to play around with this yourself and you'll figure it out in no time.
Hi David,
thanks a lot for this regex.
This regex is splitting the filed into 3 parts
(i) it splits the name of the contact
(ii) then it splits the digits before the /
(iii) and copies the ending digits into the last field.
So that is fine when the content is "Ms Marple 0176/123456789"
but when the cell contains 2 phone numbers separated with a letter, then the regex is putting the first phone number + the letter into RegexOut1.
How can I manage to have the regex working only when the cell contains a name?
thanks a lot
myriam
Hi Myriam,
I don't get the same problem where the letter is part of RegExOut1 - please see below. Could your data contain something else in between the 2 numbers? In the formula tool you can use Regex_Match to test if the string starts with a name and based on the result, go one of 2 ways.
Please find workflow attached.
Hi David,
thank you so much for this recipe. It works for almost all the use cases I have in my file.
the strings that are still annoying me are:
0162/2870461 o. 0173/6456482
09071-54111 oder 0172/2470400
0151/12139881 - 0170/2257314
in these 3 cases I only retreive the second phone number, since the first one is considered as a name.
thank you
myriam