Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extract String Before "." Character

rheepa
7 - Meteor

Hi All,

I'm trying to extract everything in a string up to the first period. For example:

"My string to extract."

I want to Parse "My string to extract" and output to a new column. In the RegEx Tool, I entered "^(.+?)..*" as my Regular Expression. I am only getting "M". From what I understand, ^ means start from the beginning, (.+?). means get everything up to the first period, and .* ends the statement. Can someone show me how to fix this? 

Thanks!

5 REPLIES 5
binuacs
20 - Arcturus

@rheepa  your regex expression ^(.+?).. your understanding is correct except for (.+?)

 

st Capturing Group (.+)
. matches any character (except for line terminators)
+? matches the previous token between one and unlimited times, as few times as possible, giving back as needed (lazy)
\. matches the character .

 

also when you use the special character like . you need to use the escape character '\.'

 

 

 

you can test your regex in the  https://regex101.com/

 

 

binuacs_0-1649884836676.png

 

Luke_C
17 - Castor

Hi @rheepa 

 

I wouldn't use regex. A simple string function should work fine. The findstring returns the location of the first period and then that location is used to dynamically feed into the left function.

 

left([Field1],findstring([Field1],'.'))

Luke_C_0-1649884963051.png

 

 

jbichachi003
8 - Asteroid

You're close...

 

Here is the expression you should use (but a better one is listed below):

^(.+?)\..*

 

The difference is the backslash I added (in red). Your solution isn't working because the dot is a metacharacter, not a literal character. Metacharacters serve specific functions within RegEx (the ^ and . and quantifiers such as +, ?, and *, are examples of metacharacters). In RegEx, the dot is a metacharacter that means any single character (in your example, your regular expression is returning "M" because the dot character is being used as the any character metacharacter instead of a literal dot, and is therefore matching the first character, the "M"). In order to convert a metacharacter into a literal character (which is what you want for the dot), you need to "escape it" with a backslash (another metacharacter).

 

Note, if you have a string that begins with a dot, the RegEx solution above won't work. Instead, you should use this solution:

^(.*?)\..*

 

The asterisk means we are matching 0 or more of the preceding character, and the ? means up to the first instance of the subsequent character (which for you is a literal dot as referenced by the backslash dot). In the example above (based on your original solution), the + is a quantifier that means match 1 or more of the preceding character, and the ? means up to the first instance of the subsequent character (which is a literal dot as referenced by the backslash dot).

 

In addition to this working on "My string to extract.", it will also work on ".Dot at the beginning and end." (which will result in an empty output). Not knowing your data, I don't know if there's even the possibility that your string will ever start with a dot, but at least you now have a solution that will work both ways!

 

RegEx tool configured to the Parse output methodRegEx tool configured to the Parse output method

 

I've attached the solution (created in Alteryx Designer, version 2020.4).

 

Hope this helps!

binuacs
20 - Arcturus

@jbichachi003 well explained !!!

rheepa
7 - Meteor

Thanks for the detailed reply. That helps a lot!

Labels