Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex

rohit782192
11 - Bolide

Hello Team, What the Regex for 29-December-1987. I am trying (\d{2})-(\w{.*})-(\d{4}) but getting False when do a Match. Mostly i am going wrong in Month.

9 REPLIES 9
binuacs
21 - Polaris

@rohit782192  If you want to specifically mention the character "-" use "\-" so it will treat as character.

(\d{2})\-(\w+)\-(\d{4})

binuacs_0-1649161394276.png

 

rohit782192
11 - Bolide

I want to know before the "-" sign also we need to put a "\"

MarqueeCrew
20 - Arcturus
20 - Arcturus

@rohit782192 ,

 

I don't know why you want to use RegEx to parse the date.  RegEx is an expensive and complex way to try to solve your need.  To separate the text to fields, you should use the - as a delimiter in a text to columns tool.  I think that @binuacs was too focused on the act of getting a solve to understand your challenge or to give you assistance to learn Alteryx.

 

The reading of dates in formats other than ISO 8601 standard (YYYY-MM-DD) requires the use of a DateTimeParse() function.  Date specifiers are used to explain the contents to the function.

 

https://help.alteryx.com/20214/designer/datetime-functions

 

%Y is a 4-digit year

%B is a complete month (%b is an abbreviated month)

%d is a day# of the month

 

 

DateTimeParse[YourDateField],"%d-%B-%Y")

 

 

If the date doesn't work, you'll see conversion errors in the log and the output will be Null()

 

Cheers,

 

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
rohit782192
11 - Bolide

I echo with you but we have to be mastered in everything.

 

First come only in mind was regex and then come the other tool.

MarqueeCrew
20 - Arcturus
20 - Arcturus

Try:

 

29-February-2022

 

 cheers,

 

 mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
rohit782192
11 - Bolide
DateFormatDateMonthYear
29-Dec-8729December1987
29-February-202229February2022
Aguisande
15 - Aurora
15 - Aurora

IMHO, "mastering everything" involves not only knowing how to use a tool. It also is about forward thinking, taking the best of the tools you have at your disposal.

This post is exactly the representation of it.

BTW, a real master, teaches others how to get to that master state.

rohit782192
11 - Bolide

That Awesome, I will try to use most of the Tool available in Alteryx.

 

Thanks for Support.

MarqueeCrew
20 - Arcturus
20 - Arcturus

@rohit782192 ,

 

Let's take a close look at your expression to find the cause of the problem:

 

 

Regex_Match([Date Field],"(\d{2})-(\w{.*})-(\d{4})")

 

This should return 0/False on your input string.  But if written properly, will return TRUE.

 

\w{.*} means something entirely different than you think.

 

Here is what it means (regex101.com):

\w matches any word character (equivalent to [a-zA-Z0-9_])
{ matches the character { with index 12310 (7B16 or 1738) literally (case sensitive)
. matches any character (except for line terminators)
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
} matches the character } with index 12510 (7D16 or 1758) literally (case sensitive)

 

if you wanted to match multiple word characters (using PERL form of RegEx) you can still use \w, but that includes numbers.  I think that you only want letters.  If you're case-insensitive, you'll be fine with \l or \u (lower or uppercase letters).  Or you can define the set of characters as an uppercase followed by lowercase letters (you'll turn case-sensitivity on).

 

Essentially you want to find:

 

A group of 2 numbers followed by a dash followed by a series of letters (letters only)  and let's assume that you only match if it is 3 or more total letters followed by a dash followed by exactly 4 numbers.

 

Regex_Match([Date Field],"(\d{2})-([a-z]{3,})-(\d{4})")

 

This is interpreted as (regex101.com):

1st Capturing Group (\d{2})
     \d matches a digit (equivalent to [0-9])
     {2} matches the previous token exactly 2 times
     - matches the character - literally
2nd Capturing Group ([a-z]{3,})
     Match a single character present in the list below [a-z]
     {3,} matches the previous token between 3 and unlimited times, as many times as 
     possible, giving back as needed (greedy)
     a-z matches a single character in the range between a and z
     - matches the character - literally
3rd Capturing Group (\d{4})
     \d matches a digit (equivalent to [0-9])
    {4} matches the previous token exactly 4 times

 

That is a more complete RegEx explanation for you.

 

Cheers,

 

Mark

 

@jarrod 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Labels