Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Check out our powerful new search update! You can read more here. Please let us know if you have any feedback by creating a topic here.
SOLVED

Regex for Extraction

Highlighted

Hi Guys,

I have huge lines of data as below and I'm looking to extract the date, the immediate number following the date and the first letter after the number (C/P), into three separate columns. Can anyone suggest on how to go with this?

Thanks

 

 

E 2015-09-20 110.000 C A OPA

AAA 2013-02-15 22.500 P A OPA

BBBB 2017-04-18 52.000 C A OPA BBBBP1C181303300.U

CC 2020-07-19 195.000 C A OPA

DDDD 2011-03-15 70.000 P A OPA

Highlighted
Quasar

can you post the sample data as text, instead of a PNG attachment?

Highlighted

Hi @ChrisTX , I posted it to my query above.

Thanks

Asteroid

.*(\d{4}-\d{2}-\d{2})\s([\d\.]*).*([CP]).*

 

A RegEx explanation since @vissamsettyvivek requested -- posting to the thread in case others have a similar question:

 

.  -- Any Character

* -- Repeat 0 or many

     Find any character and repeat until next argument in RegEx String

 

\d Any digit

{4} - Exactly 4 repeating digits

{2} - Exactly 2 repeating digits

- means -

\s space

[\d\.*] -- find any digit or . and as many repeating digits or .  -- brackets mean [] any of the inside characters

\. means . since RegEx counts a . as any character you have to escape special character with a preceding /

[CP] -- find either a C or a P

() -- Capturing Group -- this tells the RegEx tool which character groups to extract into a column -- each () group goes into a separate column

 

You'll notice several instances of .* not in () -- you have to replicate the text using RegEx Syntax in order for it to work.  For example, the space between the date and number must be handled -- you can use \s for a space or a (.*)--wildcard   I prefer using a more specific RegEx Identifier if possible like the space to avoid unintended matching.

 

I use https://regexr.com/ to test more complicated situations, and save time of running through a workflow, then paste it into the tool.

 

Happy to help, since I had a similar question answered on the community many years ago which helped me learn RegEx.

 

Thanks.

 

 

 

 

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Hi, @vissamsettyvivek 

 

This should work:

 

REGEX_Replace([Field1], "^.*(\d{4}-\d{2}-\d{2}).*$", "$1")

 

Cheers,

Highlighted
Alteryx
Alteryx

@vissamsettyvivek 

 

At first look, a basic text-to-columns might be the easiest way to do this just using space as a delimiter, but could also RegEx it.

 

Made a quick sample that does both ways.

 

 

Highlighted

Thanks guys! It worked.

Labels