Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

20 Token RegEx Expressions For The Newbie

MarqueeCrew
20 - Arcturus
20 - Arcturus
Created
PRODUCT: Alteryx Designer
VERSION:2018.1
LAST UPDATE:05/09/2018

ATTACHMENT:Regex Introduction.yxmd


Your RegEx Journey Starts Here

 

My journey with Regular Expressions (RegEx) started because I was lazy. I realized that substring, findstring and other searches were difficult to configure when my data wasn't well structured. As I began working with more and more data I found that the time spent in learning about RegEx was saving me loads of time and was getting me cleaner data. I didn't learn it all overnight. Sometimes I needed help and I did ask for it.

 

I see many questions in the Alteryx Community regarding patterns. Some of them I answer with RegEx and some without. Just the other day I used both the contains()function as well as the RegEx replace in the same expression for a post. The pattern used was not for a beginner and I apologized for not being able to adequately explain it. I am self-taught and sometimes I just know how to write something (learned from trial and error) and explaining the expression was difficult for me. In the following video I show you ways to identify basic target data (Tokens) and use a variety of formula replacements to demonstrate how the token finds matches throughout the string.

 

The following 20 minute video walks you through some simple replacements and hopefully provides you with the courage to explore your own data with RegEx at your side. If you want a quick reference, you can use RegEx101 to help you. I explain in the video that I prefer to use Alteryx designer, but the choice is yours to make. There are training opportunities here in community and at Inspire for you as well.

 

 

Cheers,

Mark

Attachments
Comments
pushkardps
7 - Meteor

Very helpful. Thanks for creating this MarqueeCrew.

MarqueeCrew
20 - Arcturus
20 - Arcturus

What would you like to see next?  

pushkardps
7 - Meteor

I am fresh out of questions at the moment.

I am pretty sure I will run into some snags anytime soon.

I will keep you posted.

Thanks again.

 

Regards,

Pushkar 

 

P.S.  On second thought, do you blog? 

MarqueeCrew
20 - Arcturus
20 - Arcturus
saurabh79
7 - Meteor

This is so helpful. Thank you immensely Mark. 

pushkardps
7 - Meteor

Hello @MarqueeCrew,

 

I am trying to use RegEx_Replace function in Formula tool to extract the month from a string.

But this function works more like - let us remove whatever we don't need.

 

I looked for a different function but did not find any.

I guess I should spend more time on writing better logic.

Any thoughts?

image.png

MarqueeCrew
20 - Arcturus
20 - Arcturus

@pushkardps,

 

Does this video help?  

 

 

You can also subscribe and follow me on YouTube.  If you post your sample data and desired output, I or someone will quickly reply.  It might be that there is a datetimeformat() solution too.

 

Cheers,

Mark

pushkardps
7 - Meteor

Aaah I understand where I went wrong @MarqueeCrew, the problem I am working on is a part of the weekly challenge. 

I have already seen your solution to that challenge but wanted to come up with something by myself. This helped a lot.

Thanks a ton. I will follow the youtube channel and post queries there going forward.

Spartucus46
7 - Meteor

Hello,

 

I'm trying to utilize a regex formula to help me extrapolate a CUSIP from transactional data. The data structure and type won't always be similar to this data set, but it should be relatively close. The CUSIP should be no more than 9 characters (usually 9 sometimes less, but never more than 9) either numeric or alphanumeric.

 

I have attached some sample data transactions in column A and the CUSIP I would need from column A in column B.

 

TRANSACTION DATACUSIP NEEDED
NO DATACASH RECEIPTTRAN # CASH DIVIDEND-LADDER CORP-505743104EX DATE:12/7/18, PAY DATE:1/24/19RATE: 0.422942505743104
NO DATACASH RECEIPTTRAN # STOCK DIVIDEND-505743104-LADDER CORPORATIONEX DATE:12/7/18;PAYDATE:1/24/19RATE:0.147058505743104
NO DATACASH RECEIPTTRAN # $0.0237/SHARE ON 8 SHARES DUE 1/25/1900776M517: ADVISORS DISCIPLINED TR00776M517
NO DATACASH RECEIPTTRAN # $0.166/SHARE ON 65 SHARES DUE 04/01/19115637209BROWN FORMAN CORP CL B115637209
NO DATACASH RECEIPTTRAN # $0.22/SHARE ON 70 SHARES DUE 04/01/19654106103NIKE INC CLASS B654106103
NO DATACASH RECEIPTTRAN # $0.47/SHARE ON 89 SHARES DUE 01/16/19857477103STATE STREET CORP857477103
NO DATACASH RECEIPTTRAN # $0.19/SHARE ON 956 SHARES DUE 01/23/1920030N101COMCAST CORP CL A20030N101
NO DATACASH RECEIPTTRAN # $0.8675/SHARE ON 65 SHARES DUE 12/31/19755111507RAYTHEON CO755111507
NO DATACASH RECEIPTTRAN # $1.1/SHARE ON 59 SHARES DUE 02/11/19009158106AIR PRODS & CHEMS INC009158106
NO DATACASH RECEIPTTRAN # $0.9/SHARE ON 100 SHARES DUE 03/12/19478160104JOHNSON & JOHNSON478160104
NO DATACASH RECEIPTTRAN # $0.4/SHARE ON 10 SHARES DUE 03/29/19737630103POTLATCHDELTIC CORPORATION737630103
NO DATACASH RECEIPTTRAN # OPTIONAL DIVIDEND STOCK PAYMENTRETAIL-76133Q10276133Q102
NO DATACASH RECEIPTTRAN # $0.7925/SHARE ON 59 SHARES DUE 01/14/1992276F100VENTAS INC92276F100
NO DATACASH RECEIPTTRAN # $0.43/SHARE ON 445 SHARES DUE 01/15/1921871N101CORECIVIC INC21871N101

 

Any assistance would be greatly appreciated.

MarqueeCrew
20 - Arcturus
20 - Arcturus

@Spartucus46,

 

Since you asked for "any" assistance, here is a suggestion:

 

REGEX_Replace([TRANSACTION DATA], ".*(\d\w{7}\d).*", '$1')

 

It works with your 9 digit CUSIP values as long as they begin and end with a number.  Sometimes you need to setup your data first by identifying multiple patterns.  You can use a regex match function to look for one pattern and see if it matches, then use one formula there and another formula for say 10 digit values.  Of course, for that example, I would check for the larger first.

 

Cheers,

 

Mark 

Spartucus46
7 - Meteor

Hello,

 

This was very helpful and worked flawlessly. However, I've been trying to tweak your original REGEX for use on the sample below, but I'm not getting the desired results. This kind of data the CUSIP does not have to be 9 characters long and could begin with either a letter or number, but should end with a number.

 

 

TRANSACTION DATACUSIP NEEDED
DIVIDEND @ 0.0568293 USD/AUD FX RATE 0.7077EX DATE: 12/28/2018 PAY DATE: 03/04/2019BY7QXS7 : VICINITY CENTRESBY7QXS7
DIVIDEND @ 0.001658 USD/AUD FX RATE 0.7138EX DATE: 12/28/2018 PAY DATE: 02/28/20196161978 : MIRVAC GROUP6161978
DIVIDEND @ 0.2521019 USD/AUD FX RATE 0.7138EX DATE: 12/28/2018 PAY DATE: 02/28/2019B033YN6 : DEXUSB033YN6
DIVIDEND @ 0.0628857 USD/AUD FX RATE 0.7138EX DATE: 12/28/2018 PAY DATE: 02/28/20196365866 : G P T GROUP6365866
DIVIDEND @ 73 USD/JPY FX RATE 108.17EX DATE: 11/13/2018 PAY DATE: 01/10/2019B0MKZN5 : TSURUHA HOLDINGS INCB0MKZN5
DIVIDEND @ 7.767109 USD/50 SHSEX DATE: 12/28/2018 PAY DATE: 04/26/20194942818 : SAMSUNG ELECTRONICS CO LTD -GDR 144A4942818
CASH DIVIDEND PAID IN GBP-BD8QVH4-INTERCONTINENTALEX DATE:1/14/19;PAYDATE:1/29/19RATE:2.038BD8QVH4
DIVIDEND REINVESTMENT PAID IN AUDEX DATE:12/28/18; PAY DATE:2/15/19DIVIDEN RATE=0.046TRANSURBAN-62008826200882
DIVIDEND @ 0.8 USD/EUR FX RATE 1.116EX DATE: 04/26/2019 PAY DATE: 05/20/2019BYQP136 : A B N AMRO GROUP NV-CVABYQP136
DIVIDEND @ 50 USD/JPY FX RATE 110.62EX DATE: 12/26/2018 PAY DATE: 03/29/2019B5LTM93 : OTSUKA HOLDINGS CO LTDB5LTM93

 

Any assistance would be greatly appreciated.

csollest_11
5 - Atom

Very helpful. Thank you very much.

Snehasish789
5 - Atom

Thanks for the wonderful post.  One think I personally recommend is fixing the case of the txt fields into upper or  sentence case or lower case and then run the regex tokens on the fields to get more accurate answers.

Cndro_Consulting
8 - Asteroid

Helpful, Thanks.

brwright
8 - Asteroid

I need a cheat sheet in text version. I cant extrapolate from the video what I'm trying to do :(