Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Help with a regex expression

klambert
7 - Meteor

Having a little moment here!

 

I am trying to extract everything before the comma, and I need to match on the below expressions and pull out the blue sections:

 

[Standard][Monthly][2][Post]

[OCS][Annual][6][Pre],[Standard][Monthly][2][Post]

 

I have written the below regex expression, however in the second pattern it is not just pulling out the first bit but returning the whole input.

 

What am I missing here?

Thank you!

7 REPLIES 7
ShankerV
17 - Castor

Hi @klambert 

 

One more easy way to do this as below.

 

ShankerV_1-1679408779304.png

 

(.+),(.+)

ShankerV_0-1679408828110.png

 

 

Input was:

ShankerV_1-1679408922943.png

 

Many thanks

Shanker V

 

klambert
7 - Meteor

Well that is beautifully simple!

 

Thank you!

BS_THE_ANALYST
14 - Magnetar
14 - Magnetar

@klambert if you want a non-regex alternative:

BS_THE_ANALYST_0-1679409276682.png

Basically, if it contains a comma, take the information from the start up until that point, if not, take all the information. 

 

I prefer RegEx, but it's nice to know the more manual alternatives. 

 

Formulas used: 
FindString: used to locate the position of the comma

Substring: used to take a portion of the string. You just need to provide the starting position, and how many characters to take from the start. In this instance, we use the Findstring to tell us how long the length needs to be.

 

Last thing, you'll notice I have !=-1 in the if statement, this is because of how the FindString function behaves. If it doesn't find the comma in this case, it returns -1 to indicate the comma wasn't found, and if it does find the comma, it'll return the position in the string:

BS_THE_ANALYST_1-1679409939092.png

 

Hope that's useful. 

 

All the best,
BS

LinkedIN

Bulien
jduracka
Alteryx
Alteryx

Hi @klambert, do you know that there's only one comma in a string? I'm asking because the suggested expression will match up to the last comma and in case of multiple commas per string, you'll have to decide whether you want to extract the part before the first one, the last one or something in between. If you choose the first one, an expression like this should work for you:

([^,]+).*

 

klambert
7 - Meteor

This is perfect, would you mind explaining the breakdown?

 

I read this as "starts with at least one comma" aka ^,+ followed by any combination of characters .*

 

But that is obviously not what its doing!

jduracka
Alteryx
Alteryx

Sure, no problem. The brackets are marking a capturing group - that is what @ShankerV  is referencing as $1 in the replacement text. The "[^,]" is a negated (because it starts with a "^") character set, meaning that it will match any character that is not a comma. The "+" means that we want one or more such characters. The ".*" means zero or more of any character.

So, this could be read as give me one or more characters (as many as possible) that are not a comma, followed by anything. The trailing part can be empty.

klambert
7 - Meteor

Ah perfect, that is a very helpful explanation. Thank you!

Labels