Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

I dont really understand how regex works

terrellchong
8 - Asteroid

regexIm using the regex parse function. I got my result.

 

But I dont understand how it works.

 

I have a set of data that looks way like this more or less.

 

Company NameRegexExOut1 
Ali Abu Adik Transport & Brokerage (S) Pte Ltd (1012342-H)Ali Abu Adik Transport & Brokerage (S) Pte Ltd 

 

For most of the case, the result comes out smoothly but the Johnhot gone haywire

Company NameRegexExOut1 
Johnhot YTM Pte Ltd (Ula YTM Trans's Specialist (Singapore) Pte Ltd Johnhot YTM Pte Ltd (Ula YTM Trans's Specialist

 

Below is my setting for Regex

terrellchong_0-1639660348929.png

Bare in mind I do not understand what I am doing and how does it work. Can someone explain to me what is wrong and how I hit the jackpot?

4 REPLIES 4
Christina_H
14 - Magnetar

What output are you looking for?  I think your RegEx is giving you everything up to the final opening bracket.

 

In RegEx, brackets mark out groups.  (.*) at the start is the marked group that will be returned by your parse function.  Within that, . represents any character and * indicates 0 or more of them.  On its own, that would return everything from the input.

 

\ is an escape character, so \( and \) indicate that you're looking for actual brackets in the input and not marking out another group.  \s represents white space characters.

 

Square brackets are used to contain a set of characters you are looking for, e.g. [a-z] would match a single lower case character.  You have [^)]*, where ^) represents any character except ).  

 

So your whole expression (.*)\s\([^)]*\) will match a space followed by anything other than ) contained in brackets, then return anything from before that.

csmith11
11 - Bolide

Not strictly needed here but since things are breaking I would add a ".*" at the end of your expression to pick up the rest of the string.

 

(.*)\s\([^)]*\) .*

 

@Christina_H did a wonderful job breaking things done for you. 👏 Since I imagine you may be newer to Regex let me point you to a page that I love using if you haven't already found it for yourself.

 

https://regex101.com/ 

 

This site will let you test Regex and on the right hand side it explains each part of it.

 

See below how your expression doesn't match the full string for JohnHot. The highlight portion is what's matched and the white portion is unmatched. The Green is the part you are parsing out and the Blue is everything else that matched.

csmith11_0-1639674038340.png

 

Adding the .* at the end now matches the whole string: (This may fix your issue in Alteryx.)

csmith11_1-1639674202874.png

 

I wish I could help more, but even with your original expression its not broken for me in Alteryx. (Should work on your end as well) 

csmith11_2-1639674264000.png

 

 

 

PhilipMannering
16 - Nebula
16 - Nebula

The expression 

 

(.*) \(

 

does the same thing in these cases - capture everything up until the last space + open bracket.

csmith11
11 - Bolide

In case its useful: See Example Attached. Just copy this version of the tool into your workflow and see if this issue resolves itself.

Labels
Top Solution Authors