Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

RegEX Help

elclark
8 - Asteroid

I have a cell that has vendor names in the following format, Name (Alternate Name-Vendor Number). I want to split that cell into 3 columns (name, atl name, vendor #). The vendor number is not standard, just has a minimum of 3 numbers. I know it can be done with RegEx but not sure of the proper formula since some names have extra characters, like a parantheses in both the name and around the alternate name and vendor number.

 

What it looks like:

 

STAPLES BUSINESS ADVANTAGE (STAPLES CONTRACT & CMRCL INC-654321)
APPLIANCE PARTS CO (NV) INC (APPLIANCE PARTS CO (NV) INC-78910)
AIRGROUP EXPRESS (AIRGROUP EXPRESS-111)

 

How I want it split:

STAPLES BUSINESS ADVANTAGESTAPLES CONTRACT & CMRCL INC654321
APPLIANCE PARTS CO (NV) INCAPPLIANCE PARTS CO (NV) INC78910
AIRGROUP EXPRESSAIRGROUP EXPRESS111
6 REPLIES 6
mceleavey
17 - Castor
17 - Castor

Hi @elclark ,

 

Unless you have a list of what I assume are company names then there's no logic at all that would dictate which parts of the text are which, so the answer is no.

It's easy to isolate the number at the end, but not the first part.

To get the number simply use the regex tool as follows:

mceleavey_0-1620241607558.png

 

mceleavey_1-1620241622987.png

 

 

 

M.



Bulien

apathetichell
18 - Pollux

2021-05-05 (9).png

2021-05-05 (10).png

2021-05-05 (11).png

 

There's a cheat in here to swap out (NV) to ",NV," beforehand. having two sets of parenthetical entities in one record would trigger an error.

   

phottovy
13 - Pulsar
13 - Pulsar

@elclark I got close using this post on stackoverflow: https://stackoverflow.com/questions/23839826/regex-split-by-parentheses-ignore-nested-parentheses-in... 

 

My attached solution uses a similar approach to @apathetichell by removing the "(NV)" then adding it back afterword. This approach wouldn't work with other nested parentheses though.

apathetichell
18 - Pollux

Thanks @phottovy  - and may I suggest the prologue combo approach....

 

replace([field 1],"("+regex_replace([field 1],"(.*)\((\u{2,})\)(.*)","$2")+")","|"+regex_replace([field 1],"(.*)\((\u{2,})\)(.*)","$2")+"|")

 

This assumes any offending parenthetical entity will be only uppercase letters and at least two in number.

 

the epilogue of regex_replace([regexout1],"(.*)|(.*)|(.*)","$1($2)$3") or some such looks less like the terror of coding drivel.

 

 

phottovy
13 - Pulsar
13 - Pulsar

@apathetichell People with better RegEx skills will probably cringe at this solution, but you could identify the nested parentheses first, then take the same approach we both suggested. Revised version attached.

Maskell_Rascal
13 - Pulsar

Hi @elclark 

 

Something like this should work for you. If there is not a consistent space before the Alt Name, then remove the \s from the regex pattern.  

(.*)\s\((\w{3,}.*)\-(.*)\)

Maskell_Rascal_0-1620250685907.png

 

I attached a sample workflow for you to try out. 

 

Thanks!

Phil

 

 

Labels