I have a cell that has vendor names in the following format, Name (Alternate Name-Vendor Number). I want to split that cell into 3 columns (name, atl name, vendor #). The vendor number is not standard, just has a minimum of 3 numbers. I know it can be done with RegEx but not sure of the proper formula since some names have extra characters, like a parantheses in both the name and around the alternate name and vendor number.
What it looks like:
STAPLES BUSINESS ADVANTAGE (STAPLES CONTRACT & CMRCL INC-654321) |
APPLIANCE PARTS CO (NV) INC (APPLIANCE PARTS CO (NV) INC-78910) |
AIRGROUP EXPRESS (AIRGROUP EXPRESS-111) |
How I want it split:
STAPLES BUSINESS ADVANTAGE | STAPLES CONTRACT & CMRCL INC | 654321 |
APPLIANCE PARTS CO (NV) INC | APPLIANCE PARTS CO (NV) INC | 78910 |
AIRGROUP EXPRESS | AIRGROUP EXPRESS | 111 |
Hi @elclark ,
Unless you have a list of what I assume are company names then there's no logic at all that would dictate which parts of the text are which, so the answer is no.
It's easy to isolate the number at the end, but not the first part.
To get the number simply use the regex tool as follows:
M.
@elclark I got close using this post on stackoverflow: https://stackoverflow.com/questions/23839826/regex-split-by-parentheses-ignore-nested-parentheses-in...
My attached solution uses a similar approach to @apathetichell by removing the "(NV)" then adding it back afterword. This approach wouldn't work with other nested parentheses though.
Thanks @phottovy - and may I suggest the prologue combo approach....
replace([field 1],"("+regex_replace([field 1],"(.*)\((\u{2,})\)(.*)","$2")+")","|"+regex_replace([field 1],"(.*)\((\u{2,})\)(.*)","$2")+"|")
This assumes any offending parenthetical entity will be only uppercase letters and at least two in number.
the epilogue of regex_replace([regexout1],"(.*)|(.*)|(.*)","$1($2)$3") or some such looks less like the terror of coding drivel.
@apathetichell People with better RegEx skills will probably cringe at this solution, but you could identify the nested parentheses first, then take the same approach we both suggested. Revised version attached.
Hi @elclark
Something like this should work for you. If there is not a consistent space before the Alt Name, then remove the \s from the regex pattern.
(.*)\s\((\w{3,}.*)\-(.*)\)
I attached a sample workflow for you to try out.
Thanks!
Phil