This is similar to a post I made yesterday; however, I found more bad data and different syntaxes in the fields that nullifies the solution to it. I have a comma delimited field that follows this syntax: ID, Name, ID, Name. Yes, I have lambasted the person who thought about creating a field like this. But, I can't get to the original data, so I have to break it out to 1 per row where ID and Name are separate fields, but correctly aligned.
The only reliable way to identify the ID from the name is that the comma after the ID is preceded by a digit. I created the attached workflow in an attempt to use REGEX replaces the pattern "#," with a pipe character so I could then split to rows and then cross tab it to the results I'm trying to get, but the split to rows part isn't working.
Input
PO_ID | Vendor |
2222 | 222, Ferrari, INC, 333,McLaren LLC, 444,Toyota |
5555 | 123, RedBull, INC, 456, Aston LLC |
Needed Results
PO_ID | VendorID | VendorName |
2222 | 222 | Ferrari,INC |
2222 | 333 | McLaren LLC |
2222 | 444 | Toyota |
5555 | 123 | RedBull, Inc |
5555 | 456 | Aston LLC |
Thanks for any help!
Solved! Go to Solution.
Hi @csh8428
You were close, I modified your regex replace a bit, otherwise used most of what you had built:
Hi @csh8428
I'd suggest you use the split to rows, then a multi row to identify your Vendor ID and summarise to present your data.
@Luke_C Thanks! It worked for my sample data, but the regex isn't working for all the production data.
Here's an example using anonymized production data
PO_ID | Vendor |
2222 | ABC123456, AMAZON, INC., DEF67890, Best Buy, INC. |
With production data it's also putting a pipe in between an alpha character and a numeric digit.
In this scenario the REGEX is producing this: "ABC|123456|, AMAZON, INC., DEF|67890|, Best Buy INC."; which messes up the downstream tools
Hi @csh8428
Try updating the regex to REGEX_Replace([Vendor], '(\b\d+\b|\b[A-Z]+\d+\b)', '|$1|')
\b\d+\b
→ Matches purely numeric vendor IDs (e.g., 222
, 333
, 444
).\b[A-Z]+\d+\b
→ Matches alphanumeric vendor IDs (e.g., ABC123456
).