I am trying to use Regex to parse the name which has number, commas in between and characters
eg:
Input:
Field 1 |
8 solid rock Studio, LLC, 1, 8, 12, JPName, 3 |
Output:
Field1 | Field2 | Field3 | Field4 | Field5 | Field6 |
Solid 8rock Studio, LLC | 1 | 8 | 12 | JPName | 3 |
I used :
REGEX_Replace([Field1], "(.*?)(\d{1,}.*)",'$2') - data starting from a digit
REGEX_Replace([Field1], "(.*?)(\d{1,}.*)",'$1') - Name
then Text to Column to parse rest of the data
But this did not solve my problem because this would result in separate the data whenever it finds a digit, pushing the letters too to next column.
How would I obtain the result like above OUTPUT? I'd very much appreciate any help.
Solved! Go to Solution.
Hi @rjbrsh ,
Will the output always have 6 new fields and a digit leading to the first split? It's not very clean, but this works using the regex parse tool.
(.+),\s(\d+),\s(.+),\s(.+),\s(.+),\s(.+)
Thanks for quick response.
No. That's the problem. The name could come as "8rock studio, LLC" or "MB2K" or "JR Furniture LLC". But, what I figured is, it always ends with an alphabet. So, I need to get all the characters (digits, alphabets and commas) until the end of alphabet.
I think I'm still a little confused on the desired outcome. If I use the other example names, I still am to split the 6 columns while keeping the name in the first column.
Can you provide a sample workflow with a few more examples? That might help me understand the problem better.
My data comes from a CSV as :
ID,Name,Out1, Out2, Out3, Out4, Out5, ....., OutN,
12, David & Sons,LLC, 21PA,21,6,N/A,N/A, 20,1.....,12458,
I have about 200+ columns.
I needs to extract just the ones in red (Name) in 1 column and the rest (Out) in the other column.
ID | Name | Out |
12 | David & Sons,LLC, 21PA | 6,N/A,N/A, 20,1.....,12458, |
I can keep the name column as-is and then use text to column to parse the remaining OUT data.
The final output I need is :
ID | Name | Out1 | Out 2 | Out3 | Out 4 | Out 5 | Out N |
12 | David & Sons, LLC, 21PA | 21 | 6 | N/A | N/A | 20 | 12458 |
19 | 8K Studio LLC | 34 | 2 | 0 | 0 | 10 | 25886 |
That certainly helps. They certainly didn't do you any favors with that csv file. Try the attached workflow to identify the ID and name. My regex would need to be tweaked if there isn't always a space after the comma. Here is what I tried:
(\d+),\s(.+[[:alpha:]]),\s(\d+,.+)
(\d+),\s = ID Column (I assumed only digits followed by ,\s)
(.+[[:alpha:]]),\s = Name Column (this will find all characters but has to end with a letter followed by ,\s)
(\d+,.+) = Remaining (finds at least one digit followed by a comma then everything else in the string)
User | Count |
---|---|
19 | |
14 | |
13 | |
9 | |
8 |