Alteryx Designer Desktop Discussions

rjbrsh · ‎09-16-2021

I am trying to use Regex to parse the name which has number, commas in between and characters

eg:

Input:

Field 1

8 solid rock Studio, LLC, 1, 8, 12, JPName, 3

Output:

Field1	Field2	Field3	Field4	Field5	Field6
Solid 8rock Studio, LLC	1	8	12	JPName	3

I used :

REGEX_Replace([Field1], "(.*?)(\d{1,}.*)",'$2') - data starting from a digit

REGEX_Replace([Field1], "(.*?)(\d{1,}.*)",'$1') - Name

then Text to Column to parse rest of the data

But this did not solve my problem because this would result in separate the data whenever it finds a digit, pushing the letters too to next column.

How would I obtain the result like above OUTPUT? I'd very much appreciate any help.

phottovy · ‎09-16-2021

Hi @rjbrsh ,

Will the output always have 6 new fields and a digit leading to the first split? It's not very clean, but this works using the regex parse tool.

(.+),\s(\d+),\s(.+),\s(.+),\s(.+),\s(.+)

rjbrsh · ‎09-16-2021

Thanks for quick response.

No. That's the problem. The name could come as "8rock studio, LLC" or "MB2K" or "JR Furniture LLC". But, what I figured is, it always ends with an alphabet. So, I need to get all the characters (digits, alphabets and commas) until the end of alphabet.

phottovy · ‎09-16-2021

I think I'm still a little confused on the desired outcome. If I use the other example names, I still am to split the 6 columns while keeping the name in the first column.

Can you provide a sample workflow with a few more examples? That might help me understand the problem better.

rjbrsh · ‎09-16-2021

My data comes from a CSV as :

ID,Name,Out1, Out2, Out3, Out4, Out5, ....., OutN,

12, David & Sons,LLC, 21PA,21,6,N/A,N/A, 20,1.....,12458,

I have about 200+ columns.

I needs to extract just the ones in red (Name) in 1 column and the rest (Out) in the other column.

ID	Name	Out
12	David & Sons,LLC, 21PA	6,N/A,N/A, 20,1.....,12458,

I can keep the name column as-is and then use text to column to parse the remaining OUT data.

The final output I need is :

ID	Name	Out1	Out 2	Out3	Out 4	Out 5	Out N
12	David & Sons, LLC, 21PA	21	6	N/A	N/A	20	12458
19	8K Studio LLC	34	2	0	0	10	25886

phottovy · ‎09-16-2021

That certainly helps. They certainly didn't do you any favors with that csv file. Try the attached workflow to identify the ID and name. My regex would need to be tweaked if there isn't always a space after the comma. Here is what I tried:

(\d+),\s(.+[[:alpha:]]),\s(\d+,.+)

(\d+),\s = ID Column (I assumed only digits followed by ,\s)

(.+[[:alpha:]]),\s = Name Column (this will find all characters but has to end with a letter followed by ,\s)

(\d+,.+) = Remaining (finds at least one digit followed by a comma then everything else in the string)

Alteryx Designer Desktop Discussions

Parsing string with numbers and letters

Re: Unable to get an output

Re: Extracting the list of sheet names across mult...

Example workflow for setting up a custom list to u...

Re: Firm names parse

Re: Help with Multi-Row formula