Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Parsing string with numbers and letters

rjbrsh
7 - Meteor

I am trying to use Regex to parse the name which has number, commas in between and characters 

eg:

Input:

Field 1
8 solid rock Studio, LLC, 1, 8, 12, JPName, 3

 

Output:

Field1Field2Field3Field4Field5Field6
Solid 8rock Studio, LLC1812JPName3


I used :

REGEX_Replace([Field1], "(.*?)(\d{1,}.*)",'$2') - data starting from a digit

REGEX_Replace([Field1], "(.*?)(\d{1,}.*)",'$1') - Name 

then Text to Column to parse rest of the data

 

But this did not solve my problem because this would result in separate the data whenever it finds a digit, pushing the letters too to next column. 

How would I obtain the result like above OUTPUT? I'd very much appreciate any help. 

 

5 REPLIES 5
phottovy
13 - Pulsar
13 - Pulsar

Hi @rjbrsh ,

 

Will the output always have 6 new fields and a digit leading to the first split? It's not very clean, but this works using the regex parse tool.

(.+),\s(\d+),\s(.+),\s(.+),\s(.+),\s(.+)

phottovy_0-1631818932897.png

phottovy_1-1631818948337.png

 

 

rjbrsh
7 - Meteor

Thanks for quick response. 

 

No. That's the problem. The name could come as "8rock studio, LLC" or "MB2K" or "JR Furniture LLC". But, what I figured is, it always ends with an alphabet. So, I need to get all the characters (digits, alphabets and commas) until the end of alphabet.

phottovy
13 - Pulsar
13 - Pulsar

I think I'm still a little confused on the desired outcome. If I use the other example names, I still am to split the 6 columns while keeping the name in the first column.

 

phottovy_0-1631823692113.png

 

Can you provide a sample workflow with a few more examples? That might help me understand the problem better.

 

rjbrsh
7 - Meteor

My data comes from a CSV as : 

ID,Name,Out1, Out2, Out3, Out4, Out5, ....., OutN,

12, David & Sons,LLC, 21PA,21,6,N/A,N/A, 20,1.....,12458,

I have about 200+ columns. 

I needs to extract just the ones in red (Name) in 1 column and the rest (Out)  in the other column.

IDNameOut
12David & Sons,LLC, 21PA6,N/A,N/A, 20,1.....,12458,

I can keep the name column as-is and then use text to column to parse the remaining OUT data.

The final output I need is :

IDNameOut1 Out 2 Out3Out 4 Out 5Out N
12David & Sons, LLC, 21PA216N/AN/A2012458
198K Studio LLC342001025886
phottovy
13 - Pulsar
13 - Pulsar

That certainly helps. They certainly didn't do you any favors with that csv file. Try the attached workflow to identify the ID and name. My regex would need to be tweaked if there isn't always a space after the comma. Here is what I tried:

(\d+),\s(.+[[:alpha:]]),\s(\d+,.+)

 

(\d+),\s = ID Column (I assumed only digits followed by ,\s)

(.+[[:alpha:]]),\s = Name Column (this will find all characters but has to end with a letter followed by ,\s)

(\d+,.+) = Remaining (finds at least one digit followed by a comma then everything else in the string)

 

Labels