Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Use Regex to split string

emil
7 - Meteor

Dear all,

 

I run into a problem when splitting a string and regrouping it based on different criteria. I tried various formulas and workflows and I am still stuck to find a structured workflow. 

 

The initial data looks like :

001ABC 123 M @#
001ABC 798 M @#
001ABC 7 @#
001ABC @
002F 421 #
003GHJK 433 @
003GHJK @

 

What output should be is :

 

001ABC123 M@#
001ABC 798 M @#
001ABC7@#
001ABC @
002F421#
003GHJK 433@
003GHJK  @

 

Many thanks for your input.

5 REPLIES 5
jdunkerley79
ACE Emeritus
ACE Emeritus
^([^ ]+) (.*?) ?([^ ]+)$

should do what you need.

 

The first column will be up to the first space. The last column will be from the last space.

The middle column will only be filled if there is a middle block

 

Sample attached

emil
7 - Meteor

Thank you ! This looks great!

ak2018
8 - Asteroid

Thank you for that example. However, I have a different problem. Say I have a field Description. It also has a recordID field. it is a freeflow text and may contain information that is in certain format, say 5 digit numbers like this

RecordID    Description

00001          LOC1 = 23456 LOC = 34856 OLD PLACE = 67933 {new line}JUNK = 53436 (10 HEMMING WAY)

 

I need this to be 

 

RecordID  Field1      Field2

00001       LOC1       23456

00001       LOC         34856

00001       OLD PLACE     67933

00001       JUNK       53436

 

Any thoughts on how to go about. I could do Tokenize but all I get is Field1, no RecordID or Field2. I know i am missing something.

EN6924
10 - Fireball

My solution -

 

EN6924_0-1647893210871.png

 

ak2018
8 - Asteroid

Thanks for the solution. However, I decided I did not want to have multiple columns as the field to be split may contain even 10 combinations at times. Instead I passed the string through the Formula Tool and replacing all ' = ' with ' =', removing the space before and/or after the equals sign. That would give you a consistent string. Now as shown in my example, the "Name" was assigned a "value" moving on the the next set. 

So passing that through the first Regex Tool

The settings were:

Formula to Convert: ([A-Z]+\s?[A-Z]+[=]\d{5})  -- Where the Text could have a space in between, for example OLD PLACE and the number is always 5 digits.

Chose CASE INSENSITIVE in case there was any text that was in lower case.

For the Output method, I chose Tokenize and Split to Rows.

 

So from the original record combo,

RecordID    Description

00001          LOC1 = 23456 LOC = 34856 OLD PLACE = 67933 {new line}JUNK = 53436 (10 HEMMING WAY)

 

the output comes out as

 

RecordID    Description

0001            LOC1=23456

0001            LOC=34856

0001           OLD PLACE=67933

0001            JUNK=53436

 

Passing this further to another Regex with the formula, 

([a-z]+\s?[A-Z]+)[=](\d{6})

 

The output method is Parse and will give two columns from the two (). I named the first one as TypeName and second as TypeValue. 

I therefore got the recordID as associated with that combination where the TypeName is specified and the value is along with it. Worked perfectly as expected.

 

Labels
Top Solution Authors