Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Dev Space

Customize and extend the power of Alteryx with SDKs, APIs, custom tools, and more.

How to parse data using RegEx with my data?

khadijahneddy
6 - Meteoroid

khadijahneddy_0-1604367353853.png

Based on the picture attached above, how to parse these sentences using RegEx when they are not in the same length? I want to separate the country and technology name in the cluster name to a new column.

 

Eg: Melbourne | Monash University

    : Guangzhou South China | Univ. of Technology 

    : Hartford, CT | United Technologies

 

Thank you in advance for your attention and help!

 

 

1 REPLY 1
DavidP
17 - Castor
17 - Castor

Hi @khadijahneddy 

 

The thing you're looking for is a pattern for which you can build some rules to parse the data by, but I don't see patterns on which a generic rule can be built.

 

The next option you're then looking at would be to build a pattern that can catch and parse SOME records, filter them out, build another rule for the next set, etc. until you've accounted for every possibility.

 

This is obviously not ideal as you're rules will need to be maintained and you'll need checks to ensure nothing falls through the cracks.

 

For instance, let's say one rule can be records that starts with a word followed by a comma space and 2 capital letters as Parsed field 1 and everything to follow as parse field 2

 

Another example is records that only have 2 words separated by space - parse each word to a field.

 

I'm attaching an example with these 2 rules to show how you would build it up.

 

At the end you can then use a Union tool to bring them all back together.

 

DavidP_0-1604412642118.png