How to parse data using RegEx with my data?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Printer Friendly Page
- Mark as New
- Subscribe to RSS Feed
- Permalink
Based on the picture attached above, how to parse these sentences using RegEx when they are not in the same length? I want to separate the country and technology name in the cluster name to a new column.
Eg: Melbourne | Monash University
: Guangzhou South China | Univ. of Technology
: Hartford, CT | United Technologies
Thank you in advance for your attention and help!
- Labels:
- API
- Custom Formula Function
- Developer
- Mark as New
- Subscribe to RSS Feed
- Permalink
The thing you're looking for is a pattern for which you can build some rules to parse the data by, but I don't see patterns on which a generic rule can be built.
The next option you're then looking at would be to build a pattern that can catch and parse SOME records, filter them out, build another rule for the next set, etc. until you've accounted for every possibility.
This is obviously not ideal as you're rules will need to be maintained and you'll need checks to ensure nothing falls through the cracks.
For instance, let's say one rule can be records that starts with a word followed by a comma space and 2 capital letters as Parsed field 1 and everything to follow as parse field 2
Another example is records that only have 2 words separated by space - parse each word to a field.
I'm attaching an example with these 2 rules to show how you would build it up.
At the end you can then use a Union tool to bring them all back together.