More Regex for parsing names! Hooray!
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I have a love-hate with RegEx. I acknowledge the power it has, but can't seem to crack this particular task. I am given a variety of formats for names and need to parse out and rejoin first and last only.
Here's my expression: ^(\w+)\W?((?<=\W).+(?=\W))?\W?(\w+)?$
The only things I can't figure out are hyphenated last names and apostrophes.
Here's my data: The "Full Name" column is my input and the rest are the output of my RegEx parse.
edit: My "Data" is just filler in a variety of first and last names. The actual names are all over the place and my company wouldn't allow be to share that data. I created 9 variations that I've seen in the actual data.
Full Name | First Name | Middle Name | Last Name |
First Last | First | Last | |
First Middle Last | First | Middle | Last |
First Last-Last | First | Last | Last |
First O'Last | First | O | Last |
First Middle Middle Last | First | Middle Middle | Last |
First Middle Last-Last | First | Middle Last | Last |
First O'last-Last | First | O'last | Last |
First Middle O'Last | First | Middle O | Last |
First M Last | First | M | Last |
I know that the answer is likely in one of the hundred posts on this topic, but I just can't find it! Is anyone able to help?
Thanks!
Solved! Go to Solution.
- Labels:
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@JFurda ,
I'm not going to give you a full-answer. Instead, please consider creating a FILTER to have apostrophe (hint: contains()) data going to separate logic.
after you separate apostrophe stuff, you can do things with normal string functions.
CountWords() = 3 gets handled as GetWord(0) and GetWord(2)
cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @JFurda
Infact i couldn't figure out the regex pretty quick but there is a quick solution using formula tool.
Output:
Workflow:
Hope this helps 🙂
If this post helps you please mark it as solution. And give a like if you dont mind 😀👍
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @JFurda
Hooray found the regex
(\w+) (.*[\s\-'])(\w+)
Regex Tool config:
Hope this helps 🙂
If this post helps you please mark it as solution. And give a like if you dont mind 😀👍
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Ah, I should have been clearer. My "Data" is just filler in a variety of first and last names. The actual names are all over the place and my company wouldn't allow be to share that data. I created 9 variations that I've seen in the actual data. Thanks, though!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hmm. So perhaps I could get the "words" assigned out. The first one will be the first name. I could sort descending and pull that as the last name.
I'll give that a go! Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @JFurda
Did you give above Regex a try is it failing somewhere ?
If yes can you give me the scenario so that I can modify accordingly
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The expression (\w+) (.*[\s\-'])(\w+) didn't quite work out. The output looks like this:
Full Name | First Name | Middle Name | Last Name |
First Last | |||
First Middle Last | First | Middle | Last |
First Last-Last | First | Last- | Last |
First O'Last | First | O' | Last |
First Middle Middle Last | First | Middle Middle | Last |
First Middle Last-Last | First | Middle Last- | Last |
First O'last-Last | First | O'last- | Last |
First Middle O'Last | First | Middle O' | Last |
First M Last | First | M | Last |
The issue(s) are:
- The first condition returned nulls.
- The third condition should retain the hyphen in the last name field.
- The fourth condition separated the O' from the Last.
- The sixth condition did not keep the hyphenated name in the last name field (similar to #3)
- The seventh condition (however rare) should keep O'last-Last as the 'Last Name'
- The eighth condition separated the O' from the Last (similar to #4)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @JFurda
Thank you for the consideration and detailed issue report.
I will look into it. If am able to calibrate an output I will reply back on this post.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @JFurda ,
I've attached my solution. This should give you what you want.
I've also attached my Grouped RecordID tool which I used in the solution.
Let me know if this fixes the issue.
M.
