Parsing First and Last Name when there is a hyphen
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi All,
I have a column of data that contains both the first and last name. I can use the example in the RegEx sample to easily parse most of it - except when it comes to names with hyphens.
I'm using the brackets to illustrate how I want the cells to appear:
[Joe-Schmo Johnson] needs to be [Joe-Schmo][Johnson]
I also get a NULL value if there is no first or last name; how can I replace a missing name with "N/A" or something to indicate it isn't there?
[Tom] becomes [NULL]
Thanks!!
Solved! Go to Solution.
- Labels:
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
What if you use a text to columns tool with \s as your delimiter (space)?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Also, if you want a value of N/A afterwards you can use a formula tool that updates the first name column that has been created and says IIF(IsNull([Field]), "N/A", [Field])
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Good call- this works most of the time...until I come across a customer with a middle initial.
Example: John J Smith
It's transforming it to [John] [J Smith]
I tried to put in a second text to columns tool and that works - but only when there is a middle initial! A customer without a middle initial will return a NULL value after the 2nd text to columns tool.
Is there a way to use a formula tool to trim off the first character that is followed by a white space? Something like that might do the trick...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Tessa,
Could you use a formula tool with Regex_Replace([Field1],'\s\u\s',' ')
This would replace a middle initial with just one space and then do the text to columns on the \s delimiter?
If this solves your issue please mark the answer as correct, if not let me know!
Regards,
Jonathan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I took a step back- used the text to columns and selected 3 as my columns to split.
I then used a formula tool for a case statement:
If IsNull([buyer-name3]) THEN ([buyer-name2]) ELSE ([buyer-name3]) ENDIf
I used a select tool to rename the columns.
This skinnys down my data into 2 columns, leaving out the middle initial.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
But what if you have someone with more than one middle initial? For example John M J Smith
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
(people should not be allowed to include middle initials!)
I used your solution to try to keep the middle initial, but that only works when a customer has a period with it!
So John S. Smith returns as 3 columns: [John][S.][Smith]
where
Bob B White returns as this: [Bob][White][NULL]
Thoughts?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Tessa,
I think I've got the complete solution for you now!
The workflow itself looks like this:
Input:
Output:
I've removed the "." periods in the names such as in rows 3 & 4 , however if you want to keep them then you will simply need to change the regex function to \u(?:[A-Za-z'\-\.]+)?
If this solves your issue please mark the answer as correct, if not let me know! I've attached my workbook for you to download if needed.
Regards,
Jonathan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Nice! I dig it!
I'm going to take a look at this later when I have some more time to finesse what I am doing - what I have works okay -- for now 🙂 Thank you!
