Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

More Regex for parsing names! Hooray!

JFurda
8 - Asteroid

I have a love-hate with RegEx. I acknowledge the power it has, but can't seem to crack this particular task. I am given a variety of formats for names and need to parse out and rejoin first and last only. 

 

Here's my expression: ^(\w+)\W?((?<=\W).+(?=\W))?\W?(\w+)?$
The only things I can't figure out are hyphenated last names and apostrophes.  

Here's my data: The "Full Name" column is my input and the rest are the output of my RegEx parse.

edit: My "Data" is just filler in a variety of first and last names. The actual names are all over the place and my company wouldn't allow be to share that data. I created 9 variations that I've seen in the actual data.

 

Full NameFirst NameMiddle NameLast Name
First LastFirst Last
First Middle LastFirstMiddleLast
First Last-LastFirstLastLast
First O'LastFirstOLast
First Middle Middle LastFirstMiddle MiddleLast
First Middle Last-LastFirstMiddle LastLast
First O'last-LastFirstO'lastLast
First Middle O'LastFirstMiddle OLast
First M LastFirstMLast

 

I know that the answer is likely in one of the hundred posts on this topic, but I just can't find it! Is anyone able to help? 

Thanks! 

12 REPLIES 12
MarqueeCrew
20 - Arcturus
20 - Arcturus

@JFurda ,

 

I'm not going to give you a full-answer.  Instead, please consider creating a FILTER to have apostrophe (hint:  contains()) data going to separate logic.

 

after you separate apostrophe stuff, you can do things with normal string functions.

 

CountWords() = 3 gets handled as GetWord(0) and GetWord(2)

 

cheers,


Mark

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
atcodedog05
22 - Nova
22 - Nova

Hi @JFurda 

 

Infact i couldn't figure out the regex pretty quick but there is a quick solution using formula tool.

 

Output:

atcodedog05_0-1603824911207.png

Workflow:

atcodedog05_1-1603824925055.png

Hope this helps 🙂


If this post helps you please mark it as solution. And give a like if you dont mind 😀👍

atcodedog05
22 - Nova
22 - Nova

Hi @JFurda 

 

Hooray found the regex

 

 

(\w+) (.*[\s\-'])(\w+)

 

Regex Tool config:

atcodedog05_0-1603825142100.png

Hope this helps 🙂


If this post helps you please mark it as solution. And give a like if you dont mind 😀👍

JFurda
8 - Asteroid

Ah, I should have been clearer. My "Data" is just filler in a variety of first and last names. The actual names are all over the place and my company wouldn't allow be to share that data. I created 9 variations that I've seen in the actual data. Thanks, though! 

JFurda
8 - Asteroid

Hmm. So perhaps I could get the "words" assigned out. The first one will be the first name. I could sort descending and pull that as the last name.

 

I'll give that a go! Thanks! 

atcodedog05
22 - Nova
22 - Nova

Hi @JFurda 

 

Did you give above Regex a try is it failing somewhere ?

 

If yes can you give me the scenario so that I can modify accordingly

JFurda
8 - Asteroid

@atcodedog05 

 

The expression (\w+) (.*[\s\-'])(\w+) didn't quite work out. The output looks like this: 

 

Full NameFirst NameMiddle NameLast Name
First Last   
First Middle LastFirstMiddle Last
First Last-LastFirstLast-Last
First O'LastFirstO'Last
First Middle Middle LastFirstMiddle Middle Last
First Middle Last-LastFirstMiddle Last-Last
First O'last-LastFirstO'last-Last
First Middle O'LastFirstMiddle O'Last
First M LastFirstLast

 

The issue(s) are: 

  • The first condition returned nulls.
  • The third condition should retain the hyphen in the last name field. 
  • The fourth condition separated the O' from the Last.
  • The sixth condition did not keep the hyphenated name in the last name field (similar to #3)
  • The seventh condition (however rare) should keep O'last-Last as the 'Last Name'
  • The eighth condition separated the O' from the Last (similar to #4)
atcodedog05
22 - Nova
22 - Nova

Hi @JFurda 

 

Thank you for the consideration and detailed issue report.

 

I will look into it. If am able to calibrate an output I will reply back on this post.

mceleavey
17 - Castor
17 - Castor

Hi @JFurda ,

 

I've attached my solution. This should give you what you want.

I've also attached my Grouped RecordID tool which I used in the solution.

 

Let me know if this fixes the issue.

 

M.



Bulien

Labels