Alteryx Designer Desktop Discussions

JFurda · ‎10-27-2020

I have a love-hate with RegEx. I acknowledge the power it has, but can't seem to crack this particular task. I am given a variety of formats for names and need to parse out and rejoin first and last only.

Here's my expression: ^(\w+)\W?((?<=\W).+(?=\W))?\W?(\w+)?$
The only things I can't figure out are hyphenated last names and apostrophes.

Here's my data: The "Full Name" column is my input and the rest are the output of my RegEx parse.

edit: My "Data" is just filler in a variety of first and last names. The actual names are all over the place and my company wouldn't allow be to share that data. I created 9 variations that I've seen in the actual data.

Full Name	First Name	Middle Name	Last Name
First Last	First		Last
First Middle Last	First	Middle	Last
First Last-Last	First	Last	Last
First O'Last	First	O	Last
First Middle Middle Last	First	Middle Middle	Last
First Middle Last-Last	First	Middle Last	Last
First O'last-Last	First	O'last	Last
First Middle O'Last	First	Middle O	Last
First M Last	First	M	Last

I know that the answer is likely in one of the hundred posts on this topic, but I just can't find it! Is anyone able to help?

Thanks!

MarqueeCrew · ‎10-27-2020

@JFurda ,

I'm not going to give you a full-answer. Instead, please consider creating a FILTER to have apostrophe (hint: contains()) data going to separate logic.

after you separate apostrophe stuff, you can do things with normal string functions.

CountWords() = 3 gets handled as GetWord(0) and GetWord(2)

cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.

atcodedog05 · ‎10-27-2020

Hi @JFurda

Infact i couldn't figure out the regex pretty quick but there is a quick solution using formula tool.

Output:

Workflow:

Hope this helps 🙂

If this post helps you please mark it as solution. And give a like if you dont mind 😀👍

atcodedog05 · ‎10-27-2020

Hi @JFurda

Hooray found the regex

(\w+) (.*[\s\-'])(\w+)

Regex Tool config:

Hope this helps 🙂

If this post helps you please mark it as solution. And give a like if you dont mind 😀👍

JFurda · ‎10-27-2020

Ah, I should have been clearer. My "Data" is just filler in a variety of first and last names. The actual names are all over the place and my company wouldn't allow be to share that data. I created 9 variations that I've seen in the actual data. Thanks, though!

JFurda · ‎10-27-2020

Hmm. So perhaps I could get the "words" assigned out. The first one will be the first name. I could sort descending and pull that as the last name.

I'll give that a go! Thanks!

atcodedog05 · ‎10-27-2020

Hi @JFurda

Did you give above Regex a try is it failing somewhere ?

If yes can you give me the scenario so that I can modify accordingly

JFurda · ‎10-27-2020

@atcodedog05

The expression (\w+) (.*[\s\-'])(\w+) didn't quite work out. The output looks like this:

Full Name	First Name	Middle Name	Last Name
First Last
First Middle Last	First	Middle	Last
First Last-Last	First	Last-	Last
First O'Last	First	O'	Last
First Middle Middle Last	First	Middle Middle	Last
First Middle Last-Last	First	Middle Last-	Last
First O'last-Last	First	O'last-	Last
First Middle O'Last	First	Middle O'	Last
First M Last	First	M	Last

The issue(s) are:

The first condition returned nulls.
The third condition should retain the hyphen in the last name field.
The fourth condition separated the O' from the Last.
The sixth condition did not keep the hyphenated name in the last name field (similar to #3)
The seventh condition (however rare) should keep O'last-Last as the 'Last Name'
The eighth condition separated the O' from the Last (similar to #4)

atcodedog05 · ‎10-27-2020

Hi @JFurda

Thank you for the consideration and detailed issue report.

I will look into it. If am able to calibrate an output I will reply back on this post.

mceleavey · ‎10-27-2020

Hi @JFurda ,

I've attached my solution. This should give you what you want.

I've also attached my Grouped RecordID tool which I used in the solution.

Let me know if this fixes the issue.

M.

Alteryx Designer Desktop Discussions

More Regex for parsing names! Hooray!

Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Unmerge Excel Rows and duplicate data

Re: How separate channel ID