Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Help with REGEX to remove middle name in LAST, FIRST MIDDLE name format

kanemku
7 - Meteor

Hello,

 

I have a column with people's full name in a "LAST NAME, FIRST NAME MIDDLE NAME" format, and I need to remove the middle name.

 

Original: LAST, FIRST MIDDLE

Result:   LAST, FIRST

 

Thank you to anyone who can help me!

7 REPLIES 7
caltang
17 - Castor
17 - Castor

Can you provide some sample data? How do you differentiate a First and Middle name correctly? Is it always in that order?

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
DataNath
17 - Castor

This is fairly straightforward if you always have single names but I'm not sure how you'd differentiate between someone having 2 middle names etc. i.e. where you'd draw the line on this. If you're getting data from a system or something that only allows single entries for each name i.e. one single word as a first, middle and last name then you could even just use an expression like:

 

GetWord([Name], 0) + ' ' + GetWord([Name], 1)

 

Rana_Kareem
9 - Comet

Hi @kanemku ..

 

Also, you can use this expression:

 

REGEX_Replace([Field Name], "\s\w+$", "")
kanemku
7 - Meteor

Thank both of you @Rana_Kareem and @DataNath for your solutions. Both of those suggestions work perfectly for people with one word last names such as SMITH, JOHN MICHAEL (results in SMITH, JOHN as intended), but for people with two or more word last names, it doesn't work. Example of that would be OSPINA ALVARADO, JUAN, it results in "OSPINA ALVARADO,". I want to keep everything before the comma as the last name, and the first name is what's after the comma up to the first space, 

 

JOHNSON SMITH, JAMEEL BRYAN

LAST = JOHNSON SMITH

FIRST = JAMEEL

MIDDLE = BRYAN

RESULT = JOHNSON SMITH, JAMEEL

 

DOE, JANE ANN MARI

LAST = DOE

FIRST = JANE

MIDDLE = ANN MARI

RESULT = DOE, JANE

 

@caltang The last name is separated by the comma, and the delimiter for FIRST and MIDDLE is a space. 

Examples of names:

DOE, JOHN

DOE, JANE ANN MARI

LAWSON, TALIYAH JANE

SMITH DOE, JOHN

NAVARRO-HERNANDEZ, JOSHUA

DE MELO CORREIA PINTO, MANUEL

 

I hope this helps. Please let me know if any of you need more info or clarity. 

DataNath
17 - Castor

@kanemku a few ways to do this if you just want everything up until the comma and then the first word afterwards. Using this expression in a Formula tool is one - just replace [Name] with whatever your field is:

 

REGEX_Replace([Name], '(.+\,\s[^\s]+).*', '$1')
kanemku
7 - Meteor

@DataNath Thank you so much for this! This gave me the exact result that I needed. Do you have any recommended resources for learning REGEX? 

DataNath
17 - Castor

Great to hear @kanemku! In terms of learning, I'd go over the interactive lessons here - https://community.alteryx.com/t5/Interactive-Lessons/tkb-p/interactive-lessons/label-name/Parsing%20...

 

A quick YouTube search will bring you to a load of videos with examples etc too! It's very useful to have resources like a cheat sheet (https://www.rexegg.com/regex-quickstart.html) and Regex101 (https://regex101.com/) handy as well for refreshers and testing out your expressions before implementing them.

Labels