Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Cleaning up Last Names

JPSeagull
8 - Asteroid

Hi. I am using a formula to clean up Title Case from cleansing tool of last names. It works great except for names like McDonald, MacDonald, DeLeon, etc.  I have this formula found in the community:

 

if contains([Last Name], "-") then Replace(TitleCase(ReplaceChar([Last Name], "-", '$')), '$',"-")
elseif contains([Last Name], "'") then Replace(TitleCase(ReplaceChar([Last Name], "'", '$')), '$',"'")
elseif contains([Last Name], "Mc") then Replace(TitleCase(ReplaceChar([Last Name], "Mc", '$')), '$', "Mc")
else TitleCase([Last Name]) endif

 

The results from:

 

JOHN WHITE-JONES
TIMOTHY O'LEARY
JACK MCCLUNG

 

Along with Cleanse to Title Case only on first name gets me this:

 

John White-Jones (correct)

Timothy O'Leary (correct)

Jack McCclung (incorrect - should be Jack McClung)

 

Any suggestions? I know there are Van Horns, DeLeone, MacDonald, De Leone, etc. to consider, but most of my names are hyphens, apostrophes, and Mc. 

 

Also, used McClung because of the challenge of getting to that second C to be upper case.

 

Thanks.

4 REPLIES 4
KGT
11 - Bolide

That is random chance that you didn't have more errors that would have helped to troubleshoot.

 

The issue is that you're using ReplaceChar rather than replace on the last If condition. So and Uppercase M or Lowercase c are being replaced by $. 

 

So, theoretically, the formula should change to this

if contains([Last Name], "-") then
Replace(TitleCase(ReplaceChar([Last Name], "-", '$')), '$',"-")
elseif contains([Last Name], "'") then
Replace(TitleCase(ReplaceChar([Last Name], "'", '$')), '$',"'")
elseif contains([Last Name], "Mc") then
Replace(TitleCase(Replace([Last Name], "Mc", '$')), '$', "Mc")
else TitleCase([Last Name]) endif

flying008
15 - Aurora

Hi, @JPSeagull 

 

FYI.

 

 

 

REGEX_Replace(TitleCase(LowerCase([Last Name])), "(-|'|\<Mc|\<Mac|\<De)([a-z])", '$1\u$2')

 

 

 

录制_2024_07_23_10_12_41_423.gif

JPSeagull
8 - Asteroid

@KGT - well, the randomness of the names was intentionally. I have thousands, but just wanted to tackle the top 3 styles of using the hyphen, the apostrophes and our Irish names. Next are the Scottish names (Mac) and those who have accents like Joséphine. 

 

Your tip solved my Irish names - thank you! 

 

 

JPSeagull
8 - Asteroid

@flying008 - Thank you! Even cleaner solution. It is straightforward and less coding. Only thing I had to do was remove the Mac and the De because I have regular last names like Mackey and Delaney. So, just have to keep a list of exceptions for the Mac and the De which are not many. Probably will just add a column with a flag if the name is exempted from specialized casing.

 

Thank you again!

Labels