Join the Alteryx Community’s Maveryx Summer Cup event! Compete, network with others, and earn your gold through a series of challenges from July 24th to August 11th. Learn more about the event here.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

String data containing names

Justin53Q
6 - Meteoroid

Hi Community members from a wet and dull Oxfordshire 😀

I have a column with plenty of string text contained within it, and I'm trying to extract out the names. Now I really hoping to use the new Name Entity Recognition tool but it's not available yet, which I think may have helped.

Anyway I have created some dummy text by way of example and I wondered what would be the best approach where the number of names and the position changes throughout the text as well as the other text around those names. Do I need to get to grips with Regex 😯

Thank you 

Justin 

 

6 REPLIES 6
binuacs
20 - Arcturus
TheOC
15 - Aurora
15 - Aurora

hey @Justin53Q 

Greetings from a very wet and dull Newcastle!

Will your data be coming in from a docx file? Or is this just for examples sake?

Reading a docx into Alteryx may require some python/macro ability. Also, you're totally right, named recognition might be useful here, however im sure we can find a solution before the release of that tool!

Cheers,
TheOC


Bulien
mceleavey
17 - Castor
17 - Castor

Hi @Justin53Q ,

 

With regex you are relying on a certain consistency within the string. That's not to say it's can't have an element of dynamism, for example in the text string you provided, all names are a single uppercase letter followed by a space followed by a string of letters, and this is why it's probably not a very good example as it's unlikely to be representative of your dataset.

 

If it is, the regex string would simply be something like: 

(\u\s\u.*?\>)

Used in a tokenise function:

mceleavey_0-1647444245232.png

 

This will split to rows on each Name:

mceleavey_1-1647444267361.png

 

However, this is unlikely to be the case.

 

Hope this gets you off and running.

 

M.

 

 

 



Bulien

Justin53Q
6 - Meteoroid

Thank you @theOC

The string data is contained in one excel sheet. Apologies for not making that clear.

Justin 

Justin53Q
6 - Meteoroid

Thank you binuacs

 

 

Justin53Q
6 - Meteoroid

Thank you mceleavey

 

 

Labels