Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

String data containing names

Justin53Q
6 - Meteoroid

Hi Community members from a wet and dull Oxfordshire 😀

I have a column with plenty of string text contained within it, and I'm trying to extract out the names. Now I really hoping to use the new Name Entity Recognition tool but it's not available yet, which I think may have helped.

Anyway I have created some dummy text by way of example and I wondered what would be the best approach where the number of names and the position changes throughout the text as well as the other text around those names. Do I need to get to grips with Regex 😯

Thank you 

Justin 

 

6 REPLIES 6
binuacs
20 - Arcturus
TheOC
15 - Aurora
15 - Aurora

hey @Justin53Q 

Greetings from a very wet and dull Newcastle!

Will your data be coming in from a docx file? Or is this just for examples sake?

Reading a docx into Alteryx may require some python/macro ability. Also, you're totally right, named recognition might be useful here, however im sure we can find a solution before the release of that tool!

Cheers,
TheOC


Bulien
mceleavey
17 - Castor
17 - Castor

Hi @Justin53Q ,

 

With regex you are relying on a certain consistency within the string. That's not to say it's can't have an element of dynamism, for example in the text string you provided, all names are a single uppercase letter followed by a space followed by a string of letters, and this is why it's probably not a very good example as it's unlikely to be representative of your dataset.

 

If it is, the regex string would simply be something like: 

(\u\s\u.*?\>)

Used in a tokenise function:

mceleavey_0-1647444245232.png

 

This will split to rows on each Name:

mceleavey_1-1647444267361.png

 

However, this is unlikely to be the case.

 

Hope this gets you off and running.

 

M.

 

 

 



Bulien

Justin53Q
6 - Meteoroid

Thank you @theOC

The string data is contained in one excel sheet. Apologies for not making that clear.

Justin 

Justin53Q
6 - Meteoroid

Thank you binuacs

 

 

Justin53Q
6 - Meteoroid

Thank you mceleavey

 

 

Labels