Alteryx Designer Desktop Discussions

Justin53Q · ‎03-16-2022

Hi Community members from a wet and dull Oxfordshire 😀

I have a column with plenty of string text contained within it, and I'm trying to extract out the names. Now I really hoping to use the new Name Entity Recognition tool but it's not available yet, which I think may have helped.

Anyway I have created some dummy text by way of example and I wondered what would be the best approach where the number of names and the position changes throughout the text as well as the other text around those names. Do I need to get to grips with Regex 😯

Thank you

Justin

binu_acs · ‎03-16-2022

@Justin53Q

TheOC · ‎03-16-2022

hey @Justin53Q

Greetings from a very wet and dull Newcastle!

Will your data be coming in from a docx file? Or is this just for examples sake?

Reading a docx into Alteryx may require some python/macro ability. Also, you're totally right, named recognition might be useful here, however im sure we can find a solution before the release of that tool!

Cheers,
TheOC

Cheers,
TheOC
Connect with me:

mceleavey · ‎03-16-2022

Hi @Justin53Q ,

With regex you are relying on a certain consistency within the string. That's not to say it's can't have an element of dynamism, for example in the text string you provided, all names are a single uppercase letter followed by a space followed by a string of letters, and this is why it's probably not a very good example as it's unlikely to be representative of your dataset.

If it is, the regex string would simply be something like:

(\u\s\u.*?\>)

Used in a tokenise function:

This will split to rows on each Name:

However, this is unlikely to be the case.

Hope this gets you off and running.

M.

Justin53Q · ‎03-16-2022

Thank you @theOC

The string data is contained in one excel sheet. Apologies for not making that clear.

Justin

Justin53Q · ‎03-16-2022

Thank you binuacs

Justin53Q · ‎03-16-2022

Thank you mceleavey

Alteryx Designer Desktop Discussions

String data containing names

Re: Issue with “Block Until Done” and Multiple Out...

Re: Extracting the list of sheet names across mult...

Re: Chaining Apps

Re: Unable to read in all raw xml from an excel fi...

Unable to read in all raw xml from an excel file