Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to separate Latin and non-Latin characters ?

SialB
5 - Atom

Hi everybody,

First of all, sorry for my english level...

 

Today, I have a huge list of person which contains many differents names, write like that "Arthur" or sometimes like that "آرثر" or that "亞瑟"...

How can I separate this differents versions?

 

I've already tried to use filter with this formula :

 

REGEX_Match([FirstName], "\w[\w+|\s|-]+")

 

But I don't know if Alteryx launch regex expression with "/u" at the end to match with full unicode or not....?

 

If somebody have an idea to help me..

Thanks ☺

5 REPLIES 5
jdunkerley79
ACE Emeritus
ACE Emeritus

I think this:

REGEX_MATCH([FirstName],"^[A-Z -]+$")

should do what you need.

 

 

SialB
5 - Atom

Thank you for your answer..

 

It's works for separate latin and non-latin characters, but unfortunately, this expression doesn't return accenteds characters... :(

jdunkerley79
ACE Emeritus
ACE Emeritus

This should match the extended latin unicode range:

REGEX_MATCH([FirstName],"^[A-Z \x{80}-\x{24F}-]+$")
SialB
5 - Atom

This is exactly what I'm looking for !

Thank you so much

dnbShiner15
7 - Meteor

BrainTrust..... I am getting educated on RegEx_Match and using Alteryx.  For some reason I am able to get a match using the RegEx101.com tester but when I place the exact same formula into Alteryx I get a no-match.  Can someone please help me figure out what I am doing wrong?

 

Here is the RegEx formula:

 

^[\p{Latin}\s[:punct:]]+$

 

Test Text:  CreationVijay

 

Here is the Alteryx formula I am using:

 

REGEX_MATCH([companyname],"^[\p{Latin}\s[:punct:]]+$")

 

Again RegEx101.com says it is a match but the result is the opposite when living in Alteryx.

 

Please let me know!

 

Thanks!

Labels