community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

Regex non-english characters

Meteor

Hello,

 

I have fileds that contain both english and non-english characters like ü, ö, ß. What would be a regex function to cover the non english characters. 

Alteryx Certified Partner
Alteryx Certified Partner

Characters not in a-z & A-Z would be:

 

[^a-zA-Z]

 

So you may use something like:

 

Regex_CountMatches([String_Field],"[^a-zA-Z]")

Because this function has a case option (default value of 1 is case insensitive), just searching for [^a-z] may work too.

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Alteryx Certified Partner
Alteryx Certified Partner

You might want to add other characters to the set....

 

space is one that you might want along with numbers...

 

 

"[^a-zA-Z0-9\s]"

 

oh, yes...  \W looks for non-word characters, but doesn't include spaces.

 

lots of options...

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Alteryx Alumni (Retired)

An interesting webpage on regex and unicode characters...

http://www.regular-expressions.info/unicode.html

Highlighted
Moderator
Moderator

Based on ASCII table (http://www.asciitable.com), I would suggest the following regex:

 

[^\x00-\x7F]

 

 

Basically all ASCII characters where code is not between 0 and 7F

 

Remark:

\xdd

A hexadecimal escape sequence - matches the single character whose code point is 0xdd.

 

Source: http://www.boost.org/doc/libs/1_62_0/libs/regex/doc/html/boost_regex/syntax/basic_extended.html

 

 

Paul NOIREL

Customer Support Engineer

Labels