community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE
SOLVED

RegEx using Unicode Blocks Inquiry

Highlighted
Atom

While trying to use Unicode Blocks either in RegEx tool  or in RegexCountMatch formula, I am not able to achieve correct results?

The ultimate goal would be to get characters count using the \p{IsLatin} block or variations of other blocks.

 

Could anybody be so kind to help/share experience in this area, please? Or confirm this syntax character classes are not supported in Alteryx? 

 

Main resources I have used so far:

 

Issues using RegEx tool (using only InLatin_Basic for simplicity):

  1. [[:InLatin_Basic:]] -  error: RegEx (#): RegEx: An invalid character class name was specified in a [[:name:]] block at character 3
  2. [:InLatin_Basic:] - no error but incorrect result - i.e. 'e' or 'm'  were not matched

As a workaround, using the unicode range seems to work ok ([\U+0000-\U+007F]) - but this is a bit cumbersome (especially when trying to work with multiple blocks).

 

Testing workflow is attached.

Many thanks in advance.

Alteryx Certified Partner

Hi @IvanaF

 

Not sure if I fully understand what your end result should be, but would this expression work within the RegEx tool?

[^[:unicode:]]

image.png

I don't believe the Boost library (which Alteryx uses for its Regex functionality) supports Unicode blocks at present.

 

The recommended workaround I know is to do what you were with Unicode ranges.

Atom

Hi @jrgo, many thanks for the suggestion; it helped me to the find additional resources in Alteryx help and can definitely be used in certain use cases. Thanks again for your time on this.

Atom

Hi @jdunkerley79, many thanks too for your time and advice. Cheers!

Labels