While trying to use Unicode Blocks either in RegEx tool or in RegexCountMatch formula, I am not able to achieve correct results?
The ultimate goal would be to get characters count using the \p{IsLatin} block or variations of other blocks.
Could anybody be so kind to help/share experience in this area, please? Or confirm this syntax character classes are not supported in Alteryx?
Main resources I have used so far:
Issues using RegEx tool (using only InLatin_Basic for simplicity):
As a workaround, using the unicode range seems to work ok ([\U+0000-\U+007F]) - but this is a bit cumbersome (especially when trying to work with multiple blocks).
Testing workflow is attached.
Many thanks in advance.
Solved! Go to Solution.
Hi @IvanaF
Not sure if I fully understand what your end result should be, but would this expression work within the RegEx tool?
[^[:unicode:]]
I don't believe the Boost library (which Alteryx uses for its Regex functionality) supports Unicode blocks at present.
The recommended workaround I know is to do what you were with Unicode ranges.
Hi @jrgo, many thanks for the suggestion; it helped me to the find additional resources in Alteryx help and can definitely be used in certain use cases. Thanks again for your time on this.
Hi @jdunkerley79, many thanks too for your time and advice. Cheers!