Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex to remove unicode for fuzzy match

maleryx
7 - Meteor

I am currently using REGEX_Replace(_CurrrentField_,'[^\w]', '') for all text fields. is there anyway to prevent unicode characters were not convertable?

 

ConvError: Fuzzy Match (29): Comments: Some Unicode characters were not convertable ("therrror ÃÂÃâšÂÃâÅthisisjustasamplesentencebecausethisisasamplesentecetoreplicable the error")

4 REPLIES 4
afv2688
16 - Nebula
16 - Nebula

Hello @maleryx,

 

You could try using this formula:

 

DecomposeUnicodeForMatch([Field1])

 

This converts your characters.

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Regards

maleryx
7 - Meteor

ConvError: Multi-Field Formula (42): DecomposeUnicodeForMatch: Some Unicode characters were not convertable ("xxxÃââœ199xx1")

ConvError: Multi-Field Formula (42): DecomposeUnicodeForMatch: Some Unicode characters were not convertable ("xαxxxxxx_x48xxx22142")

afv2688
16 - Nebula
16 - Nebula

Hello @maleryx,

 

The tool I showed you gets rid of most of the characters. There are som which cannot be converted like

 

ƒ

OR

œ

 

Those you either get rid of them with a regex replace tool or there is little much (as far as I know) to do about them

 

Regards

maleryx
7 - Meteor

Hello @afv2688

 

I am trying to ConvertToCodePage([_CurrentField_], 65001). will let you know the outcome.

Labels