Hello,
Would it be possible to replace punctuation to blank except specific characters?
I understand that we can use:
REGEX_Replace([Field1],"[[:punct:]]", '')
to replace the punctuation to blank.
However, I would keep following characters:
()-;,.'&
I also noticed that the regex formula didn't remove "⊥" into blank although I would like it to be blank.
I've attached a sample workflow.
Thank you for your help in advance.
Sincerely,
knozawa
Solved! Go to Solution.
Hi @Thableaus,
Thank you for your quick reply! It worked well.
FYI: I added a space within the formula after '&' to not remove the space between words.
REGEX_Replace([Field1],"[^-a-zA-Z0-9();,.'& ]", '')
Sincerely,
knozawa
Hi @Thableaus ,
I actually faced into another issue.
It seems like all unicode characters were also removed using the formula:
Clínico --> Clnico
São --> So
Sørlandet --> Srlandet
Linköping --> Linkping
There are lots of unicode characters that I don't want to remove. Do you think I should just list them within the regex replace formula? Or is there any other way to not remove those unicode characters?
I checked this link
Probably we cannot use this formula in Alteryx.
[^[:unicode:]]
But maybe I could use the unicode range instead of listing out all the unicode characters.
In that case, do you know how to add unicode ranges within the same regex replace formula?
REGEX_Replace([Field1],"[^-a-zA-Z0-9();,.'& \U+00C0-\U+00D1]", '')
This didn't work.
Sincerely,
knozawa
Try this:
REGEX_Replace([Field1],"[^-\w();,.'&\s]", '')
\w I think it stands for any digit or letter in Unicode.
Cheers,