Hi,
tried already several ideas but my knowledge with Regex_Replace is limited.
I have a dataset with various columns. The "Name" column includes e.g. the full name of a bank (e.g. Branch of the Bank of Russia). I want to remove all this 2 character words (in this case "of"). Additionally, my data consists of a lot of cyrillic entries which makes it even more complicated. I don't know how to created the Regex_Replace formula or is there an other way for this problem?
Thanks in advance
Solved! Go to Solution.
@taxhead for you additional requirement can you provide some sample data and expected output?
Hi @taxhead ,
I've built a bit of Regex that handles this by determining from the beginning of the word, two characters to the space, or to the end of a line.
The Regex is as follows:
(\<\u{2})\s|(\<\u{2})$
The results are:
I hope this helps,
M
I would use the RegEx tool itself, with "Replace" as the function. Although the REGEX_Replace formula is perfectly valid as well.
Either way, your example is very specific, so I will answer very specifically. If you want to remove any "word" of the form "letter,letter,space", then a RegEx for this in Alteryx is \b[[:alpha:]][[:alpha:]]\s\b
Starting from the innards, the [[:alpha:]] represents any alpha character (letter). So there are 2 right next to each other (for 2 letters next to each other). Then there's the \s which represents a space after (so you're not left with a double space after removal of the word). Then the \b just tells it that it's a word boundry. So you're saying "any 2-letter word with a space after it" and replace it with nothing.
Thanks for the quick responses!
Hmm....I honestly do not know. I have zero experience with non-English characters.
Thanks for the quick responses!
I tried it but it doesn't work with all cyrillic characters. I attach some samples below, AK is removed but e.g. ЕБ is still there. Do I have first to remove all special characters (like "" or ())?