Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Remove all 2 character "words" in a cell

taxhead
6 - Meteoroid

Hi,

 

tried already several ideas but my knowledge with Regex_Replace is limited.

 

I have a dataset with various columns. The "Name" column includes e.g. the full name of a bank (e.g. Branch of the Bank of Russia). I want to remove all this 2 character words (in this case "of"). Additionally, my data consists of a lot of cyrillic entries which makes it even more complicated. I don't know how to created the Regex_Replace formula or is there an other way for this problem?

 

Thanks in advance 

11 REPLIES 11
binuacs
21 - Polaris

@taxhead one way to remove all the 2 chars words

binuacs_0-1647865901281.png

 

binuacs
21 - Polaris

@taxhead for you additional requirement can you provide some sample data and expected output?

mceleavey
17 - Castor
17 - Castor

Hi @taxhead ,

 

I've built a bit of Regex that handles this by determining from the beginning of the word, two characters to the space, or to the end of a line.

The Regex is as follows:

 

(\<\u{2})\s|(\<\u{2})$

 

The results are:

 

mceleavey_0-1647866030618.png

mceleavey_1-1647866060450.png

 

I hope this helps,

 

M

 

 



Bulien

mbarone
16 - Nebula
16 - Nebula

I would use the RegEx tool itself, with "Replace" as the function.  Although the REGEX_Replace formula is perfectly valid as well.

 

Either way, your example is very specific, so I will answer very specifically.  If you want to remove any "word" of the form "letter,letter,space", then a RegEx for this in Alteryx is \b[[:alpha:]][[:alpha:]]\s\b

 

Starting from the innards, the [[:alpha:]] represents any alpha character (letter).  So there are 2 right next to each other (for 2 letters next to each other).  Then there's the \s which represents a space after (so you're not left with a double space after removal of the word).  Then the \b just tells it that it's a word boundry.   So you're saying "any 2-letter word with a space after it" and replace it with nothing.

taxhead
6 - Meteoroid

Thanks for the quick responses!

 

mbarone
16 - Nebula
16 - Nebula

Hmm....I honestly do not know.  I have zero experience with non-English characters.

taxhead
6 - Meteoroid

Thanks for the quick responses!

 

I tried it but it doesn't work with all cyrillic characters. I attach some samples below, AK is removed but e.g. ЕБ is still there. Do I have first to remove all special characters (like "" or ())?

 

taxhead_0-1647867812004.png

 

 

mceleavey
17 - Castor
17 - Castor

@taxhead ,

 

can you provide the data rather than an image?

 

M.



Bulien

taxhead
6 - Meteoroid

Attached a sample of the data

Labels
Top Solution Authors