community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
SOLVED

Help with Regex Replace to remove characters

Atom

Hi everyone, 

 

I have a field called "State|City" in a spreadsheet, which I need to cleanse.

 

I need to make sure that other than the following symbols all other non alpha numberic symbols are removed.

 

The following are allowed:

" " space

"-" dash

"|" pipe

 

I think I need to use the regex replace tool but I'm unsure how to write the code.

 

The characters to be removed do not need replacing with any alternative character.

 

Many thanks,

Atom

Though trial and error I have got the following to work ...

[^"|"|"-"|" "|\w]

 

Could anyone tell me if that's all I need or are there any risks to using that code?

 

 

Alteryx Certified Partner

Try using the regex tool with the following setup,

 

Capture.PNG

 

Thanks,

Philip

Alteryx
Alteryx

Hey @TSP , 

 

The most efficient way to do this might depend on how flexible / how many potential characters there are that need to be removed.  Are you able to post a sample of what a couple records might look like?

Alteryx Certified Partner

Looks good @TSP . You just don't need to include all the quotation marks.

Asteroid

This works :

 

[^a-zA-Z0-9 \-|]

 

Note that the '-' needs escaping with the '\' character, but you can add any other characters in the set that you don't want to replace.

Highlighted
Pulsar

Hi @TSP 

 

Your logic is great, to use the RegEx Replace formula, specifying everything that you don't want to replace.

Try this function in a Formula tool. It's just a simplified version of what you've already done.

REGEX_Replace([State|City], "[^\w\|\-\s]", "")

 

It says replace everything that isn't alpha-numeric, a pipe, a hyphen, or a space, with nothing.

 

Let me know if that helps.

 

Cheers!

Esther

Alteryx Certified Partner

Not if you put the dash at the end.

Asteroid

Very true! 

 

Thought it would be helpful to point out the idea of escaping characters though, as you can only have one a the end. 

Labels