Hi everyone,
I have a field called "State|City" in a spreadsheet, which I need to cleanse.
I need to make sure that other than the following symbols all other non alpha numberic symbols are removed.
The following are allowed:
" " space
"-" dash
"|" pipe
I think I need to use the regex replace tool but I'm unsure how to write the code.
The characters to be removed do not need replacing with any alternative character.
Many thanks,
Solved! Go to Solution.
Though trial and error I have got the following to work ...
[^"|"|"-"|" "|\w]
Could anyone tell me if that's all I need or are there any risks to using that code?
Try using the regex tool with the following setup,
Thanks,
Philip
Hey @TSP ,
The most efficient way to do this might depend on how flexible / how many potential characters there are that need to be removed. Are you able to post a sample of what a couple records might look like?
Looks good @TSP . You just don't need to include all the quotation marks.
This works :
[^a-zA-Z0-9 \-|]
Note that the '-' needs escaping with the '\' character, but you can add any other characters in the set that you don't want to replace.
Hi @TSP
Your logic is great, to use the RegEx Replace formula, specifying everything that you don't want to replace.
Try this function in a Formula tool. It's just a simplified version of what you've already done.
REGEX_Replace([State|City], "[^\w\|\-\s]", "")
It says replace everything that isn't alpha-numeric, a pipe, a hyphen, or a space, with nothing.
Let me know if that helps.
Cheers!
Esther
Not if you put the dash at the end.
Very true!
Thought it would be helpful to point out the idea of escaping characters though, as you can only have one a the end. 🙂