nav[aria-label="Primary Navigation"] { padding: 0; & ul { list-style: none; width: 100%; display: flex; flex-direction: row; justify-content: start; align-items: start; gap: 30px; padding: 0; & li { margin: 0; } & ul li { list-style: none; } } }

Remove invalid Excel characters using RegEX

BGeregach

How do I remove invalid characters from an Excel filename using RegEx.

I.E. I don't understand how to populate the Regular Expression field.

Thank you.

Brian

Regex

Accepted answers

All comments

jrlindem

Can you show the example that you're dealing with and what specific character's you're trying to remove?

-Jay

alexnajm

Can you give us an example? Not sure I understand the need - why would there be invalid characters?

alexnajm

Same idea @jrlindem ! 😂

jrlindem

Great minds think alike @alexnajm 😁

ChrisTX

If you know the specific characters you want to remove, try the ReplaceChar function.

Example

ReplaceChar("abcdefb", "b", "") returns "acdef".

https://help.alteryx.com/current/en/designer/functions/string-functions.html

Chris

Pilsner

Hello @BGeregach

That's a great question, as it is important to get file names correct. When you say "invalid characters" do you mean characters such as a ":" appearing in the middle of your string, or are you referring to characters that aren't recognised, i.e appear as a blank box or another unusual character?

If you are referring to the first option, a Replace() function can often be more useful than regex in this case, @ChrisTX has provided a good example of its behaviour above.

If by invalid characters you mean unrecognised characters, then you can use the following Regex:

REGEX_Replace([Field], "[^ -~ ]", "")

I learnt this from @binu_acs' reply to the following thread: https://community.alteryx.com/t5/Alteryx-Designer-Desktop-Discussions/Working-with-non-Ascii-character/td-p/1022508

What this function does is it defines what's called a character set. In Alteryx, Regex is based on the ASCII alphabet. Each character in the ASCII alphabet has an ID, so by defining a range (using the "-"), you can tell Alteryx to only accept characters between the space " " and the tilde "~". (see ASCII alphabet below)

Screenshot 2026-02-20 105700.png

The regex expression above includes the "^" symbol, which means NOT in. So the regex looks for any characters NOT in between the " " and "~", then replaces these unknown characters with nothing. @binu_acs included an example of this in his original post linked above.

One final note, if you are trying to figure out a regex expression, I would highly recommend a site like Regex 101. It allows you to see which part of your regular expression, are matching with which parts of your string. Here is an example where I used Regex to remove ":" from the middle of my file path:

Screenshot 2026-02-20 110323.png

Once you have your expression in regex 101, you can copy it into the regex or formula tool in Alteryx. Here is the before and after the regex tool in Alteryx:
Before:

Screenshot 2026-02-20 110506.png

After: (the $1 and $2 refers to the "groups", defined by the sections of the regex expressions found within brackets)

Screenshot 2026-02-20 110529.png

I hope this all makes sense & helps you solve your problem, but please let me know if you have any further questions.

Regards - Pilsner

BGeregach

To all that have replied, I thank you.

I'm hesitant to provide examples with real world data. So I'm going to do my best to explain what I am working on.

I have a master file of data. In this file is a column of Report Names. My process goes through and creates an excel file with multiple tabs for each pivot table the client wants. The name of the output file is the name of the report. The client has used characters like: : / \ < > * " (this is not an exhaustive list).

I was hoping to use one RegEx tool to handle these characters (I'm replacing each one with a space) so that i don't need to create multiple formula's with Replace or Replacechar. I am using Replace or Replacechar, I just wanted to exchange a bunch of those for one RegEx tool.

I hope this helps.

Thank you,

Brian

jrlindem

Ah, so you have the report names in a columnar arrangement in a report. That means you can use a lot of the techniques from others in this thread to clean those field values up, but also... the Cleanse Tool can pull out the special characters from your data as well and then you can pass those values downsteam.

And of course for machine characters or non-traditional symbols, REGEX or Replace() functions would be great to use.

-Jay

Quick Links

Popular Tags

This months top contributors

atcodedog05 19598

Qiu 15908

binu_acs 15783

MarqueeCrew 13710

apathetichell 13703