Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Cleanup file names

Nidhin
8 - Asteroid

Hi

I am trying to clean up the file name before data comparison.
The file names samples are as below

What I am trying to do is to remove all the numbers and special characters and month except if the month and number does not have anything between it as in MAR007.
I tried to do this by RegEx Replace().But I am not able to get the expected output for the last item.

And i also want to remove underscore only if there is more than one (FileName__01) or if it the last character in File Name (FileName__)Is there a way to do this using regex replace. ?

 

FileNameExpected CleanNameActualOutput
FileName_MAR_20_09_18FileNameFileName
FileName_MAR_20 09 19FileNameFileName
FileName_MAR-20-09-20FileNameFileName
FileName_MAR-200921FileNameFileName
FileName_MAR007_20_09_21FileName_MAR007FileName

 

Thanks

Nidhin

3 REPLIES 3
DavidP
17 - Castor
17 - Castor

I managed to get your expected result with the following regex formula, but I'm not quite sure what you want with the double underscores - can you please post an example.

 

REGEX_Replace(REGEX_Replace([FileName], '([\_\-][A-Z]{3}[\_\-]\d+.*)', ''),'([\_\-]\d+.*)','')

 

regex formula.png

Nidhin
8 - Asteroid

Thanks a lot @DavidP .

 

This is exactly what i was looking for.

I was doing this step by step.So when the numerical are removed i will end up with multiple _.So i was looking for a way to remove _ only if there are more than one.Since i wanted to retain the ones between the FileName.(EgFileName_Mar i want to keep it but FileName__ i want to remove)

But i guess the one you provided is doing all this in a single formula,

 

Nidhin

megha1311
6 - Meteoroid

You may try this expression as well :). This uses a formula tool.

 

REGEX_Replace([FileName],"(_\w\w\w(_|-|\s)\d+(_|-|\s)\d+(_|-|\s)\d+)|(_\w\w\w(_|-|\s)\d+)|((_|-|\s)\d+(_|-|\s)\d+(_|-|\s)\d+)","")

Labels