We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extracting Specific Text in a Filename

KamenRider
11 - Bolide

Hi,

 

My sample below are:

 

BPOPHASE_14PL1H1
BPOPHASE_14PL1P2
BPOPHASE_14PL1X

 

Results should be

 

H1

P2

X

 

As you can, we are to extract the text after "_14PL1". I have use the regex Regex_Replace([Field],".*_(.*)","$1") but it is not giving me the desired output

 

Hope to hear any ideas from anyone.

 

Kamen

6 REPLIES 6
apathetichell
20 - Arcturus

Regex_Replace([Field],".*_(.*)","$1") from  BPOPHASE_14PL1H1 should be "14PL1H1" - to extract just the last part you'd use

 

regex_replace([Field],".*_14PL1(.*)$","$1") - this would give you the one or two characters after _14PL1.

 

assuming it's only 1 or 2 characters you could swap .* for .{1,2}

KamenRider
11 - Bolide

Hi @apathetichell 

 

Regex_Replace([Field],".*_14PL1(.*)$","$1") is  giving me a solution of the entire filename. It doesn't give me the solution I wanted. Aside from that, it does not only "_14PL1, it can be _15XC2 so it varies. Most important is to get the last one or two characters after a numeric.

 

Please advise.

Kamen

apathetichell
20 - Arcturus

Cool - thanks for the clarification 14PL1 won't work if that specific string isn't in the original field... if you are looking at numeric and then character try:

Regex_Replace([Field],".*\d(.{1,2})$","$1")

 

This would be the last one or two anything following a number.

 

dwstada
11 - Bolide

Regex_Replace([Field],".*?_.{5}(.+)","$1")

 

this formula will give you whatever is coming 5 characters after a "_"

 

extracting specific text in a filename.PNG

 

the problem with your original expression was the first ".*" being greedy and therefore being used for the whole file name and ignoring all the other characters in your expression - making it lazy ".*?" will allow the other parts of your expression to be used as well

 

try using regex101.com to test your expressions and get explanations what the different parts do, helps me a lot :)

Felipe_Ribeir0
16 - Nebula

Hi @KamenRider 

 

One way of doing this:

 

regex_replace([Sample], '(.*\_\d+.*?\d+)(.*)', '$2')

 

Screenshot 2023-08-28 140225.png

KamenRider
11 - Bolide

Thank you guys. All of your ideas work.

 

Kamen

Labels
Top Solution Authors