Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

REGEX TO EXTRACT DATA INSIDE BRACKETS WHICH PRECEEDS CERTAIN STRING

Shahas
8 - Asteroid

Hi All,

 

I have this situation, where in data needs to be extracted from inside a bracket. There can be multiple brackets as well.

 

For eg : [abc-242] jsydg [xuys-285] jyagd [iayb-248] links to [brrwfej-342]

 

in this case, only data inside the bracket before "links to" needs to be taken. so output should be [iayb-248]

 

but, some data might be [eqb-231] uajyuvdb [fejo-758] kjahb. in this case, the last bracket needs to taken. ie, [fejo-758].

 

i tried with .*\[(.*)\] links to.* but it doesn't give desired the output if "Links to" is not present.

 

I also tried 

(?<=\[)([\w-]+?)\]

 

but it doesn't satisfy the criteria when "Links to" is present.

 

if anyone could provide a solution which satisfy all the condition, it would be really helpful.

 

 

Thanks in advance

14 REPLIES 14
binuacs
20 - Arcturus

@Shahas Can you try the below regex formula?

 

 .*\[(.*)\][ links to ].*

 

 

binuacs_0-1676979295472.png

 

 

Shahas
8 - Asteroid

Hi,

 

Thanks for the reply.

 

Sadly that doesn't seem to work for me..

 

For eg, relates to [uqnb-975] relates to [nmuand-835] relates to relates from [ajnska-984] relates from [aksdnb-464], the output which i'm getting is "[nmuand-835]" when it should have been [aksdnb-464].

 

Thanks

mceleavey
17 - Castor
17 - Castor

Hi @Shahas ,

 

I've taken the approach of working backwards from the end of the string, splitting out the records that have a set preceding the "links" tag. I've used the lowest base string of this to be "link".

 

mceleavey_0-1676980997678.png

Effectively, reversing the string allows me to look for "link" first, this will include any combination of Link, Links, Links to etc.

I've then determined the set for each one and reversed the string back it's original orientation. This gives the following:

mceleavey_1-1676981070495.png

If you have any other connotation of "link" then you can simply amend the reversed string in the regex:

.knil\s[]](.*?)[[]

However, this should work providing "link" is in the string. If not, it will use the second Regex command:

[]](.*?)[[]

 

I hope this helps,

 

M

 

 



Bulien

Christina_H
14 - Magnetar

Try this in a formula tool.  It cuts the string short at "links to" if it's there, then takes the last remaining square brackets.

REGEX_Replace(Left([Field1],FindString([Field1],"links to")),".*\[(.*)\].*","$1")

Christina_H_0-1676982269649.png

 

OllieClarke
15 - Aurora
15 - Aurora

Hey @Shahas 

How about this:

.*(\[.+?\])(?: links to)?

This will pull out the final text in square brackets before " links to" (if it's there)

OllieClarke_0-1676984076217.png

 

Does that help?

 

Ollie

 

Shahas
8 - Asteroid

Heyy @OllieClarke @Christina_H @mceleavey the solution provided by u guys works and thanks alot :). Really appreciated for the help.

 

 

I do have an updated requirement now.. from all the data i have, similar to the one's above, i just need to extract the ID that contains either "Links to" or "Risks to", with preference given to "Risks to" over "Links to" if at all they come in the same line..

 

Thanks in advance

Christina_H
14 - Magnetar

I've updated my previous formula to cut off at "risks to" if it's there, "links to" otherwise, and take the last remaining square brackets.  Is that what you mean?

REGEX_Replace(Left([Field1],iif(Contains([Field1],"risks to"),FindString([Field1],"risks to"),FindString([Field1],"links to"))),".*\[(.*)\].*","$1")

Shahas
8 - Asteroid

@Christina_H My requirement is similar to earlier post...if we have "links to", the square bracket prior to the last "Links to" should be the output.

 

Additionally, if there is "Risks to", the square bracket prior should be the output..

 

If the data doesn't contain "Links to" or "Risks to", output need to be blank

Christina_H
14 - Magnetar

Like this then:

if Contains([Field1],"links to") or Contains([Field1],"risks to") then REGEX_Replace(Left([Field1],iif(Contains([Field1],"risks to"),FindString([Field1],"risks to"),FindString([Field1],"links to"))),".*\[(.*)\].*","$1")
else Null() endif

Christina_H_0-1677068545568.png

 

Labels