Join the Alteryx Community’s Maveryx Summer Cup event! Compete, network with others, and earn your gold through a series of challenges from July 24th to August 11th. Learn more about the event here.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to extract SRC Value from Image Tag

tjamal1
8 - Asteroid

Hello 

 

Is there any way we can extract src attribute value from an Image tag using Alteryx Designer.

Example 1

 <img src="https://mydomain.com/bdecourse/module1/traffic-control-devices/directional.jpg" width="100%" alt="Directional guidance; indicates movement is permitted" title="Directional guidance; indicates movement is permitted" />

 

Output 

https://mydomain.com/bdecourse/module1/traffic-control-devices/directional.jpg

 

Thanks

 

5 REPLIES 5
Elias_Nordlinder
11 - Bolide

Hello @tjamal1,

 

The best way to extract a specific string here is by using the RegEx function.

You can specify what you want to extract dynamically with different expressions.

 

I created two different ways of extract the image tag here depending on the use cases:

 

1. Extract everything between https and jpg:

 

 

(https.*jpg|https.*jpeg|https.*png)

 

-> https means specifically https as a word/letters

-> .* means zero or more characters

-> jpg means specifically jpg as a word/letters

-> () is the group that we want to use the regex for

-> I used $1 as replacement text which means to only keep the things specified inside the parentheses

-> I unchecked "Copy unmatched text to output" to not keep anything else but the matches.

 

-> Edit: I added the jpeg and png format as well, by changing the Regex to

(https.*jpg|https.*jpeg|https.*png)

-> The | sign is an OR-sign and means that it can be one of the three different formats above

 

Elias_Nordlinder_2-1629623251067.png

 

 

2. Extract everything between src=" and " width:

 

src="(.*)"\swidth

 

I created a second use case as well for another way to parse this with RegEx

 

-> Specify to search for src=" inside the text (As this comes up before a image value)

-> (.*): Create a group of everything after this until a specific expression

->"\swidth: Stop the group when " width appears in the expression

 

Elias_Nordlinder_3-1629623372133.png

 

 

 

 

Elias_Nordlinder_0-1629622968165.png

 

Let me know if this helps or if you have any questions:

 

//Regards
Elias

 

tjamal1
8 - Asteroid

Thank you @Elias_Nordlinder that helped,

I have some images which have PNG format and some have JPEG. How to cover all of them?

Also, the Width attribute is not in all images.

Elias_Nordlinder
11 - Bolide

Hello @tjamal1 ,

 

I have changed the Expression to:

(https.*jpg|https.*jpeg|https.*png)


That should take in if the images are jpeg or png as well.

You can add more formats in the same way if you have different image formats later:

Example below:

(https.*jpg|https.*jpeg|https.*png|https.*newformat)

 

Let me know if that solved the problem 🙂
//Elias

tjamal1
8 - Asteroid

That worked perfectly for me.

Thank you 🙂

 

Have a good day!

Elias_Nordlinder
11 - Bolide

Great!

I am happy to help 🙂

Thank you, have a great day as well!

Labels