Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to extract SRC Value from Image Tag

tjamal1
8 - Asteroid

Hello 

 

Is there any way we can extract src attribute value from an Image tag using Alteryx Designer.

Example 1

 <img src="https://mydomain.com/bdecourse/module1/traffic-control-devices/directional.jpg" width="100%" alt="Directional guidance; indicates movement is permitted" title="Directional guidance; indicates movement is permitted" />

 

Output 

https://mydomain.com/bdecourse/module1/traffic-control-devices/directional.jpg

 

Thanks

 

5 REPLIES 5
Elias_Nordlinder
11 - Bolide

Hello @tjamal1,

 

The best way to extract a specific string here is by using the RegEx function.

You can specify what you want to extract dynamically with different expressions.

 

I created two different ways of extract the image tag here depending on the use cases:

 

1. Extract everything between https and jpg:

 

 

(https.*jpg|https.*jpeg|https.*png)

 

-> https means specifically https as a word/letters

-> .* means zero or more characters

-> jpg means specifically jpg as a word/letters

-> () is the group that we want to use the regex for

-> I used $1 as replacement text which means to only keep the things specified inside the parentheses

-> I unchecked "Copy unmatched text to output" to not keep anything else but the matches.

 

-> Edit: I added the jpeg and png format as well, by changing the Regex to

(https.*jpg|https.*jpeg|https.*png)

-> The | sign is an OR-sign and means that it can be one of the three different formats above

 

Elias_Nordlinder_2-1629623251067.png

 

 

2. Extract everything between src=" and " width:

 

src="(.*)"\swidth

 

I created a second use case as well for another way to parse this with RegEx

 

-> Specify to search for src=" inside the text (As this comes up before a image value)

-> (.*): Create a group of everything after this until a specific expression

->"\swidth: Stop the group when " width appears in the expression

 

Elias_Nordlinder_3-1629623372133.png

 

 

 

 

Elias_Nordlinder_0-1629622968165.png

 

Let me know if this helps or if you have any questions:

 

//Regards
Elias

 

tjamal1
8 - Asteroid

Thank you @Elias_Nordlinder that helped,

I have some images which have PNG format and some have JPEG. How to cover all of them?

Also, the Width attribute is not in all images.

Elias_Nordlinder
11 - Bolide

Hello @tjamal1 ,

 

I have changed the Expression to:

(https.*jpg|https.*jpeg|https.*png)


That should take in if the images are jpeg or png as well.

You can add more formats in the same way if you have different image formats later:

Example below:

(https.*jpg|https.*jpeg|https.*png|https.*newformat)

 

Let me know if that solved the problem 🙂
//Elias

tjamal1
8 - Asteroid

That worked perfectly for me.

Thank you 🙂

 

Have a good day!

Elias_Nordlinder
11 - Bolide

Great!

I am happy to help 🙂

Thank you, have a great day as well!

Labels