We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extract URL's from string

craigja
11 - Bolide

Hi I have a long string that has URL's in it, I want to use the regular expression tool to extract the URL's into separate columns - so Im trying to cobble together an expression that looks for strings starting with "http and ending with ", where the http can be either lower or upper class - Man I hate regular expressions!

 

<p>On Feb. 5, <a href="https://www.esma.europa.eu/press-news/esma-news/esma-sets-out-use-uk-data-in-esma-databases-under-no..." target="_blank">ESMA</a> issued <a href="https://www.esma.europa.eu/sites/default/files/library/esma_70-155-7026_use_of_uk_data_in_esma_datab..." target="_blank">statement</a> on UK data in no-deal Brexit.</p>
15 REPLIES 15
craigja
11 - Bolide

2018.3.5.52487

yalmar_m
11 - Bolide

Try this one!

craigja
11 - Bolide

Hi not really what I need - I want to get 1 extra column, with the 2 URL's separated by a comma.  Some of the data has up to 9 URL's in that one section

yalmar_m
11 - Bolide

If you want to Concat the URL's, following workflow might be a solution.

afv2688
16 - Nebula
16 - Nebula

you can do a regex replace with this formula -> href="(.+)" .+ href="(.+)" .+  then add them fith a formula tool

[regex_replace1] + " ," + [regex_replace2]

 

as long as there are more https keep adding more

craigja
11 - Bolide

Will give the above a try in a minute but just now I have a very dirty solution!

 

"(.*?)" in the regex tool, then run a regex_replace to remove "_Blank" and replace with nothing, then another regex_replace to remove " and replace them with nothing

Labels
Top Solution Authors