Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

General Discussions

Discuss any topics that are not product-specific here.

Extract web site from a string of Characters

lbolin
8 - Asteroid

I am trying to extract the web site from string charaters like this one. I just need the www.unlcetimsfoodtruck.com 

 

.  /biz_redir?url=http%3A%2F%2Fwww.uncletimsfoodtruck.com&cachebuster=1661896748&website_link_type=website&src_bizid=_3tyFt-G2ppSVIcRS86PCA&s=39567495988d1cd59560a14cf9607bb632eb94197fc1968b9d85565929d2880a

act 

 

I have several like this and they are not all the same. 

 

Is there a way to extract everything starting at www. and ending at .com

11 REPLIES 11
IraWatt
17 - Castor
17 - Castor

Hey @lbolin,

One way of doing this is with regex which is used for text matching.

IraWatt_0-1662652674689.png

(www\..*\.com)

This regexes gets the www then a dot (\.) any number of characters (.*) then has to end in .com (\.com).

 

If you want to learn more about Regex the community has some really quick interactive videos on getting to grips with it here https://community.alteryx.com/t5/Interactive-Lessons/tkb-p/interactive-lessons/label-name/Parsing%20...

 

Any questions or issues please ask

Ira Watt
Technical Consultant
Watt@Bulien.com 

 

 

 

Emmanuel_G
13 - Pulsar

Hi @lbolin ,

 

Find attached one way o doing this with regex tool.

 

Emmanuel_G_0-1662652903855.png

 

lbolin
8 - Asteroid

I worked but now i am finding that a lot of my urls are not including www. that are https://yelp.com/biz/firehouse-subs

IraWatt
17 - Castor
17 - Castor

@lbolin can you give more example data so that we can see how https://yelp.com/biz/firehouse-subs is placed in the text?

Emmanuel_G
13 - Pulsar

@lbolin  Can you please provide sample of all models you have ? 

 

This will allow us to build a regex pattern that can take everything into account.

lbolin
8 - Asteroid
Types of data
https://www.yelp.com/biz/firehouse-subs-mobile-4
https://yelp.com/biz/als-hotdogs-and-other-fine-foods-mobile-2?osq=Restaurants

.  /biz_redir?url=http%3A%2F%2Fwww.uncletimsfoodtruck.com&cachebuster=1661896748&website_link_type=website&src_bizid=_3tyFt-G2ppSVIcRS86PCA&s=39567495988d1cd59560a14cf9607bb632eb94197fc1968b9d85565929d2880a

act 

 

It would be ideal if there is a way to ignore the others and only touch the third one. 

dougperez
12 - Quasar

Hello @lbolin 

 

See the workflow attached

Alteryx_AR
12 - Quasar

@lbolin Please see the attached workflow

ayzal00
5 - Atom

Good discussion. I also need help extracting my website what are the types and features of printers. I hope the discussion works for me also.

Labels