I am trying to extract the web site from string charaters like this one. I just need the www.unlcetimsfoodtruck.com
. /biz_redir?url=http%3A%2F%2Fwww.uncletimsfoodtruck.com&cachebuster=1661896748&website_link_type=website&src_bizid=_3tyFt-G2ppSVIcRS86PCA&s=39567495988d1cd59560a14cf9607bb632eb94197fc1968b9d85565929d2880a
act
I have several like this and they are not all the same.
Is there a way to extract everything starting at www. and ending at .com
Hey @lbolin,
One way of doing this is with regex which is used for text matching.
(www\..*\.com)
This regexes gets the www then a dot (\.) any number of characters (.*) then has to end in .com (\.com).
If you want to learn more about Regex the community has some really quick interactive videos on getting to grips with it here https://community.alteryx.com/t5/Interactive-Lessons/tkb-p/interactive-lessons/label-name/Parsing%20...
Any questions or issues please ask
Ira Watt
Technical Consultant
Watt@Bulien.com
I worked but now i am finding that a lot of my urls are not including www. that are https://yelp.com/biz/firehouse-subs
@lbolin can you give more example data so that we can see how https://yelp.com/biz/firehouse-subs is placed in the text?
@lbolin Can you please provide sample of all models you have ?
This will allow us to build a regex pattern that can take everything into account.
Types of data |
https://www.yelp.com/biz/firehouse-subs-mobile-4 |
https://yelp.com/biz/als-hotdogs-and-other-fine-foods-mobile-2?osq=Restaurants |
. /biz_redir?url=http%3A%2F%2Fwww.uncletimsfoodtruck.com&cachebuster=1661896748&website_link_type=website&src_bizid=_3tyFt-G2ppSVIcRS86PCA&s=39567495988d1cd59560a14cf9607bb632eb94197fc1968b9d85565929d2880a act |
It would be ideal if there is a way to ignore the others and only touch the third one.
@lbolin Please see the attached workflow
Good discussion. I also need help extracting my website what are the types and features of printers. I hope the discussion works for me also.