We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Excel hyperlink contains number sign

erinmichelle
5 - Atom

Hello! I am trying to extract URLs that are hyperlinked within excel. Thanks to a couple of resources (here, here, here) I have been able to replicate extracting URLs from excel hyperlinks. HOWEVER, the specific URLs I am trying to extract contain the number sign "#" in the middle of the URL. Thus when reading in Alteryx, the everything starting at the # is left off. Does anyone know of a solve or workaround for this issue? Workflow and input file attached that replicates the issue. Thanks!

4 REPLIES 4
AZuc
Alteryx Alumni (Retired)

@erinmichelle ,

 

Could you provide and URL example?

 

Usually when there is a # in the middle of the URL it refers to a "row" into a larger page. If it is the case, you may ignore what's after # (including #), read the full page and parse. 

 

But I'm not sure this is the case. An example would help.

 

André Zuccatti

Sales Engineer - LATAM -
Alteryx, Inc.


HomesickSurfer
12 - Quasar

Hi @erinmichelle 

 

The URL string after the # does appear.  The # is dropped as it is invalid and irrelevant.

 

Capture.PNG

erinmichelle
5 - Atom

Thanks for the replies!

 

@AZuc Unfortunately I can't provide the exact hyperlinks as it is a proprietary site developed by the company I work for. I know for certain I cannot ignore the # and what comes after because if I try to access the URL that way I receive an error. Sorry I wish there was a better way to troubleshoot this.

 

@HomesickSurfer The URL string after the # does not appear and the # is relevant for my use case. If you look at the hyperlinks in the underlying excel file, one has the # before /test and one does not. I am trying to get Alteryx to return "http://www.fake.com/#/test" and "http://www.fake.com/test"

danilang
19 - Altair
19 - Altair

Hi @erinmichelle 

 

The HyperLink is broken up into 2 pieces in the Excel file.  The Site address(everything before the #) is stored in xl\worksheets\_rels\sheet1.xml.rels. 

danilang_0-1636201787649.png

 

The fragment(everything after the #) is stored in xl\worksheets\sheet1.xml in the location attribute

danilang_1-1636201912492.png

You'll need another Input tool to grab the contents of Sheet1.xml, parse out the id from the r:id attrib and join this with the sheet1.xml.rels data.  

 

Note that your sample URL http://www.fake.com/#/test/ isn't a valid URL.  The # can only occur after all the "/" as in http://www.fake.com/test#Fragment.  Excel can't properly parse this and puts /test in both the rel and the location

 

Dan

 

 

Labels
Top Solution Authors