Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Alteryx is here to help you solve your biggest data challenges. Read about the new Virtual Solution Center here.

Extracting hyperlink from multiple excel files in a folder

Highlighted
5 - Atom

I am looking for a solution where I can extract hyperlinks from multiple excel files which are in 1 folder. So that I can get the list of the file and the list of hyperlinks in that file.

Thank you

Highlighted
11 - Bolide

Hi @RRK ,

I think the Directory tool is what you're looking for - you just select a folder and it will output the file names and paths, as well as several other data points, for every file in that folder. Does that get you what you need?

Highlighted
5 - Atom

Hi Kayers, 

Thank you for your reply. As you mentioned Directory tool will give me the file name but that is not only thing which I need.

Actually we have many excel files and inside each excel files there are text with hyperlinks. So I want to have a list of files in one column and hyperlinks in another column. 

 

 

Highlighted
Alteryx
Alteryx

Do the files all have the same layout? The process will need to read each file in and have the option checked to bring in file name as a field, and then filter down to rows with hyperlinks. You may be able to use a dynamic input to stack all of the files if they have the same format. Otherwise, you will need to use a macro in order to get around the schema differences. 

Highlighted
5 - Atom

Thank you Brandon for your reply. Yes all my files have same format. I am beginner to Alteryx and I am not sure how to do that. if you can help  then it will be great.

 

Highlighted
Alteryx
Alteryx

I would take a look at this link first for stacking your data:

 

https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/The-Ultimate-Input-Data-Flowchart/t...

 

After successfully stacking the data, it will just be a matter of filtering out the records that you need.

Highlighted
Alteryx Partner

Hey there @RRK,

 

yesterday I already wanted to answer your question because a nice solution popped to my mind, but sadly my license expired exactly at that moment, such that I had to wait for my partner manager to assign me a fresh one.

 

Now, if you're dealing with xlsx-files, we can make use of the fact that they are an "open" format, I checked the structure which is just zipped xml files, and quickly spotted the place where the hyperlinks are referenced.

 

Therefore, this approach will fetch you all links that are clickable - even if they appear as text in the xlsx-file.

Filtering out inter-document links is left for you to do.

 

The nested batch-macro becomes actually quite simple, with some regex-fu:

AlteryxGui_2020-02-09_00-19-51.png

 

AlteryxGui_2020-02-09_00-39-58.png

 

 

The macros:

Spoiler
AlteryxGui_2020-02-09_00-12-10.png

 

AlteryxGui_2020-02-09_00-38-54.png

 

If this is of any use for you, let me know!

 

Best,

 

Johannes

(Blue Reply)

Highlighted
5 - Atom

Thank you for your reply. 

 

My files are all in .xlsm format and links are attached to the text. if I change in extension in directory still it is not working.

Is there any other solution ?

 

Thank you

Alteryx Partner

Hello @RRK ,

 

I just tried to use my macro on the same files, saved with an ".xlsm"-Extension, I only had to change the filepattern in the directory tool, but apart from that, all hyperlinks were detected. See the output below.

 

Do you want to provide a minimal example file, such that I can check whether there are any complications?

 

Again: This approach will not work on Excel 97-2003 .xls-fileformats.

 

I hope this helped.

 

reply_mueller_0-1581622218484.png

 

Best,

 

Johannes

(Blue Reply)

Labels