This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Is your input source an xlsx file? .xlsx files are actually a collection of different files which Excel puts together upon opening. In contrast, tools such as python, R and Alteryx will just fetch the data layer from the file.
My understanding is that when you input an .xlsx file into Alteryx it removes any formatting contained in the doc, such as conditional formatting, hyperlinks, text size and so on, as it is a visual layer.
However, according to the R script below (which performs this task), if we take our xlsx file and convert the extension to .zip, we can then extract the hyperlinks from the xml file for the sheet that exists within that.
Alteryx allows you to import .zip files and then you can select the sheet element, as per the script below, before performing some parsing which should pull out the element you are looking for.
The reason this is much easier in VBA is because its developed by microsoft of course and they know how there tool works and have built commands to deal with this.
# rename file to .zip
my.zip.file <- sub("xlsx", "zip", my.excel.file)
file.copy(from = my.excel.file, to = my.zip.file)
# unzip the file
# unzipping produces a bunch of files which we can read using the XML package
# assume sheet1 has our data
xml <- xmlParse("xl/worksheets/sheet1.xml")
# finally grab the hyperlinks
hyperlinks <- xpathApply(xml, "//x:hyperlink/@display", namespaces="x")