Alteryx Designer Desktop Discussions

cmiller9115 · ‎09-05-2019

Newer to alteryx any responses are appreciated. Trying to parse an xlsx file from a webpage and having issues getting to the actual file when using the down load tool. Below is the link to the webpage along with a screen shot of where the xlsx file is embedded. I have also attached an example workflow. Any suggestions on how to use the down load tool/other means within Alteryx to down load the xlsx file and be able to get the data into a useable format?

https://www.fedex.com/en-us/service-alerts.html

Embedded XLSX file from within HTML

FedEx_Service_Update_2019_09_04_PM_Hurricane_Dorian_vf_955560470.xlsx

SamDesk · ‎09-05-2019

Hello @cmiller9115,

Firstly your input URL was missing an "h" from "https".

Secondly, choosing to download the file to your filename field means you can then call this same field in your dynamic input tool. You will, however, have to append a sheet name to your filename so Alteryx knows which sheet of the spreadsheet to load, e.g:

[Filename]+"|||FedEx Custom Critical$"

Sam 🙂

geraldo · ‎09-05-2019

Hi,

Below is your reconfigured workflow in a simpler way to download

cmiller9115 · ‎09-05-2019

Thank you for both of your responses worked great.

cmiller9115 · ‎09-06-2019

Any idea on if the file name will be changing from a day to day basis the best way to treat that to ensure puling in the latest updated data from the webpage. For example yesterday's file nameing convention looks like https://www.fedex.com/content/dam/fedex/us-united-states/Service-Alerts/images/2020/Q2/FedEx_Service... compared with the original from the post.

https://www.fedex.com/content/dam/fedex/us-united-states/Service-Alerts/images/2020/Q2/FedEx_Service...

mceleavey · ‎09-06-2019

Hi @cmiller9115 ,

rather than hard-coding the URL, if the actual URL is going to be dynamic, you can scrape the raw HTML from the website containing that URL, then parse out the URL using the Regex tool. This will then be the url to feed into your downlaod tool. This means you can dynamically determine the URL each time.

M.

Alteryx Designer Desktop Discussions

Web Scraping off Webpage

Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Passed the Advanced Certification Exam!

Re: Identify duplicates in a specific column, and ...