So we’re now downloading all the network-shared documents we want thanks to instructions posted on our Knowledge Base, and we’re on our way to masteringFTP in Alteryx. But what if we want to take it a step further? A lot of our users rely on FTP as a drop zone for datasets that are generated periodically (e.g. weekly, monthly, or quarterly datasets). We should then be able to schedule a workflow to coincide with those updates, automatically select the most recent dataset, crank out all the sweet data blending and analytics we have in our scheduled workflow, and proceed with the rest of our lives, right? Right. We can do just that, and with a little work up front, you can automate your FTP download and analysis to run while you’re enjoying the finer things in life. Here’s how in v10.1:
Say the file you want to download is ftp://ftp.url.here/FTP Example File.xlsx :
Each time period, this file is listed at the URL ftp://ftp.url.here/ :
If we use this URL in our Download Tool and specify in the Basic tab to output to a string field (a), otherwise keeping our configuration the same, most servers will return all the files listed at that URL (b).
(a) Basic Tab Configuration
(b) Response from Download Tool
While the most recent version of the file will usually be recognized by the latest date and/or time in the first two fields (we’ll parse those out, too), we’re going to go ahead and use the date stamps in the file names since that’s more difficult and we’re hardcore. Depending on the formats here the approach will vary, but we’ll use regex in this case. First, let’s split the single response cell we have into multiple rows using the Text To Columns Tool:
The file list will now look like the following:
Now let’s unleash the power of regex:
After the parsing above, the file list will look like the below:
Just apply one more regular expression to get the date stamps out of the file names:
Which looks like:
Thenwe’ll throw in a little logic to use either a posted date or a file name date stamp (both converted to date types for sorting later), if provided:
Now that we’ve isolated the dates we’ll leverage to determine the file to open, we can simply sortdescending by the Date field, sample the first record, and create a new URL for our FTP request using the file name corresponding to our date of choice:
Once that’s done, all we need to do is follow the FTP instructions we had before and schedule the workflow to run just after the files are dropped!