community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Knowledge Base

Definitive answers from Designer experts.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

Automating File Transfer Protocol (FTP) Downloads

Community Data Engineer
Community Data Engineer
Created on

So we’re now downloading all the network-shared documents we want thanks to instructions posted on our Knowledge Base, and we’re on our way to mastering FTP in Alteryx. But what if we want to take it a step further? A lot of our users rely on FTP as a drop zone for datasets that are generated periodically (e.g. weekly, monthly, or quarterly datasets). We should then be able to schedule a workflow to coincide with those updates, automatically select the most recent dataset, crank out all the sweet data blending and analytics we have in our scheduled workflow, and proceed with the rest of our lives, right? Right. We can do just that, and with a little work up front, you can automate your FTP download and analysis to run while you’re enjoying the finer things in life. Here’s how in v10.1:


Say the file you want to download is ftp://ftp.url.here/FTP Example File.xlsx :


1.png


Each time period, this file is listed at the URL ftp://ftp.url.here/ :


2.png


If we use this URL in our Download Tool and specify in the Basic tab to output to a string field (a), otherwise keeping our configuration the same, most servers will return all the files listed at that URL (b).


(a) Basic Tab Configuration


3.png


(b) Response from Download Tool


4.png


While the most recent version of the file will usually be recognized by the latest date and/or time in the first two fields (we’ll parse those out, too), we’re going to go ahead and use the date stamps in the file names since that’s more difficult and we’re hardcore. Depending on the formats here the approach will vary, but we’ll use regex in this case. First, let’s split the single response cell we have into multiple rows using the Text To Columns Tool:


texttocolumns.png

 

The file list will now look like the following:


6.png


Now let’s unleash the power of regex:


regex1.PNG


After the parsing above, the file list will look like the below:


8.png

 

Just apply one more regular expression to get the date stamps out of the file names:

 

regex2.PNG


Which looks like:


10.png


Then we’ll throw in a little logic to use either a posted date or a file name date stamp (both converted to date types for sorting later), if provided:


formula1.PNG


Now that we’ve isolated the dates we’ll leverage to determine the file to open, we can simply sort descending by the Date field, sample the first record, and create a new URL for our FTP request using the file name corresponding to our date of choice:


formula2.png


Once that’s done, all we need to do is follow the FTP instructions we had before and schedule the workflow to run just after the files are dropped!

Comments
Fireball

Awesome, thanks for sharing this is exactly what I was looking for now if I can just get the external vendors SFTP and our firewall and the Altyerx software all working together I'll be in business!

Meteor

Hi,

i was following the process (more or less) and it works, but what i really want is to create like a macro that updates the input parameters (host, user, pass, folderPath) of an FTP transfer connector, given before in different columns, and outputs the list of files in those ftps. The idea is to create like a mapping for the input parameters, where the user, later, on the macro itself, can choose from which columns the parameters are fed.
This works when i'm passing the URL (host+folderPath) before the macro and updating only the user and password, but not working when taking host+folderPath dinamically. 
Some help would be very much appreciated, as i'm unable to attach a workflow, so i post the macro image.
All the control parameters actions are updating the expression value only (default).
The formula tool is to create the URL string with the updated inputs, (host+folderPath) before inserting into the FTP connector, but i guess is this part which is not working as a batch macro.

auto_ftp_macro_.jpg

 

 

Asteroid

Hi @MattD

 

This concept looks cool. I have a use case that is little different, would like to know some benchmarks or best practices that could help. 

 

1. What if my input files vary 20 - 40 Gigs in size which are at different share drive aka networks (not hosted FTP locations) ?  

2. The process is slow to read such huge file which cannot be broken down. 

3. This usually happens when I pick files from a distant server. (overseas - physically servers are at a different country within organization limits)

4. What would be an alternate solution to handle such files as we read such universe files frequently which gets updated? So moving every time to local is again slow.   

 

So ,

1. Does download tool save time while doing such operations? 

2. Do they allow such huge files that resides in an FTP sever to be read fast? 

3. Do you suggest to move such files from shared network to FTP so this can ease the process? 

 

eagerly waiting for some answers and advice.

 

Thanks in advance. 

 

Asteroid

hi Guys,

 

Thank you for the information. Its really useful. I have another situation where i download file from FTP locations, and i have sorted the list of files based on date. Now i want to upload the sorted files on the shared drive. How should I achieve it? I tried using download tool, but invain. Any quick help?

 

Thanks

Harsh

Hi,

 

I am having an issue with the download tool. I have followed the instructions, and even tested the URL to ensure that it works, but when the tool goes to actually download the files from the FTP I get the following:

 

Any idea how to fix this issue?

 

FTP 302 Error.jpg

Meteor

Hi ,

In the text input I inserted a second column where I indicated the path to save the data; after which I entered the "filename from a field" in the basic third banner window by selecting the column with the specified path. But when I send the workflow to run, I get the following error message:
  "Error: Download (1): Error Opening file: C: \ Users \ Documents \ download: Access denied. (5)".

I modified the path and inserted the name of the files that should download but the zip file that is generated is empty, and then reading the result in the browser tool known this:

"DownloadHeaders"

"HTTP / 1.1 401 Unauthorized"


Thank you in advance for availability

Meteor

how do you get the FTP download to show the full file upload date? Mine never shows the year only the month, day, and time. I have tried with multiple FTP servers, and all of them show the year in the filezilla but no year in alteryx.

Meteor

@MattD any thoughts on my comment above? In your screenshot, it shows your date as "02-08-16". Notice in my screenshot below it shows the date like so:

 

Capture.PNG

 

How did you get your date to show the year in the download tool? I have tried multiple FTP servers for clients (ours and theirs) and they all appear like this in alteryx.

Alteryx Partner

Hi @MattD , thanks for your answers, it helps a lot. Then I got a question, once I get the full path like this: 

sftp://sftpxxxx.com/xxx/123xxxxabcdefg.XML, I got a file looked like this, all the data were crowded into one column names "DownloadData",  then how can I download this file to the alteryx memory? or how can I download this xml file wtih the same format?

Thanks.

 

Capture.PNG

Meteor

If you want it in alteryx memory, you need to parse the data somehow. Typical tools to do this include tokenize, text to column, etc.

If you prefer to output the file and then use it in a different workflow you can output using this configuration in the download tool:

Capture.PNG

Alteryx Partner

Great, thanks a lot!