on 11-03-2016 01:39 PM - edited on 07-27-2021 11:35 PM by APIUserOpsDM
Can I read in an Excel file located ina zipped archive file from Amazon S3?
Unfortunately, this is not an option withinthe Amazon S3 Download Tool, as it only allows you to choose between CSV, DBF and YXDB files. However, this is possible within Alteryx with the use of a simple workflow utilizing a three line batch file,the Run Command Tool (master it here), and the AWS Command Line Interface (CLI).
In order to use the CLI, you must first download it and configure its settings. Please visit this page for information on how to do that. Once that is setup, you simply need to setup the batch file and configure the Run Command Tool.
Step 1
In the first step, you will use a Text Input Tool to write the batch file code. This code will use the CLI to copy the ZIP file from the S3 bucket to a locally accessibledrive. Configure the Text Input Tool as follows:
Important:
Step 2
In the second step, you will use the Run Command Tool to do the following:
Important:
When entering in the "Read Results" section, your ZIP file will not exist at this point so you cannot simply navigate to and select the file. So, you have two options:
C:\Users\dchapman\ExcelTest.zip|ExcelZIP.xlsx[sheet1]
This same workflow can be used to read other archived files as well. However, you will have to make slight adjustments to the "Read Results" section of the Run Command tool. For example, if reading in a CSV file, you would simply include the archived file name. Since a CSV file does not have "sheets", thebracketed sheet name is not needed.
I plan to create a simple macro with a user interfacethat will do the same thing. Once complete, I will post it in the reply section.
Thanks for reading!
It would be good to offer this functionality natively in the product as most users will have zipped files etc that they wish to pull from s3
Hi @adrianloong,
Thanks for your comment. I agree 100%. If you wouldn't mind, you could always add that suggestion to the Product Ideas section of the Community. If your idea gets a lot of attention, then it will get the attention of our developers.
Thanks!
Hello,
It would be interesting to increase the range of file types that the connector can read, for example the Excel file itself.
I am trying to use the workaround explained in the post. I am getting the following error about .bat not being valid format.
Error: Run Command (5): The file "C:\Users\XXXX\Desktop\alteryx_cp_gz_file.bat" is not a known format. (2)
Is it possible that my my admins have disabled the use of .bat files?
Thank you so much for this! Using your post and the post to which you reference, I'm attempting something similar: downloading many individually zipped .csv files from a bucket, each with the file format " *.csv.gz "
I have a few questions, and would be very grateful if someone knows the answer to any of them as I'm getting nowhere slowly :)
Many thanks in advance again for any help at all.
Hi @TeePee!
Thanks for checking out my post! Please see my answers below. Thanks!
I have a few questions, and would be very grateful if someone knows the answer to any of them as I'm getting nowhere slowly
Many thanks in advance again for any help at all.
You're welcome! Hope this helps!
Thank you so much @DanC for the speedy reply. I will attempt to download a single .csv.gz and then try batching! I will let you know if when I get there.
Awesome, @TeePee. Good luck with the rest!
Below is what I get when I run the workflow
@Danc Below is what I get when I run the workflow