Hello, I am trying to use Alteryx to build a workflow and I need to figure out how to unzip zip files. The zip files are automatically saved in a specific location and then I need to use 1 csv file from each zip file from the prior day and current day. I have my Alteryx flow built to identify the files I need, but I am not sure how to get Alteryx to look at the .csv file from within the zip file....I need help unzipping the files to only pull in the specific .csv file. I could unzip them outside of Alteryx, but I was hoping there is a simple way for Alteryx to unzip the files within the flow. I have read about using a powershell or run command, but that is all new to me and is honestly confusing. :) I have attached the start of my workflow.
Solved! Go to Solution.
I just had a quick play and the Dynamic Input tool will work, but only if you fully qualify the name (FullPath|||FileInsideZip, so like C:\myfile.zip|||filetoreference.csv). It's hard to share as I would need to create a separate zip file etc etc. If you do not know the filename inside, then that becomes a little more difficult, but not impossible, it may take a bit of playing around, and I imagine a Batch Macro.
The text input below is specifying the filename inside the zip, but I imagine you will create this in the flow.
I believe the below use similar methods:
You can use Run Command to unzip files, If you know what is the folder that these zips are saved, then you will be able to use a Directory tool to get the full path to these zip files.
In Run Command you can define the location of the unzip files, and you can set it to by dynamic location and then later use that specific location to find the csv files.
This helps and makes sense. Follow-up question, do you know if I can still use the dynamic input if the csv files in the zip files aren't all named the same? They are all names PROD_CW_CLEARWATER_ACCOUNTING_yyyymmdd_1.csv, so could I use a % in the name or something to indicate use the file that has Accounting in the name or would I have to put the exact full path? Thank you for the reply as this is helpful.
You can use run command or python to do that for you.
once the files are out of zip, using directory tool you can get all the fine names and then using dynamic input you can read them.
Something like this -> https://github.com/apathetichell/2024_AlteryxMacros/blob/main/zipfile_extractor.yxmc (.yxmc)
and the core python -> https://github.com/apathetichell/2024_AlteryxMacros/blob/main/zipfile_extractor.py
would help. The short is that you need Python (or equivalent) to get the names of the files in your archive. Alteryx can extract a dynamic archive file, with a dynamically supplied name ---> It cannot provide all of the files in an archive.
I'm not sure how wildcards will go, I imagine it would be a '*' rather than a '%', but you'll have to give it a try. I don't think I remember constructing a path in a formula tool with a wildcard, so can't say for sure. Ideally you would be able to construct the full name for the file inside the zip using the dates you construct earlier in the workflow to get the zipfile name.