Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Download ZIP file error zip.txt empty

MinhTa
7 - Meteor

Hello,

 

I am trying to download zip file and output them into a folder. I applied workflow called DownloadAndExtractZips(solution in one of other community post on how to download zip file) into my workflow but the process have an error. It seems like my zip.txt file (which includes all the name of the files inside the zip folder) is empty. The workflow is not fetching the names properly. The zip file is password protected so I believe this may have caused the issue. What are the commands I need to add to be able to get the file names?

 

Thank you.

 

MinhTa_1-1635972537220.png

MinhTa_0-1635973469603.png

 

MinhTa_2-1635972818727.png

 

 

23 REPLIES 23
apathetichell
19 - Altair

there's a slew of things that could be off - for starters - add a summarize tool to concatenate the strings that you want in your .bat file/command line

Maskell_Rascal
13 - Pulsar

Hi @MinhTa 

 

I might be the outlier here, but I prefer to use the Python tool to extract zip files into my workflows. 

 

You can first use the download tool to get the URL and then let Python do all the rest of the work.

 

In this example, I have the headers set to "None", since the zip file I downloaded has inconsistencies with header being included in the data. You can change this to "0" if you have more consistent data. 

 

from ayx import Alteryx
import pandas as pd
import zipfile
import glob
filepath = Alteryx.read("#1")
file = filepath['File'].iloc[0]

for zip_file in glob.glob(file):
    zf = zipfile.ZipFile(zip_file)
    dfs = [pd.read_csv(zf.open(f), header=None, sep=",") for f in zf.namelist()]
    df = pd.concat(dfs,ignore_index=True)
    print(df)
    
Alteryx.write(df,1)

 

The zip file contains 64 .txt files and is downloaded as a temporary file. 

Maskell_Rascal_0-1635976803098.png

 

Attached is a sample workflow for you to try.

 

Let me know if this works for you. 

 

Cheers!

Phil

 

MinhTa
7 - Meteor

@apathetichell Can you give me some suggestion on how to do this? Honestly, I dont know what I want in my .bat command line, not too familiar with this.

MinhTa
7 - Meteor

@Maskell_Rascal Thank you for the wonderful suggestion. However, I was not able to make it work. I kept getting "NameError". Do you have any suggestion on how to solve this? 

 

Thank you.

 

MinhTa_0-1635992911541.png

 

Maskell_Rascal
13 - Pulsar

@MinhTa - can you post a screenshot of the code in your Python tool? The error you’re receiving shows that you didn’t define “df”. If you look at the solution I provided, you’ll see that “df” is defined at the end to concatenate the files together. 

MinhTa
7 - Meteor

@Maskell_Rascal Hello, I did not change your code at all. I kept it the same and plug it in my workflow. Here is the python code:

MinhTa_0-1636031530898.png

 

 

Maskell_Rascal
13 - Pulsar

What is the input tool labeled "Output.csv"? Is that the actual zip file? In your initial post you were using a download tool to pull the file to a temp folder and then working on extracting it. If you already downloaded the zip file and just want to input the contents, that's a different workflow. 

MinhTa
7 - Meteor

@Maskell_Rascal Output.csv is the temp file. When I download the zip file from the db, it comes in the form of base64. I had to decode it and then save it to a temp file. Afterward, I try to convert it to zip file using your tool. I am trying to replicate the process as much as possible so the output.csv is basically step after your select tool. Output.csv contains the directory to the temp file.

 

MinhTa_0-1636034441378.png

 

Maskell_Rascal
13 - Pulsar

@MinhTa - I don't have a base64 zip file via a download to test this on, but the below updated code should work for you with the attached workflow.  

 

Maskell_Rascal_0-1636038544064.png

 

from ayx import Alteryx
import pandas as pd
import zipfile
import glob
import base64
filepath = Alteryx.read("#1")
file = filepath['File'].iloc[0]
file2 = filepath['FileDecoded'].iloc[0]

with open(file, 'rb') as file_input, open(file2, 'wb') as file_output:
    base64.decode(file_input, file_output)

for zip_file in glob.glob(file2):
    zf = zipfile.ZipFile(zip_file)
    dfs = [pd.read_csv(zf.open(f), header=None, sep=",") for f in zf.namelist()]
    df = pd.concat(dfs,ignore_index=True)
    print(df)
    
Alteryx.write(df,1)

 

I've added in a base64 decode to the script, and a formula tool prior to the Python to give the base64 decode a file to write it to. All of this will still happen in the temp files for you. 

 

You should be able to update the URL in the text input on the attached workflow and get it to run successfully. 

 

Let me know if this works for you. 

 

Phil

 

Labels