Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
SOLVED

Check if an input file is corrupted

FilipR
7 - Meteor

I am downloading a big batch of zipped files though an API. Unfortunately almost always a few of the zips are corrupted and can't be open. I have a macro (attached) that reads the csv's inside those zipped files. The macro gives an error for each file it wasn't able to open.

 

What I want to do is to create a similar macro that would check if a file is corrupted or not, but without giving me an error.

 

On a related note, checking the "Treat Read Errors as Warnings" in the Input Data tool doesn't suppress the read errors.

3 REPLIES 3
MatthewO
Alteryx
Alteryx

Hello @FilipR :

 

Have you explored using the Dynamic Input Tool? This tool requires that all file schemas are the same as the configured template file. If a file path passed into the tool has a different set of columns, it will skip the file with a warning. In the case of a corrupted file, this may have the intended outcome. I've attached an example with 3 files. Files 1 and 3 are good, but file 2 is corrupted. The example workflow will read files 1 and 3 but skip 2 with a warning message.

FilipR
7 - Meteor

Hi @MatthewO.

 

Unfortunately I can't open the workflow you attached (we use an older version of Alteryx at my company). I tried figuring it out on my own, but I can't make it work with zip files (I get an "Unable to open archive" error).

 

In the meantime, I figured out a solution in Python. The input is a list with a path to the zip files called [DownloadPath]. The Python tool checks if each file is a valid zip file and gives 1 or 0 answer in a new [Valid] column.

 

 

#################################
# List all non-standard packages to be imported by your
# script here (only missing packages will be installed)
from ayx import Package
#Package.installPackages(['pandas','numpy'])


#################################
from ayx import Alteryx
from zipfile import ZipFile

 

#################################
# read in data from input anchor as a pandas dataframe
# (after running the workflow)
df = Alteryx.read("#1")


#################################
# create a new column
df['Valid'] = 0


#################################
# loop through the rows and validate the zip files
for ind in df.index:

 

   file = df['DownloadPath'][ind].replace('\\','/')

 

   try:
      test = ZipFile(file)
      df['Valid'][ind] = 1

   except:
      df['Valid'][ind] = 0


#################################
# and then send it to one of the output anchors
Alteryx.write(df, 1)

MatthewO
Alteryx
Alteryx

@FilipR glad to hear you found a solution. For future reference, you can adjust the version of a workflow file to open it in an older version of Alteryx. The following article explains how this can be done: https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Adjusting-Alteryx-Files-for-Differe...

Labels