We have extended our Early Bird Tickets for Inspire 2023! Discounted pricing goes until February 24th. Save your spot!

Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer and Intelligence Suite.
SOLVED

Check if an input file is corrupted

FilipR
11 - Bolide

I am downloading a big batch of zipped files though an API. Unfortunately almost always a few of the zips are corrupted and can't be open. I have a macro (attached) that reads the csv's inside those zipped files. The macro gives an error for each file it wasn't able to open.

 

What I want to do is to create a similar macro that would check if a file is corrupted or not, but without giving me an error.

 

On a related note, checking the "Treat Read Errors as Warnings" in the Input Data tool doesn't suppress the read errors.

4 REPLIES 4
MatthewO
Alteryx
Alteryx

Hello @FilipR :

 

Have you explored using the Dynamic Input Tool? This tool requires that all file schemas are the same as the configured template file. If a file path passed into the tool has a different set of columns, it will skip the file with a warning. In the case of a corrupted file, this may have the intended outcome. I've attached an example with 3 files. Files 1 and 3 are good, but file 2 is corrupted. The example workflow will read files 1 and 3 but skip 2 with a warning message.

FilipR
11 - Bolide

Hi @MatthewO.

 

Unfortunately I can't open the workflow you attached (we use an older version of Alteryx at my company). I tried figuring it out on my own, but I can't make it work with zip files (I get an "Unable to open archive" error).

 

In the meantime, I figured out a solution in Python. The input is a list with a path to the zip files called [DownloadPath]. The Python tool checks if each file is a valid zip file and gives 1 or 0 answer in a new [Valid] column.

 

 

#################################
# List all non-standard packages to be imported by your
# script here (only missing packages will be installed)
from ayx import Package
#Package.installPackages(['pandas','numpy'])


#################################
from ayx import Alteryx
from zipfile import ZipFile

 

#################################
# read in data from input anchor as a pandas dataframe
# (after running the workflow)
df = Alteryx.read("#1")


#################################
# create a new column
df['Valid'] = 0


#################################
# loop through the rows and validate the zip files
for ind in df.index:

 

   file = df['DownloadPath'][ind].replace('\\','/')

 

   try:
      test = ZipFile(file)
      df['Valid'][ind] = 1

   except:
      df['Valid'][ind] = 0


#################################
# and then send it to one of the output anchors
Alteryx.write(df, 1)

MatthewO
Alteryx
Alteryx

@FilipR glad to hear you found a solution. For future reference, you can adjust the version of a workflow file to open it in an older version of Alteryx. The following article explains how this can be done: https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Adjusting-Alteryx-Files-for-Differe...

Robbobu1
7 - Meteor

@MatthewO Thanks for this idea.  I was trying to do a batch macro and they failed without continuing.  The Dynamic input worked great.

Labels