We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Identify Corrupt PDFs

HunterH
6 - Meteoroid

Hi Everyone. I am trying to figure out a way to feed Alteryx a file path full of PDFs and for it to return back a list of which PDF's are corrupt. I don't have intelligence suite. I've seen some possible solutions using the Python tool, but none of them have been for PDFs and I am not proficient enough in Python to adapt those solutions to PDFs. Has anyone been able to accomplish something like this? 

5 REPLIES 5
Qiu
21 - Polaris
21 - Polaris

@gawa 

Maybe you can help? 😂

caltang
17 - Castor
17 - Castor

Perhaps you can ride on this code: https://stackoverflow.com/questions/58807673/best-way-to-check-the-pdf-file-is-corrupt-using-python 

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
gawa
16 - Nebula
16 - Nebula

hi @HunterH 

Here is the WF to detect the corrupt PDF by using Python tool. This WF utilizes PyPDF2 Python library, so please execute this WF as "ADMIN" first to install the library to your computer.(From 2nd time, you don't need to run as ADMIN any longer)

image.png @Qiu  Thank you for addressing me.

Qiu
21 - Polaris
21 - Polaris

@gawa 
You are awesome as usual 🤣

HunterH
6 - Meteoroid

Thank you! This works great. Of my test files, one of them has redacted info and it flags that one as corrupt. Is that because of the redactions? It seems to run as expected on any of the other files I throw at it. Thanks again.

Labels
Top Solution Authors