Community Gallery

Notify Moderator

This tool will allow you to input PDF files into plain text in Alteryx. You will get one cell per PDF page.

Use with Directory tool passing FullPath field.

This version of the 'PDF Input' macro by Ollie Clark has been optimized for processing big volumes of PDFs. This is not a batch macro, instead the loop over the files is done inside of the R tool, after this change we have achieved massive time processing reduction when working with big volumes of PDFs.

* To use this tool is required to have installed the following R library on your computer: pdftools

Adarsh_R3 · ‎05-08-2022

Hi @alberto_herni thanks for sharing the macro.

I'm getting the following error while running the flow, any idea how to solve this.

Thanks in advance for your help.

alberto_herni · ‎05-09-2022

Hi @Adarsh_R3,

To use this tool is required to install the R package 'pdftools' in your computer, please use this R installer tool, get the package installed and try again.

Install R Packages - Alteryx Community

Regards,

Alberto

alberto_herni · ‎07-27-2022

I have published a new version of the macro that includes the R script to install R package 'pdftools' in your computer in case is not already installed.

Gualigee · ‎10-29-2022

Hi @alberto_herni , I encountered this problem. What should I do? Thanks.

AdrianSanchez · ‎02-16-2023

Hola Alberto,

Muchisimas gracias por subir esta version de la macro inputpdf, es justo lo que buscaba para un proyecto en el que tengo que escanear una ingente cantidad de documentos.

¿Sería viable modificar el código para que escanease solo la primera página de cada documento?

Un saludo y muchas gracias por compartirla.

AdrianSanchez · ‎03-16-2023

Hi @alberto_herni,

I am trying to use your macro to load about 2500 documents, the problem is that there might be non-scannable documents among them, when I run it I get the following error:

Could it be because there are pdf documents in image without readable text?

Is there any way to continue the macro to the end even if there are documents that are not scannable like the original pdf input macro?

Thanks in advance for your help.

Community Gallery

Looking for Alteryx built Add-Ons?

Description