Description
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Notify Moderator
This tool will allow you to input PDF files into plain text in Alteryx. You will get one cell per PDF page.
Use with Directory tool passing FullPath field.
This version of the 'PDF Input' macro by Ollie Clark has been optimized for processing big volumes of PDFs. This is not a batch macro, instead the loop over the files is done inside of the R tool, after this change we have achieved massive time processing reduction when working with big volumes of PDFs.
* To use this tool is required to have installed the following R library on your computer: pdftools
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
Hi @alberto_herni thanks for sharing the macro.
I'm getting the following error while running the flow, any idea how to solve this.
Thanks in advance for your help.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
Hi @Adarsh_R3,
To use this tool is required to install the R package 'pdftools' in your computer, please use this R installer tool, get the package installed and try again.
Install R Packages - Alteryx Community
Regards,
Alberto
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
I have published a new version of the macro that includes the R script to install R package 'pdftools' in your computer in case is not already installed.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
Hi @alberto_herni , I encountered this problem. What should I do? Thanks.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
Hola Alberto,
Muchisimas gracias por subir esta version de la macro inputpdf, es justo lo que buscaba para un proyecto en el que tengo que escanear una ingente cantidad de documentos.
¿SerÃa viable modificar el código para que escanease solo la primera página de cada documento?
Un saludo y muchas gracias por compartirla.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
Hi @alberto_herni,
I am trying to use your macro to load about 2500 documents, the problem is that there might be non-scannable documents among them, when I run it I get the following error:
Could it be because there are pdf documents in image without readable text?
Is there any way to continue the macro to the end even if there are documents that are not scannable like the original pdf input macro?
Thanks in advance for your help.