ALTERYX INSPIRE | Join us this May for for a multi-day virtual analytics + data science experience like no other! Register Now

Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

PDF to Tabular

fajar_wimar
6 - Meteoroid

Hi All,

 

Here I share with you a PDF to Tabular Tool to extract tabular data from several pdf files.
This tool is based on the following post:
https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Extracting-Tabular-Data-from-PDF-Docum...

 

In the backend process, I use Camelot package in python which can be seen in the following documentation
https://camelot-py.readthedocs.io/en/master/

 

This tool requires a Row Tolerance and input folder path containing multiple pdf files. In the output anchor, we will provide the tabular data along with Table Number, File Name, and pdf path information.

 

PDF to Tabular.jpg

 

 

 

 

 

 

 

 

 

 

 

Feel free to edit the python code and add new features to the macro. Thank you!!

 

MB25
6 - Meteoroid

great work!

sriniprad08
8 - Asteroid

Hi @fajar_wimar ,

 

Thank you for creating the great content. I tried running but getting the below error. Can you please let me know what i am missing thank.

 

sriniprad08_0-1614606524483.png

sriniprad08_0-1614607017104.png

 

 

Labels