Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to extract tables from a pdf to excel

Pankhudri20
8 - Asteroid

Hello,

 

I have a recurring invoice pdf from which I need to extract only the tables in an excel file.

Each page contains 2 adjacent tables in the pdf which needs to be transposed to get one table. 

I was able to achieve this for 1 page but I am unable to parse the full pdf.

Can someone please help me how to achieve that.

 

Due to restrictions, I can only share a sample data and expected output(contains multiple sheets, but multiple sheets are not necessary, can have all tables in one sheet).

Also, my current workflow is attached for reference.

 

Thank you

Pankhudri

 

2 REPLIES 2
BrandonB
Alteryx
Alteryx

You may have more success with the intelligence suite that has PDF tools and allows for templates that can export specific regions of PDFs to pull exactly what you want: https://www.alteryx.com/products/alteryx-platform/intelligence-suite 

 

Otherwise, tools like the one that you have linked pull the entire PDF and you need to use a series of parsing and regular expression tools to pull exactly what you want. Still possible, but much more involved. 

Pankhudri20
8 - Asteroid

Hello @BrandonB 

 

Thank you for your quick response.

 

Unfortunately, I do not have access to Intelligence Suite in Alteryx.

Can you tell me how can I solve this using available tools? 

I have been able to do it for 1 page but unable to extract all pages. Since, its a recurring invoice, I need to be able to have an automated process to extract tables.

It would be a great help since I am fairly new to Alteryx and still learning my way through.

 

Regards,

Pankhudri

Labels