Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Read Table from PDF using Image Input

shakir_juolay
8 - Asteroid

This is the first time I am trying to read a Table from a PDF in Alteryx Designer 2021.2.

From what I understand I need to use Image Input and Image to Text from the Computer Vision Tab.

Attached is my workflow and a dummy PDF. Kindly point Image Input to the folder you save the PDF to.

 

With this dummy PDF I want end up with a table as below in my Alteryx workflow.

NamePrice
Commodity 1123
Commodity 2321

 

What do I need to change to use the T anchor and get the above?

 

Also I feel now since the erstwhile PDF Input from Text Mining has been replaced by Image Input since 2021.2 the below link should mention that.

PDF Input | Alteryx Help

Specially because the replacement happened in May-21 and the link says Version: 2021.2 Last modified: July 15, 2021.

3 REPLIES 3
Jean-Balteryx
16 - Nebula
16 - Nebula

Hi @shakir_juolay ,

 

Is there a rule for Name column ? Can commodity number be 2 or more digits ?

TheOC
15 - Aurora
15 - Aurora

Hi @shakir_juolay 

I managed to do this with the image template tool:

TheOC_0-1627995554135.png



Here I can specify which part of the pdf I want to export as a table. I've attached a workflow which should do this for you, you may just need to change the input location.

Cheers,
TheOC


Bulien
shakir_juolay
8 - Asteroid

From this I understand that I have do some text parsing after the output of Image to Text, that is fine.

 

Thank You @TheOC 

 

For some weird reason with my actual PDF for some entries the Name column is getting split into two rows and the price is coming only on the first row and second row is getting nulls

Labels