In case you missed the announcement: The Alteryx One Fall Release is here! Learn more about the new features and capabilities here
ACT NOW: The Alteryx team will be retiring support for Community account recovery and Community email-change requests after December 31, 2025. Set up your security questions now so you can recover your account anytime, just log out and back in to get started. Learn more here
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extracting values from a specific PDF page

NeethaMalik
5 - Atom

Hi all,

 

I am new to Alteryx and I am trying to read pdf/image files. The data in these files is scattered. I have Alteryx intelligence suit and I have converted the data to text using it. The files have 14 + pages but I am specifically interested in just one page and the data in the page. does anyone have any tips to help me 

6 REPLIES 6
Raj
16 - Nebula

use image template tool

NeethaMalik
5 - Atom

I tried Image Template tool, it pulls data for one PDF file, but the moment I run the workflow for multiple files it returns gibberish data or adjacent data elements from the highlighted ones for other PDF files.

alexnajm
19 - Altair
19 - Altair

You can use the Image Input tool to read in the list of pages from that PDF, then use a Filter to limit to just the page you need. Then using the Image Template tool should work well!

NeethaMalik
5 - Atom

Thank you Alex,

 

It helped me narrow my search to just one page as opposed to all pages, this is great!! Now the problem I am trying to deal with is the data output is not necessarily from the fields I highlighted in the Image template. Its working file for one row but not all the rows. 

 

NeethaMalik_1-1678971797916.png

 

BS_THE_ANALYST
15 - Aurora
15 - Aurora

@NeethaMalik The approach I take is with the PDF to Text tool:

BS_THE_ANALYST_0-1678972278440.png

Then you can use some filtering logic like page = blah, and columns contain blah. Certainly alot more involved in terms of parsing. But it'll bring it every piece of data without missing things. 

 

All the best,
BS

LinkedIN

Bulien
NeethaMalik
5 - Atom

Thank you, this indeed worked.

Labels
Top Solution Authors