Extracting values from a specific PDF page
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi all,
I am new to Alteryx and I am trying to read pdf/image files. The data in these files is scattered. I have Alteryx intelligence suit and I have converted the data to text using it. The files have 14 + pages but I am specifically interested in just one page and the data in the page. does anyone have any tips to help me
Solved! Go to Solution.
- Labels:
- Computer Vision
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
use image template tool
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I tried Image Template tool, it pulls data for one PDF file, but the moment I run the workflow for multiple files it returns gibberish data or adjacent data elements from the highlighted ones for other PDF files.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
You can use the Image Input tool to read in the list of pages from that PDF, then use a Filter to limit to just the page you need. Then using the Image Template tool should work well!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you Alex,
It helped me narrow my search to just one page as opposed to all pages, this is great!! Now the problem I am trying to deal with is the data output is not necessarily from the fields I highlighted in the Image template. Its working file for one row but not all the rows.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@NeethaMalik The approach I take is with the PDF to Text tool:
Then you can use some filtering logic like page = blah, and columns contain blah. Certainly alot more involved in terms of parsing. But it'll bring it every piece of data without missing things.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you, this indeed worked.
