Pdf Data Extraction
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I have couple of pdf's in a directory. I want to extract specific data from all the pdf's and afterwards compare. problem i am facing is when i am using (pdf to text ) tool it is extracting all the data from all pdf's with the use of macro. but i only want to extract required data from each pdf.
- Labels:
- Computer Vision
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Anasalter you an create a batch macro which can be later applied for all the files.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Manoj_k i have already created a batch macro, i am getting all the text out of pdf's but what i want to know is how can i get specific data from those pdf.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
You'd need to know what area of the page/what format or anchors exist to search on. Then using the Filter Tool, you will isolate those text blocks and concatenate as necessary. More information would help to provide more specific guidance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Anasalter use the combination of Image Input,Image Template and Image to Text tools to extract specific data from PDF files instead of all the data.
Give the path of PDF files in 'Image Input' tool, add a template and give annotations in 'Image Template' tool.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@nagakavyasri I have tried with method also but the problem is some of the pdf are having 2 pages and some are having 4 so annotations are not working properly also format of all the pdf are not same.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Anasalter you can use 2 such sets of tools to read a pdf of 2 pages with 1 format and other pdf of 4 pages with another format in the same workflow
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Anyone have the batch macro for this. if yes, please share. THanks
