Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Image to Text slow performance

Goddenra
8 - Asteroid

Hi, not sure if this is something that anyone can help with, but I have recently moved our PDF extraction from using R Tool (pdftools) to the Intelligent Suite tools.

 

However, I am seeing major difference in time taken to process between the two. Previously it was taking 1 minute to process 13 PDF files (all pages, approx 120), but now it's taking 5 minutes to process the same files (even after reducing to the absolutely necessary 18 pages).

 

Have tried enabling AMPEngine but no noticeable improvement in performance.

Any ideas how I can improve performance, as ideally we want to stick with the Inteligence Suite options. 

 

Thanks in advance!

3 REPLIES 3
alexnajm
16 - Nebula
16 - Nebula

What version do you have? If you have 2022.3 or greater, I highly recommend the PDF to Text tool instead!

Hammad_Rashid
11 - Bolide

Here are some suggestions:

 

  1. Use the PDF to Text tool: The PDF to Text tool enables you to extract data directly from the PDF bin...1. You can try using this tool instead of the Intelligent Suite tools to see if it improves the performance.

  2. Use the Image Template tool: The Image Template tool allows you to specify the exact information you...2. This tool can be used to extract data from PDFs as well.

  3. Use the PDF Parsing Macro: You can use the PDF Parsing Macro to automate the entire PDF ingestion process. The macro allows you to specify the location of your PDF documents and the location of your extracted templates created in the Image Template Tool. This will then loop through all of these using the power of batch and iterative macros to match your PDF documents against all known templates. This completely removes the need for duplicate tools on a canvas or navigating between multiple temp...3.

I hope this helps! 

Goddenra
8 - Asteroid

Still on 2021 at the moment. Sounds like I need to change that. Thanks both!

Labels