Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Image to Text tool

Anasalter
8 - Asteroid

Hi Community,

Data-->emp1-->expense_report1-->bill1

                                                         bill2

                                                         bill3

                          expense_report2-->bill34

                                                          23
            emp2-->expensereport_344
                          expensereport_454
            emp3-->expensereport_345

above  is the structure how bills and invoices are present in a folder for each employees.
i have to extract text from the images, pdf  and then compare all bills for a particular employees with each other to find duplicacy.

problem i am facing is when i am using image input and image to text tool it is giving the some memory error and unable to extract the text.(there are around 2800 bills)

what approach should i use to make this workflow?


4 REPLIES 4
Karen763Purvis
5 - Atom

Hello!

To process 2800 bills efficiently, use a batch OCR workflow with tools like Tesseract or Google Vision, avoiding memory overload by streaming files and parallelizing tasks. Preprocess mywisely com images for better accuracy, store extracted text with metadata, and compare bills per employee using fuzzy matching or hashing to detect duplicates. Stick to VPP-installed apps for managed environments if using Home Assistant. 

OllieClarke
16 - Nebula
16 - Nebula

Hi @Anasalter 

 

Are you currently loading all 2800 files through the tool in one go? If so can you try batching them, so you're only working on one employee at a time?
If you take your current workflow, and use a control parameter to affect your directory input (which I'm assuming is there), that would let you make a batch macro which should limit the amount of memory being used by the tool in one go.

 

There's more info on batch macros here: https://knowledge.alteryx.com/index/s/article/Getting-Started-with-Batch-Macros-1583461640393

 

Hope that helps,

 

Ollie

Anasalter
8 - Asteroid

Hi @OllieClarke 

yes earlier i was trying to load all of the files at one go but now

I have used this approach and now i am able to extract text from the images.

OllieClarke
16 - Nebula
16 - Nebula

Happy to hear it :)

Labels
Top Solution Authors