nav[aria-label="Primary Navigation"] { padding: 0; & ul { list-style: none; width: 100%; display: flex; flex-direction: row; justify-content: start; align-items: start; gap: 30px; padding: 0; & li { margin: 0; } & ul li { list-style: none; } } }

Image to Text tool

Anasalter

Hi Community,

Data-->emp1-->expense_report1-->bill1

bill2

bill3

expense_report2-->bill34

23
emp2-->expensereport_344
expensereport_454
emp3-->expensereport_345

above is the structure how bills and invoices are present in a folder for each employees.
i have to extract text from the images, pdf and then compare all bills for a particular employees with each other to find duplicacy.

problem i am facing is when i am using image input and image to text tool it is giving the some memory error and unable to extract the text.(there are around 2800 bills)

what approach should i use to make this workflow?

Computer Vision

Accepted answers

OllieClarke

Hi @Anasalter

Are you currently loading all 2800 files through the tool in one go? If so can you try batching them, so you're only working on one employee at a time?
If you take your current workflow, and use a control parameter to affect your directory input (which I'm assuming is there), that would let you make a batch macro which should limit the amount of memory being used by the tool in one go.

There's more info on batch macros here: https://knowledge.alteryx.com/index/s/article/Getting-Started-with-Batch-Macros-1583461640393

Hope that helps,

Ollie

All comments

Karen763Purvis

Hello!

To process 2800 bills efficiently, use a batch OCR workflow with tools like Tesseract or Google Vision, avoiding memory overload by streaming files and parallelizing tasks. Preprocess mywisely com images for better accuracy, store extracted text with metadata, and compare bills per employee using fuzzy matching or hashing to detect duplicates. Stick to VPP-installed apps for managed environments if using Home Assistant.

OllieClarke

Hi @Anasalter

There's more info on batch macros here: https://knowledge.alteryx.com/index/s/article/Getting-Started-with-Batch-Macros-1583461640393

Hope that helps,

Ollie

Anasalter

Hi @OllieClarke

yes earlier i was trying to load all of the files at one go but now

I have used this approach and now i am able to extract text from the images.

Quick Links

Popular Tags

This months top contributors

atcodedog05 19598

Qiu 15908

binu_acs 15783

MarqueeCrew 13710

apathetichell 13703