LLM Prompt Tool — 502 Bad Gateway error when processing 300+ scanned PDFs in a single run

Question

I am building a GST invoice extraction workflow using the LLM Prompt Tool with GPT on Alteryx AIMS. My setup processes 300 plus scanned multi-page PDFs (10-20 pages, ~1-3MB each) from different vendors in different formats and extracts structured fields into Excel.

Workflow Setup Directory Tool → Blob Input → LLM Prompt Tool (Attach Non-Text Columns)  → JSON Parse → Excel Output

Why Blob Input and not PDF to Text

I initially tried the PDF to Text Tool but abandoned it because:

* The text output could not be passed cleanly to the LLM Prompt Tool due to a bytes/string type mismatch error
* OCR on scanned invoices is slow and loses layout context critical for accurate extraction

The Blob → Attach Non-Text Columns approach is more accurate for scanned PDFs as GPT reads the document visually, and is therefore our preferred route.

The Problem

When running all 300+ files in one go, the workflow consistently fails mid-run with:

"Error occurred in LLM response generation: 502 Server Error: Bad Gateway for url: https://eu1.alteryxcloud.com/aims/v1/generatedContent"

The same files process successfully in smaller batches of 25-30. This confirms the issue is volume/concurrency related, not file-specific. Splitting into multiple manual runs is not acceptable as it defeats the purpose of automation.

What we are looking for

* Is there a recommended architecture for processing 300+ scanned PDFs (where one pdf has 20 pages ie, 300*20= 6000 pages) through the LLM Prompt Tool in a single unattended run?

alexnajm · Accepted Answer

You can use a batch macro to send over one file at a time in one singular workflow run - that way it hopefully overcomes the error you are getting!

* Batch Macro
* Getting Started with Batch Macros
* Creating a Batch Macro - Alteryx Community