Understanding the LLM Extraction Pipeline
This document provides a high-level overview of Zudello's internal LLM extraction pipeline for context. Configuration and detailed management are handled.
Zudello employs a sophisticated pipeline leveraging Large Language Models (LLMs) to automatically extract structured data from unstructured documents like invoices and receipts. This approach offers greater flexibility and accuracy compared to traditional template-based methods.
Key Concepts
- LLM Workflows: Instead of a single large prompt, Zudello uses workflows composed of multiple, smaller, targeted prompts. Each workflow is specific to a document type (Module/Submodule). See Configuring LLM Workflows and Prompts.
- Prompts: Each prompt asks the LLM a specific question to extract a particular piece of data (e.g., "What is the invoice number?", "List the line items in JSON format.").
- Model Selection: Prompts are classified by complexity (Low, Medium, High), allowing Zudello to route the prompt to the most appropriate and cost-effective LLM (e.g., Claude Haiku for simple tasks, Claude Opus for complex tables). Zudello integrates with models like Claude, Llama, and ChatGPT via Open Router.
- Context Injection: Known information (like the client's team name or ABN) is injected into prompts to help the LLM differentiate between parties on the document.
- Validation: The LLM's response for each prompt is validated against the expected data type (Text, Number, JSON) using Pydantic models. Retries occur if validation fails.
- JSON Output: The results from all successful prompts in a workflow are aggregated into a final JSON object representing the extracted document data.
- Pipeline Mapping: This structured JSON output is then processed by the standard Zudello Pipeline Mapper to populate the relevant fields in the Zudello transaction record.
Simplified Pipeline Flow
- Ingestion & OCR: Document is uploaded, and Optical Character Recognition (OCR) extracts the raw text.
- Document Classification: Zudello determines the document type (e.g., PURCHASING/INVOICE).
- Workflow Selection: The appropriate LLM Workflow (team-specific or global default) for that document type is selected.
- Prompt Execution (Iterative):
- The workflow executes each defined prompt in sequence.
- The prompt (question + context + potentially examples) is sent to the selected LLM based on complexity.
- The LLM returns a response.
- The response is validated against the expected data type. Retries occur on failure.
- The validated data is added to the final JSON output.
- Final JSON Aggregation: All extracted data points are combined.
- Pipeline Mapping: The final JSON is mapped to Zudello's internal transaction model fields.
- Enrichment: Standard Zudello Enrichment processes run (linking suppliers/items, applying defaults, etc.).
Benefits
- Flexibility: Adapts better to variations in document layouts than rigid templates.
- Accuracy: Leverages the contextual understanding of LLMs for improved extraction, especially for complex fields or relationships.
- Extensibility: Can be configured by staff to extract standard and custom fields.
- Efficiency: Reduces the need for manual mapping (AI Assistant) and data entry corrections.
Need help?
While users don't directly configure the LLM pipeline, understanding the basics can be helpful when troubleshooting extraction issues. If you encounter persistent problems with data extraction, contact Zudello support for assistance. Staff can review the underlying workflows and prompts.