Deep Dive: The Enrichment Engine
Introduction
The Enrichment Engine is a core component of Zudello, responsible for taking the raw data extracted from documents (via OCR or LLM) and transforming it into structured, validated, and contextually relevant information ready for processing, approval, and integration. It automates many tedious tasks like linking suppliers, applying default coding, and performing calculations, significantly reducing manual effort.
Understanding the sequence and logic of the Enrichment Engine is crucial for configuring Zudello effectively and troubleshooting issues where automatic coding or linking doesn't behave as expected.
The Enrichment Sequence
Enrichment runs as a series of ordered steps whenever the enrich
flag is triggered (e.g., after extraction, when "Apply Trained Rules" is used, or during manual transaction creation). The sequence is generally as follows:
- Document Coding (Keyword Rules): Initial rules based on keywords found in extracted text fields are applied. This can set preliminary coding or flags. See How do Keyword Coding rules work?.
- Determine Origin & Initial Defaults: The system identifies if the document came via email or manual upload/creation to apply initial Location/Subsidiary defaults from the Inbox or User profile.
- Clean Transaction Data: Basic data cleaning occurs (e.g., setting default quantities if missing).
- Link Company (Supplier/Customer): This is a critical step.
- If a company is already linked, its
tax_inclusive
setting is checked/applied. - If not linked, the engine attempts to match based on extracted data (Tax Number, Email, Phone, Name) against existing Supplier/Customer records.
- If no direct match, it checks learned Supplier/Customer Alternatives. See Why didn't my supplier/item match automatically?.
- Once linked, the company's default
document_type
might be applied if the transaction's type isn't set.
- If a company is already linked, its
- Calculate Initial Totals & Exclusive Values: Based on the linked company's
tax_inclusive
setting (or calculation if no setting), initial linetax_amount
,total_exclusive
,unit_price_exclusive
, and headertotal_exclusive
values are calculated. See Understanding Document Total Calculations and Tax Calculation Errors. - Apply Company Settings:
- Consolidate Lines: If the linked company has
consolidate_lines
enabled, all existing lines are replaced with a single summary line. - Overwrite Due Date: If the document type extension is enabled, the due date is recalculated based on the linked company's payment terms.
- Consolidate Lines: If the linked company has
- Link Items: Each transaction line attempts to link to an Item record.
- Matching priority: Exact SKU -> Cleaned SKU -> Item Alternatives -> Barcode -> Keywords in Description. See Why didn't my supplier/item match automatically?.
- Apply Item Settings:
- Treat as Freight: If a linked item has
treat_as_freight
enabled, the line type changes toLANDED_COST
. - Remove Lines: If a linked item has
remove_lines
enabled, the line is deleted. - (Future) Unit of Measure Conversion: If applicable, quantities/prices are converted based on UOM rules.
- Treat as Freight: If a linked item has
- Link Transactions (Allocations): The engine attempts to automatically match the current transaction to related transactions (e.g., Invoice to PO/Receipt) based on configured rules (PO number, supplier, line matching). See Automatic Matching Process and Deep Dive: Allocations and Matching.
- Apply Coding (Defaults): This multi-stage process applies default dimension coding. Crucially, defaults generally only apply if the target field is currently empty. The priority order is:
- Inbox/User Defaults: Location/Subsidiary applied to lines based on origin (Step 2).
- Allocation Autofill: If line-level matching occurred and the Allocation extension is configured to autofill specific fields, coding from the matched PO/Receipt line is applied (potentially overwriting existing values).
- Item Defaults: Default coding from the linked Item record is applied to the line.
- Company Defaults: Default coding from the linked Supplier/Customer record is applied to the header and lines.
- Currency: If still not set, the Team's default currency is applied.
- See Supplier Default Coding.
- Apply Address Learning: If address fields are present, the system checks learned associations based on the extracted address text block (Address Key) and populates the address fields if a match is found.
Alternatives and Learning
A key part of Enrichment is its ability to learn and adapt.
- Supplier/Customer Alternatives: When a user manually links a supplier/customer to a document where automatic matching failed, the system stores the extracted details (name, tax number, email, phone) from that document as an "alternative" linked to the chosen supplier/customer. The next time a document arrives with those exact same details, Enrichment will use the learned alternative to link the correct record automatically.
- Item Alternatives: These are configured more explicitly, allowing users to map specific supplier/customer SKUs or descriptions to internal Item records. Enrichment uses these mappings during the "Link Items" step.
- Address Learning: Similar to supplier alternatives, when a user manually selects an address (e.g., a Team Address) for a document, the system learns the association between the text block extracted from the document image (the "Address Key") and the selected address record. Future documents with the identical Address Key will have the address auto-populated.
Dependencies
Data Dependencies configured in Settings (Configure Data Dependencies) primarily affect the user interface by filtering dropdown options. However, Enrichment does consider dependencies during the Apply Coding step.
- When applying defaults from Items or Suppliers/Customers, if a default value (e.g., default Location) conflicts with an already populated controlling field (e.g., Subsidiary) due to a dependency rule, the default will not be applied.
- This ensures Enrichment doesn't automatically create invalid coding combinations.
Troubleshooting Enrichment
When automatic linking or coding doesn't work as expected, consider the Enrichment sequence:
- Check Linked Records: Was the correct Supplier/Customer linked? Was the correct Item linked on the line? Incorrect linking is the most common cause of incorrect defaults. Use the Enrichment JSON (Staff/Advanced) to see why a specific record was (or wasn't) linked.
- Check Defaults Configuration: Are defaults correctly configured on the Supplier, Item, User Profile, or Inbox? Remember defaults usually only apply to empty fields.
- Check Field Values: Was the field you expected to be defaulted already populated (e.g., by extraction, keyword coding, or allocation autofill)? Enrichment won't overwrite it.
- Check Alternatives: Did an incorrect alternative get learned previously? (Requires Staff intervention to clear). Is the text on the current document slightly different, preventing the alternative from matching?
- Check Settings: Did a setting like "Consolidate Lines" or "Remove Lines" interfere with expected line data?
- Use "Apply Trained Rules": If you've corrected configuration (defaults, alternatives, settings) after the document was processed, use "Apply Trained Rules" to re-run the entire Enrichment sequence. See Apply Trained Rules. Note that this generally won't overwrite manually entered data but will re-apply defaults to empty fields and re-evaluate settings and links.
Enrichment JSON Staff Advanced
For advanced troubleshooting, Zudello Staff can access the Enrichment JSON log for a document.
- Access: Via the Staff Menu (...) on the document viewer.
- Content: Provides a detailed log of each Enrichment run, showing:
- Which functions were executed (e.g.,
Link Supplier
,Link Item for line #X
,Find matching PURCHASING ORDER transactions
). - The filter steps/criteria used for lookups (e.g.,
company_tax=...
,sku__iexact=...
). - The result of the lookup (e.g., linked record UUID or
null
/[]
if no match). - The dependency state considered during the lookup.
- Which functions were executed (e.g.,
- Use: By examining the filter steps and results, staff can pinpoint exactly why a specific record was matched or why no match was found, aiding in diagnosing configuration issues or unexpected behaviour. See Troubleshooting Enrichment Decisions.
Understanding the Enrichment Engine's flow and logic empowers users and administrators to configure Zudello for optimal automation and efficiently resolve issues when they arise.