From Inbox to ERP: A Hands-On Tour of the Agentic B2B Order Intake Prototype

Jun 29 2026 - By Alliance

From Inbox to ERP: A Hands-On Tour of the Agentic B2B Order Intake Prototype

By Scott Trafford – Engineer, MACH Alliance Agent Ecosystem and Principal, Trafford Consulting

Explore the reference implementation: B2B Order Intake GitHub repo

Purchase orders arrive through every channel a buyer has access to: email with a plain text body, email with a PDF or spreadsheet attached, web portal submissions, EDI transactions over VAN or AS2, API calls from buyer procurement systems, and fax for operations still running on it.

The most mature B2B trading relationships have automated portions of this flow. EDI-connected partners can exchange X12 850s with minimal human involvement when the data is clean and the systems agree. But automation across the full range of buyers, formats, and channels remains the exception rather than the rule. Esker's 2025 benchmark across hundreds of customer service departments puts the average touchless order rate at 67%, meaning one in three orders still requires manual intervention even at established operations. For companies earlier in their automation journey, the proportion is considerably higher.

That remaining manual workload is the problem this prototype addresses.

Each order that touches a human requires someone to read it, look up the customer, validate the products, check stock, and re-key the data into an ERP. For many companies, even orders that arrive through structured channels still require manual intervention because the systems on either end are not connected to each other. If something is missing or wrong, they send a clarification email and wait. Research from Conexiom puts the average manual entry error rate at 1–3% per order. Emporix documents average manual processing times of 8–15 minutes per order in B2B distribution operations.

This is the daily reality of B2B order intake at most manufacturers and distributors, not an edge case.

The first article in this Agent Solution Studio series, the Product Insights Agent, addressed a moment on the buyer side: the unanswered question on the product detail page. That article noted B2B order intake as a future prototype in the series. This is that prototype, and the rest of this piece is written for the people who will actually run it: the engineers, architects, and technical leaders deciding whether the pattern survives contact with their own systems.

What we built

The B2B Order Intake Agent is an agentic pipeline that receives purchase orders across multiple channels and formats, extracts and validates the order content, resolves customer identity and product catalog matches, and routes each order to one of four outcomes automatically, with a plain-language explanation attached to every decision. The demo below shows the full path, from inbound order to ERP submission.

The pipeline handles two inbound channels out of the box: an email channel that polls a configured inbox, and an EDI channel for X12 850 files. In the prototype, EDI files are delivered through a watched folder, a practical stand-in for the automated ingestion a production EDI connection would provide. Within those channels, five content parsers handle the actual formats: plain text email bodies, PDF attachments, CSV files, XLSX spreadsheets, and X12 850 EDI documents. Machine-readable PDFs are parsed for text directly. Scanned or image-based PDFs produce minimal extractable text, which yields low confidence and routes the order to Human Review rather than failing silently.

Every order, regardless of channel or format, passes through the same four pipeline stages. Intake normalizes the raw inbound message into a consistent structure. Extraction uses Claude to read the order content and pull structured fields: customer identity signals, line items, quantities, shipping address, requested delivery date. Validation runs the extracted data against business rules, the customer master, and inventory availability. Routing produces a final outcome with a confidence score and a natural-language explanation of the decision.

The pipeline is built on LangGraph. The entire flow runs as a single stateful graph, where each order is a graph invocation with nodes for intake, extraction, validation, routing, and outcomes. That design makes the flow explicit, inspectable, and resumable, and it is what powers the Human Review queue: the graph pauses at review using LangGraph's interruptBefore, persists its state to a PostgreSQL checkpointer, and resumes through the remaining nodes when the operator acts. State survives a restart. All AI work is handled by Claude via the Anthropic API, and the model is configurable through a single environment variable.

The system is aligned to the MACH Alliance Open Data Model (ODM), with order, customer, inventory, and address entities following the ODM schema. Every integration point (ERP, email provider, AI extraction, customer lookup, logging, EDI outbound) is implemented as a swappable adapter behind a defined interface. Changing providers means changing a configuration value, not the pipeline.

How it works

Every inbound order moves through the pipeline in sequence. The intake stage receives the raw message, identifies the content type, and routes it to the appropriate parser. A plain text email goes to the text parser, an email with a PDF attachment to the PDF parser, an EDI file drop to the EDI parser. The parser extracts the raw order content and hands it to extraction as structured text.

Extraction passes that content to Claude with a structured prompt that asks the model to identify the order fields, note confidence levels for each, and flag anything missing or ambiguous. The result is a structured order object with per-field confidence scores.

Validation then runs five checks in sequence: required fields are present, the customer can be identified, the SKUs can be resolved against the product catalog, stock is available, and business rules pass. Customer resolution uses a cascade: it tries account ID match, email domain match, phone match, fuzzy company name match, and address match in sequence, stopping when a match is found with sufficient confidence.

Routing evaluates the overall confidence score and the validation results against configurable thresholds and produces one of four outcomes.

Submit to ERP. All fields present, customer and products resolved, confidence above the submit threshold. The order is submitted to the ERP as a draft without human intervention.

Seek Clarification. Required fields are missing or a product code cannot be resolved. A reply is sent to the buyer listing exactly what is needed, with catalog suggestions where applicable.

Human Review. The agent has resolved most of the order but cannot commit with sufficient confidence on a specific field. The order lands in the Human Review queue with the agent's reasoning and the candidates it considered. An operator makes the final call.

Reject. The message is spam, a duplicate, or cannot be related to a known buyer after the full resolution cascade. The order is logged to the audit trail. For EDI trading partners, a formal X12 855 Purchase Order Acknowledgement can be sent. For unknown senders, a silent reject leaves no outbound message.

The operator experience for Human Review is deliberate. The Agent Control Interface shows the routing reason, extracted fields, and AI reasoning for every queued order. Operators observe and decide; they do not re-key. For EDI orders requiring clarification, the interface generates a pre-addressed X12 864 Text Message with the clarification content already composed, and the operator can add context before sending.

B2B Order Intake - Agent Control Interface

The Agent Control Interface, Human Review queue. Seven EDI orders are waiting for an operator. Each card states, in plain language, why the agent could not commit the order on its own.

Every event in the pipeline (intake, extraction, each validation step, routing decision, operator action, ERP submission) is written to an audit log. Nothing is a black box.

The four routing outcomes as an operator sees them: the Test Corpus runner, the Human Review queue, the X12 864 clarification composer, and the reject dialog that offers a formal 855 acknowledgement or a silent reject.

Under the hood

For anyone planning to read the code, here is the shape of the system. The pipeline API is Node.js and Express; the Agent Control Interface is React built with Vite and is fully headless, consuming the pipeline only through REST endpoints under /api/v1/. Orders, the audit log, and the LangGraph checkpoints all live in PostgreSQL 16. In local development, email runs through Mailpit; in production it runs through SendGrid Inbound Parse. EDI uses ANSI X12 (850 inbound, 864 and 855 outbound). Everything ships as a Docker Compose stack.

The adapter layer is the part worth studying, because it is where the principle of composability is made concrete. Each integration point is a class behind a documented interface, selected by an environment variable:

Set INBOUND_MAIL_PROVIDER=sendgrid and the pipeline receives email by webhook instead of polling, with no graph changes. The confidence thresholds that decide auto-submit versus review are environment variables too (CONFIDENCE_SUBMIT_THRESHOLD, SKU_AUTO_THRESHOLD, CUSTOMER_AUTO_THRESHOLD, and others), so the routing policy is configuration, not code. Treat them as policy you own, not defaults you inherit.

What the repo gives your team

This is a reference implementation: a complete, end-to-end agentic pipeline built to show what is possible and to run on a laptop in under an hour. The repo's docs/ folder is a toolkit for taking the pattern further:

production-considerations.md walks through what a real deployment must resolve: secrets management, authentication and TLS, PII handling, retention and deletion under GDPR and CCPA, tamper-evident audit logging and SIEM export, observability, AI governance, cost controls, and rollback.
implementation-guide.md turns that into a sequenced prototype-to-pilot backlog: four phases of scoped, prioritized tasks (Required versus Recommended, sized from a day to multiple weeks) with dependencies noted inline.
setup-instructions.md is the step-by-step local setup, and a business-case model (business-case-template.md plus an Excel sheet with sensitivity analysis and sourced assumptions) lets you put numbers against your own order volume before committing.

The architecture is built to support that path. The adapter factory, configurable confidence thresholds, the audit log, and Human Review as a first-class outcome all carry directly into production. Moving from prototype to pilot is real work, but it is well-defined work, and the repo hands you the map.

Why it matters

Return to Esker's 2025 benchmark from the opening: the average touchless rate is 67%, and top-performing organizations exceed 90%. Most B2B operations are nowhere near either number yet.

The gap between current practice and that benchmark is less a technology problem than an integration and trust problem. Organizations need a system that connects to their real data: the actual customer master, product catalog, and ERP. And they need operators who can see every decision before trusting the system with live orders.

That is what this prototype is built to demonstrate, and why it ships as something you can run rather than a slide. It includes a corpus of 49 test cases covering every channel, every content format, and every routing outcome. Every decision is explained, every operator action is logged, and the confidence thresholds are configurable policy rather than fixed defaults.

The architecture makes that extensibility concrete. Swapping the ERP stub for a NetSuite or Dynamics 365 adapter means implementing one interface method and registering it. Adding a channel such as a web form or a REST endpoint means adding one adapter module. The graph does not change. The same goes for the obvious next features: order acknowledgement, price validation, minimum order quantity checks, partial fulfillment, and multi-language support are all new adapters on the same pipeline, not rewrites.

The Product Insights Agent showed what a grounded, observable agent looks like on the buyer side of a commerce interaction. This prototype shows what it looks like on the seller side, where the volume is higher, the formats are messier, and the cost of errors flows directly into fulfillment.

Explore the code

The prototype is MIT licensed, ODM aligned, and runs locally in under an hour. The fastest way to understand it is to run a few orders through it and watch the audit log.

B2B Order Intake — GitHub repository

You need Docker (Engine plus Compose, or Docker Desktop on macOS/Windows), Git, and an Anthropic API key. The only required configuration change is the key.

# 1. Clone and enter the repo

git clone https://github.com/machalliance/solution-studio-b2b-order-intake.git

cd solution-studio-b2b-order-intake

# 2. Configure: set ANTHROPIC_API_KEY=sk-ant-... in .env

cp .env.example .env

# 3. Start the stack

docker compose up -d

Give it about 30 seconds, then open the Agent Control Interface at http://localhost:3000 (the pipeline health check lives at http://localhost:3002/health). Every other .env value has a working local default; no database setup or external accounts are needed beyond Anthropic.

To see the pipeline work, open Test Corpus and run email-01, a clean order from a known buyer. Switch to Live Feed and watch it appear, then open it in the Audit Log to follow extraction, validation, routing, and ERP submission step by step.