Agent Harness Worker Model

What it is

The Worker Model is an alternative architecture for agent harnesses that decomposes the monolithic framework into independent, composable workers communicating over a shared bus. Each worker handles a single concern (state machine, provider routing, credential vault, policy engine, approval gate, budget tracking, hook dispatch, context compression, session tree, telemetry). Replacing one worker does not require changes to others.

The iii engine implements this through one primitive: a worker that connects over WebSocket and registers functions and triggers on a shared engine bus. The harness becomes a stack of installable workers, and "build your own" stops meaning "fork a framework" and starts meaning "swap a few workers."

Why it matters

Current mainstream agent frameworks (LangChain, LangGraph, OpenAI Agents SDK, Anthropic SDK, CrewAI, AutoGen) bundle fifteen distinct concerns into a single import. Long-running agent teams eventually outgrow these bundled choices and rewrite their harness from scratch. The Worker Model keeps the choice in the builder's hands.

Mike Piccolo's core argument: a harness is not a thing you install. It is a set of jobs your system has to do for an agent to run durably, safely, and observably. The framework era bundled those jobs together because nothing underneath gave you a way to compose them. The Worker Model gives you that substrate.

The 15 jobs an agent harness has to do

#	Job	iii Worker	What it does
1	Accept turn request	harness::trigger	Receives client POST, persists request, seeds OTel trace
2	Resolve credentials	auth-credentials	Pulls provider API keys via auth::get_token
3	Model capability lookup	models-catalog	Registers models::list, models::get, models::supports
4	Drive per-turn state machine	turn-orchestrator	Provisions, streams assistant, runs tools, steers, tears down
5	Load skill bodies	iii-directory	directory::skills::download / directory::skills::get
6	Assemble system prompt	turn-orchestrator	Mode paragraph + identity preamble + default skills appendix
7	Stream tokens to client	assistant-streaming	provider::::stream via SSE → iii channel → UI fanout
8	Policy check before tool call	policy engine	consultBefore calls policy::check_permissions (allow/deny/needs_approval)
9	Pause for human approval	approval-gate	Parks calls in awaiting_approval; wakes via turn::on_approval trigger
10	Track LLM spend	llm-budget	budget::record per call; enforces per-workspace/per-agent caps
11	Run hooks before/after tool calls	hook-fanout	publish_collect; short-circuits when no subscriber registered
12	Persist session as branching tree	session	External state storage for forks and resumes
13	Compact session history	context-compaction	Subscribes to agent::turn_end; compacts per-turn instead of per-event
14	Emit event stream for UI	harness meta-worker	agent::events fanout for UI subscription
15	OpenTelemetry trace across steps	automatic instrumentation	src/runtime/worker.ts wraps every registerFunction in Proxy

How the loop runs

Client POST → harness::trigger → run::start → turn-orchestrator
                                               ↓
                              provisioning → boot sandbox + download skills + assemble prompt
                                               ↓
                         assistant_streaming → provider::<name>::stream
                                               ↓
                               tool calls → function_execute → dispatchWithHook
                                               ↓
                                    consultBefore → policy::check_permissions
                                               ↓
                                    allow / deny / needs_approval
                                               ↓
                               batch completes → steering_check → continue / stop / max_turns
                                               ↓
                                    finishSession() → teardown inline

Fail-closed by construction: if policy worker is unreachable or 5-second timeout fires, consultBefore denies with gate_unavailable. If iii::durable::publish errors, hook fanout returns publish_failed: true and orchestrator treats as deny.

Approval wake is reactive and shared: one turn::on_approval state trigger on scope approvals covers every session. No per-call resume functions. No startup re-scan.

Key design properties

Thin vs thick is a slider, not a fork

Thin harness: turn-orchestrator + provider + auth + minimal meta-worker. No approvals, no budgets, no policy. Run anything. Trust the model.
Thick harness: all 13+ workers + custom policy + custom approval + Slack integration + budget caps.
The distance between them is a config change, not a rewrite.

Composition gives you local rewrite freedom

The turn-orchestrator recently refactored its FSM from 11 states to 7, deleted per-call approval_resume mechanisms, and inlined tearing_down into finishSession(). Every other worker stayed unchanged. The approval::resolve wire shape didn't move. This is the property composition gives you: a major internal rewrite of one worker is self-contained because every neighbor talks through bus-level function ids.

Replacing a layer is writing a worker

Five concrete examples from iii:

Replace model catalog with live API: write a worker registering models::list/get/supports, fetch from provider endpoint, cache. iii worker add your-org/dynamic-models-catalog. Stop static worker. Orchestrator never knows.
Add a new provider: one folder, one iii.worker.yaml, one register.ts. Publish or keep local. Turn-orchestrator picks provider by run's provider field.
Serve skills from private artifact store: write directory::skills::get/list backed by internal docs or S3. Disconnect default iii-directory. Orchestrator's bootstrap keeps working.
Override system prompt entirely: pass system_prompt on run::start. Orchestrator uses verbatim string, skips assembly. Skill download still runs.
Replace approval UI surface: default approval-gate registers approval::resolve. Want Slack approvals? Write a Slack worker listening for /approve and /deny slash commands, calling approval::resolve. Orchestrator never knows. Existing approval-gate stays untouched.

Evidence across sources

Source	Key Claim	Relevance
Mike Piccolo — How to Build Your Own Agent Harness	Full 15-job decomposition with production stack from iii-hq/workers	Primary source; detailed architecture and replacement examples
AI Briefing 2026-05-29 Morning	Worker model replaces monolithic harness with composable workers on shared bus	Initial architectural framing
Nick Nisi — Case Harness	Gate design, SHA-256 verification, evidence-driven review	Shows same concerns (policy, verification, approval) implemented at application layer
Cole Medin — Harness Engineering	Skills + hooks + mindset shift from context to control	Methodology layer that runs on top of any harness architecture

Counterpoints & Gaps

Performance overhead: bus communication via WebSocket vs direct function calls. Not benchmarked in source.
Operational complexity: running 10-15 independent workers vs one monolithic process. DevOps surface increases.
Adoption friction: iii is new (2026-05). No large production deployments publicly documented yet.
LangChain/LangGraph response: both frameworks are actively adding modularity (LangSmith Engine, SmithDB, Context Hub). The gap may narrow.

Open questions

Has any production system outside iii implemented the full fifteen-worker decomposition?
What is the latency overhead of bus communication vs in-process calls under high throughput?
How does cross-cutting auth/logging work when every worker is independent?
Can the Worker Model apply to personal/small-team harnesses, or is it only for platform-scale infra?

harness-engineering/overview — MOC for harness concepts
harness-engineering/what-is-agent-harness — Definition of harness as model-external shell
harness-engineering/lightweight-vs-orchestration-harness — Thin vs thick harness tradeoffs
harness-engineering/multi-agent-coordination-patterns — Multi-agent orchestration patterns
harness-engineering/zero-trust-ai-agents — Security and verification in agent systems
product-trends/agent-native-architecture — Agent-native product architecture trends

Agent Harness Worker Model

Agent Harness Worker Model

What it is

Why it matters

The 15 jobs an agent harness has to do

How the loop runs

Key design properties

Thin vs thick is a slider, not a fork

Composition gives you local rewrite freedom

Replacing a layer is writing a worker

Evidence across sources

Counterpoints & Gaps

Open questions

Sources

Evolution

Derived from source material

Linked from

Agent Harness Worker Model

What it is

Why it matters

The 15 jobs an agent harness has to do

How the loop runs

Key design properties

Thin vs thick is a slider, not a fork

Composition gives you local rewrite freedom

Replacing a layer is writing a worker

Evidence across sources

Counterpoints & Gaps

Open questions

Related

Sources

Evolution

Derived from source material

Linked from