Agent Harness Worker Model
What it is
The Worker Model is an alternative architecture for agent harnesses that decomposes the monolithic framework into independent, composable workers communicating over a shared bus. Each worker handles a single concern (state machine, provider routing, credential vault, policy engine, approval gate, budget tracking, hook dispatch, context compression, session tree, telemetry). Replacing one worker does not require changes to others.
The iii engine implements this through one primitive: a worker that connects over WebSocket and registers functions and triggers on a shared engine bus. The harness becomes a stack of installable workers, and "build your own" stops meaning "fork a framework" and starts meaning "swap a few workers."
Why it matters
Current mainstream agent frameworks (LangChain, LangGraph, OpenAI Agents SDK, Anthropic SDK, CrewAI, AutoGen) bundle fifteen distinct concerns into a single import. Long-running agent teams eventually outgrow these bundled choices and rewrite their harness from scratch. The Worker Model keeps the choice in the builder's hands.
Mike Piccolo's core argument: a harness is not a thing you install. It is a set of jobs your system has to do for an agent to run durably, safely, and observably. The framework era bundled those jobs together because nothing underneath gave you a way to compose them. The Worker Model gives you that substrate.
The 15 jobs an agent harness has to do
| # | Job | iii Worker | What it does |
|---|---|---|---|
| 1 | Accept turn request | harness::trigger | Receives client POST, persists request, seeds OTel trace |
| 2 | Resolve credentials | auth-credentials | Pulls provider API keys via auth::get_token |
| 3 | Model capability lookup | models-catalog | Registers models::list, models::get, models::supports |
| 4 | Drive per-turn state machine | turn-orchestrator | Provisions, streams assistant, runs tools, steers, tears down |
| 5 | Load skill bodies | iii-directory | directory::skills::download / directory::skills::get |
| 6 | Assemble system prompt | turn-orchestrator | Mode paragraph + identity preamble + default skills appendix |
| 7 | Stream tokens to client | assistant-streaming | provider:: |
| 8 | Policy check before tool call | policy engine | consultBefore calls policy::check_permissions (allow/deny/needs_approval) |
| 9 | Pause for human approval | approval-gate | Parks calls in awaiting_approval; wakes via turn::on_approval trigger |
| 10 | Track LLM spend | llm-budget | budget::record per call; enforces per-workspace/per-agent caps |
| 11 | Run hooks before/after tool calls | hook-fanout | publish_collect; short-circuits when no subscriber registered |
| 12 | Persist session as branching tree | session | External state storage for forks and resumes |
| 13 | Compact session history | context-compaction | Subscribes to agent::turn_end; compacts per-turn instead of per-event |
| 14 | Emit event stream for UI | harness meta-worker | agent::events fanout for UI subscription |
| 15 | OpenTelemetry trace across steps | automatic instrumentation | src/runtime/worker.ts wraps every registerFunction in Proxy |
How the loop runs
Client POST → harness::trigger → run::start → turn-orchestrator
↓
provisioning → boot sandbox + download skills + assemble prompt
↓
assistant_streaming → provider::<name>::stream
↓
tool calls → function_execute → dispatchWithHook
↓
consultBefore → policy::check_permissions
↓
allow / deny / needs_approval
↓
batch completes → steering_check → continue / stop / max_turns
↓
finishSession() → teardown inline
Fail-closed by construction: if policy worker is unreachable or 5-second timeout fires, consultBefore denies with gate_unavailable. If iii::durable::publish errors, hook fanout returns publish_failed: true and orchestrator treats as deny.
Approval wake is reactive and shared: one turn::on_approval state trigger on scope approvals covers every session. No per-call resume functions. No startup re-scan.
Key design properties
Thin vs thick is a slider, not a fork
- Thin harness: turn-orchestrator + provider + auth + minimal meta-worker. No approvals, no budgets, no policy. Run anything. Trust the model.
- Thick harness: all 13+ workers + custom policy + custom approval + Slack integration + budget caps.
- The distance between them is a config change, not a rewrite.
Composition gives you local rewrite freedom
The turn-orchestrator recently refactored its FSM from 11 states to 7, deleted per-call approval_resume mechanisms, and inlined tearing_down into finishSession(). Every other worker stayed unchanged. The approval::resolve wire shape didn't move. This is the property composition gives you: a major internal rewrite of one worker is self-contained because every neighbor talks through bus-level function ids.
Replacing a layer is writing a worker
Five concrete examples from iii:
-
Replace model catalog with live API: write a worker registering models::list/get/supports, fetch from provider endpoint, cache. iii worker add your-org/dynamic-models-catalog. Stop static worker. Orchestrator never knows.
-
Add a new provider: one folder, one iii.worker.yaml, one register.ts. Publish or keep local. Turn-orchestrator picks provider by run's provider field.
-
Serve skills from private artifact store: write directory::skills::get/list backed by internal docs or S3. Disconnect default iii-directory. Orchestrator's bootstrap keeps working.
-
Override system prompt entirely: pass system_prompt on run::start. Orchestrator uses verbatim string, skips assembly. Skill download still runs.
-
Replace approval UI surface: default approval-gate registers approval::resolve. Want Slack approvals? Write a Slack worker listening for /approve and /deny slash commands, calling approval::resolve. Orchestrator never knows. Existing approval-gate stays untouched.
Evidence across sources
| Source | Key Claim | Relevance |
|---|---|---|
| Mike Piccolo — How to Build Your Own Agent Harness | Full 15-job decomposition with production stack from iii-hq/workers | Primary source; detailed architecture and replacement examples |
| AI Briefing 2026-05-29 Morning | Worker model replaces monolithic harness with composable workers on shared bus | Initial architectural framing |
| Nick Nisi — Case Harness | Gate design, SHA-256 verification, evidence-driven review | Shows same concerns (policy, verification, approval) implemented at application layer |
| Cole Medin — Harness Engineering | Skills + hooks + mindset shift from context to control | Methodology layer that runs on top of any harness architecture |
Counterpoints & Gaps
- Performance overhead: bus communication via WebSocket vs direct function calls. Not benchmarked in source.
- Operational complexity: running 10-15 independent workers vs one monolithic process. DevOps surface increases.
- Adoption friction: iii is new (2026-05). No large production deployments publicly documented yet.
- LangChain/LangGraph response: both frameworks are actively adding modularity (LangSmith Engine, SmithDB, Context Hub). The gap may narrow.
Open questions
- Has any production system outside iii implemented the full fifteen-worker decomposition?
- What is the latency overhead of bus communication vs in-process calls under high throughput?
- How does cross-cutting auth/logging work when every worker is independent?
- Can the Worker Model apply to personal/small-team harnesses, or is it only for platform-scale infra?
Related
- harness-engineering/overview — MOC for harness concepts
- harness-engineering/what-is-agent-harness — Definition of harness as model-external shell
- harness-engineering/lightweight-vs-orchestration-harness — Thin vs thick harness tradeoffs
- harness-engineering/multi-agent-coordination-patterns — Multi-agent orchestration patterns
- harness-engineering/zero-trust-ai-agents — Security and verification in agent systems
- product-trends/agent-native-architecture — Agent-native product architecture trends