Agent Harness Engineering

来源：Addy Osmani，2026-04-19

核心论点

A decent model with a great harness beats a great model with a bad harness.

过去两年业界争论的焦点是模型本身——哪个更聪明、哪个写 React 更干净、哪个幻觉更少。但这只讨论了系统的一半。模型只是 agent 的一个输入。其余的是 harness：prompts、tools、context policies、hooks、sandboxes、subagents、feedback loops、recovery paths。

Vivek Trivedy 创造了 harness engineering 一词，他的 "Anatomy of an Agent Harness" 是最清晰的推导。Dex Horthy 追踪了这一模式的出现。HumanLayer 将大多数 agent 失败框定为"配置问题"而非模型权重问题。Anthropic 工程团队发布了关于长运行工作 harness 设计的最佳公开分析。

什么是 Harness

Viv 的一句话说清了本质：

Agent = Model + Harness. If you're not the model, you're the harness.

Harness 是每一个不是模型本身的代码、配置和执行逻辑。原始模型不是 agent，harness 赋予它状态、工具执行、反馈循环和可执行约束。

具体包括：

System prompts、CLAUDE.md、AGENTS.md、skill files、subagent prompts
Tools、skills、MCP servers 及它们的描述
Bundled infrastructure（filesystem、sandbox、browser）
Orchestration logic（subagent spawning、handoffs、model routing）
Hooks 和 middleware（compaction、continuation、lint checks）
Observability（logs、traces、cost and latency metering）

Simon Willison 将循环本质归结为：agent 是一个 "runs tools in a loop to achieve a goal" 的系统。技能在于工具和循环的设计。

Claude Code、Cursor、Codex、Aider、Cline 都是 harnesses。模型底层有时是相同的，但你体验到的行为主要由 harness 决定。

"Skill Issue" 重构

工程师常陷入的模式：agent 做了蠢事 → 工程师怪模型 → 归入"等下个版本"。

Harness engineering 拒绝这个默认。失败通常是可读的：

agent 不知道某个约定 → 添加到 AGENTS.md
agent 运行了破坏性命令 → 添加 hook 拦截
agent 在 40 步任务中迷路 → 拆分成 planner 和 executor
agent 不断"完成"破碎代码 → 将 typecheck back-pressure 接入循环

HumanLayer："it's not a model problem. It's a configuration problem."

Terminal Bench 2.0 数据点：Claude Opus 4.6 在 Claude Code 内得分远低于在 custom harness 中的同一模型。Viv 的团队仅通过改变 harness 就将 coding agent 从 Top 30 提升到 Top 5。

The gap between what today's models can do and what you see them doing is largely a harness gap.

The Ratchet：每个错误变成规则

Harness engineering 中最重要的习惯是将 agent 错误视为永久信号，而非一次性故事或"坏运行"。

如果 agent 发了一个带有注释掉测试的 PR 而你不小心合并了：

下一版 AGENTS.md 写 "never comment out tests; delete them or fix them"
下一版 pre-commit hook grep .skip( 和 xit(
下一版 reviewer subagent 将注释掉的测试标为 blocker

Good AGENTS.md 的每一行都应能追溯到一个具体的失败。

Harness engineering 是 discipline 而非 framework。适合你的代码库的 harness 由你的失败历史塑造，无法下载。

从行为反向推导

Viv 最有用的设计框架：从你想要的行为出发，推导出 harness 组件。

期望行为	Harness 组件
使用真实持久数据	filesystem + git
编写和执行代码	bash + code execution
安全执行 + 默认值	sandboxed environments + tooling
记住新知识	memory files + web search + MCPs
长上下文性能	compaction + tool offloading + skills
长时程工作	Ralph loops + planning + verification

如果你说不出一个组件存在的具体行为理由，它可能不该存在。

关键组件模式

Filesystem and Git：最基础的 primitive。没有 filesystem，你就是往 chat window 里 copy-paste。Git 提供版本控制、回滚、分支实验。

Bash and code execution：通用工具策略。Agents 已经擅长 shell commands；大多数任务归结为几个精心选择的 CLI 调用。

Sandboxes：隔离执行环境。Allow-list 命令、网络隔离、按需创建和销毁。

Memory and search：Continual learning 的 crude but effective 形式。AGENTS.md 作为持久记忆，web search 和 MCP 解决知识截止日期。

Battling context rot：

Compaction：窗口接近满时智能总结和卸载旧上下文
Tool-call offloading：大工具输出保留 head/tail，全文卸载到 filesystem
Skills with progressive disclosure：只在任务实际需要时揭示指令和工具
Full context resets：Anthropic 对超长任务的策略——tear down session，从 compact hand-off file 重建

Long-horizon execution：

Ralph Loop：hook 拦截模型的退出尝试，将原始 prompt 重新注入 fresh context window，强制继续
Planner / generator / evaluator splits：将生成与评估分离到不同 agent，因为 agent 自评时 reliably skew positive
Sprint contract：生成器和评估器在写代码前就"完成"的含义达成一致

Hooks：执行层

在 tool call 前、file edit 后、commit 前、session start 时运行
运行 typecheck/lint/test 并 surface failures
拦截破坏性 bash（rm -rf、git push --force）
Auto-format on write

HumanLayer 原则：success is silent, failures are verbose。

AGENTS.md：

Keep it short（HumanLayer 保持 60 行以内）
Earn each line：每条规则应追溯到一个过去的失败或硬外部约束

生产实例：Claude Code 架构

Fareed Khan 的 Claude Code 架构拆解展示了成熟 harness 的每一层：

Input layer：UI、session manager、permission gate
Knowledge layer：skill registry、context compressor、task graph、memory store
Integration layer：MCP runtime、external servers
Execution layer：tool dispatch、streaming runtime、prompt cache
Output layer：verified task results
Observability layer：event bus、background executor
Multi-agent layer：subagent spawning、teammate mailboxes、FSM protocol、autonomous board、worktree isolator

几乎每个前文概念都作为命名组件出现在这张图上。

Harness 不缩小，它们移动

Naive story：更好的模型让 harnesses 过时。如果模型能 plan，就不需要 planner。如果模型在长时程上保持连贯，就不需要 context resets。

现实：Opus 4.6 确实 kill 了 context-anxiety 失败模式（Sonnet 4.5 会在接近上下文限制时过早收尾），意味着六个月前写的焦虑缓解脚手架现在是 dead code。

但 ceiling 随模型一起移动。以前 unreachable 的任务现在 in play，它们有自己的失败模式。焦虑脚手架消失，取而代之的是 multi-day memory policy，或协调三个专门 agent 的 harness，或生成 UI 设计质量评估器。

Anthropic："every component in a harness encodes an assumption about what the model can't do on its own." 当模型在某方面变好，那个组件对 nothing load-bearing 且应该移除。当模型解锁新事物，需要新脚手架来达到新 ceiling。

Model-Harness 训练循环

今天的 agent 产品在与 harness 一起的循环中进行 post-training。模型专门针对 harness 设计者认为它应该擅长的动作进行训练：filesystem operations、bash、planning、subagent dispatch。这就是 Opus 4.6 在 Claude Code 内感觉不同于在其他 harness 中的原因。

实际 implication twofold：

A harness is a living system, not a config file you set up once.
The "best" harness isn't necessarily the one the model was trained inside; it's the one designed for your task.

Harness-as-a-Service

Vivek Trivedy 的 HaaS 框架：我们正在从构建在 LLM APIs（给你 completion）转向构建在 harness APIs（给你 runtime）。Claude Agent SDK、Codex SDK、OpenAI Agents SDK 都指向同一方向。

默认路径曾经是：build your own loop、wire up tool-calling、handle conversation state、invent approval flow。现在的默认路径是：pick a harness framework、configure along four pillars（system prompt、tools、context、subagents）、将剩余精力投入 domain-specific prompt 和 tool design。

FredKSchott 的 Flue 是一个具体的 harness framework 实现，受本文早期版本启发而构建，验证了 harness engineering 从理论到工具的转化路径。

未来方向

协调多个 agent 在共享代码库上并行工作
Agent 分析自己的 traces 来识别和修复 harness-level 失败模式
Harness 动态组装正确的工具和 context（just-in-time 而非 pre-configured）

最后一个感觉像 harnesses 停止成为 static config，开始变成更接近 compiler 的东西。

Agent Harness Engineering — Addy Osmani

Agent Harness Engineering

核心论点

什么是 Harness

"Skill Issue" 重构

The Ratchet：每个错误变成规则

从行为反向推导

关键组件模式

生产实例：Claude Code 架构

Harness 不缩小，它们移动

Model-Harness 训练循环

Harness-as-a-Service

Harness-as-a-Service

未来方向

Sources

Evolution

Derived from source material

Linked from