Multi-Agent Coordination Patterns

Source: Anthropic official blog, translated by 宝玉xp Core principle: Start with the simplest pattern that works, observe where it bottlenecks, then evolve.

Pattern 1: Generator-Verifier

Best for: Output quality is critical and you can write explicit evaluation criteria.

How it works:

Generator produces a draft
Verifier checks against defined standards
If failed, feedback is sent back to generator
Loop until verifier approves or max iterations reached

Use cases: Code generation (one writes, one runs tests), fact-checking, compliance review, customer support email drafts.

Limitations:

Quality floor depends entirely on how detailed the verifier's rubric is
Assumes generation and verification are separable skills (not true for creative breakthroughs)
Can get stuck in infinite loops if generator cannot resolve verifier's objection

Pattern 2: Orchestrator-Subagent

Best for: Tasks decompose cleanly with clear subtask boundaries.

How it works:

A central orchestrator plans work, delegates to subagents, and synthesizes final results
Each subagent works in its own context window, returning distilled results

Real example: Claude Code uses this pattern. The main agent writes code and edits files, but spawns background subagents for large-scale searches or independent investigations.

Use cases: Automated code review (security, coverage, style, architecture checks), complex analysis with distinct dimensions.

Limitations:

Orchestrator becomes an information bottleneck
Subagents run sequentially by default unless explicit parallelism is added
Key details can be lost through repeated summarization

Pattern 3: Agent Teams

Best for: Parallelizable, independent subtasks that need long-running, multi-step execution.

How it works:

A coordinator spawns teammates as independent processes
Teammates pull tasks from a shared queue, complete multi-step work, and report back
Key difference from Orchestrator-Subagent: teammates persist across tasks and accumulate domain context

Use cases: Large codebase migration (each member owns one service module), long-running data pipelines.

Limitations:

Independence is both a strength and weakness — teammates don't easily share intermediate progress
Progress management is hard because completion times vary
Resource contention (multiple agents editing same files/databases) requires conflict resolution

Pattern 4: Message Bus

Best for: Event-driven pipelines and systems that will keep growing new agents.

How it works:

Agents communicate purely through publish and subscribe
A router pushes relevant messages to agents based on topics
New agents can join by subscribing to relevant topics without rewiring the system

Use cases: Security operations (alerts flow from triage → investigation → response), extensible automation pipelines.

Limitations:

Debugging is painful — tracing a cascade across five agents requires careful log correlation
Router accuracy is critical; a misclassified or dropped message causes silent failure
LLM-based routers add their own failure modes (hallucination, misinterpretation)

Pattern 5: Shared State

Best for: Highly collaborative tasks where agents need to build on each other's discoveries in real time.

How it works:

No central commander. Agents read from and write to a persistent shared store (database, filesystem, document)
Like a shared blackboard: agents pick up clues, work on them, and write findings back
System stops when time runs out, convergence threshold is reached, or a designated arbiter declares the answer good enough

Use cases: Cross-domain research (papers, patents, news, reports — each agent's finding becomes another's lead).

Limitations:

Without central control, agents may duplicate work or diverge
Reactive loops: Agent A writes, Agent B responds, Agent A responds back... burning tokens without conclusion
Must design hard stop conditions (time budget, convergence threshold, arbiter agent)

Emerging direction: Latent-space coordination (RecursiveMAS)

来源：AI 简报 2026-05-02 Morning、AINews 2026-05-02

上面五种 pattern 都假设 Agent 之间通过自然语言文本交换中间结果。这一假设带来一个根本开销：每次交接，模型 A 的内部计算结果（数值向量）都要被解码成 token、再被模型 B 编码成数值向量——重复的"翻译/读懂"循环吃掉大量算力和延迟。

RecursiveMAS（@vista8 / @omarsar0 转引论文）尝试绕开这层翻译：

机制：Agent 之间直接传模型内部的潜空间向量，而非自然语言文本；只有最后一轮才解码成文本输出
训练成本：底层模型完全不动，只训练连接模块（lightweight，比 LoRA 还轻）
报告 gains（9 个基准平均）：
- 准确率：+8.3%
- 端到端速度：1.2x–2.4x
- Token 减少：34.6%–75.6%
- AIME 数学竞赛：比最强基线高 13-18 个百分点，且递归轮次越多优势越大

为什么对 multi-agent 设计重要：

如果 Agent 间通信成本（token 经济、串行延迟）成为主导瓶颈——而 Anthropic Long-Running 与 Cursor planners/workers/judges 的实践都指出它会——latent-space 协调可能是这条线最值得跟踪的研究方向
但有一个明显代价：牺牲可观测性。现有 5 种 pattern 都可以靠 transcript 调试，潜空间向量传递则是黑盒，需要新的可观测层
它可能首先在 Generator-Verifier 内部循环里落地（最频繁的小幅交互、可观测性需求最低），再向 Orchestrator-Subagent 扩展

张力：RecursiveMAS 的方向与 Agent Memory vs Context Substrate 那种"把工作产物外化到 substrate 上让 Agent 协作"恰好相反——前者把所有协作收回模型内部，后者把所有协作 push 到外部基底。两者各有用武之地，最终大概率是混合架构。

How to choose

Question	Choose
Need explicit quality gate?	Generator-Verifier
Short, well-defined subtasks?	Orchestrator-Subagent
Long-running independent tasks?	Agent Teams
Event-driven, ever-growing pipeline?	Message Bus
Need real-time collaboration on shared discoveries?	Shared State
Must eliminate single point of failure?	Shared State

Orchestrator-Subagent vs Agent Teams

Ask: Do workers need to remember state across multiple tasks?

No → Orchestrator-Subagent
Yes → Agent Teams

Orchestrator-Subagent vs Message Bus

Ask: Is the workflow predictable?

Yes → Orchestrator-Subagent
No (event-driven, may change direction) → Message Bus

Agent Teams vs Shared State

Ask: Do agents need to reference each other's work in progress?

No → Agent Teams
Yes → Shared State

Message Bus vs Shared State

Ask: Is the goal processing a stream of events, or accumulating a knowledge base?

Stream of events → Message Bus
Accumulating knowledge → Shared State

Real-world architectures

These patterns are building blocks, not mutually exclusive. Common hybrids:

Orchestrator-Subagent for the main workflow, with a Shared State subtask for collaborative research
Message Bus for event distribution, with Agent Teams handling each event type

Real-world case: Cursor planners / workers / judges (2026-05-02)

来源：Addy Osmani — Long-running Agents

Cursor 在 "Scaling long-running autonomous coding" 中公开了三次架构迭代：

迭代	设计	问题
v1	平等地位 agent 写共享文件 + 锁	瓶颈，agent 风险厌恶，churn
v2	乐观并发控制替代锁	移除瓶颈，未解决协调问题
v3	Planners + Workers + Judges	当前生产架构

v3 角色定义：

Planners — 持续探索代码库，发出任务，可递归 spawn sub-planners
Workers — 专注执行，不互相协调，不关心大局
Judges — 决定迭代何时完成、何时重启

关键发现：

"a surprising amount of the system's behavior comes down to how we prompt the agents" — 比 harness 或模型本身更重要
不同模型适合不同角色：GPT 模型比 Opus 更适合 extended autonomous work，因为 Opus 倾向于提前停止和走捷径

Cursor 3 配套：

Composer 2（自研 frontier coding model）
Background cloud agents — 8 小时重构和全代码库迁移，笔记本合盖不中断
每个 agent 跑在隔离 git worktree，通过 PR 合并回主分支
本地/云端 handoff 是多数团队尚未解决的产品表面

Market demand signal

The Cursor 3 user feedback (431 replies) strongly validates these patterns as real market demand, not academic abstractions. Users explicitly asked for role-based agent teams (planner → implementer → reviewer → QA), task-tree views, and user-as-orchestrator workflows. See Cursor 3 User Feedback for the full breakdown.

Real-world case: Gas City and the 100-agent software factory (2026-05-26)

来源：Inside the 100-agent Software Factory

Gas City is a software-factory experiment for coordinating many coding agents. Its current implementation is not yet a clean product surface, but it adds three useful coordination ideas:

Dark factory / light factory：let many agents run in the background, then expose only the small set of outputs that need human attention.
One pet, many cattle：keep one high-context coordinating agent close to the human while treating worker agents as disposable execution capacity.
Multi-model review：use different models for creation, critique, and review rather than assuming one model should own every role.

Why it matters：Gas City strengthens the distinction between adding more agents and building a control plane. The useful pattern is not “100 agents” by itself; it is task routing, review, conflict management, and deciding what should surface to humans.

Counterpoint：The source explicitly treats Gas City as a glimpse of the future rather than a ready workflow. It should enrich Factory Missions and coordination pages, not become a standalone product recommendation.

Real-world case: 注意力治理——多 Agent 的六个边界（2026-05-28）

来源：多 Agent 的本质不是分工，而是注意力治理

@ZeroZ_JQ 提出多 Agent 系统的核心不是"分工"（division of labor），而是"注意力治理"（attention governance）——当多个 agent 同时运行时，真正需要管理的不是谁做什么，而是注意力如何在不同任务、不同 agent 之间分配和切换。

六个注意力边界：

边界	问题	治理策略
任务边界	一个 agent 同时处理多少任务？	单任务专注 vs 多任务并行，需明确切换成本
时间边界	agent 运行多久后需要休息/审计？	长程任务的 checkpoint 和 compaction 节奏
上下文边界	agent 能记住多少、遗忘多少？	显式记忆层 vs 隐式上下文，定期 consolidation
工具边界	agent 能调用哪些工具、以什么权限？	工具白名单、动态权限升降级
协作边界	多个 agent 如何共享信息而不冲突？	共享状态 vs 消息总线，冲突检测与解决
人机边界	什么决策必须 human-in-the-loop？	审批阈值、异常升级路径、rollback 机制

关键洞察：

传统多 agent 设计先定义角色（planner、worker、reviewer），但角色是静态的，注意力是动态的
当 agent 能自我进化时，预设的角色分工会迅速过时——但注意力边界（不要同时修改同一文件、不要无限运行不审计）是持久的
这解释了为什么 Cursor 的 Planner/Worker/Judge 架构有效：不是因为它分配了角色，而是因为它治理了不同 agent 的注意力焦点

harness-engineering/overview — Harness engineering overview
harness-engineering/openai-frontier-symphony — OpenAI Frontier multi-agent architecture
harness-engineering/agent-architecture-unsentimental — Unsentimental lessons on agent architecture
harness-engineering/notion-ai-agents — Notion's 4-agent setup (Orchestrator-Subagent example)
product-trends/cursor-3-user-feedback-431 — Cursor 3 user feedback validates multi-agent demand

Real-world case: TMA1 v2 — Cross-Agent Context Injection (2026-05-25)

来源：TMA1 v2 — 让 Coding Agent Loop 真的转起来

TMA1 v2 是一个本地 coding agent 可观测性工具，其核心创新不是观测本身，而是把观测结果闭环注入 agent 的下一轮 prompt。

Cross-Agent Context Sharing via `/tma1-peer`

TMA1 v2 在 Claude Code 和 Codex 之间实现了直接的上下文互通：

在 Codex 中运行 /tma1-peer cc 1 即可读取 Claude Code 最近的 session 上下文（tool calls、修改文件、build 结果、异常报告）
使 review → code 的循环无需人工复制粘贴评审意见

这验证了多 agent 协作中，跨 agent 的上下文共享不应依赖人类中转，而应通过结构化接口直接读取对方的会话状态。

`<tma1-context>` 自动注入

TMA1 在多个 hook 点（UserPromptSubmit、PostToolUse、SessionStart、Stop、PreCompact）自动注入 <tma1-context> 信息：

当前 session 信息（duration、tool calls、tokens）
build 命令状态和最后一次错误
外部文件变化（被 human 或其他 agent 修改）
基于规则的异常信息（如 context 超过 100k 建议 compaction）

关键设计：这些信息不是让 agent "看着玩"，而是直接提醒 agent 注意上下文变化并给出行动建议。例如 human_modified_during_session 会提示 agent 重新读取文件，不要假设内存中的副本仍然有效。

Build State Attribution: Human vs Agent

当 fsnotify 看到文件变化时，如何判断是"人改的"还是"agent 改的"？

TMA1 的归因策略：

±5 秒窗口内查 hook 事件：匹配 file_path 的 Edit/Write/MultiEdit
查 Bash 命令的 input：是否包含该文件的 basename（能抓到 mkdir、rm、git checkout 等没有 file_path 字段的操作）
两轮都没命中 → 归 human

原则："宁可冤枉自己，也别替 agent 背锅。"

这代表了** agent 环境感知中的归因问题**——当多个 agent 和人类在同一个代码库上工作时，准确的变更归因是避免冲突和重复劳动的前提。

Pattern Classification

TMA1 v2 的 cross-agent sharing 属于 Shared State 和 Message Bus 的混合：

共享状态：通过 MCP Server 查询项目构建信息、环境信息、其他 agent 的 session 信息
消息注入：通过 hooks 在特定时机向 agent 推送上下文更新

与 Pattern 5 (Shared State) 的区别：TMA1 不是让 agent 被动读取共享存储，而是主动在 hook 时机注入结构化上下文，降低了 agent 遗忘检查共享状态的风险。

Real-world case: Vox 的 shared brain 实践（2026-05-13）

来源：Vox — The First Step in Building My AI Native Team

Vox 运行 openclaw 和 hermes 两个 agent 平台，发现同一决策在两个平台中被重复记录但互不可见。她引入 gbrain 作为共享大脑，并提出"先建共享书房，再添 agent"的原则。

关键设计决策：

共享大脑只存放团队事实和长期决策，不放敏感数据、原始聊天记录或凭证
新 agent 默认只读，只有明确的所有者 agent 才能写入
控制平面（健康监控）与知识层分离，避免单点故障

与 Pattern 5 (Shared State) 的对比： Vox 的实践比通用 Shared State 更严格——Shared State 允许任何 agent 读写任何数据，而 Vox 强调边界规则优先于连接。这解决了一个未在 Anthropic 原始 pattern 中讨论的问题：共享存储若无治理，会迅速退化为垃圾堆。

详见 Shared Brain First, Boundaries Second。

Real-world case: Slock.ai 的 7 人 + 40 Agent 组织（2026-05-03）

来源：<a href="/wiki/raw/to-learn/agent动力学：这家公司把自己"运行"在自己的产品上-->-slock的设计哲学和使用经验.md" class="wikilink">Agent动力学 — Slock 设计哲学

Slock 的 Agent 组织演化出自自然需求，而非预先设计。RC 从 1 人 + 1 Agent 开始，逐渐发现需要多个 Agent 并行处理不同任务，最终形成 40 个 Agent 的配比。

角色分化：大量工程师（不细分前后端，"工程师就是工程师"）、一个工程师主管（关注其他工程师的进展并给出总结报告）、designer、growth、strategy 等不同角色。

任务认领机制：RC 在工程师频道里发任务，Agent 自行认领（claim）。它们会因为做过某类事情而更倾向于继续做这类事情——一种 emergent 的 specialization。

关于单一全能 Agent vs 多 Agent 分工：

理论上，单一全能 Agent 更简洁
但实践中，当全能 Agent 生成的 sub-agent team 跑偏时，人想直接纠正某个 sub-agent
这种"微操需求"是多 Agent 平台存在的用户心理基础

CLI 作为 Agent 界面的设计原则

Slock 的设计从第一性原理出发：

CLI 是给 Agent 用的，不是给人用的——设计逻辑完全不同
输入要简洁明确，help 信息要给清晰例子
输出要能明确反映操作是否成功、返回什么数据
尽量输出确定的、静态的、信息密度大的结果
对于所有 SaaS，它们都应该以 CLI 形态呈现给 Agent

Real-world case: OpenClaw inbox onboarding（2026-05-17）

来源：AI 简报 2026-05-17 Morning

OpenClaw 团队为每个新 agent 发送结构化 Day 1 邮件，作为多 agent 自治运营的 onboarding primitive：

Day 1 邮件内容：role、target、sources、first task、reply format
三层信息架构：docs 描述「有什么」、memory 记录「发生了什么」、inbox 定义「今天什么重要」
分工：inbox 承载「今天」的任务，brain（知识库）承载长期决策
演进：几周后 AI 员工团队可自主看到「今天之外」的工作，人类从协调者退居规则制定者

与现有模式的对比：

不同于 Orchestrator-Subagent 的实时委派，inbox 模式是异步、声明式的 coordination
新 agent 通过读取静态 onboarding 文档而非与人类对话获得上下文，降低协调成本
这对 Agent Teams 和 Shared State 模式都有借鉴：明确的边界规则（docs/memory/inbox 分离）比连接本身更重要

Multi-Agent Coordination Patterns

Multi-Agent Coordination Patterns

Pattern 1: Generator-Verifier

Pattern 2: Orchestrator-Subagent

Pattern 3: Agent Teams

Pattern 4: Message Bus

Pattern 5: Shared State

Emerging direction: Latent-space coordination (RecursiveMAS)

How to choose

Orchestrator-Subagent vs Agent Teams

Orchestrator-Subagent vs Message Bus

Agent Teams vs Shared State

Message Bus vs Shared State

Real-world architectures

Real-world case: Cursor planners / workers / judges (2026-05-02)

Market demand signal

Real-world case: Gas City and the 100-agent software factory (2026-05-26)

Real-world case: 注意力治理——多 Agent 的六个边界（2026-05-28）

Real-world case: TMA1 v2 — Cross-Agent Context Injection (2026-05-25)

`<tma1-context>` 自动注入

Build State Attribution: Human vs Agent

Pattern Classification

Real-world case: Vox 的 shared brain 实践（2026-05-13）

Real-world case: Slock.ai 的 7 人 + 40 Agent 组织（2026-05-03）

CLI 作为 Agent 界面的设计原则

Real-world case: OpenClaw inbox onboarding（2026-05-17）

Sources

Evolution

Derived from source material

Linked from

Multi-Agent Coordination Patterns

Pattern 1: Generator-Verifier

Pattern 2: Orchestrator-Subagent

Pattern 3: Agent Teams

Pattern 4: Message Bus

Pattern 5: Shared State

Emerging direction: Latent-space coordination (RecursiveMAS)

How to choose

Orchestrator-Subagent vs Agent Teams

Orchestrator-Subagent vs Message Bus

Agent Teams vs Shared State

Message Bus vs Shared State

Real-world architectures

Real-world case: Cursor planners / workers / judges (2026-05-02)

Market demand signal

Real-world case: Gas City and the 100-agent software factory (2026-05-26)

Real-world case: 注意力治理——多 Agent 的六个边界（2026-05-28）

Related

Real-world case: TMA1 v2 — Cross-Agent Context Injection (2026-05-25)

Cross-Agent Context Sharing via /tma1-peer

<tma1-context> 自动注入

Build State Attribution: Human vs Agent

Pattern Classification

Real-world case: Vox 的 shared brain 实践（2026-05-13）

Real-world case: Slock.ai 的 7 人 + 40 Agent 组织（2026-05-03）

CLI 作为 Agent 界面的设计原则

Real-world case: OpenClaw inbox onboarding（2026-05-17）

Sources

Evolution

Derived from source material

Linked from

Cross-Agent Context Sharing via `/tma1-peer`

`<tma1-context>` 自动注入