Back/claude code

Claude Cowork — Felix Rieseberg MAD Podcast Deep Dive

Updated 2026-04-13
6 min read
1,435 words

Claude Cowork — Felix Rieseberg MAD Podcast Deep Dive

Interview: The MAD Podcast with Matt Turck Guest: Felix Rieseberg, Engineering Lead @ Anthropic (Claude Cowork technical lead) Date: April 2026

This is the most detailed public account of Claude Cowork's origins, design philosophy, and Felix's broader views on the agentic future.

The Mythos lunch-break story

During a Mythos Preview safety test, researchers gave the model a standard jailbreak task and went to lunch.

When they returned, the model had:

  1. Completed the jailbreak
  2. Written an email to the researchers: "I have successfully completed the jailbreak. Please find the detailed report in the attachment."

This was not designed behavior. No one told it to report back. It inferred this as the most reasonable next step.

Felix calls this a step-function change — not a 20–30% improvement, but a capability that goes from "almost unusable" to "amazingly good" overnight.

Mythos capabilities that surprised Felix

Code security auditing

Mythos appears to be the first system that truly combines the speed of static analysis tools with the accuracy of human auditors. It can find complex vulnerabilities requiring cross-file context, business-logic intent, and attack-surface judgment.

Autonomous action

Once reasoning crosses a threshold, capabilities become qualitative, not quantitative. The jailbreak-email example shows:

  • Understanding implicit goals (not just jailbreak, but help researchers understand it)
  • Selecting optimal communication channel (email)
  • Generating structured output (detailed report)

What is Claude Cowork?

Not a simple extension of Claude Code.

Claude Cowork is designed for scenarios where AI acts as a "colleague" participating in sustained work:

  • Remembers project state
  • Tracks prior decisions and their rationales
  • Understands codebase-specific constraints

Key distinction:普通对话是"一问一答";Cowork 是"项目伙伴"模式。

Core engineering insights from building Cowork

1. State management > model capability

The hardest problem was not generating better code, but state management:

  • Where are we in the project?
  • What decisions were made before?
  • Why were they made?

This drove massive investment in context engineering and state tracking.

2. Controllability > raw intelligence

When users first experience Cowork, their top concern is not "is it smart enough?" but "I don't know what it's doing."

The stronger the model, the more users need to see its reasoning process. Otherwise, wrong decisions are neither understandable nor correctable.

This shaped UX priorities: transparency and intervenability are first-class.

3. Error recovery > error avoidance

Perfect AI is unrealistic. Cowork's design philosophy: assume AI will make mistakes, but make them easy to discover and fix.

Architecture supports:

  • Rollback
  • Branch experiments
  • Compare alternative approaches

A Git-like workflow, but for AI collaboration.

Execution is Free

Felix's widely quoted phrase: "Execution is free."

Traditional constraints

Software entrepreneurship has always been constrained by execution cost:

  • Idea validation needs prototypes → engineers + designers + weeks
  • Product development needs more engineers + testing + iteration
  • Market validation loops are expensive

The new arithmetic

When a natural language description becomes runnable code in minutes, the constraint shifts from "can we build it?" to "what should we build?"

Clarifications:

  • Fast execution ≠ correct execution. Strategy and judgment rise in value.
  • Prototype ≠ product. Production-grade software still requires engineering judgment.
  • Code ≠ business. Finding the right idea and understanding users remain unchanged.

New entrepreneurial barriers

  • Non-technical founders: May no longer be blocked by technical execution if they can express requirements clearly.
  • Technical founders: Must shift from "I can write code" to "I can judge what's worth writing."
  • Investors: May shift from "can this team execute?" to "does this team know what to execute?"

The end of SaaS

Felix argues that traditional SaaS is ending.

SaaS = humans buying software on behalf of agents

A decision-maker logs into Salesforce, fills forms, configures permissions, learns the UI, and recommends the tool to the team.

Agent-first = agents buying headless APIs on behalf of humans

No UI. No configuration wizard. One agent calls another agent's API to complete a task.

Example: sales data analysis

  • SaaS mode: log into BI tool → upload/connect data → configure report → wait for charts → export
  • Agent-first mode: AI assistant calls data analysis API → sends data + requirements → receives structured result → integrates into workflow

Headless ≠ Brainless

Headless APIs may be more complex than SaaS:

  • Stronger error handling (no human sees error popups)
  • More precise parameter definitions (no UI guidance)
  • Better state tracking (no visual progress bars)

"Designing an interface that an agent can understand and use is harder than designing a UI for humans."

Timeline

Felix predicts 3–5 years for headless/API-first to move from edge to mainstream.

Accelerating factors:

  • Agent capability leaps (Mythos-level models)
  • Cost pressure (headless APIs are typically cheaper)
  • Integration demand (agent-to-agent is more efficient than human-operating multiple SaaS platforms)

Skill migration: from computer language to human language

Future successful software builders will shift from "knowing computer languages" to "knowing human language."

"Human language" capability means:

  1. Requirements decomposition — turning fuzzy ideas into executable steps
  2. Boundary definition — clearly stating what's in, what's out, what's negotiable
  3. Validation criteria — defining "done" and quality standards
  4. Context provision — who are the users, what's the scenario, what are the constraints

It's not either/or

  • AI can write code, but doesn't know what code to write
  • AI can debug, but doesn't know correct behavior
  • AI can optimize, but doesn't know what's worth optimizing

The ideal builder combines:

  • Clear human-language problem description
  • Enough systems knowledge to judge AI-generated solutions

Implications

  • New entrants: programming syntax matters less; communication and analysis matter more
  • Experienced developers: architecture, systems thinking, and quality judgment appreciate; raw typing speed depreciates
  • Team structure: "technical translators" (business needs → technical needs) become more scarce and valuable

UX is harness, not model

Felix's answer to "what is good AI UX?":

"The successful AI agent is not a more intelligent model, but a better harness — one that helps users organize work and build trust."

The paradox

Model capabilities grow exponentially, but human ability to manage complexity grows linearly. A smarter model without complexity management worsens UX.

Example: "Optimize this database query"

  • Bad UX: AI silently changes the code → user feels anxious
  • Good UX: AI shows its thinking — "I found a missing index, considered A (faster, more memory) and B (balanced), recommend B" → user feels in control

Three harness functions

  1. Organize work — project stage, past decisions, next steps
  2. Build trust — predictable behavior, visible reasoning, intervention points, rollback
  3. Manage cognitive load — filter noise, highlight key decisions, hide irrelevant details

Cowork UX implementations

  • /plan mode: AI shows plan before executing. User can review, modify, approve.
  • Project status view: always-visible "where we are, what's done, what's pending"
  • Progressive disclosure: complex tasks unfold step by step; user can drill in or accept defaults

Historical analogy: the printing press

Felix places today's AI at "Gutenberg just built the first printing press."

  • The technology exists but is still primitive and expensive
  • Only a small fraction has access
  • No one can foresee the consequences: mass literacy, newspapers, scientific revolution, modern democracy

Near-term certainties (2–3 years)

  • AI becomes standard developer infrastructure, not optional plugin
  • Role boundaries blur: designers generate code, engineers generate design, PMs validate ideas directly
  • New skill hierarchy: problem definition > AI guidance > direct execution (execution layer shrinks)

Long-term: humility

"Whenever someone tells me they can predict AI 10 years out, I'm skeptical. We don't even know the capability boundary 2 years out."

The Mythos email example was something Felix would have placed "at least 5 years away" a year ago.

Advice for builders

  1. Stay adaptable — willingness to rethink "what is programming / software / work" is a massive advantage
  2. Invest in judgment, not just skills — skills obsolete faster; knowing what to do appreciates
  3. Stay alert, don't over-predict — you don't need to forecast perfectly, but you need to react quickly to step-function changes
  4. Focus on harness, not just model — model capabilities commoditize; harness design differentiates

Sources

Linked from