Back/claude code

Claude Code Context Window Optimization

Updated 2026-04-13
2 min read
272 words

Claude Code Context Window Optimization

Guide by @aakashgupta: Context is the scarcest resource in Claude Code. Before the first user message is sent, 10%–16% of the context window is already consumed by system prompts, MCP servers, and custom agents.

Overhead breakdown

Component Approximate overhead
System prompt ~2%
Each MCP server ~8%+
Custom agent ~4%
Conversation history grows continuously

Cost hierarchy (lowest → highest overhead)

  1. CLI — zero overhead (e.g., GitHub CLI, Vercel CLI, Firecrawl CLI)
  2. API — moderate overhead
  3. MCP — highest overhead

Andrej Karpathy independently confirmed this CLI > API > MCP ordering.

Practical tactics

Monitor usage

  • Run /status line to see real-time context consumption.
  • Set color-coded thresholds:
    • Green: < 50%
    • Orange: 50–80%
    • Red: > 80%

Prefer CLI over MCP

  • Before adding an MCP server, ask whether a CLI tool can achieve the same result.
  • Periodically audit existing MCP integrations for CLI replacements.

Escape double-tap

  • If Claude starts drifting off-topic, press Escape twice.
  • This rolls back to before the erroneous prompt and physically removes the subsequent content from context.

Counterpoints & Gaps

  • The exact percentages (2%, 8%, 4%) are approximate and may vary by model version and prompt length.
  • Some workflows genuinely require MCP-level integration; CLI substitution is not always feature-complete.
  • There is no documented guarantee that Escape double-tap permanently erases tokens from the billing context.

Sources

Linked from