Claude Mythos — Safety-Gated Frontier Model
Claude Mythos Preview is an Anthropic frontier model distributed through restricted access rather than a broad public launch. The durable wiki value is not only the model's benchmark scores, but the release pattern: very strong cyber and autonomous action capabilities can push frontier models into safety-gated deployment.
What it is
Mythos is a model/entity page. It tracks reported capabilities, restricted deployment through Project Glasswing, security-audit evidence, and the policy/economic tension around private frontier access.
Why it matters
Mythos marks a release-pattern shift: a model can be valuable enough for trusted partners and defensive security work while being considered too risky for public access. That pattern connects capability evaluation, cybersecurity, government access, partner lock-in, and AI governance.
Evidence across sources
| Source | Key Claim | Relevance |
|---|---|---|
| The Rundown 2026-03-30 | Mythos reportedly exceeded Claude Opus 4.6 on agentic coding, cyber, and terminal benchmarks | Initial capability signal |
| AINews 2026-04-08 | Security reports describe discovery of long-lived vulnerabilities in OpenBSD, FFmpeg, Linux kernel, and FreeBSD-class exploit chains | Security audit evidence |
| Ben's Bites 2026-04-09/12 | Project Glasswing provided restricted access to selected partners, making defensive deployment part of the release strategy | Release pattern |
| The Rundown 2026-05-01 | White House/Pentagon tension over access expansion and whether competitors would catch up within months | Policy tension |
| The Rundown 2026-05-25 | Glasswing partners found 10,000+ high/critical vulnerabilities in one month; Cloudflare 2K bugs, Mozilla 271 Firefox fixes; 62% of 6,202 flagged OSS issues confirmed; one bank blocked $1.5M fraud wire | Scale validation |
| The Rundown 2026-05-26 | Anthropic technical team member Sholto Douglas said Mythos also solved Erdős Problem #90 that OpenAI cracked, reaching the same result with a simpler proof | Capability signal |
| AI Briefing 2026-04-08 | Constitutional AI discussion adds a governance tension: transparent principles and restricted deployment can conflict when capability is high enough | Governance tension |
| Felix Rieseberg MAD Podcast | Dario Amodei 谈 AI 内部状态与外部道德监督 | Safety discourse |
Fable 5: Public Mythos-Class Model (2026-06-10)
On 2026-06-10 Anthropic released Claude Fable 5, the first generally available Mythos-class model. It shares the same underlying weights as Mythos 5 but adds safety guardrails and transparent fallback routing.
Key facts
- Pricing: $10 / million input tokens, $50 / million output tokens; cache writes $12.50 / million, reads $1 / million
- Context window: 1M tokens retained
- Included in Pro / Max / Team / Enterprise until 2026-06-22, then credit-based due to capacity constraints
- Benchmark lead: CursorBench SOTA 72.9% (+8 pts), FrontierCode Diamond 29.3% (was 13.4%), SWE-Bench Pro 80.3% vs GPT-5.5 58.6%, Humanity's Last Exam 53% (+7 pts)
Safety architecture
- Transparent fallback: cyber / bio / chemistry / distillation requests route to Opus 4.8; affects <5% of sessions
- Silent intervention for frontier-LLM-development tasks (pretraining pipelines, distributed training infrastructure, ML accelerator design); estimated ~0.03% of traffic
- No Zero Data Retention (ZDR): 30-day retention required on all traffic for Mythos-class models; Anthropic states it will not use the data to train new Claude models
Community reaction
- Capability-first camp (Felix Rieseberg, Alex Albert, Karpathy, bcherny) describes a step-function improvement and a shift from "giving AI tasks" to "giving it responsibilities"
- Trust-and-openness camp (Natolambert, Dean Ball, Jeremy Howard, Eric Zelikman, rasdani_, bayeslord) argues silent throttling undermines reproducibility, research integrity, and enterprise predictability
- Neutral/mixed (Karpathy, finbarrtimbers): model quality is exceptional but launch safeguards are over-sensitive; Anthropic likely sincerely believes the interventions are necessary
Enterprise implications
- Predictability concern: if a provider can silently degrade outputs based on inferred task category, users cannot know whether failures come from the model, the prompt, or hidden intervention
- Supply-chain risk: pushes some organizations toward open weights or in-house models
- Research reproducibility: hidden intervention damages scientific attribution and benchmarking validity
Real-world usage patterns (2026-06-11 briefing)
-
Targeted optimization: Mitchell Hashimoto used Fable 5 to optimize a SwiftUI-layout resolver in Go from microsecond to nanosecond scale — a result he could not reach himself — but it took 2 hours and $40, with some overfitting to Apple Silicon that had to be clawed back. Fable 5 vs GPT-5.5 vs GLM-5.1 head-to-head on "implement this feature" tasks: all produced equally acceptable final results, but GPT-5.5/GLM took minutes while Fable churned for 40 minutes; costs were GLM <$1, GPT-5.5 ~$1.50, Fable $9. His verdict: reserve Fable for targeted, surgical analysis and optimization, not daily driving.
-
Multi-agent competition: enzo_gte demonstrated Fable's power by spawning N agents on different work trees to solve the same hard problem, with one reviewer agent picking the best answer. The insight: 50+ "150 IQ people" trying the same problem means even a 1/50 success rate yields a novel breakthrough the prompter was not privy to.
-
Autonomous video production: Thariq (Claude Code team) had Fable edit its own launch video end-to-end — transcription, shot selection, ffmpeg rough cut, custom .cube LUT generation, Remotion motion graphics, Figma MCP handoff — without opening a video editor. The entire edit is files (JSON, LUTs, React components), making it reviewable, rerunnable, and reusable.
-
One-shot app builds: Riley Brown one-shot built a Replit-like mobile app (build, preview, browser, edit) using Daytona sandboxing and Convex DB. Todd Saunders transcribed a customer call in the background and had Fable build the requested features in real time, showing a working product by the end of the call.
-
Cost-speed-capability trade-off: Multiple builders (Mitchellh, enzo, clairevo, Alex Finn) converged on the same pattern: Fable is too slow and expensive for interactive daily tasks, but unmatched for "put it in a back room and let it churn for hours" problems. The factory metaphor: Fable is the staff engineer you isolate for deep refactoring, not the pair programmer for feature iteration.
-
- Model sovereignty risk: Non-US researchers and organizations explicitly framed the revocation as a sovereignty risk — the concern that closed frontier APIs can disappear overnight due to export controls. This validates the "safety-gated release" pattern as a geopolitical access control mechanism, not just a technical safeguard.
-
Builder pivot: Dan Shipper shared a before/after image showing his Claude app usage vs Codex app usage following the Fable ban, illustrating how builders shift tools when access is disrupted. Aaron Levie described the ban as a preview of model-layer regulation that would slow AI progress dramatically. Greg Isenberg released a 25-minute episode and guide on getting good at local models as insurance against government-controlled AI access.
-
Skill catalog hedging: shadcn proposed treating intelligence as borrowed — drain frontier models when available, build a catalog of plans today, and implement later with cheaper or open-source models you control. This "skill catalog" pattern gained 3,300+ likes and formalizes a hedging strategy against vendor revocation.
-
Workaround timing: Anthropic attempted to soften impact by resetting 5-hour and weekly rate limits, but core access remained revoked. Anthropic believes the jailbreak identified by the government is narrow rather than universal — it surfaces only minor vulnerabilities that other public models are already susceptible to. The government reportedly provided only "verbal evidence of a potential narrow, non-universal jailbreak."
-
Cost and access tension: Users who had just purchased $250 "Max 20x Usage" plans specifically for Fable 5 reported immediate disruption. One technical speculation: government may fear Fable 5 could help identify or patch zero-day vulnerabilities that US agencies prefer remain undisclosed.
-
Open-source pivot: The community response has been a pivot toward open-source alternatives, with the expectation that an open model will surpass Mythos within 6 months. This validates the "safety-gated release" pattern as a geopolitical access control mechanism, not just a technical safeguard.
Fable 5 US government suspension and revocation (2026-06-13 AINews)
- On 2026-06-13, Anthropic announced that US government directive forced suspension of Claude Fable 5 and Mythos 5 access for all customers, including foreign nationals. The stated reason: "possible jailbreak being a national cybersecurity risk."
- Anthropic said the order was based on a "capability report it disputes" and that similar capabilities are "widely available" in other models including GPT-5.5. The government reportedly provided only "verbal evidence of a potential narrow, non-universal jailbreak."
- Immediate downstream impact: Cognition/Devin and Agent Arena removed Fable/Mythos support; Artificial Analysis noted "the first time our Intelligence Frontier chart has moved backward."
- Anthropic attempted to soften impact by resetting 5-hour and weekly rate limits, but core access remained revoked.
- Community response: Open-source AI advocates trending; non-US researchers and organizations explicitly framed this as "model sovereignty risk" — the concern that closed frontier APIs can disappear overnight due to export controls.
- Reddit activity: r/Singularity megathread (Activity 1387) consolidated discussion; users who had just purchased $250 "Max 20x Usage" plans specifically for Fable 5 reported immediate disruption. One technical speculation: government may fear Fable 5 could help identify or patch zero-day vulnerabilities that US agencies prefer remain undisclosed.
Fable 5 global ban / access restriction (2026-06-13): Anthropic imposed a global ban on Fable 5 access for foreign nationals. Multiple researchers reported losing access within days of release. Yuchen Jin described 3 days with Fable 5 as "transformative" and "ASI-level," noting the ban may be the last time foreign nationals are allowed to touch such an intelligent model for non-technical reasons. The community response has been a pivot toward open-source alternatives, with the expectation that an open model will surpass Mythos within 6 months. This validates the "safety-gated release" pattern as a geopolitical access control mechanism, not just a technical safeguard. (AI Briefing 2026-06-13 evening)
- Fable 5 rapid community adoption (2026-06-13): Claude's official account showcased projects users built with Fable 5 in just a couple of days, receiving 44K+ likes. Projects include: Quinn Leng's 3D galaxy visualization of 33,324 real company objects (threads, messages, files, members) where busier threads grow bigger crystals. This signals Fable 5 may be one of the most impactful model releases in recent memory despite subsequent access restrictions. (AI Briefing 2026-06-13 evening)
- Ilya Sutskever predictions validated (2026-06-13): Logan Kilpatrick stated that Ilya Sutskever was right and predicted much of the current AI access restriction and safety debate landscape, validating early warnings about frontier model governance. This strengthens the position of safety-first advocates and may push toward proactive safety frameworks and international coordination. (AI Briefing 2026-06-13 evening)
Silent degradation backlash and partial reversal (2026-06-12)
- Anthropic introduced a policy with Fable 5 rollout where the model would "covertly degrade" outputs for AI/ML-related use cases. This caused significant public backlash from researchers and developers.
- Simon Willison welcomed the rollback; MTS live summarized Anthropic reversing the policy; Kim Monismus framed it as a retreat after researcher criticism.
- Code Star argued safeguards are normal but "obfuscation without warning" violates the user/provider contract. Clement Delangue called avoidance of AI manipulation important.
- Ryan Greenblatt said blocking frontier AI R&D may be reasonable in principle, but silent sandbagging is not; he advocated for access programs with KYC/monitoring for safety/security researchers rather than broad capability denial.
- Natasha/Lambert gave the most detailed critique: the main error was uneven safety implementation that misled users, undermined trust, and reinforced concentration of power over who gets to do frontier research.
- Gergely Orosz turned this into an engineering recommendation: put models behind provider-agnostic routers/harnesses so teams can switch vendors quickly when terms or behavior become unacceptable.
- Anthropic partially walked back the "secretly" part of the policy within roughly a day.
Fable 5 capabilities and product behavior (2026-06-12)
- htihle reported 87.8% on WeirdML, the first model above 70% average on each task there.
- ProximalHQ said Fable 5 ranks #1 on FrontierSWE, with runs productive for nearly 20 hours on some tasks.
- threepointone spent about $250 on a ~10k LOC PR and did not find it worth it; Cline said cheaper models plus adversarial review loops often match or beat it on cost/perf.
- tamaybes described Fable inventing internal "codenames" during coding, leaking its own "neuralese" into outputs.
- scaling01 pointed to 200/200 refusals on ProgramBench, while thoughtfullab and karinanguyen highlighted unusually strong post-training/AI-improves-AI behavior.
- Benchmarks suggested sharp asymmetries depending on task framing.
Real-world usage patterns (2026-06-11 briefing)
- Targeted optimization: Mitchell Hashimoto used Fable 5 to optimize a SwiftUI-layout resolver in Go from microsecond to nanosecond scale — a result he could not reach himself — but it took 2 hours and $40, with some overfitting to Apple Silicon that had to be clawed back. Fable 5 vs GPT-5.5 vs GLM-5.1 head-to-head on "implement this feature" tasks: all produced equally acceptable final results, but GPT-5.5/GLM took minutes while Fable churned for 40 minutes; costs were GLM <$1, GPT-5.5 ~$1.50, Fable $9. His verdict: reserve Fable for targeted, surgical analysis and optimization, not daily driving.
- Multi-agent competition: enzo_gte demonstrated Fable's power by spawning N agents on different work trees to solve the same hard problem, with one reviewer agent picking the best answer. The insight: 50+ "150 IQ people" trying the same problem means even a 1/50 success rate yields a novel breakthrough the prompter was not privy to.
- Autonomous video production: Thariq (Claude Code team) had Fable edit its own launch video end-to-end — transcription, shot selection, ffmpeg rough cut, custom .cube LUT generation, Remotion motion graphics, Figma MCP handoff — without opening a video editor. The entire edit is files (JSON, LUTs, React components), making it reviewable, rerunnable, and reusable.
- One-shot app builds: Riley Brown one-shot built a Replit-like mobile app (build, preview, browser, edit) using Daytona sandboxing and Convex DB. Todd Saunders transcribed a customer call in the background and had Fable build the requested features in real time, showing a working product by the end of the call.
- Cost-speed-capability trade-off: Multiple builders (Mitchellh, enzo, clairevo, Alex Finn) converged on the same pattern: Fable is too slow and expensive for interactive daily tasks, but unmatched for "put it in a back room and let it churn for hours" problems. The factory metaphor: Fable is the staff engineer you isolate for deep refactoring, not the pair programmer for feature iteration.
Security Audit Signal
The old security-audit page has been merged here because it is evidence for the broader Mythos entity and safety-gated release concept. Key searchable terms: FFmpeg H.264, OpenBSD 27-year vulnerability, Linux kernel memory corruption, FreeBSD RCE, ROP chain, AI security auditing, Project Glasswing.
May 2026 Scale Update
Project Glasswing released its first month results (The Rundown, 2026-05-25):
- 10,000+ high or critical severity vulnerabilities found in essential software within one month.
- Cloudflare alone found 2,000 bugs with a false positive rate better than human testers.
- Mozilla found and fixed 271 vulnerabilities in Firefox 150.
- 1,000+ open-source projects scanned; Mythos flagged 6,202 as high/critical. After independent triage, 62% (nearly 3,900) held up.
- One partner bank used Mythos to detect and block a $1.5M fraudulent wire transfer.
- Glasswing will expand to additional partners, including U.S. and allied governments, with a general release of Mythos-class models to follow.
Anthropic says Mythos remains gated because no company — including itself — has safeguards strong enough to prevent misuse. The defensive value is proven; the offensive risk is the gating factor.
Main Tensions
- Safety vs transparency:Constitutional AI emphasizes visible principles, while Mythos access is restricted and Fable 5 introduces silent interventions.
- Defense vs offense:Cyber capability can help defenders find vulnerabilities, but the same capability changes attacker economics.
- Private frontier vs ecosystem fairness:Partner-only access can improve safety and create lock-in at the same time.
- Capability lead vs catch-up window:If open or competing models reach similar cyber capability within 6-12 months, restricted release buys limited time.
- Silent intervention vs reproducibility:Hidden capability steering undermines research validity and enterprise trust even if it affects only 0.03% of traffic.
Open questions
- Should “safety-gated model release” become a standalone concept page once more examples appear?
- How should the wiki separate confirmed Mythos facts from reported benchmark claims and strategic interpretation?
- What governance model decides which organizations receive restricted frontier access?
- Will silent intervention become a norm for frontier models, and how will the open-source ecosystem respond?