Anthropic Offers Mythos Upgrade for Cyber Partners and a 'Safe' Version for the Rest of You

Overview

Anthropic released its newest frontier AI model in two forms on June 9, 2026. The public gets Claude Fable 5, a version the company says cannot run cyberattacks because built-in safety classifiers block risky requests. A small group of vetted cyber-defense and biology partners gets Claude Mythos 5, the same underlying model with those safety limits removed in specific areas.

Key Takeaways

Fable 5 and Mythos 5 share one base model, so they have the same core skills in software engineering, knowledge work, vision, and scientific research. The only difference is the guardrails.
Fable 5 runs separate classifier systems that watch for misuse and hand risky prompts off to the older, more restricted Claude Opus 4.8 model instead of answering them.
Mythos 5, the unrestricted version, stays limited to approved organizations through Project Glasswing, an initiative Anthropic runs with the US government and major tech and infrastructure firms.
Anthropic tuned the safeguards so fewer than 5 percent of Fable sessions trigger the fallback, meaning the large majority of public use runs on the full new model.
External testing, including a bug bounty of more than 1,000 hours, found no universal jailbreaks of Fable's protections.
All traffic on these models is kept for 30 days, used only for safety and abuse defense, and is not used to train the model.

Stats & Key Facts

#Fewer than 5 percent of Fable 5 sessions trigger the safety fallback to the older Opus 4.8 model.
#More than 1,000 hours of external bug bounty testing found no universal jailbreaks of Fable 5's safeguards.
#Zero harmful single-turn cyberattack requests were answered across 30 public jailbreak techniques in one partner's testing.
#Fable 5 became the first Anthropic model to break 90 percent on core analytics benchmarks, a roughly 10-point jump over Opus.
#Usage traffic is retained for 30 days for safety purposes, then deleted in nearly all cases.
#One partner reported analytical runs finishing 25 to 30 percent faster than on Opus 4.8.

Two versions of one frontier model, split by who can be trusted

Anthropic took a single new model and shipped it as two separate products with different risk levels.

The company built one base model and released it under two names. Claude Fable 5 is the public version, available to all users starting June 9, 2026. Claude Mythos 5 is the same model with safety limits lifted in certain areas, and access stays restricted to approved partners.

Because both share the same foundation, they perform the same on most tasks. Anthropic says the model reaches state-of-the-art results across software engineering, knowledge work, vision, and scientific research. The split is about safety controls, not raw ability. Fable wears guardrails; Mythos does not.

How Fable 5's classifiers block cyberattacks, bioweapons, and copycats

Fable 5 runs separate watchdog systems that sit alongside the main model and screen what users ask.

›Cybersecurity: blocks help with exploitation and offensive cyber tasks, then routes the request to the older Opus 4.8 model.
›Biology and chemistry: broadly restricts answers that might aid misuse of dual-use research.
›Distillation: stops attempts to extract the model's knowledge to train a competing system.
›Jailbreak watch: the classifiers also catch attempts to trick the model into ignoring its rules.

When a classifier flags a request in one of these areas, Fable hands the answer off to Claude Opus 4.8, an older and more limited model. Anthropic tuned this trigger conservatively so it fires in fewer than 5 percent of sessions, meaning most public users still get the full new model.

Project Glasswing keeps the unrestricted Mythos 5 inside a vetted circle

Mythos 5 stays locked to a defined group of organizations rather than the open public.

Access to the unrestricted Mythos 5 runs through Project Glasswing, an initiative Anthropic operates with the US government to secure critical software. Early access goes to cyber-defense teams, critical infrastructure operators, government partners, and selected life-sciences researchers. Organizations that already had the earlier Mythos Preview move up to Mythos 5 automatically.

Glasswing launched with a roster of major firms, including Amazon Web Services, Apple, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Anthropic has committed up to 100 million dollars in usage credits and 4 million dollars in donations to open-source security groups to support the work, and has expanded the program to around 150 more organizations in power, water, healthcare, and telecommunications.

Outside testing found no universal way around the safeguards

Anthropic points to external red-teaming rather than its own word to back the safety claims.

›An external bug bounty ran more than 1,000 hours of testing and found no universal jailbreaks.
›External red-teaming groups found no universal jailbreaks on long-form agentic tasks.
›The UK AI Safety Institute made limited early progress toward a single jailbreak.
›One partner reported Fable answered zero harmful single-turn cyberattack requests across 30 public jailbreak techniques.

Benchmark scores and real-world coding wins

Anthropic backs the capability claims with benchmark results and customer reports.

›Fable 5 became the first Anthropic model to break 90 percent on core analytics benchmarks, about 10 points above Opus.
›Cognition's FrontierCode test gave Fable the highest score among frontier models at medium effort.
›Hebbia's finance benchmark rated Fable highest for senior-level reasoning.
›At Stripe, Fable compressed months of engineering into days on a 50-million-line codebase migration.

In scientific work, Anthropic says Fable outperformed specialized protein models on gene therapy design tasks and produced novel molecular biology hypotheses that human reviewers preferred about 80 percent of the time in blind comparisons. One partner reported analytical runs finishing 25 to 30 percent faster than on Opus 4.8.

What the safe-versus-unrestricted split means for everyday business readers

The two-tier release is Anthropic's answer to a hard trade-off between capability and safety.

The core idea is plain. A model strong enough to find and fix software flaws is also strong enough to write attacks. Rather than hold the model back for everyone or release it with no limits, Anthropic gave the public a guarded version and reserved the unguarded one for organizations it screens.

For most business users, the practical effect is small. The fewer-than-5-percent fallback rate means typical work, such as coding, analysis, and writing, runs on the full model. The guardrails mostly matter at the edges, where requests touch cyberattacks, dangerous biology, or attempts to copy the model. Anthropic also keeps 30 days of usage data to defend against abuse and reduce false alarms, then deletes it in nearly all cases.

Frequently Asked Questions

What is the difference between Claude Fable 5 and Claude Mythos 5?

They are the same underlying model. Fable 5 is the public version with safety classifiers that block cyberattack, biology, chemistry, and copying misuse. Mythos 5 is the same model with those limits removed in certain areas, restricted to approved partners.

Why does Anthropic say Fable 5 cannot be used for cyberattacks?

Fable 5 runs separate classifier systems that screen requests. When one detects an offensive cyber task, it blocks the new model and hands the request to the older, more restricted Claude Opus 4.8 instead of answering.

Who gets access to the unrestricted Mythos 5?

Access runs through Project Glasswing and goes to vetted cyber-defense teams, critical infrastructure operators, government partners, and selected life-sciences researchers. Existing Mythos Preview users upgrade automatically.

How often do the safeguards interfere with normal use?

Anthropic tuned the fallback to trigger in fewer than 5 percent of Fable sessions, so the large majority of ordinary work runs on the full new model rather than the older one.

How long does Anthropic keep usage data on these models?

Traffic is retained for 30 days, used only to defend against attacks and cut false positives, and not used for model training. It is deleted after 30 days in nearly all cases.

Anthropic's two-tier release shows how a leading AI lab is trying to ship its most capable model widely while keeping its most dangerous uses behind a vetted door. For most business users, the safe Fable 5 behaves like the full model, with guardrails that only bite at the edges.