Anthropic's smartest Claude comes with a leash

Anthropic just released the most powerful AI model it's ever built. It also told the model not to answer a lot of your questions.

On Monday, Anthropic launched Claude Fable 5 and Mythos 5, the first models in its new Mythos tier and the most capable Claudes to date. Fable 5 is the version regular customers get to use. Mythos 5 is the internal model Anthropic keeps for its own research and a small set of partners. Pricing comes in at $10 per million input tokens and $50 per million output tokens, which the company is framing as a discount compared to its earlier Mythos Preview.

The capability jump is real:

On SWE-bench Pro, the benchmark for whether an AI can actually fix real software bugs, Fable 5 scored 80.3%, compared to GPT-5.5 at 58.6% and Gemini 3.1 Pro at 54.2%.
Stripe ran it on a 50-million-line Ruby codebase migration and said Fable 5 finished in a single day what would have taken its engineering team more than two months by hand.
It also beat Pokémon FireRed using nothing but screenshots, which Anthropic flagged as proof the model can handle messy, generalized tasks without an elaborate setup.

That's the marketing. Then there are the restrictions.

Tucked into the launch is a quieter set of rules on what Fable 5 will actually answer. According to Business Insider, the model is trained to refuse questions in biology and cybersecurity that could plausibly help someone build a weapon or a serious exploit. That ends up including some questions about cancer, because oncology touches the same underlying biology Anthropic doesn't want explained. The company estimates the restrictions affect around 0.03% of traffic, concentrated in fewer than 0.1% of organizations.

Anthropic also told Fable 5 not to be especially helpful with AI research itself, Business Insider reported separately. The reasoning is that the company doesn't want its smartest model accelerating the development of even smarter ones, including its own future systems. Which makes the testimonial from Sean Ward, an AI research CEO, slightly awkward. He told Anthropic that Fable 5 "works at senior research scientist grade." It just isn't allowed to.

The safeguards do have teeth. Anthropic's cyber classifiers caught 407 out of 410 attempted misuse cases in testing, a 99.3% catch rate. They need to. The underlying Mythos 5 produced working exploits for 88% of trial Firefox vulnerabilities, versus 8.8% for the previous Opus model. The leash exists because the model genuinely needs one.

All of this lands eight days after Anthropic confidentially filed its S-1 with the SEC. The pitch to public markets has been that Anthropic is the safety-first lab enterprises can trust. Fable 5 is that pitch turned into a product.

The bet is that customers will pay a premium for a model that won't get them in trouble, even when it occasionally won't help them either. That works fine for Stripe migrating codebases and law firms doing redlines. It works less well if you're a cancer researcher, or anyone whose real work happens to share territory with the things Anthropic has decided are too risky to touch. Nathan Lambert at Interconnects called the restrictions a likely "cautionary fable" in how narrow definitions of safety rarely hold up. The real test isn't whether the leash works. It's whether paying customers stay comfortable with a model that quietly decides what they shouldn't ask.

Anthropic's smartest Claude comes with a leash

Enjoyed this article?

More Articles

Claude Code's unlimited era ends Monday

Microsoft ships Scout, then quietly tells on itself

Codex's fastest-growing users aren't engineers

Claude Code's unlimited era ends Monday

Microsoft ships Scout, then quietly tells on itself

Codex's fastest-growing users aren't engineers