Anthropic accidentally leaked their best model. Here's what I actually think about it.

On March 26, Fortune reported that Anthropic had left roughly 3,000 internal documents in a publicly searchable data store. Among them: a draft blog post announcing Claude Mythos — a new model tier they’re calling “by far the most powerful AI model we’ve ever developed.”

They confirmed it. Called it a “step change.” Said early access customers are already using it.

I’ve been thinking about this for a couple of days. Not the leak itself (that’s just an embarrassing misconfigured bucket) but what it actually means if you’re picking models and building with them.

What Anthropic actually wrote

The leaked draft is unusually direct. Some key lines:

“Mythos is a new name for a new tier of model: larger and more intelligent than our Opus models — which were, until now, our most powerful. We chose the name to evoke the deep connective tissue that links together knowledge and ideas.”

“Mythos is currently far ahead of any other AI model in cyber capabilities and heralds an imminent wave of models that can exploit vulnerabilities in ways that far exceed the efforts of defenders.”

“It’s very expensive for us to serve, and will be very expensive for our customers to use.”

That last one matters. They’re not hiding it.

The new hierarchy

Mythos introduces a fourth tier that sits above Opus. The lineup:

Mythos / Capybara above Opus, most capable, not yet public
Opus currently the top you can actually use
Sonnet the balanced workhorse
Haiku fast and cheap

The Capybara name is the internal product tier — Mythos is the first model in it. Anthropic doesn’t usually create new tiers. When they do, it signals a real capability jump, not just benchmark points.

The cybersecurity-first release

The rollout plan is unusual. Rather than a standard API release, Anthropic is explicitly targeting cyber defenders first:

“We’re releasing it in early access to organizations, giving them a head start in improving the robustness of their codebases against the impending wave of AI-driven exploits.”

This isn’t marketing framing. Anthropic is essentially saying: this model is so capable at finding and exploiting vulnerabilities that we want the defense side to see it first. That’s a meaningful admission about what the model can actually do.

What I actually care about

I pick models every day. Henry (our AI agent) runs on Sonnet by default, bumps to Opus when something’s genuinely hard. The question for any new model is always the same: what does it unlock that the cheaper model can’t do?

The honest answer with Mythos: I don’t fully know yet. It’s not public. But based on where current models break, my guess is that Mythos pushes the ceiling on long-horizon tasks. The stuff where you give an agent a complex multi-step problem and it starts losing coherence halfway through. Contradicting earlier decisions. Missing constraints it set for itself.

That failure mode is the main reason humans still need to babysit agents. If Mythos genuinely extends that ceiling, it’s not a marginal improvement.

The cost problem

Opus is already painful at scale. Anthropic is explicitly flagging that Mythos is more expensive to serve than anything they’ve released before, and they’re working on efficiency before any general release.

That means the discipline of task routing matters more than ever. The teams that benefit most will be the ones who already know exactly which tasks justify the expensive model. Everyone else will route everything through it and wonder why the ROI isn’t obvious.

What doesn’t change

The model isn’t the bottleneck.

The teams seeing the most value from AI right now aren’t the ones with the biggest models. They’re the ones who’ve built discipline around when to use which model, how to structure context and how to design workflows that degrade gracefully when the model gets it wrong.

Capybara will be expensive. The people who benefit most will be the ones who can precisely identify the handful of tasks where it matters.

The meta-point

There’s something interesting about the timing. The AI model landscape has been relatively stable for a few months. Sonnet, Opus, GPT-5, Gemini 3 Pro, take your pick. Everyone knows roughly what they’re working with.

Mythos resets that. And with Anthropic reportedly heading toward an IPO this year, the timing of this buzz isn’t accidental, even if the leak itself was.

I’ll update this once I can actually get my hands on it.