Anthropic's Most Powerful Model Just Leaked — And It Can Find Zero-Days in Every Major OS

March 28, 2026Triple 3 Labs

AI NewsAnthropicClaudeCybersecurity

On March 26, 2026, a CMS misconfiguration at Anthropic accidentally published internal documentation describing a model called Mythos. By the time the pages were pulled down, screenshots were everywhere. On April 7, Anthropic stopped pretending and officially previewed it.

Here's what they've built — and why you can't have it.

The Benchmarks Are Absurd

Mythos Preview posted numbers that made the AI research community collectively stop what they were doing:

SWE-bench Verified: 93.9% — resolving nearly every real-world software engineering task thrown at it
SWE-bench Pro: 77.8%
Terminal-Bench 2.0: 82%
USAMO 2026: 97.6% — a math olympiad that breaks most humans

These aren't marginal improvements over Claude Opus 4.6. These are double-digit leads across every benchmark where they compete. The closest comparison Anthropic offered was GPT-5.4 and Opus 4.6 — and Mythos cleared both by margins that suggest it's operating in a different category.

The Cybersecurity Angle Is the Real Story

The headline capability — and the reason this model isn't going on the API — is what Anthropic calls Project Glasswing: a cybersecurity initiative where Mythos Preview was used to identify thousands of previously unknown zero-day vulnerabilities across every major operating system and browser.

Read that again. Every major OS and browser.

Axios reported that policy circles are alarmed. The concern isn't hypothetical: a model capable of discovering that many unknown vulnerabilities is also capable of exploiting them. Anthropic has been explicit that it won't release Mythos publicly, citing the offensive cyber risk.

Who Can Use It?

Forty organizations have preview access through Project Glasswing. Anthropic hasn't published the full list, but the program appears to include major software vendors, government contractors, and security firms. The access is structured to let Glasswing partners use Mythos to patch vulnerabilities — not discover new ones on their own.

The stated long-term goal is figuring out how to deploy Mythos-class models safely at scale. That framing suggests Anthropic is treating this as infrastructure R&D rather than a product launch.

What This Means for the Rest of the AI Industry

For the past two years, the AI race has been about public benchmarks and developer adoption. Mythos changes the conversation. If the most capable model is also the one that can't be safely released, you have a new dynamic: frontier capability and commercial availability are no longer the same thing.

That's new. And it's going to force everyone — competitors, regulators, enterprises — to think differently about what "having the best AI" even means.

Our Take

The security implications are real and the caution is warranted. But the more interesting story is what Mythos's existence tells us about where the technology is heading. A model that can autonomously discover software vulnerabilities at scale isn't just an AI — it's a force multiplier for anyone who controls it.

The question of who gets access to tools like this, and under what conditions, is the defining policy question of the next decade.

In the meantime, the models available today are transformative for the businesses that actually use them. We help you figure out which ones and how.