Anthropic's Project Glasswing and Claude Mythos Preview

Anthropic built a model that can find zero-days faster than almost any human security researcher. Then they decided not to release it.

Claude Mythos 2 Preview already found thousands of high-severity vulnerabilities across every major operating system and web browser. That's the capability threshold where a general release becomes a weapon. So instead of shipping it to the public, Anthropic invented Project Glasswing: a framework to deploy the model to 40+ vetted organizations that build or maintain critical software infrastructure.

Distribution itself became the safety feature.

The model you can't download

Mythos Preview isn't on the API playground. You can't fine-tune it. There's no waitlist form. Anthropic's position: AI models now surpass all but the most skilled humans at finding and exploiting software vulnerabilities. That's not marketing. That's the eval result that triggered Glasswing.

OpenAI launched GPT-5.4-Cyber last week with KYC gates and identity verification. Anthropic went the opposite direction. No consumer access at all. Instead, they extended Mythos Preview to launch partners who scan banking systems, medical records, logistics networks, power grids—the organizations that would be targets if this capability proliferated.

The commitment: share findings across the industry. Defenders get the advantage before attackers even know what's possible.

Governance before regulation

Anthropic saw an internal eval, recognized the dual-use risk, and invented a deployment model that doesn't exist in any framework yet. Not a red team. Not a bug bounty. A pre-release partnership structure where distribution is the product.

This is what AI labs do when they move faster than policy. They design their own constraints. Glasswing names the infrastructure categories it's protecting. It names the 40+ organizations. It commits to open sharing of vulnerabilities found. Those aren't technical features. Those are governance decisions baked into the product architecture.

Compare that to the traditional approach: build the model, release it, watch what happens, patch the exploits later. Anthropic flipped the sequence. They're patching the exploits now, with the model that found them, before the model is available to anyone who might use it offensively.

What this means for builders

If you're designing AI products right now, Glasswing is a case study in how capability and access decouple. The model exists. The capability is real. But the interface isn't an API endpoint. It's a vetting process, a legal agreement, a shared commitment to deploy defensively first.

When your product is powerful enough to be dangerous, distribution becomes a feature you design as carefully as the model itself. Anthropic didn't solve this with better alignment or interpretability research. They solved it by deciding who gets access and under what terms.

Mythos Preview is out there right now, scanning codebases, finding bugs that would take human researchers months. It's deployed. It's working. You just can't use it unless you're one of the 40 organizations Anthropic vetted. That's not a beta. That's a new category of release: the model too dangerous to ship, shipped only to those who need it to defend what matters.

The lesson isn't that every model needs this treatment. It's that when you build something this capable, the product decision isn't just what it does. It's who gets to use it, and in what order.

Anthropic shipped the model it couldn't ship

The model you can't download

Governance before regulation

What this means for builders

Want to discuss this further?