Anthropic Just Showed the World How Dangerous AI Can Be. That's the Point.

A 27-year-old vulnerability. Sitting in one of the most security-hardened operating systems in the world. Survived decades of human review. Found by an AI model in a matter of hours.

That's one of three examples Anthropic disclosed this week as part of Project Glasswing – a new initiative that brought together AWS, Apple, Google, Microsoft, Nvidia, JPMorganChase, Cisco, CrowdStrike, Broadcom, Palo Alto Networks, and the Linux Foundation around a single, uncomfortable fact: AI models can now find and exploit software vulnerabilities better than almost any human on earth.

The model doing it is called Claude Mythos Preview. It's not publicly available. And the way Anthropic is handling that tells you a lot about what this moment actually is.

What Mythos found

Over a few weeks, Mythos Preview identified thousands of zero-day vulnerabilities – flaws previously unknown to the developers of the software – across every major operating system and web browser. Many of them critical. Some of them decades old.

27years

OpenBSD: Remote crash vulnerability

A flaw that allowed an attacker to remotely crash any machine running the OS just by connecting to it. OpenBSD is used for firewalls and critical infrastructure and has a reputation as one of the most security-hardened systems in existence.

16years

FFmpeg: Encoding library flaw

Hidden in a single line of code that automated testing tools had hit five million times without catching it. FFmpeg is embedded in countless pieces of software to handle video encoding and decoding.

Linux kernel: Privilege escalation chain

The model autonomously found and chained together several vulnerabilities to escalate from ordinary user access to complete control of the machine. The Linux kernel runs most of the world's servers.

All three have been patched. Many others are being disclosed after fixes are in place. Anthropic is being deliberate about timing – publishing cryptographic hashes of the details now, and revealing specifics only after each fix is deployed.

The uncomfortable part: Mythos found nearly all of these vulnerabilities entirely autonomously, without any human steering. This is not AI-assisted security research. It is AI-driven security research with humans reviewing the output.

The numbers behind it

To understand the gap, look at where Mythos Preview sits relative to the previous generation of models on the benchmarks that matter for this kind of work.

Benchmark Mythos Preview Opus 4.6

CyberGym (vulnerability reproduction) 83.1% 66.6%

SWE-bench Verified (coding) 93.9% 80.8%

SWE-bench Pro 77.8% 53.4%

Terminal-Bench 2.0 82.0% 65.4%

Humanity's Last Exam (with tools) 64.7% 53.1%

These are not marginal improvements. The jump from Opus 4.6 to Mythos on SWE-bench Pro – from 53% to 78% – represents a meaningful capability shift in autonomous coding. The CyberGym number is the one with the clearest security implications: the model reproduces known vulnerabilities at a rate that was not achievable with the previous generation.

Why are they announcing this at all

This is the part I find most interesting. Anthropic didn't have to announce that their model can find critical vulnerabilities in every major OS and browser. They could have quietly used it internally, patched what they found, and said nothing.

Instead, they built a coalition of 12 of the most important technology and security companies in the world, committed $100M in usage credits, donated $4M to open-source security organizations, and made the whole thing public.

The reason is in the announcement itself: these capabilities will proliferate. The only question is whether defenders get to them first.

The asymmetry problem: Once a vulnerability is exploited, the damage is done. A defender needs to find and patch every flaw. An attacker only needs to find one. AI that can autonomously discover vulnerabilities at scale changes that equation dramatically -- but it changes it for both sides.

Project Glasswing is an effort to ensure defenders run faster. The partners aren't just getting access to a powerful model. They're getting early access specifically to use it on their own foundational systems – the software that collectively represents an enormous share of the world's shared attack surface.

Amazon Web Services Apple Broadcom Cisco CrowdStrike Google JPMorganChase Linux Foundation Microsoft NVIDIA Palo Alto Networks Anthropic

What this means for everyone else

Project Glasswing will not be available to most organizations. The model is restricted, the partners are a very specific group, and the access criteria are tight. But the implications are real for any team building or maintaining software right now.

The vulnerability window is collapsing

CrowdStrike's CTO put it plainly in the announcement: "The window between a vulnerability being discovered and being exploited by an adversary has collapsed -- what once took months now happens in minutes with AI." That's not a prediction about the future. That's an assessment of where things are today with current-generation models. Mythos is the next step.

For teams maintaining production software, this means the old approach of periodic security audits and reactive patching won't hold up. Continuous, automated security scanning is becoming a baseline requirement, not a nice-to-have.

Open source is the biggest exposure

The Linux Foundation's involvement is pointed. Open-source software powers most of the world's infrastructure, and the organizations that maintain it have historically operated without the security resources available to large enterprises. The announcement includes $2.5M in donations to Alpha-Omega and OpenSSF, as well as a separate program that gives open source maintainers access to the model.

If you're building on open source dependencies -- and every software team is -- the security posture of those dependencies is now your problem in a more immediate way than it was two years ago.

The software development lifecycle needs to catch up

Part of what Project Glasswing commits to producing is a set of practical recommendations for how security practices should evolve in the AI era -- covering vulnerability disclosure, software development lifecycle, secure-by-design practices, and patching automation. That work is worth watching. The teams that adapt their development practices ahead of those recommendations will be better positioned than those that wait for them.

$100M

Anthropic's commitment in model usage credits for Project Glasswing participants

$4M

Direct donations to open-source security organizations including Apache and OpenSSF

40+

Additional organizations with access to scan and secure critical open-source software

Days until Anthropic's first public report on vulnerabilities fixed and lessons learned

The part that matters for software teams right now

I've spent the last decade building software for US clients. Healthcare systems, HR platforms, fintech tools. The security conversation in those verticals has always been serious, but it has rarely been urgent in the way this announcement implies it should be.

A 16-year-old vulnerability hiding in a line of code that automated tools hit five million times without catching – and an AI model finds it autonomously in hours – that's a different threat environment than anything we were designing against two years ago.

The practical question for any engineering team is: what is your current exposure, and how fast can you respond when the vulnerability window collapses from months to minutes?

Project Glasswing doesn't solve that question for most organizations. It's a starting gun, not a finish line. The work it produces – the recommendations, the patched vulnerabilities, the public reporting – will raise the baseline for what responsible software security looks like. Teams that are ahead of that baseline when it shifts will be fine. Teams that are behind it will have a lot of catching up to do under pressure.

The honest read: This is not a reason to panic. It is a reason to move. The same capabilities that make Mythos dangerous in the wrong hands make it powerful for defenders. The window to get ahead of this is open right now. It won't stay open indefinitely.

Our take

At Alluxi, we build for clients where a security failure isn't just a technical problem - it's a patient record, a financial transaction, a compliance violation. Project Glasswing sets a new baseline for what due diligence looks like in those verticals.

We're watching this closely and factoring it into how we design systems, what we recommend to clients about their security posture, and how we think about the development lifecycle on every project we ship. If you're building something where security matters—and in 2026, that's everything—this announcement is worth understanding.