TL;DR: Anthropic’s Claude Mythos autonomously found thousands of zero-day vulnerabilities, including bugs that survived 27 years of human review and millions of automated tests, according to Oasis Security. Exploitability-based prioritisation is no longer a safe assumption when AI can compress discovery and exploitation into hours.
NHIMG editorial — based on content published by Oasis Security: Claude Mythos and the End of 'Hard to Exploit' Claude Mythos and the End of 'Hard to Exploit'
By the numbers:
- In a Firefox JavaScript engine benchmark, Mythos converted known vulnerabilities into working shell exploits 72.4% of the time.
- One vulnerability persisted in OpenBSD for 27 years.
- Anthropic committed $100 million in usage credits and $4 million to open-source security organizations.
Questions worth separating out
A: The assumption that exploitability stays low long enough for normal remediation breaks first.
Q: Why do low-severity or long-standing bugs become more dangerous in AI-assisted attack scenarios?
A: Because age and prior testing no longer imply safety.
Q: How should teams prioritise patching when exploitability assumptions are no longer stable?
A: Prioritise by exposure, privilege, and business reach, not only by exploitability score.
Practitioner guidance
- Re-score low-exploitability findings against AI-assisted abuse scenarios Review exception queues for vulnerabilities that were downgraded because they were considered hard to exploit.
- Shorten patch approval paths for exposed systems Pre-approve compensating actions such as blocking public access, disabling vulnerable endpoints, or isolating affected services so teams are not waiting for the next change window before acting.
- Reduce reachable privilege on internet-facing assets Remove unnecessary secrets, administrative interfaces, and elevated permissions from systems that can be reached externally, because AI-assisted exploitation turns exposure into a faster path to impact.
What's in the full article
Oasis Security's full blog post covers the operational detail this post intentionally leaves for the source:
- The benchmark-by-benchmark exploit conversion results, including the Firefox JavaScript engine test methodology.
- The cost and run-count context behind the OpenBSD and FFmpeg findings, useful for evaluating attacker economics.
- The project coalition structure, access limits, and how Glasswing participation is being constrained.
- The article's step-by-step guidance on how Oasis expects teams to recalibrate vulnerability scoring and patch cadence.
👉 Read Oasis Security's analysis of Claude Mythos and exploitability collapse →
Claude Mythos and the exploitability gap teams are missing?
Explore further