Mozilla discloses 271 Firefox vulnerabilities discovered using Anthropic's Mythos AI model
The Firefox developer released detailed vulnerability reports and technical specifics of its AI-assisted bug-hunting process, addressing prior skepticism about hallucination and false positives in automated security discovery.
1 source · cross-referenced
- Mozilla engineers found 271 Firefox security flaws using Anthropic's Mythos AI model over a two-month period, with 180 classified as sec-high severity and 80 as sec-moderate.
- The key to reducing false positives was developing a custom 'agent harness'—code that guides the LLM through structured tasks, gives it access to Mozilla's existing tooling and testing infrastructure, and uses a second LLM to verify results.
- Mozilla published full Bugzilla reports for 12 of the 271 vulnerabilities as evidence, including test cases that trigger memory safety issues, and stated the discovered bugs meet the same security criteria as traditionally-discovered flaws.
- The team used test verification signals tied to Firefox's sanitizer builds, allowing the agent to craft and validate test cases autonomously until triggering a crash or confirming a memory safety issue.
Mozilla published detailed findings from its two-month trial of Anthropic Mythos, an AI model designed for identifying software vulnerabilities in Firefox. Over the period, the tool discovered 271 security flaws: 180 rated sec-high (exploitable through normal user actions like web browsing), 80 rated sec-moderate, and 11 rated sec-low. Mozilla Distinguished Engineer Brian Grinstead emphasized that the key differentiator reducing false positives was the development of a custom agent harness—a wrapper around the LLM that structures its work and provides direct access to Firefox's existing test infrastructure.
The harness functions by giving Mythos specific goals (e.g., 'find a bug in this file'), access to tools like file systems and Firefox builds with memory sanitizers enabled, and looping instructions until the model completes its task. When analyzing code for memory safety issues, the harness points the agent at source files and the model iterates by crafting HTML test cases, running them against Firefox's sanitizer build, and validating crashes. A second LLM then grades the first model's output, with high confidence scores passed to human developers for final review.
Mozilla disclosed full Bugzilla reports for 12 of the 271 bugs, including test cases and reproduction steps, demonstrating they meet Mozilla's standard security vulnerability criteria. According to Grinstead, prior attempts at AI vulnerability discovery produced widespread hallucinations requiring significant manual remediation. The Mythos results, by contrast, generated reports that developers could act on with confidence levels comparable to traditionally-discovered vulnerabilities, providing clear signals for iterating fixes and preventing regressions.
The disclosure comes amid ongoing skepticism about AI-assisted security tooling and claims of hype surrounding AI company valuations. Grinstead stated Mozilla's motivation was to demonstrate the technical approach and technique broadly rather than promote any specific vendor, noting the team had 'completely bought in' to the method based on observed results.
- May 22, 2026 · arXiv cs.AI
New Method Improves LLM Reasoning About Conflicting Beliefs in Complex Social Scenarios
Trust79 - May 20, 2026 · OpenAI — News
OpenAI model resolves 80-year-old discrete geometry conjecture
Trust67 - May 20, 2026 · arXiv cs.AI
Study evaluates how language models interpret personal health records to answer patient questions
Trust74