Project Glasswing

Project Glasswing is a cybersecurity consortium led by the AI safety and research company Anthropic. Announced on April 7, 2026, the project brings together major technology and finance companies to use a powerful, unreleased AI model, Claude Mythos Preview, for defensive cybersecurity purposes. The initiative's goal is to proactively discover and remediate security vulnerabilities in critical open-source software and digital infrastructure before similar AI capabilities can be used by malicious actors. ^[1]

Overview

Project Glasswing was established as a direct response to the powerful, dual-use capabilities observed in Anthropic's frontier AI model, Claude Mythos Preview. After internal testing demonstrated the model's ability to autonomously find and exploit software flaws, Anthropic decided against a public release, deeming the model too potent for general availability. ^[2] ^[3]

The core mission of the consortium is to leverage Claude Mythos Preview in a controlled environment to fortify digital defenses. By providing private, early access to the model, Project Glasswing allows partners to discover and fix vulnerabilities in their own systems and in widely used open-source software. The project operates on the principle of coordinated vulnerability disclosure, aiming to give cybersecurity professionals a critical head start and a "durable advantage" against the anticipated proliferation of AI-driven cyber threats. ^[3]

The initiative addresses the concern that AI models with advanced cyber-offense capabilities will soon become widespread, potentially empowering state and non-state actors to conduct more frequent and sophisticated attacks. By harnessing the same technology for defense, the project seeks to rebalance the security landscape and establish new best practices for vulnerability management in an era of AI-accelerated threats. ^[1]

History and Background

Motivation and Rationale

The impetus for Project Glasswing arose from Anthropic's internal development and red-teaming of its frontier AI models. The company discovered that its latest model, Claude Mythos Preview, possessed emergent capabilities for autonomously identifying software vulnerabilities and creating functional exploits with minimal human guidance. This discovery prompted what the company described as an "urgent attempt" to prioritize a defensive application for this powerful technology. ^[1]

The project's rationale is rooted in managing the dual-use nature of advanced AI. While the model is a powerful tool for defense, its capabilities make it an equally potent instrument for malicious cyberattacks. Anthropic executives stated that they anticipate adversaries will develop AI with similar capabilities in "months, not years," framing the project as a "critical race to secure infrastructure." This sense of urgency was echoed by Logan Graham, Anthropic's Frontier Red Team Lead, who stated, "We need to prepare now for a world where these capabilities are broadly available in 6, 12, 24 months. Many of the assumptions that we’ve built the modern security paradigms on might break." ^[3] ^[2]

Launch and Initial Commitments

Project Glasswing was officially announced on April 7, 2026. ^[3] Along with the launch, Anthropic publicly committed to reporting on the project's progress, including specific findings and fixed vulnerabilities, within 90 days. The project was framed not only as a technical initiative but also as a collaborative effort to engage with government and security organizations to inform national security strategies and evolve industry-wide security practices. ^[1]

Core Technology: Claude Mythos Preview

The central technology enabling Project Glasswing is Claude Mythos Preview, a proprietary frontier AI model developed by Anthropic. It is not slated for public release due to its potent and potentially dangerous dual-use capabilities in cybersecurity. ^[1]

The model's advanced cybersecurity skills are an emergent property of its general advanced coding and reasoning abilities, rather than the result of specific training for cyber tasks. Anthropic has assessed the model's capabilities as comparable to a "senior security researcher." ^[2]

Demonstrated Capabilities

Internal testing and initial use within the consortium have shown that Claude Mythos Preview is capable of a range of advanced security tasks, including:

Autonomous Vulnerability Discovery: The model can autonomously search for and identify software vulnerabilities in codebases.
Exploit Development: It can generate complete attack chains and proofs of concept for the vulnerabilities it finds.
Offensive Security Simulation: It can be used for penetration testing and endpoint security assessments.
Binary Analysis: The model can hunt for system misconfigurations and evaluate software binaries even without access to the original source code.

These abilities represent a significant advancement over previous publicly available models and underscore the decision to restrict access to a controlled environment. ^[2] ^[3]

Notable Discoveries

In its pre-launch and initial phases, the model has been used to identify thousands of previously unknown (zero-day), high-severity vulnerabilities across major software projects. ^[4] Notable examples of its discoveries include:

A 27-year-old vulnerability in the OpenBSD operating system that could allow a remote attacker to crash a targeted machine. ^[1] ^[3]
A 16-year-old vulnerability in the widely used FFmpeg multimedia library. This flaw was particularly significant as it had been missed by automated scanning tools that had run on the code an estimated five million times. ^[1] ^[4]
An autonomously generated exploit chain that combined multiple vulnerabilities in the Linux kernel to achieve privilege escalation, allowing an attacker to move from a standard user account to full system control (root access). ^[1]

All vulnerabilities identified during the initial testing phase were reportedly patched in coordination with the respective software maintainers before the project's public announcement. ^[4]

Benchmark Performance

Claude Mythos Preview demonstrates a notable performance increase in cybersecurity, coding, and reasoning benchmarks compared to Anthropic's next-best publicly available model at the time of launch, Claude Opus 4.6. ^[1] ^[3]

Benchmark	Claude Mythos Preview	Claude Opus 4.6	Description
CyberGym	83.1%	66.6%	Measures performance in cybersecurity tasks.
SWE-bench Pro	77.8%	53.4%	An advanced benchmark for fixing real-world bugs in GitHub repositories.
SWE-bench Verified	93.9%	80.8%	A benchmark for fixing real-world bugs in GitHub repositories.
Terminal-Bench 2.0	82.0%	65.4%	Measures performance in agentic, terminal-based tasks.
GPQA Diamond	94.6%	91.3%	A benchmark measuring advanced reasoning capabilities.
OSWorld-Verified	79.6%	72.7%	Measures performance in agentic tasks within an operating system environment.

The data from these benchmarks illustrates the increase in capability that prompted the creation of Project Glasswing. ^[1] ^[3]

Project Goals and Methodology

Primary Objectives

Project Glasswing operates with several key goals:

Vulnerability Remediation: The core activity involves partners using Claude Mythos Preview for security tasks such as local vulnerability detection, black-box testing, and penetration testing on their own foundational software and systems.
Information Sharing: Anthropic coordinates the sharing of project learnings with the broader industry. Partners are contractually obligated to share their findings to improve collective security.
Public Reporting: The project is committed to transparency, with a public report on findings and progress scheduled for release within 90 days of launch.
Evolving Security Practices: The consortium aims to collaborate with security organizations to generate new recommendations for vulnerability disclosure processes, software update mechanisms, and secure-by-design principles suited for an AI-accelerated environment.
Government Engagement: Anthropic maintains an ongoing dialogue with U.S. government officials regarding the model's capabilities to help inform national security strategies.

These goals collectively aim to create a more resilient digital ecosystem prepared for the next generation of cyber threats. ^[1] ^[4]

Vulnerability Disclosure Process

To manage its powerful findings responsibly, Project Glasswing follows a structured, multi-step process for vulnerability disclosure:

Internal Triage: All potential bugs discovered by the AI model are first triaged internally by Anthropic's team.
Human Validation: High-severity vulnerabilities are passed to professional human security experts for manual validation to confirm authenticity and assess their potential impact.
Coordinated Reporting: Before submitting bug reports, Anthropic contacts the relevant software maintainers to establish a manageable reporting cadence, preventing them from being overwhelmed by a high volume of reports.
Patch Assistance: For open-source projects, Anthropic endeavors to provide a candidate patch alongside the vulnerability report. These patches are explicitly labeled as either AI-generated or AI-assisted and human-reviewed.
Coordinated Disclosure Timeline: The project adheres to a standard coordinated vulnerability disclosure (CVD) framework, typically waiting 45 days after a patch has been made available before publishing any technical details about the vulnerability.

This methodology is designed to maximize the defensive benefits of the AI model while minimizing the risk of discovered flaws being exploited. ^[3]

Partnerships and Collaboration

Project Glasswing was launched as a consortium of over 45 organizations, led by Anthropic. ^[2]

Founding Partners

The founding members of the consortium include major companies from the technology, finance, and cybersecurity sectors:

Amazon Web Services (AWS)
Anthropic
Apple
Broadcom
Cisco
CrowdStrike
Google
JPMorganChase
The Linux Foundation
Microsoft
NVIDIA
Palo Alto Networks

In addition to the core founding group, access to Claude Mythos Preview has been granted to over 40 other organizations responsible for maintaining critical software infrastructure. ^[1] ^[3] ^[4]

Partner Perspectives

Representatives from partner companies have publicly supported the initiative. Heather Adkins, Google's Vice President of Security Engineering, stated, "Google is pleased to see this cross-industry cybersecurity initiative coming together. We have long believed that AI poses new challenges and opens new opportunities in cyber defense." Similarly, Igor Tsyganskiy, Microsoft's Global CISO, noted, "Joining Project Glasswing, with access to Claude Mythos Preview, allows us to identify and mitigate risk early and augment our security and development solutions so we can better protect customers and Microsoft.” ^[2]

Funding, Resources, and Access

Financial Commitments

Anthropic has committed significant resources to support the project and the broader open-source security ecosystem:

Model Access Credits: A commitment of $100 million in free usage credits for Claude Mythos Preview is being distributed to project partners and other organizations maintaining critical infrastructure. ^[1]
Direct Donations: A total of 2.5 million to the Alpha-Omega Project and the Open Source Security Foundation (OpenSSF) via The Linux Foundation, and $1.5 million to the Apache Software Foundation. ^[1] ^[3]

Access and Pricing

During the initial research preview phase, participants can use the model with costs largely covered by Anthropic's commitment of usage credits. Following this phase, approved participants can access the model at a rate of 125 per million output tokens. The model is available to participants through multiple platforms, including the Claude API, Amazon Bedrock, Google Cloud's Vertex AI, and Microsoft Foundry. ^[3]

Public Attention and Pre-Launch Events

The official announcement of Project Glasswing on April 7, 2026, was preceded by two unrelated security incidents at Anthropic in late March 2026 that drew public attention.

CMS Data Leak: In late March 2026, a misconfiguration in a publicly searchable Content Management System (CMS) resulted in the unintended exposure of internal company assets, including a draft blog post that detailed Project Glasswing.
NPM Package Leak: On March 31, 2026, the full source code for an internal development tool named "Claude Code" was accidentally published to the public npm package registry, where it remained accessible for approximately three hours before being removed.

Anthropic officially responded to these incidents, characterizing them as "human errors in publishing tooling, not breaches of our security architecture." The company stated that it had implemented improved processes to prevent such errors in the future. ^[3]

Etymology

The project is named after the glasswing butterfly (Greta oto). The name serves as a dual metaphor for the project's mission:

Hidden Vulnerabilities: The butterfly's transparent wings allow it to blend into its environment, symbolizing the subtle and difficult-to-detect software vulnerabilities that the project aims to uncover.
Transparency for Safety: The butterfly's transparency is also a defense mechanism. This reflects the project's goal of using a transparent, collaborative approach to information sharing to improve cybersecurity for everyone.

This name was chosen to reflect both the challenge of finding hidden flaws and the collaborative nature of the solution. ^[1]

Overview

History and Background

Motivation and Rationale

Launch and Initial Commitments

Core Technology: Claude Mythos Preview

Demonstrated Capabilities

Internal testing and initial use within the consortium have shown that Claude Mythos Preview is capable of a range of advanced security tasks, including:

Autonomous Vulnerability Discovery: The model can autonomously search for and identify software vulnerabilities in codebases.
Exploit Development: It can generate complete attack chains and proofs of concept for the vulnerabilities it finds.
Offensive Security Simulation: It can be used for penetration testing and endpoint security assessments.
Binary Analysis: The model can hunt for system misconfigurations and evaluate software binaries even without access to the original source code.

These abilities represent a significant advancement over previous publicly available models and underscore the decision to restrict access to a controlled environment. ^[2] ^[3]

Notable Discoveries

A 27-year-old vulnerability in the OpenBSD operating system that could allow a remote attacker to crash a targeted machine. ^[1] ^[3]
A 16-year-old vulnerability in the widely used FFmpeg multimedia library. This flaw was particularly significant as it had been missed by automated scanning tools that had run on the code an estimated five million times. ^[1] ^[4]
An autonomously generated exploit chain that combined multiple vulnerabilities in the Linux kernel to achieve privilege escalation, allowing an attacker to move from a standard user account to full system control (root access). ^[1]

All vulnerabilities identified during the initial testing phase were reportedly patched in coordination with the respective software maintainers before the project's public announcement. ^[4]

Benchmark Performance

Benchmark	Claude Mythos Preview	Claude Opus 4.6	Description
CyberGym	83.1%	66.6%	Measures performance in cybersecurity tasks.
SWE-bench Pro	77.8%	53.4%	An advanced benchmark for fixing real-world bugs in GitHub repositories.
SWE-bench Verified	93.9%	80.8%	A benchmark for fixing real-world bugs in GitHub repositories.
Terminal-Bench 2.0	82.0%	65.4%	Measures performance in agentic, terminal-based tasks.
GPQA Diamond	94.6%	91.3%	A benchmark measuring advanced reasoning capabilities.
OSWorld-Verified	79.6%	72.7%	Measures performance in agentic tasks within an operating system environment.

The data from these benchmarks illustrates the increase in capability that prompted the creation of Project Glasswing. ^[1] ^[3]

Project Goals and Methodology

Primary Objectives

Project Glasswing operates with several key goals:

Vulnerability Remediation: The core activity involves partners using Claude Mythos Preview for security tasks such as local vulnerability detection, black-box testing, and penetration testing on their own foundational software and systems.
Information Sharing: Anthropic coordinates the sharing of project learnings with the broader industry. Partners are contractually obligated to share their findings to improve collective security.
Public Reporting: The project is committed to transparency, with a public report on findings and progress scheduled for release within 90 days of launch.
Evolving Security Practices: The consortium aims to collaborate with security organizations to generate new recommendations for vulnerability disclosure processes, software update mechanisms, and secure-by-design principles suited for an AI-accelerated environment.
Government Engagement: Anthropic maintains an ongoing dialogue with U.S. government officials regarding the model's capabilities to help inform national security strategies.

These goals collectively aim to create a more resilient digital ecosystem prepared for the next generation of cyber threats. ^[1] ^[4]

Vulnerability Disclosure Process

To manage its powerful findings responsibly, Project Glasswing follows a structured, multi-step process for vulnerability disclosure:

Internal Triage: All potential bugs discovered by the AI model are first triaged internally by Anthropic's team.
Human Validation: High-severity vulnerabilities are passed to professional human security experts for manual validation to confirm authenticity and assess their potential impact.
Coordinated Reporting: Before submitting bug reports, Anthropic contacts the relevant software maintainers to establish a manageable reporting cadence, preventing them from being overwhelmed by a high volume of reports.
Patch Assistance: For open-source projects, Anthropic endeavors to provide a candidate patch alongside the vulnerability report. These patches are explicitly labeled as either AI-generated or AI-assisted and human-reviewed.
Coordinated Disclosure Timeline: The project adheres to a standard coordinated vulnerability disclosure (CVD) framework, typically waiting 45 days after a patch has been made available before publishing any technical details about the vulnerability.

This methodology is designed to maximize the defensive benefits of the AI model while minimizing the risk of discovered flaws being exploited. ^[3]

Partnerships and Collaboration

Project Glasswing was launched as a consortium of over 45 organizations, led by Anthropic. ^[2]

Founding Partners

The founding members of the consortium include major companies from the technology, finance, and cybersecurity sectors:

Amazon Web Services (AWS)
Anthropic
Apple
Broadcom
Cisco
CrowdStrike
Google
JPMorganChase
The Linux Foundation
Microsoft
NVIDIA
Palo Alto Networks

In addition to the core founding group, access to Claude Mythos Preview has been granted to over 40 other organizations responsible for maintaining critical software infrastructure. ^[1] ^[3] ^[4]

Partner Perspectives

Funding, Resources, and Access

Financial Commitments

Anthropic has committed significant resources to support the project and the broader open-source security ecosystem:

Model Access Credits: A commitment of $100 million in free usage credits for Claude Mythos Preview is being distributed to project partners and other organizations maintaining critical infrastructure. ^[1]
Direct Donations: A total of 2.5 million to the Alpha-Omega Project and the Open Source Security Foundation (OpenSSF) via The Linux Foundation, and $1.5 million to the Apache Software Foundation. ^[1] ^[3]

Access and Pricing

Public Attention and Pre-Launch Events

The official announcement of Project Glasswing on April 7, 2026, was preceded by two unrelated security incidents at Anthropic in late March 2026 that drew public attention.

CMS Data Leak: In late March 2026, a misconfiguration in a publicly searchable Content Management System (CMS) resulted in the unintended exposure of internal company assets, including a draft blog post that detailed Project Glasswing.
NPM Package Leak: On March 31, 2026, the full source code for an internal development tool named "Claude Code" was accidentally published to the public npm package registry, where it remained accessible for approximately three hours before being removed.

Etymology

The project is named after the glasswing butterfly (Greta oto). The name serves as a dual metaphor for the project's mission:

Hidden Vulnerabilities: The butterfly's transparent wings allow it to blend into its environment, symbolizing the subtle and difficult-to-detect software vulnerabilities that the project aims to uncover.
Transparency for Safety: The butterfly's transparency is also a defense mechanism. This reflects the project's goal of using a transparent, collaborative approach to information sharing to improve cybersecurity for everyone.

This name was chosen to reflect both the challenge of finding hidden flaws and the collaborative nature of the solution. ^[1]

Subscribe to wiki

Share wiki

Bookmark

Wiki Details

Profile Summary

Project Glasswing

Overview

History and Background

Motivation and Rationale

Launch and Initial Commitments

Core Technology: Claude Mythos Preview

Demonstrated Capabilities

Notable Discoveries

Benchmark Performance

Project Goals and Methodology

Primary Objectives

Vulnerability Disclosure Process

Partnerships and Collaboration

Founding Partners

Partner Perspectives

Funding, Resources, and Access

Financial Commitments

Access and Pricing

Public Attention and Pre-Launch Events

Etymology

Feedback

Commit Info

Related Articles

Wiki Details

Profile Summary

Feedback

Commit Info

Related Articles

REFERENCES

Subscribe to wiki

Share wiki

Bookmark

Project Glasswing

Wiki Details

Profile Summary

Project Glasswing

Overview

History and Background

Motivation and Rationale

Launch and Initial Commitments

Core Technology: Claude Mythos Preview

Demonstrated Capabilities

Notable Discoveries

Benchmark Performance

Project Goals and Methodology

Primary Objectives

Vulnerability Disclosure Process

Partnerships and Collaboration

Founding Partners

Partner Perspectives

Funding, Resources, and Access

Financial Commitments

Access and Pricing

Public Attention and Pre-Launch Events

Etymology

Feedback

Commit Info

Related Articles

Wiki Details

Profile Summary

Feedback

Commit Info

Related Articles

REFERENCES