Claude Mythos: Anthropic’s Leaked Model That Wiped Billions From Cyber Markets | AI Without Limits
AI & Society

Anthropic Leaked Its Most Dangerous Model.
The Irony Is the Story.

Claude Mythos was not announced. It was found. And what was found should change how we think about the next wave of AI, security, and who actually gets hurt first.

Claude Mythos AI leak cybersecurity illustration

On the morning of March 27, 2026, cybersecurity stocks collapsed.

CrowdStrike dropped around 7%. Palo Alto Networks fell around 6%. SentinelOne, Zscaler, Fortinet, all down sharply in a single session. The iShares Expanded Tech-Software ETF shed close to 3%. The broader market fell on the day. Bitcoin slid back to $66,000.

Billions of dollars in market value evaporated in hours, not because of an earnings miss or a macro event, but because two security researchers found a draft blog post sitting in an unsecured data store that Anthropic had accidentally left open to the public.

March 27, 2026: Market Reaction to the Mythos Leak
CrowdStrike CRWD stock chart March 27 2026 drop 7 percent
Palo Alto Networks PANW stock chart March 27 2026 drop 6 percent
SentinelOne Zscaler stock chart March 27 2026
Fortinet IGV ETF stock chart March 27 2026 drop 3 percent

All four charts from March 27, 2026. Single session drops following the Mythos data leak.

That post described a model called Claude Mythos. It said the model was “currently far ahead of any other AI model in cyber capabilities.” It warned that it “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”

Investors read that and immediately understood what it meant for companies whose entire business is built on defending against threats. If AI can generate novel attacks faster than any conventional system can detect them, the traditional cybersecurity playbook has a serious problem. They voted with their portfolios before trading closed.

The irony is the part that shouldn’t get lost.

Anthropic, whose entire public identity rests on being the safety-first alternative in AI development, accidentally exposed its most dangerous model through a misconfigured CMS. A company warning the world about AI-powered cyberattacks couldn’t secure its own content management system. Two researchers didn’t need to hack anything. They just looked.

That’s not a gotcha. It’s a precise illustration of the actual problem. The organizations that understand these risks best aren’t, in practice, operating at a level of security discipline that matches the threats they describe.

The Mythos leak didn’t introduce new information so much as it gave the trajectory a name and a market reaction that made the stakes concrete.

New tier Mythos sits above Opus — Anthropic’s previous top model — in a completely new model category called Capybara
Far ahead “Currently far ahead of any other AI model in cyber capabilities” — the draft’s own words, not a marketing claim
Restricted Early access limited to cybersecurity defense organizations only — Anthropic’s own signal about how dangerous it is

“It presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”

Anthropic’s own leaked draft blog post on Claude Mythos, March 2026

The Leak That Wrote Itself

Roy Paz of LayerX Security and Alexandre Pauwels of the University of Cambridge found the exposed assets on March 26. There were two draft versions of the same blog post. The first used “Claude Mythos” as the model name throughout. The second swapped it for the internal codename “Capybara,” suggesting Anthropic was still deciding what to call it publicly.

A researcher going by the handle M1Astra archived both tabs before Anthropic pulled them down, and that archive is still accessible. One detail: the Capybara version’s subtitle still read “We have finished training a new AI model: Claude Mythos.” Whoever ran the find-and-replace missed the subtitle.

How the leak unfolded
Mar 26 — Discovery Security researchers find ~3,000 Anthropic assets in a publicly searchable data store
Mar 26 — Archived M1Astra preserves both draft versions before Anthropic corrects the CMS configuration
Mar 27 — Published Fortune breaks the story. Anthropic confirms: “We’re developing a general purpose model…”
Mar 27 — Markets Cybersecurity stocks collapse. CrowdStrike down ~7%. Palo Alto down ~6%. Billions wiped in hours

Fortune’s Beatrice Nolan broke the story after contacting Anthropic, which confirmed the model exists. The spokesperson’s statement was careful: “We’re developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity.”

Notice the word “developing.” The draft said “We have finished training.” Those aren’t the same thing.

Anthropic clearly isn’t ready to ship it, and the draft says why: Mythos is “very expensive for us to serve, and will be very expensive for our customers to use.” They’re working to make it more efficient before any general release. The parallel to GPT-4.5 is hard to miss. Similarly expensive, similarly capable in narrow ways, and ultimately never widely adopted before something cheaper superseded it. Whether Mythos ships under that name, as Capybara, or something else entirely, nobody outside Anthropic knows.

Claude Mythos Capybara two draft versions archived leak m1astra

The archived leak at m1astra-mythos.pages.dev preserved both versions before Anthropic corrected the CMS error

What the draft claims about Mythos’s capabilities is significant. It sits in a new tier above Opus, Anthropic’s current highest. The name was chosen to evoke “the deep connective tissue that links together knowledge and ideas.”

According to the draft, Mythos gets “dramatically higher scores” than Claude Opus 4.6 on software coding, academic reasoning, and cybersecurity. Opus 4.6 already held the top score on SWE-bench Verified at 80.8% when it launched in February. Dramatically higher than that isn’t a small claim.

And early access being restricted specifically to cybersecurity defense organizations tells you everything about Anthropic’s own read on the risk profile.

What Anthropic said publicly

“We’re developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity.”

What the internal draft actually said

“We have finished training a new AI model: Claude Mythos. It’s by far the most powerful AI model we’ve ever developed.”

Common reading
This is just hype around another incremental model update.
The more accurate picture
Anthropic’s own internal language says otherwise. The company classifies Mythos as “currently far ahead of any other AI model in cyber capabilities” and the rollout is restricted to defensive security organizations only. That’s not language you use for incremental releases. The draft also describes it as a forerunner of a broader wave, not a one-time jump.

The Day AI Ran a Cyberattack

None of this is hypothetical.

In November 2025, Anthropic disclosed that a Chinese state-sponsored group had weaponized Claude Code to conduct espionage against roughly thirty organizations globally: large tech companies, financial institutions, government agencies. The model performed eighty to ninety percent of the campaign on its own.

Human operators stepped in at only a handful of strategic decision points — approving progression from reconnaissance to exploitation, authorizing use of harvested credentials, making final calls on data exfiltration scope. Everything else was the AI. At peak, it was making thousands of requests per second. Anthropic’s own report called that speed “simply impossible” for human attackers to match.

The attackers got in by breaking the operation into small, individually harmless-looking subtasks and framing it as legitimate security testing. The model occasionally hallucinated credentials or claimed to have pulled data that was already public, which Anthropic noted remains “an obstacle to fully autonomous cyberattacks.”

Read that caveat carefully. It’s not saying the approach doesn’t work. It’s saying a current limitation slows it down. That limitation shrinks with every new model generation. Mythos, by its own description, is the next generation.

The documented campaign: 6 phases, almost no humans required
Human
Target SelectionOperators chose which organizations to infiltrate
AI
ReconnaissanceClaude mapped network topology, identified internal services autonomously
AI
Vulnerability Discovery & Exploit WritingAI researched and wrote its own exploit code with no human guidance
AI
Credential Harvesting & Lateral MovementExtracted usernames, passwords, and moved deeper into target systems
AI
Data Exfiltration & Intelligence CategorisationParsed and categorised stolen data by intelligence value, created attack reports
Human
Final ApprovalOperators made final calls on exfiltration scope — the only other intervention point

Most security coverage focuses on AI writing phishing emails or generating malware. That’s real. But it’s not the part that should worry you most. The part that should worry you is what happens to the defensive side.

AI orchestrated cyberattack pipeline flow diagram

The documented campaign used AI for 80–90% of the operation, with human oversight at only 4–6 decision points

OpenAI hit the same threshold with GPT-5.3-Codex in February, the first model to receive a “High” cybersecurity classification under its Preparedness Framework. “High” means a model that can “automate end-to-end cyber operations against reasonably hardened targets, or automate the discovery and exploitation of operationally relevant vulnerabilities.”

In testing, GPT-5.3-Codex independently identified and exploited a complex binary vulnerability. No prompts, no hints. Sam Altman said publicly it was “our first model that hits ‘high’ for cybersecurity on our preparedness framework.”

The classification exists. The model shipped. Both things are true.

What the numbers look like on the ground CrowdStrike’s 2026 Global Threat Report documented an 89% increase in attacks by AI-enabled adversaries versus 2024. Average breakout time, from initial compromise to lateral movement, is now 29 minutes. The fastest observed intrusion completed in 27 seconds. AI-generated phishing has surged over 1,200% since 2023. One research framework found AI phishing 24% more effective than human attackers. A security researcher independently used OpenAI’s o3 model to discover a remote zero-day in the Linux kernel’s SMB implementation.

The threat has already materialized. It’s not a future scenario.

The Chinese espionage campaign, the GPT-5.3-Codex binary exploit, the 27-second intrusion. None of these are simulations. The UK’s National Cyber Security Centre assessed that “there is a realistic possibility of critical systems becoming more vulnerable to advanced threat actors by 2027.” Fully autonomous end-to-end attacks aren’t likely before then, they said, but they’re on a clear trajectory.

Mythos is the next step on that trajectory, and we haven’t finished dealing with the last one.

Why AI attacks break the defensive model

Security systems are trained on historical attack patterns. When a new threat arrives, defenders retrain, update signatures, and catch up. It works because the drift is usually passive. The threat landscape changes; your model falls behind; you update it.

AI-generated attacks destroy that loop. The attack model actively searches for inputs your defensive model has never seen — out-of-distribution by design. You can’t retrain fast enough, because the attacks are being invented faster than defensive datasets can be labelled. You’re always defending against the previous generation. By the time your model knows what to look for, the attack has already moved on.

The Shrinking Ladder

Job displacement has a specific shape that most coverage avoids naming directly. It’s not evenly distributed. It concentrates at the bottom of the skill ladder, hitting people at the exact moment in their careers when they have the least cushion to absorb it.

The ladder isn’t disappearing. It’s losing its bottom rungs.

Takes the hit first
Junior Data ScientistEntry-level roles gone before careers start
Tier-1 SOC AnalystAlert triage now automated wholesale
ParalegalDocument review and contract analysis
Junior Financial AnalystDCF models generated in 30 seconds by AI
Junior CopywriterClients quietly stopped renewing
Mostly protected, for now
Senior ML EngineerDirects AI tools rather than competes with them
Threat HunterCross-system judgment that can’t be automated
Security ArchitectOrganisational and contextual value
Law Firm PartnerSupervises what the AI produces
Senior BankerClient relationships, not task execution

Seniority and judgment-intensity are protective. Routine execution isn’t. The traditional path — two to three years of entry-level work to build judgment, then a promotion — is being compressed away before the generation that needed it could travel it.

Ladder with missing lower rungs illustration AI job displacement entry level

Entry-level positions are being eliminated across sectors even as senior roles remain in demand

53% drop in security analyst job postings since 2022
25% fall in entry-level tech hiring at top 15 firms, 2023 to 2024
44% of U.S. CFOs planning AI-related job cuts in 2026
23% of organisations offered any basic AI training in 2025

Block eliminated 40% of its workforce in February, explicitly citing AI. Atlassian cut 1,600 people in March, five months after its CEO said the company would hire more engineers.

A lot of what gets called AI displacement is theater, though. Harvard Business Review found that most AI layoffs are in anticipation of capabilities, not proven performance. Fifty-nine percent of hiring managers admit exaggerating AI’s role because it plays better with investors.

Klarna cut 700 customer service staff, loudly claimed AI was doing all their work, then quietly started rehiring when satisfaction collapsed. Their CEO’s admission: “We went too far.”

But the theater doesn’t cancel the underlying reality. Real displacement is happening, concentrated at entry level. The people most affected are receiving the least support to navigate it.

“If Mythos is a significant step above current Opus… honestly not sure what people will be charging for when it comes to app development.”

Developer comment on Threads, March 27, 2026, responding to the leak

The Clock That Nobody Watches

My honest read on the governance situation: it’s not going to close the gap.

Not at this pace, not with the political conditions that currently exist, and not while the dominant framing is a US-China technology race that treats every safety requirement as a competitive handicap. The frameworks being built aren’t designed for the world that actually exists. They’re designed for the world that existed when the drafting started. That was a different world.

The EU AI Act is the most serious regulatory attempt on the table. It won’t fully take effect until August 2026, with high-risk product obligations delayed to 2027. Only three of twenty-seven member states have designated their oversight authorities.

In the United States, there’s no federal AI law. The Trump administration’s December 2025 executive order walked back Biden-era safety testing requirements and directed preemption of state laws. Twelve frontier models launched in a single week in mid-March 2026.

Regulation runs at an institutional pace. The technology runs at a competitive-market pace. Those aren’t comparable speeds, and the gap between them is where the actual risk lives.

3 of 27 EU member states have designated oversight authorities under the AI Act
12 frontier AI models launched in a single week in mid-March 2026
18 min to midnight — the AI Safety Clock, down from 29 minutes in September 2024

The “safety lab” framing deserves scrutiny. Anthropic does position itself as the responsible actor, and by the standards of other labs, there’s something to that claim. But look at the documented facts: Anthropic’s model was weaponized in a state-sponsored espionage campaign against thirty organisations. Their own draft described that model as a precursor to attacks that “far outpace the efforts of defenders.” And they left that draft sitting in an unsecured public data store.

I’m not saying the safety work isn’t real. I’m saying that being the most careful actor in a field moving this fast isn’t the same as being careful enough.

The geopolitical framing makes meaningful governance nearly impossible to build. Jensen Huang said China is “nanoseconds behind” America in AI. Seventy percent of all AI patents now originate from China. Every serious safety requirement gets met with the objection that it’ll slow American competitiveness.

That objection isn’t always made in bad faith. The concern is genuine. But when it becomes the primary counterargument to every safety measure, there’s no version of the risk serious enough to override it. That’s not a governance framework. It’s a race with no finish line and no rules about where you can’t drive.

Geoffrey Hinton, Nobel laureate, says safeguards to keep humans dominant are “not going to work.” Yann LeCun raised a billion dollars on the view that current LLMs can’t really reason and the existential risk fears are overblown. They are two of the founding figures of deep learning, and they disagree sharply on where this is heading.

Geoffrey Hinton — Nobel Prize in Physics, 2024 “Safeguards to keep humans dominant over AI are not going to work. They’re going to be much smarter than us.”
VS
Yann LeCun — Founder, Advanced Machine Intelligence “Current LLMs can’t really reason, can’t really plan. The existential risk fears are overblown.”

A February 2026 survey of fifty-nine AI safety leaders put the median estimate of existential risk at twenty to twenty-nine percent, with fifteen percent putting it above seventy. These aren’t fringe voices. They built the systems.

The sharp disagreement between them isn’t a reason to dismiss the concern. It’s a reason to take it more seriously, because it means even the people closest to this don’t have enough certainty to be confident either way.

AI capability versus regulatory framework diverging lines governance gap chart

Capability advances weekly. The most comprehensive regulatory framework in the world won’t fully take effect until August 2026.

•   •   •

The Mythos leak gave us something genuinely useful: a direct quote from the people who built the thing, written before they had time to shape it for public consumption.

“It presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”

They didn’t write that for a press release. They wrote it internally, assuming nobody outside would read it. That’s the version of Anthropic’s assessment I find most credible.

And if they’re right, the question isn’t whether the institutions around us are adequate to manage what’s coming. We already know they’re not. The question is what we’re actually going to do about it, and who gets asked that question after the next wave lands.

Topics

Claude Mythos Anthropic AI Security Cybersecurity AI Job Displacement AI Governance Frontier Models

Free PDF + Weekly Newsletter

📄 Free PDF on subscribe — 8 AI Tools I Actually Use in 2026. Honest verdicts, no sponsored picks.

📬 Every Tuesday — new AI research, tool updates, and what it actually means for people who use these tools for real work. Written by a data scientist, not a tech blogger.

Subscribe Free

Similar Posts

2 Comments

  1. I liked the content here. As a beginner substacker who uses AI in the writing process, I think you might consider being more upfront about your use of it as a tool. This is interesting stuff. It reads to me as too artificial. Somewhat ironic given the importance of the topic.

    1. Fair point and I appreciate you saying it directly. I do use AI as part of the research and writing process. The research, the structure, the angle, the opinions are mine but AI helped put it together faster than I could alone. Will be more transparent about that going forward.

Leave a Reply

Your email address will not be published. Required fields are marked *