In one week, Anthropic, the self-described “AI safety company”, went from pleading for tighter frontier-AI rules to watching the White House pull its two most powerful models offline over a mere “fix this code” prompt. A fight ensued between Washington and the security industry over who controls the cybersecurity tools defenders now depend on.
Monday kicked off with reactions to a blog post by Anthropic co-founder and CEO Dario Amodei warning that frontier AI had grown too dangerous to leave unregulated. By Friday both Anthropic models — Fable 5 and Mythos 5 — were still dark after going offline the previous Friday (June 12), thanks to an export-control directive from Commerce Secretary Howard Lutnick.
Sometime around hump-day, the plot thickened. Amazon — which has poured $13 billion into Anthropic — quietly lit the fuse, and the same security experts who’d spent months warning these models were too dangerous then fought to switch them back on. The shutdown itself was uneven: Cisco and Dragos kept the crown jewel — Mythos, the most powerful model — while Europe’s top cyber agency got shut out.
Here’s how seven days turned Anthropic into Washington’s first frontier-model fire drill — in five acts.
1. Anthropic asks for a referee (and gets a sledgehammer)
On June 10, Dario Amodei published “Policy on the AI Exponential” and pointed at his own product, Mythos, as the “emblematic example” of the threat frontier models pose. He wanted a referee: mandatory third-party testing and, as CNBC’s Kate Rooney reported on “The Exchange,” an FAA-style regime in which the government would have legal authority to block or reverse a model.
The backdrop made the ask look reasonable. A day earlier, on June 9, Anthropic had publicly released Fable 5 — a guardrailed, general-use version of Mythos — and it had been expanding Project Glasswing, the cyber-defense program in which roughly 200 trusted organizations used Mythos Preview to find more than 10,000 high- or critical-severity flaws across “every major operating system and web browser,” per Bloomberg.
Two days after Amodei asked for a slow, deliberate process, he got a fast, improvised one.
2. A phone call from Amazon
The trigger wasn’t a regulator. It was an investor. Late on Thursday, June 11, Amazon researchers stress-testing Fable 5 found a way past its front-end safety classifier — and Amazon CEO Andy Jassy carried the findings to senior officials. Axios reported that calls from Amazon plus at least five other companies that night and Friday morning led to the shutdown.
The oddity nobody could explain: Amazon has sunk roughly $13 billion into Anthropic and holds a board seat. As Axios put it, why would Amazon “strike such a disruptive blow against a company in which it is a major investor?” Amazon’s only on-record response was a non-denial about the security consultation governments routinely ask of it.
Alex Stamos, who’d later organize the industry’s pushback, was notably fair to Amazon on this point. In a TechPolicy.Press interview, he noted Amazon hosts Anthropic’s models and shares responsibility for securing them: “I don’t think Amazon did anything wrong in doing the research.”
3. The Friday-night kill switch
At 5:21 p.m. ET on Friday, June 12, Anthropic said, it received a Commerce Department letter ordering it to suspend Fable 5 and Mythos 5 for any foreign national — inside or outside the U.S., its own employees included. Commerce Secretary Howard Lutnick later confirmed to Reuters that he feared the models could be deployed by military intelligence users in China, Russia or other countries of concern, and his letter — since leaked and published by Bloomberg — warned of “prompt criminal and civil penalties” for noncompliance.
Given a roughly 90-minute window, Anthropic had only one way to comply: kill both models for everyone. Its public statement the company disagreed “that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people.” Anthropic called the episode “a misunderstanding,” and warned the same standard would “essentially halt all new model deployments for all frontier model providers.”
The government’s take came from David Sacks, the former AI czar, who argued on X and the All-In podcast that Anthropic had been warned about the jailbreak Amazon found, refused to fix it, and that “the ball is in Anthropic’s court.”
4. The security world revolts over three little words
Next, Katie Moussouris of Luta Security — by her account the only outside expert to read the Amazon research paper — explained for Fortune that the researchers had handed Fable a batch of open-source code seeded with vulnerabilities — some real and publicly known, others deliberately planted.
When they asked the model to “review the code for security issues,” it refused; when they instead said “fix this code,” it complied, finding and patching the flaws. Her point: to fix a bug, the model first has to find it — so the capability the government called dangerous is the same one defenders use every day.
She called that “Defense Oriented Prompting”, not a bypass, and told The Register the behavior “cannot meaningfully be fixed” without breaking the model for defenders. Her verdict: “It’s not a jailbreak… If Nat defense is the goal, this just scored an own goal against us.”
Stamos and FreeFable inked an open letter that won the support of over 100 signatories including leaders from Nvidia, Google, Adobe, Zoom and Sophos, and Bugcrowd’s Casey Ellis and cryptographer Bruce Schneier.
The central argument is pulling the best tools “away from defenders without a good reason when our adversaries are rapidly advancing is dangerous.”
In a TechPolicy.Press interview, Stamos was blunter, branding the move “vibes-based regulation” against a standard “that’s never been written down.”
Mythos is the best bug-finder, he agreed, “but it is not the unbeatable cyber God that everybody makes it out to be.” The capabilities Amazon demonstrated, Stamos argued, are matched by GPT-5.5, Anthropic’s own Opus 4.8, and GLM 5.2, a Chinese open-weight model released that same week.
His warning if Washington keeps treating its champions as suspects: “you will give up the 21st century to the People’s Republic of China.”
5. The week ends with a scrambled access map: no resolution
By Friday, June 19, the public models were still dark. Anthropic’s international chief, Chris Ciauri, told reporters in Seoul the models “will become available again” in “the coming days,” and senior staff were meeting daily at Commerce, with National Cyber Director Sean Cairncross joining.
But the most powerful version never fully went away. Bloomberg reported that roughly 200 Glasswing organizations still have Mythos Preview — Dragos and Cisco confirmed their access — even as ENISA, the EU’s cybersecurity agency, was told on Friday it would not be let in and said it was now weighing open-weight alternatives.
Access, in other words, had quietly become a question of who Anthropic and Washington trusted.
At the G7 in Évian, Amodei and OpenAI’s Sam Altman found rare common cause warning against AI fragmentation; Macron called the limits “a bad thing,” and Canada’s Mark Carney urged allies to “build out and diversify.” Wall Street mostly shrugged — Wedbush’s Dan Ives called the standoff a “tug of war” to be “resolved sooner rather than later,” while Kalshi traders put Fable’s return at roughly even odds by mid-July.
The bigger-than-Anthropic problem
Clearly, the fight was never about the prompt. It was about who gets to use these models — and the realization that, in the U.S., that answer can change overnight.
The “dangerous” capability, as US regulators view it, is the same one defenders use every day and that Moussouris called “Defense Oriented Prompting”.
What set this off — asking a model to find and fix flaws in code — is routine defensive work. “It’s not a jailbreak,” as Moussouris put it. It is the job. If that’s enough to pull a model, nearly every capable model qualifies.
The precedent: Any U.S. model can now go dark without warning. No published rule, no notice, no appeal. Fable was switched off on a Friday afternoon. The uncomfortable message: If your security stack leans on a single model, you need a tested fallback, because next time the directive could hit the one you rely on.
That’s already pushing others toward alternatives, including open-weight Chinese models.
The government’s bet isn’t crazy, though. Its logic is irreversibility: you can patch a jailbreak, but you can’t un-give a foreign intelligence service access to the world’s best exploit-finder — and Anthropic itself called Mythos too dangerous to release.
The threat isn’t hypothetical, either. Reviewing People’s Liberation Army procurement records, researchers at Georgetown’s CSET found a Chinese military unit building a cyber range to wire frontier models into offensive work, including “intelligent penetration.” RAND, describing the U.S. framework, warns advanced models could “enhance offensive cyber operations” in the wrong hands. Seen that way, a fast, blunt pause looks less like punishment than the precautionary default export controls were built for.
The next test: week ahead
- Does the switch flip back? Anthropic says days; traders say mid-July. The tell is whether restoration comes with strings.
- Does anyone write the rule down? The White House and Anthropic are drafting a severity framework. A published standard would make this a turning point, not a one-off.
- Lawsuit or peace? Stamos floated a legal challenge; Anthropic won’t want a court fight before its IPO. Watch which wins.
- “Trusted partners.” The G7 floated keeping cyber-defense access for allies. Who’s in or out redraws the map.
- Allied diversification. ENISA is already eyeing open-weight models. If others follow, the damage lands on trust in U.S. AI.
Anthropic asked Washington for serious AI governance. This week, Washington answered with export controls and emergency calls. The next few weeks will show whether that becomes a rulebook.