News

Anthropic Wants a Global AI Pause While Claude Writes Its Own Code

Anthropic revealed Claude now writes over 80% of its production code while calling for a verifiable global pause on AI development that no coordination mechanism yet supports.

Published

1 month ago

June 6, 2026

Henry Fox

In May 2026, Claude wrote more than 80% of the code merged into Anthropic’s own systems, including the code Anthropic uses to train future versions of Claude. Before the company launched its Claude Code tool in February 2025, that figure sat in the low single digits. Now Anthropic says this feedback loop is moving faster than most institutions are prepared for, and it wants the world to start building a brake.

The paper making that argument, “When AI Builds Itself,” was published June 4 by the Anthropic Institute, the company’s in-house research organization. Marina Favaro, who leads the institute, and Jack Clark, Anthropic’s co-founder and head of policy, co-authored it. The authors describe a process already running inside Anthropic’s own labs: AI models helping to train their own successors, with human involvement narrowing at each step. The paper arrives one week after Anthropic confidentially filed a draft registration statement with U.S. regulators, the first formal step toward a public stock listing.

Four Months to Double

METR, a nonprofit dedicated to evaluating AI capabilities and safety, tracks what it calls the task-completion time-horizon benchmark: the length of a software or machine-learning task that a given AI agent can complete successfully about half the time. Since 2024, that horizon has been doubling roughly every four months, down from a doubling time of about seven months over the 2019-to-2025 period. The Anthropic Institute paper links the acceleration partly to AI models beginning to contribute to each other’s development pipelines. METR’s task suite spans software engineering, machine learning, and cybersecurity. By May 2026, the organization noted its 16-hour measurement ceiling was becoming a constraint, with the most capable models pushing against what the test can currently resolve.

Anthropic runs a complementary internal code-optimization benchmark. It hands a model a working training script and asks it to make the same computation run faster without changing the output. Results shifted dramatically from 2025 to 2026, with the company’s research models hitting speedups far beyond what a skilled human engineer achieves in four to eight hours of focused work. CORE-Bench, which tests whether AI can reproduce published scientific research results, went from about 20% success in 2024 to near-saturation in roughly 15 months.

4 minutes: the task length Claude Opus 3 could reliably complete, March 2024
90 minutes: Claude Sonnet 3.7’s task horizon, roughly one year later
12 hours: Claude Opus 4.6’s horizon by 2026, per the Anthropic Institute paper
52x: code speedup Mythos Preview, Anthropic’s internal research model, achieved by April 2026; skilled human engineers typically reach 4x with four to eight hours of work

Anthropic Claude recursive self-improvement global AI pause proposal

Who Writes the Code That Trains the Code

Engineering Output

By May 2026, Claude was writing more than 80% of all code merged into Anthropic’s production systems. Many of the company’s researchers haven’t hand-written code in months, the paper says. The typical Anthropic engineer now merges eight times as much code per day as in 2024.

One example the paper provides: a routine upgrade began crashing tens of thousands of Anthropic training jobs. An engineer pointed Claude at the live incident with cluster access. Working through running jobs and testing one environment setting at a time, Claude isolated the obscure debugging flag triggering the crash, reproduced it reliably, and confirmed a fix. Two hours of Claude work delivered what Anthropic estimated as two to three days of human engineering.

A March 2026 poll of 130 Anthropic research staff found the median respondent estimated roughly a 4x gain in personal research output when working with Mythos Preview. On the most open-ended engineering tasks, tracked by the SWE-bench Verified benchmark (a real-world measure of software engineering ability), Claude’s success rate climbed 50 percentage points in six months, reaching 76% in May 2026.

Research Judgment

The “When AI Builds Itself” paper also tested something harder to measure than lines of code. Anthropic researchers reviewed real Claude Code sessions from January through March 2026, specifically moments where a human researcher had taken a wrong turn in an open-ended investigation, pursuing a direction that sent the session sideways before it eventually got back on track. They showed different Claude models only the work from before things went off-course and asked what each model would do next. A separate Claude judge then evaluated which proposed step was better.

Opus 4.5, tested in November 2025, beat the researcher’s decision 51% of the time across 129 of these scenarios. By April 2026, Mythos Preview was beating it 64% of the time. The paper notes these sessions were deliberately selected because the human’s direction “had room for improvement,” making the comparison something short of a clean head-to-head test. Anthropic also uses these sessions to generate training data for future models, meaning the model judging which step is better has a direct influence on which reasoning patterns its successor learns.

Humans Still Set the Research Agenda

Choosing which research problems to pursue remains a human decision. The paper calls this “research taste”: the ability to recognize when an approach has hit a dead end before months of compute are spent on it, and to spot which directions are worth investigating in the first place. Claude models now propose better next steps than a researcher would choose more than half the time in Anthropic’s internal navigation tests, but they are still working within sessions that humans framed and initiated.

Clark told Interesting Engineering that as organizations “and eventually probably as societies, we need to figure out the tools to validate and verify” AI-generated work. He told Axios the goal of publishing the paper was to “socialize the concept and basically give people a sense of what’s coming.”

“The evidence suggests that the human role is narrowing at each step in the AI development process,” the authors wrote. The paper also draws a distinction between AI as engineering tool and AI as scientific reasoner. Current measurable gains are concentrated on the engineering side: writing code, running experiments, automating debugging. An AI that reliably identifies which research directions are worth pursuing without human framing would mark a different category of capability. “By its nature, a world driven by fast recursive self-improvement could become dominated by the self-improving model as its capabilities fully eclipse those of humans,” the paper states. The authors acknowledge that predicting what that world looks like economically is, at this point, beyond confident forecasting.

The Brake Nobody Has Built Yet

The Coordination Problem

Anthropic would slow its own development only under one condition: other frontier labs doing so simultaneously and verifiably, under the same rules, with tools capable of confirming compliance. Those tools don’t currently exist.

A meaningful slowdown or pause would require multiple well-resourced labs at or near the frontier, in multiple countries, agreeing to stop under the same conditions.

The passage, from the paper’s governance section, goes on to state that each participating lab would also need to verify the others had actually stopped. A credible pause would require:

Multiple frontier labs in multiple countries agreeing simultaneously to the same stopping conditions
A verification system letting each party confirm others have genuinely paused
Defined triggers: what conditions initiate a pause and who decides when to lift it
Technical infrastructure including compute audits, AI-provenance tracking, and training-run attestation

The Verification Gap

The paper draws an analogy to Cold War-era nuclear arms control treaties, then acknowledges why that comparison understates the difficulty. “Training runs are far easier to conceal than missile silos, their inputs are general-purpose,” the paper states. Nuclear inspections can target physical infrastructure that leaves unambiguous signatures. AI training runs use chips available on global commodity markets, run on distributed private data centers, and can in principle be split across jurisdictions to evade any single regulator. The incentive to defect quietly while competitors pause is large for any actor that continues unimpeded.

Anthropic says The Anthropic Institute will research three specific verification tools: provenance tracking for model weights, compute attestation standards letting regulators confirm whether a training run occurred, and mechanisms for frontier labs to signal compliance in a manner others can independently audit. The institute will organize conversations with policymakers, other AI companies, and civil society in the coming months. OpenAI addressed the same risk in a December 2025 blog post, describing recursive self-improvement as potentially dangerous if researchers don’t share information about it. Anthropic says it also plans to engage lawmakers directly about these risks over the same period.

The Argument Anthropic Has to Make

Anthropic filed a confidential draft registration statement with the Securities and Exchange Commission approximately one week before the paper’s publication. Revenue run rate was tracking toward $50 billion in annualized terms by the end of June 2026, up from $9 billion at the end of 2025. Claude Code’s run-rate revenue passed $2.5 billion in April 2026, more than doubling since the start of the year. The Anthropic Institute itself was founded only in March 2026, three months before this paper.

The company is simultaneously marketing the productivity gains Claude produces, pitching those gains to prospective public investors via its confidential IPO registration, and calling for a globally coordinated brake on the system generating those gains. By Anthropic’s own reasoning in the paper, the two arguments are inseparable: precisely because the system is accelerating, the coordination infrastructure cannot wait.

Critics read it differently. David Sacks, a venture capital investor who has served as an informal adviser to President Trump on technology policy, called Anthropic’s policy push a “regulatory capture agenda.” A technology analyst cited by SiliconANGLE said the productivity data was likely legitimate but described promoting progress toward recursive self-improvement as “a more calculated move.”

The paper addresses the competitive logic directly. “Without coordination, commercial and geopolitical rivalries are drowning out the larger existential-to-the-species aspects of the technology being built,” the authors wrote. A unilateral slowdown, they acknowledge, would primarily hand the lead to less cautious actors. The paper is explicit about why: a pause that only one lab observes leaves every other actor better positioned to keep accelerating, which is why the multi-party, multi-country structure is the only arrangement Anthropic says is worth building.

Clark told BBC Newsnight that Claude’s share of code authorship reaching 100% “is possible within two years.” The verification machinery his paper calls for doesn’t exist yet.