Anthropic Measured How Humans Actually Use A…

anthropic.com

ksl

|

4d ago

Anthropic published a large-scale study of real agent behavior drawn from millions of interactions across Claude Code and its public API. The standout finding: Claude stops to ask for human input more than twice as often as users actually interrupt it, which flips the usual framing around AI oversight. Experienced users auto-approve over 40% of sessions compared to 20% among newcomers, and the longest autonomous runs nearly doubled to 45-plus minutes in just three months. Only 0.8% of tool calls involved irreversible actions like sending customer emails. The research makes a quiet but pointed case that pre-deployment safety testing alone misses the dynamics that matter – a position that puts Anthropic ahead of OpenAI and Google in building measurement infrastructure for agentic deployment at scale.

Source link

What's Hot

After Amma: AIADMK’s struggle to fill the void left by Jayalalithaa | India News

B2B Buyers Now Trust AI Agents Over Google

‘We’re 2002’: Nitish Kumar’s big slip as he loses cool on oppn in assembly – watch | India News

ICC announces full fixtures for Women’s T20 World Cup 2026

Australia to tour South Africa for 3 test matches and 3 one-day internationals

AUS-W vs IND-W first ODI: India opts to bat against Australia

In-form batting line-up gives freedom to the bowlers, says Padikkal

The Axar conundrum —to be or not to be

Anthropic Measured How Humans Actually Use A…

B2B Buyers Now Trust AI Agents Over Google

Lasso Security Maps Nine Prompt Injection Types

Reddit Tests AI Shopping Carousels in Search

Gemini 3.1 Pro Doubles Reasoning at Same Price

iOS 26.4 Brings ChatGPT and Claude to CarPlay

Cursor Opens a Marketplace for Agent Plugins

News

Company

Services

What's Hot

Anthropic Measured How Humans Actually Use A…

Keep Reading

News

Company

Services

Subscribe to Updates