Claude Sonnet 5 Release Notes
Claude Sonnet 5 Release Notes
Claude Sonnet 5: High-Performance Agentic Capabilities
Claude Sonnet 5 is designed as the most agentic model in the Sonnet family, enabling autonomous planning, tool use (including browsers and terminals), and multi-step execution. It narrows the performance gap between the mid-tier Sonnet and high-tier Opus models, delivering capabilities close to Opus 4.8 while maintaining lower operational costs.
Key Performance Improvements
Sonnet 5 provides a strict improvement over Sonnet 4.6 in reasoning, tool use, coding, and general knowledge work. It is specifically optimized for agentic search (evaluated via BrowseComp) and computer use (evaluated via OSWorld-Verified).
Early access partners report significant gains in autonomous execution, including:
- Software Engineering: Handling sustained coding, debugging, and multi-step software engineering work in complex technical contexts.
- End-to-End Automation: Completing multi-part tasks (e.g., updating CRM tiers and sending announcements) without stalling.
- Bug Investigation: Autonomously writing reproducing tests, implementing fixes, and verifying the results in a single pass.
- Brownfield Code: Tracing failures to root causes in legacy codebases rather than applying superficial patches.
- Legal and Data Analysis: Improving legal research and providing faster time-to-insight for live data exploration.
Safety and Cybersecurity Guardrails
Sonnet 5 is generally safer than Sonnet 4.6, demonstrating lower rates of hallucination, sycophancy, and undesirable behaviors. It is more resistant to prompt injection attacks and better at refusing malicious requests.
Regarding cybersecurity:
- Capability Limits: Sonnet 5 was not specifically trained for cybersecurity tasks and performs substantially worse than Opus 4.8 and Mythos 5 in developing software exploits.
- Exploit Success: In tests involving Firefox 147 vulnerabilities, Sonnet 5 failed to develop any full working exploits, though it showed a slightly higher partial success rate than Sonnet 4.6 due to general intelligence gains.
- Default Safeguards: Real-time cyber safeguards are enabled by default to detect and block dangerous usage.
Pricing and Availability
Claude Sonnet 5 is available across all plans (Free, Pro, Max, Team, and Enterprise) and is integrated into Claude Code and the Claude Platform.
Pricing Structure:
- Introductory (through August 31, 2026): $2 per million input tokens / $10 per million output tokens.
- Standard (after August 31, 2026): $3 per million input tokens / $15 per million output tokens.
Note: Sonnet 5 uses an updated tokenizer that may increase token counts by 1.0–1.35× compared to previous versions; introductory pricing is designed to offset this cost.
Community Insights and Counterpoints
While Anthropic highlights the agentic gains, some developers and analysts have raised concerns regarding the model's efficiency and utility:
- Cost-Performance Trade-off: Some users suggest that using Sonnet 5 at high effort levels may be less efficient than using Opus 4.8 at a lower effort level, as the cost per task may rise above that of Opus.
- Knowledge Gaps: Independent testing suggests potential weaknesses in trivia and built-in knowledge, as well as occasional failures in combined tool-calling tasks and complex puzzle solving.
- Agentic vs. Assisted Development: Some developers argue that optimizing for fully autonomous agents can degrade the model's performance in "assisted development," where the model may overstep strict instructions.
"The cost per task chart is telling me that I should never use Sonnet 5 above medium effort level - Opus always performs better for a given cost."
"I have found that the more models are optimized for fully agentic development, the worse they get at assisted development and often start doing too much despite very strict/specific instructions."