Opus 4.6 released: Agent Teams now available
Opus 4.6 gets 1M context + Finance Agent #1 ranking
Software stocks down 20%+ as 'vibe working' era begins
1M context, Dev Team mode, $3/$15 pricing confirmed
The "Opus Killer" Has Arrived
On February 3, a Vertex AI error log revealed what developers had been waiting for: Claude Sonnet 5, codenamed "Fennec." Two days later, Anthropic officially released Opus 4.6 with Agent Teams. But the real story is what testing reveals about both models.
Hands-on testing from TestingCatalog and other sources confirms what the leaked benchmarks suggested: Sonnet 5 delivers Opus-tier coding performance at Sonnet-tier pricing. For developers building with Claude, this changes the cost-capability equation entirely.
Background Reading
Claude Sonnet 5 Is Here. Opus 4.6 Just Dropped. What You Need to Know. →Full timeline from the Vertex AI leak to today's official releases.
What Testers Actually Found
Beyond the benchmark numbers, early access testing has revealed specific capabilities that matter for production development:
1. Superior UI Code Generation
Testers report Sonnet 5 produces "the most complete, detailed ASCII world map ever seen" from an AI model, an indicator of its improved spatial reasoning and structured output generation. For frontend developers, this translates to better component generation and more accurate UI implementations.
2. 1M Context Window (Confirmed)
Official sources confirm a 1 million token context window with "near-zero latency," a 5x increase from the 200K context of Sonnet 4.5. This enables processing entire codebases, lengthy documents, or extended conversation histories in a single context.
3. Stronger Coding Than Opus 4.5
The most significant finding: in coding workflows, Sonnet 5 outperforms Opus 4.5. Testers note "stronger coding output" particularly in structured generation tasks, API implementations, and complex refactoring operations.
The Benchmark Trajectory
Claude's SWE-Bench scores have improved 67% in 18 months. Sonnet 5's 82.1% breaks the 80% "human parity" threshold that many considered the ceiling for AI coding assistants:
SWE-Bench Verified scores for Claude models. Feb 2026 data is projected.
The trajectory tells the story: each Sonnet generation narrows the gap with Opus, and Sonnet 5 finally surpasses it.
The Pricing Advantage
This is where it gets interesting for production workloads. Based on leaked pricing, here's the cost comparison:
| Model | Release | SWE-Bench | Price (In/Out) |
|---|---|---|---|
| Claude 3.5 Sonnet | June 2024 | 49.0% | $3 / $15 |
| Claude 3.7 Sonnet | Feb 2025 | 62.3% | $3 / $15 |
| Claude 4 Sonnet | May 2025 | 70.0% | $3 / $15 |
| Claude Opus 4.5 | Nov 2025 | 80.9% | $15 / $75 |
| Claude Opus 4.6 | Feb 5, 2026 | TBD | $15 / $75 |
| Claude 5 Sonnet | Feb 2026 | 82.1% | $3 / $15 |
At $3/$15 per million tokens vs. Opus 4.5's $15/$75, Sonnet 5 costs roughly 80% less for input and 80% less for output, while delivering better coding benchmarks.
When to Use Sonnet 5 vs. Opus 4.6
With Opus 4.6 now officially released (Feb 5, 2026), here's the updated decision framework for developers:
Choose Sonnet 5 For:
- Code generation and refactoring tasks
- High-volume API calls where cost matters
- UI component generation and frontend work
- Structured output generation (JSON, YAML, configs)
- Production workloads currently on Sonnet 4.5
Choose Opus 4.6 For:
- Complex multi-agent workflows using Agent Teams (parallel task coordination)
- Financial analysis, tops the Finance Agent benchmark
- Research and analysis requiring maximum capability with 1M context
- Enterprise workflows involving PowerPoint, document review, and cross-tool integration
- Mission-critical applications where cost is secondary to quality
Building with Claude? Get a free 15-min AI strategy call.
Custom roadmap + 3 quick wins you can use this week.
New Capabilities
Beyond the core benchmarks, both models ship with significant new features:
CONFIRMED"Dev Team" Mode
The "Dev Team" mode is now confirmed. A Manager Agent (Sonnet 5) analyzes your high-level goal and spawns specialized sub-agents (Backend, QA, Infrastructure) that work simultaneously on different files. If agents propose conflicting changes, the Manager reconciles the logic before presenting the final PR. This is separate from Opus 4.6's Agent Teams, which take a more general approach to parallel task coordination.
TPU Optimization (Confirmed)
Sonnet 5 is optimized for Google's "Antigravity TPU" infrastructure, explaining its speed improvements while maintaining quality. This enables higher throughput and reduced latency for production workloads.
Current Status
The landscape has shifted significantly since the original leak:
- Opus 4.6 officially released: Launched Feb 5 with Agent Teams, 1M context, PowerPoint integration, and GitHub Copilot support
- Sonnet 5 live but no formal announcement: Available across Anthropic API, Amazon Bedrock, and Vertex AI per multiple sources
- Pricing widely confirmed: $3/$15 per million tokens reported by multiple independent sources
- 1M context confirmed: Both Sonnet 5 and Opus 4.6 now offer 1M token context windows
- Dev Team mode confirmed: Multi-agent spawning with Manager Agent orchestration is real and working

