“`html
- Claude Sonnet 4.6: a much more than incremental update
- 1 million tokens in context: what really changes
- Coding performance that shakes up the model hierarchy
- Computer use: a capability reaching maturity
- Claude for researchers: scientific reasoning at scale
- How to use Claude Sonnet 4.6: practical guide
- Claude in the ecosystem: much more than a simple chatbot
- Safety and ethics: the Anthropic difference
- Conclusion: Claude Sonnet 4.6 redefines mid-range standards
Anthropic has just taken a major step forward in the artificial intelligence race. Claude Sonnet 4.6, launched on February 17, 2026, establishes itself as the most powerful mid-range model ever produced by the American laboratory. A context window of one million tokens, coding performance rivaling superior class models, increased resistance to prompt injection attacks: this model is redefining expectations for developers, researchers, and enterprises. Whether you’re a Claude user on the public interface or a developer integrating Anthropic’s API, here’s everything you need to know about this major update.
Claude Sonnet 4.6: a much more than incremental update
Anthropic follows an approximately four-month update cycle for its Sonnet lineup. But Claude Sonnet 4.6 breaks with the tradition of marginal improvements. The model is described by Anthropic itself as “a complete upgrade of the model’s capabilities in the areas of coding, computer use, long-context reasoning, agent planning, knowledge work, and design”.
The release comes just two weeks after the launch of Claude Opus 4.6, the flagship model of the lineup, released on February 5, 2026 with autonomous agent team features. An updated Haiku model should follow in the coming weeks, completing the new generation of the Claude 4.6 family.
Claude Sonnet 4.6 becomes the default model for all Free and Pro plan users on claude.ai. Pricing remains identical to Sonnet 4.5, at $3 / $15 per million tokens for input/output via the API — an aggressive pricing position given the announced performance.
1 million tokens in context: what really changes
One of the most striking announcements for this version is the one million token context window in beta, double what was available for previous Sonnet versions. To give a concrete idea: this corresponds to the entirety of a medium-sized codebase, several dozen scientific research articles, or voluminous legal contracts — in a single request.
But what distinguishes Claude Sonnet 4.6 from its competitors on this point is not just the amount of manageable context, it’s the quality of reasoning over this extended context. The model was tested on the Vending-Bench Arena benchmark, which evaluates a model’s ability to manage a simulated business over time in competition against other AIs. Sonnet 4.6 developed a novel strategy: massively invest in capacity during the first ten simulated months, then pivot abruptly toward profitability in the final sprint — an approach that competing models failed to adopt.
This capacity for long-horizon planning is directly related to the extended context window: the more a model can “remember” information within a session, the more complex and coherent strategies it can develop.
Context compaction, also available in beta on the developer platform, complements this functionality: it automatically summarizes old exchanges as the conversation approaches the limits, thus increasing the effective length of the context seamlessly.
Evolution of Sonnet model scores on OSWorld — Source: Anthropic
Coding performance that shakes up the model hierarchy
For developers, Claude Sonnet 4.6 is arguably the most anticipated update of the year. In tests conducted within Claude Code — Anthropic’s command-line coding tool — early access users preferred Sonnet 4.6 to Sonnet 4.5 in 70% of cases. More surprisingly: they preferred Sonnet 4.6 to Opus 4.5 (the flagship model from November 2025) in 59% of cases.
Qualitative feedback from developers is explicit:
- Less overengineering: the model no longer generates unnecessarily complex code.
- Less “laziness”: it follows instructions through to completion, without shortcuts.
- Fewer hallucinations: fictitious success statements, frequent in older models, are substantially reduced.
- Better long-session handling: the model reads the context before modifying code, and consolidates common logic rather than duplicating it.
Leading companies confirm these results. GitHub (Joe Binder, VP Product) highlights Sonnet 4.6’s excellence on complex code fixes involving large codebases. Cursor (Michael Truell, co-founder and CEO) qualifies it as a “notable improvement on long tasks and difficult problems”. Replit (Michele Catasta, President) evokes an “extraordinary performance/cost ratio”, with superior performance on the most complex agentic workloads.
On the SWE-Bench Verified benchmark, the industry reference for software engineering, Claude Sonnet 4.6 sets a new record. With a prompt modification, the score even reaches 80.2% — an unprecedented level for a mid-range model.
Computer use: a capability reaching maturity
Since Anthropic introduced, in October 2024, the first general-purpose model capable of using a computer like a human, progress has been steady. Claude Sonnet 4.6 marks a decisive new step on the OSWorld benchmark, which tests hundreds of tasks in simulated environments including Chrome, LibreOffice, VS Code, and other common software — without dedicated APIs, using only virtual mouse clicks and keyboard input.
Early users of Sonnet 4.6 report human-level capabilities on tasks such as navigating complex spreadsheets or filling out multi-step forms spread across multiple browser tabs.
A major security issue accompanies this increase in power: prompt injection attacks, where malicious actors hide instructions in web pages to divert the behavior of the model. Anthropic indicates that Sonnet 4.6 is a significant improvement over Sonnet 4.5 in terms of resistance to these attacks, with performance similar to Opus 4.6 on this point.
The insurance company Pace achieved a score of 94% on its specialized benchmark for computer use for insurance workflows (submission intake, first notice of loss) — a level it qualifies as “critical for its operations”.
Claude for researchers: scientific reasoning at scale
Beyond coding, Claude Sonnet 4.6 asserts itself as a powerful tool for scientific and analytical research. The ability to ingest dozens of scientific publications in a single request opens unprecedented possibilities for researchers: bibliographic synthesis, methodology comparison, structured data extraction from large corpora.
The score of 60.4% on ARC-AGI-2 deserves particular attention. This benchmark, designed to measure specifically human skills — such as analogical reasoning and abstraction — is one of the most difficult in the field. While Sonnet 4.6 is surpassed by Opus 4.6, Gemini 3 Deep Think, and a refined version of GPT-5.2, it positions itself above the majority of models comparable to the same price segment.
On Humanity’s Last Exam (HLE), an extremely difficult benchmark composed of expert questions from dozens of academic disciplines, Claude Sonnet 4.6 also exceeds Sonnet 4.5. Tests were conducted with web search, code execution and extended thinking tools enabled, reflecting real-world usage conditions for a researcher.
Box reports a 15 percentage point improvement over Sonnet 4.5 on complex Q&A reasoning tasks applied to real business documents. Hebbia, specialized in financial document analysis, observes “a significant jump in answer matching rate” on its financial services benchmark.
Claude Sonnet 4.6 performance on major benchmarks — Source: Anthropic
How to use Claude Sonnet 4.6: practical guide
Via claude.ai (public users)
Claude Sonnet 4.6 is now the default model on claude.ai for Free and Pro plans, as well as on Claude Cowork. No manipulation is necessary: the change is automatic. The free plan has also been upgraded and now includes file creation, connectors, skills, and compaction.
Via the API (developers)
For developers integrating Claude via the API, the model identifier is: claude-sonnet-4-6
Claude Sonnet 4.6 is available on:
- The Claude developer platform (platform.claude.com)
- Amazon Bedrock
- Google Cloud Vertex AI
- Claude Code (CLI tool)
The model supports adaptive thinking and extended thinking, as well as context compaction in beta. Anthropic recommends exploring the spectrum of thinking modes to find the optimal balance between speed and performance based on your use case.
New API features now available
With this version, Anthropic also makes generally available several features that were in beta:
- Code execution
- Memory
- Programmatic tool calling
- Tool search
- Tool use examples
The web search and fetch tools have also been improved: they now automatically execute code to filter and process search results, retaining only relevant content in context — improving both answer quality and token efficiency.
Claude in the ecosystem: much more than a simple chatbot
Claude is today much more than a conversation interface. The Anthropic ecosystem has grown considerably around the model:
- Claude Code: a CLI tool for agentic coding, now powered by default by Sonnet 4.6.
- Claude in Chrome: a web navigation agent integrated directly into the browser.
- Claude in Excel: a spreadsheet agent that now supports MCP connectors to interface with external data (S&P Global, PitchBook, Moody’s, FactSet, etc.) without leaving Excel.
- Claude in PowerPoint: an agent for presentation creation.
- Claude Cowork: a desktop tool for non-developers enabling file and task management automation.
This ecosystem expansion reflects Anthropic’s strategy: to make Claude a universal assistant capable of integrating into all professional workflows, whether software development, financial analysis, academic research, or operational management.
Safety and ethics: the Anthropic difference
As with each new model, Anthropic has conducted thorough safety evaluations on Sonnet 4.6. The conclusion is reassuring: the model exhibits “an overall warm, honest, prosocial and sometimes humorous character, with very solid safety behaviors and no signs of major concern regarding high-stakes forms of misalignment”.
This particular attention to safety distinguishes Anthropic from several of its competitors. The approach known as Constitutional AI, which guides the training of all Claude models, aims to align the behavior of the model with explicit human values, beyond simple compliance with standard moderation rules.
Conclusion: Claude Sonnet 4.6 redefines mid-range standards
Claude Sonnet 4.6 is not merely an update: it’s a strategic repositioning. By bringing Sonnet’s performance closer to Opus’s at an unchanged price, Anthropic is putting pressure on the entire industry. For developers, the combination of a one million token context, record coding capabilities, and improved resistance to prompt injections makes it a top choice. For researchers, the ability to reason over large corpora opens new perspectives. For enterprises, the performance/cost ratio is hard to match.
Key takeaways:
- Default model on all Free and Pro plans starting February 17, 2026
- 1 million token context window in beta
- ARC-AGI-2 score of 60.4%, above most comparable models
- Record on SWE-Bench Verified (80.2% with prompt modification)
- API: identifier
claude-sonnet-4-6, pricing starting at $3/million tokens - Available on Amazon Bedrock, Google Vertex AI, and direct access via claude.ai
To test Claude Sonnet 4.6 right now, visit claude.ai or check out Anthropic’s developer documentation.
Sources: Anthropic
“`
