Claude Sonnet 5 makes agentic AI cheap enough to actually...

Anthropic released Claude Sonnet 5 on Tuesday, and the headline is simple: it is a smaller, cheaper model that can do what big models did a few months ago. The company is positioning it as the workhorse for running AI agents without burning through your API budget.

Sonnet 5 costs $2 per million input tokens and $10 per million output tokens through the end of August. After that the input price goes up to $3. The output price stays at $10. Compare that to Opus 4.8, which Anthropic says Sonnet 5 approaches in performance. Opus is more expensive. Sonnet is also cheaper than OpenAI's GPT-5.5 and Google's Gemini 3.1 Pro. Only Gemini 3.5 Flash undercuts it on price.

The price point matters because the entire foundation model market has shifted. OpenAI's GPT-5.6 Sol, released in preview last week, is their most agentic model yet. It splits work across subagents for longer tasks. Google's Gemini 3.5 Flash, launched in May, was pitched as a shift from chatbot to agentic tool. Anthropic is making the same bet: agentic capability is no longer a differentiator, it is the baseline. What separates vendors now is how cheaply they can deliver that capability and how reliably the model behaves without a human watching over it.

Performance that cuts close to Opus

Anthropic published benchmark numbers that show Sonnet 5 making up ground on its bigger sibling. On agentic coding, Sonnet 5 scores 63.2 percent. Opus 4.8 scores 69.2 percent. Sonnet 4.6, the previous version released in February, scored 58.1 percent. The improvement is roughly five points. That is meaningful for a model that costs significantly less.

On knowledge work benchmarks, Sonnet 5 actually slightly outperforms Opus 4.8. That is surprising because Opus is known for handling the hardest problems, things like making subtle judgment calls and conducting deep research. Anthropic acknowledges that Opus 4.8 remains the choice for tasks requiring the highest accuracy, but the gap has shrunk enough that developers can make economic tradeoffs.

The model also finishes complex tasks that earlier versions would abandon partway through. Daniel Shepard, a senior engineer at Zapier, gave an example: he handed Sonnet 5 a two-part job that involved updating Salesforce account tiers and sending a launch announcement to enterprise contacts. The model ran the entire workflow end to end. Shepard said the previous model stalled halfway through that same job. That kind of reliability matters more than a point or two on a benchmark.

Safety improvements that matter for agents

Running an agent means giving a model access to tools like a browser, a terminal, or an API. If the model is easily tricked or prone to hallucinating, the agent becomes a liability. Anthropic says Sonnet 5 shows a lower rate of undesirable behaviors compared to Sonnet 4.6. It refuses malicious requests more reliably and does a better job resisting prompt injection attacks. It also hallucinates and engages in sycophantic behavior less often.

Fabian Hedin, co-founder of Lovable, said in a statement that Sonnet 5 refuses unsafe requests cleanly and consistently. He pointed out that models that know when to say no are as important as models that know how to build. That is a practical observation. If you are putting an agent in front of millions of users, the safety guardrails have to be baked into the model itself, not bolted on afterward.

That said, Sonnet 5 is not as safe as Opus 4.8 or Claude Mythos Preview when it comes to misaligned behavior. Anthropic explicitly notes that Sonnet 5 has a much lower ability to perform dangerous cybersecurity tasks than the Opus models. That is good for safety, but it also means if you are building an agent that needs to handle security-sensitive operations, you probably still want the larger model.

What this means for developers

The practical takeaway is that Anthropic is offering a tiered approach where developers can tune effort level against cost and performance. You use Sonnet 5 for the bulk of agentic work and escalate to Opus 4.8 only when you need the extra accuracy. That is the same pattern we have seen in every other software pricing model. You do not pay for the most expensive option for every task. You reserve it for the hard ones.

This is also a signal that the race in foundation models is moving from raw capability to operational efficiency. OpenAI, Google, and Anthropic all have models that can act as agents. The question is which one can do it at the lowest cost with the fewest failures. Sonnet 5 is Anthropic's answer. It is not the absolute best model, but it is good enough for a wide range of tasks at a price that lets you run agents at scale.

For teams building automated workflows, the math is straightforward. If your task requires deep reasoning or high accuracy, you pay for Opus. If your task is routine, you use Sonnet. The savings add up. A developer running millions of calls a month could see a significant reduction in API spend by routing most traffic to Sonnet and only sending edge cases to the larger model.

One caveat on the pricing

The $2 per million input tokens is an introductory rate. It jumps to $3 after August 31. That is still cheaper than Opus, but it narrows the gap. Developers who are evaluating Sonnet 5 now should plan for the price increase. The output price stays at $10, which is the same as the introductory period. That is less of a change, but output tokens are typically the more expensive side of any API call.

Anthropic also made Sonnet 5 the default model for free and Pro plans starting Tuesday. That means users on those plans will get the new model without doing anything. If you have workflows that depend on specific behavior from the previous Sonnet, you should test them before the switch takes full effect.

The bigger picture

Agentic AI is becoming commoditized. The models are good enough. The vendors are competing on price and reliability. Sonnet 5 is a rational product. It is not flashy. It does not claim to solve AGI. It claims to do the things that matter to developers, like finishing a two-part job without stalling, refusing bad requests, and costing less than the alternative.

If you are building agents today, the choice is increasingly about economics and safety, not about which model can write a better poem. Sonnet 5 makes the economic case. The question is how long the gap lasts before the next price drop.

Performance that cuts close to Opus

Safety improvements that matter for agents

What this means for developers

One caveat on the pricing

The bigger picture

Claude Sonnet 5 makes agentic AI cheap enough to actually use

Performance that cuts close to Opus

Safety improvements that matter for agents

What this means for developers

One caveat on the pricing

The bigger picture

Claude Sonnet 5 makes agentic AI cheap enough to actually use

Performance that cuts close to Opus

Safety improvements that matter for agents

What this means for developers

One caveat on the pricing

The bigger picture