Claude 3.5 Sonnet
Anthropic · June 2024
● activeCloseddense transformermultimodalAPI Available
Parameters~175B
Context Window200K tokens
Why It Matters
Redefined expectations for mid-tier models by outperforming its own flagship (Opus) at a fraction of the cost. Became the most widely used Claude model and a developer favorite for coding tasks.
Description
A mid-tier model that punched well above its weight — matching or exceeding the more expensive Claude 3 Opus on most benchmarks while running faster and costing less. Quickly became the most popular model among developers for coding and complex tasks. Also introduced 'computer use' — the ability to interact with a computer screen like a human user.
Notable Milestones
- ▸Became the default model in Cursor and other AI coding tools
- ▸First model to offer 'computer use' — controlling a desktop like a human
- ▸Top-ranked on coding benchmarks like SWE-bench
Benchmark Scores
MMLUMassive Multitask Language Understanding — 57 subjects
88.7%HumanEvalCode generation pass@1 — Python problems
92.0%GPQAGraduate-level science QA
59.4%Key Innovations
Code Gen
Code GenAbility to write, debug, and understand programming code across multiple languages.
Agentic
AgenticModels that can autonomously plan, execute multi-step tasks, use tools, and self-correct without human intervention.
Tool Use
Tool UseAbility to call external tools, APIs, and functions — enabling web browsing, code execution, and real-world actions.
Family Tree
Built On
Lineage
Successors (1)
Related Research (1)
Constitutional AIAlignment
2022 · Anthropic
Introduced RL from AI Feedback using "constitutions" (rule sets) for self-supervision, reducing reliance on human labels for harmlessness training.