Command A

Cohere · March 2025

activeOpen Weightmixture of expertstextAPI Available
Parameters111B (23B active)
Context Window256K tokens

Why It Matters

Demonstrated that enterprise-focused AI could be both highly capable and extremely efficient — deployable on just two GPUs with 150% higher throughput than its predecessor.

Description

Next-generation Cohere model that replaced the Command R series. Uses a mixture-of-experts architecture (where only a fraction of the model's 111B parameters are active at once — just 23B), making it far more efficient. Features a 256K token context window and supports 23 languages natively.

Notable Milestones

  • Agentic enterprise workflows
  • Multilingual RAG across 23 languages
  • Deployable on just 2 GPUs

Key Innovations

Tool Use
Tool UseAbility to call external tools, APIs, and functions — enabling web browsing, code execution, and real-world actions.
Agentic
AgenticModels that can autonomously plan, execute multi-step tasks, use tools, and self-correct without human intervention.
MoE
MoEArchitecture where only a fraction of the model's parameters are active for each input, allowing massive scale with lower compute.

Family Tree

Built On

Lineage

CommandCommand RCommand R+Command A

Successors (1)

External Links