Gemini 2.0
Google DeepMind · December 2024
● activeClosedmixture of expertsmultimodalAPI Available
Context Window1M tokens
VariantsFlash, Pro
Why It Matters
Shifted Google's AI strategy from answering questions to taking actions. Introduced native tool use, allowing the model to autonomously search the web, execute code, and interact with external services.
Description
Google's model family built for the 'agentic era' — designed to not just answer questions but take actions using tools like Google Search, Maps, and code execution. Features native multimodal output (can generate text, images, and audio). The Flash variant delivers twice the speed of Gemini 1.5 Pro while outperforming it on benchmarks.
Notable Milestones
- ▸Powers Google's Deep Research feature for autonomous research
- ▸Introduced Multimodal Live API for real-time audio/video streaming
- ▸Flash variant became default in Gemini app
Benchmark Scores
MATHMATH benchmark — competition-level problems
89.7%GPQAGraduate-level science QA
62.1%Key Innovations
Agentic
AgenticModels that can autonomously plan, execute multi-step tasks, use tools, and self-correct without human intervention.
Tool Use
Tool UseAbility to call external tools, APIs, and functions — enabling web browsing, code execution, and real-world actions.
Multimodal
MultimodalProcessing multiple types of input (text, images, audio, video) in a single model.
Family Tree
Built On
Lineage
Successors (2)
Related Research (1)
GeminiScaling
2023 · Google DeepMind
Introduced the Gemini family with native multimodal training from the ground up, achieving SOTA on 30+ benchmarks.