Gemma

Google DeepMind · February 2024

● activeOpen Sourcedecoder onlytext

Parameters2B / 7B

Context Window8K tokens

Variants2B, 7B

Why It Matters

Google's first serious open-weight competitor to Meta's Llama. Demonstrated that a major lab could release small, efficient models that outperformed many larger open alternatives.

Description

Google's first open-weight model family, built using the same research and technology behind the proprietary Gemini models but released for anyone to download, modify, and deploy. Available in 2B and 7B parameter sizes — small enough to run on a laptop or single GPU. Designed to make frontier AI research accessible to the broader developer community.

Notable Milestones

▸Widely adopted on Hugging Face for fine-tuning and research
▸Ran efficiently on consumer hardware including laptops

Key Innovations

Open Weight

Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.

Distillation

DistillationTraining a smaller 'student' model to mimic a larger 'teacher' model, preserving capability at lower cost.

Family Tree

Successors (2)

Gemma 2 CodeGemma

Related Research (1)

Grouped-Query AttentionArchitecture

2023 · Google Research

Introduced grouped-query attention as a middle ground between multi-head and multi-query attention, reducing KV cache memory while maintaining quality…

External Links

Research Paper Announcement

More from Google Gemma

Gemma 22024-06 · 2B / 9B / 27B

Gemma 32025-03 · 1B / 4B / 12B / 27B

Gemma 42026-04 · 31B (A4B / E4B / E2B)

CodeGemma2024-04 · 2B / 7B

NextCodeGemma