Phi-1

Microsoft Research · June 2023

● activeOpen Sourcedecoder onlycode

Parameters1.3B

Context Window2K tokens

Why It Matters

Launched Microsoft's 'small but mighty' research program, demonstrating that training data quality could be more important than model size — a finding that influenced the entire field.

Description

A tiny 1.3 billion parameter model focused on code generation, trained exclusively on 'textbook-quality' data — carefully curated examples designed to teach programming concepts clearly. Despite being over 100x smaller than GPT-4, it performed surprisingly well on coding benchmarks, proving that what a model learns from matters as much as its size.

Notable Milestones

▸Proved textbook-quality data approach for code generation
▸Inspired the 'small language model' movement

Key Innovations

Distillation

DistillationTraining a smaller 'student' model to mimic a larger 'teacher' model, preserving capability at lower cost.

Family Tree

Successors (1)

Phi-2

External Links

Research Paper

More from Microsoft Phi

Phi-22023-12 · 2.7B

Phi-32024-04 · 3.8B - 14B

Phi-42025-02 · 14B

MAI-12024-05 · ~500B

Phi-4 Mini2025-02 · 3.8B

Phi-4 Multimodal2025-02 · 14B

NextPhi-2