GPT-3
OpenAI · June 2020
Why It Matters
The model that launched the modern AI era. Its 175 billion parameters showed that scaling up language models dramatically improves capabilities, enabling surprisingly human-like text generation and few-shot learning.
Description
With 175 billion parameters — over 100× larger than GPT-2 — this model demonstrated 'few-shot learning': the ability to perform new tasks from just a few examples in the prompt, without any retraining. This was the model that launched the modern AI era and proved that scale alone could unlock remarkable capabilities.
Notable Milestones
- ▸Powered the first wave of AI writing assistants
- ▸Spawned hundreds of startups built on the GPT-3 API
- ▸Demonstrated few-shot learning — performing tasks from a handful of examples
Key Innovations
Related Research (3)
Introduced the Transformer architecture using self-attention mechanisms, replacing RNNs entirely. Enabled parallel training and superior long-range de…
175B-parameter GPT. Pioneered few-shot and in-context learning, dramatically reducing the need for fine-tuning.
Found that model performance follows power laws in compute, parameters, and data. Provided the mathematical framework for scaling decisions.