GPT-4
OpenAI · March 2023
Why It Matters
First multimodal frontier model that could process both text and images. Passed the bar exam and scored in the 90th percentile on the SAT, demonstrating expert-level reasoning across many domains.
Description
OpenAI's first multimodal model, able to understand both text and images. Believed to use a mixture-of-experts architecture (where multiple specialized sub-networks collaborate on each response) with an estimated 1.7 trillion parameters. Represented a massive leap in reasoning, coding ability, and factual accuracy over GPT-3.5.
Notable Milestones
- ▸Passed the Uniform Bar Exam in the 90th percentile
- ▸Scored 90th percentile on the SAT
- ▸Powered Bing Chat, Duolingo Max, and Khan Academy's AI tutor Khanmigo
- ▸First model widely adopted for professional legal, medical, and financial tasks
Benchmark Scores
Key Innovations
Family Tree
Built On
Lineage
Successors (1)
Related Research (5)
Found that model performance follows power laws in compute, parameters, and data. Provided the mathematical framework for scaling decisions.
Introduced sparsely-gated Mixture-of-Experts layers for scaling model capacity without proportional compute increase.
Simplified MoE routing to scale to trillions of parameters efficiently. Influenced Mixtral and GPT-4/5 MoE architectures.
Combined chain-of-thought reasoning with external tool use (APIs, search), improving QA and decision-making through interleaved reasoning and action.
Described GPT-4's multimodal capabilities and performance across professional/academic benchmarks, setting new SOTA on bar exam, MMLU, and many others…