PaLM 2
Google DeepMind · May 2023
Why It Matters
Proved that training efficiency matters as much as raw model size. Powered over 25 Google products at launch and spawned specialized variants like Med-PaLM 2 (medical) and Sec-PaLM (cybersecurity).
Description
Google's successor to PaLM, designed to be smarter rather than just bigger. Used 'compute-optimal' training — a strategy inspired by DeepMind's Chinchilla research that trains a smaller model on more data for better efficiency. Powered Google's Bard chatbot (now Gemini) and came in four sizes: Gecko (mobile), Otter, Bison, and Unicorn (largest). Significantly improved multilingual, reasoning, and coding abilities.
Notable Milestones
- ▸Powered Google's Bard chatbot
- ▸Med-PaLM 2 achieved expert-level medical question answering
- ▸Available in four sizes down to mobile-friendly Gecko
Key Innovations
Related Research (2)
Showed that prompting models to "think step-by-step" unlocks arithmetic, logic, and commonsense reasoning in large models like PaLM.
Showed that SwiGLU activation (Swish + Gated Linear Unit) significantly improves Transformer FFN quality with minimal compute overhead.