BLOOM

BigScience · July 2022

activeOpen Sourcedecoder onlytext
Parameters176B
Context Window2K tokens

Why It Matters

The first truly open, community-built large language model — trained by over 1,000 researchers across 60 countries. Proved that frontier AI didn't require Big Tech resources.

Description

The first truly open, community-built large language model — trained by over 1,000 researchers across 60 countries as part of the BigScience project. With 176B parameters and support for 46 natural languages and 13 programming languages, it proved that frontier AI research didn't require Big Tech resources, just organized collaboration.

Notable Milestones

  • First 100B+ model with fully open training data and code
  • Supported 46 natural languages and 13 programming languages
  • Trained on the Jean Zay supercomputer in France

Key Innovations

Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.
Transformer
TransformerNeural network architecture using self-attention to process entire sequences in parallel. Replaced RNNs and enabled massive scaling.

External Links