Qwen 2
Alibaba Cloud · June 2024
● activeOpen Weightdecoder onlytext
Parameters0.5B - 72B
Context Window128K tokens
Variants0.5B, 1.5B, 7B, 72B
Why It Matters
Established the Qwen series as a genuine rival to Meta's LLaMA for the open-source LLM crown, with best-in-class multilingual support across 29 languages.
Description
A ground-up architecture redesign that dramatically improved performance across the board. Available in sizes from 0.5B to 72B parameters with support for 29 languages and a 128K token context window (roughly 96,000 words). Emerged as the leading open-source alternative to Meta's LLaMA in many benchmarks.
Notable Milestones
- ▸Most widely used Chinese-English bilingual open model
- ▸Base model for numerous community fine-tunes
Key Innovations
Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.
Long Context
Long ContextAbility to process very long inputs (100K+ tokens), enabling analysis of entire codebases or books.
Related Research (1)
RoPEArchitecture
2021 · Zhuiyi Technology
Introduced rotary position embeddings that encode position via rotation matrices, enabling better length generalization. Used by virtually every moder…