StarCoder
BigCode / Hugging Face · May 2023
● activeOpen Sourcedecoder onlycode
Parameters15.5B
Context Window8K tokens
Why It Matters
The first major open-source code model trained transparently on a curated, legally vetted dataset (The Stack) — proving code AI could be built responsibly.
Description
The first major open-source code model trained transparently on The Stack — a carefully curated dataset of permissively licensed code. Built by BigCode, a collaboration between Hugging Face and ServiceNow, it set new standards for responsible AI development by allowing developers to check if their code was in the training data and opt out.
Key Innovations
Code Gen
Code GenAbility to write, debug, and understand programming code across multiple languages.
Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.
Family Tree
Successors (1)
External Links
More from Community / Uncensored
WizardLM2023-06 · 13B
Dolphin (Eric Hartford)2023-07 · —
Hermes (Nous Research)2023-09 · —
LLaMA 4 Scout Abliterated2025-05 · 81B (17B active × 16 experts)
DeepSeek R1 Uncensored2025-03 · 671B (37B active)
Bark2023-04 · —
Mistral 7B Uncensored2024-01 · 7B
Qwen 2.5 72B Abliterated2025-01 · 72B
SOLAR 10.7B Uncensored2024-02 · 10.7B
LLaMA 3.1 405B Abliterated2024-08 · 405B