Nous Hermes 2

Nous Research · March 2024

activeOpen Sourcemixture of expertstext
Parameters46.7B (12.9B active)
Context Window32K tokens

Why It Matters

Demonstrated that community-built models could match or exceed commercial offerings for specific use cases like code generation and tool use.

Description

Advanced instruction-tuned model from Nous Research, built on Mixtral's Mixture-of-Experts architecture (which uses multiple specialized sub-networks, routing each input to the most relevant ones). Known for strong function calling (the ability to use external tools and APIs) and structured output generation without excessive safety filtering.

Key Innovations

Instruction Tuning
Instruction TuningFine-tuning a model on instruction-response pairs so it follows user commands more reliably.
Tool Use
Tool UseAbility to call external tools, APIs, and functions — enabling web browsing, code execution, and real-world actions.
MoE
MoEArchitecture where only a fraction of the model's parameters are active for each input, allowing massive scale with lower compute.

Family Tree

Built On

Lineage

Mistral 7BMixtral 8x7BNous Hermes 2

External Links