Bark

Suno AI · April 2023

activeOpen Weightdecoder onlyaudio

Why It Matters

First open-source model capable of generating speech with natural emotions, non-verbal sounds, and music, demonstrating that text-to-speech could be far more expressive than robotic narration.

Description

Open-source text-to-speech model by Suno AI that goes beyond simple speech synthesis. Can generate highly realistic speech complete with emotions, laughter, sighing, and even background music or sound effects. Supports multiple languages and comes with preset speaker voices, making it one of the most versatile open-source voice generation tools.

Key Innovations

Text-to-Audio
Text-to-AudioGenerating speech, music, or sound effects from text descriptions.
Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.
Transformer
TransformerNeural network architecture using self-attention to process entire sequences in parallel. Replaced RNNs and enabled massive scaling.

External Links