DALL·E
OpenAI · January 2021
Why It Matters
Proved that a single neural network could learn the relationship between text and images well enough to create novel images from descriptions, launching the text-to-image revolution.
Description
OpenAI's first text-to-image model, combining two AI techniques: a variational autoencoder (which learns to compress and reconstruct images) with a transformer (the same architecture behind GPT) to generate images from text descriptions. At 12 billion parameters, it showed that language models could be adapted to create visual art.
Notable Milestones
- ▸First demonstration of creative AI image generation from text
- ▸Inspired a wave of text-to-image research across the industry
Key Innovations
Family Tree
Successors (1)
Related Research (2)
Demonstrated that a single model could generate diverse, creative images from arbitrary text descriptions, combining language understanding with image…
Trained a model to understand both images and text by learning which image-text pairs go together from 400 million internet examples. This created a s…