DALL·E 3
OpenAI · October 2023
◌ legacyCloseddiffusionimage
Why It Matters
First major image generator deeply integrated into a conversational AI, making image creation as easy as describing what you want in a chat.
Description
Built directly into ChatGPT, allowing users to generate and refine images through natural conversation. Dramatically improved the model's ability to follow complex prompts accurately and render readable text within images — a major weakness of earlier image generators.
Notable Milestones
- ▸Integrated into ChatGPT for conversational image creation
- ▸First image model to reliably render text in images
Key Innovations
Diffusion
DiffusionGenerates outputs by gradually denoising random noise into coherent images/audio. The backbone of Stable Diffusion and DALL·E.
Text-to-Image
Text-to-ImageGenerating images from text descriptions — the technology behind DALL·E, Midjourney, and Stable Diffusion.
Multimodal
MultimodalProcessing multiple types of input (text, images, audio, video) in a single model.
Related Research (1)
DALL·EDiffusion
2021 · OpenAI
Demonstrated that a single model could generate diverse, creative images from arbitrary text descriptions, combining language understanding with image…