DALL·E 3

OpenAI · October 2023

◌ legacyCloseddiffusionimage

Why It Matters

First major image generator deeply integrated into a conversational AI, making image creation as easy as describing what you want in a chat.

Description

Built directly into ChatGPT, allowing users to generate and refine images through natural conversation. Dramatically improved the model's ability to follow complex prompts accurately and render readable text within images — a major weakness of earlier image generators.

Notable Milestones

▸Integrated into ChatGPT for conversational image creation
▸First image model to reliably render text in images

Key Innovations

Diffusion

DiffusionGenerates outputs by gradually denoising random noise into coherent images/audio. The backbone of Stable Diffusion and DALL·E.

Text-to-Image

Text-to-ImageGenerating images from text descriptions — the technology behind DALL·E, Midjourney, and Stable Diffusion.

Multimodal

MultimodalProcessing multiple types of input (text, images, audio, video) in a single model.

Family Tree

Related Research (1)

DALL·EDiffusion

2021 · OpenAI

Demonstrated that a single model could generate diverse, creative images from arbitrary text descriptions, combining language understanding with image…