Imagen 2

Google DeepMind · December 2023

◌ legacyCloseddiffusionimage

Description

Google DeepMind's text-to-image model with significantly improved photorealism and the ability to render readable text within images. Integrated into Google's Gemini chatbot and available to developers through Google Cloud's Vertex AI platform.

Key Innovations

Diffusion

DiffusionGenerates outputs by gradually denoising random noise into coherent images/audio. The backbone of Stable Diffusion and DALL·E.

Text-to-Image

Text-to-ImageGenerating images from text descriptions — the technology behind DALL·E, Midjourney, and Stable Diffusion.

Family Tree

Successors (1)

Imagen 3

Related Research (1)

ImagenDiffusion

2022 · Google Brain

Demonstrated that large frozen text encoders (T5-XXL) with cascaded diffusion models produce photorealistic images, outperforming DALL·E 2.

External Links

Announcement

More from Google Gemini

Gemini 1.02023-12 · —

Gemini 1.5 Pro2024-02 · —

Gemini 2.02024-12 · —

Gemini 2.5 Pro2025-03 · —

Gemini 3.1 Pro2026-02 · ~1T (MoE)

Gemini 3.5 Flash2026-05 · —

Gemini 3.5 Pro2026-05 · —

Imagen 32024-06 · —

Gemini 2.0 Flash2024-12 · —

Veo 22024-12 · —

PreviousGemini 1.0

NextAlphaCode 2