Stable Diffusion generates images from text descriptions

Stable Diffusion

Stable Diffusion generates images from text descriptions

The model was developed by researchers from the CompVis Group at LMU Munich and Runway, with computational support from Stability AI. This collaboration resulted in a publicly accessible model that can run on consumer hardware with modest GPU capabilities. This accessibility marks a significant advancement over previous proprietary models like DALL-E and Midjourney.

Example

A user inputs "a sunset over the mountains" and receives an image of a beautiful sunset scene with mountains in the background.

Stable Diffusion's ability to generate images from text descriptions can significantly enhance creative processes and applications in various fields.

Related concepts

Contrastive Language–Image Pre-training

CLIP embeds images and text into a shared space using contrastive learning

Large language model

LLMs can generate, summarize, translate, and analyze text in many contexts

Langevin dynamics does: adds noise to gradient descent to sample from a distribution

Langevin dynamics adds noise to gradient descent to sample from a distribution

DDIM does: deterministic sampling for faster generation with fewer steps

DDIM accelerates image generation by deterministically sampling intermediate steps

gradient accumulation simulates larger batch sizes without more memory

Gradient accumulation reduces memory usage by dividing a large batch into smaller mini-batches, accumulating gradients before updating model weights

weight tying does in language models: shares embedding and output projection matrices

Tying reduces the number of parameters by sharing embedding and output projection matrices

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews