
Stable Diffusion generates images from text descriptions
Stable Diffusion generates images from text descriptions
The model was developed by researchers from the CompVis Group at LMU Munich and Runway, with computational support from Stability AI. This collaboration resulted in a publicly accessible model that can run on consumer hardware with modest GPU capabilities. This accessibility marks a significant advancement over previous proprietary models like DALL-E and Midjourney.
Example
A user inputs "a sunset over the mountains" and receives an image of a beautiful sunset scene with mountains in the background.
Stable Diffusion's ability to generate images from text descriptions can significantly enhance creative processes and applications in various fields.
Contrastive Language–Image Pre-training
CLIP embeds images and text into a shared space using contrastive learning
Large language model
LLMs can generate, summarize, translate, and analyze text in many contexts
Langevin dynamics does: adds noise to gradient descent to sample from a distribution
Langevin dynamics adds noise to gradient descent to sample from a distribution
DDIM does: deterministic sampling for faster generation with fewer steps
DDIM accelerates image generation by deterministically sampling intermediate steps
gradient accumulation simulates larger batch sizes without more memory
Gradient accumulation reduces memory usage by dividing a large batch into smaller mini-batches, accumulating gradients before updating model weights
weight tying does in language models: shares embedding and output projection matrices
Tying reduces the number of parameters by sharing embedding and output projection matrices
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews