Langevin dynamics does: adds noise to gradient descent to sample from a distribution

Langevin dynamics adds noise to gradient descent to sample from a distribution

Related concepts

Diffusion model

q(x_t|x_{t-1}) adds Gaussian noise at each step

gradient accumulation simulates larger batch sizes without more memory

Gradient accumulation reduces memory usage by dividing a large batch into smaller mini-batches, accumulating gradients before updating model weights

AdaGrad's learning rate decays to zero

AdaGrad adjusts learning rate by accumulating squared gradients, causing it to decay to zero as denominator grows exponentially

Lyapunov exponents measure: rate of divergence of nearby trajectories in a dynamical system

Lyapunov exponents measure the rate of divergence of nearby trajectories in a dynamical system

to standardize: when you need zero mean and unit variance for gradient-based optimization

Standardize when zero mean and unit variance are required for gradient-based optimization

Stable Diffusion

Stable Diffusion generates images from text descriptions

Swipe through 100 ML concepts daily