Denoising score matching learns to denoise by estimating the score (gradient of log probability) of data distributions

Image: Ptrump16, Public domain, via Wikimedia Commons

denoising score matching does: learns to denoise, which equals learning the score

Denoising score matching learns to denoise by estimating the score (gradient of log probability) of data distributions

Related concepts

score matching does: learns the gradient of the log-density without normalizing

Matching score learns gradient of log-density without normalizing

the reverse process learns: p_θ(x_{t-1}|x_t)

The reverse process learns: p_θ(x_{t-1}|x_t) — denoising one step at a time

Langevin dynamics does: adds noise to gradient descent to sample from a distribution

Langevin dynamics adds noise to gradient descent to sample from a distribution

Brier score

Brier score measures mean squared error of probability predictions

AdaGrad's learning rate decays to zero

AdaGrad adjusts learning rate by accumulating squared gradients, causing it to decay to zero as denominator grows exponentially

classifier-free guidance does: interpolates between conditional and unconditional generation

"Classifies samples as either conditioned or unconditioned, guiding generation towards desired outcomes."

Swipe through 100 ML concepts daily