Top-k sampling fixes candidate count; top-p sampling fixes cumulative probability mass

Image: MadriCR, CC BY-SA 3.0, via Wikimedia Commons

Top-k vs top-p sampling: top-k fixes candidate count, top-p fixes cumulative probability mass

Top-k sampling fixes candidate count; top-p sampling fixes cumulative probability mass

Related concepts

importance sampling does: reweights samples from proposal to estimate target expectation

Importance sampling reweights samples from a proposal distribution to estimate the expectation under a target distribution

Markov chain Monte Carlo

MCMC samples from complex posterior distributions

Boosting (machine learning)

Boosting reduces bias in ML models

GraphSAGE does: samples and aggregates a fixed-size neighborhood

GraphSAGE samples and aggregates a fixed-size neighborhood

log-probabilities are used instead of probabilities: avoids numerical underflow

Log-probabilities convert multiplications into additions, preventing numerical underflow

Greedy vs beam search decoding: greedy picks best token, beam maintains k candidates

Greedy picks best token, beam maintains k candidates

Swipe through 100 ML concepts daily