Bloom filters check if an element is possibly in a set with high probability, avoiding false negatives
Image: Nandanupadhyay, CC BY-SA 3.0, via Wikimedia Commons
Bloom filters check if an element is possibly in a set with high probability, avoiding false negatives
log-probabilities are used instead of probabilities: avoids numerical underflow
Log-probabilities convert multiplications into additions, preventing numerical underflow
structured pruning removes: entire filters or attention heads, not individual weights
Structured pruning removes entire filters or attention heads, not individual weights
importance sampling does: reweights samples from proposal to estimate target expectation
Importance sampling reweights samples from a proposal distribution to estimate the expectation under a target distribution
Top-k vs top-p sampling: top-k fixes candidate count, top-p fixes cumulative probability mass
Top-k sampling fixes candidate count; top-p sampling fixes cumulative probability mass
Greedy vs beam search decoding: greedy picks best token, beam maintains k candidates
Greedy picks best token, beam maintains k candidates
Randomized algorithm
Randomized algorithms use random bits for expected polynomial time
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews