ALiBi uses fixed-length position encodings, enabling efficient length extrapolation without model retraining
ALiBi uses fixed-length position encodings, enabling efficient length extrapolation without model retraining
What weight tying does in language models: shares embedding and output projection matrices
Language models use tied weights to share embedding and output projection matrices, enhancing parameter efficiency
Why the curse of dimensionality makes nearest neighbor search unreliable
High-dimensional spaces increase distance ambiguity, reducing nearest neighbor search reliability
What AWQ does differently — activation-aware weight quantization preserves important weights
AWQ quantizes weights while preserving critical activation values for neural network efficiency
What score matching does: learns the gradient of the log-density without normalizing
Score matching approximates log-density gradients for variational inference without normalization
Greedy vs beam search decoding: greedy picks best token, beam maintains k candidates
Greedy decoding selects one token, while beam search retains multiple candidates
What consistent hashing does: minimizes remapping when nodes join/leave
Consistent hashing minimizes data redistribution during nodes' addition or removal
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews