Product quantization compresses vectors by splitting them into subvectors and quantizing each subvector independently
Image: Comixboy at English Wikipedia, CC BY 2.5, via Wikimedia Commons
Product quantization compresses vectors by splitting them into subvectors and quantizing each subvector independently
GPTQ quantization does
Post-training quantization using second-order information for model compression
autoencoders learn the data manifold
Autoencoders compress data manifold by forcing information through a bottleneck layer, learning efficient representations
quantization to INT8 doubles throughput
Quantization to INT8 doubles throughput because tensor cores process INT8 2x faster
Shannon's source coding theorem: you can't compress below entropy
Shannon's theorem: Data compression can't exceed entropy limit
the Gram-Schmidt process does: orthogonalizes a set of vectors
Orthogonalizes a set of vectors using Gram-Schmidt
batch size affects generalization: larger batches find sharper minima
Larger batch sizes lead to sharper minima, enhancing generalization by providing more accurate gradient estimates
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews