Product quantization compresses vectors by splitting them into subvectors and quantizing each subvector independently

Image: Comixboy at English Wikipedia, CC BY 2.5, via Wikimedia Commons

Vector quantization

Product quantization compresses vectors by splitting them into subvectors and quantizing each subvector independently

Related concepts

GPTQ quantization does

Post-training quantization using second-order information for model compression

autoencoders learn the data manifold

Autoencoders compress data manifold by forcing information through a bottleneck layer, learning efficient representations

quantization to INT8 doubles throughput

Quantization to INT8 doubles throughput because tensor cores process INT8 2x faster

Shannon's source coding theorem: you can't compress below entropy

Shannon's theorem: Data compression can't exceed entropy limit

the Gram-Schmidt process does: orthogonalizes a set of vectors

Orthogonalizes a set of vectors using Gram-Schmidt

batch size affects generalization: larger batches find sharper minima

Larger batch sizes lead to sharper minima, enhancing generalization by providing more accurate gradient estimates

Swipe through 100 ML concepts daily