Cosine similarity measures orientation, not magnitude, making it ideal for normalized embeddings

Image: cavebear42, CC BY-SA 4.0, via Wikimedia Commons

cosine similarity is preferred over dot product for normalized embeddings

Cosine similarity measures orientation, not magnitude, making it ideal for normalized embeddings

Related concepts

List of algorithms

Cosine similarity measures the angle between vectors, not their magnitude

cosine similarity works better than Euclidean distance in high dimensions

Cosine similarity measures orientation, not magnitude, making it more robust to irrelevant dimensions in high-dimensional spaces

ALiBi allows length extrapolation better than learned position embeddings

ALiBi uses relative positional encoding, avoiding fixed-size embeddings, enabling better handling of variable-length sequences

768-dim BERT embeddings capture: bidirectional context from masked language modeling

768-dim BERT embeddings capture bidirectional context from masked language modeling

soft targets carry more information than hard labels: they encode class similarities

Soft targets carry more information than hard labels because they encode class similarities

mean pooling often outperforms [CLS] for sentence similarity tasks

Mean pooling captures overall sentence meaning better than [CLS] token embedding

Swipe through 100 ML concepts daily