Cosine similarity

Cosine similarity formula: cos(θ) = (A · B) / (||A|| ||B||)

Image: Tupperware Ltd., CC BY-SA 3.0, via Wikimedia Commons

Cosine similarity

Cosine similarity formula: cos(θ) = (A · B) / (||A|| ||B||)

Cosine similarity measures the cosine of the angle between two vectors, which is calculated as the dot product of the vectors divided by the product of their lengths. This metric is useful for determining the similarity between vectors without being affected by their magnitudes.

For instance, if vector A = [1, 2] and vector B = [2, 4], the dot product A · B = 1*2 + 2*4 = 10. The lengths ||A|| = √(1² + 2²) = √5 and ||B|| = √(2² + 4²) = √20. Thus, cosine similarity = 10 / (√5 * √20) = 10 / √100 = 1.

Understanding cosine similarity is crucial in fields like information retrieval and text mining, where it helps compare documents represented as vectors of word occurrences.

Cosine similarity is a fundamental concept in vector space models used in various applications, including search engines and recommendation systems.

Related concepts

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews