Locality-sensitive hashing

Locality-sensitive hashing (LSH) hashes similar items into the same buckets

Locality-sensitive hashing (LSH) is a technique that hashes similar input items into the same "buckets" with high probability. This characteristic makes LSH particularly useful for tasks like data clustering and nearest neighbor search, where grouping similar items together can significantly improve efficiency and accuracy.

Example

In a dataset of images, LSH can group similar images (e.g., pictures of cats) into the same bucket, allowing for faster retrieval of similar images when a query is made.

LSH's ability to hash similar items together into the same buckets is crucial for efficient and accurate approximate nearest neighbor search, which is widely used in various applications such as recommendation systems and image retrieval.

Related concepts

consistent hashing solves: minimizes key redistribution when servers are added/removed

Consistent hashing minimizes key redistribution when servers are added/removed

consistent hashing does: minimizes remapping when nodes join/leave

Consistent hashing distributes data across nodes, minimizing remapping when nodes join/leave

the curse of dimensionality makes nearest neighbor search unreliable

High dimensionality dilutes data density, making nearest neighbors less distinct and search unreliable

UMAP is faster than t-SNE

UMAP is faster due to approximate nearest neighbors and cross-entropy optimization

Greedy vs beam search decoding: greedy picks best token, beam maintains k candidates

Greedy picks best token, beam maintains k candidates

Chebyshev distance

Chebyshev distance is named after Pafnuty Chebyshev

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews