Locality-sensitive hashing (LSH) hashes similar items into the same buckets
Image: MikeBogosian, CC BY-SA 4.0, via Wikimedia Commons
Locality-sensitive hashing (LSH) hashes similar items into the same buckets
Locality-sensitive hashing (LSH) is a technique that hashes similar input items into the same "buckets" with high probability. This characteristic makes LSH particularly useful for tasks like data clustering and nearest neighbor search, where grouping similar items together can significantly improve efficiency and accuracy.
Example
In a dataset of images, LSH can group similar images (e.g., pictures of cats) into the same bucket, allowing for faster retrieval of similar images when a query is made.
LSH's ability to hash similar items together into the same buckets is crucial for efficient and accurate approximate nearest neighbor search, which is widely used in various applications such as recommendation systems and image retrieval.
consistent hashing solves: minimizes key redistribution when servers are added/removed
Consistent hashing minimizes key redistribution when servers are added/removed
consistent hashing does: minimizes remapping when nodes join/leave
Consistent hashing distributes data across nodes, minimizing remapping when nodes join/leave
the curse of dimensionality makes nearest neighbor search unreliable
High dimensionality dilutes data density, making nearest neighbors less distinct and search unreliable
UMAP is faster than t-SNE
UMAP is faster due to approximate nearest neighbors and cross-entropy optimization
Greedy vs beam search decoding: greedy picks best token, beam maintains k candidates
Greedy picks best token, beam maintains k candidates
Chebyshev distance
Chebyshev distance is named after Pafnuty Chebyshev
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews