Used for semantic search, RAG, and enhancing language models' understanding
Image: Cretep, Public domain, via Wikimedia Commons
Used for semantic search, RAG, and enhancing language models' understanding
[CLS] pooling does: uses the first token's embedding as the sentence representation
CLS pooling: uses the first token's embedding as the sentence representation
mean pooling often outperforms [CLS] for sentence similarity tasks
Mean pooling captures overall sentence meaning better than [CLS] token embedding
384-dim all-MiniLM-L6-v2 optimizes: fast sentence similarity with 6 layers
All-MiniLM-L6-v2 optimizes fast sentence similarity with 6 layers
768-dim BERT embeddings capture: bidirectional context from masked language modeling
768-dim BERT embeddings capture bidirectional context from masked language modeling
Retrieval-augmented generation
RAG enables LLMs to access new information without retraining
weight tying does in language models: shares embedding and output projection matrices
Tying reduces the number of parameters by sharing embedding and output projection matrices
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews