
Mean pooling: averages all token embeddings to create a sentence embedding
Image: svonog, CC BY 2.0, via Wikimedia Commons
Mean pooling: averages all token embeddings to create a sentence embedding
[CLS] pooling does: uses the first token's embedding as the sentence representation
CLS pooling: uses the first token's embedding as the sentence representation
mean pooling often outperforms [CLS] for sentence similarity tasks
Mean pooling captures overall sentence meaning better than [CLS] token embedding
Graph neural network
Graph pooling reduces graphs to single vectors for graph-level prediction
weight tying does in language models: shares embedding and output projection matrices
Tying reduces the number of parameters by sharing embedding and output projection matrices
the embedding layer does: maps discrete token IDs to dense learned vectors
Embeddings convert token IDs to dense vectors for neural network processing
1536-dim OpenAI text-embedding-3-large is used for: semantic search and RAG
Used for semantic search, RAG, and enhancing language models' understanding
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews