![[CLS] pooling does: uses the first token's embedding as the sentence representation](https://upload.wikimedia.org/wikipedia/commons/7/79/British.coalfields.19th.century.jpg)
CLS pooling: uses the first token's embedding as the sentence representation
Image: myself, CC BY-SA 3.0, via Wikimedia Commons
CLS pooling: uses the first token's embedding as the sentence representation
mean pooling often outperforms [CLS] for sentence similarity tasks
Mean pooling captures overall sentence meaning better than [CLS] token embedding
mean pooling does: averages all token embeddings to get a sentence embedding
Mean pooling: averages all token embeddings to create a sentence embedding
weight tying does in language models: shares embedding and output projection matrices
Tying reduces the number of parameters by sharing embedding and output projection matrices
1536-dim OpenAI text-embedding-3-large is used for: semantic search and RAG
Used for semantic search, RAG, and enhancing language models' understanding
the embedding layer does: maps discrete token IDs to dense learned vectors
Embeddings convert token IDs to dense vectors for neural network processing
Graph neural network
Graph pooling reduces graphs to single vectors for graph-level prediction
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews