[CLS] pooling does: uses the first token's embedding as the sentence representation

CLS pooling: uses the first token's embedding as the sentence representation

Related concepts

mean pooling often outperforms [CLS] for sentence similarity tasks

Mean pooling captures overall sentence meaning better than [CLS] token embedding

mean pooling does: averages all token embeddings to get a sentence embedding

Mean pooling: averages all token embeddings to create a sentence embedding

weight tying does in language models: shares embedding and output projection matrices

Tying reduces the number of parameters by sharing embedding and output projection matrices

1536-dim OpenAI text-embedding-3-large is used for: semantic search and RAG

Used for semantic search, RAG, and enhancing language models' understanding

the embedding layer does: maps discrete token IDs to dense learned vectors

Embeddings convert token IDs to dense vectors for neural network processing

Graph neural network

Graph pooling reduces graphs to single vectors for graph-level prediction

Swipe through 100 ML concepts daily