384-dim all-MiniLM-L6-v2 optimizes: fast sentence similarity with 6 layers

All-MiniLM-L6-v2 optimizes fast sentence similarity with 6 layers

Related concepts

mean pooling often outperforms [CLS] for sentence similarity tasks

Mean pooling captures overall sentence meaning better than [CLS] token embedding

batch size affects generalization: larger batches find sharper minima

Larger batch sizes lead to sharper minima, enhancing generalization by providing more accurate gradient estimates

the vocabulary size matters: larger vocab = shorter sequences but more parameters

Larger vocab reduces sequence length, increasing model complexity and parameters

1536-dim OpenAI text-embedding-3-large is used for: semantic search and RAG

Used for semantic search, RAG, and enhancing language models' understanding

768-dim BERT embeddings capture: bidirectional context from masked language modeling

768-dim BERT embeddings capture bidirectional context from masked language modeling

[CLS] pooling does: uses the first token's embedding as the sentence representation

CLS pooling: uses the first token's embedding as the sentence representation

Swipe through 100 ML concepts daily