Large language model

LLMs can generate, summarize, translate, and analyze text in many contexts

LLMs, or large language models, are neural networks trained on extensive text data, enabling them to perform a variety of natural language processing tasks. These tasks include generating new text, summarizing existing content, translating languages, and analyzing text for different purposes. The versatility of LLMs makes them invaluable tools in the realm of artificial intelligence and natural language processing.

Example

A chatbot powered by an LLM can engage in a conversation with users, providing responses that are coherent and contextually relevant.

Understanding the capabilities of LLMs is crucial for leveraging their potential in various applications, from customer service to content creation.

Related concepts

Masking (behavior)

Causal masking prevents attention to future tokens in the decoder

[CLS] pooling does: uses the first token's embedding as the sentence representation

CLS pooling: uses the first token's embedding as the sentence representation

weight tying does in language models: shares embedding and output projection matrices

Tying reduces the number of parameters by sharing embedding and output projection matrices

WordPiece tokenization does: similar to BPE but uses likelihood instead of frequency

WordPiece tokenization splits words into subwords based on token likelihood rather than frequency

mean pooling often outperforms [CLS] for sentence similarity tasks

Mean pooling captures overall sentence meaning better than [CLS] token embedding

1536-dim OpenAI text-embedding-3-large is used for: semantic search and RAG

Used for semantic search, RAG, and enhancing language models' understanding

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews