Retrieval-augmented generation

RAG enables LLMs to access new information without retraining

Retrieval-augmented generation (RAG) allows large language models (LLMs) to retrieve and incorporate new information from external sources, enhancing their ability to provide up-to-date responses. This technique supplements the LLM's pre-existing training data with domain-specific and/or updated information, enabling them to access internal company data or authoritative sources for generating responses.

Example

A chatbot using RAG can access and provide the latest financial reports from a company's database, even if it wasn't trained on that specific data.

RAG's ability to access new information without retraining is crucial for maintaining the relevance and accuracy of LLM-generated content in rapidly changing domains.

Related concepts

1536-dim OpenAI text-embedding-3-large is used for: semantic search and RAG

Used for semantic search, RAG, and enhancing language models' understanding

Prompt engineering

The GenAI model learns tasks from examples in the prompt

Knowledge distillation

Knowledge distillation transfers knowledge from a large model to a smaller one without loss of validity

Large language model

LLMs can generate, summarize, translate, and analyze text in many contexts

mean pooling often outperforms [CLS] for sentence similarity tasks

Mean pooling captures overall sentence meaning better than [CLS] token embedding

Hierarchical navigable small world

HNSW is an efficient ANN search algorithm

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews