RAG enables LLMs to access new information without retraining
Image: Unsplash, CC BY-SA 4.0, via Wikimedia Commons
RAG enables LLMs to access new information without retraining
Retrieval-augmented generation (RAG) allows large language models (LLMs) to retrieve and incorporate new information from external sources, enhancing their ability to provide up-to-date responses. This technique supplements the LLM's pre-existing training data with domain-specific and/or updated information, enabling them to access internal company data or authoritative sources for generating responses.
Example
A chatbot using RAG can access and provide the latest financial reports from a company's database, even if it wasn't trained on that specific data.
RAG's ability to access new information without retraining is crucial for maintaining the relevance and accuracy of LLM-generated content in rapidly changing domains.
1536-dim OpenAI text-embedding-3-large is used for: semantic search and RAG
Used for semantic search, RAG, and enhancing language models' understanding
Prompt engineering
The GenAI model learns tasks from examples in the prompt
Knowledge distillation
Knowledge distillation transfers knowledge from a large model to a smaller one without loss of validity
Large language model
LLMs can generate, summarize, translate, and analyze text in many contexts
mean pooling often outperforms [CLS] for sentence similarity tasks
Mean pooling captures overall sentence meaning better than [CLS] token embedding
Hierarchical navigable small world
HNSW is an efficient ANN search algorithm
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews