RAG (Retrieval Augmented Generation)

Techniques

Simple Definition

A technique that enhances LLM responses by retrieving relevant documents from an external knowledge base before generating an answer.

Full Explanation

RAG solves two key LLM problems: knowledge cutoff dates and hallucination. Instead of relying solely on training data, RAG retrieves current, relevant documents from a database (using vector search) and provides them as context to the LLM. The model then generates an answer grounded in real retrieved documents. Used by Perplexity AI, enterprise chatbots, and most production AI applications.

Example

A customer service chatbot uses RAG to retrieve the latest product documentation before answering support questions.

Related Terms

Vector Database

A database optimized for storing and searching embedding vectors — the foundation of RAG and semantic search applications.

Embeddings

Numerical vector representations of text that capture semantic meaning, allowing AI to find conceptually similar content.

Grounding

Connecting AI model responses to verified, real-world information sources to reduce hallucination and improve accuracy.

Hallucination

When an AI model generates confident-sounding but factually incorrect or fabricated information.

Last verified: 2026-03-30← Back to Glossary