Memory and Knowledge

RAG Patterns

Retrieval-Augmented Generation (RAG) patterns address how agents dynamically retrieve relevant information from external knowledge sources and inject it into the model's context before generating a response. The core pattern involves chunking documents into segments, embedding them as vectors, storing them in a vector database, and at query time, retrieving the most semantically similar chunks to include in the prompt. Advanced RAG patterns include multi-step retrieval (using an initial retrieval to refine the query), hybrid search (combining semantic and keyword matching), re-ranking (using a second model to score relevance), and agentic RAG (letting the agent decide when and what to retrieve); understanding these variations matters because the right pattern depends heavily on your knowledge structure, query patterns, and latency budget.

subtopics

Naive RAG

Advanced RAG Techniques

connected to

Context Engineering vs Prompting Retrieval Augmented Generation Graph RAG vs Vector RAG

resources

Anthropic: RAG Guidedocs.anthropic.comAnthropic's guide to implementing RAG with Claude (docs.anthropic.com)LangChain: RAGpython.langchain.comComprehensive tutorial on building RAG pipelines with LangChain (python.langchain.com)Pinecone: RAG Guidepinecone.ioDeep dive into RAG architecture and optimization from Pinecone (pinecone.io)OpenAI: RAG Best Practicescookbook.openai.comOpenAI's cookbook for building production RAG systems (cookbook.openai.com)

view in track