Context Engineering

Retrieval Augmented Generation

Retrieval-Augmented Generation (RAG) is a pattern where a system retrieves relevant documents from an external knowledge source and injects them into the model's prompt before generating a response, grounding answers in specific source material rather than training data alone. This matters because language models have a knowledge cutoff and will confidently hallucinate facts outside their training, so any agent that needs to answer questions about your data, your codebase, or recent events needs RAG to be accurate. Every variation of the pattern, from simple vector retrieval to multi-hop agentic RAG, builds on this same core loop: retrieve relevant context, then generate.

subtopics

RAG Pipeline Design

Chunking Strategies

connected to

RAG Patterns Vector Databases

resources

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasksarxiv.orgThe original RAG paper from Meta AI (arxiv.org)Anthropic: RAG Guidedocs.anthropic.comAnthropic's comprehensive guide to implementing RAG with Claude (docs.anthropic.com)Pinecone: What is RAG?pinecone.ioAccessible introduction to RAG with architectural diagrams (pinecone.io)LangChain: RAG Tutorialpython.langchain.comStep-by-step tutorial for building RAG applications (python.langchain.com)

view in track