Foundations

Model Selection

Model selection is the process of matching the right language model to each task in your agent system based on cost, latency, capability, and context window size. Frontier models like Claude Sonnet and GPT-4o handle complex reasoning and multi-step planning well, while smaller models like Claude Haiku and GPT-4o-mini handle high-volume, low-complexity tasks like classification or extraction at a fraction of the cost. Before deploying, run your top two or three candidate models against your eval suite, because performance rankings reverse depending on task type and the cheapest model often wins on narrow, well-specified tasks where a frontier model's general capability adds no value and only adds latency and cost.

subtopics

Model Comparison Criteria

Model Providers

connected to

LLM Fundamentals Choosing Your Stack

resources

Anthropic Model Comparisondocs.anthropic.comDetailed comparison of Claude model capabilities, context windows, and pricing (docs.anthropic.com)OpenAI Model Overviewplatform.openai.comComplete guide to OpenAI's model lineup with capability breakdowns (platform.openai.com)Chatbot Arena Leaderboardlmarena.aiCrowdsourced model rankings based on human preference across diverse tasks (lmarena.ai)Artificial Analysis LLM Benchmarksartificialanalysis.aiIndependent speed, quality, and pricing comparisons across LLM providers (artificialanalysis.ai)Google AI Modelsai.google.devGemini model specs including context windows and multimodal capabilities (ai.google.dev)

view in track