Tool Design and Contracts

Tool Sandboxing

definition

Running agent tool calls in isolated environments limits the blast radius when things go wrong, preventing agents from accidentally modifying production data, executing dangerous commands, or accessing resources outside their intended scope. Common approaches include Docker containers, virtual machines, Firecracker microVMs, restricted file system access, and permission-based execution where certain actions require explicit human approval.

Running agent tool calls in isolated environments limits the blast radius when things go wrong, preventing agents from accidentally modifying production data, executing dangerous commands, or accessing resources outside their intended scope. Common approaches include Docker containers, virtual machines, Firecracker microVMs, restricted file system access, and permission-based execution where certain actions require explicit human approval. Sandboxing is not optional for production agent systems — without it, a single hallucinated command or prompt injection can cascade into a critical security incident. The architectural challenge is balancing isolation with utility: too restrictive and the agent can't do useful work, too permissive and you're running arbitrary LLM-generated code with full system access. This concept connects to permission models for the authorization layer within sandboxes, least privilege for the access control philosophy, and ephemeral sandboxing for the most extreme isolation pattern using disposable execution environments.

on the map

Tool Sandboxing Tool Design and Contracts

related concepts

Permission Models Ephemeral Execution Environments Blast Radius Containment