We analyzed 14 sandbox solutions for AI coding agents across four isolation tiers-containers, user-space kernels, microVMs, and kernel-enforced capabilities. Every tier has a failure mode, and multiple shipping agents have already been exploited through gaps in their isolation. The right sandbox depends on your threat model—what you're isolating from and what credentials or access exist inside the sandbox. Isolation technology protects the host from the sandbox; it doesn't protect the sandbox from itself.
Local AI agents process content from sources that can’t be fully verified. Npm packages, model weights, code snippets from search results, user-provided repositories. When these agents execute code, they typically do so with user-level privileges: filesystem access, network capabilities, shell execution.
This creates an isolation challenge. The agent needs sufficient access to be useful (installing dependencies, reading project files, making API calls), but that same access becomes a liability when processing untrusted input. The question isn’t whether to grant access, but how to contain the consequences when something goes wrong.
The past six months suggest this isn’t theoretical anymore. Multiple coding agents have shipped vulnerabilities that exposed credentials, bypassed allowlists, and enabled data exfiltration- Cursor alone had multiple critical issues in three months. The pattern: isolation mechanisms that looked reasonable on paper failed against real-world attack scenarios.
The approaches vary significantly, and so do the outcomes.
Claude Code implements dual-isolation using Seatbelt (macOS) and Bubblewrap (Linux), combining filesystem restrictions (CWD only) with network traffic routed through validating proxies.
Cursor has experienced multiple security issues: credential exposure via full filesystem read access (November 2025) and allowlist bypass via environment variable poisoning (CVE-2026-22708, January 2026). The allowlist bypass is instructive - static allowlists validate commands in isolation but ignore poisoned context, turning “safe” commands into attack vectors.
Gemini CLI offers flexible configuration with five predefined Seatbelt profiles, but suffered a code execution vulnerability via prompt injection (patched July 2025).7
We analyzed 14 sandbox solutions designed for coding agents. The landscape divides into four tiers:
Container-based isolation shares a kernel between sandbox and host. Daytona (~90ms cold start) and agent-infra/sandbox use this approach. Fast, but a kernel exploit breaks the boundary.
User-space kernel (gVisor) intercepts syscalls before they reach the host kernel. Modal runs 20,000+ containers this way. The kernel attack surface shrinks, but gVisor’s correctness becomes security-critical.
Hardware-virtualized microVMs give each sandbox its own kernel. E2B boots Firecracker microVMs in ~150ms; Northflank offers Kata or gVisor; Docker Sandboxes (Desktop 4.58+) uses native virtualization. Strongest isolation, but with session limits and operational complexity.
Kernel-enforced capabilities use OS-level restrictions. nono applies Landlock/Seatbelt with no escape hatch. Bubblewrap provides namespace-based isolation. Local-only; multi-tenant platforms can’t rely on them.
MicroVMs provide the strongest isolation but impose operational constraints—session limits, boot latency, memory overhead.
Containers aren’t security boundaries. Shared kernel means container escapes compromise the host.
Kernel-enforced capabilities have no escape hatch but only work locally. nono layers five restriction mechanisms (command blocklisting, syscall blocking, truncation blocking, filesystem sandboxing, network blocking). Defense-in-depth, but not deployable to multi-tenant platforms.
Static allowlists fail when context is poisoned. CVE-2026-22708 demonstrated this: shell built-ins modify environment variables without consent, turning subsequent “allowed” commands into attack vectors.
Full filesystem read access creates credential exposure even with write restrictions. Agents can cat credential files and leak them through STDOUT.
The sandbox limits blast radius when trust is violated. It doesn’t address the underlying trust problem.
AI agents process content from sources that can’t be fully verified: model weights that execute code on load, package dependencies with install scripts, repositories provided by users. Each is an injection vector. The sandbox is the last layer of defense, not the first.
The sandbox also only isolates what you put in it. Mount SSH keys and they’re accessible inside. Grant full filesystem read and credentials become exfiltrable. The isolation technology protects the host from the sandbox; it doesn’t protect the sandbox from itself.
The right isolation tier depends on two questions: what are you isolating from, and what are you isolating with?
What are you isolating from?
What are you isolating with?
A microVM with mounted AWS credentials isn’t meaningfully more secure than a container with the same credentials.
For platform builders:
For developers using coding agents:
For security teams:
Sandbox selection is architecture, not procurement. The isolation tier matters less than the threat model it’s designed against and the trust boundaries it actually enforces.
Defense-in-depth provides stronger guarantees than any single isolation technology. Dual-isolation (filesystem + network) addresses distinct threat vectors. Violation detection enables response, not just prevention.
And the sandbox only contains what you put in it.
Subscribe and get the latest security updates
Back to blog