Namespace: Running Claude Managed Agents on Devboxes

Anthropic just introduced support for running Claude Managed Agents in self-hosted sandboxes. Today we are shipping native support for using Devboxes as a self-hosted sandbox with Claude Managed Agents.

Claude Managed Agents let you delegate tasks to Claude. It plans, reads files, runs shell commands, calls tools, and iterates until the work is done. Each session runs in a sandbox provisioned fresh per session.

Anthropic's own sandboxes lets you declare pip, npm, or apt packages to install before the agent starts. For many tasks that's enough, but for engineering work it usually isn't. You might need a specific compiler version, a code generator that isn't on any public registry, or a base image that matches your production environment.

Namespace Devboxes solves this by giving you full control over the environment.

How it works

Anthropic handles orchestration: the model, the agent loop, context management, and error recovery. Your Namespace Devbox handles execution: shell commands, file operations, tool calls, builds, test runs, etc.

Anthropic

Agent loopModel, planning & tool calls

tool_use

Namespace

Namespace control planeRoutes work to a fresh devbox

DevboxIsolated VM, per session

The agent's capabilities and performance can now be shaped by the environment you give it.

Why Namespace Devboxes?

Agent quality is bounded by environment quality. If a build takes 3 minutes, the agent waits 3 minutes between iterations. If the toolchain doesn't match what CI expects, the agent's test runs produce different results than your pipeline does. If the agent can't reach your internal services, it works against stubs instead of the real thing.

Devboxes are high-performance environments which are best-in-class for code related workflows, like builds and tests. They use Docker-based environments with the same toolchains your team runs in CI (Docker, Bazel, Gradle, etc.). Builds are fast because the image is pre-built and the environment is tuned for compute-intensive tasks, not for running a web browser.

Three capabilities make Devboxes a particularly good fit for managed agents:

Custom images via Dockerfile. Pin your compiler version, bundle your internal tools, match your production base image exactly. The agent's environment is defined in code, version-controlled, and consistent across every run.
Private network access via Tailscale. The agent can reach your internal services, registries, and databases the same way any authorized device on your tailnet can.
Ephemeral by default. Each task starts from a clean slate. No state bleeds between runs, and parallel workstreams get fully isolated environments without branch conflicts or port collisions.

The rest of this post covers each of these in more detail.

Custom environments via Dockerfiles

Anthropic's environments let you specify packages to install such as pip dependencies, apt packages, and npm modules. Those get fetched and installed at the start of every session. For a simple Python script that's fine. For a codebase that depends on a specific compiler version, an internal code generator, or tooling that isn't on any public registry, it is not so great.

Devbox images work differently. You define a Dockerfile, build it once, and Namespace converts it into an optimized disk image. When an agent session starts, the Devbox boots from that pre-built image.

Dockerfile

FROM ubuntu:24.04

# Boilerplate: common tools and a non-root user with passwordless sudo
RUN apt-get update && apt-get install -y git curl build-essential sudo
RUN useradd -m -s /bin/bash devbox && echo "devbox ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
USER devbox

# Your team's specific toolchain
RUN apt-get install -y protobuf-compiler
RUN go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.36

# Internal tooling that isn't on any public registry
COPY scripts/setup-internal-tools.sh /tmp/
RUN /tmp/setup-internal-tools.sh

Build and register the image with the Devbox CLI:

Command Line

devbox image build ./my-image --name=my-team/golang

Namespace optimizes the resulting image for fast Devbox startup. You reference it when configuring your managed agent environment and every subsequent session boots from the same pre-built state.

The Dockerfile can live in your repository alongside your code. The agent's environment is version-controlled, auditable, and consistent across every run. Updating the toolchain means updating the Dockerfile and rebuilding. Existing Devboxes keep their original image until you explicitly move them.

Private network access via Tailscale

When a Devbox joins your tailnet, it becomes a node on your private network with the same access as any other authorized device. The agent can reach your internal services by hostname, subject to your existing ACL rules.

The connection is established automatically at boot using a short-lived OIDC token. No stored credentials, no auth keys, no manual configuration per session. The Devbox appears as a normal node in your Tailscale admin console, inherits whatever tags your workspace configured, and is removed when the session ends.

This means the agent can run migrations against your internal Postgres instance, hit internal APIs, pull from private registries, and run your test suite against the infrastructure it will actually use rather than mocks that drift over time.

Tags and ACLs work the same as they do for any node on your tailnet. Define what tag:agents can reach and every managed agent session gets exactly that access. Workspace admins can set a default Tailscale spec so the integration is automatic without developers needing to configure anything per session.

Ephemeral compute by default

Each agent task runs in a fresh Devbox, provisioned on demand and torn down when the task completes. No state carried over from previous sessions, no credentials left behind, no filesystem noise from prior work.

Every run starts from the same known point: your Dockerfile-defined image, your checked-out code, a clean environment. Debugging a misbehaving agent task means looking at the task and the code, not whatever state accumulated across prior runs.

When the task ends, the environment is gone. The agent had access to whatever your configuration allowed for the duration of the task. Nothing lingers.

This also makes parallel workstreams practical. Spin up one Devbox for a refactor, another for a failing test suite, a third for a production investigation. Each is completely isolated with no branch conflicts, no port collisions, no shared filesystem.

Getting started

To connect a Namespace Devbox to Claude managed agents, see our step-by-step guide