Devboxes for Coding Agents

A coding agent needs a machine where it can clone a repository, install dependencies, run commands, and return the result. Devboxes provide Linux AMD64 machines with Docker and the Devbox CLI gives agents a scriptable way to create, lease, access, and delete them.

This page covers the Devbox features that are most useful when an agent is doing the work: the Devbox agent skill, ephemeral Devboxes, the Pool API, command execution, file upload, and egress filtering.

Agent Skills

Namespace publishes Devbox skills in namespacelabs/agent-skills. A skill is an instruction pack for coding agents. It gives the agent a repeatable workflow for creating Devboxes, loading source code, running commands, and cleaning up after the task.

For a worked example, see Testing with Devboxes and agent skills.

Install the skill in a project when you want the repository to carry the same agent instructions for everyone. Install it globally when you want to use the skill across projects from your own machine.

Run this from the project directory:

$npx skills add namespacelabs/agent-skills

After installation, start the agent from the repository and ask it to use Devboxes for a task. For example:

$run the test suite in a devbox

For larger suites, ask the agent to split the work across multiple Devboxes:

$run the tests in parallel devboxes

The exact behavior depends on the agent and the project. In a typical run, the agent creates one or more Devboxes, copies the relevant source state into them, runs the requested commands, reports the result, and deletes or releases the Devboxes it used.

Creating and Configuring Devboxes

Agents usually create Devboxes non-interactively. For a simple one-off task, the CLI flags are enough. For repeatable tasks, a spec file gives the agent a concrete configuration to generate, review, and reuse.

Ephemeral Devboxes

Ephemeral Devboxes are useful for one-off agent tasks. An agent can create an ephemeral Devbox, load source code onto it, run a build or test command, collect the result, and stop it.

ephemeral ties the Devbox's instance and storage to a single run. When it stops, both are deleted, which is useful for fire-once test or build runs.

Create an ephemeral Devbox from the CLI:

$devbox create --ephemeral

Spec File Configuration

A spec file lets an agent define the Devbox name, image, size, repository, lifecycle, and network policy in one place.

devbox.yaml

name_prefix: agent-run
image: builtin:base
size: m
ephemeral: true
repository:
  url: github.com/your-org/your-repo
  ref: main

Create the Devbox from the spec file:

$devbox create --from devbox.yaml

Use name_prefix when the agent should not choose an exact Devbox name. Namespace appends a random suffix to the prefix. This lets several agents or shards create Devboxes at the same time without colliding on names.

Dynamic Configuration with Stdin

Agents can also generate a spec dynamically and pass it over stdin. Pass - to read the spec from stdin.

$cat devbox.yaml | devbox create --from -

Stdin is parsed as YAML by default. Use --from_format for JSON or TOML.

Egress Filtering

A Devbox can be restricted to a list of allowed outbound domains. This is useful when an agent should only reach source hosts, package registries, model APIs, or other services needed for the task.

Set network_policy.egress_domains in the Devbox spec. Prefix a domain with *. to include its subdomains. Namespace infrastructure domains are allowed so the Devbox can operate.

network_policy:
  egress_domains:
    - github.com
    - "*.githubusercontent.com"
    - registry.npmjs.org
    - api.openai.com

Omit network_policy to use the workspace default. If a task needs private services, combine domain allowlists with your existing workspace integrations and access controls.

See Egress filtering for more details, including how to monitor allowed and denied outbound requests.

Handling Secrets

Do not put tokens or passwords in prompts or spec files. Store them in the Namespace vault and reference the secret ID from the Devbox spec.

env:
  - name: GITHUB_TOKEN
    from_secret_id: <secret-id>

The secret value is injected into the Devbox environment without appearing in the spec. Use this for package registry tokens, API keys, and other task credentials.

Pool API

The Pool API lets an agent reuse warm Devboxes across tasks. A pool is identified by a tag. When the agent has work to do, it acquires a Devbox from that pool.

If an unleased Devbox already exists for the tag, devbox acquire leases it and returns its name. If none is available, Namespace creates a new Devbox, adds it to the pool, leases it, and returns the new name. When the task finishes, the agent releases the lease and the Devbox stays available for later work.

$devbox acquire review-bot

The command prints the Devbox name and a lease ID. Release the lease when the task is done:

$devbox release --lease <lease-id>

A typical task acquires a Devbox, prepares the repository, runs the command, and releases the lease:

devbox acquire review-bot
# Use the Devbox name and lease ID from the acquire output.

devbox ssh <name> -- git -C /workspaces/<repo> fetch --depth=1 origin "$BASE"
devbox ssh <name> -- git -C /workspaces/<repo> checkout "$BASE"
devbox upload <name> /tmp/changes.patch /tmp/changes.patch
devbox ssh <name> -- git -C /workspaces/<repo> apply /tmp/changes.patch
devbox ssh <name> -- npm test

devbox release --lease <lease-id>

Release the lease on success, failure, and cancellation so the Devbox returns to the pool.

A leased Devbox may contain state from the previous task. Before using it, reset the repository to the intended base ref and apply the current task's changes. Releasing a lease does not delete the Devbox. Use devbox delete when the pool member should be removed.

Useful Commands for Agents

Agents usually drive Devboxes through non-interactive CLI commands. Two commands are especially useful: devbox ssh for running commands inside the Devbox, and devbox upload for copying files into it.

Run Commands with `devbox ssh`

devbox ssh runs a command in a Devbox. This is useful after the agent has acquired a lease or created an ephemeral Devbox.

devbox ssh <name> -- git -C /workspaces/<repo> fetch --depth=1 origin "$BASE"
devbox ssh <name> -- git -C /workspaces/<repo> checkout "$BASE"

The -- separates Devbox CLI arguments from the command that should run inside the Devbox.

Upload Files with `devbox upload`

devbox upload copies a local file into a Devbox. Agents commonly use it to upload a patch, a generated config file, or a test artifact.

$devbox upload <name> /tmp/changes.patch /tmp/changes.patch

A typical source sync flow checks out a known base commit inside the Devbox, uploads a patch, and applies it there:

BASE=$(git merge-base origin/main HEAD)
git diff "$BASE" > /tmp/changes.patch

devbox ssh <name> -- git -C /workspaces/<repo> fetch --depth=1 origin "$BASE"
devbox ssh <name> -- git -C /workspaces/<repo> checkout "$BASE"
devbox upload <name> /tmp/changes.patch /tmp/changes.patch
devbox ssh <name> -- git -C /workspaces/<repo> apply /tmp/changes.patch

This example covers tracked files. If the task depends on untracked files, upload them separately or bundle them before applying the patch.

Claude Managed Agents

Claude Managed Agents can run on Namespace Devboxes. Each Claude agent session is backed by an ephemeral Devbox that is provisioned when the session starts. See the Claude integration guide for setup steps.

Devin

Devin agents can run on Namespace Devboxes. Each Devin session is backed by a fresh Devbox that is provisioned when the session starts, and the Devboxes used by Devin can run on both Linux and macOS. See the Devin on Devboxes guide for setup steps.

Best Practices

Expire ephemeral Devboxes and release leases on success, failure, and cancellation so work does not keep running after the agent is done.
Treat pooled Devboxes as warm, not clean. Reset the repository, check out the intended ref, and apply the current task's changes before running commands.
Use name_prefix for parallel work to avoid name collisions when several agents or shards create Devboxes at the same time.
Use --no_checkout or repository.disabled: true when the agent needs a sandbox but does not need the default repository.
Use from_secret_id for secrets and network_policy.egress_domains to restrict outbound access to the domains the task needs.