Product

Defense in depth: how we think about security at Namespace

Security is one of the things we think about most carefully at Namespace. Your CI environment (your CI runners) is an attractive target because it often has access to secrets, including production secrets; and through your dependencies they pull and run arbitrary code. Getting runner security wrong has consequences that go well beyond the runner itself.

Our approach is defense in depth. The idea is simple: no single control is enough, because every control has a failure mode. An environment can be misconfigured or a patched kernel gets bypassed. Defense in depth means layering controls so that when one fails the others hold. The goal isn't to make any single layer perfect, it's to make sure the layers don't all fail in the same way at the same time.

How we build instances

The foundation is a "share nothing" model. Every job runs in its own instance (our compute primitive), and every instance runs in a virtual machine. Two jobs from different workspaces don't share a kernel. Two jobs from the same workspace don't share a kernel. Each job gets its own VM, its own kernel, and a clean environment that gets cleaned up when the job finishes.

We don't use distribution-packaged defaults and maintain our own base systems, including the kernels. Our kernel configurations keep to the minimal set required for Namespace-supported use cases. We also keep them current. For example, we are rolling out Linux 7.0.1 across our fleet as we write this post.

That last point is worth being specific about. Kernel updates are operationally painful, and a lot of infrastructure falls behind as a result. We've invested in making them routine, because running current kernels is one of the layers. A layer that's perpetually out of date isn't really a layer.

Now let’s take a look at how our platform was not impacted by some recent security incidents.

CVE-2026-31431

In late April 2026, a local privilege escalation vulnerability was disclosed in the Linux kernel, affecting virtually every major distribution running kernels since 2017. CVE-2026-31431, nicknamed "Copy Fail," lets an unprivileged user write to the kernel's page cache of any readable file on the system. This is enough to corrupt a setuid binary and get root.

In shared-kernel environments, like a kubernetes-based runner environment, this is a wide blast radius. The page cache is shared across all processes on a host, container boundaries included. A compromised job can become a compromised host.

For Namespace, jobs already run as root inside their VM. We don't treat intra-job privilege escalation as an attack vector worth defending against. What we do care about is what's reachable from inside that VM, and the answer is, nothing else. The VM is the boundary. An attacker with full control of a job's kernel can't reach another job or another tenant. The VM is discarded when the job ends.

In a shared-kernel environment, Copy Fail turns a job compromise into a host compromise. On Namespace, a job compromise is just a job compromise. The isolation model doesn't change under a kernel LPE because the kernel was already scoped to that job.

The Axios supply chain attack

In late March 2026, two backdoored versions of Axios were published to npm through a compromised maintainer account. Both versions silently pulled in a malicious dependency whose postinstall hook contacted a remote server and downloaded a platform-specific RAT. The packages were live for about three hours. Any CI pipeline that ran npm install in that window, against a project with a floating version range, was impacted.

This is a different class of problem. The malicious code ran legitimately inside the job through a standard npm mechanism. There was nothing in the lockfile to flag. VM isolation limits what an attacker can do once a job is compromised, but it doesn't prevent the job from being compromised in the first place.

The layer that stops this is the network. We already had egress filtering at the instance level, but the Axios incident made it clear it needed to be easier to configure directly in GitHub Actions runner profiles. When the incident was disclosed, we also used our job-level telemetry to identify which customers had potentially been impacted and notified them directly. They didn't need to go digging through their own logs.

Runner profiles can now restrict outbound network access to a defined list of domains. The minimum set required to communicate with GitHub is included by default. It's a toggle in the profile editor. A runner with egress filtering on would have let the postinstall script execute and go nowhere.

The Axios incident is also a good illustration of how defense in depth actually works in practice. It is not a static checklist, but as something you keep pressure-testing. The isolation model held, but the network layer had a usability gap. We saw it, and we closed it.

Summary

The two incidents are a useful pair because they show different failure modes. Copy Fail was a kernel vulnerability that would have given an attacker root on a shared host, the kind of thing that isolation and current kernels are specifically designed to contain. The Axios attack was a supply chain compromise that executed code legitimately inside a job, the kind of thing that network controls are designed to contain.

Defense in depth means having controls at each of those layers, maintained well enough that they actually hold when tested. It also means being honest when a layer needs work. The goal isn't to claim we've solved security. It's to build the kind of infrastructure where individual failures don't cascade.

Accelerate your developer team

Join hundreds of teams using Namespace to build faster, test more efficiently, and ship with confidence.