Storage
Namespace offers several types of storage to users:
- Container registry: an integrated private container registry to build and deploy container images to Namespace.
- Cache Volumes: fast local storage designed for caching purposes.
- Artifact Storage: high-performance storage for workflow artifacts.
- Bazel Cache: specialized high-performance caching for Bazel build artifacts (available upon request).
Cache Volumes
Cache Volumes can be attached (i.e., mounted into a user-specified directory) to an instance at creation time.
They act as regular disks with the following properties:
- They're backed by local NVME storage. You can expect high performance.
- A Cache Volume is formatted as a regular Linux filesystem (e.g., ext4), so you can expect them to support any use-case you have that Linux supports.
Namespace adopts a unique strategy to support multi-writer scenarios, typical to Continuous Integration (i.e. multiple runners want to read and write to the cache).
When requesting a Cache Volume, Namespace attaches a "fork" of the previous cache volume version (all but the first one, which starts empty). Each compute instance gets its own private copy of the Cache Volume, as it existed at the time of the last cache commit.
As an instance completes, its copy (fork) becomes the new parent for future forks.
Try Cache Volumes from Namespace Runners to significantly speed up your GitHub Action runs.
Naming
Unique tags govern access to Cache Volumes - usable by any instance. Each tag maintains a list of Cache Volume versions. When creating new compute instances, Namespace attempts to attach the most recent Cache Volume version that new instance.
Lifecycle
At any point in time, multiple versions of the Cache Volume may be used by different compute instances.
The first request creates the first version, used as the parent of subsequent forks until a new parent version is committed.
Version commits follow a "last write" model: whenever a compute instance terminates cleanly (e.g., it's a job and exits with exit code 0), Cache Volumes attached to that instance have a new parent committed: the final flushed volume of the exiting instance.
Whenever a compute instance fails, Namespace abandons its Cache Volume versions.
Guarantees and non-guarantees
To ensure that we consistently offer high-performance Cache Volumes, with close to zero impact on startup latency, Namespace trade-offs on Cache Volume hit ratio.
It maximizes the probability that an instance can obtain a fork of the previous version of a Cache Volume, but it does not guarantee it.
Thus, users should not build applications assuming that the contents of a Cache Volume match exactly the last committed version.
Sizing
When requesting a Cache Volume you can specify a size.
When requesting x
GB, Namespace provides a volume that has at least x
GB free.
In the case of a cache hit (most of the times), the actual volume size is:
last used volume size + x
Example:
- you request a Cache Volume of 50GB
- your first instance starts with an empty volume (0 bytes used and 50GB free)
- you store 10GB of data with your first instance
- a new instance requests the same cache tag, again with size 50GB (e.g. same workflow)
- your second instance gets a volume of size 60GB (10GB used and 50GB free)
What happens when my Cache Volume fills up?
Cache Volumes are optimized for correctness: Namespace always provides the cache size you ask for. Your Cache Volumes keep growing as the amount of cached data increases. Once the content exceeds the cache size, that volume will not be used - leading to a cache miss.
Example:
- you request a Cache Volume of size 50GB
- Namespace finds an existing version with 20 GB of cache contents
- your instance gets a fork of this volume of size 70GB (20GB used and 50GB free)
- you generate 31 GB of additional cache contents (51GB used, 19GB free)
- a new instance requests the same cache tag, again with size 50GB (e.g. same workflow)
- your second instance gets an empty volume of size 50GB (cache miss)
If your Cache Volume size is larger than the amount of data written by any single instance, you will never encounter out-of-disk errors.
Garbage Collection
Namespace employs multiple garbage collection strategies to manage the amount of cached contents and avoid cache misses due to uncontrolled growth.
We already garbage-collect cache contents for:
If you need managed garbage collection for another use case, please contact support@namespace.so.
Artifact Storage
Namespace maintains high-performance artifact storage, ideal for workflow artifacts. It seamlessly integrates with Namespace products and requires zero configuration to start using it.
Using Namespace Artifact Storage from GitHub Actions
Namespace offers seamless access from Namespace runners to Namespace Artifact Storage.
If you haven't yet migrated your workflows to Namespace runners, checkout the runner documentation first.
The Artifact Storage lives close to your Namespace runners, offering higher performance and reliability than GitHub's alternative.
Access to the Artifact Storage is workspace-private, meaning that users need to be members of your Namespace workspace to access its data. Having access to the GitHub organization is not sufficient.