Speeding Up Kubernetes Image Pulls with Spegel

A node scales up. Kubernetes schedules a pod onto it. The kubelet starts pulling the container image from its upstream registry. Several gigabytes later, the pod starts.

This is fine when it happens once. It is less fine when the same image was already pulled on three sibling nodes in the same cluster, on the same private network, ten seconds earlier. The local copy on a peer node is faster than anything the registry can serve, and every node is going to pull the same image eventually. That is the gap Spegel fills.

We recently turned Spegel on for our Nebius deployment. This post covers what it does, why we picked it, how we wired it into our Helm operator, and the one containerd-configuration step that bit us before it worked.

What Spegel does

Spegel runs as a DaemonSet on every node. Each instance does three things:

Watches the local containerd content store for images already pulled on this node, and announces them on a peer-to-peer network so other nodes know who has what.
Runs a local HTTP registry on the node, serving layers from containerd’s content store to other nodes that ask for them.
Writes containerd registry-mirror config (under /etc/containerd/certs.d) pointing the configured registries at localhost:5000 (Spegel) first, with the upstream as the fallback.

The result: when a node needs an image whose layers are already cached on a sibling, containerd fetches those layers from the sibling instead of the upstream registry. If no peer has the image, the request falls through to the upstream as normal. There is no central server, no separate registry to operate, and nothing to garbage-collect. Spegel does not store any images of its own. It only shares what containerd already has.

The peer-to-peer layer uses libp2p Kademlia DHT for discovery and HTTP for transfer. Each layer is identified by its content digest, which is what containerd uses internally too, so there is no ambiguity about whether a peer has the exact bytes the kubelet asked for.

Why we picked it

We looked at the usual options:

Option	What it is	Trade-off
Pull-through cache (Harbor, Distribution registry)	A central cache the cluster pulls through	Single bottleneck, separate service to operate, needs persistent storage
Dragonfly	Peer-to-peer image distribution with a supernode	More moving parts, needs a manager and seed peer
Spegel	Stateless P2P, one DaemonSet, no central service	Slightly less aggressive than a true CDN, no cross-cluster sharing

For our use case (one cluster, fast intra-cluster network, autoscaled nodes pulling the same handful of large GPU images), Spegel is the simplest thing that solves the problem. The fact that it has no state and no central component means there is nothing to back up, nothing to restart on failure, and nothing to size. If a Spegel pod dies, containerd transparently falls through to the upstream registry.

What it actually requires from containerd

Spegel cannot configure containerd itself. The relevant containerd changes need a containerd restart to take effect, and Spegel has no way to safely restart the container runtime it is running on top of. The Spegel docs are explicit about this.

You need two settings in /etc/containerd/config.toml:

[plugins."io.containerd.grpc.v1.cri".registry]
  config_path = "/etc/containerd/certs.d"

This tells containerd to look in /etc/containerd/certs.d for per-registry mirror configuration. Spegel writes hosts.toml files in there at runtime. Without this line, containerd ignores everything Spegel writes and just keeps pulling from the upstream.

discard_unpacked_layers = false

This keeps the layer blobs in the content store after they are unpacked into snapshots. With discard_unpacked_layers = true (which some distributions ship as the default), containerd throws the blob away as soon as it is unpacked, so Spegel has nothing left to serve to peers. The disk savings are real but they defeat the entire point of running Spegel.

Most managed Kubernetes products do not set config_path for you. GKE and DigitalOcean explicitly do not. Nebius does not. EKS lets you set it via nodeadm. K0S and Talos have their own configuration mechanisms.

In our case, we handle this with a small DaemonSet that runs on every node, patches both settings into the host’s config.toml, and restarts containerd. It is idempotent (it only restarts containerd if the config actually changed), and it ships as part of our Spegel chart so it goes wherever Spegel goes.

The pause:3.9 mistake

When we first deployed our containerd-prep DaemonSet, we copied a pattern from a Nebius example: the init container runs the patch, then a long-lived pause container keeps the DaemonSet pod scheduled so the patch runs again on node replacement. The original example used:

containers:
  - name: pause
    image: gcr.io/google-containers/pause:3.9

Two days later every node had an ImagePullBackOff on gcr.io/google-containers/pause:3.9. The init container had run fine and containerd was patched, but the pod kept the original event surfacing because the keepalive container could not start.

The cause is structural: gcr.io/google-containers is the legacy Google Container Registry path, which Google shut down in March 2025. The replacement path is registry.k8s.io/pause. Anyone copying an old example that hardcodes the legacy GCR path will hit this, and it is genuinely confusing because the Spegel pods look healthy, the patch ran, the cluster is technically working, but kubectl describe on the prep pod is a wall of failed pulls.

The fix in our chart was to drop the upstream pause image entirely and use one of our own already-mirrored images for the keepalive. We had saturn-k8s-utils (an ubuntu:jammy image we already build and mirror to ECR Public) sitting right there. It has the shell and the nsenter binary the init container needs anyway, so it does double duty. One image to pull, no dependency on any external registry path.

How we wired it into our operator

Our enterprise install is driven by a single Helm operator (saturn-helm-operator) that reconciles CRs into Helm releases. Each installable component (Atlas, logging, monitoring, etc.) is a CR kind. Adding Spegel meant:

Vendoring the upstream Spegel chart under the operator’s helm-charts/spegel/ directory, the same way we vendor cert-manager.
Registering a Spegel CR kind in the operator’s watches.yaml and adding the CRD.
Adding images.spegel to the operator’s central image map. This is the canonical Saturn convention: every infrastructure image referenced by the operator lives under images: as a flat string, and a single imageMirror setting rewrites those paths through a customer’s mirror registry for air-gapped deployments. Spegel picks up that behavior for free.
Mirroring ghcr.io/spegel-org/spegel:v0.7.1 into our own registry path. Our saturn-mirror repo holds one-line Dockerfiles per third-party image (FROM ghcr.io/spegel-org/spegel:v0.7.1), and our release-images build pipeline builds and pushes them on every release.
Adding saturnComponents.spegel to the operator values, disabled by default, with a configurable mirroredRegistries list. Enabled by default in our Nebius overrides because that is where it actually helps.

The Spegel CR the operator emits looks like this:

apiVersion: charts.saturncloud.io/v1alpha1
kind: Spegel
metadata:
  name: spegel
  namespace: spegel
spec:
  image:
    repository: <your-mirror>/spegel-org/spegel
    tag: v0.7.1
  containerdPrep:
    enabled: true
    image: <your-mirror>/saturn-k8s-utils:<tag>
  mirroredRegistries:
  - https://<registry-you-pull-from>
  - https://<another-registry-you-pull-from>

If you scope mirroredRegistries (rather than leaving it empty, which would have Spegel try to mirror every registry), Spegel only writes hosts.toml files for the listed registries. Pulls from registries not on the list go straight to the upstream as if Spegel were not installed. Scope the list to the registries your cluster actually pulls images from in volume.

What changed in practice

Cold-start pull times on our larger GPU images dropped substantially once two or three nodes had pulled the image. The first node still has to fetch from the upstream, so the first scale-up event is the same as before. Every subsequent node pulls from a peer.

The other change is in failure modes. With a pull-through cache, the cache going down means every pull misses and slows down. With Spegel down, every pull falls through to the upstream registry. This is the same path that existed before Spegel was installed, so the worst case is “no faster than before,” which is a much better failure mode than “everything is slower until the cache is back.”

Enable it yourself

If you run Saturn Cloud Enterprise, see the Spegel docs for the operator values to set and the prerequisites to check.

If you are just looking to run Spegel on your own cluster, the upstream Helm chart is straightforward. The two things to get right are the containerd config_path setting (do not skip this) and scoping mirroredRegistries to registries Spegel can actually help with.

What Spegel does

Why we picked it

What it actually requires from containerd

The pause:3.9 mistake

How we wired it into our operator

What changed in practice

Enable it yourself

Related articles

What It Takes to Build a Token Factory on NVIDIA Dynamo

10 Managed Inference Providers (Token Factories) for Production in 2026

Where NVIDIA Dynamo Fits in an Inference Stack