DevOpsJune 21, 20263 min read

Docker-in-Docker CI Runners: Orchestrating Ephemeral Linux Environments

Docker-in-Docker CI runners are the key to clean, isolated builds. Learn how to orchestrate ephemeral Linux environments using systemd and Docker containers.

DockerCI/CDLinuxSystemdDevOpsAutomation

We’ve all been there: a CI build fails because a previous job left behind a stale file, a global package, or a rogue background process. I spent about two days cleaning up "zombie" containers on a shared build server before deciding enough was enough. If you’re managing your own build infrastructure, you shouldn't be dealing with persistent state.

The solution isn't to just add more memory; it's to force every job to run in a fresh, ephemeral Linux environment. By combining Docker-in-Docker (DinD) with systemd, we can create disposable build runners that vanish the moment the job finishes.

Why Docker-in-Docker for CI Runners?

When you run a standard Docker runner, the container usually maps the host’s /var/run/docker.sock to execute sibling builds. It’s convenient, but it’s a security and stability nightmare. If a container escapes or messes with the daemon, your entire host goes down.

Using Docker-in-Docker, the runner container gets its own isolated Docker daemon. This creates a hard boundary. If a build script tries to do something reckless, it only kills its own isolated environment, not your host server.

We first tried running these as simple foreground containers inside a screen session, but that failed during a power cycle. We moved to systemd to ensure the runner stays alive, restarts on failure, and cleans up after itself.

Architecting the Ephemeral Runner with Systemd

To keep things simple, I use a lightweight Debian-based image for the runner. The goal is to start the DinD daemon as a systemd service, ensuring it’s ready to accept jobs immediately.

First, create a service file at /etc/systemd/system/ci-runner.service:


INI
[Unit]
Description=Ephemeral CI Runner with DinD
After=docker.service
Requires=docker.service

[Service]
ExecStartPre=-/usr/bin/docker stop ci-runner
ExecStartPre=-/usr/bin/docker rm ci-runner
ExecStart=/usr/bin/docker run --privileged --name ci-runner \
  -v /opt/ci/data:/var/lib/docker \
  my-custom-runner:latest
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

The --privileged flag is the secret sauce here. It allows the container to run its own kernel-level operations required by the inner Docker daemon. I’ve mapped /opt/ci/data to persist just the essential build cache, which keeps our builds fast while keeping the root filesystem clean.

The Reality of Infrastructure Automation

You might wonder if this approach is overkill for a small team. When I first looked at GitHub Actions Self-Hosted Runners: Scaling Ephemeral Docker Containers, I thought about using their built-in ephemeral features. However, running your own orchestration gives you granular control over the kernel parameters and networking that managed services often hide.

While I love using Systemd Timers: The Better Way to Handle Linux Automation for cleanup tasks, the ExecStartPre trick in the unit file is often enough to handle routine restarts. It’s a bit "brute force," but in a production environment, reliability beats elegance every time.

Lessons Learned in the Trenches

One major trade-off: storage. Using Docker-in-Docker inside a container can lead to massive disk usage if you aren't pruning images regularly. I recommend adding a secondary systemd timer that runs docker system prune -f inside the runner once a day.

Also, don't forget that if your host experiences high I/O wait times, the inner Docker daemon will struggle. We saw latency spikes of around 280ms when running multiple concurrent builds on a single spinning disk. Moving to NVMe storage for the /var/lib/docker path solved about 90% of our performance complaints.

Is This Right for You?

If you're building a massive enterprise platform, you probably want to look at Implementing Ephemeral Environments with vcluster and Loft to manage Kubernetes-native isolation. But for a solo dev or a small team managing a few VPS instances, this DinD-plus-systemd pattern is hard to beat. It’s predictable, it’s auditable, and it’s entirely within your control.

I’m still tinkering with the networking side of things. Sometimes the internal Docker bridge conflicts with the host bridge, requiring custom iptables rules. It’s a bit of a headache, but it’s a one-time configuration cost for a much cleaner CI/CD pipeline.

Next time, I’d probably look into using podman instead of standard Docker to avoid the --privileged requirement, but for now, this setup is keeping our builds stable and our hosts pristine.

Back to Blog

Docker-in-Docker CI Runners: Orchestrating Ephemeral Linux Environments

Why Docker-in-Docker for CI Runners?

Architecting the Ephemeral Runner with Systemd

The Reality of Infrastructure Automation

Lessons Learned in the Trenches

Is This Right for You?

Similar Posts

Linux Performance: Cgroups v2 and Systemd Slices for VPS

GitHub Actions Self-Hosted Runners: Scaling Ephemeral Docker Containers

Docker Socket Activation: Zero-Downtime Hot-Swapping with Systemd