DevOpsJune 20, 20264 min read

Zero-downtime deploy with GitHub Actions: A Practical Guide

Achieve a zero-downtime deploy with GitHub Actions using blue-green strategies. Learn how to keep your services running seamlessly during every release.

DevOpsGitHub ActionsDockerNginxCI/CDDeploymentLinux

I remember the first time I pushed a production update that caused a 40-second outage. It was a simple "stop container, pull image, start container" script. My inbox exploded, and I realized that "good enough" for local development is dangerous for production.

Achieving a zero-downtime deploy with GitHub Actions isn't about buying expensive tools. It's about changing how your load balancer handles traffic while your application restarts. Whether you're running a monolith on a single VPS or a containerized fleet, the principle is the same: never kill the old process until the new one is ready to handle traffic.

Why the "Stop-Pull-Start" Pattern Fails

When you run docker stop followed by docker run, your application is offline for the duration of the image pull and the startup sequence. Even if your app starts in 500ms, the DNS or load balancer might still be pointing at a dead socket.

We tried a simple script approach first, but it resulted in roughly 1.8 seconds of downtime per deploy. That doesn't sound like much, but it's an eternity when you're running a high-traffic API. We needed a strategy that allowed us to swap instances without dropping connections.

The Blue-Green Strategy for Containers

The most reliable way to handle this without complex service meshes is a Blue-Green deployment. You run two versions of your app side-by-side.

Blue: The current, stable version.
Green: The new version being deployed.

Once the Green container passes health checks, you update your Nginx configuration to point to the new port and reload.

Implementing the Workflow

You'll need a GitHub Actions runner connected to your server, usually via SSH. Here is how I structure the deployment step in my YAML:


YAML
deploy:
  runs-on: ubuntu-latest
  steps:
    - name: Deploy to Server
      uses: appleboy/ssh-action@master
      with:
        host: ${{ secrets.HOST }}
        script: |
          docker pull my-app:latest
          # Start the new version on a different port
          docker run -d --name app-green -p 8081:8080 my-app:latest
          
          # Wait for health check
          until $(curl --output /dev/null --silent --head --fail http://localhost:8081/health); do
            sleep 2
          done
          
          # Switch Nginx upstream to port 8081
          sed -i 's/8080/8081/' /etc/nginx/conf.d/app.conf
          nginx -s reload
          
          # Clean up old container
          docker stop app-blue && docker rm app-blue

This ensures that the traffic only shifts once the health check passes. If the new container fails to start, the old one stays active, and your deployment fails safely.

Refinement: Automating the Swap

50 Euro banknotes being processed in a high-speed counting machine, showcasing technology and finance in action.

While the script above works, manually editing Nginx configs with sed can get messy. I prefer using a symlink approach or a simple Nginx map file.

If you're already managing your infrastructure with CI/CD with GitHub Actions for Scalable Web Apps, you might find that adding a blue-green layer feels like a natural evolution. Just ensure your health check endpoint is robust. Don't just check if the process is running; check if the database connection is alive and if the app can actually serve a request.

Handling Database Migrations

The biggest trap with a zero-downtime deploy with GitHub Actions is the database schema. If you update the database structure while two versions of your app are running, the older "Blue" version might crash.

Always follow the "Expand and Contract" pattern:

Add new columns/tables in a way that is backward compatible.
Deploy the new code (Green).
Once stable, remove the old code and columns.

It's tempting to rush, but migrating a database during a deploy is the fastest way to turn a simple release into a midnight incident. If you're looking for more advanced ways to manage these environments, check out how to use Ephemeral Environments with vcluster and GitHub Actions Guide to test these migrations before they hit your production database.

A Note on Complexity

Detailed chalkboard displaying science diagrams and sticky notes, showcasing scientific exploration.

Is this overkill for a small project? Maybe. If you're a solo dev, a small window of downtime might be acceptable. But practicing zero-downtime techniques forces you to write better, more resilient code.

I still have days where I choose a simple restart over a full blue-green deploy because the risk of a misconfigured Nginx reload outweighs the benefit of 30 seconds of uptime. Engineering is about trade-offs, not just following the "best" pattern blindly. Start simple, monitor your health endpoints, and only add complexity when the business justifies the cost.

Frequently Asked Questions

Does this increase disk usage on the server? Yes, you'll have two versions of your app running simultaneously. Ensure your server has enough RAM and CPU headroom to handle double the load during the transition.

What happens if the health check fails? With the script above, the until loop will eventually timeout if you add a counter, or the pipeline will simply hang. I recommend adding a timeout command to the loop to ensure the pipeline fails and alerts you if the deployment hangs.

Can I use this for non-containerized apps? Absolutely. The logic remains: start the new process on a different port, wait for it to be ready, then update your proxy to point to the new port.

Back to Blog

Zero-downtime deploy with GitHub Actions: A Practical Guide

Why the "Stop-Pull-Start" Pattern Fails

The Blue-Green Strategy for Containers

Implementing the Workflow

Refinement: Automating the Swap

Handling Database Migrations

A Note on Complexity

Frequently Asked Questions

Similar Posts

Nginx as a reverse proxy: The config explained line by line

Linux server hardening: Automate audits with Lynis and fail2ban

Docker for app developers: A mental model that sticks