LaravelPHPJune 22, 20263 min read

Laravel Horizon Auto-scaling: Custom Prometheus Metrics for KEDA

Learn to build a custom Prometheus exporter for Laravel Horizon to enable precise KEDA auto-scaling on Kubernetes, moving beyond basic resource limits.

LaravelKubernetesPrometheusKEDADevOpsPHPBackend

When you're running heavy background tasks, scaling based on CPU or RAM is a recipe for disaster. I’ve been there—watched a massive backlog pile up while the Kubernetes Horizontal Pod Autoscaler (HPA) sat idle because the workers were "bored" waiting for I/O. To fix this, you need to scale based on queue depth, and that requires exposing your Laravel Horizon data to the cluster.

If you haven't already looked into Scaling Laravel Queues on Kubernetes: A KEDA Implementation Guide, that’s your baseline for event-driven infrastructure. But once you move past basic triggers, you'll find that native Redis scalers often lack the granularity of Horizon’s internal metrics.

Why Custom Metrics are Necessary

Standard metrics often miss the nuance of a failing job or a sudden spike in specific queue latency. By building a custom Prometheus exporter directly into your application, you can pull metrics from Horizon’s RedisQueue and Supervisor status.

We initially tried using standard Redis exporters, but they couldn't distinguish between a "pending" job and a "reserved" job effectively. We needed to see exactly how many jobs were waiting per queue. By building a custom exporter, we gained the ability to scale our pods based on the pending job count reported by Horizon.

Exposing Laravel Horizon Metrics

You don't need a heavy package for this. A simple Artisan command running as a sidecar or a dedicated endpoint in your routes/api.php works perfectly. I prefer a dedicated endpoint to keep the logic isolated from the main application flow.

Here is a simplified version of how we extract the queue count:


PHP
#6A9955">// In a Controller or Command
public function export()
{
    $stats = Horizon::getJobStats(); #6A9955">// Or iterate over queues
    $output = "";

    foreach (Queue::getQueues() as $queue) {
        $count = Redis::connection()->llen("queues:{$queue}");
        $output .= "laravel_queue_pending_jobs{queue=\"{$queue}\"} {$count}\n";
    }

    return response($output)->header('Content-Type', 'text/plain');
}

This simple text output follows the Prometheus exposition format. When Prometheus scrapes this endpoint, it now has a concrete integer representing exactly how many tasks are waiting.

Implementing KEDA Auto-scaling

Once your metrics are live, you need to tell Kubernetes how to use them. This is where KEDA shines. You define a ScaledObject that points to your Prometheus instance.


YAML
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: horizon-worker-scaler
spec:
  scaleTargetRef:
    name: horizon-worker-deployment
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus-server.monitoring.svc.cluster.local
      metricName: laravel_queue_pending_jobs
      query: sum(laravel_queue_pending_jobs{queue="default"})
      threshold: '50'

With this setup, the worker count will increase the moment the queue exceeds 50 jobs. It’s deterministic, reactive, and significantly more efficient than scaling based on node pressure.

Lessons Learned the Hard Way

We made a mistake early on by setting the threshold too low. Our cluster spent about two days oscillating, spinning up pods only to kill them three minutes later. You need to implement a cooling period or a buffer in your KEDA configuration to prevent "flapping."

Also, don't rely solely on these metrics for production health. While Implementing Laravel Pulse for Real-Time Infrastructure Monitoring is great for visual debugging, KEDA needs raw numbers. Keep your monitoring and your scaling logic separate where possible.

The Trade-off

The biggest downside to this approach is maintenance. Every time you upgrade Laravel or change your queuing architecture, you have to verify that your exporter logic still maps to the correct Redis keys. It’s not "set and forget."

If I were starting over, I’d probably look into consolidating these metrics into a shared library so multiple services can use the same exporter logic. But for a single, high-traffic application, the custom exporter approach provides the most control over your Kubernetes resources.

If you find yourself struggling with database bottlenecks during these scale-up events, remember to check your connection pooling. Sometimes the queue workers aren't the problem—it's the database connection limit. In those cases, revisit Laravel Read-Write Splitting: Deterministic Connection Routing Guide to ensure your infrastructure isn't choking on the primary instance.

We’re still tweaking the scraping interval. Currently, we’re at 15 seconds, which feels like the sweet spot between responsiveness and overhead. Any faster and we start seeing increased load on the Redis instance itself during peak hours.

Back to Blog

Laravel Horizon Auto-scaling: Custom Prometheus Metrics for KEDA

Why Custom Metrics are Necessary

Exposing Laravel Horizon Metrics

Implementing KEDA Auto-scaling

Lessons Learned the Hard Way

The Trade-off

Similar Posts

Laravel Database Performance: Scaling Connections with PgBouncer

Laravel Horizon Graceful Shutdowns: Mastering Signal Handling for Workers

Mastering Laravel Config: A Guide to Managing Settings