Learn to build a custom Prometheus exporter for Laravel Horizon to enable precise KEDA auto-scaling on Kubernetes, moving beyond basic resource limits.
When you're running heavy background tasks, scaling based on CPU or RAM is a recipe for disaster. I’ve been there—watched a massive backlog pile up while the Kubernetes Horizontal Pod Autoscaler (HPA) sat idle because the workers were "bored" waiting for I/O. To fix this, you need to scale based on queue depth, and that requires exposing your Laravel Horizon data to the cluster.
If you haven't already looked into Scaling Laravel Queues on Kubernetes: A KEDA Implementation Guide, that’s your baseline for event-driven infrastructure. But once you move past basic triggers, you'll find that native Redis scalers often lack the granularity of Horizon’s internal metrics.
Standard metrics often miss the nuance of a failing job or a sudden spike in specific queue latency. By building a custom Prometheus exporter directly into your application, you can pull metrics from Horizon’s RedisQueue and Supervisor status.
We initially tried using standard Redis exporters, but they couldn't distinguish between a "pending" job and a "reserved" job effectively. We needed to see exactly how many jobs were waiting per queue. By building a custom exporter, we gained the ability to scale our pods based on the pending job count reported by Horizon.
You don't need a heavy package for this. A simple Artisan command running as a sidecar or a dedicated endpoint in your routes/api.php works perfectly. I prefer a dedicated endpoint to keep the logic isolated from the main application flow.
Here is a simplified version of how we extract the queue count:
PHP#6A9955">// In a Controller or Command public function export() { $stats = Horizon::getJobStats(); #6A9955">// Or iterate over queues $output = ""; foreach (Queue::getQueues() as $queue) { $count = Redis::connection()->llen("queues:{$queue}"); $output .= "laravel_queue_pending_jobs{queue=\"{$queue}\"} {$count}\n"; } return response($output)->header('Content-Type', 'text/plain'); }
This simple text output follows the Prometheus exposition format. When Prometheus scrapes this endpoint, it now has a concrete integer representing exactly how many tasks are waiting.
Once your metrics are live, you need to tell Kubernetes how to use them. This is where KEDA shines. You define a ScaledObject that points to your Prometheus instance.
YAMLapiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: horizon-worker-scaler spec: scaleTargetRef: name: horizon-worker-deployment triggers: - type: prometheus metadata: serverAddress: http://prometheus-server.monitoring.svc.cluster.local metricName: laravel_queue_pending_jobs query: sum(laravel_queue_pending_jobs{queue="default"}) threshold: '50'
With this setup, the worker count will increase the moment the queue exceeds 50 jobs. It’s deterministic, reactive, and significantly more efficient than scaling based on node pressure.
We made a mistake early on by setting the threshold too low. Our cluster spent about two days oscillating, spinning up pods only to kill them three minutes later. You need to implement a cooling period or a buffer in your KEDA configuration to prevent "flapping."
Also, don't rely solely on these metrics for production health. While Implementing Laravel Pulse for Real-Time Infrastructure Monitoring is great for visual debugging, KEDA needs raw numbers. Keep your monitoring and your scaling logic separate where possible.
The biggest downside to this approach is maintenance. Every time you upgrade Laravel or change your queuing architecture, you have to verify that your exporter logic still maps to the correct Redis keys. It’s not "set and forget."
If I were starting over, I’d probably look into consolidating these metrics into a shared library so multiple services can use the same exporter logic. But for a single, high-traffic application, the custom exporter approach provides the most control over your Kubernetes resources.
If you find yourself struggling with database bottlenecks during these scale-up events, remember to check your connection pooling. Sometimes the queue workers aren't the problem—it's the database connection limit. In those cases, revisit Laravel Read-Write Splitting: Deterministic Connection Routing Guide to ensure your infrastructure isn't choking on the primary instance.
We’re still tweaking the scraping interval. Currently, we’re at 15 seconds, which feels like the sweet spot between responsiveness and overhead. Any faster and we start seeing increased load on the Redis instance itself during peak hours.
Laravel Horizon graceful shutdowns are critical for reliable background processing. Learn to implement signal handling to prevent data loss in high-concurrency.
Read more