LaravelPHPJune 23, 20263 min read

Laravel Pipelines and Redis Streams for High-Throughput Batch Processing

Master Laravel Pipelines and Redis Streams for robust batch processing. Learn to architect high-throughput systems that handle parallel data loads efficiently.

LaravelPHPRedisArchitecturePerformanceBackend

When you’re tasked with processing 500,000 records an hour without melting your database, the standard queue-and-pray approach usually fails. Last month, I hit this wall while refactoring a legacy reporting engine; the sheer volume of jobs was causing massive lock contention on the jobs table, and the latency was climbing over 4 seconds per batch.

We needed a more deterministic way to handle high-throughput architecture. By combining the composability of Laravel Pipelines with the persistence and speed of Redis Streams, we built a system that processes data in predictable, parallel chunks.

Why Standard Queues Aren't Enough

Initially, we tried pushing every record as an individual job. It’s the "Laravel way," but at scale, the overhead of job serialization and database polling becomes a bottleneck. As I detailed in my piece on Laravel Serialization: Architecting Deterministic Payloads for High-Performance Queues, bloated payloads kill performance.

When you have thousands of small jobs, you spend more time managing the queue state than actually executing business logic. We needed a batch-first approach where the unit of work is a chunk, not an individual record.

Implementing Chunk-Based Parallelism

The secret to Laravel Pipelines is their ability to pass a "traveler" object through a series of pipes. Instead of pipes modifying a single record, we use them to process a collection of records.

Here is how we set up the core pipeline structure:


PHP
#6A9955">// The BatchProcessor pipe
public function handle($data, Closure $next)
{
    #6A9955">// $data is a collection of records
    $processed = $data->map(fn($item) => $this->transform($item));
    
    return $next($processed);
}

To achieve true parallelism, we don't dispatch this pipeline directly. Instead, we use Redis Streams as our transport layer. Redis Streams act as an append-only log, allowing us to decouple the ingestion of data from the processing logic.

Architecting with Redis Streams

Redis Streams allow for multiple consumers to read the same data stream, which is perfect for horizontal scaling. We use a "Producer-Consumer" pattern where the producer chunks the incoming data into batches of 500 and pushes them onto a stream.

Producer: Reads from the source (e.g., an S3 CSV or API) and chunks the data.
Stream: Pushes the chunk as a single Redis entry.
Consumers: Multiple horizon workers listen to the stream, pulling one chunk at a time.

This approach is significantly faster than standard queues because you reduce the number of Redis roundtrips by a factor of 500. If you’re struggling with job management, Laravel Queues and Fork-Join Pattern: Parallel Processing Strategies provides a great look at how to decompose these tasks effectively before they ever hit the stream.

Handling Determinism and Idempotency

One risk with parallel processing is double-processing the same data. Since we are dealing with chunks, we implement a simple hashing strategy on the chunk metadata.

We store the hash in Redis with an expiry to ensure that even if a worker crashes and restarts, we don't re-process the same chunk. This is similar to the concepts I discussed in Laravel Horizon Idempotency: Building Deterministic Redis Task Keys, where we use content-addressable keys to ensure exactly-once semantics.

The Trade-offs

This architecture isn't a silver bullet. You lose the granular "retry this specific record" functionality that standard Laravel jobs provide. If one record in a chunk of 500 fails, you have to decide whether to fail the whole chunk or move the failed record to a "dead-letter" stream.

We chose to fail the whole chunk and move it to a specific Redis list for manual inspection. It added about two days of development time to our sprint, but it saved us from the nightmare of partial state updates in our database.

Final Thoughts

If I were to rebuild this tomorrow, I’d probably look into using Laravel's native Batching features more aggressively, but the custom Redis Stream approach gives us a level of visibility that the built-in system sometimes hides. It’s raw, it’s fast, and it’s predictable.

When you're dealing with high-throughput architecture, don't be afraid to bypass the abstractions if they aren't serving your throughput requirements. Sometimes, the most resilient code is the code that stays closest to the infrastructure.

Back to Blog

Laravel Pipelines and Redis Streams for High-Throughput Batch Processing

Why Standard Queues Aren't Enough

Implementing Chunk-Based Parallelism

Architecting with Redis Streams

Handling Determinism and Idempotency

The Trade-offs

Final Thoughts

Similar Posts

Laravel Queues: Implementing Redis Lua Scripting Rate Limiting

Laravel Circuit Breaker Pattern: Building Resilient Service Architectures

Laravel Rate Limiting: Building Adaptive Backpressure Middleware