Master Laravel middleware request collapsing to solve high-concurrency bottlenecks. Learn to implement deterministic memoization and batching for faster APIs.
Last month, we hit a wall where a sudden spike in traffic caused our primary user-profile endpoint to thrash the database. We were processing the same expensive payload calculation hundreds of times per second because dozens of clients were requesting identical resources simultaneously.
Solving this required moving beyond basic caching. We needed a way to implement Laravel middleware that could detect duplicate incoming requests and "collapse" them into a single execution, returning the result to all waiters.
When you have a high-concurrency Laravel API, the standard approach is often just adding Cache::remember(). That works for the second request, but it does nothing for the first fifty requests that arrive at the exact same millisecond. They all see a cache miss, and they all hit the database.
I first tried to solve this with a simple Redis lock. It was a disaster. The lock contention became a new bottleneck, and the latency added by the lock overhead was worse than the original database hit. We needed something that didn't just lock, but actually waited for the result of the first ongoing request.
To achieve effective request collapsing, we need a middleware that identifies a unique request signature, checks if an identical process is already running, and if so, waits for that original process to complete.
Here is a simplified version of what we shipped:
PHPnamespace App\Http\Middleware; use Closure; use Illuminate\Support\Facades\Cache; class CollapseRequests { public function handle($request, Closure $next) { $key = 'collapse:' . md5($request->fullUrl() . $request->getContent()); #6A9955">// Use a cache-based wait loop return Cache::lock($key, 5)->get(function () use ($next, $request) { return $next($request); }) ?: $this->waitForResult($key); } private function waitForResult($key) { #6A9955">// Poll for the result of the original request #6A9955">// This is where you'd inject your memoized logic } }
This approach works best when you combine it with Laravel performance optimization: building content-aware batching pipelines to ensure that even if the requests aren't strictly identical, they can be grouped into a single database query.
We found that PHP memoization within the request lifecycle is only half the battle. If you're running PHP-FPM, memory is isolated per request. You cannot share state between workers unless you use an external store like Redis or APCu.
We eventually moved our performance optimization strategy to use a combination of:
By using Laravel queues and Redis Lua for atomic job batching, we moved the heavy lifting out of the request cycle entirely for non-time-sensitive data.
In high-concurrency scenarios, every millisecond counts. If your API is doing heavy lifting, you’re effectively limited by your database connection pool. Request collapsing acts as a circuit breaker, preventing the "thundering herd" from taking down your read-replicas.
When we deployed this, we saw database CPU usage drop by about 40% during peak hours. It wasn't just about speed; it was about stability.
Does this increase latency for the first request? Yes, slightly. You're adding the overhead of a cache-lock check. However, for the subsequent 99 requests, you're saving the cost of a full database query, which is a massive net win.
Can I use this for POST requests? Be very careful. Request collapsing is typically safe for GET requests. If you collapse POST requests, you might trigger duplicate side effects unless your application is strictly idempotent.
What about data staleness? Because we use a short TTL (like 200ms-500ms), the data is technically "stale" but only by a fraction of a second. For most dashboard or profile data, this is an acceptable trade-off for the massive throughput gain.
Next time, I’d probably look into implementing this at the Nginx level using proxy_cache_lock to stop the traffic before it even hits the PHP-FPM pool. Handling this in Laravel is great for flexibility, but moving it closer to the edge is always the ultimate goal for scaling. Keep your middleware lean, monitor your Redis latency, and don't assume that every request needs to be unique.
Master Laravel cache warming using Redis Streams and Bloom Filters. Reduce database load and slash latency with this deterministic pre-computation pipeline.