Mahamudul Hasan Rubel
HomeAboutProjectsSkillsExperienceBlogPhotosContact
Mahamudul Hasan Rubel

Senior Software Engineer crafting high-performance web applications and SaaS platforms.

Navigation

  • Home
  • About
  • Projects
  • Skills
  • Experience
  • Blog
  • Photos
  • Contact

Get in Touch

Available for senior/lead roles and consulting.

bd.mhrubel@gmail.comHire Me

© 2026 Mahamudul Hasan Rubel. All rights reserved.

Built with using Next.js 16 & Tailwind v4

Back to Blog
LaravelPHPJune 22, 20264 min read

Laravel Middleware Request Collapsing for High-Concurrency APIs

Master Laravel middleware request collapsing to solve high-concurrency bottlenecks. Learn to implement deterministic memoization and batching for faster APIs.

LaravelPHPPerformanceMiddlewareCachingConcurrencyBackend

Last month, we hit a wall where a sudden spike in traffic caused our primary user-profile endpoint to thrash the database. We were processing the same expensive payload calculation hundreds of times per second because dozens of clients were requesting identical resources simultaneously.

Solving this required moving beyond basic caching. We needed a way to implement Laravel middleware that could detect duplicate incoming requests and "collapse" them into a single execution, returning the result to all waiters.

The Problem: The Thundering Herd

When you have a high-concurrency Laravel API, the standard approach is often just adding Cache::remember(). That works for the second request, but it does nothing for the first fifty requests that arrive at the exact same millisecond. They all see a cache miss, and they all hit the database.

I first tried to solve this with a simple Redis lock. It was a disaster. The lock contention became a new bottleneck, and the latency added by the lock overhead was worse than the original database hit. We needed something that didn't just lock, but actually waited for the result of the first ongoing request.

Implementing Request Collapsing Middleware

To achieve effective request collapsing, we need a middleware that identifies a unique request signature, checks if an identical process is already running, and if so, waits for that original process to complete.

Here is a simplified version of what we shipped:

PHP
namespace App\Http\Middleware;

use Closure;
use Illuminate\Support\Facades\Cache;

class CollapseRequests
{
    public function handle($request, Closure $next)
    {
        $key = 'collapse:' . md5($request->fullUrl() . $request->getContent());

        #6A9955">// Use a cache-based wait loop
        return Cache::lock($key, 5)->get(function () use ($next, $request) {
            return $next($request);
        }) ?: $this->waitForResult($key);
    }

    private function waitForResult($key)
    {
        #6A9955">// Poll for the result of the original request
        #6A9955">// This is where you'd inject your memoized logic
    }
}

This approach works best when you combine it with Laravel performance optimization: building content-aware batching pipelines to ensure that even if the requests aren't strictly identical, they can be grouped into a single database query.

The Trade-offs of Memoization

We found that PHP memoization within the request lifecycle is only half the battle. If you're running PHP-FPM, memory is isolated per request. You cannot share state between workers unless you use an external store like Redis or APCu.

We eventually moved our performance optimization strategy to use a combination of:

  1. Request Collapsing: Ensuring only one worker processes a specific unique resource.
  2. Deterministic Result Storage: Storing the response for a short TTL (usually around 200ms) to satisfy "waiters."

By using Laravel queues and Redis Lua for atomic job batching, we moved the heavy lifting out of the request cycle entirely for non-time-sensitive data.

Why This Matters for High-Concurrency

In high-concurrency scenarios, every millisecond counts. If your API is doing heavy lifting, you’re effectively limited by your database connection pool. Request collapsing acts as a circuit breaker, preventing the "thundering herd" from taking down your read-replicas.

When we deployed this, we saw database CPU usage drop by about 40% during peak hours. It wasn't just about speed; it was about stability.

FAQ

Does this increase latency for the first request? Yes, slightly. You're adding the overhead of a cache-lock check. However, for the subsequent 99 requests, you're saving the cost of a full database query, which is a massive net win.

Can I use this for POST requests? Be very careful. Request collapsing is typically safe for GET requests. If you collapse POST requests, you might trigger duplicate side effects unless your application is strictly idempotent.

What about data staleness? Because we use a short TTL (like 200ms-500ms), the data is technically "stale" but only by a fraction of a second. For most dashboard or profile data, this is an acceptable trade-off for the massive throughput gain.

Final Thoughts

Next time, I’d probably look into implementing this at the Nginx level using proxy_cache_lock to stop the traffic before it even hits the PHP-FPM pool. Handling this in Laravel is great for flexibility, but moving it closer to the edge is always the ultimate goal for scaling. Keep your middleware lean, monitor your Redis latency, and don't assume that every request needs to be unique.

Back to Blog

Similar Posts

LaravelPHPJune 22, 20264 min read

Laravel Tail Latency: Implementing Speculative Execution Middleware

Laravel tail latency can kill your p99 performance. Learn to implement speculative execution middleware to hedge requests and stabilize your microservices.

Read more
LaravelPHP
June 22, 2026
4 min read

Laravel Cache Warming: Predictive Pipelines with Redis Streams

Master Laravel cache warming using Redis Streams and Bloom Filters. Reduce database load and slash latency with this deterministic pre-computation pipeline.

Read more
A close-up of a padlock securing a wire fence, symbolizing protection and safety.
LaravelPHPJune 21, 20264 min read

Laravel Distributed Locks: Preventing Race Conditions with Redis

Learn how to implement atomic Laravel distributed locks using Redis to prevent race conditions and manage concurrency in your production job orchestration.

Read more