Next.js request hedging is a powerful pattern to mitigate p99 latency. Learn how to implement speculative execution in Server Components to keep your apps fast.
During a recent load test on a high-traffic dashboard, I noticed that while our median response time was solid, our p99 latency was dragging. A few requests were taking over 800ms, likely due to transient network congestion or upstream microservice blips. We were already using Next.js Request Memoization: Stop Over-Fetching in Server Components to clean up our data fetching, but that doesn't solve the "slow provider" problem.
That’s when I started looking into request hedging. If you’re not familiar with the term, request hedging is the practice of sending a second, redundant request if the first one doesn't return within a specific time threshold. The first one to resolve wins, and the other is effectively discarded.
In a standard Next.js application, you’re often at the mercy of the slowest upstream service. When you orchestrate multiple data sources within your Server Components, your total response time is constrained by the slowest dependency.
We initially tried wrapping our fetch calls in a simple Promise.race() block. It seemed clean, but it introduced a nasty side effect: we were hammering our internal APIs with duplicate traffic, which caused a spike in CPU usage on our database layer. We had to be more surgical.
Performance engineering isn't just about making things faster; it's about making them predictable. Implementing request hedging requires balancing throughput against latency. If you hedge too aggressively, you’ll degrade your infrastructure. If you don't hedge at all, your users suffer during peak load.
The key to safe hedging is setting a timeout that sits slightly above your p90 latency. If your p90 is 150ms and your p99 is 800ms, triggering the second request at 200ms is a safe bet. You’ll catch most of the "long tail" requests without wasting resources on the majority of your traffic.
Here is a simplified pattern for hedging a fetch request in a Server Component:
TYPESCRIPTasync function hedgedFetch(url: string, timeout = 200) { const controller = new AbortController(); const fetcher = async (isSecondary = false) => { const res = await fetch(url, { signal: controller.signal }); if (!res.ok) throw new Error(CE9178">'Fetch failed'); return res.json(); }; // We race the primary request against a timer that triggers the secondary return Promise.race([ fetcher(), new Promise((_, reject) => setTimeout(() => reject(CE9178">'timeout'), timeout)) ]).catch(async () => { // If we reach here, the primary was slow or failed return fetcher(true); }); }
This approach is primitive but effective. In a real production environment, you need to be careful with how you handle the AbortController. If you don't correctly signal the first request to stop, you might leave hanging connections, which can lead to memory leaks or port exhaustion in your Next.js server instance.
Before you blanket your codebase with this pattern, consider these risks:
POST or DELETE request unless your API is explicitly idempotent. You don't want to double-process a payment or delete a resource twice.If you are dealing with complex data dependencies, you might find that Next.js Server Components: Solving N+1 Queries with DataLoaders is a better first step. Batching is almost always cheaper than hedging.
I wouldn't recommend hedging for every request. If your API is fast and consistent, the overhead of managing timers and abort signals isn't worth it. I typically limit hedging to:
If your architecture is complex, you might also want to read up on React Server Components vs Client Components in Next.js to ensure you aren't leaking sensitive data or configuration details while trying to optimize for speed.
We ended up implementing a "Hedging Manager" utility that tracks the health of our upstream services. If a service is consistently fast, the manager disables hedging automatically to save on compute. It’s a dynamic approach that avoids the "static configuration" trap.
I’m still not 100% sure if we should move this logic into a middleware layer or keep it as a utility function. Middleware in Next.js is limited by the edge runtime, so complex logic there can be tricky. For now, keeping it close to the data-fetching layer in our Server Components gives us the most control, even if it makes our component code slightly more verbose.
Performance engineering is a constant game of cat and mouse. You fix one bottleneck, and another appears downstream. Next.js gives us the tools to handle these problems, but it’s up to us to use them without breaking the bank or our infrastructure.
Master Next.js App Router data revalidation using global cache tags. Learn to build automated, deterministic purge pipelines for complex data graphs.