DatabasesJune 20, 20264 min read

Redis Caching Patterns That Prevent Stampedes in Production

Redis caching patterns that prevent stampedes are essential for scaling. Learn how to stop the thundering herd effect and keep your backend performance stable.

RedisCachingBackendPerformanceDatabaseArchitecturePostgreSQLMySQL

Last month, our primary dashboard API started timing out every time a popular campaign went live. We were using a standard cache-aside pattern, but the moment a key expired, dozens of concurrent requests would see a cache miss and slam the database simultaneously. It wasn't just a slow query issue—it was a classic cache stampede.

If you've spent any time killing N+1 queries at the database layer: A practical guide, you know that database load is the enemy of stability. When your cache expires, that load spikes instantly. Here is how I handle these stampedes using robust Redis patterns.

Understanding the Stampede

A cache stampede (or "thundering herd") happens when a highly requested key expires. Multiple application threads check the cache, find it empty, and all decide to recompute the value by hitting the database at the same time.

If your database query takes 300ms, and you have 50 incoming requests, your database is suddenly doing 15 seconds of work in a fraction of a second. It’s a vicious cycle.

The Probabilistic Early Recomputation Pattern

The most elegant fix isn't just setting longer TTLs. It’s "Probabilistic Early Recomputation." Instead of waiting for a hard expiration, you let the application decide to refresh the cache before it expires based on a probability calculation.

You store the "refresh-at" time inside the cached object itself. When a request comes in, you check if the current time is nearing the expiration. If it is, you use a probability function to decide if this specific request should trigger a background update.


RUBY
# Pseudo-code logic for probabilistic refresh
def get_with_early_recomputation(key)
  data = redis.get(key)
  return data unless data.needs_refresh?

  if rand < data.probability_threshold
    # Trigger async recomputation
    Async.run { update_cache(key) }
  end
  
  data.value
end

By spreading out the refresh requests, you ensure that only one thread (or a small handful) hits the database, while the rest continue to serve the slightly stale—but still fast—cached data.

Mutex Locking: The "First One Wins" Approach

A gold medal displaying the number one, set against a light blue background.

Sometimes, you can't afford any stale data. In those cases, I prefer a distributed lock using Redis SET NX (Set if Not Exists).

When a thread sees a cache miss, it attempts to acquire a lock for that specific key.

The first thread gets the lock and proceeds to the database.
The other 49 threads check the lock, fail to acquire it, and instead wait or return a default value.
The first thread finishes, updates Redis, and releases the lock.

It’s cleaner than it sounds, but be careful with timeouts. If your database query hangs while holding the lock, you’ll block all other requests for that key. Always set a reasonable lock TTL.

Don't Forget Your Indexing Strategy

While caching is a great shield, it isn't a substitute for a solid indexing strategy for app developers: Stop slow queries. If your database query is fundamentally slow, no amount of caching will save you when the cache does need to be populated.

I’ve seen engineers try to "cache away" performance problems caused by missing indexes. It works until the cache is cleared, and then the site goes down. Always ensure your database can handle the "cold start" before you rely on Redis to mask the latency.

Best Practices for Cache TTLs

One mistake I see often is setting static, long-lived TTLs. If you have a site-wide event, you end up with "synchronized expiration," where thousands of keys expire at the exact same time.

Jitter is your friend: Always add a random offset to your TTLs (e.g., base_ttl + rand(0..300) seconds). This ensures that keys expire at different times, effectively smoothing out the load on your database.
Warm-up strategies: If you know a key is going to be requested heavily, pre-warm it before the traffic hits.

When to use what?

Wooden letters forming the word 'When' on a plain cardboard background.

Pattern	Complexity	Best For
Probabilistic	High	High-traffic keys with tolerable staleness
Mutex (Locking)	Medium	Keys that require strict consistency
Jittered TTL	Low	General purpose caching

I’m still experimenting with using Redis Streams to handle cache invalidation more gracefully for microservices. There’s a constant trade-off between the complexity of your cache logic and the simplicity of your code.

What I’ve learned is that you should start with simple jittered TTLs. If you still see spikes in your monitoring tools, move to a Mutex lock. Only reach for Probabilistic Recomputation when you’re dealing with massive scale where even a millisecond of lock contention is too much.

Don't over-engineer your cache until you have the metrics to prove it's the bottleneck.

Back to Blog

Redis Caching Patterns That Prevent Stampedes in Production

Understanding the Stampede

The Probabilistic Early Recomputation Pattern

Mutex Locking: The "First One Wins" Approach

Don't Forget Your Indexing Strategy

Best Practices for Cache TTLs

When to use what?

Similar Posts

Reading an EXPLAIN plan without panic: A Backend Engineer’s Guide

When to denormalize your database for production performance

Killing N+1 queries at the database layer: A practical guide