Mahamudul Hasan Rubel
HomeAboutProjectsSkillsExperienceBlogPhotosContact
Mahamudul Hasan Rubel

Senior Software Engineer crafting high-performance web applications and SaaS platforms.

Navigation

  • Home
  • About
  • Projects
  • Skills
  • Experience
  • Blog
  • Photos
  • Contact

Get in Touch

Available for senior/lead roles and consulting.

bd.mhrubel@gmail.comHire Me

© 2026 Mahamudul Hasan Rubel. All rights reserved.

Built with using Next.js 16 & Tailwind v4

Back to Blog
DatabasesJune 24, 20264 min read

Database performance: Tuning Buffer Pool Page Eviction Strategies

Database performance starts with your buffer pool. Learn how to tune page eviction strategies to keep hot data in memory and reduce disk I/O under load.

databasespostgresqlperformancememorycachingoptimizationMySQLRedisDatabase

We recently hit a wall with our primary PostgreSQL instance during a peak traffic window. The latency for simple indexed lookups spiked from a steady 12ms to over 450ms, and our disk I/O wait times went through the roof. It wasn't a missing index or a rogue query; it was a classic case of buffer pool thrashing.

When your working set size exceeds the memory allocated to your buffer pool, the database engine starts evicting pages to make room for new ones. If those evicted pages are still "hot," the database is forced to perform expensive disk reads to bring them back. This cycle of constant swapping destroys throughput.

Understanding the Buffer Pool Mechanics

The buffer pool acts as the primary cache for your database. It stores data pages—the actual rows and indexes—in memory so the CPU doesn't have to talk to the slow storage layer. When you request data, the engine checks the buffer pool first. If it's there (a cache hit), you're fast. If it's not (a cache miss), the engine must fetch the page from disk, which is orders of magnitude slower.

In high-concurrency environments, we often focus on database performance: Adaptive Throttling to Prevent Pool Exhaustion to keep our connection counts sane, but that only solves half the problem. Even with a perfect connection pool, if your memory management isn't aligned with your access patterns, your disk will become the bottleneck.

The Wrong Turn: Blindly Increasing RAM

Our first instinct was to simply throw more RAM at the server. We bumped the instance size and increased the shared_buffers in PostgreSQL from 8GB to 24GB. It helped for about two days. Then, the application grew, the working set expanded again, and we were right back where we started.

We learned the hard way that "more memory" is a temporary fix for poor cache locality. We needed to look at how we were evicting pages.

Most databases use a variation of the Least Recently Used (LRU) algorithm or a Clock Sweep algorithm to decide which pages to drop. If your workload is dominated by large table scans, these scans can "pollute" the buffer pool by pushing out highly useful index pages.

Strategies for Better Buffer Pool Tuning

If you're dealing with high-concurrency read workloads, you need to be surgical. Here is how we stabilized our system:

1. Separate Your Workloads

We moved our analytical, scan-heavy queries to a replica. This prevents large sequential scans from flushing the buffer pool of our primary instance, which handles the high-concurrency OLTP traffic. If you can't use a replica, consider Database performance: Asynchronous Materialized Views for High-Load Reads to shift heavy aggregate reads away from the hot path.

2. Tune the Page Eviction Parameters

In PostgreSQL, you don't have a direct "eviction algorithm" switch, but you do have controls over how aggressive the background writer is. By tuning bgwriter_delay and bgwriter_lru_maxpages, we ensured that the background writer cleans up dirty pages more frequently. This keeps the pool "fresh" so that when a new page needs to be loaded, the system isn't scrambling to flush a dirty page to disk first.

3. Monitor the Cache Hit Ratio

You should never tune blindly. Use pg_stat_database to track your cache hits vs. disk reads:

SQL
SELECT 
    sum(heap_blks_read) as disk_read, 
    sum(heap_blks_hit) as buffer_hit, 
    (sum(heap_blks_hit) - sum(heap_blks_read)) / sum(heap_blks_hit) as ratio
FROM pg_stat_io;

If your ratio drops below 95% during peak hours, you’re likely thrashing.

Beyond the Buffer Pool

Sometimes the database engine simply isn't enough. If you’ve optimized your indexes and your buffer pool is tuned, but you're still seeing high latency, you might need a secondary caching layer. We found success with Database Caching Strategies: Mastering Partitioned Keys and Eviction to handle high-frequency lookups before they even hit the database.

It’s worth noting that I’m still experimenting with effective_io_concurrency settings. While it’s meant to help with pre-fetching, setting it too high on certain storage backends can actually increase contention.

Database performance is an iterative game. You'll never get it perfectly "tuned" once and walk away. The data access patterns change, the index usage shifts, and the hardware evolves. My advice? Start by observing your hit ratios, isolate your heavy scans, and prioritize the memory your most critical indexes need to stay resident. Everything else is just noise.

Back to Blog

Similar Posts

Close-up of a smartphone showing Python code on the display, showcasing coding and technology.
DatabasesJune 20, 20264 min read

Indexing Strategy for App Developers: Stop Slow Queries

Master an indexing strategy for app developers to fix slow production queries. Learn how to read EXPLAIN plans, pick the right columns, and avoid overhead.

Read more
DatabasesJune 24, 2026
4 min read

Database performance: Throttling Queries with Row-Level Constraints

Improve database performance by implementing constraint-based throttling. Learn to protect your resources using row-level governance to stop runaway queries.

Read more
DatabasesJune 24, 20264 min read

Database Partitioning Strategies: Optimizing Your Query Execution Plans

Database partitioning is the key to high-performance scale. Learn how to optimize your execution plans and use partition pruning to slash query latency.

Read more