Mahamudul Hasan Rubel
HomeAboutProjectsSkillsExperienceBlogPhotosContact
Mahamudul Hasan Rubel

Senior Software Engineer crafting high-performance web applications and SaaS platforms.

Navigation

  • Home
  • About
  • Projects
  • Skills
  • Experience
  • Blog
  • Photos
  • Contact

Get in Touch

Available for senior/lead roles and consulting.

bd.mhrubel@gmail.comHire Me

© 2026 Mahamudul Hasan Rubel. All rights reserved.

Built with using Next.js 16 & Tailwind v4

Back to Blog
DatabasesJune 21, 20264 min read

Partial indexes for high-cardinality filtering: A deep dive

Partial indexes are the secret weapon for database performance. Learn how to use them to optimize high-cardinality filtering and cut your query latency.

PostgreSQLDatabasePerformanceSQLIndexingOptimizationMySQLRedis
A diver leisurely explores a vibrant coral reef, showcasing stunning underwater photography.

Last month, we pushed a change to our reporting service that caused a spike in CPU usage on our primary Postgres instance. Our dashboard was trying to fetch "active" user logs from a table with over 50 million rows, and the query was taking roughly 280ms to return. It wasn't catastrophic, but it was slowing down our internal tools and making the UI feel sluggish. We were dealing with a classic high-cardinality filtering problem where our index was bloated with data we rarely queried.

When you're looking to speed up your database, Database indexing strategies: Mastering composite indexes for speed are usually the first place you look. But sometimes, a standard B-tree index on a high-cardinality column is just too heavy.

The Problem with Full Indexes

We initially tried adding a standard index on the status column because the queries were almost always WHERE status = 'active'. The problem was that 'active' only accounted for about 3% of the total rows in the table. The remaining 97%—'archived', 'pending', and 'failed'—were effectively dead weight in our index.

Every time we performed an INSERT or UPDATE, the database had to maintain this massive index, even though we were only ever filtering by that tiny subset. This is the hidden cost of a blind indexing strategy. The index was taking up about 1.2GB of disk space, and because it was so large, it frequently fell out of the memory cache, forcing disk I/O on every read.

Why Partial Indexes Work

Magnifying glass emphasizing the index of a book, symbolizing research and focus.

Partial indexes are a powerful form of SQL tuning that allows you to index only a subset of your data based on a WHERE clause. By defining an index that only covers the rows you actually care about, you can drastically reduce index size and improve performance.

Instead of indexing the entire status column, we dropped the existing index and created a partial one:

SQL
CREATE INDEX idx_active_users_partial 
ON user_logs (created_at) 
WHERE status = 'active';

The result was immediate. The new index size dropped from over 1GB to about 40MB. Because the index was now small enough to fit entirely in RAM, our query latency for the dashboard dropped from 280ms to under 15ms.

When to Use This Strategy

You shouldn't just apply partial indexes everywhere. They are a precise tool for specific scenarios:

  1. High-cardinality filtering: When a small percentage of your data is queried frequently.
  2. Deletion/Archive patterns: If you have a "deleted" flag, index everything except the deleted rows.
  3. Soft deletes: Similar to the above, partial indexes keep your primary queries fast by ignoring the noise of soft-deleted records.

Before you jump into implementation, make sure you understand your access patterns. If your application starts querying for 'archived' rows just as often as 'active' ones, your partial index will stop being useful. The query planner will ignore the index entirely if the WHERE clause doesn't match the index definition.

Lessons Learned

I'll admit, we made a mistake during the rollout. We initially tried to include the user_id in the index to speed up joins, but the index size ballooned again. We eventually realized that we didn't need the user_id in the index at all—the created_at column was sufficient for our sorting needs, and the database could fetch the rest from the heap.

We also had to double-check our application code. If you have a query that looks like WHERE status = ?, and you pass a variable that isn't 'active', the database will ignore your partial index and perform a sequential scan. This is a common trap. You have to ensure your query logic matches the index definition exactly.

Next time, I'd probably run a VACUUM ANALYZE immediately after creating the index on a table that large to ensure the statistics were fresh. We spent about two days chasing phantom latency issues before realizing the planner was just using an outdated histogram.

Partial indexing isn't a silver bullet, but it's a critical tool in your kit for database performance when you're working at scale. It forces you to be intentional about what you index and why.

FAQ

Does a partial index work with OR conditions? Usually, no. The query planner is quite strict. If your index is WHERE status = 'active', a query for WHERE status = 'active' OR status = 'pending' will likely bypass the index because it cannot guarantee the result set is contained within the partial index.

How does this affect write performance? It improves it. Since the database only updates the index when a row meets the criteria, you reduce the write overhead on your main tables.

Can I use partial indexes for unique constraints? Yes, and it's a great way to handle things like "only one active record per user." You can create a unique index WHERE status = 'active'.

Back to Blog

Similar Posts

Two people discussing an architectural floor plan with red markings on paper.
DatabasesJune 20, 20264 min read

Reading an EXPLAIN plan without panic: A Backend Engineer’s Guide

Reading an EXPLAIN plan is the most reliable way to optimize slow queries. Learn how to identify full table scans and index misses to save your production.

Read more
Multiple stacks of blue medical face masks arranged on a white background, emphasizing cleanliness and healthcare.
Databases
June 20, 2026
4 min read

Redis Caching Patterns That Prevent Stampedes in Production

Redis caching patterns that prevent stampedes are essential for scaling. Learn how to stop the thundering herd effect and keep your backend performance stable.

Read more
Scrabble tiles spelling 'DATA' on a wooden table with a blurred plant background.
DatabasesJune 20, 20264 min read

When to denormalize your database for production performance

Denormalize your database only when read latency becomes a bottleneck. Learn to evaluate the trade-offs between schema complexity and query speed.

Read more