Foreign key performance depends on smart indexing. Learn how to balance database write latency and read speed while maintaining strict data integrity.
Last month, I spent about two days debugging a production service that started timing out during bulk imports. The logs showed a massive spike in write latency every time we ingested a batch of user activity records. It turned out that our "perfectly normalized" schema had become a bottleneck because we indexed every single foreign key by default, assuming it was the "correct" way to handle relational schema design.
When you’re deep in the weeds of Database schema design: Implementing Shadow Columns for Soft Deletes, it’s easy to treat indexes as a set-and-forget feature. But indexes aren't free. Every time you insert a row, the database has to update the B-tree for every index on that table. If you have five foreign keys on a table, you’re effectively performing six writes—one for the data and five for the indexes—every time you touch that row.
The core of the problem is that developers often confuse "referential integrity" with "query performance." Yes, you need foreign keys to keep your data clean. Yes, you need indexes to make joins fast. But you don't always need both on every column, all the time.
When I looked at our activity_logs table, we had a user_id foreign key. We also had a category_id and a device_id. All were indexed. The user_id was queried constantly for dashboard views, so that index was essential for query optimization. However, the device_id was almost never used in a WHERE clause or a JOIN. It was just there for data integrity.
By dropping the index on device_id, we saw our write latency drop by roughly 1.8x during high-traffic windows.
When you're fine-tuning your Indexing Strategy for App Developers: Stop Slow Queries, remember that the database engine needs to lock the index pages during a write. If you have high-concurrency writes, these locks cause contention.
Here is what happens under the hood when you insert a record into a table with multiple indexed foreign keys:
If your application does frequent bulk updates, these extra index maintenance steps are what kill your throughput.
I’m not suggesting you abandon indexing. That would be reckless. Instead, apply a simple heuristic: If you don't query by it, don't index it.
If you find yourself needing to filter on those non-indexed columns later, consider Partial indexes for high-cardinality filtering: A deep dive. Partial indexes allow you to index only a subset of data, which keeps the index size small and the write overhead manageable.
If you suspect your database write latency is suffering from index bloat, don't just drop everything. Follow these steps:
EXPLAIN ANALYZE (in PostgreSQL) or EXPLAIN (in MySQL) to see if the optimizer is actually using your foreign key indexes. If you see Seq Scan on a table with a foreign key, the index is doing nothing for your reads.pg_stat_user_indexes. If idx_scan is zero or very low over a week, that index is a candidate for removal.Does dropping an index affect foreign key integrity? No. Foreign keys are a constraint enforced by the database engine at the schema level. The index is merely a performance tool that helps the engine verify that constraint faster. Dropping the index makes the verification slower, but the integrity remains intact.
When is a foreign key index absolutely mandatory?
It's mandatory when you perform frequent JOIN operations on that column or when you frequently filter results by that column. Without the index, the database is forced to perform a full table scan for every join or filter, which will eventually crash your application as the table grows.
Is there a middle ground? Yes. If you need the index for the occasional report but don't want it slowing down your high-speed ingestion, consider creating the index concurrently (in Postgres) or using a "deferred" indexing strategy if your database supports it.
I'm still cautious about dropping indexes on large production tables—one wrong move and you’re looking at a site-wide slowdown. Next time, I’d probably try creating a shadow index or running the application in a staging environment with a realistic data load before touching the production schema. It’s a constant game of tug-of-war between read speed and write throughput, and there's no single "correct" answer for every application.
Master database TTL patterns to manage expiring data efficiently. Learn how to implement schema-level TTL and optimize queries for high-performance lifecycles.