Learn to eliminate database deadlocks by mastering transaction ordering and lock granularity. Stop production downtime with these hands-on concurrency strategies.
Last month, my team spent two days chasing a phantom issue where our checkout service would periodically hang, throwing Deadlock found when trying to get lock; try restarting transaction errors. We were running PostgreSQL 15, and the logs were a mess of interleaved queries. It’s a rite of passage for every backend engineer, but it’s one you definitely want to minimize before it hits your critical path.
A deadlock happens when two or more transactions hold locks that the other needs to proceed. It’s a circular dependency. Transaction A locks Row 1 and waits for Row 2, while Transaction B locks Row 2 and waits for Row 1. The database engine eventually detects this and kills one of the transactions to break the loop.
If you’re seeing these errors, it usually means your transaction isolation levels are aggressive, or your application logic is hitting rows in a non-deterministic order. Before you start tweaking isolation levels, look at your access patterns. If you're struggling with complex multi-tenancy scenarios, you might find that multi-tenancy database schema design: Strategies for isolation and speed helps clarify your data boundaries, which often reduces the scope of potential lock conflicts.
The most effective way to prevent database deadlocks is to ensure that all transactions access shared resources in the same order. If every process updates User then Account, you eliminate the possibility of a cycle.
We once tried to fix a deadlock by introducing a random sleep, thinking it would de-sync the processes. It was a terrible idea. It didn't solve the problem; it just made the race condition harder to reproduce. Instead, we refactored our service layer to sort primary keys before executing batch updates:
PYTHONdef update_user_balances(user_ids, amount): # Always sort keys to ensure consistent lock acquisition order sorted_ids = sorted(user_ids) with transaction.atomic(): for uid in sorted_ids: # SELECT FOR UPDATE locks the row user = User.objects.select_for_update().get(pk=uid) user.balance += amount user.save()
By enforcing a deterministic order, you ensure that even if two threads try to update the same set of users, they will wait in line rather than fighting over circular dependencies.
Sometimes, the issue isn't the order; it's the lock contention created by taking locks on too many rows at once. If your transaction is too broad, you’re holding locks for longer than necessary, increasing the probability of a conflict.
We had a process that locked an entire Orders table to calculate a daily summary. When we switched to granular, row-level locks on specific status flags, the deadlock rate dropped by about 60%. If your queries are still sluggish after fixing locks, consider if database performance: Asynchronous Materialized Views for High-Load Reads could offload some of that heavy read-heavy work to a background process.
Here are a few rules of thumb for managing locks:
READ COMMITTED is the default in Postgres for a reason. Don't jump to SERIALIZABLE unless you have a very specific requirement for strict consistency.When you're in the middle of an incident, pg_stat_activity is your best friend. I often run this query to see what's actually waiting:
SQLSELECT pid, wait_event_type, wait_event, query FROM pg_stat_activity WHERE wait_event_type = 'Lock';
If you see a lot of activity here, you're likely dealing with high lock contention. If you’ve optimized your queries but still face N+1 bottlenecks, WordPress performance: Database-level request coalescing for REST API provides a great pattern for reducing the number of round-trips your database handles, which indirectly reduces the time spent holding locks.
Q: Does increasing the number of connections help? A: Usually, no. More connections often lead to higher concurrency control overhead and can actually increase the frequency of deadlocks because more transactions are competing for the same rows simultaneously.
Q: Should I use SELECT FOR UPDATE SKIP LOCKED?
A: It’s a powerful tool, especially for queue-like tables where you want to process available items without waiting for others. It’s not a magic bullet, but it’s excellent for high-throughput systems.
Q: How do I know if my query optimization is enough? A: Monitor your error logs for deadlock exceptions. If they disappear or drop to near zero, you’ve hit the sweet spot. Don't over-optimize for theoretical scenarios; prioritize the patterns that are actually failing in production.
I’m still not 100% convinced that we’ve eliminated every single edge case in our current architecture. As we add more asynchronous workers, new contention points inevitably appear. The best defense isn't a perfect system, but a codebase that fails gracefully and provides enough observability to fix the next bottleneck quickly.
Database queueing with SELECT FOR UPDATE SKIP LOCKED is the secret to high-concurrency task processing. Learn to eliminate race conditions in your workers.
Read moreImprove database performance by implementing functional indexes. Learn how to optimize case-insensitive searches and complex filters without bloating your schema.