Database queueing with SELECT FOR UPDATE SKIP LOCKED is the secret to high-concurrency task processing. Learn to eliminate race conditions in your workers.
When I first started building background workers, I relied on simple status flags. I’d mark a row as processing, grab the ID, and move on. It worked fine until we hit roughly 50 concurrent workers, at which point the system collapsed under the weight of deadlocks and duplicated task executions. That’s when I learned that building reliable database queueing systems requires more than just UPDATE statements; it requires atomic row-level locking.
If you’ve ever dealt with the nightmare of multiple workers picking up the same job, you know exactly what I’m talking about. You don't need a dedicated message broker like RabbitMQ or Redis if your database is already Postgres or MySQL 8.0+. You just need to master SELECT FOR UPDATE SKIP LOCKED.
Before we dive into the solution, let’s look at why standard approaches fail. If you run a query like SELECT * FROM jobs WHERE status = 'pending' LIMIT 1 FOR UPDATE, you’re telling the database to lock that row. If multiple workers execute this simultaneously, the second worker will block, waiting for the first to finish.
This creates a massive bottleneck. Your workers aren't processing tasks; they're waiting for locks to release. If you have 20 workers, they end up queued behind each other, turning your parallel task processing into a serial nightmare. This is the exact scenario we covered when discussing preventing race conditions in distributed transactions, where blocking leads to system-wide latency spikes.
The SKIP LOCKED clause changes the game. Instead of blocking when it encounters a locked row, the database simply skips it and moves to the next available one.
Here is how a typical worker implementation looks in Postgres:
SQLBEGIN; UPDATE jobs SET status = 'processing', locked_at = NOW() WHERE id = ( SELECT id FROM jobs WHERE status = 'pending' ORDER BY created_at ASC FOR UPDATE SKIP LOCKED LIMIT 1 ) RETURNING *; COMMIT;
When a worker executes this, it effectively "claims" the task in a single atomic operation. Because of the SKIP LOCKED directive, no two workers will ever grab the same row, even if they hit the table at the exact same millisecond.
I’ve found that this pattern works exceptionally well when you need to bypass the limitations of standard scheduling systems, similar to the strategies we explored for WordPress background processing. By pushing the queue logic into the storage layer, you eliminate the overhead of external network calls to a broker.
However, it’s not a silver bullet. You must ensure your jobs table has an index on (status, created_at). Without this, Postgres will perform a sequential scan, and your query performance will degrade linearly as your table grows. I once spent about two days debugging a "slow" queue only to realize we had missing index coverage on the status column. Once the index was added, the query execution time dropped from 400ms to under 5ms.
One thing to watch out for is "lock starvation." If you have a massive number of long-running tasks, your SKIP LOCKED query might constantly scan over thousands of locked rows to find the one that's available. If your table grows into the millions, you might need to partition your queue table by status or date.
Also, remember that SKIP LOCKED is a feature of the database engine. If you are using an older version of MySQL (pre-8.0), you simply don't have access to this feature. In those cases, you're often better off using a dedicated Redis-based queue or looking into database proxy strategies to manage the connection load.
Does SKIP LOCKED guarantee order?
Yes, if you include an ORDER BY clause, the database will respect that order for the rows it doesn't skip. However, if row 1 is locked, it will jump to row 2, even if row 2 was created later. Keep this in mind if strict FIFO (First-In, First-Out) is a hard requirement.
Can I use this for non-queue tasks? Technically, yes, but be careful. Using row-level locking for business logic (like updating user balances) is different from queueing. Ensure you understand the isolation levels of your database to avoid phantom reads.
What happens if a worker crashes while processing?
The row remains in the processing state forever. You need a "janitor" process—a separate cron job—that looks for jobs stuck in processing for longer than, say, 30 minutes, and resets their status to pending.
I’m still experimenting with how to handle dead-letter queues effectively within this structure. While the SKIP LOCKED approach is incredibly robust for high-concurrency task processing, you’re still responsible for the lifecycle of the task once it's pulled. Don't assume the database will handle your retry logic for you.
Master an indexing strategy for app developers to fix slow production queries. Learn how to read EXPLAIN plans, pick the right columns, and avoid overhead.