Database schema design matters. Learn how to implement shadow columns for efficient soft deletes and auditing without sacrificing query performance.
Last month, our primary order processing service started crawling during high-traffic windows. We had a massive orders table littered with deleted_at timestamps, causing our index scans to drag as the table grew into the millions of rows.
I realized we were paying a heavy performance tax just to keep "dead" data around. We needed a way to handle soft deletes and auditing that didn't turn every SELECT query into a slow, index-bloated mess. That’s when I turned to shadow columns and partial indexing.
Most developers default to a simple is_deleted boolean or a deleted_at timestamp. It’s easy to implement, but it’s a trap for database schema design. Every query in your application suddenly needs a WHERE deleted_at IS NULL clause.
If you forget that clause once, you get bugs. If you include it everywhere, your database engine has to navigate a bloated index that contains both active and inactive records. When your table size hits the 5-10 million row mark, the overhead of scanning these "ghost" records becomes non-trivial.
We tried adding a standard composite index on (user_id, deleted_at). It helped, but it still forced the engine to keep metadata for records that weren't even relevant to the user's current session.
Instead of cluttering the main table, I moved the deletion metadata to a shadow table (or a set of dedicated shadow columns if the volume is low). The goal is to keep the "hot" table lean.
Think of it this way:
When a user deletes an item, we move the row ID to the deleted_orders table. Now, your main orders query doesn't need a WHERE clause at all. It just hits the clean, active set of data.
If you're on PostgreSQL 14+, you can use a trigger to handle this silently, but I prefer keeping this logic in the service layer to maintain visibility. Here’s how the schema structure looks:
SQL-- The lean main table CREATE TABLE orders ( id UUID PRIMARY KEY, user_id UUID NOT NULL, amount DECIMAL(12, 2), created_at TIMESTAMPTZ DEFAULT NOW() ); -- The shadow audit table CREATE TABLE deleted_orders ( order_id UUID PRIMARY KEY, deleted_at TIMESTAMPTZ DEFAULT NOW(), deleted_by UUID, reason TEXT );
This setup allows for a much cleaner performance optimization approach. You aren't forcing the database to filter out deleted rows during every read operation.
The beauty of this approach is that it enables database indexing strategies that are far more effective. Since your main table is now strictly "active" data, your indexes are smaller and fit better into RAM.
If you still need to query deleted data for reports or admin views, you can join against the shadow table. Because the shadow table only contains the deleted IDs, the join performance remains predictable.
We also found that for high-frequency auditing, implementing a Database TTL Strategy on the shadow table itself helps keep storage costs down. We set a policy to purge the deleted_orders table after 90 days, effectively moving old, deleted data into cold storage (like S3/Parquet files) instead of bloating our primary RDS instance.
This isn't a silver bullet. You’re trading storage complexity for read performance.
DELETE and the INSERT, you could end up with orphaned data. Always wrap these in a single BEGIN...COMMIT block.softDelete() helpers.UNION query. UNION is slower than a simple SELECT, so don't use this pattern if your primary use case is "show me all history" rather than "show me my current active items."We saw our p99 latency for order lookups drop from around 320ms down to about 45ms after cleaning up the index bloat. Was it worth the extra code? Absolutely.
Next time, I might experiment with partitioning the main table by created_at alongside this shadow pattern. I’m still not entirely convinced that moving the data is better than just using a partial index, but for high-churn tables where you delete frequently, the shadow column approach provides a much cleaner separation of concerns.
Does this hurt write performance? Yes, slightly. You’re performing two writes instead of one. However, in most backend systems, read performance is the bottleneck. The trade-off is usually worth it.
Can I use this for non-deleted auditing?
Definitely. You can use shadow columns to track status changes (e.g., order_status_history) without polluting the main orders table.
What about foreign keys? This is the tricky part. You’ll need to decide if you want to enforce hard foreign keys to the shadow table or use application-level logic. I prefer application-level checks to keep the database schema flexible.
Database indexing with composite keys is the best way to speed up complex queries. Learn how to design effective indexes and fix slow performance today.