ArchitectureJune 23, 20264 min read

API Security: Preventing Resource Exhaustion with Query Complexity Analysis

API security depends on more than just basic rate limiting. Learn to prevent resource exhaustion by calculating query complexity before execution.

API SecurityGraphQLSystem DesignPerformanceReliabilityAPIArchitectureBackend

Last month, we had a production incident where a single client query brought our database CPU to 98% utilization for ten minutes. The request wasn't technically a "denial of service" attack—it was a perfectly valid, authorized request—but its recursive nature forced an exponential join that our indexer couldn't handle.

Standard API rate limiting with token bucket algorithms wasn't enough because we were counting requests, not the cost of those requests. To fix this, we implemented request-level query complexity analysis to kill expensive operations before they ever hit the database.

Why Rate Limiting Fails Against Deep Queries

If you're building a GraphQL API or a REST API with embedded resource expansion (like ?include=users.posts.comments.likes), you’re vulnerable to "query depth" attacks. A user can send a request that looks small but forces the server to traverse thousands of nodes.

We first tried to solve this by simply limiting the maximum depth of our object graph to 5 levels. It was a naive approach. A shallow query fetching 10,000 items is often more expensive than a deep query fetching 5 items. We needed a system that assigned a "cost" to each field based on its computational weight.

Implementing Static Query Complexity Analysis

The goal is to calculate a total cost score for a request before you hit your business logic. We assign a base cost of 1 to every field, but we add multipliers for collections.

Here is a simplified schema-based cost map:


JAVASCRIPT
const costMap = {
  user: { cost: 1 },
  posts: { cost: 2, multiplier: 10 }, // Each post fetched counts as 10
  comments: { cost: 5, multiplier: 50 },
};

When a request comes in, we traverse the AST (Abstract Syntax Tree) and sum the costs. If the total exceeds a threshold—say, 500 units—we reject the request with a 429 Too Many Requests status code.

The Calculation Logic

You don't need a complex parser if you're using a standard library. In our Node.js environment, we use the graphql-validation-complexity package (or a custom visitor pattern for REST) to generate the score.


JAVASCRIPT
function calculateComplexity(node, depth = 0) {
  let cost = node.cost || 1;
  if (node.children) {
    for (const child of node.children) {
      cost += (child.multiplier || 1) * calculateComplexity(child, depth + 1);
    }
  }
  return cost;
}

This approach allows us to be generous with simple lookups while being extremely restrictive with "deep" data fetching. It’s effectively a dynamic budget for every individual request.

Integrating Complexity into Your API Strategy

This isn't a replacement for API throttling; it's an enhancement. You still need to manage your overall throughput, but complexity analysis ensures that the requests you do allow are safe to execute.

When implementing this, keep these three things in mind:

Start in "Report Only" mode: Don't start by blocking requests. Log the complexity scores of all incoming traffic for about two weeks. You’ll be surprised at how your "normal" traffic is distributed.
Account for Authentication: Authenticated users should have a higher complexity budget than anonymous users.
Cache the result: If you’re using a static schema, the complexity of a query string can be cached. Don't re-parse the AST on every single request if the query is identical.

The Trade-offs

We found that static analysis doesn't account for database lock contention. A query might have a low "complexity score" but still be slow if it hits a locked row. We've considered adding a dynamic component—where the cost is adjusted based on current database load—but that introduces significant latency to the request validation phase.

For now, we stick to static estimation. It’s predictable, fast (usually adds around 2–5ms to request processing), and it has successfully prevented the kind of cascading failures we saw last quarter.

Ultimately, preventing resource exhaustion is about visibility. Once you start measuring the cost of your queries, you stop treating every request as equal. You’ll find that a small subset of your traffic is responsible for the vast majority of your load. By capping that, you keep your system stable for everyone else.

I’m still experimenting with how to communicate these limits to our frontend developers. Right now, they only see a 429 error, which is frustrating. Ideally, we’d expose the cost as a header in the response, allowing them to optimize their queries before they hit the limit. We’re not there yet, but it’s the logical next step in our reliability engineering journey.

Back to Blog

API Security: Preventing Resource Exhaustion with Query Complexity Analysis

Why Rate Limiting Fails Against Deep Queries

Implementing Static Query Complexity Analysis

The Calculation Logic

Integrating Complexity into Your API Strategy

The Trade-offs

Similar Posts

API Security: Decoupling Field-Level Authorization from Controllers

REST API Field Selection: Solving Data Over-fetching and Dependency Graphs

Pagination that scales past page 1000: A Technical Guide