Regular Expression Denial of Service (ReDoS) can crash your Node.js or PHP app. Learn to spot catastrophic backtracking and harden your regex patterns today.
During an on-call rotation last year, I watched a seemingly innocent regex update bring a production service to its knees. The CPU usage on our primary Node.js instance spiked to 100% within seconds of deployment, and the event loop became completely unresponsive. We had inadvertently introduced a catastrophic backtracking scenario, a classic case of a Regular Expression Denial of Service.
If you're building high-traffic web applications, you've likely used regex for email validation, password complexity checks, or parsing logs. When these patterns are poorly constructed, they can turn a simple user input into a computational black hole.
At its core, catastrophic backtracking occurs when a regex engine tries to find a match for a string that doesn't fully satisfy the pattern. Many engines (like those in V8 for Node.js and PCRE for PHP) use a "backtracking" algorithm. If the pattern has overlapping groups—often caused by nested quantifiers—the engine will try every possible permutation of the input to see if a match exists.
Consider this common, dangerous pattern: ^([a-zA-Z0-9]+)*$.
If you provide a long string of valid characters that ends with an invalid one—like a string of 30 "a"s followed by a "!"—the engine will explore an exponential number of paths. For an input of length 30, it might take a few milliseconds. At length 50, it could take minutes. By the time you reach 100 characters, you've effectively locked up your process.
I first realized we had a problem when I saw the event loop lag metrics in our monitoring dashboard shoot from a standard 10ms to around 4,500ms. We had been using a complex regex to parse legacy headers, and one specific payload triggered the nightmare.
To test your own patterns, look for these "red flags":
(a+)+ or (a*)*.(a|a)+.([a-zA-Z]+)*.If your logic requires complex parsing, regex might not be the right tool. When we moved our header parsing logic away from complex regex and toward a simple split() and indexOf() approach, our CPU spikes vanished entirely. Before you dive deep into regex security, it's worth checking if your input validation needs can be handled by standard string methods or dedicated libraries.
If you must use regex, you need to be defensive. Here is how I secure my patterns in Node.js and PHP:
(?>...). These tell the engine not to backtrack once a group has matched.+ inside a *, stop. Rewrite the pattern to be explicit about the expected length.RegExp class doesn't support timeouts. If you are dealing with untrusted input, use a library like safe-regex to detect potentially dangerous patterns during development.Regex is just one piece of the puzzle. If you are hardening your authentication flows, don't forget to look at Preventing Session Fixation: Hardening Authentication Flows in Node.js and Laravel to ensure your session management isn't a weak link. Likewise, if your regex is used to validate file paths or URLs, ensure you aren't opening yourself up to other vulnerabilities, such as those discussed in Preventing Open Redirect Vulnerabilities: A Guide for Developers.
How do I know if my regex is "safe"?
Use tools like safe-regex for Node.js. It isn't perfect, but it will flag most patterns that exhibit exponential complexity.
Why doesn't the engine just "stop" if it takes too long? Most regex engines are designed to find a match at any cost. They don't have built-in "time budgets." You have to implement those protections at the application level by limiting input length or using non-backtracking engines.
Is it better to just write a manual parser? For complex validation, yes. A manual parser using simple loops is almost always faster and safer than a complex regex, and it's significantly easier for the next developer on your team to debug.
We've learned the hard way that regex is a double-edged sword. While it's powerful for pattern matching, it requires a mindset of defensive coding to prevent ReDoS. Next time, I would prioritize using dedicated validation schemas like Joi or Zod in Node.js, which often handle these edge cases under the hood. Don't assume your patterns are safe just because they work for your test cases; always stress-test them with long, invalid strings.
Insecure deserialization can lead to remote code execution. Learn how to prevent object injection by replacing native serialization with secure data formats.
Read moreJWT security depends on granular authorization scopes. Learn how to implement scope-based validation in Node.js and Laravel to prevent token over-privilege.