Why is my regex slow? Common causes and fixes
Most slow regex isn't about input size. It's about the pattern doing exponentially more work than it should.
The short answer
Almost all slow regex is caused by one of three things:
- Catastrophic backtracking — nested quantifiers exploring exponentially many paths
- Unbounded greedy matching —
.*followed by something rare that forces lots of backtracking - Excessive alternation — long lists of alternatives where most fail before the last one matches
The fix is almost always restructuring the pattern, not throwing more CPU at it.
The classic ReDoS pattern
Pattern: ^(a+)+$
Input: "aaaaaaaaaaaaaaaaaaaa!" (20 a's + non-matching char)
Time: seconds to minutes
Why? The nested quantifiers create overlapping ways to assign characters. With 20 a's, the engine tries about a million combinations. With 30 a's, a billion. The failure at the end forces the engine to backtrack through every possibility.
This is called "catastrophic backtracking" or ReDoS (regex denial of service). With untrusted input, it can hang your server.
Patterns to watch for
(a+)+,(a*)*— quantifier on a group that can overlap with itself(a|a)+,(a|aa)+— alternation with overlapping branches, then quantified(.+)+— even subtler version of the same problem.*.*— two greedy patterns that can split the same characters multiple ways
Fix #1: Eliminate the overlap
Bad:
^(\w+\s+)+\w+$
Better:
^\w+(\s+\w+)*$
The original has overlapping ways to assign whitespace and word characters to the inner and outer groups. The fix makes it unambiguous: one word, then zero-or-more groups of (whitespace + word).
Fix #2: Use possessive quantifiers
If your flavor supports them (PCRE, Java, Python 3.11+), make quantifiers possessive — they refuse to back off:
Before: (a+)+
After: (a++)+ each a+ won't backtrack
This eliminates the exponential explosion. Possessive quantifiers say "if you matched this much, stick with it."
Fix #3: Use atomic groups
Same idea applied to a group (PCRE, Java):
Before: (a+)+
After: (?>a+)+ atomic group won't backtrack into
Fix #4: Use a non-backtracking engine
Some regex engines guarantee linear-time matching by not supporting features that cause backtracking. They're fast and immune to ReDoS, but they don't support backreferences or lookarounds:
- Go's regexp (uses RE2)
- Rust's regex crate
- Hyperscan (Intel's production-grade engine)
- JavaScript's upcoming UTS#18 mode
If your patterns don't need backreferences, these are by far the safest choice for handling untrusted input.
Fix #5: Anchor or restructure
Sometimes the fix is just adding anchors. \d+\.\d+ applied to abcxyz123.456 in non-anchored mode scans every position before finding a match. Anchoring with ^ tells the engine to give up immediately if the start doesn't match.
Diagnosing a slow regex
Steps:
- Identify the pattern that's slow. Profilers can help but it's usually obvious.
- Identify the input that triggers it. Usually a string that ALMOST matches but fails at the end.
- Look for nested quantifiers on overlapping patterns.
- Test progressively — increase input length and see if time grows linearly or exponentially.
If doubling input length more than doubles the time, you have backtracking blowup.
Online tools for ReDoS detection
Several tools statically analyze regex for ReDoS vulnerabilities:
- npm:
safe-regex,vuln-regex-detector - Online:
regex101.comshows step count - Research:
recheck
The takeaway
Slow regex is almost always a pattern problem, not an input problem. Look for nested quantifiers, overlapping alternation, and unbounded greedy matching.
For untrusted input, use a non-backtracking engine (Go regexp, RE2) or audit every regex for ReDoS. The performance and security trade-off rarely goes the other way.
Related reading
Try this pattern in the explainer
Paste any regex into the live explainer and see what each token means, with example matches in real time.
Open the regex explainer →