Lookahead vs lookbehind — when to use each
Both check context without consuming it. The right one depends on where the context is.
The short answer
- Lookahead
(?=...)— assert what comes after the current position - Lookbehind
(?<=...)— assert what comes before the current position
Both are zero-width assertions — they check a condition but don't consume characters. The match doesn't include them.
Side-by-side examples
Extracting a number after a $ sign
Lookbehind: (?<=\$)\d+
Input: "Price: $100"
Match: "100" (the $ stays where it is)
Extracting a number before "px"
Lookahead: \d+(?=px)
Input: "height: 100px"
Match: "100" (the "px" stays where it is)
Both are extracting just the number, with the prefix or suffix used as context. The lookaround tells you "yes, the context is there" without including it in the match — useful when you want clean values without having to strip prefixes/suffixes.
The negative versions
- Negative lookahead
(?!...)— assert that what comes after does NOT match - Negative lookbehind
(?<!...)— assert that what comes before does NOT match
foo(?!bar) match "foo" only if NOT followed by "bar"
(?<!@)\bword\b match "word" only if NOT preceded by "@"
Why use lookarounds at all?
1. Multi-condition validation
The strong-password regex uses multiple lookaheads to assert each requirement:
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[^\w\s]).{8,}$
Each lookahead is an independent assertion: "somewhere there's a lowercase letter," "somewhere there's a digit," etc. They're evaluated from the same start position; none of them consume characters. Then .{8,} actually matches the password.
2. Inserting at boundaries
The thousands-separator trick:
Pattern: (\d)(?=(\d{3})+$)
Replace: $1,
Input: 1234567
Output: 1,234,567
This finds digit positions where exactly 3-digit groups follow until the end of the string. The lookahead doesn't consume, so you can match every comma-insertion point without losing the trailing digits.
3. Negative filtering
(?<!unsubscribed_)email match "email" but not "unsubscribed_email"
The fixed-width lookbehind problem
Lookbehinds aren't universally supported the way lookaheads are. Several flavors require fixed-width lookbehinds — patterns where every alternative has the same number of characters:
| Flavor | Lookbehind support |
|---|---|
| JavaScript (ES2018+) | ✓ variable-width |
| Python stdlib re | FIXED-WIDTH ONLY |
| Python regex (3rd party) | ✓ variable-width |
| PCRE | Mostly fixed-width |
| Java | Bounded-width |
In flavors that require fixed width, (?<=a+) won't compile — the + means variable length.
Workaround for fixed-width restrictions
If you need a variable-width lookbehind and your engine can't do it, restructure as a capture group instead:
Want: (?<=\$+)\d+
Workaround: \$+(\d+)
You consume the prefix and capture just the number you want — same result, different access pattern.
Performance
Lookarounds do extra work — they run a separate sub-match at the current position. This is usually fine, but be careful with patterns inside lookaheads that themselves backtrack heavily. (?=.*\d.*[a-z].*[A-Z]) applied to long input runs three near-full-string scans per position.
The takeaway
Use lookahead when the context is after the match — extracting a number before a unit, validating multiple constraints, finding insertion points. Use lookbehind when the context is before — extracting values after a prefix, filtering by previous character.
If your environment has limited lookbehind support, restructure the pattern to use a regular capture group instead. The result is the same; you just access the value differently.
Related reading
Try this pattern in the explainer
Paste any regex into the live explainer and see what each token means, with example matches in real time.
Open the regex explainer →