Concepts May 8, 2026

Lookahead vs lookbehind — when to use each

Both check context without consuming it. The right one depends on where the context is.

The short answer

Lookahead (?=...) — assert what comes after the current position
Lookbehind (?<=...) — assert what comes before the current position

Both are zero-width assertions — they check a condition but don't consume characters. The match doesn't include them.

Side-by-side examples

Extracting a number after a $ sign

Lookbehind:  (?<=\$)\d+
Input:       "Price: $100"
Match:       "100"   (the $ stays where it is)

Extracting a number before "px"

Lookahead:   \d+(?=px)
Input:       "height: 100px"
Match:       "100"   (the "px" stays where it is)

Both are extracting just the number, with the prefix or suffix used as context. The lookaround tells you "yes, the context is there" without including it in the match — useful when you want clean values without having to strip prefixes/suffixes.

The negative versions

Negative lookahead (?!...) — assert that what comes after does NOT match
Negative lookbehind (?<!...) — assert that what comes before does NOT match

foo(?!bar)         match "foo" only if NOT followed by "bar"
(?<!@)\bword\b     match "word" only if NOT preceded by "@"

Why use lookarounds at all?

1. Multi-condition validation

The strong-password regex uses multiple lookaheads to assert each requirement:

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[^\w\s]).{8,}$

Each lookahead is an independent assertion: "somewhere there's a lowercase letter," "somewhere there's a digit," etc. They're evaluated from the same start position; none of them consume characters. Then .{8,} actually matches the password.

2. Inserting at boundaries

The thousands-separator trick:

Pattern: (\d)(?=(\d{3})+$)
Replace: $1,
Input:   1234567
Output:  1,234,567

This finds digit positions where exactly 3-digit groups follow until the end of the string. The lookahead doesn't consume, so you can match every comma-insertion point without losing the trailing digits.

3. Negative filtering

(?<!unsubscribed_)email     match "email" but not "unsubscribed_email"

The fixed-width lookbehind problem

Lookbehinds aren't universally supported the way lookaheads are. Several flavors require fixed-width lookbehinds — patterns where every alternative has the same number of characters:

Flavor	Lookbehind support
JavaScript (ES2018+)	✓ variable-width
Python stdlib re	FIXED-WIDTH ONLY
Python regex (3rd party)	✓ variable-width
PCRE	Mostly fixed-width
Java	Bounded-width

In flavors that require fixed width, (?<=a+) won't compile — the + means variable length.

Workaround for fixed-width restrictions

If you need a variable-width lookbehind and your engine can't do it, restructure as a capture group instead:

Want:        (?<=\$+)\d+
Workaround:  \$+(\d+)

You consume the prefix and capture just the number you want — same result, different access pattern.

Performance

Lookarounds do extra work — they run a separate sub-match at the current position. This is usually fine, but be careful with patterns inside lookaheads that themselves backtrack heavily. (?=.*\d.*[a-z].*[A-Z]) applied to long input runs three near-full-string scans per position.

The takeaway

Use lookahead when the context is after the match — extracting a number before a unit, validating multiple constraints, finding insertion points. Use lookbehind when the context is before — extracting values after a prefix, filtering by previous character.

If your environment has limited lookbehind support, restructure the pattern to use a regular capture group instead. The result is the same; you just access the value differently.