Download Cheat sheet PDF 12 pages · syntax, editors, patterns, Unicode, performance, debugging
Concepts May 8, 2026

Lookahead vs lookbehind — when to use each

Both check context without consuming it. The right one depends on where the context is.

The short answer

  • Lookahead (?=...) — assert what comes after the current position
  • Lookbehind (?<=...) — assert what comes before the current position

Both are zero-width assertions — they check a condition but don't consume characters. The match doesn't include them.

Side-by-side examples

Extracting a number after a $ sign

Lookbehind:  (?<=\$)\d+
Input:       "Price: $100"
Match:       "100"   (the $ stays where it is)

Extracting a number before "px"

Lookahead:   \d+(?=px)
Input:       "height: 100px"
Match:       "100"   (the "px" stays where it is)

Both are extracting just the number, with the prefix or suffix used as context. The lookaround tells you "yes, the context is there" without including it in the match — useful when you want clean values without having to strip prefixes/suffixes.

The negative versions

  • Negative lookahead (?!...) — assert that what comes after does NOT match
  • Negative lookbehind (?<!...) — assert that what comes before does NOT match
foo(?!bar)         match "foo" only if NOT followed by "bar"
(?<!@)\bword\b     match "word" only if NOT preceded by "@"

Why use lookarounds at all?

1. Multi-condition validation

The strong-password regex uses multiple lookaheads to assert each requirement:

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[^\w\s]).{8,}$

Each lookahead is an independent assertion: "somewhere there's a lowercase letter," "somewhere there's a digit," etc. They're evaluated from the same start position; none of them consume characters. Then .{8,} actually matches the password.

2. Inserting at boundaries

The thousands-separator trick:

Pattern: (\d)(?=(\d{3})+$)
Replace: $1,
Input:   1234567
Output:  1,234,567

This finds digit positions where exactly 3-digit groups follow until the end of the string. The lookahead doesn't consume, so you can match every comma-insertion point without losing the trailing digits.

3. Negative filtering

(?<!unsubscribed_)email     match "email" but not "unsubscribed_email"

The fixed-width lookbehind problem

Lookbehinds aren't universally supported the way lookaheads are. Several flavors require fixed-width lookbehinds — patterns where every alternative has the same number of characters:

Flavor Lookbehind support
JavaScript (ES2018+)✓ variable-width
Python stdlib reFIXED-WIDTH ONLY
Python regex (3rd party)✓ variable-width
PCREMostly fixed-width
JavaBounded-width

In flavors that require fixed width, (?<=a+) won't compile — the + means variable length.

Workaround for fixed-width restrictions

If you need a variable-width lookbehind and your engine can't do it, restructure as a capture group instead:

Want:        (?<=\$+)\d+
Workaround:  \$+(\d+)

You consume the prefix and capture just the number you want — same result, different access pattern.

Performance

Lookarounds do extra work — they run a separate sub-match at the current position. This is usually fine, but be careful with patterns inside lookaheads that themselves backtrack heavily. (?=.*\d.*[a-z].*[A-Z]) applied to long input runs three near-full-string scans per position.

The takeaway

Use lookahead when the context is after the match — extracting a number before a unit, validating multiple constraints, finding insertion points. Use lookbehind when the context is before — extracting values after a prefix, filtering by previous character.

If your environment has limited lookbehind support, restructure the pattern to use a regular capture group instead. The result is the same; you just access the value differently.


Related reading


Try this pattern in the explainer

Paste any regex into the live explainer and see what each token means, with example matches in real time.

Open the regex explainer →