How-to May 4, 2026

Regex to match anything between two strings

The pattern is simple — but there are three ways to write it, and one of them is much better.

The short answer

To match text between START and END:

START(.*?)END        lazy quantifier, simple but can backtrack
START(.+?)END        same, requires at least one character
START([^E]*)END      negated class (faster, no backtracking)

The captured content is in group 1. Use the lazy version for clarity, the negated class for performance.

The three approaches

1. Lazy quantifier — readable

Pattern: BEGIN(.*?)END
Input:   "BEGINfirstENDmiddleBEGINsecondEND"
With g flag: matches "BEGINfirstEND" and "BEGINsecondEND"
Capture group 1: "first" and "second"

This is the most common form. .*? matches as little as possible — it stops at the first END it can find.

2. Negated character class — fast

If your delimiters are single characters (like quotes or brackets), use a negated class:

"([^"]*)"            content between double quotes
\[([^\]]*)\]         content between square brackets
\(([^)]*)\)          content between parens (single-level only)

This is significantly faster than lazy .*? because the engine never has to back off — it consumes characters that definitely belong in the match.

3. Lookarounds — clean output

If you want just the inner text without including the delimiters in the match itself:

Pattern: (?<=START).*?(?=END)
Result:  matches just the middle text, no capture group needed

The lookbehind and lookahead don't consume the delimiters. The match itself is just the inside.

When the delimiter is multi-character

For multi-character delimiters like :

Lazy:        <!--(.*?)-->
Negated:     <!--((?:[^-]|-(?!-))*)-->     (much harder to read)

The negated-class approach gets complex with multi-char delimiters. Stick with the lazy quantifier unless you specifically need ReDoS resistance.

Multiline content

By default, . doesn't match newlines. If your content spans lines, enable the dotall flag:

JavaScript: /START(.*?)END/s (ES2018+)
Python: re.compile(r"START(.*?)END", re.DOTALL)
PCRE: /s modifier

Or use [\s\S] instead of . — that works without any flag:

START([\s\S]*?)END     works across newlines without flags

JavaScript example

const text = "[apple][banana][cherry]";
const matches = [...text.matchAll(/\[([^\]]+)\]/g)].map(m => m[1]);
// ["apple", "banana", "cherry"]

Python example

import re
text = "[apple][banana][cherry]"
matches = re.findall(r"\[([^\]]+)\]", text)
# ["apple", "banana", "cherry"]

Common mistakes

Forgetting to make it lazy

Pattern: BEGIN(.*)END
Input:   "BEGINfirstENDmiddleBEGINsecondEND"
Match:   the WHOLE thing — captures "firstENDmiddleBEGINsecond"

Greedy .* grabs everything to the last END. Use lazy .*? instead.

Not handling nested delimiters

Regex can't handle balanced nesting in general. \(.*?\) on input (a(b)c) matches (a(b), not (a(b)c). For nested structures, you need a parser, not a regex.

The takeaway

Use the lazy quantifier for readability. Use the negated character class for single-character delimiters when performance matters. Use lookarounds when you want the middle text without delimiters in the match.

For nested or recursive structures, regex isn't the right tool — switch to a parser.