Regex to match anything between two strings
The pattern is simple — but there are three ways to write it, and one of them is much better.
The short answer
To match text between START and END:
START(.*?)END lazy quantifier, simple but can backtrack
START(.+?)END same, requires at least one character
START([^E]*)END negated class (faster, no backtracking)
The captured content is in group 1. Use the lazy version for clarity, the negated class for performance.
The three approaches
1. Lazy quantifier — readable
Pattern: BEGIN(.*?)END
Input: "BEGINfirstENDmiddleBEGINsecondEND"
With g flag: matches "BEGINfirstEND" and "BEGINsecondEND"
Capture group 1: "first" and "second"
This is the most common form. .*? matches as little as possible — it stops at the first END it can find.
2. Negated character class — fast
If your delimiters are single characters (like quotes or brackets), use a negated class:
"([^"]*)" content between double quotes
\[([^\]]*)\] content between square brackets
\(([^)]*)\) content between parens (single-level only)
This is significantly faster than lazy .*? because the engine never has to back off — it consumes characters that definitely belong in the match.
3. Lookarounds — clean output
If you want just the inner text without including the delimiters in the match itself:
Pattern: (?<=START).*?(?=END)
Result: matches just the middle text, no capture group needed
The lookbehind and lookahead don't consume the delimiters. The match itself is just the inside.
When the delimiter is multi-character
For multi-character delimiters like <!-- and -->:
Lazy: <!--(.*?)-->
Negated: <!--((?:[^-]|-(?!-))*)--> (much harder to read)
The negated-class approach gets complex with multi-char delimiters. Stick with the lazy quantifier unless you specifically need ReDoS resistance.
Multiline content
By default, . doesn't match newlines. If your content spans lines, enable the dotall flag:
- JavaScript:
/START(.*?)END/s(ES2018+) - Python:
re.compile(r"START(.*?)END", re.DOTALL) - PCRE:
/smodifier
Or use [\s\S] instead of . — that works without any flag:
START([\s\S]*?)END works across newlines without flags
JavaScript example
const text = "[apple][banana][cherry]";
const matches = [...text.matchAll(/\[([^\]]+)\]/g)].map(m => m[1]);
// ["apple", "banana", "cherry"]
Python example
import re
text = "[apple][banana][cherry]"
matches = re.findall(r"\[([^\]]+)\]", text)
# ["apple", "banana", "cherry"]
Common mistakes
Forgetting to make it lazy
Pattern: BEGIN(.*)END
Input: "BEGINfirstENDmiddleBEGINsecondEND"
Match: the WHOLE thing — captures "firstENDmiddleBEGINsecond"
Greedy .* grabs everything to the last END. Use lazy .*? instead.
Not handling nested delimiters
Regex can't handle balanced nesting in general. \(.*?\) on input (a(b)c) matches (a(b), not (a(b)c). For nested structures, you need a parser, not a regex.
The takeaway
Use the lazy quantifier for readability. Use the negated character class for single-character delimiters when performance matters. Use lookarounds when you want the middle text without delimiters in the match.
For nested or recursive structures, regex isn't the right tool — switch to a parser.
Related reading
Try this pattern in the explainer
Paste any regex into the live explainer and see what each token means, with example matches in real time.
Open the regex explainer →