Guide

Capture groups — every kind explained

Parentheses do four jobs in regex: grouping for quantifiers, capturing for back-references, naming, and assertions. This guide covers them all with concrete examples.

1. Capturing groups

The simplest form: (pattern). The parens both group the contents for quantifiers and capture the match for back-references.

// "2024-06-15"
const m = "2024-06-15".match(/(\d{4})-(\d{2})-(\d{2})/);
m[0];  // "2024-06-15" — full match
m[1];  // "2024" — group 1
m[2];  // "06"
m[3];  // "15"

Groups are numbered by the position of their opening paren, left to right. Nested groups share the same numbering:

// ((a)(b)) — three groups: "ab", "a", "b"
"ab".match(/((a)(b))/);  // ["ab", "ab", "a", "b"]

2. Non-capturing groups

Sometimes you want to group for quantifiers or alternation, but you don't need to capture the match. Use (?:pattern).

// Match one or more repetitions of "ab"
"ababab".match(/(?:ab)+/);  // ["ababab"] — only one group: the full match

Why bother? Three reasons:

Performance. Capturing has overhead. For groups you don't need to reference, non-capturing is faster.
Readability. When the match has many groups, non-capturing for grouping-only operations keeps the back-reference numbers stable.
Numbering. Adding a new non-capturing group doesn't renumber the other groups.

3. Named capture groups

Capture by name instead of number. Far more readable for complex patterns.

// JavaScript / .NET / Java
(?<year>\d{4})-(?<month>\d{2})

// Python (with the P)
(?P<year>\d{4})-(?P<month>\d{2})

Access by name:

// JavaScript
const m = "2024-06".match(/(?<year>\d{4})-(?<month>\d{2})/);
m.groups.year;   // "2024"
m.groups.month;  // "06"

# Python
m = re.search(r"(?P<year>\d{4})-(?P<month>\d{2})", "2024-06")
m.group("year")   # "2024"
m.groupdict()     # {"year": "2024", "month": "06"}

Named groups are still numbered too — group 1 in the example above is "2024", same as if it were unnamed.

4. Back-references inside the pattern

Refer to a previously captured group from within the same pattern:

// Match repeated words: "hello hello", "the the"
\b(\w+)\s+\1\b

// Match an HTML tag and its closing tag
<(\w+)>.*?</\1>

Back-references match the literal text of the earlier capture, not the pattern. So (\w+)\s\1 matches "ab ab" but not "ab cd" even though both groups would individually match \w+.

5. Named back-references

Same idea but by name:

// JavaScript: \k<name>
(?<word>\w+)\s+\k<word>

# Python: (?P=name)
(?P<word>\w+)\s+(?P=word)

6. Atomic groups (advanced)

An atomic group (?>pattern) commits to whatever the inner pattern matched — no backtracking inside. This prevents catastrophic backtracking in cases where regular groups would retry.

// Vulnerable to ReDoS
(\w+)+$

// Safe equivalent
(?>\w+)+$  // PCRE / Java / Ruby / .NET
\w++$        // possessive quantifier — same idea

JavaScript supports atomic groups in ES2025 engines and later. Not available in Python's standard re module (use the third-party regex module).

Lookaround groups (assertions)

Lookarounds use parens but don't actually capture — they assert that a position is followed/preceded by a pattern.

// Positive lookahead — followed by
\d+(?=USD)

// Negative lookahead — NOT followed by
\d+(?!USD)

// Positive lookbehind — preceded by
(?<=\$)\d+

// Negative lookbehind — NOT preceded by
(?<!\$)\d+

See our lookarounds guide for the deep dive.

Practical examples

Reformat a name

// "Last, First" → "First Last"
"Smith, Alice".replace(/(\w+), (\w+)/, "$2 $1")
// "Alice Smith"

Extract key-value pairs

// "key1=value1&key2=value2"
const params = {};
"name=alice&age=30".replace(/(\w+)=(\w+)/g, (_, k, v) => {
  params[k] = v;
});
// { name: "alice", age: "30" }

Find duplicate adjacent words

"the the cat sat on on the mat".match(/\b(\w+)\s+\1\b/g);
// ["the the", "on on"]

Performance note

Each capturing group has some overhead in backtracking engines. For hot loops on large inputs, prefer non-capturing groups when you don't need the values.