Regex substitution — search and replace with back-references
Substitution is where regex starts to feel powerful. You're not just finding text — you're transforming it. This guide covers the syntax, the back-reference rules, and the differences between languages.
The basic form
Every language follows a similar shape: a pattern, a replacement, and the source text.
// JavaScript
"hello world".replace(/world/, "JavaScript")
// "hello JavaScript"
# Python
re.sub(r"world", "Python", "hello world")
# "hello Python"
Without /g (JS) or by default in re.sub (Python), only the first match is replaced. Add /g in JS or pass count parameter in Python's re.sub.
Back-references in the replacement
The capture groups from the pattern are available in the replacement. Different syntaxes by language:
| Language | Numbered | Named |
|---|---|---|
| JavaScript | $1, $2 | $<name> |
| Python | \1, \2 | \g<name> |
| PHP | $1, $2 or \1 | ${name} |
| Java/.NET | $1, $2 | ${name} |
| Go | $1, $2 | $name |
| Ruby | \1, \2 | \k<name> |
Common substitution recipes
Reformat a date
// JS: YYYY-MM-DD → DD/MM/YYYY
"2024-06-15".replace(/(\d{4})-(\d{2})-(\d{2})/, "$3/$2/$1")
// "15/06/2024"
Anonymize emails
// Replace local part with asterisks
"alice@example.com".replace(/[^@]+(?=@)/, "****")
// "****@example.com"
Collapse whitespace
" hello world ".replace(/\s+/g, " ").trim()
// "hello world"
Convert markdown links to HTML
"[click here](https://example.com)".replace(
/\[([^\]]+)\]\(([^)]+)\)/g,
'<a href="$2">$1</a>'
)
Function replacements
The replacement can be a function called for each match. This is where regex substitution becomes really powerful.
// JavaScript: capitalize every word
"hello world".replace(/\b\w/g, c => c.toUpperCase())
// "Hello World"
# Python: same thing
re.sub(r"\b\w", lambda m: m.group().upper(), "hello world")
// "Hello World"
The $& and \0 problem
To insert the whole match, JavaScript uses $&, Python uses \g<0> or \0:
// Wrap every number in brackets
"abc 123 def".replace(/\d+/g, "[$&]")
// "abc [123] def"
To insert a literal $ in JS replacement, use $$. To insert a literal \ in Python replacement, use \\.
Conditional replacement
If you want to replace only when some condition is met, use a function:
"a 1 b 22 c 333".replace(/\d+/g, (match) => {
const n = parseInt(match);
return n > 10 ? `[${n}]` : match;
})
// "a 1 b [22] c [333]"
Replace only the Nth occurrence
Most engines have no built-in way to replace just the third match. Use a function with a counter:
let count = 0;
"a a a a".replace(/a/g, m => ++count === 3 ? "X" : m);
// "a a X a"
Preserve case when replacing
Use a function and inspect the original:
"Hello hello HELLO".replace(/hello/gi, m => {
if (m === m.toUpperCase()) return "WORLD";
if (m[0] === m[0].toUpperCase()) return "World";
return "world";
})
// "World world WORLD"
Watch out for special characters in replacement
Even if your match is from user input and contains $ or \, you don't want those interpreted as back-reference syntax. To safely insert a literal string as replacement, escape it first:
// JavaScript
function escapeReplacement(s) {
return s.replace(/\$/g, "$$$$");
}
"abc".replace(/abc/, escapeReplacement(userInput));