Converting regex between flavors
"Regex" is a family of related languages, not a single language. A pattern that works perfectly in Python may fail silently in JavaScript, or work but mean something different. This guide is the survival map.
The major flavors
- PCRE — Perl-Compatible Regular Expressions. The reference for "advanced" regex. Used in PHP, nginx, BBEdit, many tools.
- Python — the
remodule. Close to PCRE but with some unique syntax. - JavaScript — the RegExp builtin. Common features only, no \K or recursion.
- Java —
java.util.regex. Backtracking engine with most features. - .NET — System.Text.RegularExpressions. Most powerful backtracking engine in widespread use.
- Go — uses RE2 (Russ Cox's library). Linear time, no backtracking, fewer features.
- Ruby — Onigmo engine. Close to Perl, full feature set.
What's identical across all of them
The basic building blocks work the same everywhere:
- Character classes:
[abc],[a-z],[^abc] - Shorthand:
\d,\w,\s - Quantifiers:
*,+,?,{n,m} - Anchors:
^,$,\b - Capturing groups:
(...) - Non-capturing groups:
(?:...) - Alternation:
a|b - Lookahead:
(?=...),(?!...)
If your pattern uses only these, it almost certainly works everywhere.
Named group syntax — the most common gotcha
| Flavor | Definition | Back-reference |
|---|---|---|
| Python | (?P<name>...) | (?P=name) |
| JS, Java, .NET, PCRE | (?<name>...) | \k<name> |
| .NET (also) | (?'name'...) | \k'name' |
| Ruby | (?<name>...) | \k<name> |
The P in Python's (?P<name>...) is a Python-specific quirk. Converting between Python and others almost always needs you to add or remove the P.
Lookbehind
- PCRE, Java, .NET, Ruby: variable-length lookbehind supported (some restrictions).
- JavaScript: variable-length lookbehind supported (ES2018+).
- Python: variable-length lookbehind supported in the
regexthird-party module;rerequires fixed length. - Go: not supported at all (RE2 is linear-time and rejects lookbehind).
Features Go doesn't have
Go's regexp is RE2, which trades features for linear-time guarantees. If you're porting to Go, watch for:
- No backreferences (
\1) - No lookahead or lookbehind
- No atomic groups or possessive quantifiers
- No recursion
- Some Unicode properties differ slightly
If your pattern uses any of these, you need to rewrite it for Go.
Flag syntax
| Behavior | JS | Python | .NET | Go |
|---|---|---|---|---|
| Case-insensitive | i | re.I | IgnoreCase | (?i) |
| Multiline | m | re.M | Multiline | (?m) |
| Dotall | s | re.S | Singleline | (?s) |
| Verbose | — | re.X | IgnorePatternWhitespace | — |
Go uses inline flags (?i), (?m), etc. — no external flag arg.
String literal escaping
This trips up everyone. The regex \d+ as a literal in source code:
| Language | Source |
|---|---|
| JavaScript (literal) | /\d+/ |
| JavaScript (string) | "\\d+" |
| Python (raw) | r"\d+" |
| Python (string) | "\\d+" |
| Java | "\\d+" |
| Go (raw) | `\d+` |
| C# (verbatim) | @"\d+" |
Languages with raw/verbatim string syntax (Python, Go, C#) let you write the regex once. Others require doubling every backslash.
Use our converter
Our flavor converter handles the syntax differences automatically. Paste a pattern, pick source and target flavors, and it shows the converted pattern plus warnings about features that don't carry over.