Regex in Go
Go's regexp package is built on RE2 — Google's regex engine guaranteeing linear-time matching. Trade-off: a few features have to go.
RE2 in two sentences
RE2 compiles a regex to a state machine and matches in time linear in the input length, regardless of how badly the pattern is shaped. It rules out catastrophic backtracking entirely. The price is that anything requiring backtracking — lookaround, back-references — isn't supported.
What Go's regexp can't do
- No lookaheads or lookbehinds.
(?=...),(?!...),(?<=...),(?<!...)all fail to compile. - No back-references in the pattern.
(foo)\1doesn't work — you can't reference a captured group later in the same regex. - No atomic groups, no possessive quantifiers. Not needed — RE2 is already linear time.
- No conditionals like
(?(1)yes|no).
Capture groups, alternation, anchors, named groups, character classes, Unicode all work.
The API
import "regexp"
// Compile once, use many times
re := regexp.MustCompile(`\d+`)
re.MatchString("abc 123") // true
re.FindString("abc 123") // "123"
re.FindAllString("abc 123 def 456", -1) // ["123", "456"]
re.FindStringSubmatch("a-1-b") // ["a-1-b"] or with groups
re.ReplaceAllString("foo123", "X") // "fooX"
Note: MustCompile panics on a bad pattern; use Compile for runtime patterns that may be invalid.
Named captures
re := regexp.MustCompile(`(?P<year>\d{{4}})-(?P<month>\d{{2}})`)
match := re.FindStringSubmatch("2024-06")
result := make(map[string]string)
for i, name := range re.SubexpNames() {{
if i != 0 && name != "" {{
result[name] = match[i]
}}
}}
Go uses Python-style (?P<name>...). Accessing by name requires the boilerplate above; there's no built-in Match.GroupByName().
POSIX vs RE2 syntax
Most people want regexp.MustCompile with the default (RE2) syntax. There's also regexp.CompilePOSIX which switches to POSIX matching semantics (leftmost-longest instead of leftmost-first), useful when porting POSIX regex code.
Working around the limitations
Need a lookahead? Rewrite without it. foo(?=bar) can become foo(bar) with the bar excluded from your post-processing. Or use Go's string operations after the regex finds candidates.
Need a back-reference for duplicate detection? Run the regex without it, then check duplicates in code.
Need lookbehind to say "not preceded by"? Often you can use negation and anchors creatively, or do a two-pass approach: find all matches, then filter in Go.
Why this is mostly a good thing
For server code processing user input, the no-backtracking guarantee is a huge security win. ReDoS attacks that take down PCRE-based services don't affect Go. If you're writing a web API and considering lookbehind, it's worth a moment to ask whether you really need it or whether you can structure the data differently.
If you absolutely need PCRE-style features in Go, third-party libraries exist (regexp2, bindings to PCRE2), but you give up the linear-time guarantee.