Download Cheat sheet PDF 12 pages · syntax, editors, patterns, Unicode, performance, debugging
Guide

Regex flags deep dive

Seven letters that change how regex works. Most bugs from misunderstanding regex come from missing or extra flags.

g — global

Without g: the regex matches the first occurrence and stops. With g: it can find all occurrences.

In JavaScript, g is needed for String.matchAll, for replace-all behavior of String.replace in older versions, and for repeated RegExp.exec calls (the regex object remembers lastIndex).

In Python, there's no g flag — re.findall and re.finditer are global by nature, re.search isn't.

i — case-insensitive

Letters in the pattern match either case in the input. So /cat/i matches Cat, CAT, cAt.

Unicode case-folding is trickier than ASCII. With /u/i in JS or re.IGNORECASE in Python, ß (German sharp s) won't necessarily equal SS without further flags. For locale-aware case folding, consider Unicode-aware libraries.

m — multiline

Changes the meaning of ^ and $:

  • Without m: ^ matches only at the start of the input, $ only at the end.
  • With m: ^ matches at the start of every line, $ matches at the end of every line.

Crucially, m does NOT change what . matches — that's the s flag's job.

s — dotall (single-line)

Changes . to match newlines too.

Confusingly, the flag letter is s for "single-line", but in everyday language we call it "dotall". By default, the dot doesn't match \n, which is what people usually forget.

For HTML or any multiline content where you want to grab everything between two markers, /<style>(.*?)<\/style>/s is the safe choice.

u — unicode

JavaScript-specific. Enables full Unicode mode:

  • . matches astral plane characters as one unit, not two surrogates
  • \p{{...}} Unicode property classes work
  • Invalid escapes throw instead of being silently accepted

Always use u when dealing with non-ASCII input in JavaScript. Python and PCRE have Unicode on by default.

y — sticky

JavaScript-specific. Anchors the match to start exactly at lastIndex, without scanning ahead. Useful for tokenizers and lexers.

With g: tries to match at lastIndex, advances on success. With y: same, but fails immediately if no match at lastIndex.

d — has indices

JavaScript ES2022. Adds an indices property to the match result with [start, end] for each capture group. Useful for syntax highlighting, replace-with-position, and analytical tools.

x — extended / verbose mode

Not in JavaScript. Available in Python (re.X), PCRE, .NET, Ruby. Allows whitespace and # comments inside the pattern so you can write multi-line, commented regex.

# Python
re.compile(r"""
    ^                    # start of string
    (?P<year>\d{{4}})     # 4-digit year
    -
    (?P<month>\d{{2}})    # 2-digit month
    -
    (?P<day>\d{{2}})      # 2-digit day
    $                    # end of string
""", re.X)

This is what the explainer's Format button generates.

Common combinations

Case-insensitive global search: gi — find every Match of a word, any case.

Multiline with dot-newline: ms — line-based ^$ anchors but dots can cross newlines.

Full Unicode global: gu — JS, when input may contain emoji or non-ASCII.

Inline mode modifiers

Most flavors let you turn flags on/off inside the pattern with (?im) or (?-i):

(?i)foo(?-i)bar      foo case-insensitive, bar case-sensitive (PCRE)
(?im:foo)            scoped to the group (Java, .NET, PCRE)

JavaScript supports this via (?im:...) in ES2018+, scoped only.


← Back to guides