Download Cheat sheet PDF 12 pages · syntax, editors, patterns, Unicode, performance, debugging
Concepts May 10, 2026

What does \b mean in regex?

\b matches a position, not a character — it's the difference between matching cat and matching cat inside catalog.

The short answer

\b is a word boundary. It matches the position between a word character (\w: letter, digit, or underscore) and a non-word character — or the start/end of the string if it's next to a word character.

It doesn't consume any character. It just asserts that the cursor is at a word boundary at that moment.

The classic use case

You want to find the word "cat" — not "catalog" or "cat-like" or "concat":

Pattern: \bcat\b
Input:   "The cat sat on the catalog"
Matches: "cat" (just the standalone one)

The first \b asserts "I'm at the start of a word" before the c. The second \b asserts "I'm at the end of a word" after the t. Inside "catalog", the position after t is between two word characters (t and a) — not a boundary — so the match fails.

Where boundaries actually are

Look at the string hello, world!. The word boundaries are at the positions marked with |:

|h e l l o|, |w o r l d|!
↑          ↑  ↑          ↑
boundary       boundary
positions

Five letters, then a comma — boundary between o and ,. Then a space, then w — boundary between space and w. Each transition between a word char and a non-word char is a boundary.

The Unicode gotcha

By default, \w only matches ASCII [A-Za-z0-9_]. So \b doesn't understand accented characters:

Pattern: \bcafé\b
Input:   "I love café food"
Match:   FAILS  (with default flags)

The é is NOT a word character in ASCII mode. The position after caf is a word boundary (because we transition from word f to non-word é), so the regex tries to match starting somewhere it shouldn't.

Fix this by enabling Unicode-aware matching:

  • JavaScript: add the u flag → /\bcafé\b/u
  • Python 3: Unicode is default — works automatically
  • PCRE: use /u or the (*UCP) directive

The opposite: \B

\B matches a non-boundary — a position that is NOT between a word char and a non-word char. Useful for finding substrings inside words:

Pattern: \Bcat\B
Input:   "The catalog is a cat"
Match:   "cat" inside "catalog"  (the standalone "cat" is excluded)

Common mistakes

Using \b around non-word characters

\b. at the start of a string is fine — \b matches the start before a word character. But \b! doesn't mean what you might think:

Pattern: \b!
Input:   "hello!"
Match:   YES (boundary between word "o" and non-word "!")

The boundary is between o and !, and \b matches that position. The ! then consumes the exclamation. That's probably what you want, but be aware: \b is about transitions, not about a specific side.

Expecting \b to work without word characters nearby

\b\b matches a single boundary position twice — it doesn't mean "two boundaries." Boundaries don't consume characters, so two in a row match the same position.

Practical patterns

Match a whole word only

\bword\b

Find numbers but not inside identifiers

Pattern: \b\d+\b
Input:   "count is 42 but x32 is also there"
Matches: ["42"]  (not "32" inside "x32")

Match the start of a word

\b[A-Z]   first letter, uppercase

The takeaway

\b is a position, not a character. It's the difference between matching a word and matching a word-shaped substring. Most "regex matched too much" bugs in word-search patterns are fixed by adding \b on both sides.

For Unicode text, make sure your flavor is in Unicode mode — otherwise \b won't see accented letters as word characters and your matches will be wrong.


Related reading


Try this pattern in the explainer

Paste any regex into the live explainer and see what each token means, with example matches in real time.

Open the regex explainer →