gitGood.dev
Tools

Regular Expressions Cheat Sheet

ToolsFREELast updated: June 2026 · By gitGood Editorial

Anchors, character classes, quantifiers, groups, alternation, lookarounds, backreferences, and flags - plus practical patterns and the gotchas that trip people up in interviews.

How to read regex

A regular expression is a pattern matched against text left to right. Most characters match themselves; a handful are metacharacters with special meaning. To match a metacharacter literally, escape it with a backslash. The tables below use the common Perl-compatible (PCRE) syntax shared by JavaScript, Python, Java, and most modern engines - note that lookbehind and named-group syntax vary slightly between flavors.

Anchors and boundaries

These match positions, not characters - they consume nothing.

^
Start of string (or start of line with the multiline flag).
$
End of string (or end of line with the multiline flag).
\b
Word boundary - the position between a word character and a non-word character.
\B
Not a word boundary - the position inside a word or between two non-word characters.

Character classes

Match a single character from a set.

.
Any character except newline (unless the dotall / 's' flag is set).
\d / \D
A digit / a non-digit.
\w / \W
A word character (letters, digits, underscore) / a non-word character.
\s / \S
A whitespace character / a non-whitespace character.
[a-z]
Any one character in the range a to z. Combine ranges, e.g. [A-Za-z0-9].
[^...]
Negated class - any one character NOT in the set. The caret must be first inside the brackets.

Quantifiers

Control how many times the preceding token repeats.

*
Zero or more.
+
One or more.
?
Zero or one (optional).
{n}
Exactly n times.
{n,}
At least n times.
{n,m}
Between n and m times, inclusive.

Greedy vs lazy

By default quantifiers are greedy - they match as much as possible, then backtrack to let the rest of the pattern succeed. Adding a '?' after a quantifier makes it lazy - it matches as little as possible and expands only as needed. The classic example: against the text 'a<b><c>', the pattern '<.*>' matches the whole '<b><c>' (greedy), while '<.*?>' matches just '<b>' (lazy). Lazy quantifiers are the usual fix when a pattern grabs more than you intended, though a negated character class like '<[^>]*>' is often faster and clearer than a lazy match.

Groups and alternation

Group tokens, capture text, and offer choices.

(...)
Capturing group - matches the enclosed pattern and captures the text for backreference and extraction.
(?:...)
Non-capturing group - groups for precedence or repetition without capturing. Use it when you do not need the value.
(?<name>...)
Named capturing group - capture and refer to it by name instead of a number (syntax varies slightly by engine).
a|b
Alternation - match 'a' or 'b'. Alternation has low precedence, so 'gray|grey' is two whole alternatives, while 'gr(a|e)y' alternates just one letter.

Lookarounds

Zero-width assertions - they test for context without consuming characters.

(?=...)
Positive lookahead - succeeds if the pattern matches ahead, e.g. 'foo(?=bar)' matches 'foo' only when followed by 'bar'.
(?!...)
Negative lookahead - succeeds if the pattern does NOT match ahead.
(?<=...)
Positive lookbehind - succeeds if the pattern matches immediately before the current position.
(?<!...)
Negative lookbehind - succeeds if the pattern does NOT match immediately before.
\1, \2
Backreference - matches the same text a previous capturing group matched, e.g. '(\w+)\s+\1' finds a doubled word.

Flags

Modify how the whole pattern is matched.

g (global)
Find all matches, not just the first. In JavaScript this also affects 'lastIndex' on repeated calls.
i (ignore case)
Case-insensitive matching.
m (multiline)
Makes '^' and '$' match at line boundaries, not just string boundaries.
s (dotall)
Makes '.' match newline characters too.

Practical patterns

Useful starting points - tighten them for real validation.

Email-ish
Roughly '[\w.+-]+@[\w-]+\.[\w.-]+' - fine for a quick filter, but full RFC-correct email validation is famously hard, so do not promise it.
IPv4 octet (loose)
'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}' matches the shape; it does NOT reject 999, so range-check the octets separately.
ISO date
'\d{4}-\d{2}-\d{2}' matches a YYYY-MM-DD shape (it does not validate that the date is real).
key=value
'(\w+)=(\S+)' captures the key in group 1 and the value in group 2.

Interview gotchas

  • ·Catastrophic backtracking - nested quantifiers over overlapping patterns, like '(a+)+$', can hang on certain inputs and is a real denial-of-service vector (ReDoS). Avoid ambiguous nesting and prefer possessive quantifiers or atomic groups where supported.
  • ·Always anchor validation patterns with '^' and '$' (or '\A' and '\z') - an unanchored pattern matches a substring, so 'abc@def' would 'pass' an unanchored email check.
  • ·'.' does not match newlines by default - reach for the dotall flag, not '[\s\S]', when you mean 'any character including newlines'.
  • ·Inside a character class most metacharacters lose their meaning - '[.]' matches a literal dot, and '-' is literal when first or last.
  • ·Lookbehind support and syntax differ across engines; some older flavors do not support it or require fixed-length lookbehind.
  • ·The global flag in JavaScript makes a single RegExp object stateful via 'lastIndex' - reusing it in a loop can silently skip matches.
  • ·Regex cannot match arbitrarily nested structures (balanced parentheses, full HTML) - that needs a real parser, and saying so is a strong answer.

Other cheat sheets

Big-O Reference

Algorithms

Time and space complexity for the data structures, sorting algorithms, and search routines that show up in coding interviews. Skim the row, remember the row, defend the row in an interview.

Interview Patterns

Patterns

The recurring shapes - sliding window, two pointers, fast/slow, BFS/DFS, backtracking, DP, divide & conquer, binary search variants, union-find, topological sort. Each entry: when to reach for it, the template, complexity, and which classic problems use it.

Design Tradeoffs

Systems

The recurring forks in system design interviews. CAP, PACELC, sync vs async, push vs pull, SQL vs NoSQL, sharding shapes, consistency models, cache strategies, idempotency, and rate limiting. For each, the options and when to choose each.

Unix Essentials

Tools

Filesystem layout, the commands you actually use (find / grep / awk / sed / xargs), processes and signals, networking, permissions, basic shell scripting, and a vi survival kit.

SQL Essentials

Tools

Query clause order, every JOIN type and when to use it, aggregates vs window functions, what indexes actually buy you, transaction isolation levels, and the NULL / WHERE-vs-HAVING / EXISTS-vs-IN gotchas interviewers fish for.

Git Essentials

Tools

The everyday commands, every undo scenario mapped to its fix, rebase vs merge with a side to pick, interactive rebase, bisect, the reflog safety net, stash, and the flags worth aliasing.

Docker & K8s

Tools

The docker and kubectl commands you reach for daily, Dockerfile best practices, how layer caching actually works, the core k8s objects in one screen, requests vs limits, liveness vs readiness, and a step-by-step CrashLoopBackOff debug flow.

REST API Design

Systems

Method semantics and idempotency, the ~15 status codes that matter, resource naming rules, offset vs cursor pagination, versioning and auth tradeoffs, error body conventions, rate-limit headers, and the smells reviewers flag.

STAR Method

Patterns

The STAR structure with timing, what interviewers actually grade, eight question archetypes and how to frame each, the anti-patterns that sink answers (rambling, "we" instead of "I", no metrics), and a 30-second answer skeleton.

Networking

Systems

TCP vs UDP, the TLS and TCP handshakes, HTTP versions, status codes, DNS resolution, the OSI and TCP/IP layer models, and the ports you are expected to know in an interview.

Linux Perf

Tools

The USE method, a first-five-minutes triage runbook, and the CPU, memory, disk, network, and tracing commands you reach for when a Linux box is misbehaving.

Concurrency

Patterns

A fast reference for concurrency primitives, synchronization tradeoffs, the memory model, and the classic bugs that show up in systems interviews and real code.

Distributed Systems

Systems

A reference for the theorems, consistency models, replication and partitioning strategies, delivery guarantees, and resilience patterns that come up in system design interviews.

Practice the patterns

Reading is the floor. The signal in interviews comes from working problems out loud and defending your tradeoffs. Spin up an AI mock interview or run a coding challenge to put these to work.