Regex

Regex Cheat Sheet for Beginners (2026)

Q: What is the difference between test(), match(), and exec()?

In JavaScript, test() returns a boolean and is the fastest way to check if a pattern exists. match() returns an array of all matches (with the g flag) or the first match with capture groups (without g). exec() returns one match at a time with full group information, and maintains state between calls with the g flag — useful for iterating through all matches in a loop while accessing group data.

Q: How do I make a regex case-insensitive?

Add the i flag: /pattern/i in JavaScript, or re.IGNORECASE / re.I in Python. Case-insensitive matching is slightly slower than case-sensitive because the engine must check both cases for each character, but the difference is negligible for typical use cases.

April 20, 2026 · 8 min read · Test regex live →

Arun Gopal · Developer & Founder, ToolPry

Regular expressions are one of those things every developer knows they should understand but many never fully master. This cheat sheet covers everything from basic character matching to advanced lookaheads — with real-world examples you can test immediately in ToolPry's Regex Tester.

Anchors

Anchors do not match characters — they match positions in a string.

Pattern	What it matches	Example
^	Start of string (or line with m flag)	`^Hello` matches "Hello world" but not "Say Hello"
$	End of string (or line with m flag)	`world$` matches "Hello world" but not "world peace"
\b	Word boundary	`\bcat\b` matches "cat" but not "catch" or "tomcat"
\B	Non-word boundary	`\Bcat\B` matches "scatter" but not "cat"

Character Classes

Pattern	What it matches
.	Any character except newline
\d	Any digit [0-9]
\D	Any non-digit
\w	Word character [a-zA-Z0-9_]
\W	Non-word character
\s	Whitespace (space, tab, newline)
\S	Non-whitespace
[abc]	Any of a, b, or c
[^abc]	Anything except a, b, or c
[a-z]	Any lowercase letter
[a-zA-Z]	Any letter

Quantifiers

Quantifiers specify how many times a pattern must match. By default they are greedy — they match as much as possible.

Quantifier	Meaning	Example
*	0 or more	`go*` matches "g", "go", "goo", "gooo"
+	1 or more	`go+` matches "go", "goo" but not "g"
?	0 or 1 (optional)	`colou?r` matches "color" and "colour"
{n}	Exactly n times	`\d{4}` matches "2026" but not "26"
{n,}	n or more times	`\d{2,}` matches "12", "123", "1234"
{n,m}	Between n and m times	`\d{2,4}` matches "12", "123", "1234"
*?	Lazy: 0 or more, as few as possible	`<.*?>` matches one tag at a time
+?	Lazy: 1 or more, as few as possible	`.+?` matches the shortest possible string

Greedy vs lazy: <.*> on bold matches the entire string. <.*?> matches only . Add ? after a quantifier to make it lazy.

Groups and Capturing

Pattern	What it does
(abc)	Capturing group — captures "abc" for back-reference
(?:abc)	Non-capturing group — groups without capturing
(?<name>abc)	Named capturing group
\1	Back-reference to group 1
\k<name>	Back-reference to named group
a\|b	Alternation — matches a or b

Example: Extract year, month, day from a date

const regex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = '2026-05-02'.match(regex);
console.log(match.groups.year);  // "2026"
console.log(match.groups.month); // "05"
console.log(match.groups.day);   // "02"

Lookaheads and Lookbehinds

Lookarounds let you match a pattern only if it is (or is not) followed by or preceded by another pattern — without including that other pattern in the match.

Pattern	Type	What it matches
(?=abc)	Positive lookahead	Position followed by "abc"
(?!abc)	Negative lookahead	Position NOT followed by "abc"
(?<=abc)	Positive lookbehind	Position preceded by "abc"
(?<!abc)	Negative lookbehind	Position NOT preceded by "abc"

Example: Match price numbers without the $ sign

// Matches the number in "$19.99" without including $
const regex = /(?<=\$)\d+(\.\d{2})?/;
'$19.99'.match(regex)[0]; // "19.99"

Flags

Flag	Meaning
g	Global — find all matches, not just the first
i	Case-insensitive matching
m	Multiline — ^ and $ match line boundaries
s	Dotall — . matches newlines too
u	Unicode mode — enables full Unicode matching
v	UnicodeSets — advanced Unicode (ES2024+)

Real-World Regex Patterns

Email validation

/^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/

URL matching

/https?:\/\/(www\.)?[-a-zA-Z0-9@:%._+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_+.~#?&/=]*)/

IP address (IPv4)

/^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$/

Strong password check

// At least 8 chars, one uppercase, one lowercase, one digit, one special char
/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/

Strip HTML tags

const clean = html.replace(/<[^>]*>/g, '');

Match hex color codes

/#([a-fA-F0-9]{6}|[a-fA-F0-9]{3})\b/g

Extract all URLs from text

/https?:\/\/[^\s"'<>]+/g

Regex in Python vs JavaScript

The core syntax is the same, but there are key differences to know.

Python

import re

# Search (find first match anywhere)
m = re.search(r'\d+', 'Order 42')
print(m.group())  # "42"

# Find all matches
matches = re.findall(r'\d+', '1 cat and 2 dogs')
print(matches)  # ['1', '2']

# Substitute
result = re.sub(r'\s+', '-', 'hello world')
print(result)  # "hello-world"

# Compile for reuse (performance)
pattern = re.compile(r'^\d{4}-\d{2}-\d{2}$')
print(bool(pattern.match('2026-05-02')))  # True

JavaScript

// Literal syntax
const regex = /\d+/g;

// Constructor (useful for dynamic patterns)
const regex2 = new RegExp('\\d+', 'g');

// test() — returns boolean
/^\d+$/.test('123');  // true

// match() — returns array or null
'hello 42 world'.match(/\d+/g);  // ['42']

// replace()
'hello world'.replace(/\s+/g, '-');  // 'hello-world'

// replaceAll() with regex (ES2021+)
'aabbcc'.replaceAll(/(.)\1/g, '$1');  // 'abc'

Tips for Writing Better Regex

Always test your patterns against edge cases — empty strings, Unicode characters, very long strings. Use non-capturing groups (?:...) when you do not need the captured value, as this is slightly faster. Avoid catastrophic backtracking by not nesting quantifiers like (a+)+. Compile patterns you use repeatedly. And always use a live regex tester while writing patterns — it saves enormous amounts of debugging time.

For more pattern inspiration, the Word Counter tool uses regex internally to count sentences and syllables. You can also use the URL Encoder when you need to escape special regex characters for use in URLs.

Writing Regex for Real Use Cases

The gap between knowing regex syntax and applying it confidently to real problems is where most developers get stuck. Understanding each element individually is straightforward; knowing which combination to reach for when facing an actual input is harder. These patterns cover the situations you will encounter most frequently.

Validate an email address

No email regex is perfect — the full RFC 5322 specification is 6,500 characters of regex. This pattern covers 99.9% of real-world email addresses. For production, validate server-side with a library and confirm ownership via a verification email rather than relying solely on a regex.

Match a URL

Extract numbers from text

Slugify a string (URL-safe)

Validate a hex colour code

Strip HTML tags safely

Common Regex Mistakes

Forgetting to escape special characters. The characters . * + ? ^ $ { } [ ] | ( ) \ are all metacharacters in regex. To match them literally, escape with a backslash. Trying to match a period with . actually matches any character — use \. to match a literal period.

Catastrophic backtracking. Patterns like (a+)+ or (.*)* cause exponential backtracking on certain inputs, making your regex take seconds or minutes on a short string. Avoid nesting quantifiers. Use atomic groups or possessive quantifiers where available.

Using greedy when you need lazy. <.*> on text matches the entire string from the first < to the last >. Add ? after the quantifier: <.*?> matches one tag at a time.

Forgetting the global flag for replaceAll. str.replace(/pattern/, replacement) only replaces the first match. Use /pattern/g to replace all occurrences.

Not anchoring validation patterns. /\d{4}/ matches any string containing four consecutive digits — including abc12345. For validation, always anchor: /^\d{4}$/ matches only exactly four digits and nothing else.

Testing and Debugging Regex

Always test regex patterns against edge cases before using them in production: empty strings, very long inputs, strings with Unicode characters, inputs with special characters, and boundary conditions. ToolPry's Regex Tester lets you test patterns against multiple inputs simultaneously with real-time matching highlights, making edge case testing fast.

For complex patterns, compile the regex once and reuse it. In JavaScript: const pattern = /your-pattern/g; defined outside a loop. In Python: pattern = re.compile(r'your-pattern'). Recompiling on every iteration is a measurable performance cost in tight loops.

Frequently Asked Questions

What is the difference between test(), match(), and exec()?

In JavaScript, test() returns a boolean and is the fastest way to check if a pattern exists. match() returns an array of all matches (with the g flag) or the first match with capture groups (without g). exec() returns one match at a time with full group information, and maintains state between calls with the g flag — useful for iterating through all matches in a loop while accessing group data.

When should I use a regex library vs built-in regex?

Use built-in regex for most tasks. Use a library when you need features the built-in engine lacks: PCRE-compatible lookbehinds, named backreferences in older environments, non-backtracking engines for untrusted input (safer against ReDoS attacks), or multiple engine support. The xregexp library extends JavaScript regex with Unicode properties, named groups in older browsers, and other advanced features.

Can regex parse HTML or JSON?

You should not parse HTML or JSON with regex. HTML is not a regular language — it has arbitrarily deep nesting that regex cannot handle correctly. Use DOMParser or a library like Cheerio for HTML. JSON should be parsed with JSON.parse(). Regex is appropriate for extracting simple patterns from known, well-structured strings — not for parsing arbitrary markup or structured data formats.

How do I make a regex case-insensitive?

Add the i flag: /pattern/i in JavaScript, or re.IGNORECASE / re.I in Python. Case-insensitive matching is slightly slower than case-sensitive because the engine must check both cases for each character, but the difference is negligible for typical use cases.

Regex Performance Tips

For most everyday use, regex performance is not a concern. When you are running patterns against millions of strings in a loop, or processing large files, these optimisations matter.

Compile patterns once. In Python, re.compile(r'pattern') ahead of the loop is significantly faster than calling re.search(r'pattern', string) on every iteration because compilation happens once. JavaScript's regex literals are compiled at parse time when defined outside functions, but new RegExp(...) inside a loop recompiles each time.

Anchor when possible. ^pattern and pattern$ let the engine stop early when a string does not match. Without anchors, the engine tries the pattern at every position in the string. For validation (checking if an entire string matches a format), always anchor both ends.

Prefer specific character classes over dot. [0-9] is faster than . when you know the expected characters, because the engine does not need to check every possible character. Similarly, \d is faster than [0-9a-fA-F] if you only need digits.

Avoid catastrophic backtracking. The pattern (a+)+b on a string like aaaaaaaaaaac causes exponential backtracking — the engine tries every possible combination of groupings before concluding no match. In Node.js and browser environments, this can freeze the page. Test suspect patterns with a tool like ToolPry's Regex Tester against inputs designed to trigger worst-case behaviour.

Use non-capturing groups when you do not need the capture. (?:abc) instead of (abc) is slightly faster because the engine does not need to store the captured text. In tight loops processing many strings, this adds up.