Regex

Regex Cheat Sheet for Beginners (2026)

May 2, 2026 · 8 min read · Test regex live →

Regular expressions are one of those things every developer knows they should understand but many never fully master. This cheat sheet covers everything from basic character matching to advanced lookaheads — with real-world examples you can test immediately in ToolPry's Regex Tester.

Anchors

Anchors do not match characters — they match positions in a string.

PatternWhat it matchesExample
^Start of string (or line with m flag)^Hello matches "Hello world" but not "Say Hello"
$End of string (or line with m flag)world$ matches "Hello world" but not "world peace"
\bWord boundary\bcat\b matches "cat" but not "catch" or "tomcat"
\BNon-word boundary\Bcat\B matches "scatter" but not "cat"

Character Classes

PatternWhat it matches
.Any character except newline
\dAny digit [0-9]
\DAny non-digit
\wWord character [a-zA-Z0-9_]
\WNon-word character
\sWhitespace (space, tab, newline)
\SNon-whitespace
[abc]Any of a, b, or c
[^abc]Anything except a, b, or c
[a-z]Any lowercase letter
[a-zA-Z]Any letter

Quantifiers

Quantifiers specify how many times a pattern must match. By default they are greedy — they match as much as possible.

QuantifierMeaningExample
*0 or morego* matches "g", "go", "goo", "gooo"
+1 or morego+ matches "go", "goo" but not "g"
?0 or 1 (optional)colou?r matches "color" and "colour"
{n}Exactly n times\d{4} matches "2026" but not "26"
{n,}n or more times\d{2,} matches "12", "123", "1234"
{n,m}Between n and m times\d{2,4} matches "12", "123", "1234"
*?Lazy: 0 or more, as few as possible<.*?> matches one tag at a time
+?Lazy: 1 or more, as few as possible.+? matches the shortest possible string

Greedy vs lazy: <.*> on <b>bold</b> matches the entire string. <.*?> matches only <b>. Add ? after a quantifier to make it lazy.

Groups and Capturing

PatternWhat it does
(abc)Capturing group — captures "abc" for back-reference
(?:abc)Non-capturing group — groups without capturing
(?<name>abc)Named capturing group
\1Back-reference to group 1
\k<name>Back-reference to named group
a|bAlternation — matches a or b

Example: Extract year, month, day from a date

const regex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = '2026-05-02'.match(regex);
console.log(match.groups.year);  // "2026"
console.log(match.groups.month); // "05"
console.log(match.groups.day);   // "02"

Lookaheads and Lookbehinds

Lookarounds let you match a pattern only if it is (or is not) followed by or preceded by another pattern — without including that other pattern in the match.

PatternTypeWhat it matches
(?=abc)Positive lookaheadPosition followed by "abc"
(?!abc)Negative lookaheadPosition NOT followed by "abc"
(?<=abc)Positive lookbehindPosition preceded by "abc"
(?<!abc)Negative lookbehindPosition NOT preceded by "abc"

Example: Match price numbers without the $ sign

// Matches the number in "$19.99" without including $
const regex = /(?<=\$)\d+(\.\d{2})?/;
'$19.99'.match(regex)[0]; // "19.99"

Flags

FlagMeaning
gGlobal — find all matches, not just the first
iCase-insensitive matching
mMultiline — ^ and $ match line boundaries
sDotall — . matches newlines too
uUnicode mode — enables full Unicode matching
vUnicodeSets — advanced Unicode (ES2024+)

Real-World Regex Patterns

Email validation

/^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/

URL matching

/https?:\/\/(www\.)?[-a-zA-Z0-9@:%._+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_+.~#?&/=]*)/

IP address (IPv4)

/^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$/

Strong password check

// At least 8 chars, one uppercase, one lowercase, one digit, one special char
/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/

Strip HTML tags

const clean = html.replace(/<[^>]*>/g, '');

Match hex color codes

/#([a-fA-F0-9]{6}|[a-fA-F0-9]{3})\b/g

Extract all URLs from text

/https?:\/\/[^\s"'<>]+/g

Regex in Python vs JavaScript

The core syntax is the same, but there are key differences to know.

Python

import re

# Search (find first match anywhere)
m = re.search(r'\d+', 'Order 42')
print(m.group())  # "42"

# Find all matches
matches = re.findall(r'\d+', '1 cat and 2 dogs')
print(matches)  # ['1', '2']

# Substitute
result = re.sub(r'\s+', '-', 'hello world')
print(result)  # "hello-world"

# Compile for reuse (performance)
pattern = re.compile(r'^\d{4}-\d{2}-\d{2}$')
print(bool(pattern.match('2026-05-02')))  # True

JavaScript

// Literal syntax
const regex = /\d+/g;

// Constructor (useful for dynamic patterns)
const regex2 = new RegExp('\\d+', 'g');

// test() — returns boolean
/^\d+$/.test('123');  // true

// match() — returns array or null
'hello 42 world'.match(/\d+/g);  // ['42']

// replace()
'hello world'.replace(/\s+/g, '-');  // 'hello-world'

// replaceAll() with regex (ES2021+)
'aabbcc'.replaceAll(/(.)\1/g, '$1');  // 'abc'

Tips for Writing Better Regex

Always test your patterns against edge cases — empty strings, Unicode characters, very long strings. Use non-capturing groups (?:...) when you do not need the captured value, as this is slightly faster. Avoid catastrophic backtracking by not nesting quantifiers like (a+)+. Compile patterns you use repeatedly. And always use a live regex tester while writing patterns — it saves enormous amounts of debugging time.

For more pattern inspiration, the Word Counter tool uses regex internally to count sentences and syllables. You can also use the URL Encoder when you need to escape special regex characters for use in URLs.

Writing Regex for Real Use Cases

The gap between knowing regex syntax and applying it confidently to real problems is where most developers get stuck. Understanding each element individually is straightforward; knowing which combination to reach for when facing an actual input is harder. These patterns cover the situations you will encounter most frequently.

Validate an email address

/^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/

No email regex is perfect — the full RFC 5322 specification is 6,500 characters of regex. This pattern covers 99.9% of real-world email addresses. For production, validate server-side with a library and confirm ownership via a verification email rather than relying solely on a regex.

Match a URL

/https?:\/\/(www\.)?[\w\-]+(\.[\w\-]+)+([\w.,@?^=%&:/~+#\-]*[\w@?^=%&/~+#\-])?/g

Extract numbers from text

// All integers
'Order 42, item 7, qty 100'.match(/\d+/g); // ['42', '7', '100']

// Decimal numbers
'$12.99 and $3.50'.match(/\d+\.\d+/g);   // ['12.99', '3.50']

// Negative numbers too
'-5 and +3 and 42'.match(/[+-]?\d+/g);    // ['-5', '+3', '42']

Slugify a string (URL-safe)

function slugify(str) {
  return str
    .toLowerCase()
    .trim()
    .replace(/[^\w\s-]/g, '')   // remove non-word chars
    .replace(/[\s_-]+/g, '-')   // spaces and underscores to hyphens
    .replace(/^-+|-+$/g, '');   // trim leading/trailing hyphens
}
slugify('Hello, World! 2026'); // 'hello-world-2026'

Validate a hex colour code

/^#([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$/

// Test:
/^#([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$/.test('#22d3ee'); // true
/^#([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$/.test('#xyz');    // false

Strip HTML tags safely

const stripped = html.replace(/<[^>]*>/g, '');
// Note: for untrusted HTML, use DOMPurify instead of regex

Common Regex Mistakes

Forgetting to escape special characters. The characters . * + ? ^ $ { } [ ] | ( ) \ are all metacharacters in regex. To match them literally, escape with a backslash. Trying to match a period with . actually matches any character — use \. to match a literal period.

Catastrophic backtracking. Patterns like (a+)+ or (.*)* cause exponential backtracking on certain inputs, making your regex take seconds or minutes on a short string. Avoid nesting quantifiers. Use atomic groups or possessive quantifiers where available.

Using greedy when you need lazy. <.*> on <b>text</b> matches the entire string from the first < to the last >. Add ? after the quantifier: <.*?> matches one tag at a time.

Forgetting the global flag for replaceAll. str.replace(/pattern/, replacement) only replaces the first match. Use /pattern/g to replace all occurrences.

Not anchoring validation patterns. /\d{4}/ matches any string containing four consecutive digits — including abc12345. For validation, always anchor: /^\d{4}$/ matches only exactly four digits and nothing else.

Testing and Debugging Regex

Always test regex patterns against edge cases before using them in production: empty strings, very long inputs, strings with Unicode characters, inputs with special characters, and boundary conditions. ToolPry's Regex Tester lets you test patterns against multiple inputs simultaneously with real-time matching highlights, making edge case testing fast.

For complex patterns, compile the regex once and reuse it. In JavaScript: const pattern = /your-pattern/g; defined outside a loop. In Python: pattern = re.compile(r'your-pattern'). Recompiling on every iteration is a measurable performance cost in tight loops.

Frequently Asked Questions

What is the difference between test(), match(), and exec()?

In JavaScript, test() returns a boolean and is the fastest way to check if a pattern exists. match() returns an array of all matches (with the g flag) or the first match with capture groups (without g). exec() returns one match at a time with full group information, and maintains state between calls with the g flag — useful for iterating through all matches in a loop while accessing group data.

When should I use a regex library vs built-in regex?

Use built-in regex for most tasks. Use a library when you need features the built-in engine lacks: PCRE-compatible lookbehinds, named backreferences in older environments, non-backtracking engines for untrusted input (safer against ReDoS attacks), or multiple engine support. The xregexp library extends JavaScript regex with Unicode properties, named groups in older browsers, and other advanced features.

Can regex parse HTML or JSON?

You should not parse HTML or JSON with regex. HTML is not a regular language — it has arbitrarily deep nesting that regex cannot handle correctly. Use DOMParser or a library like Cheerio for HTML. JSON should be parsed with JSON.parse(). Regex is appropriate for extracting simple patterns from known, well-structured strings — not for parsing arbitrary markup or structured data formats.

How do I make a regex case-insensitive?

Add the i flag: /pattern/i in JavaScript, or re.IGNORECASE / re.I in Python. Case-insensitive matching is slightly slower than case-sensitive because the engine must check both cases for each character, but the difference is negligible for typical use cases.

Regex Performance Tips

For most everyday use, regex performance is not a concern. When you are running patterns against millions of strings in a loop, or processing large files, these optimisations matter.

Compile patterns once. In Python, re.compile(r'pattern') ahead of the loop is significantly faster than calling re.search(r'pattern', string) on every iteration because compilation happens once. JavaScript's regex literals are compiled at parse time when defined outside functions, but new RegExp(...) inside a loop recompiles each time.

Anchor when possible. ^pattern and pattern$ let the engine stop early when a string does not match. Without anchors, the engine tries the pattern at every position in the string. For validation (checking if an entire string matches a format), always anchor both ends.

Prefer specific character classes over dot. [0-9] is faster than . when you know the expected characters, because the engine does not need to check every possible character. Similarly, \d is faster than [0-9a-fA-F] if you only need digits.

Avoid catastrophic backtracking. The pattern (a+)+b on a string like aaaaaaaaaaac causes exponential backtracking — the engine tries every possible combination of groupings before concluding no match. In Node.js and browser environments, this can freeze the page. Test suspect patterns with a tool like ToolPry's Regex Tester against inputs designed to trigger worst-case behaviour.

Use non-capturing groups when you do not need the capture. (?:abc) instead of (abc) is slightly faster because the engine does not need to store the captured text. In tight loops processing many strings, this adds up.