What is the difference between greedy and lazy quantifiers in regex?

Greedy quantifiers (*, +, {n,m}) match as much text as possible. Lazy quantifiers (*?, +?, {n,m}?) match as little as possible. For example, on " жирный " matches the entire string greedily, while matches each individual tag. When your pattern is capturing too much, switching to a lazy quantifier is usually the first thing to try.

Why does my regex work in Python but not in JavaScript?

Different languages use different regex engines with different feature sets. Python supports variable-length lookbehinds and uses (?P...) syntax for named groups. JavaScript uses (?...) for named groups and only added lookbehind support in ES2018. Go's standard library uses RE2, which doesn't support lookbehind or backreferences at all. Use a Regex Flavor Converter to adapt patterns when switching between languages.

What are named capture groups and when should I use them?

Named capture groups use the syntax (?...) in most flavors (Python uses (?P...)) and let you reference captured text by name instead of position number. They're most useful in complex patterns with multiple groups, where group numbering becomes hard to track, or in replacement strings where you want readable substitution references like $ instead of $1.

Is regex reliable for email validation?

Regex can catch obvious formatting errors in email addresses — missing @ sign, no domain, etc. — but it cannot validate whether an address actually exists or whether the domain accepts mail. The full email spec (RFC 5321) also allows uncommon syntax like quoted local parts that most regex patterns reject. A pragmatic pattern like ^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$ covers nearly all real-world addresses. For anything beyond basic format checking, send a confirmation email.

Реклама мешает? Идти Без рекламы Сегодня 

Regex Patterns Every Developer Should Know (and Stop Rewriting Every Time)

Обновлено Июн 30, 2026

Regex is one of those tools you reach for constantly but rarely sit down to actually learn. Here are the syntax elements, common patterns, and pitfalls worth committing to memory — plus a few to just bookmark and paste.

Regex Patterns Every Developer Should Know (and Stop Rewriting Every Time) 1

Реклама · УДАЛИТЬ?

Most developers have a relationship with regex that goes something like this: write a pattern, get it mostly working, paste it into production, forget how it works three weeks later, rewrite it from scratch the next time the same problem comes up. This article is an attempt to break that cycle.

What follows is a practical reference for the regex elements you’ll reach for constantly — plus explanations of the parts that consistently trip people up. Pull up the Тестер регулярных выражений and try these as you go.

Anchors and Character Classes Worth Memorizing

These are the building blocks. If you’re fuzzy on any of them, that’s where most pattern bugs originate.

^ — matches the start of a string (or start of a line in multiline mode)
$ — matches the end of a string (or end of a line in multiline mode)
. — matches any character except a newline (use s flag to include newlines)
\d — matches any digit, equivalent to [0-9]
\w — matches word characters: [a-zA-Z0-9_]
\s — matches whitespace: spaces, tabs, newlines
[abc] — character class, matches a, b, или c
[^abc] — negated class, matches anything that is NOT a, b, или c
\b — word boundary — the position between a word character and a non-word character

The uppercase versions are negated: \D is anything not a digit, \W is anything not a word character, \S is anything not whitespace.

One common confusion: \b is a zero-width assertion — it doesn’t consume any characters. \bcat\b matches the word “cat” but not “category” or “concatenate”.

Quantifiers: Greedy vs Lazy

Quantifiers control how many times the preceding element must match.

* — zero or more
+ — one or more
? — zero or one (also makes the preceding quantifier lazy)
{n} — exactly n times
{n,m} — between n and m times
{n,} — n or more times

By default all quantifiers are greedy — they match as much as possible. Add ? after any quantifier to make it lazy (match as little as possible).

# Greedy: matches the entire string between the first < and the last >
<.+>  →  matches "bold and italic"

# Lazy: matches each tag individually
<.+?>  →  matches "", "", "", ""

Greedy behavior is the source of more regex bugs than almost anything else. When a pattern is matching too much, try making the quantifier lazy first.

Capture Groups vs Non-Capturing Groups

Parentheses group parts of a pattern and capture what they match. That captured value can be referenced later — in a replacement string, or as a backreference within the same pattern.

# Capturing group — stores the match in group 1
(\d{4})-(\d{2})-(\d{2})
# Matches "2024-03-15", captures: group 1 = "2024", group 2 = "03", group 3 = "15"

# Non-capturing group — groups without storing
(?:\d{4})-(?:\d{2})-(?:\d{2})
# Same match, but no captured groups

Use non-capturing groups (?:...) when you only need grouping for alternation or quantifiers, and don’t need the captured value. It’s slightly more efficient and keeps your group numbering clean.

Named groups let you reference captures by name instead of position — much more readable in complex patterns:

# Named capture groups
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})

# In JavaScript:
const match = "2024-03-15".match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/);
console.log(match.groups.year);  // "2024"
console.log(match.groups.month); // "03"

Lookaheads and Lookbehinds

Lookaround assertions let you match something only when it is (or isn’t) preceded or followed by something else — without including that something else in the match.

(?=...) — positive lookahead: the pattern must follow
(?!...) — negative lookahead: the pattern must NOT follow
(?<=...) — positive lookbehind: the pattern must precede
(?<!...) — negative lookbehind: the pattern must NOT precede

Practical example: extract prices but only when they’re in USD, without capturing the currency symbol:

# Match numbers preceded by a $ sign
(?<=\$)\d+(?:\.\d{2})?

# Input: "$19.99 or €24.99"
# Matches: "19.99" (not the euro amount)

Another common use: password validation — require at least one digit, but express it without dictating where the digit must appear:

# Password: at least 8 chars, must contain a digit and uppercase letter
^(?=.*\d)(?=.*[A-Z]).{8,}$

Flags That Actually Matter

g (global) — find all matches, not just the first one
i (case-insensitive) — Hello соответствует hello, HELLO(0–6 или 1–7) — воскресенье может быть
m (multiline) — makes ^ и $ match line starts/ends instead of string start/end
s (dotAll) — makes . сопоставлять и символы новой строки

# Without multiline: ^ only matches start of entire string
# With multiline: ^ matches start of each line
/^\w+/gm  // matches first word on every line

Common Practical Patterns

These are the patterns you’ll Google repeatedly. Save them somewhere — or just open the Лист с регулярными выражениями when you need a quick reference.

Email Validation (and Why It’s Hard)

Email validation with regex is genuinely difficult because the full RFC 5321 spec allows things most developers don’t expect — quoted strings with spaces, comments in parentheses, internationalized domain names. For most applications, a pragmatic pattern is good enough:

# Practical email pattern — not RFC-complete but covers 99% of real addresses
^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$

The real validation for email is sending a confirmation link. Regex just catches obvious typos.

URL Matching

# Match http and https URLs
https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)

This handles the vast majority of URLs in real content. For full URI parsing including edge cases, use a dedicated URL parser instead of regex — most languages have one built in.

Slug Validation

# Valid URL slug: lowercase letters, numbers, hyphens, no leading/trailing hyphens
^[a-z0-9]+(?:-[a-z0-9]+)*$

IPv4-адрес

# IPv4 address — validates 0-255 range per octet
^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$

Note the structure: the first three octets are grouped with a trailing dot \., and the fourth is matched separately without one.

UUID Format

# UUID v4 format
^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$

# Or if you want to accept any UUID version, case-insensitive:
^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$

Semantic Version (semver)

# Semver: MAJOR.MINOR.PATCH with optional pre-release and build metadata
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$

# Simpler version if you just need MAJOR.MINOR.PATCH:
^\d+\.\d+\.\d+$

Flavor Differences Between Languages

Regex is not one standard — it’s a family of related dialects. A pattern that works in Python may fail silently (or loudly) in JavaScript or Go. Use the Конвертер форматов регулярных выражений when porting patterns between languages.

Named Backreferences

Syntax varies for referencing named groups in replacements:

# PCRE (PHP, Perl): use \k<name> in pattern, ${name} in replacement
# JavaScript: use \k<name> in pattern, $<name> in replacement string
# Python: use (?P<name>...) to define, (?P=name) to backreference, \g<name> in replacement
# Go: use (?P<name>...) to define, ${name} in regexp.ReplaceAllString

# Python named group syntax (different from JS/PCRE)
(?P<year>\d{4})-(?P<month>\d{2})

Lookbehind Support

JavaScript didn’t support lookbehind at all until ES2018 — and even now some older environments lack it. Go’s standard regexp package (which uses RE2 syntax) doesn’t support lookbehind or backreferences at all. PCRE and Python both support variable-length lookbehinds; older PCRE versions required fixed-width.

In Go specifically, if you need lookaround assertions, you’ll need to restructure your pattern or use a third-party library. It’s a real constraint.

Unicode Handling

JavaScript requires the u flag for proper Unicode support. Python 3 regexes handle Unicode by default. Go’s RE2 engine is Unicode-aware. PCRE requires the u modifier for Unicode properties.

# JavaScript: use u flag for Unicode property escapes (ES2018+)
/\p{Letter}+/u  // matches letters in any language

Test Your Regex Before Shipping

The single most reliable way to avoid shipping a broken regex is to test it against actual inputs — including edge cases — before it touches production code. That means empty strings, strings with only special characters, very long inputs, and inputs that should explicitly NOT match.

Используйте Тестер регулярных выражений to run patterns against multiple test cases at once. You can toggle flags, check match positions, and see capture groups — much faster than round-tripping through your application to see if a pattern works.

Also: regex can be slow. Catastrophic backtracking is a real attack vector. A pattern like (a+)+b on a long string of as with no trailing b can take exponential time. If you’re validating user-provided input server-side, either test with worst-case inputs or use a RE2-based engine (like Go’s or Rust’s) that guarantees linear time.

Quick Reference Summary

(конец), ^ $ \b
Классы символов: \d \w \s and their uppercase negations
(диапазон), * + ? {n,m} — add ? to make lazy
не захватывающие, (...) именованные группы (?:...) Смотрящие вперёд / смотрящие назад: (?<name>...) named
Lookaround: (?=...) (?!...) (?<=...) (?<!...)
Флаги: g i m s

For a full syntax reference organized by category, the Лист с регулярными выражениями has everything in one place. And when you need to port a pattern between PCRE, JavaScript, Python, or Go, the Конвертер форматов регулярных выражений handles the syntax translation automatically.