Keine Werbung mögen? Gehen Werbefrei Heute

Regex Patterns Every Developer Should Know (and Stop Rewriting Every Time)

Aktualisiert am

Regex is one of those tools you reach for constantly but rarely sit down to actually learn. Here are the syntax elements, common patterns, and pitfalls worth committing to memory — plus a few to just bookmark and paste.

Regex Patterns Every Developer Should Know (and Stop Rewriting Every Time) 1
ANZEIGE Entfernen?

Most developers have a relationship with regex that goes something like this: write a pattern, get it mostly working, paste it into production, forget how it works three weeks later, rewrite it from scratch the next time the same problem comes up. This article is an attempt to break that cycle.

What follows is a practical reference for the regex elements you’ll reach for constantly — plus explanations of the parts that consistently trip people up. Pull up the RegEx-Tester and try these as you go.

Anchors and Character Classes Worth Memorizing

These are the building blocks. If you’re fuzzy on any of them, that’s where most pattern bugs originate.

  • ^ — matches the start of a string (or start of a line in multiline mode)
  • $ — matches the end of a string (or end of a line in multiline mode)
  • . — matches any character except a newline (use s flag to include newlines)
  • \d — matches any digit, equivalent to [0-9]
  • \w — matches word characters: [a-zA-Z0-9_]
  • \s — matches whitespace: spaces, tabs, newlines
  • [abc] — character class, matches a, b, oder c
  • [^abc] — negated class, matches anything that is NOT a, b, oder c
  • \b — word boundary — the position between a word character and a non-word character

The uppercase versions are negated: \D is anything not a digit, \W is anything not a word character, \S is anything not whitespace.

One common confusion: \b is a zero-width assertion — it doesn’t consume any characters. \bcat\b matches the word “cat” but not “category” or “concatenate”.

Quantifiers: Greedy vs Lazy

Quantifiers control how many times the preceding element must match.

  • * — zero or more
  • + — one or more
  • ? — zero or one (also makes the preceding quantifier lazy)
  • {n} — exactly n times
  • {n,m} — between n and m times
  • {n,} — n or more times

By default all quantifiers are greedy — they match as much as possible. Add ? after any quantifier to make it lazy (match as little as possible).

# Greedy: matches the entire string between the first < and the last >
<.+>  →  matches "bold and italic"

# Lazy: matches each tag individually
<.+?>  →  matches "", "", "", ""

Greedy behavior is the source of more regex bugs than almost anything else. When a pattern is matching too much, try making the quantifier lazy first.

Capture Groups vs Non-Capturing Groups

Parentheses group parts of a pattern and capture what they match. That captured value can be referenced later — in a replacement string, or as a backreference within the same pattern.

# Capturing group — stores the match in group 1
(\d{4})-(\d{2})-(\d{2})
# Matches "2024-03-15", captures: group 1 = "2024", group 2 = "03", group 3 = "15"

# Non-capturing group — groups without storing
(?:\d{4})-(?:\d{2})-(?:\d{2})
# Same match, but no captured groups

Use non-capturing groups (?:...) when you only need grouping for alternation or quantifiers, and don’t need the captured value. It’s slightly more efficient and keeps your group numbering clean.

Named groups let you reference captures by name instead of position — much more readable in complex patterns:

# Named capture groups
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})

# In JavaScript:
const match = "2024-03-15".match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/);
console.log(match.groups.year);  // "2024"
console.log(match.groups.month); // "03"

Lookaheads and Lookbehinds

Lookaround assertions let you match something only when it is (or isn’t) preceded or followed by something else — without including that something else in the match.

  • (?=...) — positive lookahead: the pattern must follow
  • (?!...) — negative lookahead: the pattern must NOT follow
  • (?<=...) — positive lookbehind: the pattern must precede
  • (?<!...) — negative lookbehind: the pattern must NOT precede

Practical example: extract prices but only when they’re in USD, without capturing the currency symbol:

# Match numbers preceded by a $ sign
(?<=\$)\d+(?:\.\d{2})?

# Input: "$19.99 or €24.99"
# Matches: "19.99" (not the euro amount)

Another common use: password validation — require at least one digit, but express it without dictating where the digit must appear:

# Password: at least 8 chars, must contain a digit and uppercase letter
^(?=.*\d)(?=.*[A-Z]).{8,}$

Flags That Actually Matter

  • g (global) — find all matches, not just the first one
  • i (case-insensitive) — Hello passt hello, HELLO, usw.
  • m (multiline) — makes ^ und $ match line starts/ends instead of string start/end
  • s (dotAll) — makes . (dotall) — Lassen Sie
# Without multiline: ^ only matches start of entire string
# With multiline: ^ matches start of each line
/^\w+/gm  // matches first word on every line

Common Practical Patterns

These are the patterns you’ll Google repeatedly. Save them somewhere — or just open the Regex-Tipp-Blatt when you need a quick reference.

Email Validation (and Why It’s Hard)

Email validation with regex is genuinely difficult because the full RFC 5321 spec allows things most developers don’t expect — quoted strings with spaces, comments in parentheses, internationalized domain names. For most applications, a pragmatic pattern is good enough:

# Practical email pattern — not RFC-complete but covers 99% of real addresses
^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$

The real validation for email is sending a confirmation link. Regex just catches obvious typos.

URL Matching

# Match http and https URLs
https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)

This handles the vast majority of URLs in real content. For full URI parsing including edge cases, use a dedicated URL parser instead of regex — most languages have one built in.

Slug Validation

# Valid URL slug: lowercase letters, numbers, hyphens, no leading/trailing hyphens
^[a-z0-9]+(?:-[a-z0-9]+)*$

IPv4-Adresse

# IPv4 address — validates 0-255 range per octet
^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$

Note the structure: the first three octets are grouped with a trailing dot \., and the fourth is matched separately without one.

UUID Format

# UUID v4 format
^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$

# Or if you want to accept any UUID version, case-insensitive:
^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$

Semantic Version (semver)

# Semver: MAJOR.MINOR.PATCH with optional pre-release and build metadata
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$

# Simpler version if you just need MAJOR.MINOR.PATCH:
^\d+\.\d+\.\d+$

Flavor Differences Between Languages

Regex is not one standard — it’s a family of related dialects. A pattern that works in Python may fail silently (or loudly) in JavaScript or Go. Use the Regex-Flavor-Konverter when porting patterns between languages.

Named Backreferences

Syntax varies for referencing named groups in replacements:

# PCRE (PHP, Perl): use \k<name> in pattern, ${name} in replacement
# JavaScript: use \k<name> in pattern, $<name> in replacement string
# Python: use (?P<name>...) to define, (?P=name) to backreference, \g<name> in replacement
# Go: use (?P<name>...) to define, ${name} in regexp.ReplaceAllString

# Python named group syntax (different from JS/PCRE)
(?P<year>\d{4})-(?P<month>\d{2})

Lookbehind Support

JavaScript didn’t support lookbehind at all until ES2018 — and even now some older environments lack it. Go’s standard regexp package (which uses RE2 syntax) doesn’t support lookbehind or backreferences at all. PCRE and Python both support variable-length lookbehinds; older PCRE versions required fixed-width.

In Go specifically, if you need lookaround assertions, you’ll need to restructure your pattern or use a third-party library. It’s a real constraint.

Unicode Handling

JavaScript requires the u flag for proper Unicode support. Python 3 regexes handle Unicode by default. Go’s RE2 engine is Unicode-aware. PCRE requires the u modifier for Unicode properties.

# JavaScript: use u flag for Unicode property escapes (ES2018+)
/\p{Letter}+/u  // matches letters in any language

Test Your Regex Before Shipping

The single most reliable way to avoid shipping a broken regex is to test it against actual inputs — including edge cases — before it touches production code. That means empty strings, strings with only special characters, very long inputs, and inputs that should explicitly NOT match.

Verwenden Sie die RegEx-Tester to run patterns against multiple test cases at once. You can toggle flags, check match positions, and see capture groups — much faster than round-tripping through your application to see if a pattern works.

Also: regex can be slow. Catastrophic backtracking is a real attack vector. A pattern like (a+)+b on a long string of as with no trailing b can take exponential time. If you’re validating user-provided input server-side, either test with worst-case inputs or use a RE2-based engine (like Go’s or Rust’s) that guarantees linear time.

Quick Reference Summary

  • Anchors: ^ $ \b
  • Zeichenklassen: \d \w \s and their uppercase negations
  • Quantifizierer: * + ? {n,m} — add ? to make lazy
  • Gruppen: (...) erfassend, (?:...) nicht erfassend, (?<name>...) named
  • Lookaround: (?=...) (?!...) (?<=...) (?<!...)
  • Flags: g i m s

For a full syntax reference organized by category, the Regex-Tipp-Blatt has everything in one place. And when you need to port a pattern between PCRE, JavaScript, Python, or Go, the Regex-Flavor-Konverter handles the syntax translation automatically.

Möchten Sie werbefrei genießen? Werde noch heute werbefrei

Erweiterungen installieren

IO-Tools zu Ihrem Lieblingsbrowser hinzufügen für sofortigen Zugriff und schnellere Suche

Zu Chrome-Erweiterung Zu Kantenerweiterung Zu Firefox-Erweiterung Zu Opera-Erweiterung

Die Anzeigetafel ist eingetroffen!

Anzeigetafel ist eine unterhaltsame Möglichkeit, Ihre Spiele zu verfolgen. Alle Daten werden in Ihrem Browser gespeichert. Weitere Funktionen folgen in Kürze!

ANZEIGE Entfernen?
ANZEIGE Entfernen?
ANZEIGE Entfernen?

Nachrichtenecke mit technischen Highlights

Beteiligen Sie sich

Helfen Sie uns, weiterhin wertvolle kostenlose Tools bereitzustellen

Kauf mir einen Kaffee
ANZEIGE Entfernen?