🔍 正则表达式教程:从零到英雄

Regular expressions (regex) are one of the most powerful tools in a developer's arsenal—and one of the most feared. This tutorial breaks down regex into digestible pieces, building from basic concepts to patterns you can use in real projects.

🎯 By the end of this tutorial, you'll be able to: Write patterns for email validation, password requirements, phone numbers, and more. You'll understand quantifiers, groups, lookaheads, and know when to use (or avoid) regex.

什么是正则表达式?

Regular expressions are patterns that describe sets of strings. They're used to:

基础知识:文字字符

The simplest regex is just literal text:

Pattern: hello
Matches: "hello" in "hello world"
Does not match: "Hello" (case-sensitive by default)

字符类

Match any single character from a set:

Pattern Matches Example
[abc] a, b, or c "cat" matches c
[a-z] Any lowercase letter "Hello" matches e, l, l, o
[A-Z] Any uppercase letter "Hello" matches H
[0-9] Any digit "abc123" matches 1, 2, 3
[^abc] NOT a, b, or c "dog" matches d, o, g

速记字符类

Shorthand Equivalent Description
\d [0-9] Any digit
\D [^0-9] Not a digit
\w [a-zA-Z0-9_] Word character
\W [^a-zA-Z0-9_] Not a word character
\s [ \t\n\r] Whitespace
\S [^ \t\n\r] Not whitespace
. (almost anything) Any character except newline

量词:有多少?

Quantifier Meaning Example
* 0 or more ab*c matches "ac", "abc", "abbc"
+ 1 or more ab+c matches "abc", "abbc", not "ac"
? 0 or 1 colou?r matches "color" and "colour"
{n} Exactly n \d{4} matches "2026"
{n,} n or more \d{2,} matches "12", "123", "1234"
{n,m} Between n and m \d{2,4} matches "12", "123", "1234"

✅ 贪婪与懒惰

Quantifiers are "greedy" by default—they match as much as possible. Add ? to make them "lazy" (match as little as possible). .* vs .*?

锚点:在哪里匹配

Anchor Position Example
^ Start of string/line ^Hello matches "Hello world", not "Say Hello"
$ End of string/line world$ matches "Hello world", not "world peace"
\b Word boundary \bcat\b matches "cat" but not "category"

分组和捕获

Parentheses create groups for:

示例:捕获组

Pattern: (\d{3})-(\d{3})-(\d{4})
Input: "555-123-4567"

Group 0 (full match): "555-123-4567"
Group 1: "555"
Group 2: "123"
Group 3: "4567"

非捕获组

Use (?:...) when you need grouping but don't need to capture:

(?:https?|ftp):// // Groups but doesn't capture

实用图案

电子邮件验证(基本)

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

Matches: [email protected], [email protected]

电话号码(美国)

^\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})$

Matches: (555) 123-4567, 555-123-4567, 555.123.4567

密码(8+字符、大写、小写、数字)

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$

Uses lookaheads to require different character types

网址

https?://[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}(/\S*)?

Matches: http://example.com, https://sub.domain.com/path

前瞻和后瞻

Match based on what comes before or after, without including it in the match:

Type Syntax Description
Positive Lookahead (?=...) Followed by ...
Negative Lookahead (?!...) NOT followed by ...
Positive Lookbehind (?<=...) Preceded by ...
Negative Lookbehind (? NOT preceded by ...
// Match "foo" only if followed by "bar"
foo(?=bar) // matches "foo" in "foobar", not in "foobaz"

// Match $ amount (digit preceded by $)
(?<=\$)\d+ // matches "100" in "$100"

标志/修饰符

Flag Description
i Case-insensitive matching
g Global - find all matches, not just first
m Multiline - ^ and $ match line boundaries
s Dotall - . matches newlines too

常见错误

🔧 测试你的模式

Use our free RegEx Tester to experiment with patterns and see matches in real-time.

打开正则表达式测试器 →

何时不使用正则表达式

结论

Regular expressions are like a superpower—incredibly useful once you learn them, but easy to misuse. Start with simple patterns, test incrementally, and don't be afraid to use comments or break complex patterns into pieces.

The key to mastering regex is practice. Use the patterns in this tutorial as building blocks, experiment with variations, and soon you'll be writing patterns confidently.