Regular expressions (regex) are one of the most powerful tools in a developer's arsenal—and one of the most feared. This tutorial breaks down regex into digestible pieces, building from basic concepts to patterns you can use in real projects.
🎯 By the end of this tutorial, you'll be able to: Write patterns for email validation, password requirements, phone numbers, and more. You'll understand quantifiers, groups, lookaheads, and know when to use (or avoid) regex.
什么是正则表达式?
Regular expressions are patterns that describe sets of strings. They're used to:
- Search: Find text matching a pattern
- Validate: Check if input matches expected format
- Extract: Pull specific parts from text
- Replace: Transform text based on patterns
基础知识:文字字符
The simplest regex is just literal text:
Matches: "hello" in "hello world"
Does not match: "Hello" (case-sensitive by default)
字符类
Match any single character from a set:
| Pattern | Matches | Example |
|---|---|---|
[abc] |
a, b, or c | "cat" matches c |
[a-z] |
Any lowercase letter | "Hello" matches e, l, l, o |
[A-Z] |
Any uppercase letter | "Hello" matches H |
[0-9] |
Any digit | "abc123" matches 1, 2, 3 |
[^abc] |
NOT a, b, or c | "dog" matches d, o, g |
速记字符类
| Shorthand | Equivalent | Description |
|---|---|---|
\d |
[0-9] |
Any digit |
\D |
[^0-9] |
Not a digit |
\w |
[a-zA-Z0-9_] |
Word character |
\W |
[^a-zA-Z0-9_] |
Not a word character |
\s |
[ \t\n\r] |
Whitespace |
\S |
[^ \t\n\r] |
Not whitespace |
. |
(almost anything) | Any character except newline |
量词:有多少?
| Quantifier | Meaning | Example |
|---|---|---|
* |
0 or more | ab*c matches "ac", "abc", "abbc" |
+ |
1 or more | ab+c matches "abc", "abbc", not "ac" |
? |
0 or 1 | colou?r matches "color" and "colour" |
{n} |
Exactly n | \d{4} matches "2026" |
{n,} |
n or more | \d{2,} matches "12", "123", "1234" |
{n,m} |
Between n and m | \d{2,4} matches "12", "123", "1234" |
✅ 贪婪与懒惰
Quantifiers are "greedy" by default—they match as much as possible. Add ? to make them
"lazy" (match as little as possible). .* vs .*?
锚点:在哪里匹配
| Anchor | Position | Example |
|---|---|---|
^ |
Start of string/line | ^Hello matches "Hello world", not "Say Hello" |
$ |
End of string/line | world$ matches "Hello world", not "world peace" |
\b |
Word boundary | \bcat\b matches "cat" but not "category" |
分组和捕获
Parentheses create groups for:
- Applying quantifiers to multiple characters:
(ab)+ - Capturing matched text for later use
- Creating alternatives:
(cat|dog)
示例:捕获组
Input: "555-123-4567"
Group 0 (full match): "555-123-4567"
Group 1: "555"
Group 2: "123"
Group 3: "4567"
非捕获组
Use (?:...) when you need grouping but don't need to capture:
实用图案
电子邮件验证(基本)
Matches: [email protected], [email protected]
电话号码(美国)
Matches: (555) 123-4567, 555-123-4567, 555.123.4567
密码(8+字符、大写、小写、数字)
Uses lookaheads to require different character types
网址
Matches: http://example.com, https://sub.domain.com/path
前瞻和后瞻
Match based on what comes before or after, without including it in the match:
| Type | Syntax | Description |
|---|---|---|
| Positive Lookahead | (?=...) |
Followed by ... |
| Negative Lookahead | (?!...) |
NOT followed by ... |
| Positive Lookbehind | (?<=...) |
Preceded by ... |
| Negative Lookbehind | (? |
NOT preceded by ... |
foo(?=bar) // matches "foo" in "foobar", not in "foobaz"
// Match $ amount (digit preceded by $)
(?<=\$)\d+ // matches "100" in "$100"
标志/修饰符
| Flag | Description |
|---|---|
i |
Case-insensitive matching |
g |
Global - find all matches, not just first |
m |
Multiline - ^ and $ match line boundaries |
s |
Dotall - . matches newlines too |
常见错误
- Forgetting to escape:
.,*,+,?, etc. have special meaning. Use\.to match a literal period. - Greedy matching:
.*matches too much. Use.*?for lazy matching. - Missing anchors:
\d{4}matches "2026" anywhere. Use^\d{4}$for exact match. - Overcomplicated patterns: Sometimes string methods or multiple simple patterns are clearer.
🔧 测试你的模式
Use our free RegEx Tester to experiment with patterns and see matches in real-time.
打开正则表达式测试器 →何时不使用正则表达式
- Parsing HTML/XML: Use a proper parser. Regex can't handle nested tags correctly.
- Complex validation: Email RFC is incredibly complex. Use a library.
- Simple tasks:
str.includes()orstr.startsWith()are clearer than regex.
结论
Regular expressions are like a superpower—incredibly useful once you learn them, but easy to misuse. Start with simple patterns, test incrementally, and don't be afraid to use comments or break complex patterns into pieces.
The key to mastering regex is practice. Use the patterns in this tutorial as building blocks, experiment with variations, and soon you'll be writing patterns confidently.