0% found this document useful (0 votes)

21 views6 pages

Regex Patterns and Syntax

The document provides an overview of regular expressions (regex) in Python, detailing basic syntax, character classes, anchors, groups, and practical examples. It explains various regex components such as `.` for matching any character, `^` for the start of a string, and `` for word boundaries. Additionally, it includes practical examples for extracting emails, finding dates, and validating phone numbers.

Uploaded by

Samit Mishra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views6 pages

Regex Patterns and Syntax

Uploaded by

Samit Mishra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Regular Expressions (Regex) in Python

### Basic Syntax

- `.`: Matches any single character except a newline.

- Example: à.b` matches àab`, àcb`, but not à\nb`.

- `^`: Matches the start of a string.

- Example: `^abc` matches `abc` only if it is at the beginning of the string.

- `$`: Matches the end of a string.

- Example: `abc$` matches `abc` only if it is at the end of the string.

- **`*`**: Matches 0 or more repetitions of the preceding element.

- Example: àb*c` matches àc`, àbc`, àbbc`, etc.

- `+`: Matches 1 or more repetitions of the preceding element.

- Example: àb+c` matches àbc`, àbbc`, but not àc`.

- `?`: Matches 0 or 1 repetition of the preceding element.

- Example: àb?c` matches àc` and àbc`, but not àbbc`.

- **`{m,n}`**: Matches between `m` and `n` repetitions of the preceding element.

- Example: à{2,4}` matches àa`, àaa`, and àaaa`.

### Character Classes

- `[...]`: Matches any one of the characters inside the brackets.

- Example: `[abc]` matches `a`, `b`, or `c`.

- `[^...]`: Matches any character not inside the brackets.

- Example: `[^abc]` matches any character except `a`, `b`, or `c`.

- `\d`: Matches any digit (equivalent to `[0-9]`).

- Example: `\d` matches `1`, `2`, `3`, etc.

- `\D`: Matches any non-digit.

- Example: `\D` matches `a`, `b`, `!`, etc.

- `\w`: Matches any word character (alphanumeric + underscore, equivalent to `[A-Za-z0-9_]`).

- Example: `\w` matches `a`, `1`, `_`, etc.

- `\W`: Matches any non-word character.

- Example: `\W` matches `!`, `@`, `#`, etc.

- `\s`: Matches any whitespace character (spaces, tabs, newlines).

- Example: `\s` matches ` `, `\t`, `\n`, etc.

- `\S`: Matches any non-whitespace character.

- Example: `\S` matches `a`, `1`, `!`, etc.

### Anchors

- `\b`: Matches a word boundary.

- Example: `\bword\b` matches `word` but not `sword` or `words`.

- `\B`: Matches a non-word boundary.

- Example: `\Bword\B` matches `swordfish` but not `word` or `sword`.

### Groups and Alternations

- `(...)`: Groups a pattern together.

- Example: `(abc)+` matches àbc`, àbcabc`, àbcabcabc`, etc.

- **`|`**: Matches either the pattern before or the pattern after the `|`.

- Example: `a|b` matches `a` or `b`.

### Escaped Characters

- `\`: Escapes a special character, making it literal.

- Example: `\.` matches `.` instead of any character.

### Lookahead and Lookbehind

- `(?=...)`: Positive lookahead assertion, matches a group if it is followed by a certain pattern.

- Example: à(?=b)` matches à` in àb`, but not in àc`.

- **`(?!...)`**: Negative lookahead assertion, matches a group if it is not followed by a certain pattern.

- Example: à(?!b)` matches à` in àc`, but not in àb`.

- `(?<=...)`: Positive lookbehind assertion, matches a group if it is preceded by a certain pattern.

- Example: `(?<=b)a` matches `a` in `ba`, but not in `ca`.

- `(?<!...)`: Negative lookbehind assertion, matches a group if it is not preceded by a certain

pattern.

- Example: `(?<!b)a` matches `a` in `ca`, but not in `ba`.

### Flags

- `re.IGNORECASE` or `re.I`: Ignore case.

- Example: `re.search('abc', 'ABC', re.I)` matches.

- **`re.MULTILINE` or `re.M`**: Make `^` and `$` match the start and end of each line.

- Example: `re.search('^abc', 'abc\ndef', re.M)` matches.

- `re.DOTALL` or `re.S`: Make `.` match any character including newlines.

- Example: `re.search('a.b', 'a\nb', re.S)` matches.

### Practical Examples

1. Extracting Email Addresses:

```python

import re
text = "Please contact us at [email protected] for assistance."

emails = re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text)

print(emails) # Output: ['[email protected]']

```

2. Finding Dates in a Text:

```python

text = "The event is scheduled for 2023-08-10."

dates = re.findall(r'\b\d{4}-\d{2}-\d{2}\b', text)

print(dates) # Output: ['2023-08-10']

```

3. Replacing Multiple Whitespace Characters with a Single Space:

```python

text = "This is an example text."

clean_text = re.sub(r'\s+', ' ', text)

print(clean_text) # Output: "This is an example text."

```

4. Validating a Phone Number:

```python

phone = "123-456-7890"

if re.match(r'^\d{3}-\d{3}-\d{4}$', phone):

print("Valid phone number")

else:

print("Invalid phone number")

```

### Summary

Regular expressions provide a powerful way to search, match, and manipulate strings based on

specific patterns. By understanding these patterns and syntax, you can perform complex text

processing tasks efficiently.

Lecture 9 Python
No ratings yet
Lecture 9 Python
8 pages
Regular Expressions in QTP
No ratings yet
Regular Expressions in QTP
2 pages
Python Regular Expression
100% (1)
Python Regular Expression
31 pages
Regular Expressions
No ratings yet
Regular Expressions
291 pages
Regular Expressions Python
No ratings yet
Regular Expressions Python
26 pages
Regular Expressions
No ratings yet
Regular Expressions
104 pages
Howto Regex
No ratings yet
Howto Regex
20 pages
Python RegEx
No ratings yet
Python RegEx
11 pages
Python Regex
No ratings yet
Python Regex
8 pages
2 - Python Strings
No ratings yet
2 - Python Strings
23 pages
Regular Expression Python
No ratings yet
Regular Expression Python
23 pages
Python Regular Expressions Quick Reference
No ratings yet
Python Regular Expressions Quick Reference
2 pages
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
No ratings yet
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
20 pages
Howto Regex
No ratings yet
Howto Regex
17 pages
Howto Regex PDF
No ratings yet
Howto Regex PDF
20 pages
Summary Python 1
No ratings yet
Summary Python 1
36 pages
Python Regular Expressions
No ratings yet
Python Regular Expressions
14 pages
Python Regex: Re - Match, Re - Search, Re - Findall With Example
No ratings yet
Python Regex: Re - Match, Re - Search, Re - Findall With Example
10 pages
Howto Regex
No ratings yet
Howto Regex
20 pages
Regular Expressions
No ratings yet
Regular Expressions
5 pages
Regular Expression 1
No ratings yet
Regular Expression 1
17 pages
06 - Regular Expressions and Network Programming
No ratings yet
06 - Regular Expressions and Network Programming
55 pages
Python Reg Expressions PDF
No ratings yet
Python Reg Expressions PDF
8 pages
9 RegEx
No ratings yet
9 RegEx
57 pages
Reg Exp
No ratings yet
Reg Exp
10 pages
Ubject Erb Greement: Reteaching
No ratings yet
Ubject Erb Greement: Reteaching
3 pages
Regular Expressions
No ratings yet
Regular Expressions
9 pages
Regular Expression Syntax
No ratings yet
Regular Expression Syntax
4 pages
Structuring With Regix
No ratings yet
Structuring With Regix
49 pages
Lec 06 - Regular Expression
No ratings yet
Lec 06 - Regular Expression
19 pages
Basic Spanish 2nd Edition Ana C. Jarvis Download
No ratings yet
Basic Spanish 2nd Edition Ana C. Jarvis Download
57 pages
Lecture 6 Re Basics
No ratings yet
Lecture 6 Re Basics
12 pages
Howto Regex
No ratings yet
Howto Regex
20 pages
Regex Notes
No ratings yet
Regex Notes
2 pages
SEN 317 Lecture 4
No ratings yet
SEN 317 Lecture 4
8 pages
Subtitle
No ratings yet
Subtitle
3 pages
Text Processing For NLP Understanding Regex
No ratings yet
Text Processing For NLP Understanding Regex
16 pages
Python Unit 5
No ratings yet
Python Unit 5
143 pages
Chapter 10
No ratings yet
Chapter 10
28 pages
9 RegEx
No ratings yet
9 RegEx
57 pages
Regexpresion
No ratings yet
Regexpresion
11 pages
14.regular Expression
No ratings yet
14.regular Expression
3 pages
Python Regular Expressions Cheat Sheet PDF
No ratings yet
Python Regular Expressions Cheat Sheet PDF
1 page
RegEx in Python
No ratings yet
RegEx in Python
6 pages
Ayan Saha - 10700121101
No ratings yet
Ayan Saha - 10700121101
10 pages
Module II
No ratings yet
Module II
17 pages
Diglossia and Polyglossia
33% (3)
Diglossia and Polyglossia
3 pages
Daily Literacy Practice - Set 2
No ratings yet
Daily Literacy Practice - Set 2
10 pages
Untitled
No ratings yet
Untitled
53 pages
Mid Semester Examination (Sep 2018) B. E. - III Year, Semester V
No ratings yet
Mid Semester Examination (Sep 2018) B. E. - III Year, Semester V
2 pages
Top 30 Commonly Confused Words in English
No ratings yet
Top 30 Commonly Confused Words in English
5 pages
Mennonite 2
No ratings yet
Mennonite 2
268 pages
Regular Expressions in Python
No ratings yet
Regular Expressions in Python
12 pages
I. Zulu A. Consider The Following Data From Zulu
No ratings yet
I. Zulu A. Consider The Following Data From Zulu
2 pages
GMAT Grammar Rules - Complete List of GMAT Sentence Correction Rules
No ratings yet
GMAT Grammar Rules - Complete List of GMAT Sentence Correction Rules
21 pages
Conjunctions
No ratings yet
Conjunctions
20 pages
Regular Expressions Cheat Sheet
No ratings yet
Regular Expressions Cheat Sheet
5 pages
Show Dont Tell
No ratings yet
Show Dont Tell
4 pages
Module5 RegularExpressions
No ratings yet
Module5 RegularExpressions
10 pages
GRAMMAR LESSON 4 - Adjective and Adverbs
No ratings yet
GRAMMAR LESSON 4 - Adjective and Adverbs
8 pages
Python Reg Expressions
No ratings yet
Python Reg Expressions
8 pages
3 - JLPT N2 Grammar List
No ratings yet
3 - JLPT N2 Grammar List
4 pages
Modality Dissertation
100% (2)
Modality Dissertation
8 pages
Howto Regex
No ratings yet
Howto Regex
19 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
4 pages
TL - SB2 - p01 - Title - Page - Indd 1 9/24/19 12:04 PM
No ratings yet
TL - SB2 - p01 - Title - Page - Indd 1 9/24/19 12:04 PM
25 pages
ROMANIZATION OF KOREAN - MOCT For ROK
No ratings yet
ROMANIZATION OF KOREAN - MOCT For ROK
4 pages
Full Python Regex Questions Detailed
No ratings yet
Full Python Regex Questions Detailed
4 pages
Verbals Topic
No ratings yet
Verbals Topic
2 pages
Q1Wk6 English 9 Conditionals and Communicative Styles
No ratings yet
Q1Wk6 English 9 Conditionals and Communicative Styles
3 pages
64cb740b82bff Énoncé Let's Review 5
No ratings yet
64cb740b82bff Énoncé Let's Review 5
42 pages
Assignment 3
No ratings yet
Assignment 3
3 pages
References
No ratings yet
References
2 pages
Chapter 2 CDA
No ratings yet
Chapter 2 CDA
18 pages
Year 11 English Scheme of Work 2011 Sample
No ratings yet
Year 11 English Scheme of Work 2011 Sample
4 pages
English Revision Programme
No ratings yet
English Revision Programme
10 pages
ELEMENTARY GRAMMAR A Practice Book For B
No ratings yet
ELEMENTARY GRAMMAR A Practice Book For B
20 pages
Regex Patterns Reference
No ratings yet
Regex Patterns Reference
3 pages
Advanced English Notebook Grade 9
No ratings yet
Advanced English Notebook Grade 9
34 pages
KIR A1 Test Unit 1
No ratings yet
KIR A1 Test Unit 1
8 pages
Transformation of Sentence
No ratings yet
Transformation of Sentence
16 pages
Unit 2 Regular Expression
No ratings yet
Unit 2 Regular Expression
3 pages
T L 10702 Muted Rainbow Themed Punctuation Display Posters - Ver - 2
No ratings yet
T L 10702 Muted Rainbow Themed Punctuation Display Posters - Ver - 2
11 pages
Regex Reference
No ratings yet
Regex Reference
2 pages
Japanese Language Calendar
No ratings yet
Japanese Language Calendar
4 pages
7e's English Wednesday
No ratings yet
7e's English Wednesday
4 pages

Regex Patterns and Syntax

Uploaded by

Regex Patterns and Syntax

Uploaded by

Regular Expressions (Regex) in Python

### Basic Syntax

- **`.`**: Matches any single character except a newline.

- Example: à.b` matches àab`, àcb`, but not à\nb`.

- **`^`**: Matches the start of a string.

- Example: `^abc` matches `abc` only if it is at the beginning of the string.

- **`$`**: Matches the end of a string.

- Example: `abc$` matches `abc` only if it is at the end of the string.

- **`*`**: Matches 0 or more repetitions of the preceding element.

- Example: àb*c` matches àc`, àbc`, àbbc`, etc.

- **`+`**: Matches 1 or more repetitions of the preceding element.

- Example: àb+c` matches àbc`, àbbc`, but not àc`.

- **`?`**: Matches 0 or 1 repetition of the preceding element.

- Example: àb?c` matches àc` and àbc`, but not àbbc`.

- Example: à{2,4}` matches àa`, àaa`, and àaaa`.

- **`[...]`**: Matches any one of the characters inside the brackets.

- Example: `[abc]` matches `a`, `b`, or `c`.

- **`[^...]`**: Matches any character not inside the brackets.

- Example: `[^abc]` matches any character except `a`, `b`, or `c`.

- **`\d`**: Matches any digit (equivalent to `[0-9]`).

- Example: `\d` matches `1`, `2`, `3`, etc.

- **`\D`**: Matches any non-digit.

- Example: `\D` matches `a`, `b`, `!`, etc.

- **`\w`**: Matches any word character (alphanumeric + underscore, equivalent to `[A-Za-z0-9_]`).

- Example: `\w` matches `a`, `1`, `_`, etc.

- **`\W`**: Matches any non-word character.

- Example: `\W` matches `!`, `@`, `#`, etc.

- **`\s`**: Matches any whitespace character (spaces, tabs, newlines).

- Example: `\s` matches ` `, `\t`, `\n`, etc.

- **`\S`**: Matches any non-whitespace character.

- Example: `\S` matches `a`, `1`, `!`, etc.

- **`\b`**: Matches a word boundary.

- Example: `\bword\b` matches `word` but not `sword` or `words`.

- **`\B`**: Matches a non-word boundary.

- Example: `\Bword\B` matches `swordfish` but not `word` or `sword`.

### Groups and Alternations

- **`(...)`**: Groups a pattern together.

- Example: `(abc)+` matches àbc`, àbcabc`, àbcabcabc`, etc.

- Example: `a|b` matches `a` or `b`.

### Escaped Characters

- **`\`**: Escapes a special character, making it literal.

- Example: `\.` matches `.` instead of any character.

### Lookahead and Lookbehind

- **`(?=...)`**: Positive lookahead assertion, matches a group if it is followed by a certain pattern.

- Example: à(?=b)` matches à` in àb`, but not in àc`.

- Example: à(?!b)` matches à` in àc`, but not in àb`.

- **`(?<=...)`**: Positive lookbehind assertion, matches a group if it is preceded by a certain pattern.

- Example: `(?<=b)a` matches `a` in `ba`, but not in `ca`.

- **`(?<!...)`**: Negative lookbehind assertion, matches a group if it is not preceded by a certain

- Example: `(?<!b)a` matches `a` in `ca`, but not in `ba`.

- **`re.IGNORECASE` or `re.I`**: Ignore case.

- Example: `re.search('abc', 'ABC', re.I)` matches.

- Example: `re.search('^abc', 'abc\ndef', re.M)` matches.

- **`re.DOTALL` or `re.S`**: Make `.` match any character including newlines.

- Example: `re.search('a.b', 'a\nb', re.S)` matches.

### Practical Examples

1. **Extracting Email Addresses**:

emails = re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text)

print(emails) # Output: ['[email protected]']

2. **Finding Dates in a Text**:

text = "The event is scheduled for 2023-08-10."

dates = re.findall(r'\b\d{4}-\d{2}-\d{2}\b', text)

print(dates) # Output: ['2023-08-10']

3. **Replacing Multiple Whitespace Characters with a Single Space**:

text = "This is an example text."

clean_text = re.sub(r'\s+', ' ', text)

print(clean_text) # Output: "This is an example text."

4. **Validating a Phone Number**:

print("Valid phone number")

print("Invalid phone number")

processing tasks efficiently.

You might also like

- `.`: Matches any single character except a newline.

- `^`: Matches the start of a string.

- `$`: Matches the end of a string.

- `+`: Matches 1 or more repetitions of the preceding element.

- `?`: Matches 0 or 1 repetition of the preceding element.

- `[...]`: Matches any one of the characters inside the brackets.

- `[^...]`: Matches any character not inside the brackets.

- `\d`: Matches any digit (equivalent to `[0-9]`).

- `\D`: Matches any non-digit.

- `\w`: Matches any word character (alphanumeric + underscore, equivalent to `[A-Za-z0-9_]`).

- `\W`: Matches any non-word character.

- `\s`: Matches any whitespace character (spaces, tabs, newlines).

- `\S`: Matches any non-whitespace character.

- `\b`: Matches a word boundary.

- `\B`: Matches a non-word boundary.

- `(...)`: Groups a pattern together.

- `\`: Escapes a special character, making it literal.

- `(?=...)`: Positive lookahead assertion, matches a group if it is followed by a certain pattern.

- `(?<=...)`: Positive lookbehind assertion, matches a group if it is preceded by a certain pattern.

- `(?<!...)`: Negative lookbehind assertion, matches a group if it is not preceded by a certain

- `re.IGNORECASE` or `re.I`: Ignore case.

- `re.DOTALL` or `re.S`: Make `.` match any character including newlines.

1. Extracting Email Addresses:

2. Finding Dates in a Text:

3. Replacing Multiple Whitespace Characters with a Single Space:

4. Validating a Phone Number: