0% found this document useful (0 votes)
30 views5 pages

Regular Expressions Cheat Sheet

This document is a cheat sheet for regular expressions (RegEx), providing a quick reference for essential constructs, syntax, and practical examples for text pattern matching and manipulation. It covers foundational elements like character classes, anchors, and quantifiers, as well as advanced features such as groups and lookaheads. The cheat sheet also includes popular Python re module functions and inline flags to enhance regex usage.

Uploaded by

dr.poplisakshi2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views5 pages

Regular Expressions Cheat Sheet

This document is a cheat sheet for regular expressions (RegEx), providing a quick reference for essential constructs, syntax, and practical examples for text pattern matching and manipulation. It covers foundational elements like character classes, anchors, and quantifiers, as well as advanced features such as groups and lookaheads. The cheat sheet also includes popular Python re module functions and inline flags to enhance regex usage.

Uploaded by

dr.poplisakshi2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Regular Expressions Table of Contents

This cheat sheet provides a quick reference for essential regular expression Special Characters
(RegEx) constructs, helping you perform text pattern matching and ^ $ . \ | + * ? {#} {#,#} {#,#}?
manipulation with ease. It covers foundational syntax, such as character
classes, anchors, and quantifiers, alongside advanced features like groups,
Sets
lookaheads, and inline flags. Whether you're cleaning data, validating input, [rEsz] [a-z] [a\-z] [a-] [-a] [a-z0-9] [(+*)] [^ers]
or performing complex text searches, this cheat sheet ensures you can find
the right tools for the task.

Character Classes | Special Sequences


Each entry includes practical examples that demonstrate regex's flexibility, \w \W \d \D \s \S \b \B \A \Z
from identifying patterns to modifying text with functions like re.sub(). The
examples are paired with concise explanations to simplify learning and
Popular Python Re Module Functions
application. To test and refine your own patterns interactively, visit
findall, search, split, sub, match
regex101.com, an indispensable tool for exploring how regex behaves.

Designed to make regex approachable and useful, this handy resource is Groups
perfect for tackling challenges in text processing, data cleaning, and parsing. cat(?=fish) (?<=cat)fish cat(?!fish) (?<!cat)fish

(?P=pet) (?P<pet>cat) (ro) (?:cat) (cat)\1 (?#...)


Keep it close-by to be ready to streamline workflows and work effectively with
text-based data.
Inline Flags
(?a) (?i) (?L) (?m) (?s) (?u) (?x)

RegEx Cheat Sheet Free resources at: dataquest.io/guide


Special Characters
Syntax Matches Explained Syntax Matches Explained
The + quantifier greedily matches the preceding
The ^ anchor matches the character or group to its b+ abc bbb d b
^r regular expressions right r only at the start of a string. It does not match if
expression b one or more times. It captures the longest
possible sequences of b in the string.
r appears elsewhere.

The * quantifier greedily matches zero or more


s$ she sells seashells The $ anchor matches the character or group to its left s b* abc bbb d b
occurrences of the preceding expression b, including
only at the end of a string. It does not match if s appears
empty matches at positions where no b exists.
elsewhere.

The ? quantifier matches the preceding character or


. regular expressions
The . wildcard matches any single character (including colou?r color colour
group zero or one times, making it optional.
spaces) but not newline characters \n. It does not match
multiple characters unless combined with a quantifier
like * or +. The {m} quantifier matches the preceding character u
u{3} uuu uuuu uu u
exactly m times.
The \ character is used to escape special characters
\. www.example.com The {m,n} quantifier matches the preceding character at
(e.g., \. for a literal dot) or to denote character classes u{2,3} uuu uuuu uu u
(e.g., \d for digits). See the Character Classes section least m times but not more than n times.
for more details.
The {m,n}? quantifier matches the preceding character
u{2,3}? uuu uuuu uu u
The | (OR) operator matches either the expression to its at least m times but not more than n times, in a non-
A|B Action Button greedy (lazy) manner.
left A or its right B, finding all possible matches across
the string.

RegEx Cheat Sheet Free resources at: dataquest.io/guide


Sets Character Classes
Syntax Matches Explained Syntax Matches Explained

Square brackets [] define a set, where each character Matches all alphanumeric characters ( a-z , A-Z , and 0-9 ).
[rEsz] Regular Expression \w Ch 4 racter _ Class 3 s It also matches the underscore _ .
is matched independently. A match occurs if any
character from the set appears in the text.
Matches any non-word character, which includes symbols,
\W !@# $%^&*() punctuation, and spaces. Non-word characters are anything
The - in [m-n] is a range operator , matching any not in the set [a-zA-Z0-9_].
[a-z] 1 Fig, 2 NewTons
character from m to n .
\d 1a2b 3 c Matches all digits 0-9 .
The \ escapes the - , treating it as a literal character
[a\-z] a to z is not = A-Z
instead of a range operator. This set matches a , z ,
and - only.
\D 1a2b 3 c Matches any non-digits.

Matches whitespace characters including the \t , \n , \r ,


Matches a and the literal - because - is treated \s character classes
[a-] regular-expression and space characters.
as a character when placed at the start or end of a
set. \S character classes Matches non-whitespace characters.

[-a] regular-expression As above, matches a or - . Matches a word boundary, the position between a \w
\b character classes character (letter, digit, or underscore) and a \W character
(non-word character). It doesn’t match actual characters but
Matches characters from a to z and also from 0 positions like the start or end of words.
[a-z0-9] 396 ExpressionS
to 9 .
\B Matches where \b does not, that is, the boundary of \w
character classes
characters.
Special characters become literal inside a set, so
[(+*)] (valid) *expressions+words Matches the start of the string. The backslash \ escapes the
this matches ( , + , * , and ) . \Ac color colour
normal meaning of A, turning it into a special positional anchor.
Unlike ^, which matches the start of each line in multi-line mode,
The ^ negates the set, matching any character not
[^ers] regular expression \A always matches the very beginning of the entire string.
in the set. Here, it matches characters that are not e
, r , or s . r\Z Matches the end of the string. The backslash \ escapes the
color colour
normal meaning of Z, turning it into a special positional anchor.
Unlike $, which matches the end of each line in multi-line mode,
\Z always matches the very end of the entire string, excluding
any trailing newline.

RegEx Cheat Sheet Free resources at: dataquest.io/guide


Popular Python re Module Functions Groups
Syntax Explained Syntax Matches Explained

Finds all non-overlapping matches of the pattern A in string Captures the substring ro as a group. Groups are denoted by
re.findall(A, B) (ro) groups
B and returns them as a list. If no matches are found, it parentheses () and can be accessed later for further
returns an empty list. processing.

Searches string B for the first occurrence of the pattern A A non-capturing group groups patterns without creating a
re.search(A, B) (?:cat) cat fish dog
and returns a match object. If no match is found, it returns capturing group. Use non-capturing groups when grouping is
None needed for logic but you don’t need to extract the group.

Splits string B into a list at each occurrence of the pattern A A positive lookahead asserts that the pattern fish must
re.split(A, B) cat(?=fish) catfish catdog
If no match is found, it returns the original string as a single- follow cat for a match. It checks the context after the current
element list. match without consuming it.

Replaces all occurrences of the pattern A in string C with A positive lookbehind asserts that the pattern cat must
re.sub(A, B, C) (?<=cat)fish catfish dogfish
the string B and returns the modified string. The original precede fish for a match. It checks the context before the
string C remains unchanged. current match without consuming it.

Attempts to match the pattern A starting strictly at position A negative lookahead asserts that the pattern fish must not
re.match(A, B) cat(?!fish) catfish catdog
0 in string B . If the pattern doesn’t match at the start, it follow cat for a match. It checks the context after the current
returns None. Unlike re.search(), it does not evaluate the match without consuming it.
rest of the string.
A negative lookbehind asserts that the pattern cat must not
(?<!cat)fish catfish dogfish
precede fish for a match. It checks the context before the
Note: The re module is a part of Python's standard library so it does not need to be installed.
current match without consuming it.
Run import re to access these functions.

The backreference construct \1 refers to the first captured


(cat)\1 catcat dogcat
group in the pattern. Subsequent groups can be referenced
with \2 , \3 , and so on.

RegEx Cheat Sheet Free resources at: dataquest.io/guide


Groups
Syntax Matches Explained Syntax Matches Explained

The named group construct (?P<name>...) assigns a The locale-dependent flag L makes shorthand
(?P<pet>cat) dog cat fish (?L)\w+ straße cafe Éclair
name to the captured group for easy reference later in the character classes like \w locale-sensitive, allowing
regex. The P in ?P stands for Python. matches based on cultural or regional rules.

dog cat fish The named group backreference (?P=name) matches the The multi-line flag m makes ^ and $ match the start
(?P=pet) (?m)^cat catdog catfish
content previously captured by the named group name . In and end of each line, rather than the start and end of the
this example, it matches the word cat . entire string.

The comment construct allows you to include comments in cat\ndog The dot matches all flag s allows the . character to
(?#...) (?s)cat.dog
your regex. These comments are ignored by the regex engine match newline characters in addition to all other
and do not affect the match result. characters.

naïve cat café The Unicode flag u makes shorthand classes like \w ,
(?u)\w+
Inline Flags \W , \b , and \B match Unicode characters. Unlike the L

flag, which applies locale-specific rules, the u flag uses


Syntax Matches Explained Unicode rules to ensure consistent matching across
languages.
The inline flag setting construct (?flags) applies one or
(?aiLmsux)
more flags to modify the behavior of the regex pattern that
The verbose flag x enables extended formatting by
follows it. Use (?flags:regex) for group matching with (?x)c a t cat
allowing spaces and comments in the pattern for
flags.
improved readability. Spaces are ignored unless escaped
with a backslash.
The ASCII-only flag a restricts shorthand character classes
(?a)\w+ cat 123_ CAT like \w , \W , \b , and \B to match only ASCII characters,
excluding Unicode.

(?i)cat cat Cat CAT CaT The ignore case flag i makes the pattern case-insensitive,
allowing matches regardless of capitalization.

RegEx Cheat Sheet Free resources at: dataquest.io/guide

You might also like