Regular Expressions Cheat Sheet
Regular Expressions Cheat Sheet
This cheat sheet provides a quick reference for essential regular expression Special Characters
(RegEx) constructs, helping you perform text pattern matching and ^ $ . \ | + * ? {#} {#,#} {#,#}?
manipulation with ease. It covers foundational syntax, such as character
classes, anchors, and quantifiers, alongside advanced features like groups,
Sets
lookaheads, and inline flags. Whether you're cleaning data, validating input, [rEsz] [a-z] [a\-z] [a-] [-a] [a-z0-9] [(+*)] [^ers]
or performing complex text searches, this cheat sheet ensures you can find
the right tools for the task.
Designed to make regex approachable and useful, this handy resource is Groups
perfect for tackling challenges in text processing, data cleaning, and parsing. cat(?=fish) (?<=cat)fish cat(?!fish) (?<!cat)fish
Square brackets [] define a set, where each character Matches all alphanumeric characters ( a-z , A-Z , and 0-9 ).
[rEsz] Regular Expression \w Ch 4 racter _ Class 3 s It also matches the underscore _ .
is matched independently. A match occurs if any
character from the set appears in the text.
Matches any non-word character, which includes symbols,
\W !@# $%^&*() punctuation, and spaces. Non-word characters are anything
The - in [m-n] is a range operator , matching any not in the set [a-zA-Z0-9_].
[a-z] 1 Fig, 2 NewTons
character from m to n .
\d 1a2b 3 c Matches all digits 0-9 .
The \ escapes the - , treating it as a literal character
[a\-z] a to z is not = A-Z
instead of a range operator. This set matches a , z ,
and - only.
\D 1a2b 3 c Matches any non-digits.
[-a] regular-expression As above, matches a or - . Matches a word boundary, the position between a \w
\b character classes character (letter, digit, or underscore) and a \W character
(non-word character). It doesn’t match actual characters but
Matches characters from a to z and also from 0 positions like the start or end of words.
[a-z0-9] 396 ExpressionS
to 9 .
\B Matches where \b does not, that is, the boundary of \w
character classes
characters.
Special characters become literal inside a set, so
[(+*)] (valid) *expressions+words Matches the start of the string. The backslash \ escapes the
this matches ( , + , * , and ) . \Ac color colour
normal meaning of A, turning it into a special positional anchor.
Unlike ^, which matches the start of each line in multi-line mode,
The ^ negates the set, matching any character not
[^ers] regular expression \A always matches the very beginning of the entire string.
in the set. Here, it matches characters that are not e
, r , or s . r\Z Matches the end of the string. The backslash \ escapes the
color colour
normal meaning of Z, turning it into a special positional anchor.
Unlike $, which matches the end of each line in multi-line mode,
\Z always matches the very end of the entire string, excluding
any trailing newline.
Finds all non-overlapping matches of the pattern A in string Captures the substring ro as a group. Groups are denoted by
re.findall(A, B) (ro) groups
B and returns them as a list. If no matches are found, it parentheses () and can be accessed later for further
returns an empty list. processing.
Searches string B for the first occurrence of the pattern A A non-capturing group groups patterns without creating a
re.search(A, B) (?:cat) cat fish dog
and returns a match object. If no match is found, it returns capturing group. Use non-capturing groups when grouping is
None needed for logic but you don’t need to extract the group.
Splits string B into a list at each occurrence of the pattern A A positive lookahead asserts that the pattern fish must
re.split(A, B) cat(?=fish) catfish catdog
If no match is found, it returns the original string as a single- follow cat for a match. It checks the context after the current
element list. match without consuming it.
Replaces all occurrences of the pattern A in string C with A positive lookbehind asserts that the pattern cat must
re.sub(A, B, C) (?<=cat)fish catfish dogfish
the string B and returns the modified string. The original precede fish for a match. It checks the context before the
string C remains unchanged. current match without consuming it.
Attempts to match the pattern A starting strictly at position A negative lookahead asserts that the pattern fish must not
re.match(A, B) cat(?!fish) catfish catdog
0 in string B . If the pattern doesn’t match at the start, it follow cat for a match. It checks the context after the current
returns None. Unlike re.search(), it does not evaluate the match without consuming it.
rest of the string.
A negative lookbehind asserts that the pattern cat must not
(?<!cat)fish catfish dogfish
precede fish for a match. It checks the context before the
Note: The re module is a part of Python's standard library so it does not need to be installed.
current match without consuming it.
Run import re to access these functions.
The named group construct (?P<name>...) assigns a The locale-dependent flag L makes shorthand
(?P<pet>cat) dog cat fish (?L)\w+ straße cafe Éclair
name to the captured group for easy reference later in the character classes like \w locale-sensitive, allowing
regex. The P in ?P stands for Python. matches based on cultural or regional rules.
dog cat fish The named group backreference (?P=name) matches the The multi-line flag m makes ^ and $ match the start
(?P=pet) (?m)^cat catdog catfish
content previously captured by the named group name . In and end of each line, rather than the start and end of the
this example, it matches the word cat . entire string.
The comment construct allows you to include comments in cat\ndog The dot matches all flag s allows the . character to
(?#...) (?s)cat.dog
your regex. These comments are ignored by the regex engine match newline characters in addition to all other
and do not affect the match result. characters.
naïve cat café The Unicode flag u makes shorthand classes like \w ,
(?u)\w+
Inline Flags \W , \b , and \B match Unicode characters. Unlike the L
(?i)cat cat Cat CAT CaT The ignore case flag i makes the pattern case-insensitive,
allowing matches regardless of capitalization.