Search for a string in Python (Check if a
substring is included/Get a substring
position)
Modified: 2023-05-07 | Tags: Python, String, Regex
This article explains how to search a string to check if it contains a
specific substring and to get its location in Python. The re module
in the standard library allows more flexible operation with regular
expressions.
Contents
Check if a string contains a given substring: in
Get the position (index) of a given substring: find(), rfind()
Case-insensitive search
Check and get a position with regex: re.search()
Get all results with regex: re.findall(), re.finditer()
Search multiple strings with regex
Use special characters and sequences
Case-insensitive search with regex: re.IGNORECASE
See the following article on how to count specific characters or
substrings in a string.
Count characters and strings in Python
See the following articles on how to extract, replace, and compare
strings.
Extract a substring from a string in Python (position, regex)
Replace strings in Python (replace, translate, re.sub, re.subn)
String comparison in Python (exact/partial match, etc.)
If you want to search the contents of a text file, read the file as a
string.
Read, write, and create files in Python (with and open())
Check if a string contains a given substring: in
Use the in operator to check if a string contains a given substring.
The in operator is case-sensitive, and the same applies to the
string methods described below. You can check for the presence of
multiple substrings using and and or.
Boolean operators in Python (and, or, not)
s = 'I am Sam'
print('Sam' in s)
# True
print('sam' in s)
# False
print('I' in s and 'Sam' in s)
# True
source: str_in_find_rfind.py
For more complex operations, consider using regular expressions,
as described in the following sections.
Note that the in operator can also be used for lists, tuples, and
dictionaries. See the following article for details.
The in operator in Python (for list, string, dictionary, etc.)
Get the position (index) of a given
substring: find(), rfind()
You can get the position of a given substring in the string with
the find() method of str.
Built-in Types - str.find() — Python 3.11.3 documentation
If the substring specified as the first argument is found, the method
returns its starting position (the position of the first character); if
not found, -1 is returned.
s = 'I am Sam'
print(s.find('Sam'))
# 5
print(s.find('XXX'))
# -1
source: str_in_find_rfind.py
In Python, the index of the first character in a string is 0.
I am Sam
01234567
If there are multiple occurrences of the substring, the position of
the first occurrence (the leftmost substring) is returned.
To find all occurrences, you can adjust the range with
the start and end arguments; however, using the regex approach
described below is more convenient.
print(s.find('am'))
# 2
source: str_in_find_rfind.py
By specifying the second argument start and the third
argument end, the search will be limited to the range of the
slice [start:end].
How to slice a list, string, tuple in Python
print(s.find('am', 3))
# 6
print(s.find('am', 3, 5))
# -1
source: str_in_find_rfind.py
The rfind() method searches the string starting from the right side.
Built-in Types - str.rfind() — Python 3.11.3 documentation
If multiple substrings are present, the position of the rightmost
substring is returned. Similar to find(), you can also
specify start and end arguments for the rfind() method.
print(s.rfind('am'))
# 6
print(s.rfind('XXX'))
# -1
print(s.rfind('am', 2))
# 6
print(s.rfind('am', 2, 5))
# 2
source: str_in_find_rfind.py
There are index() and rindex() methods similar
to find() and rfind(). If the specified string does not
exist, find() and rfind() return -1, but index() and rindex() raise
an error.
Built-in Types - str.index() — Python 3.11.3 documentation
Built-in Types - str.rindex() — Python 3.11.3 documentation
print(s.index('am'))
# 2
# print(s.index('XXX'))
# ValueError: substring not found
print(s.rindex('am'))
# 6
# print(s.rindex('XXX'))
# ValueError: substring not found
source: str_in_find_rfind.py
Case-insensitive search
Note that the in operator and the string methods mentioned so far
are case-sensitive.
For case-insensitive searches, you can convert both the search
string and target string to uppercase or lowercase. Use
the upper() method to convert a string to uppercase, and
the lower() method to convert it to lowercase.
Uppercase and lowercase strings in Python (conversion and
checking)
s = 'I am Sam'
print(s.upper())
# I AM SAM
print(s.lower())
# i am sam
print('sam' in s)
# False
print('sam' in s.lower())
# True
print(s.find('sam'))
# -1
print(s.lower().find('sam'))
# 5
Check and get a position with regex: re.search()
Use regular expressions with the re module of the standard library.
Regular expressions with the re module in Python
Use re.search() to check if a string contains a given string with
regex.
The first argument is a regex pattern, and the second is a target
string. Although special characters and sequences can be used in
the regex pattern, the following example demonstrates the simplest
pattern by using the string as it is.
If the pattern matches, a match object is returned;
otherwise, None is returned.
import re
s = 'I am Sam'
print(re.search('Sam', s))
# <re.Match object; span=(5, 8), match='Sam'>
print(re.search('XXX', s))
# None
source: str_search_regex.py
You can get various information with the methods of the match
object.
How to use regex match objects in Python
group() returns the matched string, start() returns the start
position, end() returns the end position, and span() returns a tuple
of (start position, end position).
m = re.search('Sam', s)
print(m.group())
# Sam
print(m.start())
# 5
print(m.end())
# 8
print(m.span())
# (5, 8)
Get all results with
regex: re.findall(), re.finditer()
re.search() returns only the first match object, even if there are
multiple matching occurrences in the string.
s = 'I am Sam'
print(re.search('am', s))
# <re.Match object; span=(2, 4), match='am'>
re.findall() returns all matching parts as a list of strings.
print(re.findall('am', s))
# ['am', 'am']
To get the positions of all matching parts, use re.finditer() along
with list comprehensions.
List comprehensions in Python
print([m.span() for m in re.finditer('am', s)])
# [(2, 4), (6, 8)]
In the above example, span() is used so that a list of tuples, (start
position, end position), is returned. If you want to get a list of only
start or end positions, use start() or end().
Note that re.finditer() returns an iterator yielding match objects
over all matches.
Search multiple strings with regex
Even if you do not have much experience with regular expressions,
it is helpful to know the | symbol.
If the regex pattern is A|B, it matches A or B. You can use just a
string for A and B (of course, you can use special characters and
sequences), and you can use A|B|C for three or more.
You can search for multiple strings as follows.
s = 'I am Sam Adams'
print(re.findall('Sam|Adams', s))
# ['Sam', 'Adams']
print([m.span() for m in re.finditer('Sam|Adams', s)])
# [(5, 8), (9, 14)]
source: str_search_regex.py
Use special characters and sequences
Using special characters and sequences in regex patterns allows
for more complex searches.
s = 'I am Sam Adams'
print(re.findall('am', s))
# ['am', 'am', 'am']
print(re.findall('[a-zA-Z]+am[a-z]*', s))
# ['Sam', 'Adams']
See the following article for basic examples of utilizing regex
patterns, such as wildcard-like patterns.
Extract a substring from a string in Python (position, regex)
Case-insensitive search with regex: re.IGNORECASE
You can specify re.IGNORECASE as the flags argument of functions
such as re.search() andre.findall() to search case-insensitive.
s = 'I am Sam'
print(re.search('sam', s))
# None
print(re.search('sam', s, flags=re.IGNORECASE))
# <re.Match object; span=(5, 8), match='Sam'>