String Method Functions
String Method Functions
Strings implement all of the common sequence operations, along with the additional
methods described below.
Strings also support two styles of string formatting, one providing a large degree of
flexibility and customization (see str.format(), Format String Syntax and Custom String
Formatting) and the other based on C printf style formatting that handles a narrower
range of types and is slightly harder to use correctly, but is often faster for the cases it can
handle (printf-style String Formatting).
The Text Processing Services section of the standard library covers a number of other
modules that provide various text related utilities (including regular expression support in
the re module).
str.capitalize()
Return a copy of the string with its first character capitalized and the rest lowercased.
Changed in version 3.8: The first character is now put into titlecase rather than
uppercase. This means that characters like digraphs will only have their first letter
capitalized, instead of the full character.
str.casefold()
Return a casefolded copy of the string. Casefolded strings may be used for caseless
matching.
The casefolding algorithm is described in section 3.13 ‘Default Case Folding’ of the
Unicode Standard.
str.center(width[, fillchar])
Return centered in a string of length width. Padding is done using the specified fillchar
(default is an ASCII space). The original string is returned if width is less than or equal to
len(s). For example:
1/18
>>> 'Python'.center(10)
' Python '
>>> 'Python'.center(10, '-')
'--Python--'
>>> 'Python'.center(4)
'Python'
If sub is empty, returns the number of empty strings between characters which is the
length of the string plus one. For example:
str.encode(encoding='utf-8', errors='strict')
Return the string encoded to bytes.
errors controls how encoding errors are handled. If 'strict' (the default), a
UnicodeError exception is raised. Other possible values are 'ignore', 'replace',
'xmlcharrefreplace', 'backslashreplace' and any other name registered via
codecs.register_error(). See Error Handlers for details.
For performance reasons, the value of errors is not checked for validity unless an
encoding error actually occurs, Python Development Mode is enabled or a debug build is
used. For example:
Changed in version 3.9: The value of the errors argument is now checked in Python
Development Mode and in debug mode.
2/18
Return True if the string ends with the specified suffix, otherwise return False. suffix can
also be a tuple of suffixes to look for. With optional start, test beginning at that position.
With optional end, stop comparing at that position. Using start and end is equivalent to
str[start:end].endswith(suffix). For example:
>>> 'Python'.endswith('on')
True
>>> 'a tuple of suffixes'.endswith(('at', 'in'))
False
>>> 'a tuple of suffixes'.endswith(('at', 'es'))
True
>>> 'Python is amazing'.endswith('is', 0, 9)
True
str.expandtabs(tabsize=8)
Return a copy of the string where all tab characters are replaced by one or more spaces,
depending on the current column and the given tab size. Tab positions occur every
tabsize characters (default is 8, giving tab positions at columns 0, 8, 16 and so on). To
expand the string, the current column is set to zero and the string is examined character
by character. If the character is a tab (\t), one or more space characters are inserted in
the result until the current column is equal to the next tab position. (The tab character
itself is not copied.) If the character is a newline (\n) or return (\r), it is copied and the
current column is reset to zero. Any other character is copied unchanged and the current
column is incremented by one regardless of how the character is represented when
printed. For example:
>>> '01\t012\t0123\t01234'.expandtabs()
'01 012 0123 01234'
>>> '01\t012\t0123\t01234'.expandtabs(4)
'01 012 0123 01234'
>>> print('01\t012\n0123\t01234'.expandtabs(4))
01 012
0123 01234
Note
The find() method should be used only if you need to know the position of sub. To check
if sub is a substring or not, use the in operator:
str.format(*args, **kwargs)
3/18
Perform a string formatting operation. The string on which this method is called can
contain literal text or replacement fields delimited by braces {}. Each replacement field
contains either the numeric index of a positional argument, or the name of a keyword
argument. Returns a copy of the string where each replacement field is replaced with the
string value of the corresponding argument.
See Format String Syntax for a description of the various formatting options that can be
specified in format strings.
Note
When formatting a number (int, float, complex, decimal.Decimal and subclasses) with
the n type (ex: '{:n}'.format(1234)), the function temporarily sets the LC_CTYPE locale
to the LC_NUMERIC locale to decode decimal_point and thousands_sep fields of
localeconv() if they are non-ASCII or longer than 1 byte, and the LC_NUMERIC locale is
different than the LC_CTYPE locale. This temporary change affects other threads.
Changed in version 3.7: When formatting a number with the n type, the function sets
temporarily the LC_CTYPE locale to the LC_NUMERIC locale in some cases.
>>> classDefault(dict):
... def__missing__(self, key):
... return key
...
>>> '{name} was born in {country}'.format_map(Default(name='Guido'))
'Guido was born in country'
str.isalnum()
Return True if all characters in the string are alphanumeric and there is at least one
character, False otherwise. A character c is alphanumeric if one of the following returns
True: c.isalpha(), c.isdecimal(), c.isdigit(), or c.isnumeric().
str.isalpha()
Return True if all characters in the string are alphabetic and there is at least one
character, False otherwise. Alphabetic characters are those characters defined in the
Unicode character database as “Letter”, i.e., those with general category property being
4/18
one of “Lm”, “Lt”, “Lu”, “Ll”, or “Lo”. Note that this is different from the Alphabetic property
defined in the section 4.10 ‘Letters, Alphabetic, and Ideographic’ of the Unicode
Standard.
str.isascii()
Return True if the string is empty or all characters in the string are ASCII, False
otherwise. ASCII characters have code points in the range U+0000-U+007F.
str.isdecimal()
Return True if all characters in the string are decimal characters and there is at least one
character, False otherwise. Decimal characters are those that can be used to form
numbers in base 10, e.g. U+0660, ARABIC-INDIC DIGIT ZERO. Formally a decimal
character is a character in the Unicode General Category “Nd”.
str.isdigit()
Return True if all characters in the string are digits and there is at least one character,
False otherwise. Digits include decimal characters and digits that need special handling,
such as the compatibility superscript digits. This covers digits which cannot be used to
form numbers in base 10, like the Kharosthi numbers. Formally, a digit is a character that
has the property value Numeric_Type=Digit or Numeric_Type=Decimal.
str.isidentifier()
Return True if the string is a valid identifier according to the language definition, section
Identifiers and keywords.
Example:
str.islower()
Return True if all cased characters [4] in the string are lowercase and there is at least one
cased character, False otherwise.
str.isnumeric()
Return True if all characters in the string are numeric characters, and there is at least one
character, False otherwise. Numeric characters include digit characters, and all
characters that have the Unicode numeric value property, e.g. U+2155, VULGAR
FRACTION ONE FIFTH. Formally, numeric characters are those with the property value
Numeric_Type=Digit, Numeric_Type=Decimal or Numeric_Type=Numeric.
5/18
str.isprintable()
Return True if all characters in the string are printable, False if it contains at least one
non-printable character.
Here “printable” means the character is suitable for repr() to use in its output; “non-
printable” means that repr() on built-in types will hex-escape the character. It has no
bearing on the handling of strings written to sys.stdout or sys.stderr.
The printable characters are those which in the Unicode character database (see
unicodedata) have a general category in group Letter, Mark, Number, Punctuation, or
Symbol (L, M, N, P, or S); plus the ASCII space 0x20. Nonprintable characters are those
in group Separator or Other (Z or C), except the ASCII space.
str.isspace()
Return True if there are only whitespace characters in the string and there is at least one
character, False otherwise.
str.istitle()
Return True if the string is a titlecased string and there is at least one character, for
example uppercase characters may only follow uncased characters and lowercase
characters only cased ones. Return False otherwise.
str.isupper()
Return True if all cased characters [4] in the string are uppercase and there is at least
one cased character, False otherwise.
>>> 'BANANA'.isupper()
True
>>> 'banana'.isupper()
False
>>> 'baNana'.isupper()
False
>>> ' '.isupper()
False
str.join(iterable)
Return a string which is the concatenation of the strings in iterable. A TypeError will be
raised if there are any non-string values in iterable, including bytes objects. The
separator between elements is the string providing this method.
str.ljust(width[, fillchar])
Return the string left justified in a string of length width. Padding is done using the
specified fillchar (default is an ASCII space). The original string is returned if width is less
than or equal to len(s).
6/18
str.lower()
Return a copy of the string with all the cased characters [4] converted to lowercase.
The lowercasing algorithm used is described in section 3.13 ‘Default Case Folding’ of the
Unicode Standard.
str.lstrip([chars])
Return a copy of the string with leading characters removed. The chars argument is a
string specifying the set of characters to be removed. If omitted or None, the chars
argument defaults to removing whitespace. The chars argument is not a prefix; rather, all
combinations of its values are stripped:
See str.removeprefix() for a method that will remove a single prefix string rather than
all of a set of characters. For example:
If there is only one argument, it must be a dictionary mapping Unicode ordinals (integers)
or characters (strings of length 1) to Unicode ordinals, strings (of arbitrary lengths) or
None. Character keys will then be converted to ordinals.
If there are two arguments, they must be strings of equal length, and in the resulting
dictionary, each character in x will be mapped to the character at the same position in y. If
there is a third argument, it must be a string, whose characters will be mapped to None in
the result.
str.partition(sep)
Split the string at the first occurrence of sep, and return a 3-tuple containing the part
before the separator, the separator itself, and the part after the separator. If the separator
is not found, return a 3-tuple containing the string itself, followed by two empty strings.
>>> 'TestHook'.removeprefix('Test')
'Hook'
>>> 'BaseTestCase'.removeprefix('Test')
'BaseTestCase'
7/18
Added in version 3.9.
>>> 'MiscTests'.removesuffix('Tests')
'Misc'
>>> 'TmpDirMixin'.removesuffix('Tests')
'TmpDirMixin'
str.rjust(width[, fillchar])
Return the string right justified in a string of length width. Padding is done using the
specified fillchar (default is an ASCII space). The original string is returned if width is less
than or equal to len(s).
str.rpartition(sep)
Split the string at the last occurrence of sep, and return a 3-tuple containing the part
before the separator, the separator itself, and the part after the separator. If the separator
is not found, return a 3-tuple containing two empty strings, followed by the string itself.
str.rsplit(sep=None, maxsplit=-1)
Return a list of the words in the string, using sep as the delimiter string. If maxsplit is
given, at most maxsplit splits are done, the rightmost ones. If sep is not specified or None,
any whitespace string is a separator. Except for splitting from the right, rsplit() behaves
like split() which is described in detail below.
str.rstrip([chars])
Return a copy of the string with trailing characters removed. The chars argument is a
string specifying the set of characters to be removed. If omitted or None, the chars
argument defaults to removing whitespace. The chars argument is not a suffix; rather, all
8/18
combinations of its values are stripped:
See str.removesuffix() for a method that will remove a single suffix string rather than
all of a set of characters. For example:
str.split(sep=None, maxsplit=-1)
Return a list of the words in the string, using sep as the delimiter string. If maxsplit is
given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1
elements). If maxsplit is not specified or -1, then there is no limit on the number of splits
(all possible splits are made).
If sep is given, consecutive delimiters are not grouped together and are deemed to delimit
empty strings (for example, '1,,2'.split(',') returns ['1', '', '2']). The sep
argument may consist of multiple characters as a single delimiter (to split with multiple
delimiters, use re.split()). Splitting an empty string with a specified separator returns
[''].
For example:
>>> '1,2,3'.split(',')
['1', '2', '3']
>>> '1,2,3'.split(',', maxsplit=1)
['1', '2,3']
>>> '1,2,,3,'.split(',')
['1', '2', '', '3', '']
>>> '1<>2<>3<4'.split('<>')
['1', '2', '3<4']
For example:
9/18
If sep is not specified or is None and maxsplit is 0, only leading runs of consecutive
whitespace are considered.
For example:
>>> "".split(None, 0)
[]
>>> " ".split(None, 0)
[]
>>> " foo ".split(maxsplit=0)
['foo ']
str.splitlines(keepends=False)
Return a list of the lines in the string, breaking at line boundaries. Line breaks are not
included in the resulting list unless keepends is given and true.
This method splits on the following line boundaries. In particular, the boundaries are a
superset of universal newlines.
Representation Description
\n Line Feed
\r Carriage Return
10/18
Changed in version 3.2: \v and \f added to list of line boundaries.
For example:
Unlike split() when a delimiter string sep is given, this method returns an empty list for
the empty string, and a terminal line break does not result in an extra line:
>>> "".splitlines()
[]
>>> "One line\n".splitlines()
['One line']
>>> ''.split('\n')
['']
>>> 'Two lines\n'.split('\n')
['Two lines', '']
str.strip([chars])
Return a copy of the string with the leading and trailing characters removed. The chars
argument is a string specifying the set of characters to be removed. If omitted or None, the
chars argument defaults to removing whitespace. The chars argument is not a prefix or
suffix; rather, all combinations of its values are stripped:
The outermost leading and trailing chars argument values are stripped from the string.
Characters are removed from the leading end until reaching a string character that is not
contained in the set of characters in chars. A similar action takes place on the trailing end.
For example:
str.swapcase()
Return a copy of the string with uppercase characters converted to lowercase and vice
versa. Note that it is not necessarily true that s.swapcase().swapcase() == s.
11/18
str.title()
Return a titlecased version of the string where words start with an uppercase character
and the remaining characters are lowercase.
For example:
The string.capwords() function does not have this problem, as it splits words on spaces
only.
>>> importre
>>> deftitlecase(s):
... return re.sub(r"[A-Za-z]+('[A-Za-z]+)?",
... lambda mo: mo.group(0).capitalize(),
... s)
...
>>> titlecase("they're bill's friends.")
"They're Bill's Friends."
str.translate(table)
Return a copy of the string in which each character has been mapped through the given
translation table. The table must be an object that implements indexing via
__getitem__(), typically a mapping or sequence. When indexed by a Unicode ordinal
(an integer), the table object can do any of the following: return a Unicode ordinal or a
string, to map the character to one or more other characters; return None, to delete the
character from the return string; or raise a LookupError exception, to map the character
to itself.
See also the codecs module for a more flexible approach to custom character mappings.
str.upper()
Return a copy of the string with all the cased characters [4] converted to uppercase. Note
that s.upper().isupper() might be False if s contains uncased characters or if the
Unicode category of the resulting character(s) is not “Lu” (Letter, uppercase), but e.g. “Lt”
12/18
(Letter, titlecase).
The uppercasing algorithm used is described in section 3.13 ‘Default Case Folding’ of the
Unicode Standard.
str.zfill(width)
Return a copy of the string left filled with ASCII '0' digits to make a string of length width.
A leading sign prefix ('+'/'-') is handled by inserting the padding after the sign character
rather than before. The original string is returned if width is less than or equal to len(s).
For example:
>>> "42".zfill(5)
'00042'
>>> "-42".zfill(5)
'-0042'
Changed in version 3.7: The await and async for can be used in expressions within f-
strings.
Changed in version 3.12: Many restrictions on expressions within f-strings have been
removed. Notably, nested strings, comments, and backslashes are now permitted.
An f-string (formally a formatted string literal) is a string literal that is prefixed with f or F.
This type of string literal allows embedding arbitrary Python expressions within
replacement fields, which are delimited by curly brackets ({}). These expressions are
evaluated at runtime, similarly to str.format(), and are converted into regular str
objects. For example:
A single opening curly bracket, '{', marks a replacement field that can contain any
Python expression:
13/18
>>> x = 42
>>> f'{{x}} is {x}'
'{x} is 42'
To use an explicit conversion, use the ! (exclamation mark) operator, followed by any of
the valid formats, which are:
Conversion Meaning
!a ascii()
!r repr()
!s str()
For example:
While debugging it may be helpful to see both the expression and its value, by using the
equals sign (=) after the expression. This preserves spaces within the brackets, and can
be used with a converter. By default, the debugging operator uses the repr() (!r)
conversion. For example:
14/18
Once the output has been evaluated, it can be formatted using a format specifier
following a colon (':'). After the expression has been evaluated, and possibly converted
to a string, the __format__() method of the result is called with the format specifier, or
the empty string if no format specifier is given. The formatted result is then used as the
final value for the replacement field. For example:
Note
The formatting operations described here exhibit a variety of quirks that lead to a number
of common errors (such as failing to display tuples and dictionaries correctly). Using the
newer formatted string literals, the str.format() interface, or template strings may help
avoid these errors. Each of these alternatives provides their own trade-offs and benefits
of simplicity, flexibility, and/or extensibility.
String objects have one unique built-in operation: the % operator (modulo). This is also
known as the string formatting or interpolation operator. Given format % values (where
format is a string), % conversion specifications in format are replaced with zero or more
elements of values. The effect is similar to using the sprintf() function in the C
language. For example:
If format requires a single argument, values may be a single non-tuple object. [5]
Otherwise, values must be a tuple with exactly the number of items specified by the
format string, or a single mapping object (for example, a dictionary).
A conversion specifier contains two or more characters and has the following
components, which must occur in this order:
3. Conversion flags (optional), which affect the result of some conversion types.
4. Minimum field width (optional). If specified as an '*' (asterisk), the actual width is
read from the next element of the tuple in values, and the object to convert comes
after the minimum field width and optional precision.
15/18
5. Precision (optional), given as a '.' (dot) followed by the precision. If specified as
'*' (an asterisk), the actual precision is read from the next element of the tuple in
values, and the value to convert comes after the precision.
7. Conversion type.
When the right argument is a dictionary (or other mapping type), then the formats in the
string must include a parenthesised mapping key into that dictionary inserted immediately
after the '%' character. The mapping key selects the value to be formatted from the
mapping. For example:
In this case no * specifiers may occur in a format (since they require a sequential
parameter list).
Flag Meaning
'#' The value conversion will use the “alternate form” (where defined below).
'-' The converted value is left adjusted (overrides the '0' conversion if both are
given).
' ' (a space) A blank should be left before a positive number (or empty string)
produced by a signed conversion.
'+' A sign character ('+' or '-') will precede the conversion (overrides a “space”
flag).
A length modifier (h, l, or L) may be present, but is ignored as it is not necessary for
Python – so e.g. %ld is identical to %d.
16/18
Conversion Meaning Notes
Notes:
17/18
1. The alternate form causes a leading octal specifier ('0o') to be inserted before the
first digit.
2. The alternate form causes a leading '0x' or '0X' (depending on whether the 'x' or
'X' format was used) to be inserted before the first digit.
3. The alternate form causes the result to always contain a decimal point, even if no
digits follow it.
The precision determines the number of digits after the decimal point and defaults
to 6.
4. The alternate form causes the result to always contain a decimal point, and trailing
zeroes are not removed as they would otherwise be.
The precision determines the number of significant digits before and after the
decimal point and defaults to 6.
18/18