Strings in Python
What you already know
● A string is a sequence of Unicode characters.
● Can be created by enclosing characters inside single, double or triple quotes.
● Triple quoted strings can span multiple lines.
● Can use implicit/explicit line continuation for single or double quoted strings to span the string over
multiple lines.
● Indices start with 0 and end with one less than the length of the string.
● len() function returns the length of the string.
● Strings can be concatenated using + operator and can be repeated multiple number of times using *
operator.
But did you also know?
● The multiplier operand n must be an integer. Therefore, it can be zero or negative also.
● What will be the output of -
● Python also provides negative indicing.
● The index of -1 refers to the last item, -2 to the second last item and so on.
● Trying to access a character out of index range will raise an IndexError .
● The index must be an integer. We can't use floats or other types, this will result into TypeError.
The in Operator
● This membership operator can also be used with strings.
● It returns True if the first operand is contained within the second, and False otherwise:
● The not in operator does the opposite.
Built-in String Functions
ord()
● Returns an integer value for the given character.
● ord() function will return numeric values for Unicode characters as well.
chr()
● Returns a character value for the given integer.
● Does the reverse of ord().
● chr() handles Unicode characters as well.
str()
● Returns a string representation of an object.
● The repr() built-in function also works the same way.
String Slicing
● A form of indexing syntax that extracts substrings from a string.
● s[m:n] returns the portion of string s starting with position m, and up to but not including
position n.
● s[m:n] will return a substring that is n - m characters in length.
● If you omit the first index, the slice starts at the beginning of the string.
Thus, s[:m] and s[0:m] are equivalent.
● If you omit the second index as in s[n:], the slice extends from the first
index through the end of the string.
● For any string s and any integer n (0 ≤ n ≤ len(s)), s[:n] + s[n:] will be equal to s.
Built-in String Functions
● Omitting both indices returns the original string, in its entirety.
● It’s not a copy, it’s a reference to the original string.
● If the first index in a slice is greater than or equal to the second index,
Python returns an empty string.
● Negative indices can be used with slicing as well.
Specifying a stride in a slice
● Adding an additional : and a third index designates a stride (also called a step).
● Indicates how many characters to jump after retrieving each character in the
slice.
● Can specify a negative stride value as well, in which case Python steps
backward through the string. In that case, the starting/first index should be
greater than the ending/second index.
● When you are stepping backward, if the first and second indices are omitted, the defaults are
reversed.
Modifying strings. But can you?
● No you can’t. But you already knew this. Hopefully.
● Strings are immutable.
● But you can easily accomplish what you want by generating a copy of the original string that has
the desired change in place.
Built-in String Methods
Case Conversion
“
Methods in this group perform case
conversion on the target string
1. s.capitalize()
➔ Capitalizes the target string.
➔ Returns a copy of s with the first character converted to
This is a slide title
uppercase and all other characters converted to lowercase.
➔ Non-alphabetic characters are unchanged.
2. s.lower()
➔ Returns a copy of s with all alphabetic characters converted to lowercase.
3. s.upper()
➔ Returns a copy of s with all alphabetic characters converted to uppercase.
4. s.swapcase()
➔ Returns a copy of s with uppercase alphabetic characters converted to lowercase and vice
versa.
5. s.title()
➔ Returns a copy of s in which the first letter of each word is converted to uppercase and
remaining letters are lowercase.
➔ It does not attempt to distinguish between important and unimportant words, and it does
not handle apostrophes, possessives, or acronyms gracefully.
Find
“ Big
concept
Methods in this group provide various
means of searching the target string
for a specified substring
1. s.count(<sub>[, <start>[, <end>]])
➔ Returns the number of non-overlapping occurrences of substring <sub> in s.
You can also split your content
➔ The count is restricted to the number of occurrences within the substring indicated by
<start> and <end>, if they are specified.
2. s.find(<sub>[, <start>[, <end>]])
➔ Returns the lowest index in s where substring <sub> is found.
➔ Returns -1 if the specified substring is not found.
➔ The search is restricted to the substring indicated by <start> and <end>, if they are
specified.
3. s.rfind(<sub>[, <start>[, <end>]])
➔ Returns the highest index in s where substring <sub> is found.
In two or three columns
➔ If the substring is not found, -1 is returned.
➔ The search is restricted to the substring indicated by <start> and <end>, if they are
specified.
4. s.startswith(<sub>[, <start>[, <end>]])
➔ Returns True if s starts with the specified <sub> and False otherwise.
➔ The comparison is restricted to the substring indicated by <start> and <end>, if they
are specified.
5. s.endswith(<sub>[, <start>[, <end>]])
➔ Returns True if s ends with the specified <sub> and False otherwise.
In two or three columns
➔ The comparison is restricted to the substring indicated by <start> and <end>, if they
are specified.
Character Classification
“ Methods in this group classify a group
based on the characters it contains
1. s.isalnum()
➔ Returns True if s is nonempty and all its characters are alphanumeric (either a letter or a
In two or three columns
number), and False otherwise.
2. s.isalpha()
➔ Returns True if s is nonempty and all its characters are alphabetic, and False otherwise.
3. s.isdigit()
➔ Returns True if s is nonempty and all its characters are numeric digits, and False
In two or three columns
otherwise.
4. s.islower()
➔ Returns True if s is nonempty and all the alphabetic characters it contains are lowercase,
and False otherwise.
➔ Non-alphabetic characters are ignored.
5. s.isupper()
➔ Returns True if s is nonempty and all the alphabetic characters it contains are uppercase,
In two or three columns
and False otherwise.
➔ Non-alphabetic characters are ignored.
6. s.isspace()
➔ Returns True if s is nonempty and all characters are whitespace characters, and False
otherwise.
➔ The most common whitespace characters are space ' ', tab '\t', and newline '\n'.
➔ However, there are a few other ASCII and Unicode characters that qualify as whitespace.
7. s.istitle()
➔ Returns True if s is nonempty, the first alphabetic character of each word is uppercase,
In two or three columns
and all other alphabetic characters in each word are lowercase.
➔ It returns False otherwise.
String Formatting
“ Methods in this group modify or
enhance the format of a string
1. s.replace(<old>, <new>[, <count>])
➔ Returns a copy of s with all occurrences of substring <old> replaced by <new>.
In two or three columns
➔ If the optional <count> argument is specified, a maximum of <count> replacements are
performed, starting at the left end of s.
2. s.lstrip([<chars>])
➔ Returns a copy of s with any whitespace characters removed from the left end.
➔ If the optional <chars> argument is specified, it is a string that specifies the set of
characters to be removed.
3. s.rstrip([<chars>])
➔ Returns a copy of s with any whitespace characters removed from the right end.
In two or three columns
➔ If the optional <chars> argument is specified, it is a string that specifies the set of
characters to be removed.
4. s.strip([<chars>])
➔ Equivalent to invoking s.lstrip() and s.rstrip() in succession.
➔ Without the <chars> argument, it removes leading and trailing whitespace.
➔ The optional <chars> argument specifies the set of characters to be removed.
Conversion between
“ Strings & Lists
Methods in this group convert
between a string & some composite
data type
1. s.join(<iterable>)
➔ Returns the string that results from concatenating the objects in <iterable> separated
In two or three columns
by s.
➔ <iterable> must be a sequence of string objects as well.
2. s.partition(<sep>)
➔ Splits s at the first occurrence of string <sep>.
In two or three columns
➔ The return value is a three-part tuple consisting of:
- The portion of s preceding <sep>
- <sep> itself
- The portion of s following <sep>
➔ If <sep> is not found in s, the returned tuple contains s followed by two empty strings.
3. s.split(sep=None, maxsplit=-1)
➔ Splits s into substrings delimited by any sequence of whitespace and returns the substrings
In two or three columns
as a list.
➔ If <sep> is specified, it is used as the delimiter for splitting.
➔ If <sep> is specified with a value of None, the string is split delimited by whitespace, just
as though <sep> had not been specified at all.
➔ When <sep> is explicitly given as a delimiter, consecutive delimiters in s are assumed to
delimit empty strings, which will be returned.
➔ This is not the case when <sep> is omitted, however.
➔ If <maxsplit> is specified, splits are counted from the left end of s.
In two or three columns
➔ The default value for <maxsplit> is -1, which means all possible splits should be
performed—the same as if <maxsplit> is omitted entirely.
Quiz Time!
1. What is the output of -
a) True
b) False
Ans: (a)
2. What is the slice expression that gives
every third character of string s, starting with
the last character and proceeding backward to
the first?
Ans: s[::-3]
3. Suppose s = ‘foobar’ .
All of the following expressions produce the a) s[::-5]
same result except one. Which one?
b) s[::-1][::-5]
c) s[::-1][-1] + s[len(s)-1]
d) s[::5]
e) s[0] + s[-1]
Ans: (a)
4. Which of the following are true?
a) s[::-1][::-1] is s
b) s[:] == s
c) s[:] is s
d) s[::-1][::-1] == s
Ans: (b),(c),(d)
5. What is the output from this print()
function call?
a) 3 1 0
b) 3 1 2
c) 3 1 1
d) 3 2 1
Ans: (b)
6. Suppose s = ‘foo-bar-baz’.
Which of the following expressions evaluates a) s.upper().lower()
to a string equal to s?
b) '-'.join(s.split('-'))
c) s.strip('-')
d) '-'.join(s.partition('-'))
Ans: (a),(b),(c)
7. What is the result of this statement?
print(ord('foo')) a) 324
b) 102
c) 102 111 111
d) Exception
Ans: (d)
Fin.