07_strings.ipynb - Colab
07_strings.ipynb - Colab
We already encountered strings when we introduced Python using the popular example of
printing the string literal "Hello World!" on the screen (standard output). Remember that these
literals can be written with single quotation too ('Hello World!').
A string is a sequence of characters. You can access the characters one at a time with the
bracket operator, e.g., string[i] , where index i within brackets starts from 0.
print(letter_1st)
print(letter_2nd)
last_letter = fruit[len(fruit)-1]
print(last_letter)
last_letter = fruit[-1]
print(last_letter)
print(type(fruit))
print(type(letter_1st)) # a character is a string made of a singleton character
print('len of variable \'fruit\' :', len(fruit)) # note the escaped special charac
# '\' allows an alternative interp
# in a sequence. For example \n (
b
a
a
a
<class 'str'>
<class 'str'>
len of variable 'fruit' : 6
ananab
ananab
ananab
%who
for i in range(len(fruit)):
print("char", i, ": ", fruit[i])
print("----------------")
i = 0
while i < len(fruit):
print("char", i, ": ", fruit[i])
i += 1 # equivalent to i = i + 1
print("----reverse ----")
for i in range(-1, -(len(fruit)+1), -1):
print(fruit[i], end = " ")
print("\n----------------")
index = -1
while index > -(len(fruit)+1):
letter = fruit[index]
print(letter, end = " ")
index -= 1 # equivalent to index = index - 1
char 0 : b
char 1 : a
char 2 : n
char 3 : a
char 4 : n
char 5 : a
----------------
char 0 : b
char 1 : a
char 2 : n
char 3 : a
char 4 : n
char 5 : a
----reverse ----
a n a n a b
----------------
a n a n a b
Since a string is an ordered sequence of characters, we can use a for loop with the in
operator:
In addition, the operator in and not in can be used to test whether a character is included or
not in a given string.
print('\n++++++++++++++++++++++++++')
if 'b' in fruit:
print("'b' is in the string at index", fruit.index('b')) # index is a builtin
try:
print('\'c\' is in the string at index', fruit.index('c'))
except:
print("'c' is NOT in the string")
if 'c' in fruit:
print('\'c\' is in the string at index', fruit.index('c'))
else:
print("'c' is NOT in the string")
print(fruit.index('na'))
b a n a n a
++++++++++++++++++++++++++
'b' is in the string at index 0
'c' is NOT in the string
'c' is NOT in the string
'k' is NOT in the string
2
String concatenation
The operator + can be used also over strings, to concatenate multiple ones:
This is a case of operator overloading, where + is given an extended semantics beyond their
predeJned operational meaning. Thus + is used to add two integers as well as join two
strings.
The following example shows how to use concatenation (string addition) and a for loop to
generate an abecedarian series (that is, in alphabetical order). In Robert McCloskey’s book
Make Way for Ducklings, the names of the ducklings are Jack, Kack, Lack, Mack, Nack, Ouack,
Pack, and Quack. This loop outputs these names in order, except for “Ouack” and “Quack”,
which are misspelled.
var = 'c'
if var in "aeiouAEIOU":
print(var, "occurs in", '"aeiouAEIOU"')
else:
print(var, "doesn't occur in", '"aeiouAEIOU"')
var = 'u'
print(var in "aeiouAEIOU")
Hi! 4
Hi! Hello World ... 19
True
# Make Way for Ducklings
prefixes = 'JKLMNOPQ'
suffix = 'ack'
Jack
Kack
Lack
Mack
Nack
Ouack
Pack
Quack
Jack
Kack
Lack
Mack
Nack
Ouack
Pack
Quack
String replication
Python can be used to multiply things other than numbers. This is another case of operator
overloading for * . Indeed we can multiply a string by an integer, that cause the string to be
replicated many times.
for i in range(10):
new = "Hello! " * i
print(i, len(new), new)
0 0
1 7 Hello!
2 14 Hello! Hello!
3 21 Hello! Hello! Hello!
4 28 Hello! Hello! Hello! Hello!
5 35 Hello! Hello! Hello! Hello! Hello!
6 42 Hello! Hello! Hello! Hello! Hello! Hello!
7 49 Hello! Hello! Hello! Hello! Hello! Hello! Hello!
8 56 Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello!
9 63 Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello!
HelloHello
String slicing
The bracket operator can be used to extract a slice of characters from a string. SpeciJcally,
[n:m] returns the part of the string from the n-th character to the m-th character, including
the Jrst but excluding the last. This means that if you write hello[1:1] , the returned string is
the empty one '' , while hello[2:3] is equivalent to hello[2] .
If you omit the Jrst index (before the colon), the slice starts at the beginning of the string. If
you omit the second index, the slice goes to the end of the string.
start refers to the index of the element which is used as a start of our slice, stop refers to
the index of the element we should stop just before to Jnish our slice, step allows us to skip
some characters within the start:stop range.
Note that step can be negative, and also start can be negative.
s = "abcdefg"
print(s[2:5])
subs=""
for i in range(2, 5):
subs = subs + s[i]
print("--->", subs)
subs=""
for i in range(2, 5):
subs = s[i] + subs # note that '+' is NOT commutative when applied to strings!!
print("+++>", subs)
hello = 'ABCDEFGHIKLMNOPQRSTUVWXYZ'
# 0123456789012345678
print(hello + '\n')
[1:1]:
[10:14]: LMNO
[:3]: ABC
[2:]: CDEFGHIKLMNOPQRSTUVWXYZ
[::2]: ACEGILNPRTVXZ
[1::2]: BDFHKMOQSUWY
-1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 -16 -17 -18 -19 -20 -21 -22
[-1:-len(hello)-1:-1]: ZYXWVUTSRQPONMLKIHGFEDCBA
[::-1]: ZYXWVUTSRQPONMLKIHGFEDCBA
Search a letter
DeJne a function that Jnds and returns the index of the Jrst occurrence of a letter, and -1
otherwise:
"""
index = 0
while index < len(word):
if word[index] == letter:
return index
index = index + 1
return -1
"""
fruit = 'banana'
print(fruit, ": found letter 'n' at index:", find(fruit, 'n'))
print(fruit, ": found letter 'b' at index:", find(fruit, 'b'))
print(fruit, ": found letter 'x' at index:", find(fruit, 'x'))
2. Since you can check the equality between two strings, e.g.,
if fruit[0:2] == 'ba':
<do something ...>
deJne and use a new function that searches the Jrst occurrence of substring:
word = "banana"
substr = "ana"
nchars = len(substr)
for i in range(len(word) - nchars + 1):
print(word[i : i+nchars])
ban
ana
nan
ana
find_print("automotive", "t")
# Exercise 1
# print directly all the occurrences (indexes) of the second parameter 'letter'
find_print("automotive", "t")
find_print("automotive", "o")
find_print("automotive", "k")
t : 2
t : 6
o : 3
o : 5
return -1
def find_substr(word, substr):
if len(substr) > len(word):
return -1
nchars = len(substr)
for i in range(len(word) - nchars + 1):
print(word[i : i + nchars])
if word[i : i + nchars] == substr: # for nchars=2, we have (word[i], word[i
return i
return -1
sub = "an"
fruit = "banana"
print(find_substr(fruit, sub))
sub = "aa"
fruit = "banana"
print(find_substr(fruit, sub))
ba
an
1
ba
an
na
an
na
-1
last char: a
# Exercise 2
print(find_substr("automotive", "ive"))
print(find_substr("automotive", "automotive big"))
7
-1
The in operator
The Boolean operator in takes 2 strings and returns True if the Jrst appears as a substring
in the second, False otherwise.
if 'ban' in fruit:
print('\'ban\' is a substring of', fruit)
String comparison
Not only the + and * operators are overloaded to work on strings, but also the relational
operators, used to test conditions on string pairs.
First, we can check if two string are equal or not by using the operators == and !=.
We can use >, <, <= and >= to compare two strings lexicographically (dictionary or alphabetic
order).
Remember that characters are encoded as numbers, and the string comparison at the end is
reduced to comparing numbers, e.g., the ASCII code points. The ASCII the encoding respect
the lexicographic order. For example, the numerical code of a is smaller than b , which is
smaller than c , etc. Moreover the uppercase letters are associated with codes that are
smaller than the lowercase ones. The number symbols are associated with smaller codes
that letters.
Finally, whitespace ' ' is smaller than digits and letters.
Thus, try to check if the following inequalities hold, and explain why this happens:
True
True
True
True
True
False
104
'auto' < 'automobile': True
auto : 97 117 116 111 32
automobile: 97 117 116 111 109 111 98 105 108 101
String methods
The string class provide methods that perform a variety of useful operations. A method is
similar to a function — it takes arguments and returns a value — but the syntax is different.
For example, the method upper takes a string and returns a new string with all uppercase
letters.
Given a string variable word , instead of the function syntax upper(word) , the method syntax
is word.upper() . Thus, the default argument of the method is the string word to which the
method is applied.
https://docs.python.org/3/library/stdtypes.html#string-methods
str = "hello"
str.find("he")
str.find("lo", 2)
str.find("lo", 2, 4)
0
3
-1
False
True
False