0% found this document useful (0 votes)
7 views

07_strings.ipynb - Colab

The document provides an overview of string manipulation in Python, covering topics such as string indexing, concatenation, replication, slicing, and searching for characters or substrings. It includes examples of using loops to traverse strings, as well as defining functions to find the first and last occurrences of characters. Additionally, it presents exercises for further practice with string functions.

Uploaded by

899141
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

07_strings.ipynb - Colab

The document provides an overview of string manipulation in Python, covering topics such as string indexing, concatenation, replication, slicing, and searching for characters or substrings. It includes examples of using loops to traverse strings, as well as defining functions to find the first and last occurrences of characters. Additionally, it presents exercises for further practice with string functions.

Uploaded by

899141
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

 Strings

We already encountered strings when we introduced Python using the popular example of
printing the string literal "Hello World!" on the screen (standard output). Remember that these
literals can be written with single quotation too ('Hello World!').

A string is a sequence of characters. You can access the characters one at a time with the
bracket operator, e.g., string[i] , where index i within brackets starts from 0.

len is a built-in function that returns the number of characters in a string.

If we want to take the last/penultimate element of a string, we can enumerate characters


from the tail of the string. To this end, we can use negative indexing: -1 identiJes the last one,
-2 the penultimate, etc.
fruit = 'banana'
letter_1st = fruit[0]
letter_2nd = fruit[1]

print(letter_1st)
print(letter_2nd)

last_letter = fruit[len(fruit)-1]
print(last_letter)
last_letter = fruit[-1]
print(last_letter)

print(type(fruit))
print(type(letter_1st)) # a character is a string made of a singleton character

print('len of variable \'fruit\' :', len(fruit)) # note the escaped special charac
# '\' allows an alternative interp
# in a sequence. For example \n (

print(fruit[-1], fruit[-2], fruit[-3], fruit[-4], fruit[-5], fruit[-6], sep="")

for i in range(-1, -len(fruit)-1, -1):


print(fruit[i], end='')
print()
for i in range(1, 7):
print(fruit[-i], end='')

b
a
a
a
<class 'str'>
<class 'str'>
len of variable 'fruit' : 6
ananab
ananab
ananab

%who

fruit last_letter letter_1st letter_2nd


 Different ways to traverse a string with loops
In the following we show two traversals of string fruit with a for and a while loop. The
string is the following, where the character indexes are shown:

Python string image

for i in range(len(fruit)):
print("char", i, ": ", fruit[i])

print("----------------")

i = 0
while i < len(fruit):
print("char", i, ": ", fruit[i])
i += 1 # equivalent to i = i + 1

print("----reverse ----")
for i in range(-1, -(len(fruit)+1), -1):
print(fruit[i], end = " ")

print("\n----------------")

index = -1
while index > -(len(fruit)+1):
letter = fruit[index]
print(letter, end = " ")
index -= 1 # equivalent to index = index - 1

char 0 : b
char 1 : a
char 2 : n
char 3 : a
char 4 : n
char 5 : a
----------------
char 0 : b
char 1 : a
char 2 : n
char 3 : a
char 4 : n
char 5 : a
----reverse ----
a n a n a b
----------------
a n a n a b
Since a string is an ordered sequence of characters, we can use a for loop with the in
operator:

for letter in fruit:

This is thus an alternative way for traversing strings.

In addition, the operator in and not in can be used to test whether a character is included or
not in a given string.

for letter in fruit:


print(letter, end = ' ')

print('\n++++++++++++++++++++++++++')
if 'b' in fruit:
print("'b' is in the string at index", fruit.index('b')) # index is a builtin

try:
print('\'c\' is in the string at index', fruit.index('c'))
except:
print("'c' is NOT in the string")

if 'c' in fruit:
print('\'c\' is in the string at index', fruit.index('c'))
else:
print("'c' is NOT in the string")

if 'k' not in fruit:


print("'k' is NOT in the string")

print(fruit.index('na'))

b a n a n a
++++++++++++++++++++++++++
'b' is in the string at index 0
'c' is NOT in the string
'c' is NOT in the string
'k' is NOT in the string
2
 String concatenation
The operator + can be used also over strings, to concatenate multiple ones:

hello = 'Hi! ' + "Hello " + 'World ...'

This is a case of operator overloading, where + is given an extended semantics beyond their
predeJned operational meaning. Thus + is used to add two integers as well as join two
strings.

The following example shows how to use concatenation (string addition) and a for loop to
generate an abecedarian series (that is, in alphabetical order). In Robert McCloskey’s book
Make Way for Ducklings, the names of the ducklings are Jack, Kack, Lack, Mack, Nack, Ouack,
Pack, and Quack. This loop outputs these names in order, except for “Ouack” and “Quack”,
which are misspelled.

As an exercise, modify the program to Jx this error.

hello = "Hi! "


print(hello, len(hello))
hello = hello + "Hello World ..."
print(hello, len(hello))

var = 'c'
if var in "aeiouAEIOU":
print(var, "occurs in", '"aeiouAEIOU"')
else:
print(var, "doesn't occur in", '"aeiouAEIOU"')

var = 'u'
print(var in "aeiouAEIOU")

Hi! 4
Hi! Hello World ... 19
True
# Make Way for Ducklings

prefixes = 'JKLMNOPQ'
suffix = 'ack'

for letter in prefixes:


if letter == "O" or letter == "Q": #letter in "OQ":
string_to_print = letter + "u" + suffix
else:
string_to_print = letter + suffix
print(string_to_print)

for letter in prefixes:


if letter == "O" or letter == "Q": #letter in "OQ":
print(letter, "u", suffix, sep='')
else:
print(letter, suffix, sep='')

Jack
Kack
Lack
Mack
Nack
Ouack
Pack
Quack
Jack
Kack
Lack
Mack
Nack
Ouack
Pack
Quack

 String replication
Python can be used to multiply things other than numbers. This is another case of operator
overloading for * . Indeed we can multiply a string by an integer, that cause the string to be
replicated many times.
for i in range(10):
new = "Hello! " * i
print(i, len(new), new)

# new = "Hello" * 2.6 # error, only int are allowed


# print(new)
# new = 2 * "Hello" # '*' is commutative

0 0
1 7 Hello!
2 14 Hello! Hello!
3 21 Hello! Hello! Hello!
4 28 Hello! Hello! Hello! Hello!
5 35 Hello! Hello! Hello! Hello! Hello!
6 42 Hello! Hello! Hello! Hello! Hello! Hello!
7 49 Hello! Hello! Hello! Hello! Hello! Hello! Hello!
8 56 Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello!
9 63 Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello!
HelloHello

 String slicing
The bracket operator can be used to extract a slice of characters from a string. SpeciJcally,
[n:m] returns the part of the string from the n-th character to the m-th character, including
the Jrst but excluding the last. This means that if you write hello[1:1] , the returned string is
the empty one '' , while hello[2:3] is equivalent to hello[2] .

If you omit the Jrst index (before the colon), the slice starts at the beginning of the string. If
you omit the second index, the slice goes to the end of the string.

The full slice syntax is: start:stop:step .

start refers to the index of the element which is used as a start of our slice, stop refers to
the index of the element we should stop just before to Jnish our slice, step allows us to skip
some characters within the start:stop range.

Default values: start=0 , stop=len(str) , step=1 .

Note that step can be negative, and also start can be negative.

Inizia a programmare o genera codice con l'IA.

for i in range(2, 5):


print(i , end=" ")
print()

s = "abcdefg"
print(s[2:5])

for i in range(2, 5):


print(s[i] , end="")
print()

subs=""
for i in range(2, 5):
subs = subs + s[i]
print("--->", subs)

subs=""
for i in range(2, 5):
subs = s[i] + subs # note that '+' is NOT commutative when applied to strings!!
print("+++>", subs)

hello = 'ABCDEFGHIKLMNOPQRSTUVWXYZ'
# 0123456789012345678
print(hello + '\n')

print("[1:1]: ", hello[1:1])

print("[10:14]: ", hello[10:14])


print("[:3]: ", hello[:3])
print("[2:]: ", hello[2:])

print("[::2]: ", hello[::2]) # even positions


print("[1::2]: ", hello[1::2]) # odd positions

for i in range(-1, -len(hello)-1, -1):


print (i, end=" ")
print()

print("[-1:-len(hello)-1:-1]: ", hello[-1:-len(hello)-1:-1])


print("[::-1]: ", hello[::-1]) # equivalent to the previous one
2 3 4
cde
cde
---> cde
+++> edc
ABCDEFGHIKLMNOPQRSTUVWXYZ

[1:1]:
[10:14]: LMNO
[:3]: ABC
[2:]: CDEFGHIKLMNOPQRSTUVWXYZ
[::2]: ACEGILNPRTVXZ
[1::2]: BDFHKMOQSUWY
-1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 -16 -17 -18 -19 -20 -21 -22
[-1:-len(hello)-1:-1]: ZYXWVUTSRQPONMLKIHGFEDCBA
[::-1]: ZYXWVUTSRQPONMLKIHGFEDCBA

Inizia a programmare o genera codice con l'IA.

 Search a letter
DeJne a function that Jnds and returns the index of the Jrst occurrence of a letter, and -1
otherwise:

def find(word, letter):

def find(word, letter):


# scan the string 'word', and check char by char against 'letter'

File "<ipython-input-15-b1e6a875913a>", line 2


# scan the string 'word', and check char by char against 'letter'
^
SyntaxError: incomplete input

def find(word, letter):


for i in range(len(word)):
if word[i] == letter:
return i
return -1

"""
index = 0
while index < len(word):
if word[index] == letter:
return index
index = index + 1
return -1
"""

# find the index of the last occurrence of letter, -1 otherwise


def find_last(word, letter):
for i in range(len(word)-1,-1,-1):
if word[i] == letter:
return i
return -1

# find the negative index of the last occurrence of 'letter', 0 otherwise


def find_last_negative(word, letter):
for i in range(-1,-len(word)-1,-1):
if word[i] == letter:
return i
return 0

fruit = 'banana'
print(fruit, ": found letter 'n' at index:", find(fruit, 'n'))
print(fruit, ": found letter 'b' at index:", find(fruit, 'b'))
print(fruit, ": found letter 'x' at index:", find(fruit, 'x'))

print(fruit, ": found last letter 'n' at index:", find_last(fruit, 'n'))


print(fruit, ": found last letter 'b' at index:", find_last(fruit, 'b'))
print(fruit, ": found last letter 'x' at index:", find_last(fruit, 'x'))

print(fruit, ": found last letter 'n' at index:", find_last_negative(fruit, 'n'))


print(fruit, ": found last letter 'b' at index:", find_last_negative(fruit, 'b'))
print(fruit, ": found last letter 'x' at index:", find_last_negative(fruit, 'x'))

banana : found letter 'n' at index: 2


banana : found letter 'b' at index: 0
banana : found letter 'x' at index: -1
banana : found last letter 'n' at index: 4
banana : found last letter 'b' at index: 0
banana : found last letter 'x' at index: -1
banana : found last letter 'n' at index: -2
banana : found last letter 'b' at index: -6
banana : found last letter 'x' at index: 0
 Exercises
1. Modify function find to print directly all the occurrences (the indexes) of the second
parameter 'letter'. Such kind of functions that doesn't return nothing are said void.

2. Since you can check the equality between two strings, e.g.,

if fruit[0:2] == 'ba':
<do something ...>

deJne and use a new function that searches the Jrst occurrence of substring:

def find_substr(word, substr):


<body>

word = "banana"
substr = "ana"
nchars = len(substr)
for i in range(len(word) - nchars + 1):
print(word[i : i+nchars])

ban
ana
nan
ana
find_print("automotive", "t")
# Exercise 1
# print directly all the occurrences (indexes) of the second parameter 'letter'

def find_print(word, letter):


for i in range(len(word)):
if word[i] == letter:
print(letter, ":", i)

find_print("automotive", "t")
find_print("automotive", "o")
find_print("automotive", "k")

t : 2
t : 6
o : 3
o : 5
return -1
def find_substr(word, substr):
if len(substr) > len(word):
return -1

nchars = len(substr)
for i in range(len(word) - nchars + 1):
print(word[i : i + nchars])
if word[i : i + nchars] == substr: # for nchars=2, we have (word[i], word[i
return i
return -1

sub = "an"
fruit = "banana"
print(find_substr(fruit, sub))

sub = "aa"
fruit = "banana"
print(find_substr(fruit, sub))

print("last char:", fruit[5:8])

ba
an
1
ba
an
na
an
na
-1
last char: a

# Exercise 2

def find_substr(word, substr):


l_substr = len(substr)
if l_substr > len(word):
return -1
for i in range(len(word) - l_substr + 1): # this avoids slices smaller than sub
if word[i : i+l_substr] == substr:
return i
return -1

print(find_substr("automotive", "ive"))
print(find_substr("automotive", "automotive big"))

7
-1
 The in operator

The Boolean operator in takes 2 strings and returns True if the Jrst appears as a substring
in the second, False otherwise.

if 'ban' in fruit:
print('\'ban\' is a substring of', fruit)

String comparison
Not only the + and * operators are overloaded to work on strings, but also the relational
operators, used to test conditions on string pairs.

First, we can check if two string are equal or not by using the operators == and !=.

We can use >, <, <= and >= to compare two strings lexicographically (dictionary or alphabetic
order).

Remember that characters are encoded as numbers, and the string comparison at the end is
reduced to comparing numbers, e.g., the ASCII code points. The ASCII the encoding respect
the lexicographic order. For example, the numerical code of a is smaller than b , which is
smaller than c , etc. Moreover the uppercase letters are associated with codes that are
smaller than the lowercase ones. The number symbols are associated with smaller codes
that letters.
Finally, whitespace ' ' is smaller than digits and letters.

ASCII table image

Thus, try to check if the following inequalities hold, and explain why this happens:

"1utoma" < "automa"


"Automa" < "automa"
"1utoma" < "Automa"
"auto" < "automobile"
"auto " < "automobile"
print("1utoma" < "automa")
print("Automa" < "automa")
print("1utoma" < "Automa")
print("auto" < "automobile")
print("auto" < "automobile")
print("accd" < "abde")

True
True
True
True
True
False

# `ord()` returns the integer code that represents a character argument:


print(ord("h"))

# function that prints all the codes of the characters in a string s


def print_ASCII(s):
print(s, ": ", sep='', end='')
for c in s:
print(ord(c), end=' ')
print()

print("'auto' < 'automobile': ", "auto " < "automobile")


print_ASCII("auto ")
print_ASCII("automobile")

104
'auto' < 'automobile': True
auto : 97 117 116 111 32
automobile: 97 117 116 111 109 111 98 105 108 101
 String methods
The string class provide methods that perform a variety of useful operations. A method is
similar to a function — it takes arguments and returns a value — but the syntax is different.

For example, the method upper takes a string and returns a new string with all uppercase
letters.

Given a string variable word , instead of the function syntax upper(word) , the method syntax
is word.upper() . Thus, the default argument of the method is the string word to which the
method is applied.

Look at this page for all the methods for strings:

https://docs.python.org/3/library/stdtypes.html#string-methods

One example of method for a string object str are:

Method Jnd: str.find(sub[, start[, end]])


Return the lowest index in the string where substring sub is found within the slice
s[start:end] . Therefore, start and end are interpreted as in slice notation.
Brackets indicate optional arguments. So we have 3 different ways to call the method:

str = "hello"
str.find("he")
str.find("lo", 2)
str.find("lo", 2, 4)

The method returns -1 if `sub` is not found.


str = "hello"
print(str.find("he"))
print(str.find("lo", 2))
print(str.find("lo", 2, 4))
print(str.isdigit())
str="123"
print(str.isdigit())

# try str. : this show all the possible methods of strings

0
3
-1
False
True
False

You might also like