Slither Into Python
Slither Into Python
Contents
................................................................................................................................................................ 1
Chapter 1 ............................................................................................................................................... 7
Prerequisites ...................................................................................................................................... 7
1.1 Learning outcomes and who this book is for ......................................................................... 7
1.2 A bit about Python ..................................................................................................................... 9
1.3 Our first program ..................................................................................................................... 10
Chapter 2 - Integers, Operators and Syntax ................................................................................... 11
2.1 Integers ...................................................................................................................................... 11
2.2 Operator precedence ............................................................................................................... 13
2.3 Syntax ......................................................................................................................................... 13
2.4 Exercises..................................................................................................................................... 15
Chapter 3 - Variables, Floats, Input and Printing ........................................................................... 16
3.1 - Variable Assignment ............................................................................................................. 16
3.2 - Floating-point types .............................................................................................................. 20
3.3 - Getting user input.................................................................................................................. 21
3.4 - The print function .................................................................................................................. 23
3.5 - Exercises .................................................................................................................................. 24
Chapter 4 - Control Flow ................................................................................................................... 26
4.1 - Booleans.................................................................................................................................. 26
4.2 - The if Statement..................................................................................................................... 28
4.3 - The if-else statement ............................................................................................................. 30
4.4 - The elif statement .................................................................................................................. 32
4.5 - Nested if statements ............................................................................................................. 33
4.6 - Exercises .................................................................................................................................. 33
Chapter 5 - Looping & Iterations ..................................................................................................... 35
5.1 - while loops.............................................................................................................................. 35
5.2 - Looping patterns ................................................................................................................... 38
5.3 - break and continue ................................................................................................................ 40
5.4 - while loop examples .............................................................................................................. 42
5.5 - Exercises .................................................................................................................................. 43
Chapter 6 - Strings ............................................................................................................................. 45
6.1 - Introduction to strings .......................................................................................................... 45
2
6.2 - String functions ...................................................................................................................... 46
6.3 - String methods ...................................................................................................................... 47
6.4 - String indexing ....................................................................................................................... 49
6.5 - Immutability ........................................................................................................................... 51
6.6 - String slicing ........................................................................................................................... 52
6.7 - String operators ..................................................................................................................... 54
6.8 - Printing & formatting ............................................................................................................ 55
6.9 - Exercises .................................................................................................................................. 58
Chapter 7 - Linear Search .................................................................................................................. 61
7.1 - What is linear search? ........................................................................................................... 61
7.2 Linear search using while ........................................................................................................ 61
7.3 - Examples ................................................................................................................................. 62
7.4 - Exercises .................................................................................................................................. 63
Chapter 8 - Lists .................................................................................................................................. 67
8.1 - List literals ............................................................................................................................... 67
8.2 - List methods and functions .................................................................................................. 68
8.3 - List indexing ........................................................................................................................... 70
8.4 - References & Mutability ....................................................................................................... 70
8.5 - List operations ........................................................................................................................ 74
8.6 - List slicing ............................................................................................................................... 74
8.7 - Exercises .................................................................................................................................. 75
Chapter 9 - Basic Sorting Algorithms .............................................................................................. 79
9.1 - What is sorting? ..................................................................................................................... 79
9.2 - Selection sort.......................................................................................................................... 79
9.3 - Insertion sort .......................................................................................................................... 82
9.4 - Comparison of Selection sort and Insertion sort............................................................... 84
9.5 - Exercises .................................................................................................................................. 87
Input, File & Text Processing............................................................................................................. 88
10.1 - What is file & text processing? .......................................................................................... 88
10.2 - The Sys module.................................................................................................................... 88
10.3 - Reading from standard input ............................................................................................. 90
10.4 - Writing to standard output ................................................................................................ 93
10.5 - Standard error ...................................................................................................................... 95
3
10.6 - Opening & reading files ..................................................................................................... 95
10.7 - Writing to files & closing files ............................................................................................ 96
10.8 - Putting is all together ............................................................................................................. 98
10.9 - Exercises ................................................................................................................................ 99
Chapter 11 - for loops & Dictionaries............................................................................................ 103
11.1 - for loops.............................................................................................................................. 103
11.2 - Dictionaries ......................................................................................................................... 105
11.3 - Dictionary accessing .......................................................................................................... 106
11.4 - Adding and removing entries .......................................................................................... 107
11.5 - Dictionary operators & methods..................................................................................... 108
11.6 - Sorting dictionaries ........................................................................................................... 110
11.7 - Exercises .............................................................................................................................. 111
Chapter 12 - Functions & Modules ................................................................................................ 115
12.1 - What are functions? .......................................................................................................... 115
12.2 - Defining our own functions.............................................................................................. 115
12.3 - Arguments & parameters ................................................................................................. 116
12.4 - Variable Scope ................................................................................................................... 119
12. 5 - Default mutable argument trap...................................................................................... 119
12.6 - What are modules? ........................................................................................................... 121
12.7 - Creating our own modules............................................................................................... 122
12.8 - Exercises .............................................................................................................................. 123
Chapter 13 - Binary Search .............................................................................................................. 127
13.1 - Number guessing game ................................................................................................... 127
13.2 - Binary Search ...................................................................................................................... 128
13.3 - Implementing Binary Search ............................................................................................ 130
13.4 - Improving our number guessing game .......................................................................... 131
13.5 Analysis of Binary Search..................................................................................................... 131
13.6 - Exercises .............................................................................................................................. 133
Chapter 14 - Error Handling ............................................................................................................ 135
14.1 - Raising Exceptions ............................................................................................................. 135
14.2 - Assertions ........................................................................................................................... 136
14.3 - try & except ........................................................................................................................ 137
14.4 - finally ................................................................................................................................... 141
4
14.5 - Exercises .............................................................................................................................. 142
Chapter 15 - More on Data Types .................................................................................................. 145
15.1 - Tuples .................................................................................................................................. 145
15.2 - Sets ...................................................................................................................................... 147
15.3 - Lists & list comprehensions ............................................................................................. 150
15.4 - Exercises .............................................................................................................................. 151
Chapter 16 - Recursion & Quicksort .............................................................................................. 154
16.1 - Recursion ............................................................................................................................ 154
16.2 - Memoization ...................................................................................................................... 156
16.3 - Quicksort part 1 ................................................................................................................. 158
16.4 - Quicksort part 2 ................................................................................................................. 160
16.5 - Exercises .............................................................................................................................. 160
Chapter 17 - Object Oriented Programming (OOP) .................................................................... 162
17.1 - Classes & Objects .............................................................................................................. 162
17.2 - Defining a new type .......................................................................................................... 163
17.3 - The self variable ................................................................................................................. 164
17.4 - Exercises .............................................................................................................................. 166
Chapter 18 - OOP: Special Methods and Operator Overloading............................................... 169
18.1 - The __init__() method ........................................................................................................ 169
18.2 - The __str__() method ......................................................................................................... 170
18.3 - The __len__() method......................................................................................................... 171
18.4 - Operator overloading ....................................................................................................... 172
18.5 - Exercises .............................................................................................................................. 175
Chapter 19 - OOP: Method Types & Access Modifiers ............................................................... 177
19.1 - What are instance methods? ........................................................................................... 177
19.2 - What are class methods? .................................................................................................. 177
19.3 - What are static methods? ................................................................................................. 180
19.4 - Public, Private & Protected attributes............................................................................. 181
19.5 - Exercises .............................................................................................................................. 182
Chapter 20 - OOP Inheritance ........................................................................................................ 186
20.1 - What is inheritance? .......................................................................................................... 186
20.2 - Parent & child classes ....................................................................................................... 186
20.3 - Multiple inheritance .......................................................................................................... 191
5
20.4 - Exercises .............................................................................................................................. 192
Chapter 21 - Basic Data Structures ................................................................................................ 195
21.1 - The Stack............................................................................................................................. 195
21.2 - The Queue .......................................................................................................................... 199
21.3 - Examples ............................................................................................................................. 201
21.4 - Exercises .............................................................................................................................. 203
Chapter 22 - More Advanced Data Structures ............................................................................. 206
22.1 - Linked Lists ......................................................................................................................... 206
22.2 - Binary Search Trees (BSTs)................................................................................................ 210
22.3 - Exercises .............................................................................................................................. 218
6
Chapter 1
Prerequisites
There are only two things you need to have done before reading this book: have
Python installed on your computer. I’m not going to go through the process of doing
that, so I’ve left some links below on how to install it on whatever operating system
you’re using. We’ll be using Python 3.7 throughout this book; however you might
already have Python installed if you’re on Mac or Linux. I’ll also be developing on
Windows 10 but that shouldn’t make a difference.
Secondly you should have a have a text editor installed. I recommend Sublime Text
as it will provide everything we need to get started. A link to download Sublime Text
is also linked below.
I also highly recommend becoming familiar with the command line. This is going to
be essential if you want to be productive. I know I've said I'll be developing on
Windows, but I also recommend you install some Linux distribution (Ubuntu is
usually a good choice for beginners). You can install this alongside Windows if that's
what you've got installed. All great developers know the Linux command line inside
and out and if you work in this industry, you'll be expected to know it. Below are
some links on how to install Ubuntu alongside Windows and an introduction to both
the Linux and Windows command lines.
7
No prior programming experience or computer science background is necessary.
Unlike any other Python resources I have found (not that they don’t exist), they don’t
explain important computer science concepts such as memory or “how computers
work” which I believe is essential to becoming a good programmer (having some
basic computer science knowledge can often help you get out of odd situations
when programming). In this book I will cover the fundamentals of the Python
programming language and introduce these important computer science concepts
and theories.
Another important skill all great programmers have is problem solving. Being able to
think about a problem from a different perspective, break it down into an easier
more manageable problem or re-frame the problem is unbelievably important.
I will also explain everything very simply in the beginning of the book and gradually
become more technical as to not overwhelm you in the beginning (I found this
worked well for me when I was learning to program).
I hope that by the end of this book you will have a solid understanding of the Python
programming language, a firm understanding of important computer science
8
concepts, the ability to put problem solving techniques to good use and most
importantly, think computationally.
High-level means we are abstracted away from the details of the computer. For the
purposes of explanation, a low-language like Assembly is used in specialised
applications such as operating systems (In fact, the Apollo 11 Guidance Computer
was written in assembly and the source code can be found here: Apollo 11 Source
Code). Below is a snippet of this low-level assembly language which we will re-write
in Python in the following section.
1. SECTION .DATA
2. hello: db 'Hello world!', 10
3. helloLen: equ $-hello
4.
5. SECTION .TEXT
6. GLOBAL _start
7.
8. _start:
9. mov eax,4
10. mov ebx,1
11. mov ecx,hello
12. mov edx,helloLen
13. int 80h
14.
15. mov eax,1
16. mov ebx,0
17. int 80h
Python is being widely adopted by universities all around the world as the
introductory language taught to first year computer science students due to its
simplicity. In saying that, it may be an easy to understand language and easy to write
language, but it is also extremely powerful. It is used everywhere from automating
everyday computing tasks to data analytics and artificial intelligence.
Python has a huge community behind it. If you get stuck on a problem or can’t make
sense of an error message, I can guarantee you’ll find your answer in a matter of
minutes.
9
Python also has a ridiculous number of open source libraries, frameworks and
modules available to do whatever you want to do. This makes developing your
applications even more straightforward.
To top it all off, if you’re planning on making a career out of this, the average salary
for a Python developer in the US is in the region of $92,000.
1. print("Hello world!")
That’s it! It really is that simple. Create a new file in your text editor, copy the above
code into it and save it as hello.py. All Python programs end with the .py extension.
Next, to run our program we need to open a terminal window. On Windows, open
the file explorer and go to the directory (folder) where you saved the file. On the top
of the file explorer window you will see something similar to:
Click into this and type “cmd” and hit enter. This will open a terminal window in that
directory.
Finally, to run your program, in the terminal window next to _C:NAME>, type the
following.
python hello.py
You should now see “Hello world!” was printed out. If you remember from the
previous section, python is the interpreter and we’re telling it to run our program
‘hello.py’
Congratulations! You’ve written your first of many Python programs. Now I want you
to forget the above program. I have it here simply to serve as a boost in motivation.
In the next chapter we’ll take it right back to the very basics and get you on the road
to mastering the Python language.
10
Chapter 2 - Integers, Operators and
Syntax
2.1 Integers
Before we begin, getting to grips with the material in this chapter is important
yet it’s very easy! There isn’t really anything difficult in this chapter. However, I
will say, some of the exercises at the end may catch you out! so pay attention
to the content.
As I said at the end of the previous chapter, forget that hello world program.
In this chapter we’re going to start off with the very basics and that means
using Python as an integer calculator.
In the first part of this chapter we’re going to use something called the Python
REPL to write our programs.
At the command prompt (in the terminal window), type the following (then
‘Enter’):
python
Now lets try some integer calculations by typing them at the REPL prompt:
5
5 + 7
5 - 3
4 * 4
11
15 // 5
5 * 2 + 3
3 + 2 * 5
(3 + 2) * 5
These are known as integer expressions. They are expressions which evaluate
to an integer value.
A single value on its own such as the first expression in the above list of
expressions is called an integer literal.
We can also write negative numbers in the same way we normally would.
5 + -5 evaluates to 0.
For completeness, the minus sign before the second 5 in the above expression
is called the unary negation operator. It acts on a single value. The + is a
binary operator as it operates on two things called operands (5 and -5 in this
case).
• Integer literals
• Integer operators
o Binary operators: +, -, * and //
o Unary operators: - (negation)
• Operands: 5 + 3
• Integer expressions
12
• Integer values
What did you notice when you entered these expressions into the REPL?
1. (…)
2. *, /
3. +, -
For completeness, + and - are said to have equal precedence. Similarly, with *
and //. However, precedence doesn’t tell us the order of evaluation for
operators of equal precedence and it can be important!
The integer operators we have met are left associative meaning they are
evaluated from left to right.
2.3 Syntax
13
Now let’s throw a spanner in the works. In the Python REPL try these
expressions:
4 + 3 +
(1 + 1))
5 + 3 2
9 + - 3
8 + + 9
1 * * 6
Another way of looking at syntax is, the set of rules which govern what is a
valid program and what is not.
A syntax error occurs when we don’t obey these rules (or form “meaningless
words”). It’s much the same with English for example. If someone wrote down
the word “thwoelaman” you’d probably look at them as if they had two heads
and that’s essentially how the interpreter is looking at you, it can’t make sense
of what you’ve written!
When you write code that doesn’t follow the Python syntax and you get a
syntax error, the python interpreter usually throws out a very useful error
message. For example, if we try to evaluate the following expression:
>>> (1 + (// 4))
We are told the syntax error has probably occurred on line 1. We’re then
shown where on line 1 it has probably occurred. This is shown by the ^
character below the expression we tried to evaluate. Python is quite good at
14
pointing out where these errors occurred. In fact, Python is quite good at
pointing out where most errors occurred compared to many other languages
(at least in my opinion).
I want to point out two other operators. The ** (power) operator and the %
(modulus) operator.
The function of the power operator (**) is quite obvious but the modulus
operator (%), maybe not so much. Try these expressions in the REPL:
2 ** 2
3 ** 2
4 % 3
13 % 10
That’s right, the % operator returns the remainder after division. In the case
of 13 % 10, this returns the remainder from 13 divided by 10 which is 3.
2.4 Exercises
Question 1
• Is the ** (power) operator left or right associative? (Try figuring this out
in the REPL)
Question 2
• Is the % operator left or right associative? (Try figuring this out in the
REPL)
Question 3
Question 4
Question 5
15
• From the following list of integer expressions, which expression(s) have
invalid syntax?
o ((4 * 6 - (4 // 2))
o 9 * - 11 + 6
o 6 - +4
Variables utilize the computer’s memory to store the data. You can think
of main memory or memory (RAM) as cubbyholes with each cubbyhole
labelled with some identifier (x for example). When we want to store a value,
we are essentially telling the computer to put a value Y, into the cubbyhole
labelled x. In Python this is called variable assignment. Let's look at this in
action.
1. x = 7
2. y = 5
3.
4. z = x - y # The value 2 is now essentially stored in the variable z
For this section you'd be best off by opening up your text editor and a
command prompt (terminal window). Create a new file and call it whatever
you want but make sure you save it with the .py extension.
Copy the three above assignment statements into the file and save it. Then in
your command prompt, run the program (python FILENAME.py). Nothing
16
appears to have happened BUT I'll illustrate below what went on in the
background.
The Python interpreter considers each statement in turn from top to bottom,
evaluates the expression on the right-hand side, allocates an area of memory
for the new variable x and stores the value 7 in the newly allocated memory
slot. Well start with the first statement x = 7 and consider what happens in
memory:
1) Memory:
---------------------------------
| | | | | | | | | # Expression on right-hand side is
evaluated
---------------------------------
2) Memory: x
---------------------------------
| | | | | | | | | # Allocate an area of memory for x
---------------------------------
3) Memory: x
---------------------------------
17
5) Memory: x y
---------------------------------
6) Memory: x y z
---------------------------------
| | | | 7 | | | 5 | | # Allocate an area of memory for z
---------------------------------
7) Memory: x y z
---------------------------------
| | | | 7 | | | 5 | 2 | # Store the value 2 in this area
of memory
---------------------------------
18
In this case we 're-use' (or update) the memory slot as shown below
1) Memory: x y z
---------------------------------
| | | |*7*| | | 5 | 2 | # Memory already allocated for x,
re-use it.
---------------------------------
2) Memory: x y z
---------------------------------
Now we can use this to do some useful things, for example, lets calculate the
surface area of earth (approx.).
1. radius_of_earth = 6371
2. pi = 3.14
3. surface_area = 4 * pi * (radius_of_earth ** 2)
IMPORTANT NOTE:
19
3.2 - Floating-point types
In the previous piece of code about computing the surface area of the earth
you may have noticed that I had the statement pi = 3.14
We haven't seen these yet. This is a new type (like integer). They are
called floating-point numbers (decimals numbers to the average person). In
Python this is the float type.
>>> type(3.14)
<type 'float'>
Floats work pretty much the same as integers (which I'm sure you'd expect).
We can perform the same calculations with them. Float expressions also
evaluate to float types.
>>> 2.0 // 1.0
2.0
>>> 2.0 * 4.0
8.0
# etc....
I don't know about you, but that division symbol seems strange. Why not just
use a single / ?
Well you can! // is for floor division (digits after decimal point are removed).
Let’s look at how / and // work:
>>> 100 // 3
33
>>> 100 / 3
33.333333333333336
We can see here that when using integers, // rounds down to a whole number
(int) but when we use / the result is converted to a float. This is called type
casting.
Similarly:
>>> 100.0 // 3.0
33.0
20
>>> 100.0 / 3.0
33.333333333333336
This is something you need to watch out for when you're doing division. If you
use / with integer operands and the resulting value contains digits after the
decimal place, you may run into an error if the next part of your code is
expecting an integer.
We also have all the other standard operators we've met (float operators in
this case).
To do this we use a function, namely input(). We'll get to functions later in this
book so don't worry about them, just know that when we type input(), our
program will wait for the user to input something at the command prompt.
We can also store the input from the user. This is done as follow:
>>> x = input()
Hello # User types this
>>> x
'Hello'
By default, input() returns a string (str), that’s what the ' ' are around the
Hello shows. This is another type which we will get to later but its just textual
data (nothing too scary).
Essentially, input() takes in a line from input (we'll get to this later), converts
the line into a string and returns it.
21
We can see that from the above code it might look like our program has
frozen (it hasn't). We can add a user-friendly prompt that shows when asking
the user for input. Let’s take a look at a more user-friendly program.
>>> name = input("Enter your name: ")
Enter your name: John
>>> name
'John'
Perfect! We can now have the user input data into our program.
What if we don't want the user to input a string? What if we want an integer or
a float?
There are a couple of ways to handle this situation, let's take a look at them:
# Method 1
x = eval(input())
# Method 2
x = int(input())
y = float(input())
eval() is able to infer the type the user has inputted (if the user inputs 3,
then eval() will cast the string '3' to an int. The same happens if the user
inputs a float. I don't like this eval() function as it isn't very safe but it works
fine here.
I prefer the second method, but we will get an error and our program will
crash if the user inputs something that can't be cast to an int such as 'hello'.
>>> x = int(input())
Hello
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'Hello'
>>>
22
We get a ValueError. We'll look at different types of errors a little later on.
We'll also look at being able to handle these situations (hence why I prefer the
second method).
To clarify things for now, you can think of a function as a machine that takes
some input, does something with that input, then spits out a value.
In this case, print() takes in something (an argument) and prints it to output
(we'll also get to this later).
This, in my opinion, is one of the most useful functions Python has to offer. As
we progress through this book and things start to get a little more difficult, I
can bet you'll be using this quite a lot.
Nothing too complicated! For now, though, let's recap on what we've covered
so far.
Recap:
• We've figured out how to store data within our program (variable
assignment)
• We've looked at a new type, the float.
• We've discovered that division can be done two ways:
• // or floor division
• / or your regular old division
• We've learned how to take input from the user with the input() function
• We've learned how to print information to the terminal window with
the print() function.
23
Let's look at a program that puts all of this together! (Text editor time)
In the line print("Hi " + name), the + joins the string "Hi" and name together
to make a single string, this is called string concatenation and we'll come to
this later in the strings chapter.
3.5 - Exercises
Important Note:
Extend the above program to allow the user to input the number of rabbits
and the time period after which you want to know the rabbit population e.g. 4
years
Question 3
Assume the user is going to input 2 numbers, x and y for example. Write a
Python program that will swap the numbers stored in these variables e.g.
# Assume the user has input these
x = 3
y = 11
# Your code to swap here
# After, the variables should be:
# x = 11
# y = 3
24
Question 4
The goal of this task is to derive a single assignment statement for the
following sequences such that, if the variable x has the first value in a
sequence, then x will have the next value in the sequence after the assignment
statement is executed. And if the assignment statement were executed
repeatedly, then x would cycle through all of the values in the sequence.
# EXAMPLE
# Sequence:
# 0 1 2 3 4 5 6 7 ...
#SOLUTION
x = x + 1
Sequence 1:
0 2 4 6 8 10 12 ...
Sequence 2:
0 -1 -2 -3 -4 -5 -6 ...
Sequence 3:
6 -6 6 -6 6 -6.....
Sequence 4 (Difficult):
-6 -3 0 -6 -3 0 -6 -3 0 -6 -3 0...
25
Chapter 4 - Control Flow
4.1 - Booleans
Important Note:
If you can remember back to the example of making a cup of tea, we had a
step that went something like:
- if want milk:
- take milk from fridge
- pour milk
The above snippet allowed us to decide. In programming this construct is
called control flow. We either wanted milk or we didn't. In order to implement
this kind of thing in our programs we need to introduce a new type called
the bool which is short for Boolean.
For example, with integers, 0 is falsey while any other integer is truthy. With
floats, 0.0 is falsey while any other float is truthy and finally with strings, the
empty string ('') is falsey while any other string is truthy.
>>> False == 0
True
>>> False == 1
False
>>> False == 0
True
>>> False == 0.1
False
26
We can also cast any other value to a bool using the bool() function if the
value can be interpreted as a truth value.
Booleans also have operators. These are different to the operators we've seen
with integers and floats. These are Boolean operators.
OR Truth Table
P Q P OR Q
True True True
True False True
False True True
False False False
P Q P AND Q
True True True
True False False
False True False
False False False
P NOT P
True False
False True
Here are a few examples in Python
27
>>> not True
False
>>> not False
True
Again, knowing these truth tables off like the back of your hand is crucial.
You'll be dealing with these a lot!
>>> 1 == 1
True
>>> 1 == 2
False
>>> 10 != 4
True
>>> 4 != 4
False
>>> 11 > 9
True
>>> 13 < 23
True
>>> 16 < 2
False
>>> 13 < 13
False
>>> 13 <= 13
True
I think you get the gist. So, let's get onto the if statement.
1. wants_drink = input("Do you want a drink with your meal? (yes or no): ")
2. if wants_drink == "yes":
3. print("That will cost an extra £1.50")
4. print("Thank you! Enjoy your meal")
28
If we run the above code with two different types of input, we can get two
different results:
# USER ANSWERS NO
Do you want a drink with your meal? (yes or no): no
Thank you! Enjoy your meal
The above code reads quite clearly as well, almost like English. If the user says
yes to wanting a drink, then print that it will cost more.
if <CONDITION>:
# Do something if condition evaluates to True
A couple of important things to note on the syntax. It is always if followed by
the condition, then a colon (:). For the block of code that follows, it must be
indented!! I like to use tabs however you can use spaces. Right now, I'm going
to tell you to pick a number of spaces. It's typical to use 4 spaces. For tabs, 1
tab will suffice.
The above program does nothing useful, in fact it's complete garbage but it
shows the syntax for if statements quite well.
You'll probably make a lot of syntax mistakes in the beginning, we all did so
don't let that discourage you at all, it's completely normal. Even the most
experienced developers make syntax mistakes every now and again.
29
4.3 - The if-else statement
In the previous section we learned how to add some very basic flow control to
our program however we imagined a scenario in which we wanted to do
something else if the condition for the if statement didn't pass.
It again, is quite easy to read the above program. When we execute the above
program we could have two different outputs:
# SCENARIO 1
Enter your age: 33
You can enter
Goodbye
# SCENARIO 2
Enter your age: 14
You are too young to enter
Goodbye
The syntax again is obvious:
if <CONDITION>:
# Do something if the condition was True
else:
# Do something else!
REMEMBER:
Don't forget the colon at the end of each statement and don't forget to indent
the code block. Each statement in the code block must also be indented the
same number of spaces!
Let's look at another example. In this example we want to write a program that
will tell us if an integer is even or odd:
1. number = int(input("Enter a number: "))
2. if (number % 2) == 0:
3. print("Your number was even")
4. else:
5. print("Your number was odd")
30
Remember back to the chapter on integers, we had the modulus (%) operator?
To refresh your memory, this operator returned the remainder after division.
The == comparison operator takes two operands; in the above case the left
operand was an expression. This will evaluate to a value which is then checked
to see if it is equal to zero. If it is, then it is an even number.
We can also combine comparison operators and Boolean operators, let's look
at an example of how this is done:
1. number = int(input("Enter an even number between 50 and 100: "))
2. if (number % 2) == 0 and number < 100 and number > 50:
3. print("You've entered an even number between 50 and 100")
4. else:
5. print("I don't care about your number")
The condition might look nasty but it actually reads quite easily:
"if the number is even AND the number is less than 100 AND the number is
greater than 50"
We can even simplify it a bit. Remember I told you that values could be truthy
or falsey? Well let’s look at our even number program again.
1. ...
2.
3. if (number % 2) != 0:
4. print("Your number was odd")
5. else:
6. print("Your number was even")
7.
8. ...
Note that we are saying if the remainder doesn't equal zero in the condition.
If number % 2 is 0, then we don't enter the code block for if because 0 is falsey.
31
4.4 - The elif statement
Our programs are now getting a little more complex and we have some
decent control flow but they're acting as if there's only two options. What if
there were 3 or more choices? The elif statement can solve this for us!
if <condition 1>:
# Do something
elif <condition 2>:
# Do something else
elif <condition 3>:
# Do something else
.
.
.
elif <condition N>:
# Do something else
else:
# Do something else
We can have as many elif statements as we please.
Let's write a program that calculates the grade of a student based on their
mark:
1. mark = int(input("Enter a mark: "))
2.
3. if mark >= 70:
4. print("You got a first")
5. elif mark >= 50 and mark < 70:
6. print("You got a second")
7. elif mark >= 40 and mark < 50:
8. print("You got a third")
9. else:
10. print("You failed!")
If you are coming from another language, Python does not provide
switch/case statements as do other languages, but we can use if...elif...else to
mimic them.
32
4.5 - Nested if statements
What if we have a condition and based on that condition, we need to check
other conditions? Well we can do that by nesting if...elif...else statements inside
other if...elif...else statements.
This is a toy example but nested if...elif...else statements are used everywhere,
and you'll come across them frequently so spend time becoming familiar with
writing them!
They're also quite straight forward so I'm going to get straight into some
exercises so you can practice these things.
4.6 - Exercises
Important Note:
Question 1
Write a program that takes three inputs, a, b and c and prints out whether or
not those numbers form a right-angled triangle. The formula you'll need is
c=(√a2+b2)c=(a2+b2)
33
Question 2
Fizz buzz is a game in which the following rules apply:
Examples:
1 = 1
2 = 2
3 = fizz
4 = 4
5 = buzz
6 = 6
.
.
15 = fizz-buzz
* Remember what I said about how if...elif..else statements are evaluated sequentially (first to last) until
the first condition passes
In your solution you should expect the user to enter a single number and
either print the number, fizz, buzz or fizz-buzz
Question 3
Write a program which takes in six numbers, x1, y1, r1 and x2, y2, r2 (which are
the x and y coordinates of the centre of a circle and that circles radius) and
print out whether or not the two circles overlap.
You're going to need the formula for the distance between two points to solve
this which is:
d(P,Q)=(√(x2−x1)2+(y2−y1)2))d(P,Q)=((x2−x1)2+(y2−y1)2))
The rest requires some problem solving! Perhaps drawing out the scenario on paper might
help?
34
Chapter 5 - Looping & Iterations
Important Note:
This chapter may be a fair bit more challenging than the previous chapters and there
is a lot to take in here.
In this chapter we look at looping and iteration, in particular, the while loop. While
loops are syntactically very similar to if statements.
# IF STATEMENTS
if <condition>:
# Do something
# And another thing
# WHILE LOOPS
while <condition>:
# Do something
# And another thing
As with if statements, the <condition> is a Boolean expression. The body also contains
a sequence of statements. Similarly, the extent of body of the loop is determined by
the indentation of the statements.
The way they work however is slightly different and they operate as follows:
35
8.
9. x = True #
10. while x == True: # <- Evaluate the expression (evaluates to True)
11. print("Hello") #
12. x = False #
13. print("x is now false") #
14.
15. -------------------------------------------------------------------------------
16.
17. x = True #
18. while x == True: #
19. print("Hello") # <- Print "Hello" to the terminal window
20. x = False #
21. print("x is now false") #
22.
23. -------------------------------------------------------------------------------
24.
25. x = True #
26. while x == True: #
27. print("Hello") #
28. x = False # <- Assign False to x (x contains False)
29. print("x is now false") #
30.
31. ------------------------------------------------------------------------------
32.
33. x = True #
34. while x == True: #
35. print("Hello") #
36. x = False # <- We've reached the end so re-evaluate condition
37. print("x is now false") #
38.
39. ------------------------------------------------------------------------------
40.
41. x = True #
42. while x == True: # <- Evaluate the expression (it fails, so exit loop)
1. i = 0
2. while i < 5:
3. print(i)
4. i = i + 1
5.
6. # THE OUTPUT
7. 0
8. 1
9. 2
10. 3
11. 4
36
This might look a little confusing but let me explain what's going on. The variable i is
called a counter (or loop variable) and it controls the number of times the loop
iterates.
• The variable i is initially 0 and the condition evaluates to True, so the body of
the loop is executed.
• Print the value of the variable i to the terminal and increment i by 1. (i is now 1
instead of 0).
• Evaluate the condition again. It is still True as i is still less than 5.
• Print the value of the variable i to the terminal and increment i by 1. (i is now
2).
• Repeat the above steps until i is 5 in which case the loops condition is False so
the loop terminates and we execute any statements which follow.
What if we were to have the condition as i < 10000? Our 4 lines of code would still
hold, and the program would count from 0 to 9999.
You may be wondering why we counted from 0 as it seems unnatural but there are
good reasons for this. The explanation for it may be a little bit too technical for you
at this stage but let me give an example that hopefully explains it.
Remember I said you can think of memory as cubbyholes? Well, lets take that
example:
-------------------------------------
| | | | | # Each "cubbyhole" is a *memory location*
-------------------------------------
-------------------------------------
| | | | | # Each has an *address*
-------------------------------------
-------------------------------------
| 32 | 44 | 89 | 12 | # Suppose these are filled with values
-------------------------------------
-------------------------------------
| 32 | 44 | 89 | 12 | # Let's say the address of the first
location is 2
-------------------------------------
^
|
2
37
# THIS FORMULA NEEDS TO BE GENERAL SO IF WE WANTED THE FIRST LOCATION
# THAT WOULD BE: current_address + 0 (2 + 0 = 2 which is the address of first
location)
We don't actually count from 0 we start indexes (which we will get to later) with 0. It's
just good practice to count from 0 in loops like this as we usually use a value such as
i to index into lists (we'll get to these later) and I recommend getting used to it now.
If this still confuses you, it will make complete sense when we get to lists a little later
on.
It's very rare to see counters beginning at something like 30 and where counters do
begin at odd or large number like these, there is usually a very, very good reason for
doing so.
Let's look at another example. We'll write a program that allows the user to input 3
numbers and we'll add them up and print the result.
1. total = 0
2.
3. i = 0
4. while i < 3:
5. total = total + int(input())
6. i += 1
7. print(total)
There are two patterns we usually follow when writing while loops. These are:
1. Do-something-n-times
2. Do-while-not-condition
I'll get to the second pattern in a minute but for now I'll look at the first pattern.
We have been using the first pattern in the previous examples which is as follows:
while i < N:
# Do something
i += 1
In the above pattern, we know the value of N prior to evaluating the loops condition.
It may have come from user input or it may be the result of some previous
calculation.
38
As a naming convention, it's good practice to name the counter as i and if i is being
used as a variable name somewhere else, then j or k will do.
Here's another example which may seem to go against this pattern upon thinking
about the problem but what if we wanted to count backwards, so instead of 0, 1, 2,
we wanted 2, 1, 0 (You're probably thinking we should do i = i - 1)?
Here is how it's done but the user inputs the number we're counting backwards from:
1. n = int(input())
2.
3. i = 0
4. while i < n:
5. print(n - i - 1)
6. i += 1
Let's look at the second looping pattern, the "Do-while-not-condition" pattern. With
this pattern we don't know how many times we'll go around the loop before it
terminates.
Suppose the user inputs numbers some number of times, we don't know how many
times yet, and our condition is that they can't input a negative number:
1. number = int(input())
2.
3. while not number < 0:
4. print(number)
5. number = int(input())
6. print("You entered a negative number")
In this example we continuously ask the user for a number and print it. We continue
like this until they input a negative number in which case the condition fails, and the
loop terminates.
To recap:
39
Important Note:
The break statement is used to exit the loop by skipping the rest of the loop's code
block without testing the condition for the loop. break interrupts the flow of the
program by breaking out of the loop and continues to execute the rest of your
program.
1. number = 10
2.
3. i = 0
4. while i < number:
5. if i == 5:
6. break
7. print(i)
8. i += 1
9. print("Exiting program")
10.
11. # OUTPUT
12. 0
13. 1
14. 2
15. 3
16. 4
17. Exiting program
I think it's quite self-explanatory how break works so let's take a look at
the continue statement.
40
The continue statement is used to skip the remaining statements in the body of the
loop and start the next iteration of the loop. Unlike
the break statement, continue doesn't terminate the loop, it just jumps back to the
start.
1. number = 5
2.
3. i = -1
4. while i < number:
5. i += 1
6. if i == 2:
7. continue
8. print(i)
9. print("Exiting program")
10.
11. # OUTPUT
12. 0
13. 1
14. 3
15. 4
16. Exiting program
Notice how this has kind of messed up our looping pattern as we had to put i += 1 as
the first statement and we had to initialize i to -1.
Now this is a trivial case but continue has its uses. Let's look at when and where these
statements should be used. For now, though, I recommend avoiding them and don't
use them as a crutch to get out of a loop. For the most part, in this book, if you find
yourself using these statements there’s usually something wrong with your loop and
you should think about the problem a bit more.
When used at the start of a loop's code block, they act like preconditions. A
precondition is a condition placed on something (loop, function, etc) which must
hold true at the beginning of that "something" in order for it work correctly.
When placed at the end of the loop's code block, they act like postconditions. A
postcondition is a condition placed on something which must hold true after the
execution of that "something" in order to be satisfied that it has executed correctly.
Now this next one might be a bit confusing so don't get too caught up in it, I'm
putting it here for completeness.
41
These statements may also be used as invariants, in particular, loop invariants. An
invariant is a statement that is placed in a code segment that is repeated (loops for
example) and should be true each time about the loop. Invariants are often used in
loops and recursion (which we will get too later).
This part may be confusing. If an invariant is placed at the beginning of a while loop
whose condition doesn't change the state of the computation (that is, it has no side
effects), then the invariant should also be true at the end of the loop. When used in
object-oriented programming (which we'll also get to later), invariants can refer to
the state of an object.
For now, though, avoid them, especially in the exercises later on.
Example 1
The first example problem is to check if a given number is a prime number assuming
the given number is greater than or equal to 2. Here's the code:
1. n = int(input())
2.
3. i = 2
4. while i < n and (n % i) != 0:
5. i += 1
6. if i < n:
7. print("Your number is not a prime number")
8. else:
9. print("Your number is a prime number!")
Try and figure out why I have the if/else statements by working it out on some paper
with small numbers.
Example 2
The next example is going to be an extension to the Fizz-Buzz program you've
written before. There is a slight change though. Last time you wrote a program that
prints the number, fizz, buzz or fizz-buzz for a given number. This time I'm going to
write a program that prints the sequence up to a given number.
1. n = int(input())
2.
3. i = 1
4. while i < n:
5. if i % 3 == 0 and i % 5 == 0:
6. print("fizz-buzz")
7. elif i // 3 == 0:
42
8. print("fizz")
9. elif i // 5 == 0:
10. print("buzz")
11. else:
12. print(i)
13. i += 1
Notice the order the if/elif/else statements go in. Since in Python statements are
executed sequentially, we need to check that the number isn't divisible by both 3 and
5 first.
Example 3
In this example program we're going to write a program that allows the user to
continuously enter numbers until they enter the number 0. When they enter the
number 0, we'll print out the total of all the other numbers they inputted.
1. number = int(input())
2.
3. total = 0
4. while not number == 0:
5. total += number
6. number = int(input())
7. print(total)
If you now feel a bit more comfortable with while loops then let’s look at some
problems for you to solve!
5.5 - Exercises
Important Note:
These exercises take a step up in difficulty so think carefully about your solutions.
Question 1
Write a program that prints the first n numbers of the Fibonacci sequence. The
Fibonacci sequence is defined as:
Fn=Fn−1+Fn−2F0=0, F1=1Fn=Fn−1+Fn−2F0=0,F1=1
43
That is, the next number in the sequence is the sum of the previous two numbers and
the first two numbers in the sequence are 0 and 1. A part of the sequence is shown
below:
Question 2
Write a program that prints a sequence such as the one shown below. The number of
stars in the widest part should be equal to the number a user input:
# INPUT
7
# OUTPUT
*
**
***
****
*****
******
*******
******
*****
****
***
**
*
Question 3
Write a Python program that reads in a single integer from the user. Then read in an
unknown number of integers, each time printing whether or not the number was
higher or lower than the previous. You can assume that every number is different and
you can stop when the number entered is 0. Here is an example input and output:
# INPUT
5
2
9
7
0
# OUTPUT
lower
higher
lower
Exiting...
44
Chapter 6 - Strings
6.1 - Introduction to strings
In this chapter we're going to take a closer look at strings. We have met them
various times throughout the book and we know they represent textual data.
Python also offers a large set of operations, methods and functions for
working with strings, some of which we'll meet later on in this chapter.
Working with strings in Python is also quite easy compared to other
languages.
There are a couple of things I'd like to point out before we jump onto string
methods. Firstly, I want to take another quick look at the print function.
# OUTPUT
"This and that"
In the above code, each of the strings is an argument and you can have as
many arguments as you like (we'll look at arguments in more detail later in the
functions chapter).
When working with textual data (strings) we have some special characters.
Some of these are:
45
Special Character Meaning
\n Newline
\r Carriage return
\t Tab
And here is what they do:
>>> print("Hello\nWorld")
Hello
World
>>>print("Welcome to my\rhome")
Homeome to my
>>>print("Apples\tOranges")
Apples Oranges
The \t adds a tab, the \nwill move to a new line and \r will essentially take
whatever comes after it, and overwrite whatever comes before it up to the
length of whatever comes after it. Don't worry about it too much I've only
encountered it a handful of times.
There is also a special string called the empty string. This is represented as "".
>>> len("Hello")
5
>>> len("Hello\nWorld") # Newline character counts!
11
46
>>> str(5 + 5)
'10'
>>> chr(65)
'A'
>>> ord("B")
66
len() and str() work as expected. We cast a type to a string using
the str() function.
The simplest of these schemes is called ASCII which you may have heard of. It
stands for American Standard Code for Information Interchange and it covers
the common Latin characters you're probably used to working with.
chr() and ord() are used for translating between codes and characters.
1object.method(<arguments>)
Let's look some at the most commonly used string methods:
Method Meaning
s.capitalize() returns a copy of s with the first character capitalized
s.title() returns a copy of s with the first letter of each word capitalized
s.upper() returns a copy of s with all alphabetic characters capitalized
returns a copy of s with all alphabetic characters converted to
s.lower()
lowercase
returns True if s is nonempty and all its characters are
s.isalnum()
alphanumeric (letters and numbers) and False otherwise.
returns True if s is nonempty and all its characters are alphabetic
s.isalpha()
and False otherwise
47
Method Meaning
returns True if s is nonempty and all its characters can be casted
s.isdigit()
to integers and False otherwise
returns a copy of s with the whitespace on the left-hand side and
s.strip()
right-hand side removed
returns a copy of s with the whitespace on the left-hand side
s.lstrip()
removed
returns a copy of s with the whitespace on the right-hand side
s.rstrip()
removed.
Let's look at these in action:
>>> s = "76324"
>>> s.isdigit()
True
48
"This is a string"
Remember we talked about memory and counting from 0 for indices? Well
let's combine those two ideas here.
-------------------------------
| H | E | L | L | O | # Each memory location contains one character
-------------------------------
-------------------------------
| H | E | L | L | O | # Each location also has an address
-------------------------------
| | | | |
13 14 15 16 17 # These are hypothetical addresses
-------------------------------
| H | E | L | L | O | # We can get the address of 'E' as 13 + 1
------------------------------- # Similarly for 'H' as 13 + 0
| | | | |
13 14 15 16 17
49
We can index strings in a similar way! The syntax for indexing into a string is as
follows:
>>> s = "Hello"
>>> s[0]
"H"
>>> s[1]
"e"
>>> s[2]
"l"
>>> s[3]
"l"
>>> s[4]
"o"
>>> s[5]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: string index out of range
Notice in the last line we try to index to string at position 5 but we get
an IndexError and we're told the index is out of range. This is because we
start counting from 0 and 0-4 are the 5 indices of the string.
So, the final index of string s is: len(s) - 1 and we can index using
this: s[len(s) - 1]
String indexes may also be negative numbers and will specify locations relative
to the end of the string.
>>> s = "Hello"
>>> s[-1]
"o"
>>> s[-2]
"l"
>>> s[-3]
"l"
>>> s[-4]
"e"
>>> s[-5]
"H"
123H E L L O # String
0 1 2 3 4 # Positive indices
-5 -4 -3 -2 -1 # Negative indices
Recap for this section:
• An index is a position.
• Using indices, we can reference characters at particular positions.
• For indices, we always start counting from 0.
• The first character is at index 0.
• The last character is at the index len(s) - 1
50
6.5 - Immutability
In Python, every variable holds an instance of an object and there are two
types of objects, mutable and immutable. So far, we have only
met immutable types i.e ints, floats, bools and strings. An object being
immutable means it's value can't be changed once created.
10 = 3
True = False
"Hello" = "World"
This wouldn't make any sense and it's probably quite obvious why. However,
now that we know about string indexing, doing something such as:
s = "string"
s[0] = "b"
This might seem reasonable. However, it's not and there is good reason for it.
What if we tried something such as:
s = "string"
s[0] = "bl"
Now that might also seem reasonable but remember how strings are stored in
memory. When they are created, they have a length and trying to change that
length such as above would cause issues on a lower level (not that it can't be
done, it's just not worth it in 99% of cases). Making strings immutable types
increases performance and security but don't worry about that for now. Just
know you can't change a strings value once it's created.
To get a better understanding of how strings are stored in memory and how
they're referenced, look at the code below, then the diagram that follows.
a = "Hello"
b = a
a = "World"
51
From the first part of the diagram, I said before a contains "Hello", however
that isn't the case. That was for simplicity. What's actually happening is, Python
maintains a namespace of mappings from variables to objects they refer to.
After a = "Hello" is executed, a mapping from a to the value "Hello" is added
to the namespace. a now points to the memory location that holds the value
"Hello". We say that a contains a reference to the value "Hello" stored in
memory.
When the line b = a is executed, what happens is, b now points to the same
memory location that a points to and when a = "World" is executed, a new
string instance is created and saved in some new memory location and the
location that a points to is updated.
52
What if we wanted to get the characters from the first to third position of the
string "Hello" i.e. "Hel".
Well we can! We use slicing to do so. Slicing works very similarly to indexing
except, rather than just specifying the start index, we also give the end index
we want.
>>> s = "Hello"
>>> s[0:3]
'Hel'
>>> s[0:1]
'H'
The syntax is: string[ <start> : <end> ]
Notice how slicing works though. Using the first example, slicing returns a new
string with everything from s at the start index, up to but not including the
end index.
If we wanted everything after the first letter in s, we can omit the end index.
We can also omit the first index and leave the end index which will give us
everything from index 0 up to the specified end index
>>> s = "Hello"
>>> s[1:]
'ello'
>>> s[:3]
'Hel'
We can also omit both indexes, in which case we'll get the entire string
returned:
>>> s = "Hello"
>>> s[:]
'Hello'
Note that the colon must remain in either case.
>>> first = 2
>>> last = 4
>>> s = "Hello"
>>> s[first:last]
'll'
The indices may also be negative numbers and will specify locations relative to
the end of the string:
53
>>> s = "Hello"
>>> s[:-1]
'Hell'
>>> s[-4:]
Python also offers extended slicing for strings. This adds a third parameter
when slicing the string. This third parameter indicates the step size and the
syntax is s[first:last:step].
The + operator is called the concatenation operator and is used to join strings
together.
>>> s1 = "Hello"
>>> s2 = "World"
>>> s1 + s2
"HelloWorld"
The * operator is called the replication operator and is used to replicate a
string.
>>> s = "Hello"
>>> s * 3
"HelloHelloHello"
These operators also have precedence associated with them:
54
There is also another special operator called the in operator. This will return a
Boolean based on whether a substring is contained within a string.
Sometimes we want our output to look nice and have better readability. Let's
look at an example with bad readability.
1. i = 0
2. while i < 12:
3. print(i, '* 15 = ', i * 15)
4. i += 1
5.
6. # OUTPUT
7. 0 * 15 = 0
8. 1 * 15 = 15
9. 2 * 15 = 30
10. 3 * 15 = 45
11. 4 * 15 = 60
12. 5 * 15 = 75
13. 6 * 15 = 90
14. 7 * 15 = 105
15. 8 * 15 = 120
16. 9 * 15 = 135
17. 10 * 15 = 150
18. 11 * 15 = 165
We can see that as the numbers become larger in the output, the spacing
becomes off and lines start to jut out. We will look at how to make this look
nice a little later in this section.
The string on which format operates ('{}') is called the format string. The {} is
called a placeholder or (replacement field). Inside the placeholders we specify
how data should be displayed and the arguments to the format method are
55
the items of data that will be inserted into each placeholder and formatted
according to the placeholder’s format commands.
The general syntax for a format command is {[: <align> <minimum width>
<.precision> <type>]} where square brackets indicate optional parameters.
>>> pi = 3.14159265359
>>> print('{:.3f}'.format(pi))
'3.142'
Let’s break down the format command here ({:.3f}).
>>> pi = 3.14159265359
>>> e = 2.71828182845
>>> print('The number pi is: {:.3f} and the number e is {:.5f}'.format(pi, e))
'The number pi is: 3.142 and the number e is 2.71828'
The first argument to be formatted is matched with the first placeholder and
the second argument to the second placeholder and so on.
Nested placeholders are matched from left to right as well. A way of helping
to think about this is, each time we encounter an opening curly brace, we will
be inserting the next argument.
We can use this minimum width to solve the problem we had with the output
from the 15 times tables.
56
1. i = 0
2. while i < 12:
3. print('{:2d} * {:2d} = {:3d}'.format(i, 15, i*15))
4. i += 1
5.
6. # OUTPUT
7. 0 * 15 = 0
8. 1 * 15 = 15
9. 2 * 15 = 30
10. 3 * 15 = 45
11. 4 * 15 = 60
12. 5 * 15 = 75
13. 6 * 15 = 90
14. 7 * 15 = 105
15. 8 * 15 = 120
16. 9 * 15 = 135
17. 10 * 15 = 150
18. 11 * 15 = 165
The minimum width will pad out the argument with whitespace to the
specified minimum width.
Let's look at the star example, this is by no means a good solution, it's simply
to show the use of the alignment format command:
1. i = 0
2. while i < 6:
3. print('{:^10s}'.format('* '*i))
4. i += 1
5.
6. i -= 2
7. while i > 0:
8. print('{:^10s}'.format('* '*i))
9. i -= 1
10.
11. # OUTPUT
12. *
13. * *
14. * * *
15. * * * *
16. * * * * *
17. * * * *
18. * * *
19. * *
20. *
As you can see in the format command, we specify the string to be centred
with a minimum width of 10 and the type to be s which is a string.
57
6.9 - Exercises
IMPORTANT NOTE:
I can't show you everything there is to python. The book would go on forever.
A really important skill all good programmers have is being able to know what
to look for when they run into a problem, they can't solve themselves. For
example, I might want to be able to something with a string, but I don't know
how. Knowing where to look to find out the answer is super important. You
may have even come across a website called stack overflow by now. This is
going to be a really good resource for you. Therefore, some the questions in
these exercises may require you to go and search for methods to help you
arrive at your solution.
There is a lot to take in from this chapter. String are important and used all the
time. Becoming comfortable with them is equally important.
Question 1
Write a program that takes as input, a single integer from the user which will
specify how many decimal places the number e should be formatted to.
Take e to be 2.7182818284590452353602874713527
# EXAMPLE INPUT
4
# EXAMPLE OUTPUT
2.7183
Question 2
Write a program that will take as input, a string and two integers. The two
integers will represent indices. If the string can be sliced using the two indices,
then print the sliced string. If either or both integers are outside the strings
index range, then print that the string cannot be sliced at those integers.
# EXAMPLE INPUT
"This is my string"
2
9
# EXAMPLE OUTPUT
58
'is is m'
# EXAMPLE INPUT
"This is a string"
10
22
# EXAMPLE OUTPUT
'Cannot slice string using those indices'
Question 3
When you sign up for accounts on website or apps, you may be told your
password strength when entering it for the first time. In this exercise, you are
to write a program that takes in as input, a string that will represent a
password.
• digits
• lowercase letters
• uppercase letters
• special characters (take these to be: $, #, @, and ^)
# EXAMPLE INPUTS
978
hjj
jKl
nmM2
r@num978LL
LLLL
# EXAMPLE OUTPUTS
1
59
1
2
3
4
1
Question 4
Write a program that takes 3 floating point numbers as input from the user: a
starting radius, radius increment and an ending radius. Based on these three
numbers, your program should output a table with a spheres corresponding
surface area and volume.
The surface area and volume of a sphere are given by the following formulae:
A = 4πr2A = 4πr2
V=43πr3V=43πr3
# EXAMPLE INPUT
1
1
10
# EXAMPLE OUTPUT
Radius Area Volume
---------- ---------- ------------
1.0 12.57 4.19
2.0 50.27 33.51
3.0 113.10 113.10
4.0 201.06 268.08
5.0 314.16 523.60
6.0 452.39 904.78
7.0 615.75 1436.76
8.0 804.25 2144.66
9.0 1017.88 3053.63
10.0 1256.64 4188.79
60
Chapter 7 - Linear Search
7.1 - What is linear search?
In this chapter you're going to code your first algorithm. We'll take a look at
the linear search algorithm.
(Usually) we are searching for the position of the element we want. Therefore,
in some cases, the element we are searching for may not exist and we need
our algorithm to be able to handle that.
i += 1
Let's look at an example to help clarify. In this example we want to find the
position of the first occurrence of the letter "W":
1. s = "Hello World"
2.
3. i = 0
4. while i < len(s) and s[i] != "W":
5. i += 1
What we do here is, begin at index 0 ("H") and continue through the string,
checking each index until "W" is found and that is linear search.
61
Now we have another problem. There are two conditions in the while loop.
Either "W" is found, and we exit the loop or we search the entire string and
"W" is not found. How do we know which condition caused the loop to
terminate?
If condition 1 (i < len(s)) is True then "W" was found as we clearly didn't
reach the end of the string OR condition 1 is False in which case we did reach
the end of the string and "W" was not found.
1. s = "Hello World"
2.
3. i = 0
4. while i < len(s) and s[i] != "W":
5. i += 1
6.
7. if i < len(s):
8. print("The letter 'W' was found at position " + str(i) + " in the string")
9. else:
10. print("The letter 'W' was not found in the string")
This is usually how linear search works; however, we can have more
complicated versions which we will get to in the next section.
7.3 - Examples
Important Note:
In this section I'm going to go through some more examples of linear search
using while loops, some of which may get a little complicated so spend some
time going through them.
For our first example, let's recreate the lstrip() string method.
1. s = input()
2.
3. i = 0
4. while i < len(s) and s[i] == " ":
5. i += 1
6.
7. if i < len(s):
8. print(s[i:])
9. else:
10. print("No alpha-numeric characters exist in this string")
62
The above program will strip all leading white space characters.
Let's look at a more difficult example. For this problem we want the user to
input a string and print out that string, removing all leading and trailing
whitespace and only one white space character between each word.
# INPUT
" This is a string with lots of whitespace "
# OUTPUT
This is difficult to solve and can remember being given this problem when I
was learning to program. To solve this problem, we'll need to nest while loops
inside each other.
1. s = input()
2. output = ""
3.
4. i = 0
5. while i < len(s):
6. # Find a non whitespace character (start of a word)
7. while i < len(s) and s[i] == " ":
8. i += 1
9.
10. j = i
11. if i < len(s):
12. # Find the end of that word
13. while j < len(s) and s[j] != " ":
14. j += 1
15.
16. # Build up a string from the original without the excess whitespace
17. output += " " + s[i:j]
18.
19. i = j + 1
20.
21.
22. # Print out the final string
23. print(output[1:])
I'm not going to explain this, I want you to work through it. Take a simple
input string to help you walk through the program.
7.4 - Exercises
Important Note:
63
Question 1
Write a program that takes a string from the user and prints the first number
encountered along with its position.
# EXAMPLE OUTPUT
"7 16"
# EXAMPLE INPUT
"There are no digits here"
#EXAMPLE OUTPUT
Question 2
Building upon the previous exercise, write a program that takes a string as
input from the user and prints the first number encountered along with the
starting position of the number.
# EXAMPLE OUTPUT
"1208 17"
# EXAMPLE INPUT
"There is no numbers here"
64
# EXAMPLE OUTPUT
"No numbers in string"
Question 3
Write a program that takes a string as input from the user. The string will
consist of digits and your program should print out the first repdigit. A
repdigit is a number where all digits in the number are the same. Examples of
repdigits are 22, 77, 2222, 99999, 444444. You can also assume that each
number will have differing digits unless it is a repdigit.
# EXAMPLE INPUT
"34 74 86 34576 47093 3 349852 777 9082"
# EXAMPLE OUTPUT
"777"
# EXAMPLE INPUT
"98 3462 245 87658"
# EXAMPLE OUTPUT
Question 4
If you open a browser and go to any web page that has text on it and press
Ctrl+f, a search box will pop up. You can type any word in this search box and
all occurrences of that word on the page are highlighted. This doesn't use
linear search; it uses something called regular expressions (don't worry about
what they are). As well as the occurrences of the word being highlighted, you
are also told how many times the word occurs.
You are to write a program which mimics this behaviour. Your program should
take a string as input from the user and then a second string as input which is
the word the user wants to search for. Your program should print out how
many times the word occurs in the string.
Assume that it still counts if the word you're searching for is embedded in
another word, for example if the user was searching for the word "search":
65
search and searching <- you can count this as 2 occurrences
# EXAMPLE INPUT
"In this type of search, a sequential search is made over all items one by
one"
"search"
# EXAMPLE OUTPUT
2
# EXAMPLE INPUT
"There wont be an occurrence here"
"python"
# EXAMPLE OUTPUT
0
66
Chapter 8 - Lists
Important Note:
Just a heads up, there's a lot to cover in this chapter and it can get quite
technical in places (especially section 8.4) so pay particular attention to that
section, it’s really important!
The most basic kind of list is the empty list and is represented as []. The empty
list is falsey
>>> empty_list = []
>>> empty_list == False
True
In a list, elements are separated by commas.
Lists are also a sequence type meaning each object in the collection occupies
a specific numbered location within it (which means elements are ordered).
Lists are also iterable so we can loop through their elements using indices.
67
However, they differ in two significant ways:
• A list can contain objects of different types while a string can only
contain characters
• A list is a mutable type (We'll get to this later).
>>> student_grades = [88, 23, 56, 75, 34, 23, 84, 63, 52, 77, 96]
>>> sorted(student_grades)
[23, 23, 34, 52, 56, 63, 75, 77, 84, 88, 96]
In computer science, searching and sorting are two big problems and are
important topics.
We can append and pop elements to and from the end of a list. Append is add
and pop is remove. We can do this because lists are mutable.
68
>>> my_list.pop()
99
>>> my_list
[1, 5, 8]
>>> my_list.pop(2)
5
>>> my_list
[1, 8]
Notice that when we pop, we are returned the number we popped. By default,
if no arguments are passed to pop() then the last element is popped. We can
also pass an index to pop() and it will pop the element at that index
and return it to us.
We can store the value that is returned after calling pop() in another variable
However, if we had words, we might want to join them together with spaces:
69
8.3 - List indexing
As we could with strings, we can also index into lists the same way.
A mutable type (or mutable sequence) is one that can be changed after it is
created. A list is a mutable sequence and can therefore be changed in place.
Let's refresh our memory on how strings couldn't be changed after they were
created
70
>>> s[0] = "b"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
We get an error. Take a look back at the chapter on strings (Chapter 6) and
revise the diagram that exists in that chapter.
I'm going to introduce the is keyword now. The is keyword compares object
IDs (identities). This is similar to == which checks if the objects referred to by
the variables have the same content (equality).
We would say shirt_1 == shirt_2 evaluates to True (They are both the same
color, size and have the same graphic printed on them)
We would say shirt_1 is shirt_2 evaluates to False (Even though they look
similar, they are not the same shirt).
>>> s = "Hello"
>>> t = s
>>> t is s
True
>>> t == s
True
These variables, s and t are referencing the same object (The string
"Hello").What if we try to update s?
>>> s = "Hello"
>>> t = s
>>> t is s
True
>>> s = "World"
>>> t
"Hello"
The variable s is now referencing a different object and t still references
"Hello". As strings are immutable we must create a new instance. We cannot
change it in place.
>>> a = [1, 3, 7]
>>> b = a
71
>>> b is a
True
>>> a.append(9)
>>> a
[1, 3, 7, 9]
>>> b
[1, 3, 7, 9]
>>> a is b
True
The behavior has changed here and that is because the object pointed to
by a and b is mutable. Modifying it does not create a new object and the
original reference is not overwritten to point to a new object. We do not
overwrite the reference in the variable a, instead we write through it to modify
the mutable object it points to.
There are however some little tricks thrown in with this behavior. Consider the
following code:
>>> a = [1, 3, 7]
>>> b = a
>>> a = a + [9]
>>> a
[1, 3, 7, 9]
>>> b
[1, 3, 7]
What's going on here? That goes against what we just talked about doesn't it?
Well, obviously not, the Python developers wouldn't leave a bug that big in
language. What's happening here is we executed the following code: a = a +
[9]. This doesn't write through a to modify the list it references. Instead a new
list is created from the concatenation of the list a and the list [9].
A reference to this new list overwrites the original reference in a. The list
referenced by b is hence unchanged.
That's not all the little tricks thrown in. There's one more subtlety. Consider the
following code:
>>> a = [1, 3, 7]
>>> b = a
>>> a += [9]
>>> a
[1, 3, 7, 9]
>>> b
[1, 3, 7, 9]
It turns out x += y isn't always shorthand for x = x + y. Although with
immutable types they do the same thing, with lists they behave very
differently. It turns out the += operator when applied to lists, modifies the
72
list in-place. This means we write through the reference in a and
append [9] to the list.
>>> a = [1, 3]
>>> b = a
>>> a[1] = 9
** BEFORE UPDATING A **
** AFTER UPDATING A **
b references the same list object. But when we "change" a, we're clearly not
changing it. We're updating a reference within it. While b just points to the
list, a has changed what’s inside it hence b's reference is not affected.
We can still see the 3 floating around in the second part of the diagram. This
will be picked up by something called the garbage collector (A garbage
73
collector keeps memory clean and stops it from becoming flooded with stuff
like the unreferenced 3).
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> a + b
[1, 2, 3, 4, 5, 6]
>>> a += b
>>> a
[1, 2, 3, 4, 5, 6]
We also have the * operator, which does as expected:
>>> a = [1, 2, 3]
>>> a * 3
[1, 2, 3, 1, 2, 3, 1, 2, 3]
We also have a new operator. This is the in operator. The in operator is
a membership test operator and returns True if a sequence with a specified
value is in an object
74
>>> a[::2]
[1, 3, True]
Important:
>>> a = [1, 2, 3]
>>> b = a
If we updated a, then b would also be effected. What if we wanted to make a
copy of a and store it in b? We can use slicing to do that as slicing returns a
new object
>>> a = [1, 2, 3]
>>> b = a[:]
>>> b is a
False
We can see now they do not reference the same object so updating one will
not affect the other.
8.7 - Exercises
Important Note: We've learned a lot up to this point. Your solutions should
be combining (where necessary) everything we've learned so far. If you are
unsure of something, don't be afraid to go back to previous chapters. It's
completely normal not to remember everything this early on. The more
practice you have the more of this stuff will stick!
Question 1
Write a program that takes a string as input from the user, followed by a
number and output each word which has a length greater than or equal to the
number.
The string will be a series of words such as "apple orange pear grape melon
lemon"
# EXAMPLE INPUT
"elephant cat dog mouse bear wolf lion horse"
5
# EXAMPLE OUTPUT
elephant
mouse
75
horse
You may use the following code at the beginning of your program
my_words = input().split()
num = int(input())
The first string will be a series of words such as "apples oranges pears grapes
lemons melons"
# EXAMPLE INPUT
"apples oranges pears grapes lemons melons"
"es"
# EXAMPLE OUTPUT
apples
oranges
grapes
Hint: Your solution should be generalized for any suffix. Don't assume the
length of the suffix will be 2
You may use the following code at the beginning of your program
words = input().split()
suffix = input()
# EXAMPLE INPUT
3
5
6
7
0
# EXAMPLE OUTPUT
76
[3, 5, 6, 7]
Question 4
Building on your solution to question 3, taking a list built up from user input,
ask the user for two more pieces of input. Integers this time.
The integers will represent indices. You are to swap the elements at the
specified indices with eachother.
# EXAMPLE INPUT
6
3
9
0
*********************************
* LIST SHOULD NOW BE: [6, 3, 9] *
*********************************
1
2
# EXAMPLE OUTPUT
[6, 9, 3]
You may assume that the indices will be within the index range of the list.
Question 5
Write a program that builds a list of integers from user input. You program
should then find the smallest of those integers and put it at the first index (0)
in the list. Input ends when the user inputs 0.
# EXAMPLE INPUT
10
87
11
5
65
342
12
0
# LIST SHOULD NOW BE: [10, 87, 11, 5, 65, 342, 12]
# EXAMPLE OUTPUT
[5, 87, 11, 10, 65, 342, 12]
This problem is a little harder, think about your solution!
77
Question 6
** THIS IS A HARD PROBLEM **
Write a program that takes two sorted lists as input from the user and merge
them together. The resulting list should also be sorted. Both of the input lists
will be in increasing numerical order, the resulting list should be also.
# EXAMPLE INPUT
2
6
34
90
0 # END OF FIRST INPUT
5
34
34
77
98
0 # END OF SECOND INPUT
# EXAMPLE OUTPUT
[2, 5, 6, 34, 34, 34, 77, 90, 98]
Don't assume both lists will be the same length!
78
Chapter 9 - Basic Sorting Algorithms
9.1 - What is sorting?
We've learned a lot so far and its now time to put what we've learned to good
use. We're now going to look at your first important algorithms.
• Sorted subarray - The portion of the list that has been sorted
• Unsorted subarray - The portion of the list that has not yet been sorted
• a - The list we are sorting
In selection sort, we break the list into two parts, the sorted subarray and the
unsorted subarray and all of the elements in the sorted subarray are less than
or equal to all the elements in the unsorted subarray. We also begin with the
assumption that the entire list is unsorted.
79
Next, we search the entire list for the position of the smallest element. We
must search the entire list to be certain that we have found the smallest
element. If there are multiple occurrences of the smallest element, we take the
position of the first one. We then move that element into the correct position
of the sorted subarray. We repeat the above steps on the unsorted subarray
--------------------------------------------------------------------------
--
--------------------------------------------------------------------------
--
80
SORTED PART UNSORTED # ELEMENT IN THE LIST AND SWAP WITH
7
PART # 7 IS THE SMALLEST (NO SWAP)
--------------------------------------------------------------------------
--
--------------------------------------------------------------------------
--
Does this seem somewhat familiar? If you did the exercises from the previous
chapter, exercise 5 was basically this process!
It's time to code selection sort! Let's look at the code for it below:
1. i = 0
2. while i < len(a):
3. p = i
4. j = i + 1
5. while j < len(a):
6. if a[j] < a[p]:
7. p = j
8. j += 1
9.
10. tmp = a[p]
11. a[p] = a[i]
12. a[i] = tmp
13.
14. i += 1
That's it! Let's break this down as there is a lot going on here
81
• The three lines above i += 1 are swapping the smallest element into the
correct position.
• The process then repeats.
A skill that all programmers have is being able to step through code and
figure out how it's working. I encourage you to do the same with this
algorithm (pen and paper and walking through an example)
Another good way to figure out what's going in the middle of some code is to
stick in a print() statement to print out what variables hold what values at
that current time. Another good way is to set break points. I'm not going to
show you how to do this as it's generally a feature of the text editor you're
using so look up how to set break points for your editor and how to use them!
It's a skill you'd be expected to have if you worked in this field.
• Selection Sort: Select the smallest element from the unsorted subarray
and append it to the sorted subarray of the list.
• Insertion Sort: Take the next element in the unsorted subarray and insert
it into its correct position in the sorted subarray.
-----------------------------------------------------------------------
82
SORTED UNSORTED # CORRECT POSITION
-----------------------------------------------------------------------
3
2 4 5 _ 9 6
|=============||==========| # TAKE 3 OUT OF THE LIST
SORTED UNSORTED
-----------------------------------------------------------------------
3
2 4 5 _ 9 6
|=============||==========| # IS 3 < 5? YES, SO MOVE 5 UP
SORTED UNSORTED
-----------------------------------------------------------------------
3
2 4 _ 5 9 6
|=============||==========| # IS 3 < 4? YES, SO MOVE 4 UP
SORTED UNSORTED
-----------------------------------------------------------------------
3
2 _ 4 5 9 6
|=============||==========| # IS 3 < 2? NO, PLACE AFTER THE 2
SORTED UNSORTED
-----------------------------------------------------------------------
83
2 3 4 5 9 6
|=============||==========| # 3 IS NOW IN THE CORRECT POSITION
SORTED UNSORTED
-----------------------------------------------------------------------
2 3 4 5 9 6
|==================||=====| # MOVE ONTO THE NEXT ELEMENT
SORTED UNSORTED
It's time to code insertion sort! Let's take a look at the code for it, I'll give the
explanation as comments:
Selection sort and Insertion sort are what we call quadratic sorting algorithms.
84
This means they both have a big O time complexity of:
O(n2)
Time complexity refers to the time it takes for the algorithm to complete.
There are generally three categories:
• Best case scenario - For example with insertion sort, let's say our list is
already sorted and we try to sort it, then we are in a best case scenario
as we don't need to move any elements around and the algorithm
finishes quickly.
• Average case: This is how the algorithm performs in terms of time on
average
• Worst case scenario: This is how the algorithm would perform in terms
of time if our list was completely scrambled and out of order.
Big O deals with the worst case and it is the case we are usually concerned
with!
There's a lot of maths behind this and there's a whole topic of computer
science related to it so I'm not covering it in detail here.
Anyway, when we say that both of these algorithms have a time complexity
of O(n^2), we are essentially saying that the performance of the algorithms
will grow proportionally to the square of the size of the input (lists in this
case). Another way to think of O(n^2) is that if we double the size of our input,
it will quadruple the time taken for the algorithm to complete. These are just
estimates. It doesn't mean that two algorithms with time complexities
of O(n^2) will both take the exact same time to complete. You can look at this
as a guide as to how your algorithm will perform in a general sense.
I've taken the time to run some algorithms with different time complexities on
lists of varying sizes and we can look at how the algorithm takes longer to
complete as the size of the list grows. You can see this in the diagram below:
85
Time complexity
In this diagram, the y-axis represents how long it took the algorithm to finish
in seconds and the x-axis shows the number of elements in the list.
The blue line is for an O(n^2) algorithm and the orange is for an O(n)
algorithm (Linear time algorithm).
I have a fast computer, so it doesn't appear that the orange line is changing (it
is, just very slowly).
However, it is very clear that the O(n^2) algorithms become very slow when
the size of the input becomes large.
In the real world, a list of size 1000 is small and selection sort or insertion sort
wouldn't cut it. However, that doesn't mean they don't have their uses. In fact,
in practice, selection sort and insertion sort outperform the faster algorithms
on small lists and some of the fast sorting algorithms will actually switch to
these O(n^2) algorithms when they are nearing the end of the sorting process.
It turns out the whole process finishes faster when this is done (in some cases).
We also have something called space complexity and this deals with how
much memory is taken up by an algorithm. Both of these algorithms have O(1)
space complexity. This means they use constant memory. They use constant
memory as they are in-place sorting algorithms. They don't create additional
lists to assist during the sorting process.
86
There is usually a trade-off between time and space complexity. As you can
see here, we have constant memory (This is good) but quadratic time
complexity (This is bad). We could write some algorithm that is faster but takes
more memory. It depends on the problem and the computing resources we
have.
9.5 - Exercises
There won't be any coding exercises in this chapter, just theory questions.
They are just as important to understand and get right though!
Question 1
What is the time complexity of:
1. Insertion sort
2. Selection sort
Question 2
What is the space complexity of:
1. Selection sort
2. Insertion sort
Question 3
Give two examples of when we might use selection sort or insertion sort in the
real world?
Question 4
If we ended up in a situation in which the input was sorted but we didn't know
and we try to sort the input, why might we prefer insertion sort over selection
sort?
87
Input, File & Text Processing
10.1 - What is file & text processing?
File processing is the process of creating, storing and accessing a files content.
Up until now we've only been able to temporarily store data in our programs
in objects such as lists. In the real world, this wouldn't be much use. We need
some way of being able to save data permanently. To do this, our programs
need to store data on a hard disk. This is persistent storage and the data in
this storage will be able to survive a system reboot or our program crashing. If
we reboot our system, RAM is cleared and we lose our data. We've been
storing data in RAM until now.
Up until now we've been using input() and print() for input and output (I/O).
These are high-level interfaces to the underlying operating-system services
and sometimes we need finer control, especially when we want to read and
write files.
Python implements file objects as general interfaces to data streams. These file
objects allow us to read and write to and from standard input and standard
output (this is what input() and print() do and we'll get to standard input and
output later). The file objects also allow us to read and write files.
88
module can define functions, classes and variables. To access a module you
must import it into your code using the import keyword.
import sys
# YOUR CODE
Through the sys module we can access arguments that are passed to our
script at the command line. Lots of Python scripts need access to these
arguments. We get access to them via argv (sys.argv).
argv is shorthand for argument vector. This is a list containing the command-
line arguments passed to the script. The first element in this list is the script
itself. The arguments for the script come after the name of the script. Let's
look at how this works. Create a new Python file and save it as myscript.py and
enter the following code:
import sys
print(sys.argv)
1. import sys
2.
3. total = 0
4.
5. i = 1
6. while i < len(sys.argv):
7. total += int(sys.argv[i])
8. i += 1
9. print(total)
89
At the command line type the following, then Enter:
$ py calculator.py 5 3 6
14
We can pass the numbers we want to add together to the script. This saves us
having to ask the user for input each time we want new input. Remember
though, argv is a list of strings and that’s why we had to cast the arguments to
integers in the above code.
The sys module also provides us with three standard file objects. These
are: stdin, stdout, and stderr. These stand for standard input, standard
output and standard error.
Important: The following methods, unlike input() and print(), will never add
or remove newline characters. We will have to handle newline characters
ourselves.
The first is the read() method. It is the most basic of the three. read() will read
the entire contents of the file and the entire contents will be assigned to a
single string.
12345import sys
contents = sys.stdin.read()
print(contents)
90
The syntax for this method is file.read() where sys.stdin is a file (file object
here).
You should create a text file and fill it up with some text. I recommend you
have about 10 lines of text. Each line can be a single word if you like.
To run the above code, type the following at the command line. Make sure the
text file and the Python script are in the same directory. Type the following,
then Enter.
$ py myscript.py < input.txt
This is line one.
I'm line two
And I'm line three
This is the output we get. Yours will look different depending on what you
have in your file. As you can see we've printed out that file you just created.
It reads the file in as a single string and the string looks like so:
"This is line one.\nI'm line two\nAnd I'm line three"
1. import sys
2.
3. contents = sys.stdin.readlines()
4.
5. print(contents)
The syntax for this method is file.readlines() where sys.stdin is a file (file
object here).
As you can see our text file is now stored in a list, with each line being an
element in the list. Notice how the newline characters haven't been removed.
This method is good as we can now do some manipulation on each line of text
more easily.
91
Rather than using the input redirection operator we can just read lines from
terminal.
You can run the above script again, this time omitting the redirection operator.
Important: With files we eventually reach the end of them. This is indicated in
the file by EOF (end of file). We don't need to worry about how this works as
it's handled at a lower level. When we do not redirect standard input to read
from a file, we do however, need indicate EOF at the command line. On
Windows this is done by pressing ctrl+z. On Linux this is indicated by pressing
ctrl+d. When you run the above script again and omit the redirect, you will be
prompted to continuously input lines of text. When you are finished press
ctrl+z or ctrl+d.
$ py myscript.py
line one
line two
final line # PRESS ENTER, THEN CTRL+Z OR CTRL+D TO INDICATE EOF.
We can still read in many lines of input using this method, we'll just need a
loop. Let's look at how that is done.
1. import sys
2.
3. line = sys.stdin.readline()
4. while line:
5. print(line.strip()) # Strip the newline character as print() will add one.
6. line = sys.stdin.readline()
The syntax for this method is file.readline() where sys.stdin is a file (file
object here).
92
I'm line two.
And I'm line three
I want to take another look at the read() method. We can actually control how
much of the input is read. If we redirect input to a file, we can limit how much
of the file we read at any given time. Similarly, if we don't redirect, we can limit
how much of user input through the console we read. We do this by passing
an integer as an argument to the method. This integer represents how many
characters we want to read.
1. import sys
2.
3. contents = sys.stdin.read(6)
4.
5. print(contents)
Running the code and redirecting standard input to a text file would yield:
$ py myscript.py < input.txt
This i
As you can see, we passed the integer 6 to the read function, telling it to only
read 6 characters.
Get used to these methods! This is how we'll be the primary way of taking
input into our functions from now on!
93
Let's look at that in action in the Python REPL:
>>> import sys
>>> sys.stdout.write("Hello")
Hello5
>>>
We can see that we wrote "Hello" to standard output and we were returned
the number of characters that were written (5). The reason the 5 is on the same
line is because, unlike print(), we must append the newline character
ourselves:
>>> import sys
>>> sys.stdout.write("Hello\n")
Hello
5
>>>
We can also redirect the standard output stream to a file using the output
redirection operator (>). Let's look at how this is done.
1. import sys
2.
3. sys.stdout.write("Hello World!\n")
If you check the directory where your script is stored, you should now see a
file called output.txt.
Open it and view its contents, it should contain what we just wrote to the
output stream.
We also have the writelines() method available to us which is not all that
dissimilar to readlines() except it writes instead of reads.
1. import sys
2.
3. my_lines = ["Line one\n", "Line two"]
4. sys.stdou.writelines(my_lines)
94
Then, at the command-line, run:
1$ py myscript.py > output.txt
If you open output.txt you'll find that it contains the strings in the list above on
two separate lines.
Having the standard output stream solves the semi predicate problem by
allowing output and errors to be distinguished. We can redirect standard
error to a different output stream to standard output. (an error log file for
example).
That's all I want to say on standard error. We won't have to worry about it
again in this book. It's just good to know about all the standard data streams
we have access to.
When reading a file, we initialize a file object that acts as a link from the
program to the file stored on the disk.
To open a file we use the open() function. The open() function takes two
arguments: The filename and the mode in which the file is to be opened. The
syntax is: open(<filename>, <mode>)
There are various modes in which we can open a file, we're only concerned
with four of them here.
95
2. 'w' - This mode opens a file for writing. If the file doesn't exist, it creates
a new file with the specified file name. If the file does exist and we write
to it, anything that was previously in the file is overwritten.
3. 'a' - This opens a file in append mode. If the file doesn't exist, it creates
a new file with the specified file name. If it does exist, we add to the file
rather than overwriting it.
4. 'x' - This opens a file for exclusive creation, failing if it already exists
and throwing a FileExistsError.
Once the file is open, we can read its contents using the methods from section
3.4: read(), readlines() and readline().
Let's look at an example of opening a file from the disk and reading in its
contents:
Make sure you have a file called "input.txt" saved in the same directory as your
script and for demo purposes make sure it has a few lines of text in it.
You can now see that the contents of your file have been read in successfully.
It is in list form as we used the readlines() method.
As you may have guessed, we write to files using the write() method we met
in previous sections. To write to a file though, it must be opened in write
mode.
96
1$ py myscript.py
Now open the file called output.txt. Don't worry if it didn't exist, one will be
created. You should now see the line "I am being written to a file" contained in
that file.
When a file is opened, the operating system allocates memory to track that
file's state and the operating system cannot reallocate that memory until the
file is closed. If we don't close open files, then we are using up memory
unnecessarily.
When your program exists, the file is closed automatically but maybe we don't
want to exit straight after we are finished reading and writing to a file.
The file has now been closed and memory can be reallocated for other uses.
Having to open files and remember to close them can become a bit of a pain.
We can use something called the with statement to handle this for us.
So, what is going on here? We can see the with statement, but we also have
an as statement. The as statement is used to assign the returned object to a
new identifier. In this case, we open a file and are returned a file object. The
file object is then assigned to the variable my_file using the as statement.
97
We then do whatever we need to do. In this case we write to the file. When we
exit the with block the file is closed. Opening and closing files this way, is
usually how most people do it.
Another use of the as statement which may clarify things is, for example, we
were importing the sys module and we didn't like how long the name "sys"
was (I know this is a silly example but some modules have long names), we
could import it as such:
1. import sys as s
2. content = s.stdin.read()
Now, every time we want to refer to the sys module, we call it by its alias, s.
Let's assume we have a text file that contains many lines. Each line has a
student’s name followed by the mark they received for a specific class. Here is
a sample from that text file:
Liam 84
Noah 33
William 43
James 37
Logan 59
What we want to do is process this file and output to a new file whether each
student passed or failed the class.
Logan PASS
98
We want to pass the input filename and output filename as arguments to the
script and we take a fail to be a mark less than 40.
1. import sys
2.
3. src = sys.argv[1] # The input source file
4. dst = sys.argv[2] # The file to output to
5.
6. with open(src, 'r') as fin, open(dst, 'a') as fout: # Open in append mode
7.
8. student = fin.readline().strip() # Strip the newline character
9. while student:
10.
11. student_data = student.split() # ['Liam', '84'] for example
12.
13. name = student_data[0]
14. mark = int(student_data[1])
15.
16. if mark < 40:
17. grade = "FAIL"
18. else:
19. grade = "PASS"
20.
21. fout.write('{:s} {:s}\n'.format(name, grade)) # Write to the output fi
le
22.
23. student = fin.readline().strip() # Get the next student
Logan PASS
File processing is a common task, as I've said, so become used to it, you'll
probably be doing it a lot!
10.9 - Exercises
Important Note: These questions tougher than previous ones and get
considerably more difficult as they go on, especially question 4. Don't be
discouraged, however. Stick at them. If you manage to complete these 4
99
questions, you're on your way to becoming a great developer and problem
solver!
Question 1
Write a program that multiplies an arbitrary number of command-line
arguments together. The arguments will be numbers (convert them first).
To test that your solution is correct, you should get the following output
$ py calc.py 8 8 9 3 2
3456
Question 2
Write a program that reads in lines from standard input (no redirection) and
outputs them to standard output (no redirection).
Question 3
Write a program that reads a text file and outputs (using print() is fine) how
many words are contained within that file. The name of the text file should be
passed as a command-line argument to your script.
Your program shouldn't consider blank lines as words and you need not worry
about punctuation. For example, you can consider "trust," to be a single
word.
To test that your solution is correct, use the Second Inaugural Address of
Abraham Lincoln as your input text and your program should output: 701
Question 4
Write a program that reads in the contents of a file. Each line will contain a
word or phrase. Your program should output (using print() is fine) whether or
not each line is a palindrome or not, "True" or "False". The name of the text file
should be passed as a command-line argument to your script.
100
A palindrome is a word or phrase that is the same backwards as it is forwards.
For example, "racecar" is a palindrome while "AddE" is not.
To test that your solution is correct, use the following as your input text:
racecar
AddE
HmllpH
Was it a car or a cat I saw
Hannah
T can arise in context where language is played wit
Able was I ere I saw Elba
True
Question 5
** THIS QUESTION IS HARD **
Like question 3, write a program that reads a text file. This time your program
should output how many unique words are contained within the file.
101
This time you do need to care about punctuation and your solution should not
be case sensitive. For example, "trust" and "trust," and "Trust" should be
considered the same word, therefore only one occurrence of that word should
be recorded as unique and if you were to come across "trust" again, then
don't record it.
Hint 2: The solution to this exercise may make use of some string methods
that we looked at back in the strings chapter.
102
Chapter 11 - for loops & Dictionaries
11.1 - for loops
So far, we have been using while loops to iterate over lists and strings. I
purposely chose to not mention for loops yet as using while loops help with
computational thinking in the beginning. However, we have had enough
practice and if you completed the final question from the last chapter, you're
more than ready to move on.
for loops are used when you have a block of code which we need to execute a
known number of times (kind of similar to our do-something-n-times loop
pattern). When using for loops we assume we don't need to check for some
condition, and we need to iterate over every value in the collection or
sequence.
The for loop makes use of the in operator. The in operator checks for the
presence of something in some collection.
I'll show you this working in an example and then walk through how it works:
In the above example, name is the loop variable. We can pick any variable name
for this and I have chosen name as it is appropriate as we are looping
through names.
103
The loop variable is a temporary variable and its role is to store a given
element for the duration of the given iteration. Let's step through our example
and see how this works.
1. We pick the first element from the list and check if it exists
2. It does, so we assign it to a variable called name.
3. We enter the body of the loop and do any work we need to (printing
the name in this case).
4. We go back to start of the loop and repeat steps 1 to 4.
5. We will get to a point when there is no "next element", in which case we
exit the loop.
What if we were in a case where we didn't have something to iterate over but
we wanted to execute a block of code a fixed number of times?
Python provides two built-in functions for doing this. They are
the range() function and the xrange() function.
The range() function takes three parameters, a start and stop and a step. The
start and step are optional. The range() function builds a sequence based on
these parameters.
As you can see, we start at 5 and go up to but not including the 10. If we omit
the start parameter, then it defaults to 0 and I'm sure you can guess at this
point what the step parameter does from your knowledge of it in extended
slicing.
104
The range function has built up a sequence from 5 to 9 and we iterate over
that using the for loop.
11.2 - Dictionaries
So far, we've met lots of different types. We have also met Lists and Strings.
These two types are examples of data structures. In this section we're going to
look at another Python built-in data structure called the dictionary. It is of
type dict.
Dictionaries are collection types but not sequences. That is, its elements aren't
ordered like they are in a list or string.
|----------| |------------------|
The Big O complexity for the lookup operation is O(1). It is constant. It doesn't
matter how large the dictionary is, the time for searching for a value is always
the same. Magic, right? Take it as magic, I won't be going into how it does so
in this book as it's a little too advanced for beginners, but you're more than
welcome to look up how it's done.
105
We also do not know the order in which key-value pairs are stored in a Python
dictionary so don't write programs that rely on its order. If you have a
dictionary that is the same each time you run your program, the order will be
different each time.
For the above phone book example, that would be done as follows:
With dictionaries, a key can be any immutable type. The values can be of any
type, including other dictionaries.
If we try to access a dictionary based on a key that doesn't exist within the
dictionary, we get an error. This error is known as a KeyError in Python.
>>> phonebook = {"Tony": "457-2344356", "Adam" : "359-5550983"}
>>> phonebook["Bob"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'Bob'
106
>>> my_dict[3]
'three'
>>> my_dict[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 0
If we have a key-value pair in which the value is a list, we can index the list as
follows:
>>> my_dict = {"first": [2, 4, 8]}
>>> my_dict["first"][0]
2
{"one":1}
As we can add entries to the dictionary without a new dictionary being created
every time, dictionaries are mutable types.
Remember that the keys must be a mutable type and values can be of any
type. This will catch a lot of people out.
We can remove entries from the dictionary by using the del statement. This is
done as follows:
>>> phonebook = {"Tony": "457-2344356", "Adam" : "359-5550983"}
>>> del phonebook["Tony"]
107
In the above example we are saying, delete the entry from the dictionary
whose key is "Tony".
We can use the in operator to test whether or not a key exists within the
dictionary
108
These aren't really lists as we can't index them but we can iterate over them
using a for loop.
I want to briefly introduce another data structure. It's called the tuple. I won't
go into detail on it just yet but it's basically an immutable list. We can index
into it but can't change it (much like a string).
The reason I need to let you know about it here is because we can retrieve a
list of tuples using the dictionary method items().
We can see the tuples have the form (e1, e2, ....., en) where e1....en are
elements.
Remember, dict.items() returns a 'list' of tuples with each tuple containing two
elements, the key and the value. When we say for (k, v) in
my_dict.items() we are essentially saying for each tuple in the list of tuples,
assign k to the element in the first position in the tuple and v to the element in
the second position. I'll illustrate this below:
(k, v)
| |
V V
("one", 1)
109
Key: one and Value: 1
Notice how they're not in the same order as they appear when we created the
dictionary.
As you can see, the dictionary has been sorted based on the keys. Since the
keys are strings, they are sorted lexicographically.
We can also sort on values, however, we don't have the knowledge to do that
yet. We'll come back to that in the functions chapter.
We can however reverse the order they are sorted in. The sorted() function
has an optional parameter called reverse. We can set this to True and the
dictionary will be sorted in reverse order based on the keys.
phonebook = {"Bill": "457-2344356", "Adam" : "359-5550983"}
110
11.7 - Exercises
Question 1
Write a program that reads a file. The file will contain a number of items
stocked in a shop and the number of each item available. Your program
should parse the file and build a dictionary from this it, then output the stock
available in alphabetical order and in a specific format. You can use the
following as your input text file:
Oranges 12
Apples 10
Pears 22
Milk 7
Water Bottles 33
Chocolate Bars 11
Energy Drinks 8
Question 2
Write a program that reads a text file. The file is a list of contact details for
people. With each contact you're given the name, phone number and email
address. Your dictionary should be a mapping from contact names to email
and phone number for that contact. Your program should then ask for user
input. The input should be a single name. If the name can be found in the
contact list then output the name and contact details for that person,
otherwise output No contact with that name.
111
The contact list file is:
Liam 345-45643454 [email protected]
Noah 324-43576413 [email protected]
Tony 987-56543239 [email protected]
Bert 654-99275234 [email protected]
Name: Noah
Email: [email protected]
Phone: 324-43576413
Name: Bert
Email: [email protected]
Phone: 654-99275234
These details should be printed one by one as the user enters the names, not
in bulk as I have above.
Question 3
112
In the previous chapter we tried to count how many words were contained
within a piece of text. In this question you are to do the same except you
should output how many times each word occurred in the text.
To test that your solution is correct, use the Second Inaugural Address of
Abraham Lincoln as your input text and your program should output (this is a
sample from the out):
.
.
.
the : 58
oath : 1
of : 22
presidential : 1
office : 1
there : 2
is : 6
less : 2
occasion : 2
for : 9
an : 3
extended : 1
address : 2
than : 4
.
.
113
.
You need not format the output. The length of your dictionary should be 701.
Question 4
Write a program that takes two dictionaries and outputs their intersection. The
intersection of two dictionaries should be a third dictionary that contains key-
value pairs that are present in both of the other two dictionaries. You can hard
code these two dictionaries in. They don't need to be read from anywhere.
To test if your solution is correct, you can run your program with the two
following dictionaries:
d1 = {"k1": True, "k2": True, "k3": True, "k4": True}
d2 = {"k6": True, "k2": True, "k5": True, "k1": True, "k8": True, "k4":
True}
114
Chapter 12 - Functions & Modules
12.1 - What are functions?
In Python, a function is some code that takes input, performs some
computation and produces an output. In programming, a function is a
reusable block of code which performs a specific task.
Some of the common functions we have met are len() and print(). When we
want to know the length of something we pass that something to
the len() function for example.
We have also met methods. Methods are functions that are associated with an
object. For example the format() method operates on strings. Methods are
implemented the same way as functions and are essentially functions that
operate on associated objects, but we'll get back to those later.
We define our functions using the def keyword. Let's take a look at a function
that prints "Hello, World!" to the screen.
1. def hello():
2. print("Hello, World!")
Above we have defined, using def, a new function called hello(). When
we call our function, the functions code block will execute, in this case we just
115
print "Hello, World!" to the screen. This function can be used as follows within
our code:
1. def hello():
2. print("Hello, World!")
3.
4. hello()
The last line above calls our function. This should be familiar as you've called
other functions many times now!
return result
I'll explain what arg1, arg2, ..., argn is in the next section but let's take a
look at the return statement.
In our first function we didn't get a result back, it simply printed something to
the screen. Let's write a function that will return the text "Hello, world!" to us.
1. def hello():
2. return "Hello, World!"
1. def hello():
2. return "Hello, World!"
3.
4. x = hello()
5. print(x)
6.
7. # ALTERNATIVELY
8. print(hello())
In this function, the string Hello, World! is handed back to the function caller.
Since the function returns a value, its caller is expected to collect that value.
Above we see the caller collects the returned value and stores it in x. You can
see that the print() function also collects the value returned by
the hello() function.
116
In the previous section we looked at the syntax for a function. In that, we had
something like:
def myFunc(arg1, arg2, arg3):
#
The arg1, arg2, arg3 are called the functions parameters. A function
parameter is like a variable that is local to the function thus cannot be
referenced outside of that function. Another name of a parameter is a formal
parameter. If our function has one formal parameter, then we must pass a
value to the function when we call it. This value is called an argument. Another
name for an argument is an actual parameter.
In the above function, x and y in the function definition are called the
parameters.
When we call the function, we pass 5 and 7 to the function. These are the
arguments that we supplied to the function.
The variable result is also local to the function and cannot be referenced from
outside the function.
A procedure changes the state of our program. For example, they may update
values of variables, print something to the screen or update a file.
Functions that return a value inspect the state of our program. They return a
value based on their inspection.
117
Remember the split() method? I said if we don't pass any arguments to this
method then it will split a string based on whitespace. This is called
a default argument.
In the above function, we must pass a radius as an argument when calling it.
We don't however have to pass an x or y coordinate. If we don't pass an x or y
coordinate, our function defaults these values to 0.
1. def my_function(*argv):
2. for arg in argv:
3. print(arg)
4.
5. my_function("first", "second", "even a third")
In the above example, argv is some arbitrary name we chose as the parameter
name. The * beforehand indicates that it is a variable length parameter.
1. def my_function(**kwargs):
2. for k, v in kwargs:
3. print("{} : {}".format(k, v))
4.
5. my_function(first="Hello", second="World")
118
As you can probably tell, kwargs is a dictionary. We indicate that it is a keyword
variable length parameter by the ** before the variable name.
Both normal and keyword variable length parameters must be put at the end
of the parameter list when defining your functions parameters. There is good
reason for this. If we had def func(x, y, *argv, z) we wouldn't know
where *argv ended.
I'm going to say, be cautious when using variable length parameters. They
should only be used when you have a good reason to use them (and that isn't
very often).
Variables which are defined inside code blocks are local to that block.
A block is a construct that delimits the scope of any declaration within it. In
Python, a variable defined inside a function is local to that function. It is
accessible from when it is defined until the end of that function.
119
9. word = "orange"
10. tlist = add_to_list(word, ['pear'])
11. print(tlist)
12. word = "banana"
13. tlist = add_to_list(word)
14. print(tlist)
15.
16. main()
$ py trap.py
['apple']
['pear', 'orange']
['apple', 'banana']
Let's look at why this is strange behaviour. We see that the second parameter
to add_to_list() is optional (indicated by the default value []). If no argument
for this parameter is supplied when calling the function then that parameter
takes on the value of [], the empty list.
The first time we call the function and pass it apple, we get back ['apple'].
This is fine and it's what we expected. The second time we call the function
and pass it orange and a list of ['pear'], we get back the list ['pear',
'orange']. This is also fine and what we expected. The third time however, we
get some unexpected behaviour.
Why is apple in the list if we called the function and passed banana and no list.
Surely we should just get back ['banana']? This is called the default mutable
argument trap and no, it's not a bug.
The takeaway from this is to not use mutable types as default arguments.
Instead, work around it as follows:
120
14. word = "banana"
15. tlist = add_to_list(word)
16. print(tlist)
17.
18. main()
Modules allow you to logically organize your Python code. Grouping code into
modules makes it easier to understand and use your code. It is not uncommon
for a software project to span hundreds of files.
You can import modules using the import statement which we have seen when
importing sys and string.
We can import specific attributes from modules using the from keyword
e.g. from string import punctuation. We can also import all attributes from a
module using * e.g. from string import *.
When we import a module, the Python interpreter searches for the module in
the following manner:
121
12.7 - Creating our own modules
We can create our own modules to logically organize our code. Let's say we
want to create a module that contains some maths functions. We do this as
follows:
1. # maths.py
2.
3. def add(x, y):
4. return x + y
5.
6. def multiply(x, y)
7. return x * y
8.
9. def main():
10. print(add(x, y))
11. print(multiply(x, y))
12.
13. if __name__=="__main__":
14. main()
The add() and multiply() functions are fine and we have come across similar
functions before. The main() function we have also come across in a previous
section but the if __name__=="__main__":, we have not come across before.
When you execute a program directly from the command line (as you've been
doing), the Python interpreter sets a special variable for that script. That
variable is __name__. When a script is executed directly, that variable is set
to "__main__", otherwise, if it is being imported for example, then it is set to
the name of the module, "math" in this case.
It is good practice to include this check in your scripts. This check is also
usually located at the end of the script.
The way Python handles the __name__ variable has changed in Python 3.7. It is
now under PEP 567. On the surface nothing has drastically changed.
122
12.8 - Exercises
When completing these exercises, use the following pattern when writing your
scripts:
If you are asked to write module, then you should save the module in the
same directory as your testing script.
Question 1
Write a module named sorting.py that contains functions that implement
both selection and insertion sort i.e your module (sorting.py) should contain
a selection_sort(a) function and an insertion_sort(a) function. Your function
should return the values, not print them.
1. import sorting
2.
3. a = [5, 6, 3, 8, 7, 2]
4. print(sorting.selection_sort(a))
5. a = [5, 6, 3, 8, 7, 2]
6. print(sorting.insertion_sort(a))
$ py test.py
[2, 3, 5, 6, 7, 8]
[2, 3, 5, 6, 7, 8]
Question 2
Write a module called arithmetic.py. This module should include the following
functions:
• add()
• multiply()
123
• divide()
• subtract()
$ py test.py
22
5
-18
1
2
6
18
Question 3
Write a function called length() that mimics the len() function. Your function
should work for strings, dictionaries and lists. length() should only take 1
argument and return the length of the data-structure passed to it.
1. a = [5, 3, 4, 1, 2, 3]
2. print(length(a))
3.
4. a = []
5. print(length(a))
6.
124
7. d = {}
8. print(length(d))
9.
10. d = {"one": True, "two": True}
11. print(length(d))
12.
13. s = "This is a string"
14. print(length(s))
15.
16. s = ""
17. print(length(s))
$ py test.py
6
0
0
2
16
Question 4
Write a function called fib() that calculates and returns the n-th Fibonacci
number. Your function should take 1 argument, the Fibonacci number you
want to calculate.
1. print(fib(3))
2. print(fib(0))
3. print(fib(1))
4. print(fib(10))
5. print(fib(13))
$ py test.py
3
1
1
89
377
Remember:
fib(n)=fib(n − 1)+fib(n − 2)
fib(0)=1
125
fib(1)=1
Question 5
Write a function called read_file() which takes a single argument, a filename
(as a string), and returns the contents of the file in list form with each element
in the list being a single line from the file.
1. lines = read_file("input.txt")
Question 6
Write a function procedure called rep_all() that takes 3 arguments, a list of
integers and 2 numbers. Your procedure should replace all occurrences of the
first number with the second.
Calling your function as follows and printing the list afterwards should
produce the following:
1. a = [4, 2, 3, 3, 7, 8]
2. rep_all(a, 3, 10)
3. print(a)
$ py test.py
[4, 2, 10, 10, 7, 8]
126
Chapter 13 - Binary Search
13.1 - Number guessing game
Let's assume we have a sorted array of length 100,000. Further assume
somebody has chosen a number from that array, at random and we don't
know what it is. We're given 20 attempts to guess the answer. This may seem
impossible and that we only have luck on our side.
We can use the knowledge we have so far to write a function that guesses this,
and we have a couple of ways of doing it.
Let's look at two ways we might attempt this. The first is linear search.
The first line imports the randint() function from the random module which
takes two integer arguments and returns a random integer between the
arguments.
The second line builds up the array using something called a list
comprehension. We'll get back to these soon.
The third line chooses that random number as the secret number and the rest
of the code is standard linear search to try find the secret.
Obviously, this is a terrible idea as the secret number must be between 0 and
20 which has a 20/100000 chance of happening.
127
1. from random import randint
2. arr = [x for x in range(0, 100000)]
3. secret = randint(0, len(arr))
4.
5. guess = randint(0, len(arr))
6. i = 1
7. while i < 20 and guess != secret:
8. guess = randint(0, len(arr))
9. i += 1
10.
11. if i < 20:
12. print("The secret number is " + str(i))
13. else:
14. print("Hard luck, you didn't guess the secret number")
Better approach? Maybe, but we can do much better. In fact, we can guess the
number within 20 guesses every time without fail. We use a new algorithm
called Binary Search to do that.
2 4 9 12 34 35 77
| low | high
|_____________________| # We will call these two position low and
high
2 4 9 12 34 35 77
| low | high
|_____________________| # The number is somewhere between low and
high
2 4 9 12 34 35 77
| low | high # We're also able to calculate the index
for
128
|_____________________| # the number in the middle of low and high
2 4 9 12 34 35 77
| low | middle | high # We're also able to calculate the index
for
|_________|____________| # the number in the middle of low and high
#
# Assume the number we're searching for is 4
#
2 4 9 12 34 35 77
| low | middle | high # We're also able to calculate the index
for
|_________|____________| # the number in the middle of low and high
2 4 9 12 34 35 77
| low | middle | high # As the list is sorted we can check if
the
|_________|____________| # number at 'middle' is <= or > 4
2 4 9 12 34 35 77
| low | middle | high # In this case 4 is < 12 so we don't need
to
|_________|____________| # bother searching anything above middle
index
2 4 9 12
| low | high # So we cut that portion of the list out
|_________| # and make middle the new high
2 4 9 12
| low | high # Repeat the process,
129
|_________| # continuously halving the list until the
condition
# that low < high fails i.e low == high
We calculate the midpoint by getting the average of low and high i.e. (low +
high) // 2. Notice we're doing integer division here, so we'll round down if
there is a decimal place.
Roughly, only 10% of developers are able to code binary search properly.
There’s a little cop out in the middle of it so we need to be careful so that our
solution works correctly in all cases.
It is important that mid and high are never equal. This is one place where some
developers mess up when coding this algorithm.
130
We must add 1 when updating the low value because if low == mid, we will end
up in an infinite loop and that is bad news! This is another place where
developers mess up.
Since binary search has its cop outs like those above, I recommend you learn it
off by heart.
Our improved version of the game will guess (find) the secret number within
20 attempts every time.
This is important because we get guess the answer correctly, in the worst case,
in 20 attempts, whereas our previous approaches, the worst case would have
been 100,000 attempts!
131
O(n2)
O(log(n))
More specifically:
O(log2n)
It is a little trickier to tell what the runtime of an algorithm like this is. As a
general rule of thumb, if we halve the input upon each iteration then it will
have a logarithmic runtime (or at least, a logarithmic component to its runtime
complexity).
Recall the graph of the Insertion Sort and an O(n) algorithm. Here is the graph
of an algorithm with binary search's runtime complexity.
You can clearly see that as the input gets much larger, the time it takes for the
algorithm to complete begins to level off and increasing the input size begins
to have a smaller and smaller effect on the algorithm’s performance.
132
Binary Search's space complexity is also O(1), which is constant memory, as we
don't create any copies of the list when searching.
13.6 - Exercises
Question 1
Answer the following questions:
Question 2
Using what you've learned in this chapter, write a function called insert()that
uses an efficient algorithm (hint, hint...) and takes a list and an element as
input and returns the index at which that element should be inserted into the
list.
For example, calling the insert function as such should return the following:
1. insert([1, 3, 4, 5], 2)
2.
3. # RETURNS
4. 1 # i.e. 2 should be inserted at index 1
Question 3
Again, using what you've learned, write a function called contains() that takes
two arguments, a list and an element and returns whether or not that element
is contained within that list.
For example:
contains([1, 3, 5, 7], 3)
# RETURNS
True
------------------------------
133
contains([1, 3, 5, 7], 2)
#RETURNS
False
134
Chapter 14 - Error Handling
14.1 - Raising Exceptions
In Python you'll run into many kinds of errors and I'm sure you've ran into
plenty so far!
We've seen syntax errors which arise from improper syntax e.g. too many
closing brackets. You'll also run into something called exception errors.
Exception errors arise whenever syntactically correct Python code results in an
error. There are many types of exception errors, to name a few, you'll get
an ImportError when you try to import a module that cannot be found, you'll
get a ZeroDivisionError when the second operand of the division or modulo
operator is zero. If you run into an error that doesn't fall into a specific
category of exception, Python will throw a RuntimeError.
Running this with the following inputs would give the following output:
135
$ py change_vel.py
5
6
7
5
Traceback (most recent call last):
File "test3.py", line 14, in <module>
main()
File "test3.py", line 10, in main
raise Exception("Car cannot have a negative speed")
Exception: Car cannot have a negative speed
As you can see, we raised an exception with our own, custom error message.
14.2 - Assertions
If you remember back to the beginning of this book, I talked about pre and
post conditions. A pre-condition is a condition that always holds true prior to
the execution of some code. Preconditions may be used in functions to
enforce a contract between a function and its invoker. Let's assume we have
some function that requires a list as an argument. We want to assert that the
argument passed to the function is a list and not something else like a string.
1. def sorter(arr):
2. assert(type(arr) == list)
3. # Do the sorting
4.
5. def main():
6.
7. arr = [6, 3, 5, 1, 8]
136
8. sorter(arr)
9.
10. print("We've made it this far")
11.
12. arr = "Hello"
13. sorter(arr)
14.
15. if __name__=="__main__":
16. main()
AssertionError
As you can see our assertion was True when we passed the list, so the program
continued as normal. When we passed a string to the sorter function, the
assertion returned False and our program exited and raised an AssertionError.
The way try and except blocks work are not all that different from
an if and else statement. Inside the try block we put code that should run as
"normal". If an exception is raised in the try block, then we catch this
exception in the except block. By catching the exception, we can handle the
error without our program crashing.
Let me give two examples. One where we don't try to catch the exception and
one where we do.
1. filename = input()
2. with open(filename) as f:
137
3. lines = f.readlines()
4.
5. print(lines)
Our program has crash. What if we wanted to handle this appropriately and
allow the user to enter the filename again? We use a try and except block to
do this.
138
+-- GeneratorExit
+-- Exception
+-- StopIteration
+-- StopAsyncIteration
+-- ArithmeticError
| +-- FloatingPointError
| +-- OverflowError
| +-- ZeroDivisionError
+-- AssertionError
+-- AttributeError
+-- BufferError
+-- EOFError
+-- ImportError
| +-- ModuleNotFoundError
+-- LookupError
| +-- IndexError
| +-- KeyError
+-- MemoryError
+-- NameError
| +-- UnboundLocalError
+-- OSError
| +-- BlockingIOError
| +-- ChildProcessError
| +-- ConnectionError
| | +-- BrokenPipeError
| | +-- ConnectionAbortedError
| | +-- ConnectionRefusedError
| | +-- ConnectionResetError
| +-- FileExistsError
| +-- FileNotFoundError
| +-- InterruptedError
| +-- IsADirectoryError
139
| +-- NotADirectoryError
| +-- PermissionError
| +-- ProcessLookupError
| +-- TimeoutError
+-- ReferenceError
+-- RuntimeError
| +-- NotImplementedError
| +-- RecursionError
+-- SyntaxError
| +-- IndentationError
| +-- TabError
+-- SystemError
+-- TypeError
+-- ValueError
| +-- UnicodeError
| +-- UnicodeDecodeError
| +-- UnicodeEncodeError
| +-- UnicodeTranslateError
+-- Warning
+-- DeprecationWarning
+-- PendingDeprecationWarning
+-- RuntimeWarning
+-- SyntaxWarning
+-- UserWarning
+-- FutureWarning
+-- ImportWarning
+-- UnicodeWarning
+-- BytesWarning
+-- ResourceWarning
We were being quite specific with our exception in the previous example.
Assume another error might occur somewhere in our program and we are
unsure of what it might be, but we still want to allow the user to enter a
140
filename after it occurs. We can have multiple except clauses after
the try block.
Now, if we run into any other error, we have handled it with a more general
exception.
1. inp = int(input())
2. try:
3. if inp < 5:
4. raise ValueError
5. else:
6. print(inp)
7. except ValueError:
8. print("Input shouldn't be less than 5")
14.4 - finally
Usually, if we have a try and except block, we'll try to do something specific
that may be prone to errors. In the example from the previous section, that
specific something was opening a file. We shouldn't have to ask the user for
input again in the try block. We also shouldn't have to ask the user to enter
the filename again after we handled the exception in the except block.
141
11. filename = input("Enter a file name: ")
14.5 - Exercises
Question 1
Write a program that takes a number from a command-line argument and
print out the multiples of that number up to 10. The user should also be able
to specify a formatting flag and/or shortening flag at the command line. As an
example, take a look at a couple of different ways a user could run the
program:
$ py times_tables.py 15
0 * 15 = 0
1 * 15 = 15
2 * 15 = 30
3 * 15 = 45
4 * 15 = 60
5 * 15 = 75
6 * 15 = 90
7 * 15 = 105
8 * 15 = 120
9 * 15 = 135
10 * 15 = 150
$ py times_tables.py -f 15
0 * 15 = 0
1 * 15 = 15
2 * 15 = 30
3 * 15 = 45
4 * 15 = 60
5 * 15 = 75
142
6 * 15 = 90
7 * 15 = 105
8 * 15 = 120
9 * 15 = 135
10 * 15 = 150
$ py times_tables.py -f -s 15
0
15
30
45
60
75
90
105
120
135
150
In the second example, notice the -f flag and, the -s flag in the third example.
Your program should be able to handle the following situations that may raise
exceptions:
Question 2
When should we use the following?
143
• raise
• finally
• try and except
• What is a precondition?
• What is a postcondition?
144
Chapter 15 - More on Data Types
15.1 - Tuples
A tuple is essentially an immutable list. Tuples, like strings, cannot be modified
after creation. However, a tuple is different to a string in that while strings
consist solely of characters, a tuple can contain objects of any type (like a list).
<class 'int'>
We can concatenate tuples using the + operator and repeat tuples using
the * operator:
>>> t = (2, 4, 6, 8)
>>> u = (1, 3, 5, 7)
>>> t + u
(2, 4, 6, 8, 1, 3, 5, 7)
>>> t * 2
(2, 4, 6, 8, 2, 4, 6, 8)
145
We can also slice and index in the same way we do with strings:
>>> t = (5, 4, 3, 2, 1)
>>> t[3]
2
>>> t[::-1]
(1, 2, 3, 4, 5)
>>> t[1:3]
(4, 3)
We can also iterate over tuples using loops. I'm going to use the for loop here,
but you can use while too:
>>> chars = ("a", "b", "c")
>>> for c in chars:
... print(c)
...
a
b
c
146
... print(k, v)
...
a 10
b 11
c 12
15.2 - Sets
A set is a collection of objects of arbitrary type. The set's objects are called
its members. Sets are an unordered type (like dictionaries). They are also
iterable. An important feature of sets is that they only contain one copy of a
particular object i.e. duplicates are not allowed. Sets are mutable types.
Sets are generally used for membership testing and removing duplicates. They
usually hold a collection of similar items.
To take full advantage of sets, you'll need to have a good understanding of set
theory. I won't be covering set theory in this book so you can read up on it
here: Introduction to Set Theory. You may however be familiar with basic set
theory as you covered it in school. Set's are very important and are one of the
data types that creep up in interviews.
We can also create a set using comma separated values enclosed in curly
braces:
x = {1, 2, 3, 4}
Note that we cannot use {} to create an empty set as this would create an
empty dictionary.
When we use the in operator on strings and lists, the runtime complexity is
O(n). When we use this operator on sets it's on average O(1). This is the same
with dictionaries. It is therefore very fast to access.
147
To add a new member to a set we use the add() method:
>>> x = {2, 4, 6}
>>> x.add(8)
>>> x
{8, 2, 4, 6}
{9, 2, 4, 5, 6}
148
We can get the union of two sets A and B using the union() method. The
union of two sets is the set of elements in both A and B:
>>> A = {1, 3, 3, 7}
>>> B = {2, 3, 6, 8}
>>> A.union(B)
{1, 2, 3, 6, 7, 8}
We can get the set difference between two sets A and B using
the difference() method. The A set difference B is the set of elements in A but
not in B. Likewise, B set difference A is the set of elements in B but not in A:
>>> A = {1, 2, 3, 4, 5, 6}
>>> B = {4, 5, 6, 7, 8, 9}
>>> A.difference(B)
{1, 2, 3}
>>> B.difference(A)
{8, 9, 7}
We can also check if set B is a subset of A using the issubset() method. Set A
is a subset of set B if every member of A is also a member of B:
>>> A = {2, 4, 6}
>>> B = {1, 2, 3, 4, 5, 6, 7, 8, 9}
>>> A.issubset(B)
True
>>> B.issubset(A)
False
149
15.3 - Lists & list comprehensions
We've seen how to create lists and you should be used to doing it at this
point. However, consider the following task:
Write a program that builds a list that contains all the even numbers from a
second list.
The task of building one list by processing elements from another list is a very
common task in computing. Until now we have had to use some kind of loop
to iterate over all the elements in the second list and pick out the even
numbers. Python has provided a short-cut for this called a list comprehension.
You have seen me using a list comprehension in a previous chapter to build a
list of 1,000 integers.
For example, to create a list that contains every number from 0 to 1,000 we
could use the following list comprehension:
lst = [x for x in range(0, 1000)]
The above list comprehension reads: "Add x to the new list for each x in the list
of elements from 0 to 1000".
To solve the task, I mentioned at the beginning of this section we could use
the following list comprehension:
>>> my_lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
>>> evens = [x for x in my_lst if not x % 2]
>>> evens
[2, 4, 6, 8, 10, 12, 14]
In the above list comprehension, x is the expression, for x in my_lst is the for-
clause and if not x % 2 is the condition. It reads as follows: "Add x to the new
list for each x in my_lst if x modulus 2 does not equal 0".
150
We can use a list comprehension to build up a list of squared values from
another list using the following list comprehension:
>>> vals = [1, 2, 3, 4, 5, 6]
>>> squares = [x ** 2 for x in vals]
>>> squares
[1, 4, 9, 16, 25, 36]
Let's look at a list comprehension that will square the even numbers and leave
any odd numbers the same:
>>> vals = [1, 2, 3, 4, 5, 6, 7, 8]
>>> evens_squared = [x ** 2 if not x % 2 else x for x in vals]
>>> evens_squared
The above list comprehension reads as: "Add the square of x if x is even
otherwise just x for each x in vals"
List comprehensions are so common to stumble across and you'll often see
them being used if you research any problems online. They are generally a
construct only found in Python. If you move on to Java for example in the
future, you won't be able to use list comprehensions as they don't exist in
Java.
Comprehensions don't just apply to lists. They can be used as short-cuts for
building a new collection from another collections. We therefore have set and
dictionary comprehensions:
>>> my_set = {x for x in range(10)}
>>> my_set
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
>>> my_dict = {x:True for x in range(5)}
>>> my_dict
{0: True, 1: True, 2: True, 3: True, 4: True}
Practice with these! Use them when you can (which will probably be often).
Making use of comprehensions is the Pythonic way of writing Python code.
15.4 - Exercises
151
Download the following file (copy this page to a text file): Dictionary.
Question 1
During a study, several software engineers were asked what their top 3
favourite programming languages were.
Your program should be run as follows and give the following output:
$ py unique.py input.txt
8
Question 2
Write a list comprehension that creates a list containing all the odd numbers
between 0 and 10,000.
Question 3
Using the dictionary you downloaded earlier, write a program that builds a list
containing all the words that are longer than or equal to 18 characters. You
should use a list comprehension to do this!
Note: This may take some time to run as the input file is large!
Question 4
152
Using the dictionary, you downloaded earlier, write a program that builds a list
containing all the words that contain every vowel (a, e, i, o and u). For example,
the word "equation" should be in the list. You should make good use of a list
comprehension to do this!
Question 5
Using the dictionary, you downloaded earlier, write a program that builds a list
containing all the words that contain exactly 4 a's and end in 'ian'. For
example, the word "alabastrian" should be in the list. You should make good
use of a list comprehension to do this!
Question 6
Using the dictionary, you downloaded earlier, write a program that builds a list
containing all the words in the dictionary whose reverse also occurs. For
example, the words "lager" and "regal" should be in the list. You should make
good use of comprehensions to do this!
Tip: You may want to think about your solution for this. Your program may
take a very long time to run if you don't implement a more efficient solution!
153
Chapter 16 - Recursion & Quicksort
16.1 - Recursion
Before I go any further, I want to say this, writing a recursive solution is
DIFFICULT when starting out. Don’t expect it to work first time. Writing a
recursive solution involves a whole new way of thinking and can in fact be
quite off putting to newcomers but stick with it and you’ll wonder how you
ever found it so difficult.
So, what exactly is recursion? A recursive solution is one where the solution to
a problem is expressed as an operation on a simplified version of the same
problem.
Now that probably didn’t make a whole lot of sense and it’s very tricky to
explain exactly how it works so let’s look at a simple example:
1. def reduce(x):
2. return reduce(x-1)
So, what’s happening here? We have a function called reduce which takes a
parameter x (an integer) and we want it to reduce it to ‘0’. Sounds simple so
let’s run it when we pass 5 as the parameter and see what happens:
>>> reduce(5)
RuntimeError: maximum recursion depth exceeded
To fix this problem we need to add in something referred to as the base case.
This is simply a check we add so that our function knows when to stop calling
itself. The base case is normally the simplest version of the original problem
and one we always know the answer to.
154
Let’s fix our error by adding a base case. For this function we want it to stop
when we reach 0.
1. def reduce(x):
2. if x == 0:
3. return 0
4. return reduce(x-1)
Great it works! Let’s take another look at what is happening this time. We call
reduce(5) which calls reduce(4)….which calls reduce(0). Ok stop here, we have
hit our base case. Now what the function will do is return 0 to the previous call
of reduce(1) which returns 0 to reduce(2)….which returns 0 to reduce(5) which
returns our answer. There you go, your first recursive solution!
Let’s look at a more practical example. Let’s write a recursive solution to find
N! (or N factorial):
For example, 4! = 4 * 3 * 2 * 1
Our base case for this example is N = 0 as 0! is defined as 1. So, lets write our
function:
1. def factorial(n):
2. if n == 0:
3. return 1
4. return n * factorial(n-1)
OK let’s see that in action. We’ll call our function and pass 4 in for n:
factorial(4) calls factorial (3) ….. which calls factorial(0) which is our base case
so we return 1 back to the previous call of factorial(1) which takes our 1 and
multiplies it by 1 which evaluate to 1 which passes that back to factorial(2)
which takes our 1 and multiplies it by 2 which evaluates to 2 which passes that
back to factorial(3) which multiplies 3 2 to get 6 which passes that back to
factorial(4) which multiples 4 6 which evaluates to 24 which is returned as our
answer.
These are very simple cases and not super useful but being able to spot when
a recursive solution may be implemented and then implementing that solution
155
is a skill that will make you a better programmer. For certain problems
recursion may offer an intuitive, simple, and elegant solution.
For example, you may implement a recursive solution to find the number of
nodes in a tree structure, count how many leaf nodes are in a tree or even
return whether a binary search tree is AVL balanced or not. These solutions
tend to be quite short and elegant and take away from the long iterative
approach you may have been taking.
As I’ve said, recursion isn’t easy but stick with it, it will click and seem so simple
you’ll wonder how you ever found it so difficult.
16.2 - Memoization
We've written the solution to the Fibonacci problem a couple of times
throughout this book. When writing those solutions, we've used an iterative
approach. We can, however, take a recursive approach. It may be difficult to
recognize when a recursive solution is an option, but this is a great example of
when a recursive solution is an option!
Write a program that takes an integer n and computes the n-th Fibonacci
number:
1. def fib(n):
2. if n <= 1:
3. return 1
4. else:
5. return fib(n-1) + fib(n-2)
That’s much simpler! In this example we'll take fib(0) to be 1 and fib(1) to be 1
also. Therefore, our base case is if n is 0 or 1 we return 1. The recursive case
comes straight from the definition of the Fibonacci sequence, that is:
fib(n)=fib(n − 1)+fib(n − 2)
However, you may have noticed that if you input anything greater than
(roughly) 35 - 40 your program begins to really grind to a halt and take a VERY
long time and anything above 45~50, well... go make a cup of coffee and
come back and it MIGHT be finished. What's causing this? Let's look at a
diagram of the first few recursive calls to fib().
156
We can see that when we call fib(5) we, at various points in our functions
execution, call fib(1) 5 times! and fib(2) 3 times! This is unnecessary. Why
should we have to recompute the value for fib(1) over and over again?
We can use a dictionary to implement a cache. This will give us O(1) for
accessing and will drastically improve the performance of our function.
157
We can now call fib(900) and get the answer straight away! By default, Python
limits the recursion depth to 1000. We can override this but it's usually not a
good idea!
We can see from the tree that there are less calls to the function even though
we are a value higher. This is a significant performance increase. Even calling
from fib(5) we have saved ourselves 6 function calls!
To achieve this, I’ll first cover quicksort’s core operation: partitioning. It works
as follows:
>>> A = [6, 3, 17, 11, 4, 44, 76, 23, 12, 30]
>>> partition(A, 0, len(A)-1)
>>> print(A)
158
[6, 3, 17, 11, 4, 23, 12, 30, 76, 44]
So, what happened here and how does it work? We need to pick some
number as our pivot. Our partition function takes 3 arguments, the list,
the first element in the list, and the pivot. What we are trying to achieve here is
that when we partition the list, everything to the left of the pivot is less than
the pivot and everything to the right is greater than the pivot. For the first
partition seen above, 30 is our pivot. We'll always take our partition to be the
last element in the list. After the partition we see some elements have changed
position but everything to the left of 30 is less than it and everything to the
right is greater than it.
So, what does that mean for us? Well it means that 30 is now in
its correct position in the list AND we now have two easier lists to sort. All of
this is done in-place, so we are not creating new lists.
The return q at the end isn’t necessary for our partition but it is essential for
sorting the entire list. The above code works its way across the list A and
maintains indices p, q, j, r.
p is fixed and is the first element in the list. r is the pivot and is the last
element in the list. Elements in the range A[p:q-1] are known to be less than
or equal to the pivot and everything from A[q-1:r-1] are greater than the
pivot. The only indices that change are q and j. At each step we
compare A[j] with A[r]. If it is greater than the pivot it is in the correct
position, so we increment j and move to the next element. If A[j] is less
than A[r] we swap A[q] with A[j]. After this swap, we increment q, thus
extending the range of elements known to be less than or equal to the pivot.
We also increment j to move to the next element to be processed.
159
16.4 - Quicksort part 2
Now onto the quicksort part. Remember it is a recursive algorithm so it will
continuously call on partition() until there is nothing left to partition and all
elements are in their correct positions. After the first partition we
call partition() again, this time we call it on the list of elements to the left of
the pivot and the list of elements to the right of the pivot.
It’s that simple. All we do here is check if the index of the pivot, is less than or
equal to the index of the start of our list we want to partition. If it is we return
as whatever list was passed does not need to be partitioned any further.
Otherwise, we partition the list A, and call quicksort again on the two new sub
lists.
Quicksort works best on large lists that are completely scrambled. It has really
bad performance on lists that are almost sorted. Or in Big-O notation, the best
case (scrambled) is O(n log(n)) and in the worst case, (almost or completely
ordered list) is O(n^2).
Again, I encourage you to try this on paper with a simple list. It will help clarify
what is going on.
16.5 - Exercises
Important: These exercises are quite difficult! Make use of problem-solving
techniques to help you arrive at a solution. Sketching out what's happening on
paper, it's a good way to help you out!
160
Some of these questions are the kind of questions you can expect if you get a
recursion question when interviewing at companies such as Google, Facebook
or Amazon.
Question 1
What is:
• Recursion?
• Memoization?
Question 2
Write a recursive function that adds up all the numbers from 1 to 100.
Question 3
Write a recursive function that takes an integer as an argument and returns
whether that integer is a power of 2. Your function should return True or False
Question 4
Write a recursive function that takes a string as an argument and returns
whether that string is a palindrome. Your function should return True or False.
Question 5
Write a recursive function that takes a list of integers as an argument and
returns the maximum of that list.
Hint: Thinking recursively, the maximum is either the first element of the list or
the maximum of the remaining list!
161
Chapter 17 - Object Oriented
Programming (OOP)
17.1 - Classes & Objects
So far, we’ve met many of Pythons built-in types such
as Booleans, integers, dictionaries etc. These are all class types. This means
that any instance of a string, list, float etc, is an object of the class string, list or
float. In other words, every string object, e.g. "Hello", is an instance of
the string class. An object is an implementation of a type
In OOP, programs are designed by making them out of objects that interact
with one another. An important feature of objects is that they use their
methods to access and modify their attributes. For example, if we wanted to
model a Person, they might have an age attributes and a name attribute. Each
year they get one year older so they may have a change_age() method to
update their age.
Most of the time, Python’s built-in types are not enough to model data we
want, so we built our own types (or classes) which model the data exactly how
we need.
A good way to think of this is that a class is a cookie cutter and an instance of
that class is the cookie we cut out.
162
17.2 - Defining a new type
Let’s say we wanted to model Time. Python’s built in data types are not going
to be enough, to easily and logically model this. Instead, we would create our
own Time class. We define a new class using the class statement.
1. class Time:
2. pass
We have now defined a new class called time. As a side note, pass basically
means “do nothing”. I have it there so our program will not crash.
Save the above code to a file called my_time.py and import it as follows:
>>> from my_time import Time
>>> t = Time()
>>> type(t)
<class 'time.Time'>
>>> isinstance(t, Time)
True
>>> print(t)
<time.Time object at 0x000001A7C6902F60>
We can see that t is of type Time and t is an instance of Time and we can see it
is an object stored at the memory location 0x000001A7C6902F60.
1. class Time:
2.
3. def set_time(time_obj, hours, minutes, seconds):
4. time_obj.hours = hours
5. time_obj.minutes = minutes
6. time_obj.seconds = seconds
7.
8. def display(time_obj):
9. print("The time is {}:{}:{}".format(time_obj.hours,
10. time_obj.minutes,
11. time_obj.seconds))
In the above class definition, the set_time() method requires four arguments.
The first is the time object we want to initialise and the other three are the
hours, minutes, and seconds we want to set for our time object.
163
As shown above, we access the hours, minutes and seconds attributes using
the period operator e.g time_obj.hours = hours. This is saying, for the time
object that was passed, set its hours attribute to the hours passed as an
argument.
In the display() method, we access the hours, minutes, and seconds using the
period operator.
164
'The time is 11:32:45'
Note how we only have to pass three arguments to the set_time() method
rather than the four we declared. This is because when we invoke an instance
method on an object, that object is automatically passed as the first argument
to the method. Therefore, we supply one less argument than the number we
declared in the method definition. Python will automatically supply the
missing object as the first argument on our behalf.
Instance methods are methods that act upon a particular instance of an object.
It’s first parameter is always the object upon which it will operate. By default,
all a classes’ methods are instance methods unless we declare them as class
methods which we will get to in a later chapter.
1. class Time:
2. def set_time(self, hours, minutes, seconds):
3. self.hours = hours
4. self.minutes = minutes
5. self.seconds = seconds
6.
7. def display(self):
8. print("The time is {}:{}:{}".format(self.hours,
9. self.minutes,
10. self.seconds))
165
I’m going to leave it at that for this chapter. There is a lot to take on here and
its really important stuff! The rest of the book will be focused solely on object-
oriented programming. So, make sure you understand the material in this
book and complete the following exercises.
17.4 - Exercises
Question 1
Write a class that models a Lamp. The lamp class will have two methods. The
first will initialise the lamp and is called plug_in(). This will set
the is_on attribute which is a boolean to False. The second method is
called toggle(). If the lamp is off, it will turn it on (set is_on to True) and if the
lamp is on, it will turn it off (set is_on to True).
False
Question 2
Write a class that models a Cirlce. The circle class will have four methods. The
first will initialise the circle. The initialise() method should take three
parameters, x_coord, y_coord, and radius. This will set the circles x and y
coordinates and it radius. The next method is called calc_area(). It should
calculate the area of the circle (Look up the formula). The third method is
called calc_perimeter() and should calculate the circumference of the circle
(Look up the formula). The fourth method is overlaps(). It should take as an
argument, another circle and print out whether or not the two circles overlap.
166
Your class may be imported and used as follows:
>>> from my_circle import Circle
>>> c1 = Cirlce()
>>> c1.initialise(2, 2, 4)
>>> c2 = Circle()
>>> c2.initialise(3, 2, 2)
>>> c1.calc_perimeter()
25.13
>>> c2.calc_perimeter()
12.57
>>> c1.calc_area()
50.27
>>> c1.overlaps(c2)
True
Question 3
Write a class that models a students grades for particular modules. The class
will have four methods. The first should initialise the student. The student
should have a name and age. The grades should be modelled as a dictionary
in which the key is the name of the module and the value is the grade. This
should initially be empty. The second method is called add_module() and
should add a module to the student object. The third is
called update_module_grade() and should update the grade for a particular
module. The final method is called show_grades() and should print out the
students modules and the grade associated with each module.
167
>>> s1.add_module("Networks")
>>> s1.update_module_grade("Networks", 77)
>>> s1.show_grades()
Noahs grades are:
Python: 87
Networks: 77
168
Chapter 18 - OOP: Special Methods and
Operator Overloading
18.1 - The __init__() method
In the previous chapter we looked at how to define new types by creating
classes. We also had to initialise our instance objects using methods
like initialise(). We don’t have to do this with Python’s built-in types such as
strings.
For example, to initialise a list we could just type my_list = [1, 2, 3].
As you can see, our __init__() method was called automatically when we
created an instance of the point class.
If the __init__() method takes two arguments (excluding self), then we must
pass two arguments, otherwise we’ll get an error.
169
TypeError: __init__() missing 2 required positional arguments: 'x' and 'y'
We can however, set default values for these arguments. This will allow us to
initialise an instance of Point without passing the arguments:
1. class Point:
2. def __init__(self, x=0, y=0):
3. self.x = x
4. self.y = y
5.
6. def display(self):
7. print("X coordinate: {}\nY coordinate: {}".format(self.x, self.y))
What happens when we define this special method and create an instance is:
Under the hood, the list built in type implements this __str__() method and
hence, allows us to print our lists in human readable format.
We can override this method to allow us to format and print our objects as we
see fit. Therefore, when we call print() on an object of a class we’ve defined,
we’ll no longer get something like <time.Time object at
0x000001A7C6902F60> printed to the terminal.
170
We do this by returning a formatted string. Let’s look at how we might do this
for the Pointclass. We could now replace our display() function and replace it
with __str__(), that way we can just print our point instance.
1. class Point:
2. def __init__(self, x=0, y=0):
3. self.x = x
4. self.y = y
5.
6. def __str__(self):
7. return "X coordinate: {}\nY coordinate: {}".format(self.x, self.y)
We can now call the len() function on instances of our Point class.
171
It is unfortunate that we can’t return a floating-point number, therefore if we
want real accuracy we’ll have to implement a different method that we’ll have
to call on the instance.
In this case, we overload the __eq__() method by comparing two tuples. If the
two tuples, each containing the x and y coordinates of points, are equal, then
our two points are equal. We can use this as follows:
We could also add two points together to get a new point. In this case we
would overload the __add()__ method. The __add__() method implements the
functionality of the + operator, not the += operator. Therefore, we must return
a new object as the + operator is not for in-place addition.
172
Let’s look at that:
1. class Point:
2. def __init__(self, x=0, y=0):
3. self.x = x
4. self.y = y
5.
6. def __str__(self):
7. return "X coordinate: {}\nY coordinate: {}".format(self.x, self.y)
8.
9. def __len__(self):
10. distance_from_center = ((self.x)**2 + (self.y)**2)**0.5
11. return int(distance_from_center)
12.
13. def __eq__(self, other):
14. return ((self.x, self.y) == (other.x, other.y))
15.
16. def __add__(self, other):
17. new_x = self.x + other.x
18. new_y = self.y + other.y
19. return Point(new_x, new_y)
We can overload the += operator, which is for in-place addition. This way we
don’t need to return a reference to a new point object and assign that to a
new variable p3, instead we update the object’s (p1 in this case) data attributes.
173
This is done by overloading the __iadd()__ method.
1. class Point:
2. def __init__(self, x=0, y=0):
3. self.x = x
4. self.y = y
5.
6. def __str__(self):
7. return "X coordinate: {}\nY coordinate: {}".format(self.x, self.y)
8.
9. def __len__(self):
10. distance_from_center = ((self.x)**2 + (self.y)**2)**0.5
11. return int(distance_from_center)
12.
13. def __eq__(self, other):
14. return ((self.x, self.y) == (other.x, other.y))
15.
16. def __add__(self, other):
17. new_x = self.x + other.x
18. new_y = self.y + other.y
19. return Point(new_x, new_y)
20.
21. def __iadd__(self, other):
22. temp_point = self + other
23. self.x, self.y = temp_point.x, temp_point.y
24. return self
Anyway, after we update the attributes for self, we must return self.
18.5 - Exercises
Question 1
Create a BankAccount class. The bank account class should have the following
attributes: balance and account_owner. The bank account class should also
have deposit and withdraw methods. You should also be able to print the bank
account, to display the account owners name and balance. When
implemented correctly, you should be able to use the Bank account class as
follows:
175
Balance: $150
Question 2
Create a Line class. The line class should have four attributes: an x1 and
a y1 coordinate for one point on the line and an x2 and a y2 coordinate for the
second point on the line. You should be able to do a few things with an
instance of a line. You should be able to call the len method on a line to show
its length, add two lines together, subtract two lines, multiply a line by an
integer, print the line and compare two lines to see if they are equal. In this
case equality means both lines are the same length. When implemented
correctly, you should be able to use the Line class as follows:
176
Chapter 19 - OOP: Method Types &
Access Modifiers
19.1 - What are instance methods?
By the end of this chapter you should be ready for your first project!
We have already met instance methods. They are the type of methods we have
been working with in regard to object-oriented programming until this point.
Instance methods are the most common type of methods in classes. They are
called instance methods as they act on specific instances of a class. They can
access the data attributes that are unique to that class.
For example, if we have a Person class, then each instance of a person may
have a unique name, age etc. Instance methods have self as the first
parameter, and this allows us to go through self to access data attributes
unique to that instance and also other methods that may reside within our
class.
I'm not going to provide an example for instance methods as we've seen them
time and time again!
Class methods have limited access to methods within the class and can modify
class specific details such as class variables which we will see in a moment. This
allows us to modify the class's state which will be applied across all instances
of the class. They are one of the more confusing method types. Class methods
are also permanently bound to its class.
1. class Time:
2. def __init__(self, hours=0, minutes=0, seconds=0):
3. self.hours = hours
4. self.minutes = minutes
5. self.seconds = seconds
6.
7. def __str__(self):
8. return "The time is {:02}:{:02}:{:02}".format(self.hours,
9. self.minutes,
10. self.seconds)
11.
12. # OTHER METHODS CAN GO HERE...
13.
14. @classmethod
15. def seconds_to_time(cls, s):
16. minutes, seconds = divmod(s, 60)
17. hours, minutes = divmod(minutes, 60)
18. extra, hours = divmod(hours, 24)
19. return cls(hours, minutes, seconds)
For this example, I've cut out other instance methods that we don't care about
for now. The seconds to time class method takes the class as the first
parameter and some number of seconds, s, as its second parameter.
A side note on the divmod() function: The divmod() function takes two
arguments, the first is the number of seconds we want to convert to 24 hour
time, the second will be divided into it (60 in this case). This returns a tuple in
which the first element is the number of times the second parameter divided
into the first, and the second element of the tuple is the remainder after that
division. For example, if we called divmod(180, 60), we'd get back a tuple of (3,
0) in which we use multiple assignment to assign the 3 to the minutes and 0 to
the seconds which is what we expect as 180 seconds is 3 minutes.
Back to our class method. When we have calculated our hours, minutes and
seconds, we return cls(hours, minutes, seconds). cls will be replaced
with Time in this case. So really, we're returning a new Time object.
178
>>> t = Time.seconds_to_time(11982)
>>> print(t)
I also mentioned class variables earlier. Much like before, class variables are
bound to a class rather than a specific instance. We can use a class method to
check the number of times an instance of the time class has been created for
example.
With convention, class variables are usually all uppercase for the variable
name. Let's modify our time class above to add a class method
called COUNT which will count the number of times an instance of the class has
been created.
1. class Time:
2.
3. COUNT = 0
4.
5. def __init__(self, hours=0, minutes=0, seconds=0):
6. self.hours = hours
7. self.minutes = minutes
8. self.seconds = seconds
9. Time.COUNT += 1
10.
11. def __str__(self):
12. return "The time is {:02}:{:02}:{:02}".format(self.hours,
13. self.minutes,
14. self.seconds)
15.
16. # OTHER METHODS CAN GO HERE...
17.
18. @classmethod
19. def seconds_to_time(cls, s):
20. minutes, seconds = divmod(s, 60)
21. hours, minutes = divmod(minutes, 60)
22. extra, hours = divmod(hours, 24)
23. return cls(hours, minutes, seconds)
Note how we have a new COUNT variable just before our constructor and inside
our constructor, after initializing all the attributes, we increase the COUNT class
variable by 1. Now we can see how many times instance of our class has been
created.
>>> from my_time import Time
>>> t1 = Time(23, 11, 37)
>>> t2 = Time()
>>> Time.COUNT
2
179
>>> t3 = Time(12, 34, 54)
>>> Time.COUNT
It may be tricky trying to spot when to use these. As a general rule of thumb, if
a method can be invoked (and makes sense to) in the absence of an instance,
then it seems that, that method is a good candidate for a class method.
Static methods are methods that are related to the class in some way, but
don't need to access any class specific data and don't need an instance to be
invoked. You can simply call them as pleased.
In general, static methods don't know anything about class state. They are
methods that act more as utilities.
1. class Time:
2.
3. COUNT = 0
4.
5. def __init__(self, hours=0, minutes=0, seconds=0):
6. self.hours = hours
7. self.minutes = minutes
8. self.seconds = seconds
9. Time.COUNT += 1
10.
11. def __str__(self):
12. return "The time is {:02}:{:02}:{:02}".format(self.hours,
13. self.minutes,
14. self.seconds)
15.
16. # OTHER METHODS CAN GO HERE...
17.
18. @classmethod
19. def seconds_to_time(cls, s):
20. minutes, seconds = divmod(s, 60)
21. hours, minutes = divmod(minutes, 60)
180
22. extra, hours = divmod(hours, 24)
23. return cls(hours, minutes, seconds)
24.
25.
26. @staticmethod
27. def validate(hours, minutes, seconds):
28. return 0 <= hours <= 23 and 0 <= minutes <= 59 and 0<= seconds <= 59
This static method is simply a utility that allows us to verify that times are
correct. We may use this to stop a user from creating a time such
as 35:88:14 as that wouldn't make any logical sense.
>>> from my_time import Time
>>> Time.validate(5, 23, 44)
True
>>> Time.validate(25, 34, 63)
False
>>> t = Time(22, 45, 23)
>>> t.validate(t.hours, t.minutes, t.seconds)
True
It may also be tricky to spot when to use static methods, but again, if you find
that you have a function that may be useful but it won't be applicable to any
other class yet at the same time it doesn't need to be bound to a instance and
it doesn't need to access or modify any class data then it is a good candidate
for a static method.
Access modifiers are used to make code more robust and secure by limiting
access to variables that don't need to be accessed by every (or maybe they do
in which case they're public).
181
Until now we have been using public attributes and methods. Classes can call
the methods and access the variables in other classes.
In Python, there is no strict checking for access modifiers, in fact they don't
exist, but Python coders have adopted a convention to overcome this.
The third access modifier I'll talk about is protected. This is the same as private
except all subclasses can also access the methods and member variables. I'll
talk about subclasses in the next chapter so don't worry about that now, but
by convention, we use a single underscore before the variable name.
19.5 - Exercises
Question 1
Write a Python program that contains a class called Person. An instance of
a Person should have name and age attributes. You should be able to print an
instance of the class which, for example, should output John is 28 years old.
182
Your class should have a method called fromBirthYear() which as parameters
should take an age and a birth year. For example, fromBirthYear('John',
1990) should return an instance of Person whose age is 29 (at the time of
writing this book, 2019).
Question 2
Write a Student class that has initializes a student with 3
attributes: name, grades and student_number. Grades should be a dictionary of
modules to grades. The student number of the first student instance should be
19343553. The student number of the second student instance should be
19343554 and so on. You should have three methods, add_module() which
takes one parameter, a module name, and initialises the grade to 0. The
second method should be, update_module() which takes two parameters, the
module name and the grade for the module. The final method should allow
the user to print a student.
Think carefully about how you will implement the increasing student number.
183
Grades:
Python 88
>>> adam = Student("Adam")
>>> adam.add_module("Java")
>>> adam.update_module("Java", 60)
>>> print(adam)
Name: Adam
Student Number: 19343554
Grades:
Java: 60
Question 3
Modify the Student class to include a method that will validate grades. A grade
should be between 0 and 100.
Python 88
Project
You are tasked with creating a network simulation application in which two
parties exist. We'll call these parties the Sender and Receiver. There should also
exist a process that continuously checks if the Sender has any packets of data
184
to send, if it does, then the process should deliver them to the Receiver.
The Sender should periodically create data packets to be sent (These can be
randomly generated strings). The Receiver should periodically check if it has
received any packets of data, if it has, then it should print them, then delete
them.
Hint: You should think carefully about what classes are needed and what
methods they should contain. Something called a 'buffer' may be helpful here
for your process that moves packets between Sender and Receiver. Read up
about buffers and what they. Think carefully about what Python built-in types
might be able to implement a buffer.
185
Chapter 20 - OOP Inheritance
20.1 - What is inheritance?
In programming we often come across objects that are quite similar or we may
see that one class is a type of another class. For example, a car is a type of
vehicle. Similarly, a motorbike is a type of vehicle. We can see a relationship
emerging here. The type of relationship is an “is a” relationship.
Let’s say we have Rectangle and Square classes. Many of the methods and
attributes between a Rectangle and Square will be similar. For example both a
rectangle and a square have an area() and a perimeter(). Up until now we
may have implemented these classes as follows:
186
1. class Rectangle:
2. def __init__(self, length, width):
3. self.length = length
4. self.width = width
5.
6. def area(self):
7. return self.length * self.width
8.
9. def perimeter(self):
10. return 2 * self.length + 2 * self.width
11.
12. class Square:
13. def __init__(self, length):
14. self.length = length
15.
16. def area(self):
17. return self.length * self.length
18.
19. def perimeter(self):
20. return 4 * self.length
This seems a little bit redundant however as both a square and rectangle have
4 sides each and they both have an area and a perimeter. If we look more
closely at this situation we notice that a square is a special case of a rectangle
in which all sides are the same length.
We can use inheritance to reflect this relationship and reduce the amount of
code we have to write. In this case a Square is a type of Rectangle so it makes
sense for a square to inherit from the rectangle class.
Let’s look at this code again but this time I have made changes to the square
class to reflect this relationship. There are a few new things going on here but
I’ll explain after.
1. class Rectangle:
2. def __init__(self, length, width):
3. self.length = length
4. self.width = width
5.
6. def area(self):
7. return self.length * self.width
8.
9. def perimeter(self):
187
10. return 2 * self.length + 2 * self.width
11.
12. class Square(Rectangle):
13. def __init__(self, length):
14. super().__init__(length, length)
The first change here is when we define the name of the Square class. I
have class Square(Rectangle). This means that the Square inherits from
the Rectangle class.
The next change is to the square’s __init__() method. We can see we are
calling a new super() method. The super() method in Python returns a proxy
object that delegates method calls to a parent or sibling of type. That
definition may seem a little confusing but what that means in this case is that
we have used super() to call the __init__() method of the Rectangle class,
which allows us to use it without having to re-implement it in the Square class.
Now, a squares length is length and its width is also length as we passed
length in twice to the call to the super()’s __init__() method.
Since we’ve inherited Rectangle in the Square class, we also have access to
the area() and perimeter methods without having to redefine them. We can
now use these classes as follows:
>>> rect = Rectangle(2, 3)
>>> square = Square(2)
>>> print(rect.area())
6
>>> print(square.area())
4
>>> print(rect.perimeter())
10
>>> print(square.perimeter())
8
As you can see, the super() method allows you to call methods of the
superclass in your subclass. The main use case of this is to extend functionality
of the inherited method.
With inheritance we can also override methods in our child classes. Let’s add a
new class called a Cube. The cube will inherit from Square (which inherits from
Rectangle). The Cube class will extend the functionality of the area method to
188
calculate the surface area of a cube. It will also have a new method
called volume to calculate the cubes volume. We can make good use
of super() here to help us reduce the amount of code we’ll need.
1. class Square(Rectangle):
2. def __init__(self, length):
3. super().__init__(length, length)
4.
5. class Cube(Square):
6. def area(self):
7. side_area = super().area()
8. return 6 * side_area
9.
10. def volume(self):
11. side_area = super().area()
12. return side_area * self.length
As you can see, we haven’t had to override the __init__() method as it’s
inherited from Square. We have however overrided the area() method from
the parent class (Square) on the Cube to reflect how the area of a cube is
calculated. We have also used the super() method to help us with this.
As you can see, by using inheritance we have greatly reduced the amount of
code compared to how we would have implemented these three classes
previously.
189
I just want to go back to overriding the __init__() method for a second. Let’s
assume a Person class and an Employee class that is derived from Person.
Everything about an employee is the same as a Person but the employee will
also have an employee ID.
1. class Person:
2. def __init__(self, name, age):
3. self.name = name
4. self.age = age
5.
6. def show_name(self):
7. print("Name: " + self.name)
8.
9. def show_age(self):
10. print("Age: " + str(self.age))
11.
12. class Employee(Person):
13. def __init__(self, name, age, employeeId):
14. super().__init__(name, age)
15. self.employeeId = employeeId
16.
17. def show_id(self):
18. print("Employee ID: " + str(self.employeeId))
The reason I’m showing you this is because, although I’ve already shown how
to override methods, people get confused when overriding
the __init__() method to also include new attributes on the derived class.
190
20.3 - Multiple inheritance
So far we’ve looked a single inheritance. That is, child classes that inherit from
a single parent class. Multiple inheritance is when a class can inherit from
more than one parent class.
Th;is allows programs to reduce redundancy, but it can also introduce a certain
amount of complexity as well as ambiguity. If you plan on doing multiple
inheritance in your programs, it should be done with care. Give thought to
your overall program and how this might affect it. It may also affect the
readability of your programs.
For example:
1. class A:
2. def method():
3. # your code here
4.
5. class B:
6. def method():
7. # your code here
8.
9. class C(A, B):
10. pass
11.
12. >>> C.method()
Which method is the C class referring to? The method from A or the method
from B? This may make your code harder to read. In Python, when
calling method() on C, Python first checks if C has an implementation
of method then it checks if A has implemented method. It has in this case so use
that. If it hadn’t then Python would check each parent class in turn (A, B, ……)
and so on until it finds the first parent class to have implemented method.
I’m not going to go too deep into multiple inheritance as it’s quite self-
explanatory on how to use it, so I’m only going to give one example of how it
can be used:
1. class CPU:
2. def __init__(self, num_registers):
3. self.num_registers = registers
4. # Other methods
5.
6. class RAM:
7. def __init__(self, amount):
8. self.amount = amount
9. # Other methods
191
10.
11. class Computer:
12. def __init__(self, num_registers, amount):
13. CPU.__init__(num_registers)
14. RAM.__init__(amount)
15.
16. # Other methods
20.4 - Exercises
Question 1
Create two classes: Person and Employee. The person class should
have name and age attributes. The employee class should have one
additional employeeId attribute. The employ should also
have clock_in and clock_out methods. When implemented correctly, you
should be able to use the two classes as follows:
>>> person = Person("John", 33)
>>> employee = Employee("Tom", 42)
>>> employee.clock_in()
Tom has clocked in.
>>> employee.clock_out()
Tom has clocked out.
Question 2
Create a Shape class. The shape class should have two
attributes: num_sides and side_length. It should also have one method
called show_type()
Create two other classes: Triangle and Pentagon. Each of these classes should
inherit from the Shape class.
192
Both the Triangle and Pentagon classes should have area() methods.
Question 3
Create a Clock class. It should have three attributes: hour, minute, and second. It
should also have a method called show_time() that displays the time in 24 hour
time. It should also have a method called tick() that advances the time by 1
second.
Create a Calendar class. It should have three attributes: day, month, and year. It
should have two methods, one called show_date() and the other
called next() that advances the date to the next day.
Note: There are a few ways of completing this question, however I want you to
use multiple inheritance. If you do, then the code for CalendarClock should be
quite short.
194
Chapter 21 - Basic Data Structures
The backbone of every piece of software you will write will consist of two main
things: data and algorithms. We know that algorithms are used to manipulate
the data within our software (and that manipulation should be done as
efficiently as possible). For this reason, it is important to structure our data so
that our algorithms can manipulate the data efficiently.
We have met various data structures already, lists and dictionaries to name a
few. You also know that using lists for example, makes it very easy for us to
store and manipulate sequences of data (Trust me, without this abstraction it
becomes a bit of a pain).
We have also seen that data structures allow us to model systems that we see
in the real world. For example, we could design and build a class that allows us
to model a library. We could easily use this to build a library management
system.
In this chapter we're going to look at two fundamental data structures that are
used everywhere in computing. We're also going to look at some examples of
how to take advantage of them.
A stack is a linear data structure that follows a particular order in which the
operations are performed. A stack is a LIFO (Last in First Out) data structure.
That is, the last element entered into the stack is the first element that will
leave the stack. You can think of a stack as a stack of dinner plates. Plates are
stacked, one on top of the other. The last plate to be put on top of the stack
will be the first plate to be removed as we cant remove the plate at the
bottom (or the stack of plates would fall over and they'd all smash).
A stack (usually) has 4 main operations: push (add an element to the top),
pop(remove an element from the top), top (show the element at the top) and
195
isEmpty (is the stack empty). We can also add a length method and it is usually
useful!
1. #######
2. # Stack
3. #######
4.
5. |------| <-- Top
6. |------|
7. |------|
8. |------|
9. |------|
10. |------|
11. |------|
12.
13.
14. ###########
15. # If we pop:
16. ###########
17.
18. |------| <-- Pop
19.
20. |------| <-- New top
21. |------|
22. |------|
23. |------|
24. |------|
25. |------|
26.
27.
28. ###############
29. # If we push
30. # a new element:
31. ###############
32.
33. |------| <-- Push new
34. |------| element
35. |------| (new top)
36. |------|
37. |------|
38. |------|
39. |------|
40. |------|
The stack is such a fundamental data structure that many computer instruction
sets provide special instructions for manipulating stacks. For example, the Intel
Pentium processor implements the x86 instruction set. This allows for machine
language programmers to program computers at a very low level (Assembly
language for example).
A very special type of stack called the "Call stack" or "Program stack", usually
shortened to "The stack" is what some of the instructions in the x86 instruction
set manipulate.
196
Let's consider what happens with the following program at a low level (For
anyone with a solid understanding of computers this is a little high level but a
good example).
.
.
.
print(4)
.
.
.
(I'M AWARE THIS ISN'T WHAT ACTUALLY HAPPENS BUT LET'S GO WITH THIS
FOR NOW).
197
When we run a program, it is loaded into memory as machine code (Code the
computer knows how to execute). Instructions are stored sequentially. In this
view, we can look at functions as "mini programs" inside our main program.
That is, the code for them is stored elsewhere in memory. When we
call print(4) we are basically saying, jump to where the code for print is and
start sequentially executing instructions until the function has completed, then
return to the original location and continue on executing where you left off.
Before I explain this, the CPU has many special pieces of memory within it,
called registers (These are like tiny pieces of RAM, usually 32-bits or 64-bits in
size). One of these registers is called the instruction pointer register often
abbreviated to IP. It contains the memory address of the next piece of code to
execute.
In this case we execute the instruction at 0x0000004 ( then increase the IP), in
the next instruction we call print(4) which is at location 0x0000005 yet the code
for the print function is stored at 0x0000001 to 0x0000003. This is where the
stack comes into play. We know that when we print the number we want to
continue where we left off and execute the instruction at 0x0000006. In this case
we push the return address (0x0000006) onto the stack, set the IP
to 0x0000001 and start executing the code for the print function. When we're
finished, we pop the return address off the stack (0x0000006) and place this in
the IP. Now our program continues to execute where we left off.
I'm sure you can imagine how recursion works now at this level (essentially a
series of push, push, push ...... pop, pop, pop).
Things are a lot more complex than this, but it isn't necessary for the
explanation of how stacks are used at the most fundamental levels. Don't
worry too much if you didn't understand all of that. It isn't necessary for this
book, but it shows just how important the stack is.
Stacks have many high levels uses too, for example, undo mechanisms in text
editors in which we keep track of all text changes in a stack. They may be used
in browsers to implement a back/forward mechanism to go back and forward
between web pages.
Let's look at the code for a stack in Python. We will implement a stack using a
list
1. class Stack:
198
2. def __init__(self):
3. self.stack = []
4.
5. def push(self, elem):
6. self.stack.append(elem)
7.
8. def pop(self):
9. if len(self.stack) == 0:
10. return None
11. else:
12. return self.stack.pop()
13.
14. def isEmpty(self):
15. if len(self.stack) == 0:
16. return True
17. return False
18.
19. def top(self):
20. return self.stack[-1]
This is a very simple version of a stack, but we can now use our stack as
follows:
1. stack = Stack()
2.
3. stack.push(1)
4. stack.push(4)
5. stack.push(2)
6.
7. print(stack.top())
8.
9. print(stack.pop())
10. print(stack.pop())
11. print(stack.pop())
When we call the top() method, we don't actually remove the top element, we
are just told what it is. When we use pop() method, we do remove the top
element.
They have similar methods for adding and removing except they
are enqueue() and dequeue() respectively.
I think you get the gist of how you may visualize a queue so I'm just going to
jump to the code for implementing a queue. Again, we will implement a queue
using a list.
1. class Queue:
2. def __init__(self):
3. self.q = []
4.
5. def enqueue(self, elem):
6. self.q.append(elem)
7.
8. def dequeue(self):
9. if len(self.q) != 0:
10. return self.q.pop(0)
11.
12. def isEmpty(self):
13. if len(self.q) > 0:
14. return False
15. return True
And there you go. A very simple Queue class. Notice how we pop from index
0. Recall that the pop() method for lists takes an optional argument which is
the index of the list you want to remove an element from. If we don't supply
200
this optional argument it defaults to -1 (the end of the list, which is what
stacks do). Here we remove the element at the start of the list but add
elements to the end.
1. queue = Queue()
2.
3. queue.enqueue(1)
4. queue.enqueue(4)
5. queue.enqueue(2)
6.
7. print(queue.isEmpty())
8.
9. print(queue.dequeue())
10. print(queue.dequeue())
11. print(queue.dequeue())
12.
13. print(queue.isEmpty())
False
1
4
2
True
21.3 - Examples
The first example I'm going to look at is for stacks. This question is known to
have been asked during past Google coding interviews!
The problem is: Given an input string consisting of opening and closing
parentheses, write a python program that outputs whether or not the string is
balanced.
For example:
Input: ([{}])
Output: Balanced
Input: (([])
Output: Unbalanced
Input: {(){[]}}
201
Output: Balanced
Input: [()(){]
Output: Unbalanced
One approach we can take with this problem is to use a stack. Each time we
encounter an opening bracket we append that to the stack. When we
encounter a closing bracket we pop from the stack and check that the element
we popped is the opening bracket for the closing bracket. For example, [ is
the opening bracket for ], similarly, { is the opening bracket for } and so on.
Here is a possible solution. Assume our stack class from earlier exists:
We can make better use of the pairs map and make our code a bit neater but
it won't be as readable so I've taken a slightly longer approach.
Now we can run our code as follows to get the desired output from the
example input and output above:
202
21.4 - Exercises
Question 1
You are given two strings S and T. Both strings either contain alphanumeric
characters or a '#' character. A '#' means backspace. Your goal is to use your
knowledge of stacks to design a function that will decide whether both strings
are equal after the back space operation has been applied. For example:
Input: S = "xy#z", T = "xw#z"
Output: True
Explanation: When the backspace operation is applied, both strings become
"xz"
Question 2
It is possible to create a queue through the use of two stacks. Implement a
queue class like above, except rather than using a list to implement the queue
you use two stacks.
Question 3
It is also possible to create a stack through the use of two queues. Implement
a stack class like the one we have seen earlier, except rather than using a list to
implement the stack, use two queues.
203
This might be a little tricky, you'll have to think about this.
Mini Project 1
You are to design an RPN calculator. RPN stands for Reverse Polish Notation.
An RPN calculator is also known as a postfix calculator. We as humans do
math as follows, for example, 4 + 7 or we must use BEMDAS rules: 8 + ((10 *
3) / 2). In RPN, we would represent these as: 4 7 + and the last expression we
represent as 8 10 3 2 / * +. Whats happening here is the operator (+ is in the
prefix position rather than infix position as we are used to). This might look a
little confusing but its actually quite simple and we can use a stack to solve
this problem.
If we had a unary operator such as negation then we would pop one element
from the stack, calculate the result and push it back.
• Addition: (+)
• Subtraction: (-)
• Multiplication (*)
• Floor division (//)
• Negation (n)
• Power (e)
204
Your calculator should read in RPN expressions, one line at a time and
evaluate them. There is one additional operator to implement. This is p which
stands for print. If you encounter p then print the result. You can assume p will
always occur at the end of the input.
Example:
Input: 4 7 3 + + p
Output: 14
Infix: 4 + 7 + 3
Input: 5 8 * n p
Output: -40
Infix: -(5 * 8)
Input: 3 6 1 + + 2 e p
Output: 100
Infix: (3 + 6 + 1)^2
Think carefully about this problem. It's even a great beginner project to host
on your Github account to showcase your work to the world!
205
Chapter 22 - More Advanced Data
Structures
In the previous chapter, we learned about some simple yet fundamental data
structures with rather simple implementations. In this chapter we're going to
take a look at a couple of new data structures. They'll also be a little bit more
advanced in their implementation.
Throughout this book I may have referred to Python lists as arrays and visa-
versa. Python lists have a complicated implementation under the hood. This
allows for lists to be of arbitrary size, grow and shrink and store many items of
various types.
In many other programming languages, an array is much like a fixed size list
that stores elements of a specific type. They store elements contiguously in
memory. That isn't necessarily the case in Python.
There are many variations of linked lists but in this chapter, we're going to
look at one variation called a Singly Linked List.
206
There are various elements to this so let me explain:
Linked Lists have various advantages over arrays (not Python lists). These
include dynamic size (they can grow and shrink as needed much like python
lists, whereas arrays can't do this) and ease of insertion and deletion.
When we delete something from a Python list, the list must be resized and
various other things must be done (this is done under the hood so we don't
worry about this).
From the above diagram we can break the problem of implementing a linked
list down into two things:
1. A node class: This will contain the data the node will store and a pointer
to the next node.
2. Linked List class: This will contain the head node and methods we can
perform on the linked list.
This is our linked list node class. It's very simple and only contains two
attributes. The first is the data the node will store and a next attribute. The
next attribute is set to None by default as it won't point to anything initially.
207
This concept of a node is very powerful when it comes to data structures. It
gives us a lot of flexibility when developing other data structures as we'll see
later on.
When we initialize our LinkedList it will be empty, that is, the head will point
to nothing (None). That's all we need to start implementing methods.
To get started, we'll take a look at the add() method. Firstly, we need to think
about how we're going to go about doing this. In our linked list, we'll be
adding new elements to the end of the list. Firstly, we need to check if the
head is empty. If it is, then we just point the LinkedList head to a new node.
Otherwise, we need to "follow" this trail of pointers to nodes until one of the
node's next attribute points to None. When we find this node then we update
its next attribute to point to the element we're trying to add.
Here we check that the head of the linked list isn't empty. If it has then we
assign a new Node to the linked list head. Otherwise we create a new variable
called curr. This keeps track of what node we're on. If we didn't do this we
would end up updating the linked list's head node by mistake. After we create
this new variable, we loop through the elements of the list, going from node
to node through the next pointer until we find a node whose next pointer is
empty. When we find that node, we update its next pointer to point to a
new Node.
208
This may take a little time to wrap your head around or understand but the
more you work with data structures like this or ones similar, the more sense it
will make.
Let's look at our deletion method. We have many options when deleting
elements from a linked list just like we do with adding them. We could remove
the first element, the last element or the first occurrence of an element. Since
our linked list can hold many occurrences of the same data, we will remove the
first occurrence of an element.
Let's look at how that may be done with the help of a diagram:
As you can see, from this diagram we want to delete the node that contains 2.
To do this we find the first node who's next pointer points to a Node that
contains the value 2. When we find this Node we just update it's next pointer to
point to the same node the Node we want to delete points to. Essentially, we
re-route the next pointer to bypass the element we want to delete. This is
quite easy to do.
We have to cover a couple of cases here. The first is that the list is empty in
which case return nothing. The second is that the head of the linked list is the
209
element we want to delete. If it is then we make the head of the list the
second Node in the list (Which may be None but thats ok, it just means the list
only had one item). Finally, if it is neither of those cases we traverse the list
using linear search until we find the first occurrence of the element then we
're-route' the next pointer of the Node that points to the Node we want to
delete, to the Node the next pointer of the Node we want to delete points to.
That last part might seem confusing but we're doing what you see in the
diagram. It's probably best at this point if you're confused by that to draw this
scenario out on paper to help clarify things.
There are a couple of other useful methods we can add here but we'll leave
them for later.
Just before I finish up on linked lists, I want to talk a little bit more about the
complexities of their methods and why you might use them (or not).
The run time complexity of the add method is O(n) in this case as we must
iterate over each entry in the list to find the last element. The runtime
complexity of the deletion method is also O(n). Although we don't always
iterate over each element in the list, Big-O deals with the worst case, which is
going through the entire list. There are optimizations we could make though
and we'll get to them later on too.
Linked Lists in Python usually don't have a use case. This is because Python
lists are already very well optimized and dynamic. However, in languages such
as C or C++, Linked Lists are essential (Where dynamic arrays, like python lists,
may not exist). They also pop up in technical interviews so you're best to know
them. In fact, if I was hiring an engineer, I'd be very skeptical about hiring one
who didn't know how to code a linked list.
210
A binary search tree (BST), is somewhat different to the data structures we
have met so far. It is not a linear data structure. It is a type of ordered data
structure that stores items. BSTs allow for fast searches, addition and removal
of items. BSTs keep their items in sorted order so that the operations that can
be performed on them are fast, unlike a linked list who's add and remove
operations are O(n). The add and remove operations on a BST are O(log n).
Searching a BST is also O(log n). This is the average case for those three
operations.
This is a similar situation to linear search vs. binary search which we looked at
earlier.
In computer science, a 'tree' is a widely used abstract data type (a data type
defined by its behaviour not implementation) that simulates a hierarchical tree
structure (much like a family tree), with a root node and subtrees of children
with a parent node represented as a set of linked nodes.
A binary tree is a type of tree in which each Node in the tree has at most, two
child nodes. A binary search tree is a special type of binary tree in which the
elements are ordered. A diagram might help you visualize this, so here's one:
There is a lot to take in here so let me explain. The root of this tree is the node
that contains the element 10. The big thing labeled Sub Tree is the right sub
tree of the root node.
211
As you can see from this diagram, it is a binary tree as each node has at most
two child nodes. It also satisfies an important property that makes it a binary
search tree. That is, that if we take any node, 10 for example, everything to the
right is greater than it and everything to the left is less than it. Similarly, if we
look at the node 13, everything to the right is greater than 13 and everything
to the left is less than 13.
What makes searching fast is this, if we wanted to check if 2 was in the tree, we
start at the root node and check if 2 is less than the value at the root. In this
case 2 is less than 10 so we know we don't need to bother searching the root
node's right sub tree. Essentially what we've done here is cut out everything in
that blue circle as the element 2 will not be in there.
Let's look at implementing a binary search tree. Again, the concept of a Node is
important here. Our Node class for a binary search tree will be slightly different
than what it was for a linked list. Let's take a look:
1. class Node:
2. def __init__(self, elem):
3. self.elem = elem
4. self.left = None
5. self.right = None
As you can see here, each node will have an element, a left child and a right
child (either or both of which may be None).
This time we can look at the search operation first. It will be a good look at
how to use recursion in a practical situation. We either want to return None if
the element is not found or return the Node that contains what we're looking
for if the element is found.
212
1. class BinarySearchTree:
2. def __init__(self):
3. self.root = None
4.
5. def search(self, value):
6. return self.recursive_search(self.root, value)
7.
8. def recursive_search(self, node, value):
9. if node is None or node.elem == value:
10. return node
11. if value < node.elem:
12. return self.recursive_search(node.left, value)
13. else:
14. return self.recursive_search(node.right, value)
This may seem a little odd as we have two search functions but there is good
reason for it. The search method allows us to call the recursive_search method
and pass in the root node, that way, we now have a 'copy' of the root node
and not the root node itself so we don't end up in an infinite loop when
recursively searching. Again, the recursive nature of this is very difficult to
explain so I'd try working through an example on paper to help clarify after
the following explanation.
Inside the recursive_search method we have a base case which checks if we've
reached None (we didn't find what we're looking for) or we've reached
the Node that contains what we're searching for.
If we don't pass our base case then we check if the value we're searching for is
less than the value at the node we're currently at; if it is then we search the left
subtree of that node by calling the recursive_search method and passing the
current nodes left tree. Otherwise, the value we're searching for is greater
than the value at the current node, in which case we call recursive_search and
pass in the current nodes right tree.
Remember a node's left or right tree is simply a pointer to a node (we can look
at this node as the root node of a sub tree).
For example, if we're searching for the value 15 in the tree from the diagram
above, then the path we take looks like this:
213
Just to clarify, a leaf nodes’ left and right 'trees' are None.
Next, we will look at inserting an element into the tree. What we're doing here
is following pretty much the same thing we did with search except when we
reach a leaf node on our path we will set it's left or right pointer to point to a
new node (depending on whether or not the value is less than or greater than
the value at the child node).
• Base case: If the node we are at is None then insert the new node here.
• Recursive case:
• If the value we're inserting is less than the value of the current node
then we check if left tree of the current node is None; If it is, then insert
there, otherwise, call insert again and pass the left subtree.
• If the value we're inserting is greater than the value of the current node
then we check if the right tree of the current node is None; If it is, then
insert there, otherwise, call insert again and pass the right subtree.
214
Here's how that's done:
1. class BinarySearchTree:
2. def __init__(self):
3. self.root = None
4.
5. def insert(self, val):
6. if self.root is None:
7. self.root = Node(val)
8. else:
9. self.recursive_insert(self.root, val)
10.
11. def recursive_insert(self, node, val):
12. if val < node.elem:
13. if node.left is None:
14. node.left = Node(val)
15. else:
16. self.recursive_insert(node.left, val)
17. else:
18. if node.right is None:
19. node.right = Node(val)
20. else:
21. self.recursive_insert(node.right, val)
That's it. I hope you can see why recursion is useful here. It makes our code
read better and when we're planning this out in our heads, it naturally fits into
our algorithm.
Next, we're going to look deleting an element. This is quite a bit more difficult.
With insertion, we always insert into one of the leaf nodes but if we're deleting
something, then three possibilities arise. The first is the simplest case:
1. The node we're removing has no child nodes (i.e. a leaf node). In this
case we can just remove the node.
2. The next case is when we're removing a node that only has one child.
Again this is simple enough to deal with. In this case we just cut the
node we're removing from the tree and link its child to its parent.
215
3. The final case is when we're removing a node that has two child nodes.
This is the most complex case and requires some smart thinking. There
is however, one useful property of BSTs that we can use to solve this
problem. That is, the same set of values can be represented as different
binary-search trees. Let's consider the following set of values: {9, 23, 11,
30}. We can represent this in two different ways:
Both of these trees are different yet they are both valid. What did we do here
though to transform the first tree into the second?
We can take this idea and apply it to removing elements with two child nodes.
Here's how we'll do that:
216
• Replace the element of the node to be removed with the minimum we
just found.
• Be careful! The right subtree now contains a duplicate.
• Recursively apply the removal to the right subtree
When applying the removal to the right subtree to remove the duplicate, we
can be certain that the duplicate node will fall under case 1 or 2. It won't fall
under case 3 (It wouldn't have been the minimum if we did).
There are two functions we'll need here. One is the remove method and the
other will find the minimum node. We get a free function here (we will have
a min operation on our BST!).
Here's how to code it, I'll also comment the code as it's a bit complex:
1. class BinarySearchTree:
2. def __init__(self):
3. self.root = None
4.
5. def remove(self, value):
6. return self.recursive_remove(self.root, None, value)
7.
8. def recursive_remove(self, node, parent, value):
9. # Helper function
10. def min_node(node):
11. curr = node
12. while curr.left != None:
13. curr = curr.left
14. return curr
15.
16. # Base case (element doesn't exist in the tree)
17. if node == None:
18. return node
19.
20. # If element to remove is less than current node
21. # then it is in the left subtree. We must set the current
22. # node's left subtree equal to the result of the removal
23. if value < node.elem:
24. node.left = self.recursive_remove(node.left, node, value)
25.
26. # If element to remove is greater than current node
27. # then it is in the right subtree. We must set the current
28. # node's right subtree equal to the result of the removal
29. elif value > node.elem:
30. node.right = self.recursive_remove(node.right, node, value)
31.
32. # If the element to remove is equal to the value of
33. # current node, then this is the node to remove
34. else:
35. print(node.elem, not parent is None)
36. # Not the root node
37. if not parent is None:
38. # Node has no children (CASE 2)
39. if node.num_children() == 0:
40. if parent.left == node:
41. parent.left = None
42. else:
217
43. parent.right = None
44. # Node has only one child (CASE 2)
45. elif node.num_children() == 1:
46. # Get the child node
47. if node.left != None:
48. child = node.left
49. else:
50. child = node.right
51.
52. # Point the parent node to the child node of
53. # the node we're removing
54. if parent.left == node:
55. parent.left = child
56. else:
57. parent.right = child
58.
59. # Node has two children (CASE 3)
60. elif node.num_children() == 2:
61. # Get the min node in the right subtree of node to remove
62. smallest_node = min_node(node.right)
63.
64. # Copy the smallest nodes value to the node formerly holding
65. # the value we wanted to delete
66. node.elem = smallest_node.elem
67. self.recursive_remove(node, parent, value)
68. # Node to delete is the root node
69. else:
70. # Only 1 element in the tree (the root) - set the root to None
71. if node.num_children() == 0:
72. node = None
73. # Root has one child - make that the root
74. elif node.num_children() == 1:
75. if node.left != None:
76. self.root = node.left
77. if node.right != None:
78. self.root = node.right
79. elif node.num_children() == 2:
80. smallest_node = min_node(node.right)
81. node.elem = smallest_node.elem
82. self.recursive_remove(node, None, value)
83. return node
This is tough. Make sure you understand the delete operation, even if that
means going through it over and over again. Use a pen and paper if you need!
It is a good question for an employer to ask as it shows how someone
approaches a tough problem (pointing out different scenarios, breaking the
problem down, etc.).
That is all I want to cover on Binary Search Tree's for now. There are a few
more methods we can add, and I'll leave that up to you in the exercises!
22.3 - Exercises
Important: Some of these exercises are quite difficult! Make use of problem-
solving techniques to help you arrive at a solution. Sketching out what's
happening on paper is a good way to help you out!
218
It's time to turn up the heat a bit on the difficulty of the questions. There are
some really tough questions in here. Good luck!
Question 1
We talked earlier about the run-time complexities of the Linked List methods
we implemented. I mentioned that we could improve their complexities.
Question 2
Change the implementation of the add() or remove() methods on the Linked
List class in order to achieve a Linked List that behaves like:
• A Stack
• A Queue
Question 3
Another useful method our Linked List class may have is a length() method.
That is, it returns the number of elements in the linked list.
Question 4
If you thought carefully about your implementation of the length() method
from the previous exercise then you might be able to skip this question.
However, I'm not expecting that you did.
If my guess is correct, your length() method iterates over each element in the
list, adding 1 to your length each time until you reach the end of the list. This
will give your length() method a run-time complexity of O(n).
We can do better than that though. How might you change the Linked List
class so that the run-time complexity of your length() method is O(1)?
219
Question 5
The height (depth) of a Tree is the number of edges on the longest path from
the root node to leaf node. (An edge is basically the arrows in our diagrams
that we've looked at before)
The height is determined by the number of edges in the longest path, like the
following:
220
You can also view the height of a tree as such:
You should write a recursive method to find the height of any given binary
search tree
Question 6
This question is very difficult
Consider what happens when we add elements to the tree in the following
order: {2, 5, 7, 12, 13}. We end up with a Linked List. That means that our
search and insert is no longer O(log n). It's now O(n). This isn't really that great
for us.
We can however, make an optimization to our binary search tree so that our
search and insert are always O(log n), regardless of what order we enter the
elements.
The optimization is that the difference between height of the left and right
subtrees is no greater than one for all nodes. When the difference between left
and right subtrees height is no greater than 1, we say the tree is balanced.
This means, if we insert or remove an element and the height between left and
right subtrees becomes greater than 1 then we must rearrange the nodes.
This optimization means that our binary search tree will become self-
balancing.
This type of self-balancing binary search tree has a special name. It's called
an AVL Tree.
221