1 1 110
1 1 110
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1 Python Primer 1
1.1 Python Overview . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 The Python Interpreter . . . . . . . . . . . . . . . . . . 2
1.1.2 Preview of a Python Program . . . . . . . . . . . . . . 3
1.2 Objects in Python . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Identifiers, Objects, and the Assignment Statement . . . 4
1.2.2 Creating and Using Objects . . . . . . . . . . . . . . . . 6
1.2.3 Python’s Built-In Classes . . . . . . . . . . . . . . . . . 7
1.3 Expressions, Operators, and Precedence . . . . . . . . . . . 12
1.3.1 Compound Expressions and Operator Precedence . . . . 17
1.4 Control Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.1 Conditionals . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.2 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.5.1 Information Passing . . . . . . . . . . . . . . . . . . . . 24
1.5.2 Python’s Built-In Functions . . . . . . . . . . . . . . . . 28
1.6 Simple Input and Output . . . . . . . . . . . . . . . . . . . . 30
1.6.1 Console Input and Output . . . . . . . . . . . . . . . . 30
1.6.2 Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.7 Exception Handling . . . . . . . . . . . . . . . . . . . . . . . 33
1.7.1 Raising an Exception . . . . . . . . . . . . . . . . . . . 34
1.7.2 Catching an Exception . . . . . . . . . . . . . . . . . . 36
1.8 Iterators and Generators . . . . . . . . . . . . . . . . . . . . 39
1.9 Additional Python Conveniences . . . . . . . . . . . . . . . . 42
1.9.1 Conditional Expressions . . . . . . . . . . . . . . . . . . 42
1.9.2 Comprehension Syntax . . . . . . . . . . . . . . . . . . 43
1.9.3 Packing and Unpacking of Sequences . . . . . . . . . . 44
1.10 Scopes and Namespaces . . . . . . . . . . . . . . . . . . . . 46
1.11 Modules and the Import Statement . . . . . . . . . . . . . . 48
1.11.1 Existing Modules . . . . . . . . . . . . . . . . . . . . . 49
1.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
xi
xii Contents
2 Object-Oriented Programming 56
2.1 Goals, Principles, and Patterns . . . . . . . . . . . . . . . . 57
2.1.1 Object-Oriented Design Goals . . . . . . . . . . . . . . 57
2.1.2 Object-Oriented Design Principles . . . . . . . . . . . . 58
2.1.3 Design Patterns . . . . . . . . . . . . . . . . . . . . . . 61
2.2 Software Development . . . . . . . . . . . . . . . . . . . . . 62
2.2.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.2.2 Pseudo-Code . . . . . . . . . . . . . . . . . . . . . . . 64
2.2.3 Coding Style and Documentation . . . . . . . . . . . . . 64
2.2.4 Testing and Debugging . . . . . . . . . . . . . . . . . . 67
2.3 Class Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.3.1 Example: CreditCard Class . . . . . . . . . . . . . . . . 69
2.3.2 Operator Overloading and Python’s Special Methods . . 74
2.3.3 Example: Multidimensional Vector Class . . . . . . . . . 77
2.3.4 Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.3.5 Example: Range Class . . . . . . . . . . . . . . . . . . . 80
2.4 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.4.1 Extending the CreditCard Class . . . . . . . . . . . . . . 83
2.4.2 Hierarchy of Numeric Progressions . . . . . . . . . . . . 87
2.4.3 Abstract Base Classes . . . . . . . . . . . . . . . . . . . 93
2.5 Namespaces and Object-Orientation . . . . . . . . . . . . . 96
2.5.1 Instance and Class Namespaces . . . . . . . . . . . . . . 96
2.5.2 Name Resolution and Dynamic Dispatch . . . . . . . . . 100
2.6 Shallow and Deep Copying . . . . . . . . . . . . . . . . . . . 101
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8 Trees 299
8.1 General Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
8.1.1 Tree Definitions and Properties . . . . . . . . . . . . . . 301
8.1.2 The Tree Abstract Data Type . . . . . . . . . . . . . . 305
8.1.3 Computing Depth and Height . . . . . . . . . . . . . . . 308
8.2 Binary Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
8.2.1 The Binary Tree Abstract Data Type . . . . . . . . . . . 313
8.2.2 Properties of Binary Trees . . . . . . . . . . . . . . . . 315
8.3 Implementing Trees . . . . . . . . . . . . . . . . . . . . . . . 317
8.3.1 Linked Structure for Binary Trees . . . . . . . . . . . . . 317
8.3.2 Array-Based Representation of a Binary Tree . . . . . . 325
8.3.3 Linked Structure for General Trees . . . . . . . . . . . . 327
8.4 Tree Traversal Algorithms . . . . . . . . . . . . . . . . . . . 328
Contents xv
8.4.1 Preorder and Postorder Traversals of General Trees . . . 328
8.4.2 Breadth-First Tree Traversal . . . . . . . . . . . . . . . 330
8.4.3 Inorder Traversal of a Binary Tree . . . . . . . . . . . . 331
8.4.4 Implementing Tree Traversals in Python . . . . . . . . . 333
8.4.5 Applications of Tree Traversals . . . . . . . . . . . . . . 337
8.4.6 Euler Tours and the Template Method Pattern . . . . 341
8.5 Case Study: An Expression Tree . . . . . . . . . . . . . . . . 348
8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
Bibliography 732
Index 737
Chapter
1
Contents
Python Primer
Figure 1.1: The identifier temperature references an instance of the float class
having value 98.6.
Identifiers
Identifiers in Python are case-sensitive, so temperature and Temperature are dis-
tinct names. Identifiers can be composed of almost any combination of letters,
numerals, and underscore characters (or more general Unicode characters). The
primary restrictions are that an identifier cannot begin with a numeral (thus 9lives
is an illegal name), and that there are 33 specially reserved words that cannot be
used as identifiers, as shown in Table 1.1.
Reserved Words
False as continue else from in not return yield
None assert def except global is or try
True break del finally if lambda pass while
and class elif for import nonlocal raise with
Table 1.1: A listing of the reserved words in Python. These names cannot be used
as identifiers.
1.2. Objects in Python 5
For readers familiar with other programming languages, the semantics of a
Python identifier is most similar to a reference variable in Java or a pointer variable
in C++. Each identifier is implicitly associated with the memory address of the
object to which it refers. A Python identifier may be assigned to a special object
named None, serving a similar purpose to a null reference in Java or C++.
Unlike Java and C++, Python is a dynamically typed language, as there is no
advance declaration associating an identifier with a particular data type. An iden-
tifier can be associated with any type of object, and it can later be reassigned to
another object of the same (or different) type. Although an identifier has no de-
clared type, the object to which it refers has a definite type. In our first example,
the characters 98.6 are recognized as a floating-point literal, and thus the identifier
temperature is associated with an instance of the float class having that value.
A programmer can establish an alias by assigning a second identifier to an
existing object. Continuing with our earlier example, Figure 1.2 portrays the result
of a subsequent assignment, original = temperature.
float
temperature original
98.6
Figure 1.2: Identifiers temperature and original are aliases for the same object.
Once an alias has been established, either name can be used to access the under-
lying object. If that object supports behaviors that affect its state, changes enacted
through one alias will be apparent when using the other alias (because they refer to
the same object). However, if one of the names is reassigned to a new value using
a subsequent assignment statement, that does not affect the aliased object, rather it
breaks the alias. Continuing with our concrete example, we consider the command:
temperature = temperature + 5.0
The execution of this command begins with the evaluation of the expression on the
right-hand side of the = operator. That expression, temperature + 5.0, is eval-
uated based on the existing binding of the name temperature, and so the result
has value 103.6, that is, 98.6 + 5.0. That result is stored as a new floating-point
instance, and only then is the name on the left-hand side of the assignment state-
ment, temperature, (re)assigned to the result. The subsequent configuration is dia-
grammed in Figure 1.3. Of particular note, this last command had no effect on the
value of the existing float instance that identifier original continues to reference.
float float
temperature original
103.6 98.6
Figure 1.3: The temperature identifier has been assigned to a new value, while
original continues to refer to the previously existing value.
6 Chapter 1. Python Primer
Calling Methods
Python supports traditional functions (see Section 1.5) that are invoked with a syn-
tax such as sorted(data), in which case data is a parameter sent to the function.
Python’s classes may also define one or more methods (also known as member
functions), which are invoked on a specific instance of a class using the dot (“.”)
operator. For example, Python’s list class has a method named sort that can be
invoked with a syntax such as data.sort( ). This particular method rearranges the
contents of the list so that they are sorted.
The expression to the left of the dot identifies the object upon which the method
is invoked. Often, this will be an identifier (e.g., data), but we can use the dot op-
erator to invoke a method upon the immediate result of some other operation. For
example, if response identifies a string instance (we will discuss strings later in this
section), the syntax response.lower( ).startswith( y ) first evaluates the method
call, response.lower( ), which itself returns a new string instance, and then the
startswith( y ) method is called on that intermediate string.
When using a method of a class, it is important to understand its behavior.
Some methods return information about the state of an object, but do not change
that state. These are known as accessors. Other methods, such as the sort method
of the list class, do change the state of an object. These methods are known as
mutators or update methods.
1.2. Objects in Python 7
The int and float classes are the primary numeric types in Python. The int class is
designed to represent integer values with arbitrary magnitude. Unlike Java and
C++, which support different integral types with different precisions (e.g., int,
short, long), Python automatically chooses the internal representation for an in-
teger based upon the magnitude of its value. Typical literals for integers include 0,
137, and −23. In some contexts, it is convenient to express an integral value using
binary, octal, or hexadecimal. That can be done by using a prefix of the number 0
and then a character to describe the base. Example of such literals are respectively
0b1011, 0o52, and 0x7f.
The integer constructor, int( ), returns value 0 by default. But this constructor
can be used to construct an integer value based upon an existing value of another
type. For example, if f represents a floating-point value, the syntax int(f) produces
the truncated value of f. For example, both int(3.14) and int(3.99) produce the
value 3, while int(−3.9) produces the value −3. The constructor can also be used
to parse a string that is presumed to represent an integral value (such as one en-
tered by a user). If s represents a string, then int(s) produces the integral value
that string represents. For example, the expression int( 137 ) produces the inte-
ger value 137. If an invalid string is given as a parameter, as in int( hello ), a
ValueError is raised (see Section 1.7 for discussion of Python’s exceptions). By de-
fault, the string must use base 10. If conversion from a different base is desired, that
base can be indicated as a second, optional, parameter. For example, the expression
int( 7f , 16) evaluates to the integer 127.
The float class is the sole floating-point type in Python, using a fixed-precision
representation. Its precision is more akin to a double in Java or C++, rather than
those languages’ float type. We have already discussed a typical literal form, 98.6.
We note that the floating-point equivalent of an integral number can be expressed
directly as 2.0. Technically, the trailing zero is optional, so some programmers
might use the expression 2. to designate this floating-point literal. One other form
of literal for floating-point values uses scientific notation. For example, the literal
6.022e23 represents the mathematical value 6.022 × 1023 .
The constructor form of float( ) returns 0.0. When given a parameter, the con-
structor attempts to return the equivalent floating-point value. For example, the call
float(2) returns the floating-point value 2.0. If the parameter to the constructor is
a string, as with float( 3.14 ), it attempts to parse that string as a floating-point
value, raising a ValueError as an exception.
1.2. Objects in Python 9
Sequence Types: The list, tuple, and str Classes
The list, tuple, and str classes are sequence types in Python, representing a col-
lection of values in which the order is significant. The list class is the most general,
representing a sequence of arbitrary objects (akin to an “array” in other languages).
The tuple class is an immutable version of the list class, benefiting from a stream-
lined internal representation. The str class is specially designed for representing
an immutable sequence of text characters. We note that Python does not have a
separate class for characters; they are just strings with length one.
2 3 5 7 11 13 17 19 23 29 31
primes:
0 1 2 3 4 5 6 7 8 9 10
S A M P L E
0 1 2 3 4 5
Python’s set class represents the mathematical notion of a set, namely a collection
of elements, without duplicates, and without an inherent order to those elements.
The major advantage of using a set, as opposed to a list, is that it has a highly
optimized method for checking whether a specific element is contained in the set.
This is based on a data structure known as a hash table (which will be the primary
topic of Chapter 10). However, there are two important restrictions due to the
algorithmic underpinnings. The first is that the set does not maintain the elements
in any particular order. The second is that only instances of immutable types can be
added to a Python set. Therefore, objects such as integers, floating-point numbers,
and character strings are eligible to be elements of a set. It is possible to maintain a
set of tuples, but not a set of lists or a set of sets, as lists and sets are mutable. The
frozenset class is an immutable form of the set type, so it is legal to have a set of
frozensets.
Python uses curly braces { and } as delimiters for a set, for example, as {17}
or { red , green , blue }. The exception to this rule is that { } does not
represent an empty set; for historical reasons, it represents an empty dictionary
(see next paragraph). Instead, the constructor syntax set( ) produces an empty set.
If an iterable parameter is sent to the constructor, then the set of distinct elements
is produced. For example, set( hello ) produces { h , e , l , o }.
Python’s dict class represents a dictionary, or mapping, from a set of distinct keys
to associated values. For example, a dictionary might map from unique student ID
numbers, to larger student records (such as the student’s name, address, and course
grades). Python implements a dict using an almost identical approach to that of a
set, but with storage of the associated values.
A dictionary literal also uses curly braces, and because dictionaries were intro-
duced in Python prior to sets, the literal form { } produces an empty dictionary.
A nonempty dictionary is expressed using a comma-separated series of key:value
pairs. For example, the dictionary { ga : Irish , de : German } maps
ga to Irish and de to German .
The constructor for the dict class accepts an existing mapping as a parameter,
in which case it creates a new dictionary with identical associations as the existing
one. Alternatively, the constructor accepts a sequence of key-value pairs as a pa-
rameter, as in dict(pairs) with pairs = [( ga , Irish ), ( de , German )].
12 Chapter 1. Python Primer
Logical Operators
Python supports the following keyword operators for Boolean values:
not unary negation
and conditional and
or conditional or
The and and or operators short-circuit, in that they do not evaluate the second
operand if the result can be determined based on the value of the first operand.
This feature is useful when constructing Boolean expressions in which we first test
that a certain condition holds (such as a reference not being None), and then test a
condition that could have otherwise generated an error condition had the prior test
not succeeded.
Equality Operators
Python supports the following operators to test two notions of equality:
is same identity
is not different identity
== equivalent
!= not equivalent
The expression a is b evaluates to True, precisely when identifiers a and b are
aliases for the same object. The expression a == b tests a more general notion of
equivalence. If identifiers a and b refer to the same object, then a == b should also
evaluate to True. Yet a == b also evaluates to True when the identifiers refer to
1.3. Expressions, Operators, and Precedence 13
different objects that happen to have values that are deemed equivalent. The precise
notion of equivalence depends on the data type. For example, two strings are con-
sidered equivalent if they match character for character. Two sets are equivalent if
they have the same contents, irrespective of order. In most programming situations,
the equivalence tests == and != are the appropriate operators; use of is and is not
should be reserved for situations in which it is necessary to detect true aliasing.
Comparison Operators
Data types may define a natural order via the following operators:
< less than
<= less than or equal to
> greater than
>= greater than or equal to
These operators have expected behavior for numeric types, and are defined lexi-
cographically, and case-sensitively, for strings. An exception is raised if operands
have incomparable types, as with 5 < hello .
Arithmetic Operators
Python supports the following arithmetic operators:
+ addition
− subtraction
multiplication
/ true division
// integer division
% the modulo operator
The use of addition, subtraction, and multiplication is straightforward, noting that if
both operands have type int, then the result is an int as well; if one or both operands
have type float, the result will be a float.
Python takes more care in its treatment of division. We first consider the case
in which both operands have type int, for example, the quantity 27 divided by
4. In mathematical notation, 27 ÷ 4 = 6 34 = 6.75. In Python, the / operator
designates true division, returning the floating-point result of the computation.
Thus, 27 / 4 results in the float value 6.75. Python supports the pair of opera-
tors // and % to perform the integral calculations, with expression 27 // 4 evalu-
ating to int value 6 (the mathematical floor of the quotient), and expression 27 % 4
evaluating to int value 3, the remainder of the integer division. We note that lan-
guages such as C, C++, and Java do not support the // operator; instead, the / op-
erator returns the truncated quotient when both operands have integral type, and the
result of true division when at least one operand has a floating-point type.
14 Chapter 1. Python Primer
Python carefully extends the semantics of // and % to cases where one or both
operands are negative. For the sake of notation, let us assume that variables n
and m represent respectively the dividend and divisor of a quotient m n , and that
q = n // m and r = n % m. Python guarantees that q m + r will equal n. We
already saw an example of this identity with positive operands, as 6 ∗ 4 + 3 = 27.
When the divisor m is positive, Python further guarantees that 0 ≤ r < m. As
a consequence, we find that −27 // 4 evaluates to −7 and −27 % 4 evaluates
to 1, as (−7) ∗ 4 + 1 = −27. When the divisor is negative, Python guarantees that
m < r ≤ 0. As an example, 27 // −4 is −7 and 27 % −4 is −1, satisfying the
identity 27 = (−7) ∗ (−4) + (−1).
The conventions for the // and % operators are even extended to floating-
point operands, with the expression q = n // m being the integral floor of the
quotient, and r = n % m being the “remainder” to ensure that q m + r equals
n. For example, 8.2 // 3.14 evaluates to 2.0 and 8.2 % 3.14 evaluates to 1.92, as
2.0 ∗ 3.14 + 1.92 = 8.2.
Bitwise Operators
Python provides the following bitwise operators for integers:
∼ bitwise complement (prefix unary operator)
& bitwise and
| bitwise or
ˆ bitwise exclusive-or
<< shift bits left, filling in with zeros
>> shift bits right, filling in with sign bit
Sequence Operators
Each of Python’s built-in sequence types (str, tuple, and list) support the following
operator syntaxes:
s[j] element at index j
s[start:stop] slice including indices [start,stop)
s[start:stop:step] slice including indices start, start + step,
start + 2 step, . . . , up to but not equalling or stop
s+t concatenation of sequences
k s shorthand for s + s + s + ... (k times)
val in s containment check
val not in s non-containment check
Python relies on zero-indexing of sequences, thus a sequence of length n has ele-
ments indexed from 0 to n − 1 inclusive. Python also supports the use of negative
indices, which denote a distance from the end of the sequence; index −1 denotes
the last element, index −2 the second to last, and so on. Python uses a slicing
1.3. Expressions, Operators, and Precedence 15
notation to describe subsequences of a sequence. Slices are described as half-open
intervals, with a start index that is included, and a stop index that is excluded. For
example, the syntax data[3:8] denotes a subsequence including the five indices:
3, 4, 5, 6, 7. An optional “step” value, possibly negative, can be indicated as a third
parameter of the slice. If a start index or stop index is omitted in the slicing nota-
tion, it is presumed to designate the respective extreme of the original sequence.
Because lists are mutable, the syntax s[j] = val can be used to replace an ele-
ment at a given index. Lists also support a syntax, del s[j], that removes the desig-
nated element from the list. Slice notation can also be used to replace or delete a
sublist.
The notation val in s can be used for any of the sequences to see if there is an
element equivalent to val in the sequence. For strings, this syntax can be used to
check for a single character or for a larger substring, as with amp in example .
All sequences define comparison operations based on lexicographic order, per-
forming an element by element comparison until the first difference is found. For
example, [5, 6, 9] < [5, 7] because of the entries at index 1. Therefore, the follow-
ing operations are supported by sequence types:
s == t equivalent (element by element)
s != t not equivalent
s < t lexicographically less than
s <= t lexicographically less than or equal to
s > t lexicographically greater than
s >= t lexicographically greater than or equal to
Dictionaries also support many useful behaviors through named methods, which
we explore more fully in Chapter 10.
Python supports an extended assignment operator for most binary operators, for
example, allowing a syntax such as count += 5. By default, this is a shorthand for
the more verbose count = count + 5. For an immutable type, such as a number or
a string, one should not presume that this syntax changes the value of the existing
object, but instead that it will reassign the identifier to a newly constructed value.
(See discussion of Figure 1.3.) However, it is possible for a type to redefine such
semantics to mutate the object, as the list class does for the += operator.
alpha = [1, 2, 3]
beta = alpha # an alias for alpha
beta += [4, 5] # extends the original list with two more elements
beta = beta + [6, 7] # reassigns beta to a new list [1, 2, 3, 4, 5, 6, 7]
print(alpha) # will be [1, 2, 3, 4, 5]
This example demonstrates the subtle difference between the list semantics for the
syntax beta += foo versus beta = beta + foo.
1.3. Expressions, Operators, and Precedence 17
Operator Precedence
Type Symbols
1 member access expr.member
function/method calls expr(...)
2
container subscripts/slices expr[...]
3 exponentiation
4 unary operators +expr, −expr, ˜expr
5 multiplication, division , /, //, %
6 addition, subtraction +, −
7 bitwise shifting <<, >>
8 bitwise-and &
9 bitwise-xor ˆ
10 bitwise-or |
comparisons is, is not, ==, !=, <, <=, >, >=
11
containment in, not in
12 logical-not not expr
13 logical-and and
14 logical-or or
15 conditional val1 if cond else val2
16 assignments =, +=, −=, =, etc.
Table 1.3: Operator precedence in Python, with categories ordered from highest
precedence to lowest precedence. When stated, we use expr to denote a literal,
identifier, or result of a previously evaluated expression. All operators without
explicit mention of expr are binary operators, with syntax expr1 operator expr2.
18 Chapter 1. Python Primer
1.4.1 Conditionals
Conditional constructs (also known as if statements) provide a way to execute a
chosen block of code based on the run-time evaluation of one or more Boolean
expressions. In Python, the most general form of a conditional is written as follows:
if first condition:
first body
elif second condition:
second body
elif third condition:
third body
else:
fourth body
Each condition is a Boolean expression, and each body contains one or more com-
mands that are to be executed conditionally. If the first condition succeeds, the first
body will be executed; no other conditions or bodies are evaluated in that case.
If the first condition fails, then the process continues in similar manner with the
evaluation of the second condition. The execution of this overall construct will
cause precisely one of the bodies to be executed. There may be any number of
elif clauses (including zero), and the final else clause is optional. As described on
page 7, nonboolean types may be evaluated as Booleans with intuitive meanings.
For example, if response is a string that was entered by a user, and we want to
condition a behavior on this being a nonempty string, we may write
if response:
as a shorthand for the equivalent,
if response != :
1.4. Control Flow 19
As a simple example, a robot controller might have the following logic:
if door is closed:
open door( )
advance( )
Notice that the final command, advance( ), is not indented and therefore not part of
the conditional body. It will be executed unconditionally (although after opening a
closed door).
We may nest one control structure within another, relying on indentation to
make clear the extent of the various bodies. Revisiting our robot example, here is a
more complex control that accounts for unlocking a closed door.
if door is closed:
if door is locked:
unlock door( )
open door( )
advance( )
The logic expressed by this example can be diagrammed as a traditional flowchart,
as portrayed in Figure 1.6.
False True
door is closed
False True
door is locked
unlock door( )
open door( )
advance( )
1.4.2 Loops
Python offers two distinct looping constructs. A while loop allows general repeti-
tion based upon the repeated testing of a Boolean condition. A for loop provides
convenient iteration of values from a defined series (such as characters of a string,
elements of a list, or numbers within a given range). We discuss both forms in this
section.
While Loops
The syntax for a while loop in Python is as follows:
while condition:
body
As with an if statement, condition can be an arbitrary Boolean expression, and
body can be an arbitrary block of code (including nested control structures). The
execution of a while loop begins with a test of the Boolean condition. If that condi-
tion evaluates to True, the body of the loop is performed. After each execution of
the body, the loop condition is retested, and if it evaluates to True, another iteration
of the body is performed. When the conditional test evaluates to False (assuming
it ever does), the loop is exited and the flow of control continues just beyond the
body of the loop.
As an example, here is a loop that advances an index through a sequence of
characters until finding an entry with value X or reaching the end of the sequence.
j=0
while j < len(data) and data[j] != X :
j += 1
The len function, which we will introduce in Section 1.5.2, returns the length of a
sequence such as a list or string. The correctness of this loop relies on the short-
circuiting behavior of the and operator, as described on page 12. We intention-
ally test j < len(data) to ensure that j is a valid index, prior to accessing element
data[j]. Had we written that compound condition with the opposite order, the eval-
uation of data[j] would eventually raise an IndexError when X is not found. (See
Section 1.7 for discussion of exceptions.)
As written, when this loop terminates, variable j’s value will be the index of
the leftmost occurrence of X , if found, or otherwise the length of the sequence
(which is recognizable as an invalid index to indicate failure of the search). It is
worth noting that this code behaves correctly, even in the special case when the list
is empty, as the condition j < len(data) will initially fail and the body of the loop
will never be executed.
1.4. Control Flow 21
For Loops
For readers familiar with Java, the semantics of Python’s for loop is similar to the
“for each” loop style introduced in Java 1.5.
As an instructive example of such a loop, we consider the task of computing
the sum of a list of numbers. (Admittedly, Python has a built-in function, sum, for
this purpose.) We perform the calculation with a for loop as follows, assuming that
data identifies the list:
total = 0
for val in data:
total += val # note use of the loop variable, val
The loop body executes once for each element of the data sequence, with the iden-
tifier, val, from the for-loop syntax assigned at the beginning of each pass to a
respective element. It is worth noting that val is treated as a standard identifier. If
the element of the original data happens to be mutable, the val identifier can be
used to invoke its methods. But a reassignment of identifier val to a new value has
no affect on the original data, nor on the next iteration of the loop.
As a second classic example, we consider the task of finding the maximum
value in a list of elements (again, admitting that Python’s built-in max function
already provides this support). If we can assume that the list, data, has at least one
element, we could implement this task as follows:
biggest = data[0] # as we assume nonempty list
for val in data:
if val > biggest:
biggest = val
Although we could accomplish both of the above tasks with a while loop, the
for-loop syntax had an advantage of simplicity, as there is no need to manage an
explicit index into the list nor to author a Boolean loop condition. Furthermore, we
can use a for loop in cases for which a while loop does not apply, such as when
iterating through a collection, such as a set, that does not support any direct form
of indexing.
22 Chapter 1. Python Primer
Index-Based For Loops
The simplicity of a standard for loop over the elements of a list is wonderful; how-
ever, one limitation of that form is that we do not know where an element resides
within the sequence. In some applications, we need knowledge of the index of an
element within the sequence. For example, suppose that we want to know where
the maximum element in a list resides.
Rather than directly looping over the elements of the list in that case, we prefer
to loop over all possible indices of the list. For this purpose, Python provides
a built-in class named range that generates integer sequences. (We will discuss
generators in Section 1.8.) In simplest form, the syntax range(n) generates the
series of n values from 0 to n − 1. Conveniently, these are precisely the series of
valid indices into a sequence of length n. Therefore, a standard Python idiom for
looping through the series of indices of a data sequence uses a syntax,
for j in range(len(data)):
In this case, identifier j is not an element of the data—it is an integer. But the
expression data[j] can be used to retrieve the respective element. For example, we
can find the index of the maximum element of a list as follows:
big index = 0
for j in range(len(data)):
if data[j] > data[big index]:
big index = j
1.5 Functions
In this section, we explore the creation of and use of functions in Python. As we
did in Section 1.2.2, we draw a distinction between functions and methods. We
use the general term function to describe a traditional, stateless function that is in-
voked without the context of a particular class or an instance of that class, such as
sorted(data). We use the more specific term method to describe a member function
that is invoked upon a specific object using an object-oriented message passing syn-
tax, such as data.sort( ). In this section, we only consider pure functions; methods
will be explored with more general object-oriented principles in Chapter 2.
We begin with an example to demonstrate the syntax for defining functions in
Python. The following function counts the number of occurrences of a given target
value within any form of iterable data set.
def count(data, target):
n=0
for item in data:
if item == target: # found a match
n += 1
return n
The first line, beginning with the keyword def, serves as the function’s signature.
This establishes a new identifier as the name of the function (count, in this exam-
ple), and it establishes the number of parameters that it expects, as well as names
identifying those parameters (data and target, in this example). Unlike Java and
C++, Python is a dynamically typed language, and therefore a Python signature
does not designate the types of those parameters, nor the type (if any) of a return
value. Those expectations should be stated in the function’s documentation (see
Section 2.2.3) and can be enforced within the body of the function, but misuse of a
function will only be detected at run-time.
The remainder of the function definition is known as the body of the func-
tion. As is the case with control structures in Python, the body of a function is
typically expressed as an indented block of code. Each time a function is called,
Python creates a dedicated activation record that stores information relevant to the
current call. This activation record includes what is known as a namespace (see
Section 1.10) to manage all identifiers that have local scope within the current call.
The namespace includes the function’s parameters and any other identifiers that are
defined locally within the body of the function. An identifier in the local scope
of the function caller has no relation to any identifier with the same name in the
caller’s scope (although identifiers in different scopes may be aliases to the same
object). In our first example, the identifier n has scope that is local to the function
call, as does the identifier item, which is established as the loop variable.
24 Chapter 1. Python Primer
Return Statement
A return statement is used within the body of a function to indicate that the func-
tion should immediately cease execution, and that an expressed value should be
returned to the caller. If a return statement is executed without an explicit argu-
ment, the None value is automatically returned. Likewise, None will be returned if
the flow of control ever reaches the end of a function body without having executed
a return statement. Often, a return statement will be the final command within the
body of the function, as was the case in our earlier example of a count function.
However, there can be multiple return statements in the same function, with con-
ditional logic controlling which such command is executed, if any. As a further
example, consider the following function that tests if a value exists in a sequence.
def contains(data, target):
for item in target:
if item == target: # found a match
return True
return False
If the conditional within the loop body is ever satisfied, the return True statement is
executed and the function immediately ends, with True designating that the target
value was found. Conversely, if the for loop reaches its conclusion without ever
finding the match, the final return False statement will be executed.
list str
... A
Figure 1.7: A portrayal of parameter passing in Python, for the function call
count(grades, A ). Identifiers data and target are formal parameters defined
within the local scope of the count function.
The communication of a return value from the function back to the caller is
similarly implemented as an assignment. Therefore, with our sample invocation of
prizes = count(grades, A ), the identifier prizes in the caller’s scope is assigned
to the object that is identified as n in the return statement within our function body.
An advantage to Python’s mechanism for passing information to and from a
function is that objects are not copied. This ensures that the invocation of a function
is efficient, even in a case where a parameter or return value is a complex object.
Mutable Parameters
Python’s parameter passing model has additional implications when a parameter is
a mutable object. Because the formal parameter is an alias for the actual parameter,
the body of the function may interact with the object in ways that change its state.
Considering again our sample invocation of the count function, if the body of the
function executes the command data.append( F ), the new entry is added to the
end of the list identified as data within the function, which is one and the same as
the list known to the caller as grades. As an aside, we note that reassigning a new
value to a formal parameter with a function body, such as by setting data = [ ],
does not alter the actual parameter; such a reassignment simply breaks the alias.
Our hypothetical example of a count method that appends a new element to a
list lacks common sense. There is no reason to expect such a behavior, and it would
be quite a poor design to have such an unexpected effect on the parameter. There
are, however, many legitimate cases in which a function may be designed (and
clearly documented) to modify the state of a parameter. As a concrete example,
we present the following implementation of a method named scale that’s primary
purpose is to multiply all entries of a numeric data set by a given factor.
def scale(data, factor):
for j in range(len(data)):
data[j] = factor
26 Chapter 1. Python Primer
Default Parameter Values
Python provides means for functions to support more than one possible calling
signature. Such a function is said to be polymorphic (which is Greek for “many
forms”). Most notably, functions can declare one or more default values for pa-
rameters, thereby allowing the caller to invoke a function with varying numbers of
actual parameters. As an artificial example, if a function is declared with signature
def foo(a, b=15, c=27):
there are three parameters, the last two of which offer default values. A caller is
welcome to send three actual parameters, as in foo(4, 12, 8), in which case the de-
fault values are not used. If, on the other hand, the caller only sends one parameter,
foo(4), the function will execute with parameters values a=4, b=15, c=27. If a
caller sends two parameters, they are assumed to be the first two, with the third be-
ing the default. Thus, foo(8, 20) executes with a=8, b=20, c=27. However, it is
illegal to define a function with a signature such as bar(a, b=15, c) with b having
a default value, yet not the subsequent c; if a default parameter value is present for
one parameter, it must be present for all further parameters.
As a more motivating example for the use of a default parameter, we revisit
the task of computing a student’s GPA (see Code Fragment 1.1). Rather than as-
sume direct input and output with the console, we prefer to design a function that
computes and returns a GPA. Our original implementation uses a fixed mapping
from each letter grade (such as a B−) to a corresponding point value (such as
2.67). While that point system is somewhat common, it may not agree with the
system used by all schools. (For example, some may assign an A+ grade a value
higher than 4.0.) Therefore, we design a compute gpa function, given in Code
Fragment 1.2, which allows the caller to specify a custom mapping from grades to
values, while offering the standard point system as a default.
Keyword Parameters
The traditional mechanism for matching the actual parameters sent by a caller, to
the formal parameters declared by the function signature is based on the concept
of positional arguments. For example, with signature foo(a=10, b=20, c=30),
parameters sent by the caller are matched, in the given order, to the formal param-
eters. An invocation of foo(5) indicates that a=5, while b and c are assigned their
default values.
Python supports an alternate mechanism for sending a parameter to a function
known as a keyword argument. A keyword argument is specified by explicitly
assigning an actual parameter to a formal parameter by name. For example, with
the above definition of function foo, a call foo(c=5) will invoke the function with
parameters a=10, b=20, c=5.
A function’s author can require that certain parameters be sent only through the
keyword-argument syntax. We never place such a restriction in our own function
definitions, but we will see several important uses of keyword-only parameters in
Python’s standard libraries. As an example, the built-in max function accepts a
keyword parameter, coincidentally named key, that can be used to vary the notion
of “maximum” that is used.
28 Chapter 1. Python Primer
By default, max operates based upon the natural order of elements according
to the < operator for that type. But the maximum can be computed by comparing
some other aspect of the elements. This is done by providing an auxiliary function
that converts a natural element to some other value for the sake of comparison.
For example, if we are interested in finding a numeric value with magnitude that is
maximal (i.e., considering −35 to be larger than +20), we can use the calling syn-
tax max(a, b, key=abs). In this case, the built-in abs function is itself sent as the
value associated with the keyword parameter key. (Functions are first-class objects
in Python; see Section 1.10.) When max is called in this way, it will compare abs(a)
to abs(b), rather than a to b. The motivation for the keyword syntax as an alternate
to positional arguments is important in the case of max. This function is polymor-
phic in the number of arguments, allowing a call such as max(a,b,c,d); therefore,
it is not possible to designate a key function as a traditional positional element.
Sorting functions in Python also support a similar key parameter for indicating a
nonstandard order. (We explore this further in Section 9.4 and in Section 12.6.1,
when discussing sorting algorithms).
A Sample Program
Here is a simple, but complete, program that demonstrates the use of the input
and print functions. The tools for formatting the final output is discussed in Ap-
pendix A.
age = int(input( Enter your age in years: ))
max heart rate = 206.9 − (0.67 age) # as per Med Sci Sports Exerc.
target = 0.65 max heart rate
print( Your target fat-burning heart rate is , target)
1.6.2 Files
Files are typically accessed in Python beginning with a call to a built-in function,
named open, that returns a proxy for interactions with the underlying file. For
example, the command, fp = open( sample.txt ), attempts to open a file named
sample.txt, returning a proxy that allows read-only access to the text file.
The open function accepts an optional second parameter that determines the
access mode. The default mode is r for reading. Other common modes are w
for writing to the file (causing any existing file with that name to be overwritten),
or a for appending to the end of an existing file. Although we focus on use of
text files, it is possible to work with binary files, using access modes such as rb
or wb .
32 Chapter 1. Python Primer
When processing a file, the proxy maintains a current position within the file as
an offset from the beginning, measured in number of bytes. When opening a file
with mode r or w , the position is initially 0; if opened in append mode, a ,
the position is initially at the end of the file. The syntax fp.close( ) closes the file
associated with proxy fp, ensuring that any written contents are saved. A summary
of methods for reading and writing a file is given in Table 1.5
Table 1.5: Behaviors for interacting with a text file via a file proxy (named fp).
Writing to a File
When a file proxy is writable, for example, if created with access mode w or
a , text can be written using methods write or writelines. For example, if we de-
fine fp = open( results.txt , w ), the syntax fp.write( Hello World.\n )
writes a single line to the file with the given string. Note well that write does not
explicitly add a trailing newline, so desired newline characters must be embedded
directly in the string parameter. Recall that the output of the print method can be
redirected to a file using a keyword parameter, as described in Section 1.6.
1.7. Exception Handling 33
Class Description
Exception A base class for most error types
AttributeError Raised by syntax obj.foo, if obj has no member named foo
EOFError Raised if “end of file” reached for console or file input
IOError Raised upon failure of I/O operation (e.g., opening file)
IndexError Raised if index to sequence is out of bounds
KeyError Raised if nonexistent key requested for set or dictionary
KeyboardInterrupt Raised if user types ctrl-C while program is executing
NameError Raised if nonexistent identifier used
StopIteration Raised by next(iterator) if no element; see Section 1.8
TypeError Raised when wrong type of parameter is sent to a function
ValueError Raised when parameter has invalid value (e.g., sqrt(−5))
ZeroDivisionError Raised when any division operator used with 0 as divisor
Generators
In Section 2.3.4, we will explain how to define a class whose instances serve as
iterators. However, the most convenient technique for creating iterators in Python
is through the use of generators. A generator is implemented with a syntax that
is very similar to a function, but instead of returning values, a yield statement is
executed to indicate each element of the series. As an example, consider the goal
of determining all factors of a positive integer. For example, the number 100 has
factors 1, 2, 4, 5, 10, 20, 25, 50, 100. A traditional function might produce and
return a list containing all factors, implemented as:
def factors(n): # traditional function that computes factors
results = [ ] # store factors in a new list
for k in range(1,n+1):
if n % k == 0: # divides evenly, thus k is a factor
results.append(k) # add k to the list of factors
return results # return the entire list
In contrast, an implementation of a generator for computing those factors could be
implemented as follows:
def factors(n): # generator that computes factors
for k in range(1,n+1):
if n % k == 0: # divides evenly, thus k is a factor
yield k # yield this factor as next result
Notice use of the keyword yield rather than return to indicate a result. This indi-
cates to Python that we are defining a generator, rather than a traditional function. It
is illegal to combine yield and return statements in the same implementation, other
than a zero-argument return statement to cause a generator to end its execution. If
a programmer writes a loop such as for factor in factors(100):, an instance of our
generator is created. For each iteration of the loop, Python executes our procedure
1.8. Iterators and Generators 41
until a yield statement indicates the next value. At that point, the procedure is tem-
porarily interrupted, only to be resumed when another value is requested. When
the flow of control naturally reaches the end of our procedure (or a zero-argument
return statement), a StopIteration exception is automatically raised. Although this
particular example uses a single yield statement in the source code, a generator can
rely on multiple yield statements in different constructs, with the generated series
determined by the natural flow of control. For example, we can greatly improve
the efficiency of our generator for computing factors of a number, n, by only test-
ing values up to the square root of that number, while reporting the factor n//k
that is associated with each k (unless n//k equals k). We might implement such a
generator as follows:
def factors(n): # generator that computes factors
k=1
while k k < n: # while k < sqrt(n)
if n % k == 0:
yield k
yield n // k
k += 1
if k k == n: # special case if n is perfect square
yield k
We should note that this generator differs from our first version in that the factors
are not generated in strictly increasing order. For example, factors(100) generates
the series 1, 100, 2, 50, 4, 25, 5, 20, 10.
In closing, we wish to emphasize the benefits of lazy evaluation when using a
generator rather than a traditional function. The results are only computed if re-
quested, and the entire series need not reside in memory at one time. In fact, a
generator can effectively produce an infinite series of values. As an example, the
Fibonacci numbers form a classic mathematical sequence, starting with value 0,
then value 1, and then each subsequent value being the sum of the two preceding
values. Hence, the Fibonacci series begins as: 0, 1, 1, 2, 3, 5, 8, 13, . . .. The follow-
ing generator produces this infinite series.
def fibonacci( ):
a=0
b=1
while True: # keep going...
yield a # report value, a, during this pass
future = a + b
a=b # this will be next value reported
b = future # and subsequently this
42 Chapter 1. Python Primer
The generator syntax is particularly attractive when results do not need to be stored
in memory. For example, to compute the sum of the first n squares, the genera-
tor syntax, total = sum(k k for k in range(1, n+1)), is preferred to the use of an
explicitly instantiated list comprehension as the parameter.
44 Chapter 1. Python Primer
Python provides two additional conveniences involving the treatment of tuples and
other sequence types. The first is rather cosmetic. If a series of comma-separated
expressions are given in a larger context, they will be treated as a single tuple, even
if no enclosing parentheses are provided. For example, the assignment
data = 2, 4, 6, 8
results in identifier, data, being assigned to the tuple (2, 4, 6, 8). This behavior
is called automatic packing of a tuple. One common use of packing in Python is
when returning multiple values from a function. If the body of a function executes
the command,
return x, y
it will be formally returning a single object that is the tuple (x, y).
As a dual to the packing behavior, Python can automatically unpack a se-
quence, allowing one to assign a series of individual identifiers to the elements
of sequence. As an example, we can write
a, b, c, d = range(7, 11)
which has the effect of assigning a=7, b=8, c=9, and d=10, as those are the four
values in the sequence returned by the call to range. For this syntax, the right-hand
side expression can be any iterable type, as long as the number of variables on the
left-hand side is the same as the number of elements in the iteration.
This technique can be used to unpack tuples returned by a function. For exam-
ple, the built-in function, divmod(a, b), returns the pair of values (a // b, a % b)
associated with an integer division. Although the caller can consider the return
value to be a single tuple, it is possible to write
quotient, remainder = divmod(a, b)
to separately identify the two entries of the returned tuple. This syntax can also be
used in the context of a for loop, when iterating over a sequence of iterables, as in
for x, y in [ (7, 2), (5, 8), (6, 4) ]:
In this example, there will be three iterations of the loop. During the first pass, x=7
and y=2, and so on. This style of loop is quite commonly used to iterate through
key-value pairs that are returned by the items( ) method of the dict class, as in:
for k, v in mapping.items( ):
1.9. Additional Python Conveniences 45
Simultaneous Assignments
When computing a sum with the syntax x + y in Python, the names x and y must
have been previously associated with objects that serve as values; a NameError
will be raised if no such definitions are found. The process of determining the
value associated with an identifier is known as name resolution.
Whenever an identifier is assigned to a value, that definition is made with a
specific scope. Top-level assignments are typically made in what is known as global
scope. Assignments made within the body of a function typically have scope that is
local to that function call. Therefore, an assignment, x = 5, within a function has
no effect on the identifier, x, in the broader scope.
Each distinct scope in Python is represented using an abstraction known as a
namespace. A namespace manages all identifiers that are currently defined in a
given scope. Figure 1.8 portrays two namespaces, one being that of a caller to our
count function from Section 1.5, and the other being the local namespace during
the execution of that function.
float str
n
3.56 gpa A int
target
grades 2
list data
str major item
CS
str str str
A- B+ A-
Figure 1.8: A portrayal of the two namespaces associated with a user’s call
count(grades, A ), as defined in Section 1.5. The left namespace is the caller’s
and the right namespace represents the local scope of the function.
Python implements a namespace with its own dictionary that maps each iden-
tifying string (e.g., n ) to its associated value. Python provides several ways to
examine a given namespace. The function, dir, reports the names of the identifiers
in a given namespace (i.e., the keys of the dictionary), while the function, vars,
returns the full dictionary. By default, calls to dir( ) and vars( ) report on the most
locally enclosing namespace in which they are executed.
1.10. Scopes and Namespaces 47
When an identifier is indicated in a command, Python searches a series of
namespaces in the process of name resolution. First, the most locally enclosing
scope is searched for a given name. If not found there, the next outer scope is
searched, and so on. We will continue our examination of namespaces, in Sec-
tion 2.5, when discussing Python’s treatment of object-orientation. We will see
that each object has its own namespace to store its attributes, and that classes each
have a namespace as well.
First-Class Objects
Existing Modules
Module Name Description
array Provides compact array storage for primitive types.
Defines additional data structures and abstract base classes
collections
involving collections of objects.
copy Defines general functions for making copies of objects.
heapq Provides heap-based priority queue functions (see Section 9.3.7).
math Defines common mathematical constants and functions.
os Provides support for interactions with the operating system.
random Provides random number generation.
re Provides support for processing regular expressions.
sys Provides additional level of interaction with the Python interpreter.
time Provides support for measuring time, or delaying a program.
Table 1.7: Some existing Python modules relevant to data structures and algorithms.
Syntax Description
Initializes the pseudo-random number generator
seed(hashable)
based upon the hash value of the parameter
Returns a pseudo-random floating-point
random( )
value in the interval [0.0, 1.0).
Returns a pseudo-random integer
randint(a,b)
in the closed interval [a, b].
Returns a pseudo-random integer in the standard
randrange(start, stop, step)
Python range indicated by the parameters.
Returns an element of the given sequence
choice(seq)
chosen pseudo-randomly.
Reorders the elements of the given
shuffle(seq)
sequence pseudo-randomly.
Table 1.8: Methods supported by instances of the Random class, and as top-level
functions of the random module.
1.12. Exercises 51
1.12 Exercises
For help with exercises, please visit the site, www.wiley.com/college/goodrich.
Reinforcement
R-1.1 Write a short Python function, is multiple(n, m), that takes two integer
values and returns True if n is a multiple of m, that is, n = mi for some
integer i, and False otherwise.
R-1.2 Write a short Python function, is even(k), that takes an integer value and
returns True if k is even, and False otherwise. However, your function
cannot use the multiplication, modulo, or division operators.
R-1.3 Write a short Python function, minmax(data), that takes a sequence of
one or more numbers, and returns the smallest and largest numbers, in the
form of a tuple of length two. Do not use the built-in functions min or
max in implementing your solution.
R-1.4 Write a short Python function that takes a positive integer n and returns
the sum of the squares of all the positive integers smaller than n.
R-1.5 Give a single command that computes the sum from Exercise R-1.4, rely-
ing on Python’s comprehension syntax and the built-in sum function.
R-1.6 Write a short Python function that takes a positive integer n and returns
the sum of the squares of all the odd positive integers smaller than n.
R-1.7 Give a single command that computes the sum from Exercise R-1.6, rely-
ing on Python’s comprehension syntax and the built-in sum function.
R-1.8 Python allows negative integers to be used as indices into a sequence,
such as a string. If string s has length n, and expression s[k] is used for in-
dex −n ≤ k < 0, what is the equivalent index j ≥ 0 such that s[j] references
the same element?
R-1.9 What parameters should be sent to the range constructor, to produce a
range with values 50, 60, 70, 80?
R-1.10 What parameters should be sent to the range constructor, to produce a
range with values 8, 6, 4, 2, 0, −2, −4, −6, −8?
R-1.11 Demonstrate how to use Python’s list comprehension syntax to produce
the list [1, 2, 4, 8, 16, 32, 64, 128, 256].
R-1.12 Python’s random module includes a function choice(data) that returns a
random element from a non-empty sequence. The random module in-
cludes a more basic function randrange, with parameterization similar to
the built-in range function, that return a random choice from the given
range. Using only the randrange function, implement your own version
of the choice function.
52 Chapter 1. Python Primer
Creativity
C-1.13 Write a pseudo-code description of a function that reverses a list of n
integers, so that the numbers are listed in the opposite order than they
were before, and compare this method to an equivalent Python function
for doing the same thing.
C-1.14 Write a short Python function that takes a sequence of integer values and
determines if there is a distinct pair of numbers in the sequence whose
product is odd.
C-1.15 Write a Python function that takes a sequence of numbers and determines
if all the numbers are different from each other (that is, they are distinct).
C-1.16 In our implementation of the scale function (page 25), the body of the loop
executes the command data[j] = factor. We have discussed that numeric
types are immutable, and that use of the = operator in this context causes
the creation of a new instance (not the mutation of an existing instance).
How is it still possible, then, that our implementation of scale changes the
actual parameter sent by the caller?
C-1.17 Had we implemented the scale function (page 25) as follows, does it work
properly?
def scale(data, factor):
for val in data:
val = factor
Explain why or why not.
C-1.18 Demonstrate how to use Python’s list comprehension syntax to produce
the list [0, 2, 6, 12, 20, 30, 42, 56, 72, 90].
C-1.19 Demonstrate how to use Python’s list comprehension syntax to produce
the list [ a , b , c , ..., z ], but without having to type all 26 such
characters literally.
C-1.20 Python’s random module includes a function shuffle(data) that accepts a
list of elements and randomly reorders the elements so that each possi-
ble order occurs with equal probability. The random module includes a
more basic function randint(a, b) that returns a uniformly random integer
from a to b (including both endpoints). Using only the randint function,
implement your own version of the shuffle function.
C-1.21 Write a Python program that repeatedly reads lines from standard input
until an EOFError is raised, and then outputs those lines in reverse order
(a user can indicate end of input by typing ctrl-D).
1.12. Exercises 53
C-1.22 Write a short Python program that takes two arrays a and b of length n
storing int values, and returns the dot product of a and b. That is, it returns
an array c of length n such that c[i] = a[i] · b[i], for i = 0, . . . , n − 1.
C-1.23 Give an example of a Python code fragment that attempts to write an ele-
ment to a list based on an index that may be out of bounds. If that index
is out of bounds, the program should catch the exception that results, and
print the following error message:
“Don’t try buffer overflow attacks in Python!”
C-1.24 Write a short Python function that counts the number of vowels in a given
character string.
C-1.25 Write a short Python function that takes a string s, representing a sentence,
and returns a copy of the string with all punctuation removed. For exam-
ple, if given the string "Let s try, Mike.", this function would return
"Lets try Mike".
C-1.26 Write a short program that takes as input three integers, a, b, and c, from
the console and determines if they can be used in a correct arithmetic
formula (in the given order), like “a + b = c,” “a = b − c,” or “a ∗ b = c.”
tation of a function named norm such that norm(v, p) returns the p-norm
value of v and norm(v) returns the Euclidean norm of v. You may assume
that v is a list of numbers.
54 Chapter 1. Python Primer
Projects
P-1.29 Write a Python program that outputs all possible strings formed by using
the characters c , a , t , d , o , and g exactly once.
P-1.30 Write a Python program that can take a positive integer greater than 2 as
input and write out the number of times one must repeatedly divide this
number by 2 before getting a value less than 2.
P-1.31 Write a Python program that can “make change.” Your program should
take two numbers as input, one that is a monetary amount charged and the
other that is a monetary amount given. It should then return the number
of each kind of bill and coin to give back as change for the difference
between the amount given and the amount charged. The values assigned
to the bills and coins can be based on the monetary system of any current
or former government. Try to design your program so that it returns as
few bills and coins as possible.
P-1.32 Write a Python program that can simulate a simple calculator, using the
console as the exclusive input and output device. That is, each input to the
calculator, be it a number, like 12.34 or 1034, or an operator, like + or =,
can be done on a separate line. After each such input, you should output
to the Python console what would be displayed on your calculator.
P-1.33 Write a Python program that simulates a handheld calculator. Your pro-
gram should process input from the Python console representing buttons
that are “pushed,” and then output the contents of the screen after each op-
eration is performed. Minimally, your calculator should be able to process
the basic arithmetic operations and a reset/clear operation.
P-1.34 A common punishment for school children is to write out a sentence mul-
tiple times. Write a Python stand-alone program that will write out the
following sentence one hundred times: “I will never spam my friends
again.” Your program should number each of the sentences and it should
make eight different random-looking typos.
P-1.35 The birthday paradox says that the probability that two people in a room
will have the same birthday is more than half, provided n, the number of
people in the room, is more than 23. This property is not really a paradox,
but many people find it surprising. Design a Python program that can test
this paradox by a series of experiments on randomly generated birthdays,
which test this paradox for n = 5, 10, 15, 20, . . . , 100.
P-1.36 Write a Python program that inputs a list of words, separated by white-
space, and outputs how many times each word appears in the list. You
need not worry about efficiency at this point, however, as this topic is
something that will be addressed later in this book.
Chapter Notes 55
Chapter Notes
The official Python Web site (http://www.python.org) has a wealth of information, in-
cluding a tutorial and full documentation of the built-in functions, classes, and standard
modules. The Python interpreter is itself a useful reference, as the interactive command
help(foo) provides documentation for any function, class, or module that foo identifies.
Books providing an introduction to programming in Python include titles authored by
Campbell et al. [22], Cedar [25], Dawson [32], Goldwasser and Letscher [43], Lutz [72],
Perkovic [82], and Zelle [105]. More complete reference books on Python include titles by
Beazley [12], and Summerfield [91].
Chapter
2 Object-Oriented Programming
Contents
2.1 Goals, Principles, and Patterns . . . . . . . . . . . . . . . . 57
2.1.1 Object-Oriented Design Goals . . . . . . . . . . . . . . . 57
2.1.2 Object-Oriented Design Principles . . . . . . . . . . . . . 58
2.1.3 Design Patterns . . . . . . . . . . . . . . . . . . . . . . . 61
2.2 Software Development . . . . . . . . . . . . . . . . . . . . 62
2.2.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.2.2 Pseudo-Code . . . . . . . . . . . . . . . . . . . . . . . . 64
2.2.3 Coding Style and Documentation . . . . . . . . . . . . . . 64
2.2.4 Testing and Debugging . . . . . . . . . . . . . . . . . . . 67
2.3 Class Definitions . . . . . . . . . . . . . . . . . . . . . . . . 69
2.3.1 Example: CreditCard Class . . . . . . . . . . . . . . . . . 69
2.3.2 Operator Overloading and Python’s Special Methods . . . 74
2.3.3 Example: Multidimensional Vector Class . . . . . . . . . . 77
2.3.4 Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.3.5 Example: Range Class . . . . . . . . . . . . . . . . . . . . 80
2.4 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.4.1 Extending the CreditCard Class . . . . . . . . . . . . . . . 83
2.4.2 Hierarchy of Numeric Progressions . . . . . . . . . . . . . 87
2.4.3 Abstract Base Classes . . . . . . . . . . . . . . . . . . . . 93
2.5 Namespaces and Object-Orientation . . . . . . . . . . . . . 96
2.5.1 Instance and Class Namespaces . . . . . . . . . . . . . . . 96
2.5.2 Name Resolution and Dynamic Dispatch . . . . . . . . . . 100
2.6 Shallow and Deep Copying . . . . . . . . . . . . . . . . . . 101
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2.1. Goals, Principles, and Patterns 57
Robustness
Every good programmer wants to develop software that is correct, which means that
a program produces the right output for all the anticipated inputs in the program’s
application. In addition, we want software to be robust, that is, capable of handling
unexpected inputs that are not explicitly defined for its application. For example,
if a program is expecting a positive integer (perhaps representing the price of an
item) and instead is given a negative integer, then the program should be able to
recover gracefully from this error. More importantly, in life-critical applications,
where a software error can lead to injury or loss of life, software that is not robust
could be deadly. This point was driven home in the late 1980s in accidents involv-
ing Therac-25, a radiation-therapy machine, which severely overdosed six patients
between 1985 and 1987, some of whom died from complications resulting from
their radiation overdose. All six accidents were traced to software errors.
58 Chapter 2. Object-Oriented Programming
Adaptability
Modern software applications, such as Web browsers and Internet search engines,
typically involve large programs that are used for many years. Software, there-
fore, needs to be able to evolve over time in response to changing conditions in its
environment. Thus, another important goal of quality software is that it achieves
adaptability (also called evolvability). Related to this concept is portability, which
is the ability of software to run with minimal change on different hardware and
operating system platforms. An advantage of writing software in Python is the
portability provided by the language itself.
Reusability
Going hand in hand with adaptability is the desire that software be reusable, that
is, the same code should be usable as a component of different systems in various
applications. Developing quality software can be an expensive enterprise, and its
cost can be offset somewhat if the software is designed in a way that makes it easily
reusable in future applications. Such reuse should be done with care, however, for
one of the major sources of software errors in the Therac-25 came from inappropri-
ate reuse of Therac-20 software (which was not object-oriented and not designed
for the hardware platform used with the Therac-25).
Abstraction
The notion of abstraction is to distill a complicated system down to its most funda-
mental parts. Typically, describing the parts of a system involves naming them and
explaining their functionality. Applying the abstraction paradigm to the design of
data structures gives rise to abstract data types (ADTs). An ADT is a mathematical
model of a data structure that specifies the type of data stored, the operations sup-
ported on them, and the types of parameters of the operations. An ADT specifies
what each operation does, but not how it does it. We will typically refer to the
collective set of behaviors supported by an ADT as its public interface.
60 Chapter 2. Object-Oriented Programming
As a programming language, Python provides a great deal of latitude in regard
to the specification of an interface. Python has a tradition of treating abstractions
implicitly using a mechanism known as duck typing. As an interpreted and dy-
namically typed language, there is no “compile time” checking of data types in
Python, and no formal requirement for declarations of abstract base classes. In-
stead programmers assume that an object supports a set of known behaviors, with
the interpreter raising a run-time error if those assumptions fail. The description
of this as “duck typing” comes from an adage attributed to poet James Whitcomb
Riley, stating that “when I see a bird that walks like a duck and swims like a duck
and quacks like a duck, I call that bird a duck.”
More formally, Python supports abstract data types using a mechanism known
as an abstract base class (ABC). An abstract base class cannot be instantiated
(i.e., you cannot directly create an instance of that class), but it defines one or more
common methods that all implementations of the abstraction must have. An ABC
is realized by one or more concrete classes that inherit from the abstract base class
while providing implementations for those method declared by the ABC. Python’s
abc module provides formal support for ABCs, although we omit such declarations
for simplicity. We will make use of several existing abstract base classes coming
from Python’s collections module, which includes definitions for several common
data structure ADTs, and concrete implementations of some of those abstractions.
Encapsulation
Another important principle of object-oriented design is encapsulation. Different
components of a software system should not reveal the internal details of their
respective implementations. One of the main advantages of encapsulation is that it
gives one programmer freedom to implement the details of a component, without
concern that other programmers will be writing code that intricately depends on
those internal decisions. The only constraint on the programmer of a component
is to maintain the public interface for the component, as other programmers will
be writing code that depends on that interface. Encapsulation yields robustness
and adaptability, for it allows the implementation details of parts of a program to
change without adversely affecting other parts, thereby making it easier to fix bugs
or add new functionality with relatively local changes to a component.
Throughout this book, we will adhere to the principle of encapsulation, making
clear which aspects of a data structure are assumed to be public and which are
assumed to be internal details. With that said, Python provides only loose support
for encapsulation. By convention, names of members of a class (both data members
and member functions) that start with a single underscore character (e.g., secret)
are assumed to be nonpublic and should not be relied upon. Those conventions
are reinforced by the intentional omission of those members from automatically
generated documentation.
2.1. Goals, Principles, and Patterns 61
2.2.1 Design
For object-oriented programming, the design step is perhaps the most important
phase in the process of developing software. For it is in the design step that we
decide how to divide the workings of our program into classes, we decide how
these classes will interact, what data each will store, and what actions each will
perform. Indeed, one of the main challenges that beginning programmers face is
deciding what classes to define to do the work of their program. While general
prescriptions are hard to come by, there are some rules of thumb that we can apply
when determining how to design our classes:
• Responsibilities: Divide the work into different actors, each with a different
responsibility. Try to describe responsibilities using action verbs. These
actors will form the classes for the program.
• Independence: Define the work for each class to be as independent from
other classes as possible. Subdivide responsibilities between classes so that
each class has autonomy over some aspect of the program. Give data (as in-
stance variables) to the class that has jurisdiction over the actions that require
access to this data.
• Behaviors: Define the behaviors for each class carefully and precisely, so
that the consequences of each action performed by a class will be well un-
derstood by other classes that interact with it. These behaviors will define
the methods that this class performs, and the set of behaviors for a class are
the interface to the class, as these form the means for other pieces of code to
interact with objects from the class.
Defining the classes, together with their instance variables and methods, are key
to the design of an object-oriented program. A good programmer will naturally
develop greater skill in performing these tasks over time, as experience teaches
him or her to notice patterns in the requirements of a program that match patterns
that he or she has seen before.
2.2. Software Development 63
A common tool for developing an initial high-level design for a project is the
use of CRC cards. Class-Responsibility-Collaborator (CRC) cards are simple in-
dex cards that subdivide the work required of a program. The main idea behind this
tool is to have each card represent a component, which will ultimately become a
class in the program. We write the name of each component on the top of an index
card. On the left-hand side of the card, we begin writing the responsibilities for
this component. On the right-hand side, we list the collaborators for this compo-
nent, that is, the other components that this component will have to interact with to
perform its duties.
The design process iterates through an action/actor cycle, where we first iden-
tify an action (that is, a responsibility), and we then determine an actor (that is, a
component) that is best suited to perform that action. The design is complete when
we have assigned all actions to actors. In using index cards for this process (rather
than larger pieces of paper), we are relying on the fact that each component should
have a small set of responsibilities and collaborators. Enforcing this rule helps keep
the individual classes manageable.
As the design takes form, a standard approach to explain and document the
design is the use of UML (Unified Modeling Language) diagrams to express the
organization of a program. UML diagrams are a standard visual notation to express
object-oriented software designs. Several computer-aided tools are available to
build UML diagrams. One type of UML figure is known as a class diagram. An
example of such a diagram is given in Figure 2.3, for a class that represents a
consumer credit card. The diagram has three portions, with the first designating
the name of the class, the second designating the recommended instance variables,
and the third designating the recommended methods of the class. In Section 2.2.3,
we discuss our naming conventions, and in Section 2.3.1, we provide a complete
implementation of a Python CreditCard class based on this design.
Class: CreditCard
Fields: customer balance
bank limit
account
Behaviors: get customer( ) get balance( )
get bank( ) get limit( )
get account( ) charge(price)
make payment(amount)
2.2.2 Pseudo-Code
As an intermediate step before the implementation of a design, programmers are
often asked to describe algorithms in a way that is intended for human eyes only.
Such descriptions are called pseudo-code. Pseudo-code is not a computer program,
but is more structured than usual prose. It is a mixture of natural language and
high-level programming constructs that describe the main ideas behind a generic
implementation of a data structure or algorithm. Because pseudo-code is designed
for a human reader, not a computer, we can communicate high-level ideas, without
being burdened with low-level implementation details. At the same time, we should
not gloss over important steps. Like many forms of human communication, finding
the right balance is an important skill that is refined through practice.
In this book, we rely on a pseudo-code style that we hope will be evident to
Python programmers, yet with a mix of mathematical notations and English prose.
For example, we might use the phrase “indicate an error” rather than a formal raise
statement. Following conventions of Python, we rely on indentation to indicate
the extent of control structures and on an indexing notation in which entries of a
sequence A with length n are indexed from A[0] to A[n − 1]. However, we choose
to enclose comments within curly braces { like these } in our pseudo-code, rather
than using Python’s # character.
http://www.python.org/dev/peps/pep-0008/
◦ Classes (other than Python’s built-in classes) should have a name that
serves as a singular noun, and should be capitalized (e.g., Date rather
than date or Dates). When multiple words are concatenated to form a
class name, they should follow the so-called “CamelCase” convention
in which the first letter of each word is capitalized (e.g., CreditCard).
Multiline block comments are good for explaining more complex code sec-
tions. In Python, these are technically multiline string literals, typically de-
limited with triple quotes (”””), which have no effect when executed. In the
next section, we discuss the use of block comments for documentation.
66 Chapter 2. Object-Oriented Programming
Documentation
http://www.python.org/dev/peps/pep-0257/
In this book, we will try to present docstrings when space allows. Omitted
docstrings can be found in the online version of our source code.
2.2. Software Development 67
Testing
A careful testing plan is an essential part of writing a program. While verifying the
correctness of a program over all possible inputs is usually infeasible, we should
aim at executing the program on a representative subset of inputs. At the very
minimum, we should make sure that every method of a class is tested at least once
(method coverage). Even better, each code statement in the program should be
executed at least once (statement coverage).
Programs often tend to fail on special cases of the input. Such cases need to be
carefully identified and tested. For example, when testing a method that sorts (that
is, puts in order) a sequence of integers, we should consider the following inputs:
• The sequence has zero length (no elements).
• The sequence has one element.
• All the elements of the sequence are the same.
• The sequence is already sorted.
• The sequence is reverse sorted.
In addition to special inputs to the program, we should also consider special
conditions for the structures used by the program. For example, if we use a Python
list to store data, we should make sure that boundary cases, such as inserting or
removing at the beginning or end of the list, are properly handled.
While it is essential to use handcrafted test suites, it is also advantageous to
run the program on a large collection of randomly generated inputs. The random
module in Python provides several means for generating random numbers, or for
randomizing the order of collections.
The dependencies among the classes and functions of a program induce a hi-
erarchy. Namely, a component A is above a component B in the hierarchy if A
depends upon B, such as when function A calls function B, or function A relies on
a parameter that is an instance of class B. There are two main testing strategies,
top-down and bottom-up, which differ in the order in which components are tested.
Top-down testing proceeds from the top to the bottom of the program hierarchy.
It is typically used in conjunction with stubbing, a boot-strapping technique that
replaces a lower-level component with a stub, a replacement for the component
that simulates the functionality of the original. For example, if function A calls
function B to get the first line of a file, when testing A we can replace B with a stub
that returns a fixed string.
68 Chapter 2. Object-Oriented Programming
Bottom-up testing proceeds from lower-level components to higher-level com-
ponents. For example, bottom-level functions, which do not invoke other functions,
are tested first, followed by functions that call only bottom-level functions, and so
on. Similarly a class that does not depend upon any other classes can be tested
before another class that depends on the former. This form of testing is usually
described as unit testing, as the functionality of a specific component is tested in
isolation of the larger software project. If used properly, this strategy better isolates
the cause of errors to the component being tested, as lower-level components upon
which it relies should have already been thoroughly tested.
Python provides several forms of support for automated testing. When func-
tions or classes are defined in a module, testing for that module can be embedded
in the same file. The mechanism for doing so was described in Section 1.11. Code
that is shielded in a conditional construct of the form
if name == __main__ :
# perform tests...
will be executed when Python is invoked directly on that module, but not when the
module is imported for use in a larger software project. It is common to put tests
in such a construct to test the functionality of the functions and classes specifically
defined in that module.
More robust support for automation of unit testing is provided by Python’s
unittest module. This framework allows the grouping of individual test cases into
larger test suites, and provides support for executing those suites, and reporting or
analyzing the results of those tests. As software is maintained, the act of regression
testing is used, whereby all previous tests are re-executed to ensure that changes to
the software do not introduce new bugs in previously tested components.
Debugging
The simplest debugging technique consists of using print statements to track the
values of variables during the execution of the program. A problem with this ap-
proach is that eventually the print statements need to be removed or commented
out, so they are not executed when the software is finally released.
A better approach is to run the program within a debugger, which is a special-
ized environment for controlling and monitoring the execution of a program. The
basic functionality provided by a debugger is the insertion of breakpoints within
the code. When the program is executed within the debugger, it stops at each
breakpoint. While the program is stopped, the current value of variables can be
inspected.
The standard Python distribution includes a module named pdb, which provides
debugging support directly within the interpreter. Most IDEs for Python, such as
IDLE, provide debugging environments with graphical user interfaces.
2.3. Class Definitions 69
1 class CreditCard:
2 ”””A consumer credit card.”””
3
4 def init (self, customer, bank, acnt, limit):
5 ”””Create a new credit card instance.
6
7 The initial balance is zero.
8
9 customer the name of the customer (e.g., John Bowman )
10 bank the name of the bank (e.g., California Savings )
11 acnt the acount identifier (e.g., 5391 0375 9387 5309 )
12 limit credit limit (measured in dollars)
13 ”””
14 self. customer = customer
15 self. bank = bank
16 self. account = acnt
17 self. limit = limit
18 self. balance = 0
19
20 def get customer(self):
21 ”””Return name of the customer.”””
22 return self. customer
23
24 def get bank(self):
25 ”””Return the bank s name.”””
26 return self. bank
27
28 def get account(self):
29 ”””Return the card identifying number (typically stored as a string).”””
30 return self. account
31
32 def get limit(self):
33 ”””Return current credit limit.”””
34 return self. limit
35
36 def get balance(self):
37 ”””Return current balance.”””
38 return self. balance
Code Fragment 2.1: The beginning of the CreditCard class definition (continued in
Code Fragment 2.2).
2.3. Class Definitions 71
39 def charge(self, price):
40 ”””Charge given price to the card, assuming sufficient credit limit.
41
42 Return True if charge was processed; False if charge was denied.
43 ”””
44 if price + self. balance > self. limit: # if charge would exceed limit,
45 return False # cannot accept charge
46 else:
47 self. balance += price
48 return True
49
50 def make payment(self, amount):
51 ”””Process customer payment that reduces balance.”””
52 self. balance −= amount
Code Fragment 2.2: The conclusion of the CreditCard class definition (continued
from Code Fragment 2.1). These methods are indented within the class definition.
The Constructor
A user can create an instance of the CreditCard class using a syntax as:
cc = CreditCard( John Doe, 1st Bank , 5391 0375 9387 5309 , 1000)
Internally, this results in a call to the specially named init method that serves
as the constructor of the class. Its primary responsibility is to establish the state of
a newly created credit card object with appropriate instance variables. In the case
of the CreditCard class, each object maintains five instance variables, which we
name: customer, bank, account, limit, and balance. The initial values for the
first four of those five are provided as explicit parameters that are sent by the user
when instantiating the credit card, and assigned within the body of the construc-
tor. For example, the command, self. customer = customer, assigns the instance
variable self. customer to the parameter customer; note that because customer is
unqualified on the right-hand side, it refers to the parameter in the local namespace.
72 Chapter 2. Object-Oriented Programming
Encapsulation
By the conventions described in Section 2.2.3, a single leading underscore in the
name of a data member, such as balance, implies that it is intended as nonpublic.
Users of a class should not directly access such members.
As a general rule, we will treat all data members as nonpublic. This allows
us to better enforce a consistent state for all instances. We can provide accessors,
such as get balance, to provide a user of our class read-only access to a trait. If
we wish to allow the user to change the state, we can provide appropriate update
methods. In the context of data structures, encapsulating the internal representation
allows us greater flexibility to redesign the way a class works, perhaps to improve
the efficiency of the structure.
Additional Methods
The most interesting behaviors in our class are charge and make payment. The
charge function typically adds the given price to the credit card balance, to reflect
a purchase of said price by the customer. However, before accepting the charge,
our implementation verifies that the new purchase would not cause the balance to
exceed the credit limit. The make payment charge reflects the customer sending
payment to the bank for the given amount, thereby reducing the balance on the
card. We note that in the command, self. balance −= amount, the expression
self. balance is qualified with the self identifier because it represents an instance
variable of the card, while the unqualified amount represents the local parameter.
Error Checking
Our implementation of the CreditCard class is not particularly robust. First, we
note that we did not explicitly check the types of the parameters to charge and
make payment, nor any of the parameters to the constructor. If a user were to make
a call such as visa.charge( candy ), our code would presumably crash when at-
tempting to add that parameter to the current balance. If this class were to be widely
used in a library, we might use more rigorous techniques to raise a TypeError when
facing such misuse (see Section 1.7).
Beyond the obvious type errors, our implementation may be susceptible to log-
ical errors. For example, if a user were allowed to charge a negative price, such
as visa.charge(−300), that would serve to lower the customer’s balance. This pro-
vides a loophole for lowering a balance without making a payment. Of course,
this might be considered valid usage if modeling the credit received when a cus-
tomer returns merchandise to a store. We will explore some such issues with the
CreditCard class in the end-of-chapter exercises.
2.3. Class Definitions 73
Testing the Class
In Code Fragment 2.3, we demonstrate some basic usage of the CreditCard class,
inserting three cards into a list named wallet. We use loops to make some charges
and payments, and use various accessors to print results to the console.
These tests are enclosed within a conditional, if name == __main__ :,
so that they can be embedded in the source code with the class definition. Using
the terminology of Section 2.2.4, these tests provide method coverage, as each of
the methods is called at least once, but it does not provide statement coverage, as
there is never a case in which a charge is rejected due to the credit limit. This
is not a particular advanced from of testing as the output of the given tests must
be manually audited in order to determine whether the class behaved as expected.
Python has tools for more formal testing (see discussion of the unittest module
in Section 2.2.4), so that resulting values can be automatically compared to the
predicted outcomes, with output generated only when an error is detected.
53 if name == __main__ :
54 wallet = [ ]
55 wallet.append(CreditCard( John Bowman , California Savings ,
56 5391 0375 9387 5309 , 2500) )
57 wallet.append(CreditCard( John Bowman , California Federal ,
58 3485 0399 3395 1954 , 3500) )
59 wallet.append(CreditCard( John Bowman , California Finance ,
60 5391 0375 9387 5309 , 5000) )
61
62 for val in range(1, 17):
63 wallet[0].charge(val)
64 wallet[1].charge(2 val)
65 wallet[2].charge(3 val)
66
67 for c in range(3):
68 print( Customer = , wallet[c].get customer( ))
69 print( Bank = , wallet[c].get bank( ))
70 print( Account = , wallet[c].get account( ))
71 print( Limit = , wallet[c].get limit( ))
72 print( Balance = , wallet[c].get balance( ))
73 while wallet[c].get balance( ) > 100:
74 wallet[c].make payment(100)
75 print( New balance = , wallet[c].get balance( ))
76 print( )
Code Fragment 2.3: Testing the CreditCard class.
74 Chapter 2. Object-Oriented Programming
Non-Operator Overloads
In addition to traditional operator overloading, Python relies on specially named
methods to control the behavior of various other functionality, when applied to
user-defined classes. For example, the syntax, str(foo), is formally a call to the
constructor for the string class. Of course, if the parameter is an instance of a user-
defined class, the original authors of the string class could not have known how
that instance should be portrayed. So the string constructor calls a specially named
method, foo. str ( ), that must return an appropriate string representation.
Similar special methods are used to determine how to construct an int, float, or
bool based on a parameter from a user-defined class. The conversion to a Boolean
value is particularly important, because the syntax, if foo:, can be used even when
foo is not formally a Boolean value (see Section 1.4.1). For a user-defined class,
that condition is evaluated by the special method foo. bool ( ).
2.3. Class Definitions 75
Implied Methods
As a general rule, if a particular special method is not implemented in a user-defined
class, the standard syntax that relies upon that method will raise an exception. For
example, evaluating the expression, a + b, for instances of a user-defined class
without add or radd will raise an error.
However, there are some operators that have default definitions provided by
Python, in the absence of special methods, and there are some operators whose
definitions are derived from others. For example, the bool method, which
supports the syntax if foo:, has default semantics so that every object other than
None is evaluated as True. However, for container types, the len method is
typically defined to return the size of the container. If such a method exists, then
the evaluation of bool(foo) is interpreted by default to be True for instances with
nonzero length, and False for instances with zero length, allowing a syntax such as
if waitlist: to be used to test whether there are one or more entries in the waitlist.
In Section 2.3.4, we will discuss Python’s mechanism for providing iterators
for collections via the special method, iter . With that said, if a container class
provides implementations for both len and getitem , a default iteration is
provided automatically (using means we describe in Section 2.3.4). Furthermore,
once an iterator is defined, default functionality of contains is provided.
In Section 1.3 we drew attention to the distinction between expression a is b
and expression a == b, with the former evaluating whether identifiers a and b are
aliases for the same object, and the latter testing a notion of whether the two iden-
tifiers reference equivalent values. The notion of “equivalence” depends upon the
context of the class, and semantics is defined with the eq method. However, if
no implementation is given for eq , the syntax a == b is legal with semantics
of a is b, that is, an instance is equivalent to itself and no others.
We should caution that some natural implications are not automatically pro-
vided by Python. For example, the eq method supports syntax a == b, but
providing that method does not affect the evaluation of syntax a != b. (The ne
method should be provided, typically returning not (a == b) as a result.) Simi-
larly, providing a lt method supports syntax a < b, and indirectly b > a, but
providing both lt and eq does not imply semantics for a <= b.
2.3. Class Definitions 77
1 class Vector:
2 ”””Represent a vector in a multidimensional space.”””
3
4 def init (self, d):
5 ”””Create d-dimensional vector of zeros.”””
6 self. coords = [0] d
7
8 def len (self):
9 ”””Return the dimension of the vector.”””
10 return len(self. coords)
11
12 def getitem (self, j):
13 ”””Return jth coordinate of vector.”””
14 return self. coords[j]
15
16 def setitem (self, j, val):
17 ”””Set jth coordinate of vector to given value.”””
18 self. coords[j] = val
19
20 def add (self, other):
21 ”””Return sum of two vectors.”””
22 if len(self) != len(other): # relies on len method
23 raise ValueError( dimensions must agree )
24 result = Vector(len(self)) # start with vector of zeros
25 for j in range(len(self)):
26 result[j] = self[j] + other[j]
27 return result
28
29 def eq (self, other):
30 ”””Return True if vector has same coordinates as other.”””
31 return self. coords == other. coords
32
33 def ne (self, other):
34 ”””Return True if vector differs from other.”””
35 return not self == other # rely on existing eq definition
36
37 def str (self):
38 ”””Produce string representation of vector.”””
39 return < + str(self. coords)[1:−1] + > # adapt list representation
2.3.4 Iterators
1 class SequenceIterator:
2 ”””An iterator for any of Python s sequence types.”””
3
4 def init (self, sequence):
5 ”””Create an iterator for the given sequence.”””
6 self. seq = sequence # keep a reference to the underlying data
7 self. k = −1 # will increment to 0 on first call to next
8
9 def next (self):
10 ”””Return the next element, or else raise StopIteration error.”””
11 self. k += 1 # advance to next index
12 if self. k < len(self. seq):
13 return(self. seq[self. k]) # return the data element
14 else:
15 raise StopIteration( ) # there are no more elements
16
17 def iter (self):
18 ”””By convention, an iterator must return itself as an iterator.”””
19 return self
Code Fragment 2.5: An iterator class for any sequence type.
80 Chapter 2. Object-Oriented Programming
1 class Range:
2 ”””A class that mimic s the built-in range class.”””
3
4 def init (self, start, stop=None, step=1):
5 ”””Initialize a Range instance.
6
7 Semantics is similar to built-in range class.
8 ”””
9 if step == 0:
10 raise ValueError( step cannot be 0 )
11
12 if stop is None: # special case of range(n)
13 start, stop = 0, start # should be treated as if range(0,n)
14
15 # calculate the effective length once
16 self. length = max(0, (stop − start + step − 1) // step)
17
18 # need knowledge of start and step (but not stop) to support getitem
19 self. start = start
20 self. step = step
21
22 def len (self):
23 ”””Return number of entries in the range.”””
24 return self. length
25
26 def getitem (self, k):
27 ”””Return entry at index k (using standard interpretation if negative).”””
28 if k < 0:
29 k += len(self) # attempt to convert negative index
30
31 if not 0 <= k < self. length:
32 raise IndexError( index out of range )
33
34 return self. start + k self. step
2.4 Inheritance
Building
There are two ways in which a subclass can differentiate itself from its su-
perclass. A subclass may specialize an existing behavior by providing a new im-
plementation that overrides an existing method. A subclass may also extend its
superclass by providing brand new methods.
2.4. Inheritance 83
Python’s Exception Hierarchy
Another example of a rich inheritance hierarchy is the organization of various ex-
ception types in Python. We introduced many of those classes in Section 1.7, but
did not discuss their relationship with each other. Figure 2.5 illustrates a (small)
portion of that hierarchy. The BaseException class is the root of the entire hierar-
chy, while the more specific Exception class includes most of the error types that
we have discussed. Programmers are welcome to define their own special exception
classes to denote errors that may occur in the context of their application. Those
user-defined exception types should be declared as subclasses of Exception.
BaseException
Class: CreditCard
Fields: customer balance
bank limit
account
Behaviors: get customer( ) get balance( )
get bank( ) get limit( )
get account( ) charge(price)
make payment(amount)
Class: PredatoryCreditCard
Fields: apr
Behaviors: process month( ) charge(price)
Figure 2.6 provides an overview of our use of inheritance in designing the new
PredatoryCreditCard class, and Code Fragment 2.7 gives a complete Python im-
plementation of that class.
To indicate that the new class inherits from the existing CreditCard class, our
definition begins with the syntax, class PredatoryCreditCard(CreditCard). The
body of the new class provides three member functions: init , charge, and
process month. The init constructor serves a very similar role to the original
CreditCard constructor, except that for our new class, there is an extra parameter
to specify the annual percentage rate. The body of our new constructor relies upon
making a call to the inherited constructor to perform most of the initialization (in
fact, everything other than the recording of the percentage rate). The mechanism
for calling the inherited constructor relies on the syntax, super( ). Specifically, at
line 15 the command
super( ). init (customer, bank, acnt, limit)
calls the init method that was inherited from the CreditCard superclass. Note
well that this method only accepts four parameters. We record the APR value in a
new field named apr.
In similar fashion, our PredatoryCreditCard class provides a new implemen-
tation of the charge method that overrides the inherited method. Yet, our imple-
mentation of the new method relies on a call to the inherited method, with syntax
super( ).charge(price) at line 24. The return value of that call designates whether
2.4. Inheritance 85
1 class PredatoryCreditCard(CreditCard):
2 ”””An extension to CreditCard that compounds interest and fees.”””
3
4 def init (self, customer, bank, acnt, limit, apr):
5 ”””Create a new predatory credit card instance.
6
7 The initial balance is zero.
8
9 customer the name of the customer (e.g., John Bowman )
10 bank the name of the bank (e.g., California Savings )
11 acnt the acount identifier (e.g., 5391 0375 9387 5309 )
12 limit credit limit (measured in dollars)
13 apr annual percentage rate (e.g., 0.0825 for 8.25% APR)
14 ”””
15 super( ). init (customer, bank, acnt, limit) # call super constructor
16 self. apr = apr
17
18 def charge(self, price):
19 ”””Charge given price to the card, assuming sufficient credit limit.
20
21 Return True if charge was processed.
22 Return False and assess 5 fee if charge is denied.
23 ”””
24 success = super( ).charge(price) # call inherited method
25 if not success:
26 self. balance += 5 # assess penalty
27 return success # caller expects return value
28
29 def process month(self):
30 ”””Assess monthly interest on outstanding balance.”””
31 if self. balance > 0:
32 # if positive balance, convert APR to monthly multiplicative factor
33 monthly factor = pow(1 + self. apr, 1/12)
34 self. balance = monthly factor
Code Fragment 2.7: A subclass of CreditCard that assesses interest and fees.
86 Chapter 2. Object-Oriented Programming
the charge was successful. We examine that return value to decide whether to as-
sess a fee, and in turn we return that value to the caller of method, so that the new
version of charge has a similar outward interface as the original.
The process month method is a new behavior, so there is no inherited version
upon which to rely. In our model, this method should be invoked by the bank,
once each month, to add new interest charges to the customer’s balance. The most
challenging aspect in implementing this method is making sure we have working
knowledge of how an annual percentage rate translates to a monthly rate. We do
not simply divide the annual rate by twelve to get a monthly rate (that would be too
predatory, as it would result in a higher APR than advertised). The correct com-
putation is to take the twelfth-root of 1 + self. apr, and use that as a multiplica-
tive
√ factor. For example, if the APR is 0.0825 (representing 8.25%), we compute
12
1.0825 ≈ 1.006628, and therefore charge 0.6628% interest per month. In this
way, each $100 of debt will amass $8.25 of compounded interest in a year.
Protected Members
Our PredatoryCreditCard subclass directly accesses the data member self. balance,
which was established by the parent CreditCard class. The underscored name, by
convention, suggests that this is a nonpublic member, so we might ask if it is okay
that we access it in this fashion. While general users of the class should not be
doing so, our subclass has a somewhat privileged relationship with the superclass.
Several object-oriented languages (e.g., Java, C++) draw a distinction for nonpub-
lic members, allowing declarations of protected or private access modes. Members
that are declared as protected are accessible to subclasses, but not to the general
public, while members that are declared as private are not accessible to either. In
this respect, we are using balance as if it were protected (but not private).
Python does not support formal access control, but names beginning with a sin-
gle underscore are conventionally akin to protected, while names beginning with a
double underscore (other than special methods) are akin to private. In choosing to
use protected data, we have created a dependency in that our PredatoryCreditCard
class might be compromised if the author of the CreditCard class were to change
the internal design. Note that we could have relied upon the public get balance( )
method to retrieve the current balance within the process month method. But the
current design of the CreditCard class does not afford an effective way for a sub-
class to change the balance, other than by direct manipulation of the data member.
It may be tempting to use charge to add fees or interest to the balance. However,
that method does not allow the balance to go above the customer’s credit limit,
even though a bank would presumably let interest compound beyond the credit
limit, if warranted. If we were to redesign the original CreditCard class, we might
add a nonpublic method, set balance, that could be used by subclasses to affect a
change without directly accessing the data member balance.
2.4. Inheritance 87
1 class Progression:
2 ”””Iterator producing a generic progression.
3
4 Default iterator produces the whole numbers 0, 1, 2, ...
5 ”””
6
7 def init (self, start=0):
8 ”””Initialize current to the first value of the progression.”””
9 self. current = start
10
11 def advance(self):
12 ”””Update self. current to a new value.
13
14 This should be overridden by a subclass to customize progression.
15
16 By convention, if current is set to None, this designates the
17 end of a finite progression.
18 ”””
19 self. current += 1
20
21 def next (self):
22 ”””Return the next element, or else raise StopIteration error.”””
23 if self. current is None: # our convention to end a progression
24 raise StopIteration( )
25 else:
26 answer = self. current # record current value to return
27 self. advance( ) # advance to prepare for next time
28 return answer # return the answer
29
30 def iter (self):
31 ”””By convention, an iterator must return itself as an iterator.”””
32 return self
33
34 def print progression(self, n):
35 ”””Print next n values of the progression.”””
36 print( .join(str(next(self)) for j in range(n)))
Code Fragment 2.8: A general numeric progression class.
2.4. Inheritance 89
An Arithmetic Progression Class
1 class FibonacciProgression(Progression):
2 ”””Iterator producing a generalized Fibonacci progression.”””
3
4 def init (self, first=0, second=1):
5 ”””Create a new fibonacci progression.
6
7 first the first term of the progression (default 0)
8 second the second term of the progression (default 1)
9 ”””
10 super( ). init (first) # start progression at first
11 self. prev = second − first # fictitious value preceding the first
12
13 def advance(self):
14 ”””Update current value by taking sum of previous two.”””
15 self. prev, self. current = self. current, self. prev + self. current
Code Fragment 2.11: A class that produces a Fibonacci progression.
To complete our presentation, Code Fragment 2.12 provides a unit test for all of
our progression classes, and Code Fragment 2.13 shows the output of that test.
92 Chapter 2. Object-Oriented Programming
if name == __main__ :
print( Default progression: )
Progression( ).print progression(10)
Default progression:
0123456789
Arithmetic progression with increment 5:
0 5 10 15 20 25 30 35 40 45
Arithmetic progression with increment 5 and start 2:
2 7 12 17 22 27 32 37 42 47
Geometric progression with default base:
1 2 4 8 16 32 64 128 256 512
Geometric progression with base 3:
1 3 9 27 81 243 729 2187 6561 19683
Fibonacci progression with default start values:
0 1 1 2 3 5 8 13 21 34
Fibonacci progression with start values 4 and 6:
4 6 10 16 26 42 68 110 178 288
Code Fragment 2.13: Output of the unit tests from Code Fragment 2.12.
2.4. Inheritance 93
This implementation relies on two advanced Python techniques. The first is that
we declare the ABCMeta class of the abc module as a metaclass of our Sequence
class. A metaclass is different from a superclass, in that it provides a template for
the class definition itself. Specifically, the ABCMeta declaration assures that the
constructor for the class raises an error.
2.4. Inheritance 95
The second advanced technique is the use of the @abstractmethod decorator
immediately before the len and getitem methods are declared. That de-
clares these two particular methods to be abstract, meaning that we do not provide
an implementation within our Sequence base class, but that we expect any concrete
subclasses to support those two methods. Python enforces this expectation, by dis-
allowing instantiation for any subclass that does not override the abstract methods
with concrete implementations.
The rest of the Sequence class definition provides tangible implementations for
other behaviors, under the assumption that the abstract len and getitem
methods will exist in a concrete subclass. If you carefully examine the source code,
the implementations of methods contains , index, and count do not rely on any
assumption about the self instances, other than that syntax len(self) and self[j] are
supported (by special methods len and getitem , respectively). Support
for iteration is automatic as well, as described in Section 2.3.4.
In the remainder of this book, we omit the formality of using the abc module.
If we need an “abstract” base class, we simply document the expectation that sub-
classes provide assumed functionality, without technical declaration of the methods
as abstract. But we will make use of the wonderful abstract base classes that are
defined within the collections module (such as Sequence). To use such a class, we
need only rely on standard inheritance techniques.
For example, our Range class, from Code Fragment 2.6 of Section 2.3.5, is an
example of a class that supports the len and getitem methods. But that
class does not support methods count or index. Had we originally declared it with
Sequence as a superclass, then it would also inherit the count and index methods.
The syntax for such a declaration would begin as:
class Range(collections.Sequence):
Nested Classes
It is also possible to nest one class definition within the scope of another class.
This is a useful construct, which we will exploit several times in this book in the
implementation of data structures. This can be done by using a syntax such as
class A: # the outer class
class B: # the nested class
...
In this case, class B is the nested class. The identifier B is entered into the name-
space of class A associated with the newly defined class. We note that this technique
is unrelated to the concept of inheritance, as class B does not inherit from class A.
Nesting one class in the scope of another makes clear that the nested class
exists for support of the outer class. Furthermore, it can help reduce potential name
conflicts, because it allows for a similarly named class to exist in another context.
For example, we will later introduce a data structure known as a linked list and will
define a nested node class to store the individual components of the list. We will
also introduce a data structure known as a tree that depends upon its own nested
2.5. Namespaces and Object-Orientation 99
node class. These two structures rely on different node definitions, and by nesting
those within the respective container classes, we avoid ambiguity.
Another advantage of one class being nested as a member of another is that it
allows for a more advanced form of inheritance in which a subclass of the outer
class overrides the definition of its nested class. We will make use of that technique
in Section 11.2.1 when specializing the nodes of a tree structure.
By default, Python represents each namespace with an instance of the built-in dict
class (see Section 1.2.3) that maps identifying names in that scope to the associated
objects. While a dictionary structure supports relatively efficient name lookups,
it requires additional memory usage beyond the raw data that it stores (we will
explore the data structure used to implement dictionaries in Chapter 10).
Python provides a more direct mechanism for representing instance namespaces
that avoids the use of an auxiliary dictionary. To use the streamlined representation
for all instances of a class, that class definition must provide a class-level member
named slots that is assigned to a fixed sequence of strings that serve as names
for instance variables. For example, with our CreditCard class, we would declare
the following:
class CreditCard:
slots = _customer , _bank , _account , _balance , _limit
In this example, the right-hand side of the assignment is technically a tuple (see
discussion of automatic packing of tuples in Section 1.9.3).
When inheritance is used, if the base class declares slots , a subclass must
also declare slots to avoid creation of instance dictionaries. The declaration
in the subclass should only include names of supplemental methods that are newly
introduced. For example, our PredatoryCreditCard declaration would include the
following declaration:
class PredatoryCreditCard(CreditCard):
slots = _apr # in addition to the inherited members
We could choose to use the slots declaration to streamline every class in
this book. However, we do not do so because such rigor would be atypical for
Python programs. With that said, there are a few classes in this book for which
we expect to have a large number of instances, each representing a lightweight
construct. For example, when discussing nested classes, we suggest linked lists
and trees as data structures that are often comprised of a large number of individual
nodes. To promote greater efficiency in memory usage, we will use an explicit
slots declaration in any nested classes for which we expect many instances.
100 Chapter 2. Object-Oriented Programming