Intermediate Python
Intermediate Python
Intermediate Python
Obi Ike-Nwosu
This book is for sale at http://leanpub.com/intermediatepython
This version was published on 2015-10-18
This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing
process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and
many iterations to get reader feedback, pivot until you have the right book and build traction once
you do.
2015 Obi Ike-Nwosu
Contents
1. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2. An Introduction . . . . . . . . . . . . . .
2.1 The Evolution of Python . . . . . .
2.2 Python 2 vs Python 3 . . . . . . . .
2.3 The Python Programming Language
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
3
4
4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
6
9
10
15
17
20
21
22
23
24
4. Intermezzo: Glossary . .
4.1 Names and Binding
4.2 Code Blocks . . . .
4.3 Name-spaces . . . .
4.4 Scopes . . . . . . .
4.5 eval() . . . . . . .
4.6 exec() . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
25
25
25
25
26
27
28
5. Objects 201 . . . . . . . . . . . . . . . . .
5.1 Strong and Weak Object References
5.2 The Type Hierarchy . . . . . . . . .
None Type . . . . . . . . . . . . .
NotImplemented Type . . . . . .
Ellipsis Type . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
29
31
32
33
33
35
CONTENTS
Numeric Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
36
40
42
43
43
44
44
44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
50
50
50
53
53
55
58
62
62
65
70
71
71
73
73
74
75
76
77
80
7. The Function . . . . . . . . . . . . . .
7.1 Function Definitions . . . . . . . .
7.2 Functions are Objects . . . . . . .
7.3 Functions are descriptors . . . . .
7.4 Calling Functions . . . . . . . . .
Unpacking Function Argument
* and ** Function Parameters .
7.5 Nested functions and Closures . .
7.6 A Byte of Functional Programming
The Basics . . . . . . . . . . . .
Comprehensions . . . . . . . .
Functools . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
83
83
84
87
87
89
90
91
94
94
95
98
Sequence Type
Set . . . . . . .
Mapping . . . .
Callable Types
Custom Type .
Module Type .
File/IO Types .
Built-in Types .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
102
102
106
108
108
110
111
113
114
116
118
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
122
122
122
124
125
125
127
128
131
135
137
141
141
143
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
145
145
148
148
149
149
151
153
153
156
156
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
1. Acknowledgements
I would love to take the opportunity to thank all who have reviewed and spotted issues in the
manuscript. This includes but is not limited to Ngozi Nwosu for taking the time out to review the
whole manual and point out a whole load of grammatical errors, Olivia Enewally, Roman Turna
and Abhen Ng for pointing out some factual and grammatical errors.A whole lot of other people on
Reddit have pointed out errors and to those people I am really grateful.
Without the input of all these people, this manuscript would be worth less than it currently is. Thank
you all!
2. An Introduction
The Python Programming language has been around for quite a while. Development work was
started on the first version of Python by Guido Van Rossum in 1989. Since then, it has grown to
become a highly loved and revered language that has been used and continues to be used in a host
of different application types.
The Python interpreter and the extensive standard library that come with the interpreter are
available for free in source or binary form for all major platforms from the Python Web site. This
site also contains distributions of and pointers to many free third party Python modules, programs
and tools, and additional documentation.
The Python interpreter can easily be extended with new functions and data types implemented in C,
C++ or any other language that is callable from C. Python is also suitable as an extension language
for customisable applications. One of the most notable feature of python is the easy and white-space
aware syntax.
This book is intended as a concise intermediate level treatise on the Python programming language.
There is a need for this due to the lack of availability of materials for python programmers at this
level. The material contained in this book is targeted at the programmer that has been through a
beginner level introduction to the Python programming language or that has some experience in
a different object oriented programming language such as Java and wants to gain a more in-depth
understanding of the Python programming language in a holistic manner. It is not intended as an
introductory tutorial for beginners although programmers with some experience in other languages
may find the very short tutorial included instructive.
The book covers only a handful of topics but tries to provide a holistic and in-depth coverage
of these topics. It starts with a short tutorial introduction to get the reader up to speed with the
basics of Python; experienced programmers from other object oriented languages such as Java may
find that this is all the introduction to Python that they need. This is followed by a discussion of
the Python object model then it moves on to discussing object oriented programming in Python.
With a firm understanding of the Python object model, it goes ahead to discuss functions and
functional programming. This is followed by a discussion of meta-programming techniques and
their applications. The remaining chapters cover generators, a complex but very interesting topic in
Python, modules and packaging, and python runtime services. In between, intermezzos are used to
discuss topics are worth knowing because of the added understanding they provide.
I hope the content of the book achieves the purpose for the writing of this book. I welcome all
feedback readers may have and actively encourage readers to provide such feedback.
https://www.python.org/
An Introduction
The next major milestone was Python 3 released on December 2, 2008. This was designed to rectify
certain fundamental design flaws in the language that could not be implemented while maintaining
full backwards compatibility with the 2.x series.
An Introduction
A user can type in python statements at the interpreter prompt and get instant feedback. For example,
we can evaluate expressions at the REPL and get values for such expressions as in the following
example.
https://www.python.org/
Typing Ctrl-D at the primary prompt causes the interpreter to exit the session.
Multiple physical lines can be explicitly joined into a single logical line by use of the line continuation
character, \, as shown below:
>>> name = "Obi Ike-Nwosu"
>>> cleaned_name = name.replace("-", " "). \
...
replace(" ", "")
>>> cleaned_name
'ObiIkeNwosu'
>>>
Lines are joined implicitly, thus eliminating the need for line continuation characters, when
expressions in triple quoted strings, enclosed in parenthesis (), brackets [.] or braces {} spans
multiple lines.
From discussions above, it can be inferred that there are two types of statements in python:
1. Simple statements that span a single logical line. These include statements such as assignment
statements, yield statements etc. A simple statement can be summarized as follows:
1. Compound statements that span multiple logical lines statements. These include statements
such as the while and for statements. A compound statement is summarized as thus in python:
Compound statements are made up of one or more clauses. A clause is made up of a header and a
suite. The clause headers for a given compound statement are all at the same indentation level; they
begin with a unique identifier, while, if etc., and end with a colon. The suite execution is controlled
by the header. This is illustrate with the example below:
>>> num = 6
# if statement is a compound statement
# clause header controls execution of indented block that follows
>>> if num % 2 == 0:
# indented suite block
...
print("The number {} is even".format(num))
...
The number 6 is even
>>>
The suite may be a set of one or more statements that follow the headers colon with each statement
separated from the previous by a semi-colon as shown in the following example.
```python
>>> x = 1
>>> y = 2
>>> z = 3
>>> if x < y < z: print(x); print(y); print(z)
...
1
2
3
```
The suite is conventionally written as one or more indented statements on subsequent lines that
follow the header such as below:
```python
>>> x = 1
>>> y = 2
>>> z = 3
>>> if x < y < z:
...
print(x)
...
print(y);
...
print(z)
...
1
2
3
```
Indentations are used to denote code blocks such as function bodies, conditionals, loops and classes.
Leading white-space at the beginning of a logical line is used to compute the indentation level for
that line, which in turn is used to determine the grouping of statements. Indentation used within
the code body must always match the indentation of the the first statement of the block of code.
3.3 Strings
Strings are represented in Python using double "..." or single '...' quotes. Special characters can
be used within a string by escaping them with \ as shown in the following example:
# the quote is used as an apostrophe so we escape it for python to
# treat is as an apostrophe rather than the closing quote for a string
>>> name = 'men\'s'
>>> name
"men's"
>>>
To avoid the interpretation of characters as special characters, the character, r, is added before the
opening quote for the string as shown in the following example.
>>> print('C:\some\name') # here \n means newline!
C:\some
ame
>>> print(r'C:\some\name') # note the r before the quote
C:\some\name
String literals that span multiple lines can be created with the triple quotes but newlines are
automatically added at the end of a line as shown in the following snippet.
>>> para = """hello world I am putting together a
... book for beginners to get to the next level in python"""
# notice the new line character
>>> para
'hello world I am putting together a \nbook for beginners to get to the next level in python'
# printing this will cause the string to go on multiple lines
>>> print(para)
hello world I am putting together a
book for beginners to get to the next level in python
>>>
To avoid the inclusion of a newline, the continuation character \ should be used at the end of a line
as shown in the following example.
10
String are immutable so once created they cannot be modified. There is no character type so
characters are assumed to be strings of length, 1. Strings are sequence types so support sequence
type operations except assignment due to their immutability. Strings can be indexed with integers
as shown in the following snippet:
>>> name = 'obiesie'
>>> name[1]
'b'
>>>
Strings can be concatenated to create new strings as shown in the following example
>>> name = 'obiesie'
>>> surname = " Ike-Nwosu"
>>> full_name = name + surname
>>> full_name
'obiesie Ike-Nwosu'
>>>
One or more string literals can be concatenated together by writing them next to each other as
shown in the following snippet:
>>> 'Py' 'thon'
'Python'
>>>
The built-in method len can also be used to get the length of a string as shown in the following
snippet.
>>> name = "obi"
>>> len(name)
3
>>>
11
The if statement can be followed by zero or more elif statements and an optional else statement
that is executed when none of the conditions in the if or elif statements have been met.
>>> if name == "obi":
...
print("Hello Obi")
... elif name == "chuks":
...
print("Hello chuks")
... else:
...
print("Hello Stranger")
Hello Stranger
>>>
Most programming languages have a syntax similar to the following for iterating over a progression
of numbers:
for(int x = 10; x < 20; x = x+1) {
// do something here
}
Python replaces the above with the simpler range() statement that is used to generate an arithmetic
progression of integers. For example:
12
The range statement has a syntax of range(start, stop, step). The stop value is never part of
the progression that is returned.
while statement
The while statement executes the statements in its suite as long as the condition expression in the
while statement evaluates to true.
>>> counter = 10
>>> while counter > 0: # the conditional expression is 'counter>0'
...
print(counter)
...
counter = counter - 1
...
10
9
8
7
6
5
4
3
2
1
13
The continue keyword is used to force the start of the next iteration of a loop. When used the
interpreter ignores all statements that come after the continue statement and continues with the
next iteration of the loop.
>>> for i in range(10):
# if i is 5 then
# continues with
...
if i == 5:
...
continue
...
print("The value
...
The value is 0
The value is 1
The value is 2
The value is 3
The value is 4
# no printed value for i
The value is 6
The value is 7
The value is 8
The value is 9
is " + str(i))
== 6
In the example above, it can be observed that the number 5 is not printed due to the use of continue
when the value is 5 however all subsequent values are printed.
else clause with looping constructs Python has a quirky feature in which the else keyword
can be used with looping constructs. When an else keyword is used with a looping construct such
as while or for, the statements within the suite of the else statement are executed as long as the
looping construct was not ended by a break statement.
14
If the loop was exited by a break statement, the execution of the suite of the else statement is
skipped as shown in the following example:
>>> for i in range(10):
...
if i == 5:
...
break
...
print(i)
... else:
...
print("I am in quirky else loop")
...
0
1
2
3
4
>>>
Enumerate
Sometimes, when iterating over a list, a tuple or a sequence in general, having access to the index
of the item, as well as the item being enumerated over maybe necessary. This could achieved using
a while loop as shown in the following snippet:
15
The above solution is how one would go about it in most languages but python has a better
alternative to such in the form of the enumerate keyword. The above solution can be reworked
beautifully in python as shown in the following snippet:
>>> for index, name in enumerate(names):
...
print("{}. {}".format(index, name))
...
0. Joe
1. Obi
2. Chris
3. Jamie
>>>
3.5 Functions
Named functions are defined with the def keyword which must be followed by the function name
and the parenthesized list of formal parameters. The returnkeyword is used to return a value from
a function definition. A python function definition is shown in the example below:
def full_name(first_name, last_name):
return " ".join((first_name, last_name))
Functions are invoked by calling the function name with required arguments in parenthesis
for example full_name("Obi", "Ike-Nwosu"). Python functions can return multiple values by
returning a tuple of the required values as shown in the example below in which we return the
quotient and remainder from a division operation:
16
Python functions can be defined without return keyword. In that case the default returned value is
None as shown in the following snippet:
>>> def print_name(first_name, last_name):
...
print(" ".join((first_name, last_name)))
...
>>> print_name("Obi", "Ike-Nwosu")
Obi Ike-Nwosu
>>> x = print_name("Obi", "Ike-Nwosu")
Obi Ike-Nwosu
>>> x
>>> type(x)
<type 'NoneType'>
>>>
The return keyword does not even have to return a value in python as shown in the following
example.
>>> def dont_return_value():
...
print("How to use return keyword without a value")
...
return
...
>>> dont_return_value()
How to use return keyword without a value
Python also supports anonymous functions defined with the lambda keyword. Pythons lambda
support is rather limited, crippled a few people may say, because it supports only a single expression
in the body of the lambda expression. Lambda expressions are another form of syntactic sugar and
are equivalent to conventional named function definition. An example of a lambda expression is the
following:
>>> square_of_number = lambda x: x**2
>>> square_of_number
<function <lambda> at 0x101a07158>
>>> square_of_number(2)
4
>>>
17
Elements can also be added to other parts of a list not just the end using `insert` method.
Two or more lists can be concatenated together with the `+` operator.
To get a full listing of all methods of the list, run the help command with list as argument.
1. Tuples: These are also another type of sequence structures. A tuple consists of a number of
comma separated objects for example.
18
When defining a non-empty tuple the parenthesis is optional but when the tuple is part of a larger
expression, the parenthesis is required. The parenthesis come in handy when defining an empty
tuple for instance:
>>> companies = ()
>>> type(companies)
<class 'tuple'>
>>>
Tuples have a quirky syntax that some people may find surprising. When defining a single element
tuple, the comma must be included after the single element regardless of whether or not parenthesis
are included. If the comma is left out then the result of the expression is not a tuple. For instance:
>>> company = "Google",
>>> type(company)
<class 'tuple'>
>>>
>>> company = ("Google",)
>>> type(company)
<class 'tuple'>
# absence of the comma returns the value contained within the parenthesis
>>> company = ("Google")
>>> company
'Google'
>>> type(company)
<class 'str'>
>>>
Tuples are integer indexed just like lists but are immutable; once created the contents cannot be
changed by any means such as by assignment. For instance:
19
However, if the object in a tuple is a mutable object such as a list, such object can be changed as
shown in the following example:
>>> companies = (["lockheedMartin", "Boeing"], ["Google", "Microsoft"])
>>> companies
(['lockheedMartin', 'Boeing'], ['Google', 'Microsoft'])
>>> companies[0].append("SpaceX")
>>> companies
(['lockheedMartin', 'Boeing', 'SpaceX'], ['Google', 'Microsoft'])
>>>
1. Sets: A set is an unordered collection of objects that does not contain any duplicates. An empty
set is created using set() or by using curly braces, {}. Sets are unordered so unlike tuples or
lists they cannot be indexed by integers. However sets, with the exception of frozen sets, are
mutable so one can add, update or remove from a set as shown in the following:
>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
>>> basket_set = set()
>>> basket_set
set()
>>> basket_set.update(basket)
>>> basket_set
{'pear', 'orange', 'apple', 'banana'}
>>> basket_set.add("clementine")
>>> basket_set
{'pear', 'orange', 'apple', 'banana', 'clementine'}
>>> basket_set.remove("apple")
>>> basket_set
{'pear', 'orange', 'banana', 'clementine'}
>>>
20
The primary operations of interest that are offered by dictionaries are the storage of a value
by the key and retrieval of stored values also by key. Values are retrieved by using indexing
the dictionary with the key using square brackets as shown in the following example.
>>> ages["obi"]
24
Dictionaries are mutable so the values indexed by a key can be changed, keys can be deleted
and added to the dict.
Pythons data structures are not limited to just those listed in this section. For example the
collections module provides additional data structures such as queues and deques however the
data structures listed in this section form the workhorse for most Python applications. To get better
insight into the capabilities of a data structure, the help() function is used with the name of the
data structure as argument for example, help(list).
3.7 Classes
The class statement is used to define new types in python as shown in the following example:
class Account:
# class variable that is common to all instances of a class
num_accounts = 0
def __init__(self, name, balance):
# start of instance variable
self.name = name
self.balance = balance
# end of instance variables
Account.num_accounts += 1
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
@classmethod
21
Classes in python just like classes in other languages have class variables, instance variables,
class methods, static methods and instance methods. When defining classes, the base classes are
included in the parenthesis that follows the class name. For those that are familiar with Java, the
__init__ method is something similar to a constructor; it is in this method that instance variables
are initialized. The above defined class can be initialized by calling the defined class with required
arguments to __init__ in parenthesis ignoring the self argument as shown in the following
example.
>>> acct = Account("obie", 10000000)
Methods in a class that are defined with self as first argument are instance methods. The self
argument is similar to this in java and refers to the object instance. Methods are called in python
using the dot notation syntax as shown below:
>>> acct = Account("obie", 10000000)
>>>account.inquiry()
Name=obie, balance=10000000
Python comes with built-in function, dir, for introspection of objects. The dir function can be called
with an object as argument and it returns a list of all attributes, methods and variables, of a class.
3.8 Modules
Functions and classes provide mean for structuring your Python code but as the code grows in size
and complexity, there is a need for such code to be split into multiple files with each source file
containing related definitions. The source files can then be imported as needed in order to access
definitions in any of such source file. In python, we refer to source files as modules and modules
have the .py extensions.
For example, the Account class definition from the previous section can be saved to a module called
Account.py. To use this module else where, the import statement is used to import the module as
shown in the following example:
>>> import Account
>>> acct = Account.Account("obie", 10000000)
Note that the import statement takes the name of the module without the .py extension. Using the
import statement creates a name-space, in this case the Account name-space and all definitions in
the module are available in such name-space. The dot notation (.) is used to access the definitions as
required. An alias for an imported module can also be created using the as keyword so the example
from above can be reformulated as shown in the following snippet:
22
It is also possible to import only the definitions that are needed from the module resulting in the
following:
>>> from Account import Account
>>> account = Account("obie", 10000000)
All the definitions in a module can also be imported by using the wild card symbol a shown below:
>>> from Account import *
This method of imports is not always advised as it can result in name clashes when one of the name
definitions being imported is already used in the current name-space. This is avoided by importing
the module as a whole. Modules are also objects in Python so we can introspect on them using the
dir introspection function. Python modules can be further grouped together into packages. Modules
and packages are discussed in depth in a subsequent chapter that follows.
3.9 Exceptions
Python has support for exceptions and exception handling. For example, when an attempt is made to
divide by zero, a ZeroDivisionError is thrown by the python interpreter as shown in the following
example.
>>> 2/0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
>>>
During the execution of a program, an exception is raised when an error occurs; if the exception is
not handled, a trace-back is dumped to the screen. Errors that are not handled will normally cause
an executing program to terminate.
Exceptions can be handled in Python by using the try...catch statements. For example, the divide
by zero exception from above could be handled as shown in the following snippet.
23
>>> try:
...
2/0
... except ZeroDivisionError as e:
...
print("Attempting to divide by 0. Not allowed")
...
Attempting to divide by 0. Not allowed
>>>
Exceptions in python can be of different types. For example, if an attempt was made to catch an
IOError in the previous snippet, the program would terminate because the resulting exception type is
a ZeroDivisionError exception. To catch all types of exceptions with a single handler, try...catch
Exception is used but this is advised against as it becomes impossible to tell what kind of exception
has occurred thus masking the exception
Custom exceptions can be defined to handle custom exceptions in our code. To do this, define a
custom exception class that inherits from the Exception base class.
The open method returns a file object or throws an exception if the file does not exist. The file object
supports a number of methods such as read that reads the whole content of the file into a string
or readline that reads the contents of the file one line at a time. Python supports the following
syntactic sugar for iterating through the lines of a file.
for line in open("afile.txt"):
print(line)
24
f = open("out.txt", "w")
contents = ["I", "love", "python"]
for content in contents:
f.write(content)
f.close()
Python also has support for writing to standard input and standard output. This can be done using
the sys.stdout.write() or the sys.stdin.readline() from the sys module.
4. Intermezzo: Glossary
A number of terms and esoteric python functions are used throughout this book and a good
understanding of these terms is integral to gaining a better. and deeper understanding of python. A
description of these terms and functions is provided in the sections that follow.
In the above, example, x is a name that references the object, 5. The process of assigning a reference
to 5 to x is called binding. A binding causes a name to be associated with an object in the innermost
scope of the currently executing program. Bindings may occur during a number of instances such
as during variable assignment or function or method call when the supplied parameter is bound to
the argument. It is important to note that names are just symbols and they have no type associated
with them; names are just references to objects that actually have types
4.3 Name-spaces
A name-space as the name implies is a context in which a given set of names is bound to objects.
name-spaces in python are currently implemented as dictionary mappings. The built-in name-space
is an example of a name-space that contains all the built-in functions and this can be accessed
by entering __builtins__.__dict__ at the terminal (the result is of a considerable amount). The
interpreter has access to multiple name-spaces including the global name-space, the built-in namespace and the local name-space. name-spaces are created at different times and have different
lifetimes. For example, a new local name-space is created at the start of a function execution and
Intermezzo: Glossary
26
this name-space is discarded when the function exits or returns. The global name-space refers to the
module wide name-space and all names defined in this name-space are available module-wide. The
local name-space is created by function definitions while the built-in name-space contains all the
built-in names. These three name-spaces are the main name-space available to the interpreter.
4.4 Scopes
A scope is an area of a program in which a set of name bindings (name-spaces) is visible and directly
accessible. Direct access is an important characteristic of a scope as will be explained when classes
are discussed. This simply means that a name, name, can be used as is, without the need for dot
notation such as SomeClassOrModule.name to access it. At runtime, the following scopes may be
available.
1.
2.
3.
4.
Whan a name is used in python, the interpreter searches the name-spaces of the scopes in ascending
order as listed above and if the name is not found in any of the name-spaces, an exception is raised.
Python supports static scoping also known as lexical scoping; this means that the visibility of a set
of name bindings can be inferred by only inspecting the program text.
Note
Python has a quirky scoping rule that prevents a reference to an object in the global scope from
being modified in a local scope; such an attempt will throw an UnboundLocalError exception. In
order to modify an object from the global scope within a local scope, the global keyword has to be
used with the object name before modification is attempted. The following example illustrates this.
>>> a = 1
>>> def inc_a(): a += 2
...
>>> inc_a()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in inc_a
UnboundLocalError: local variable 'a' referenced before assignment
In order to modify the object from the global scope, the global statement is used as shown in the
following snippet.
Intermezzo: Glossary
>>>
>>>
...
...
...
>>>
>>>
2
27
a = 1
def inc_a():
global a
a += 1
inc_a()
a
Python also has the nonlocal keyword that is used when there is a need to modify a variable bound
in an outer non-global scope from an inner scope. This proves very handy when working with nested
functions (also referred to as closures). A very trivial illustration of the nonlocal keyword in action
is shown in the following snippet that defines a simple counter object that counts in ascending order.
>>>
...
...
...
scope
...
...
...
...
>>>
>>>
>>>
1
>>>
2
>>>
1
>>>
2
def make_counter():
count = 0
def counter():
nonlocal count # nonlocal captures the count binding from enclosing scope not global\
count += 1
return count
return counter
counter_1 = make_counter()
counter_2 = make_counter()
counter_1()
counter_1()
counter_2()
counter_2()
4.5 eval()
eval is a python built-in method for dynamically executing python expressions in a string (the
content of the string must be a valid python expression) or code objects. The function has the
following signature eval(expression, globals=None, locals=None). If supplied, the globals
argument to the eval function must be a dictionary while the locals argument can be any mapping.
The evaluation of the supplied expression is done using the globals and locals dictionaries as the
global and local name-spaces. If the __builtins__ is absent from the globals dictionary, the current
globals are copied into globals before expression is parsed. This means that the expression will have
either full or restricted access to the standard built-ins depending on the execution environment; this
way the exection environment of eval can be restricted or sandboxed. eval when called returns the
result of executing the expression or code object for example:
Intermezzo: Glossary
28
```python
>>> eval("2 + 1") # note the expression is in a string
3
```
Since eval can take arbitrary code obects as argument and return the value of executing such
expressions, it along with exec, is used in executing arbitrary Python code that has been compiled
into code objects using the compile method. Online Python interpreters are able to execute python
code supplied by their users using both eval and exec among other methods.
4.6 exec()
exec is the counterpart to eval. This executes a string interpreted as a suite of python statements
or a code object. The code supplied is supposed to be valid as file input in both cases. exec has the
following signature: exec(object[, globals[, locals]]). The following is an example of exec
In all instances, if optional arguments are omitted, the code is executed in the current scope. If only
the globals argument is provided, it has to be a dictionary, that is used for both the global and the
local variables. If globals and locals are given, they are used for the global and local variables,
respectively. If provided, the locals argument can be any mapping object. If the globals dictionary
does not contain a value for the key __builtins__, a reference to the dictionary of the built-in
module builtins is inserted under that key. One can control the builtins that are available to the
executed code by inserting custom __builtins__ dictionary into globals before passing it to exec()
thus creating a sandbox.
5. Objects 201
Python objects are the basic abstraction over data in python; every value is an object in python.
Every object has an identity, a type and a value. An objects identity never changes once it has been
created. The id(obj) function returns an integer representing the obj's identity. The is operator
compares the identity of two objects returning a boolean. In CPython, the id() function returns an
integer that is a memory location for the object thus uniquely identifying such object. This is an
implementation detail and implementations of Python are free to return whatever value uniquely
identifies objects within the interpreter.
The type() function returns an objects type; the type of an object is also an object itself. An
objects type is also normally unchangeable. An objects type determines the operations that the
object supports and also defines the possible values for objects of that type. Python is a dynamic
language because types are not associated with variables so a variable, x, may refer to a string and
later refer to an integer as shown in the following example.
x = 1
x = "Nkem"
However, Python unlike dynamic languages such as Javascript is strongly typed because the
interpreter will never change the type of an object. This means that actions such as adding a string
to a number will cause an exception in Python as shown in the following snippet:
>>> x = "Nkem"
>>> x + 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Can't convert 'int' object to str implicitly
This is unlike Javascript where the above succeeds because the interpreter implicitly converts the
integer to a string then adds it to the supplied string.
Python objects are either one of the following:
1. Mutable objects: These refer to objects whose value can change. For example a list is a mutable
data structure as we can grow or shrink the list at will.
30
Objects 201
>>>
>>>
>>>
>>>
[1,
>>>
x = [1, 2, 4]
y = [5, 6, 7]
x = x + y
x
2, 4, 5, 6, 7]
Programmers new to Python from other languages may find some behavior of mutable object
puzzling; Python is a pass-by-object-reference language which means that the values of object
references are the values passed to function or method calls and names bound to variables refer
to these reference values. For example consider the snippets shown in the following example.
>>> x
[1, 2, 3]
# now x and y refer to the same list
>>> y = x
# a change to x will also be reflected in y
>>> x.extend([4, 5, 6])
>>> y
[1, 2, 3, 4, 5, 6]
y and x refer to the same object so a change to x is reflected in y. To fully understand why this is
so, it must be noted that the variable, x does not actually hold the list, [1, 2, 3], rather it holds
a reference that points to the location of that object so when the variable, y is bound to the value
contained in x, it now also contains the reference to the original list, [1, 2, 3]. Any operation on
x finds the list that x refers to and carries out the operation on the list; y also refers to the same list
thus the change is also reflected in the variable, y.
1. Immutable objects: These objects have values that cannot be changed. A tuple is an example
of an immutable data structure because once created we can not change the constituent objects
as shown below:
>>> x = (1, 2, 3, 4)
>>> x[0]
1
>>> x[0] = 10
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>>
However if an immutable object contains a mutable object the mutable object can have its value
changed even if it is part of an immutable object. For example, a tuple is an immutable data structure
however if a tuple contains a list object, a mutable object, then we can change the value of the list
object as shown in the following snippet.
Objects 201
31
```python
>>> t = [1, 2, 3, 4]
>>> x = t,
>>> x
([1, 2, 3, 4],)
>>> x[0]
[1, 2, 3, 4]
>>> x[0].append(10)
>>> x
([1, 2, 3, 4, 10],)
>>>
```
Two kind of references, strong and weak references, exist in Python but when discussing
references, it is almost certainly the strong reference that is being referred to. The previous example
for instance, has three references and these are all strong references. The defining characteristic of
a strong reference in Python is that whenever a new strong reference is created, the reference count
for the referenced object is incremented by 1. This means that the garbage collector will never collect
an object that is strongly referenced because the garbage collector collects only objects that have
a reference count of 0. Weak references on the other hand do not increase the reference count of
the referenced object. Weak referencing is provided by the weakref module. The following snippet
shows weak referencing in action.
Objects 201
32
The weakref.ref function returns an object that when called returns the weakly referenced object.
The weakref module the weakref.proxy alternative to the weakref.ref function for creating weak
references. This method creates a proxy object that can be used just like the original object without
the need for a call as shown in the following snippet.
>>> d = weakref.proxy(a)
>>> d
<weakproxy at 0x10138ba98 to Foo at 0x1012d6828>
>>> d.__dict__
{}
When all the strong references to an object have deleted then the weak reference looses it reference
to the original object and the object is ready for garbage collection. This is shown in the following
example.
>>> del a
>>> del b
>>> d
<weakproxy at 0x10138ba98 to NoneType at 0x1002040d0>
>>> d.__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ReferenceError: weakly-referenced object no longer exists
>>> c()
>>> c().__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute '__dict__'
Objects 201
33
None Type
The None type is a singleton object that has a single value and this value is accessed through the
built-in name None. It is used to signify the absence of a value in many situations, e.g., it is returned
by functions that dont explicitly return a value as illustrated below:
```python
>>> def print_name(name):
...
print(name)
...
>>> name = print_name("nkem")
nkem
>>> name
>>> type(name)
<class 'NoneType'>
>>>
```
NotImplemented Type
The NotImplemented type is another singleton object that has a single value. The value of this object
is accessed through the built-in name NotImplemented. This object should be returned when we want
to delegate the search for the implementation of a method to the interpreter rather than throwing a
runtime NotImplementedError exception. For example, consider the two types, Foo and Bar below:
class Foo:
def __init__(self, value):
self.value = value
def __eq__(self, other):
if isinstance(other, Foo):
print('Comparing an instance of Foo with another instance of Foo')
return other.value == self.value
elif isinstance(other, Bar):
print('Comparing an instance of Foo with an instance of Bar')
return other.value == self.value
print('Could not compare an instance of Foo with the other class')
return NotImplemented
class Bar:
def __init__(self, value):
self.value = value
def __eq__(self, other):
if isinstance(other, Bar):
34
Objects 201
print('Comparing an instance of Bar with another instance of Bar')
return other.value == self.value
print('Could not compare an instance of Bar with the other class')
return NotImplemented
When an attempt is made at comparisons, the effect of returning NotImplemented can be clearly
observed. In Python, a == b results in a call to a.__eq__(b). In this example, instance of Foo and
Bar have implementations for comparing themselves to other instance of the same class, for example:
>>> f = Foo(1)
>>> b = Bar(1)
>>> f == b
Comparing an instance of Foo with an instance of Bar
True
>>> f == f
Comparing an instance of Foo with another instance of Foo
True
>>> b == b
Comparing an instance of Bar with another instance of Bar
True
>>>
What actually happens when we compare f with b? The implementation of __eq__() in Foo checks
that the other argument is an instance of Bar and handles it accordingly returning a value of True:
>>> f == b
Comparing an instance of Foo with an instance of Bar
True
If b is compared with f then b.__eq__(f) is invoked and the NotImplemented object is returned
because the implementation of __eq__() in Bar only supports comparison with a Bar instances.
However, it can be seen in the following snippet that the comparison operation actually succeeds;
what has happened?
>>> b == f
Could not compare an instance of Bar with the other class
Comparing an instance of Foo with an instance of Bar
True
>>>
The call to b.__eq__(f) method returned NotImplemented causing the python interpreter to invoke
the __eq__() method in Foo and since a comparison between Foo and Bar is defined in the
implementation of the __eq__() method in Foo the correct result, True, is returned.
The NotImplmented object has a truth value of true.
35
Objects 201
>>>bool(NotImplemented)
True
Ellipsis Type
This is another singleton object type that has a single value. The value of this object is accessed
through the literal ... or the built-in name Ellipsis. The truth value for the Ellipsis object is
true. The Ellipsis object is mainly used in numeric python for indexing and slicing matrices. The
numpy documentation provides more insight into how the Ellipsis object is used.
Numeric Type
Numeric types are otherwise referred to as numbers. Numeric objects are immutable thus once
created their value cannot be changed. Python numbers fall into one of the following categories:
1. Integers: These represent elements from the set of positive and negative integers. These fall
into one of the following types:
1. Plain integers: These are numbers in the range of -2147483648 through 2147483647 on
a 32-bit machine; the range value is dependent on machine word size. Long integers
are returned when results of operations fall outside the range of plain integers and in
some cases, the exception OverflowError is raised. For the purpose of shift and mask
operations, integers are assumed to have a binary, 2s complement notation using 32 or
more bits, and hiding no bits from the user.
2. Long integers: Long integers are used to hold integer values that are as large as the virtual
memory on a system can handle. This is illustrated in the following example.
>>> 238**238
422003234274091507517421795325920182528086611140712666297183769
390925685510755057402680778036236427150019987694212157636287196
316333783750877563193837256416303318957733860108662430281598286
073858990878489423027387093434036402502753142182439305674327314
588077348865742839689189553235732976315624152928932760343933360
660521328084551181052724703073395502160912535704170505456773718
101922384718032634785464920586864837524059460946069784113790792
337938047537052436442366076757495221197683115845225278869129420
5907022278985117566190920525466326339246613410508288691503104L
It is important to note that from the perspective of a user, there is no difference between
the plain and long integers as all conversions if any are done under covers by the
interpreter.
3. Booleans: These represent the truth values False and True. The Boolean type is a
subtype of plain integers. The False and True Boolean values behave like 0 and 1 values
respectively except when converted to a string, then the strings False or True are
returned respectively. For example:
http://docs.scipy.org/doc/numpy/user/basics.indexing.html
36
Objects 201
>>> x = 1
>>> y = True
>>> x + y
2
>>> a = 1
>>> b = False
>>> a + b
1
>>> b == 0
True
>>> y == 1
True
>>>
>>> str(True)
'True'
>>> str(False)
'False'
2. Float: These represent machine-level only double precision floating point numbers. The
underlying machine architecture and specific python implementation determines the accepted
range and the handling of overflow; so CPython will be limited by the underlying C language
while Jython will be limited by the underlying Java language.
3. Complex Numbers: These represent complex numbers as a pair of machine-level double
precision floating point numbers. The same caveats apply as for floating point numbers.
Complex numbers can be created using the complex keyword as shown in the following
example.
>>> complex(1,2)
(1+2j)
>>>
Complex numbers can also be created by using a number literal prefixed with a j. For instance, the
previous complex number example can be created by the expression, 1+2j. The real and imaginary
parts of a complex number z can be retrieved through the read-only attributes z.real and z.imag.
Sequence Type
Sequence types are finite ordered collections of objects that can be indexed by integers; using
negative indices in python is legal. Sequences fall into two categories - mutable and immutable
sequences.
1. Immutable sequences: An immutable sequence type object is one whose value cannot change
once it is created. This means that the collection of objects that are directly referenced by an
immutable sequence is fixed. The collection of objects referenced by an immutable sequence
maybe composed of mutable objects whose value may change at runtime but the mutable
object itself that is directly referenced by an immutable sequence cannot be changed. For
37
Objects 201
example, a tuple is an immutable sequence but if one of the elements in the tuple is a list, a
mutable sequence, then the list can change but the reference to the list object that tuple holds
cannot be changed as shown below:
>>> t = [1, 2, 3], "obi", "ike"
>>> type(t)
<class 'tuple'>
>>> t[0].append(4) # mutate the list
>>> t
([1, 2, 3, 4], 'obi', 'ike')
>>> t[0] = [] # attempt to change the reference in tuple
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
38
Objects 201
'abc'
>>> type(b.decode("utf-16"))
<class 'str'>
3. Tuple: A tuple is a sequence of arbitrary python objects. Tuples of two or more items are
formed by comma-separated lists of expressions. A tuple of one item is formed by affixing
a comma to an expression while an empty tuple is formed by an empty pair of parentheses.
This is illustrated in the following example.
>>> names = "Obi",
>>> names
('Obi',)
>>> type(names)
<class 'tuple'>
# tuple of 1
# tuple of 2 or more
4. Mutable sequences: An immutable sequence type is one whose value can change after it has
created. There are currently two built-in mutable sequence types - byte arrays and lists
1. Byte Arrays: Bytearray objects are mutable arrays of bytes. Byte arrays are created using
the built-in bytearray() constructor. Apart from being mutable and thus unhashable,
byte arrays provide the same interface and functionality as immutable byte objects.
Bytearrays are very useful when the efficiency offered by their mutability is required.
For example, when receiving an unknown amount of data over a network, byte arrays
are more efficient because the array can be extended as more data is received without
having to allocate new objects as would be the case if the immutable byte type was used.
2. Lists: Lists are a sequence of arbitrary Python objects. Lists are formed by placing a
comma-separated list of expressions in square brackets. The empty list is formed with
the empty square bracket, []. A list can be created from any iterable by passing such
iterable to the list method. The list data structure is one of the most widely used data
type in python.
Sequence types have some operations that are common to all sequence types. These are described
in the following table; x is an object, s and t are sequences and n, i, j, k are integers.
39
Objects 201
Operation
Result
x in s
x not in s
s+t
s * n or n * s
s[i]
s[i:j]
s[i:j:k]
len(s)
min(s)
max(s)
s.index(x[, i[, j]])
s.count(x)
Note
1. Values of n that are less than 0 are treated as 0 and this yields an empty sequence of the same
type as s such as below:
>>> x = "obi"
>>> x*-2
''
2. Copies made from using the * operation are shallow copies; any nested structures are not
copied. This can result in some confusion when trying to create copies of a structure such as
a nested list.
>>> lists = [[]] * 3 # shallow copy
>>> lists
[[], [], []] # all three copies reference the same list
>>> lists[0].append(3)
>>> lists
[[3], [3], [3]]
To avoid shallow copies when dealing with nested lists, the following method can be adopted
Objects 201
40
```python
>>> lists = [[] for i in range(3)]
>>> lists[0].append(3)
>>> lists[1].append(5)
>>> lists[2].append(7)
>>> lists
[[3], [5], [7]]
```
3. When i or j is negative, the index is relative to the end of the string thus len(s) + i or len(s)
+ j is substituted for the negative value of i or j.
4. Concatenating immutable sequences such as strings always results in a new object for
example:
>>> name = "Obi"
>>> id(name)
4330660336
>>> name += "Obi" + " Ike-Nwosu"
>>> id(name)
4330641208
Python defines the interfaces (thats the closest word that can be used) - Sequences and MutableSequences in the collections library and these define all the methods a type must implement to be
considered a mutable or immutable sequence; when abstract base classes are discussed, this concept
will become much clearer.
Set
These are unordered, finite collection of unique python objects. Sets are unordered so they cannot
be indexed by integers. The members of a set must be hash-able so only immutable objects can be
members of a set. This is so because sets in python are implemented using a hash table; a hash table
uses some kind of hash function to compute an index into a slot. If a mutable value is used then the
index calculated will change when this object changes thus mutable values are not allowed in sets.
Sets provide efficient solutions for membership testing, de-duplication, computing of intersections,
union and differences. Sets can be iterated over, and the built-in function len() returns the number
of items in a set. There are currently two intrinsic set types:- the mutable set type and the immutable
frozenset type. Both have a number of common methods that are shown in the following table.
41
Objects 201
Method
Description
len(s)
x in s
x not in s
isdisjoint(other)
1. Frozen set: This represents an immutable set. A frozen set is created by the built-in frozenset()
constructor. A frozenset is immutable and thus hashable so it can be used as an element of
another set, or as a dictionary key.
2. Set: This represents a mutable set and it is created using the built-in set() constructor. The
mutable set is not hashable and cannot be part of another set. A set can also be created using
the set literal {}. Methods unique to the mutable set include:
Method
Description
42
Objects 201
Method
Description
remove(elem)
discard(elem)
pop()
clear()
Mapping
A python mapping is a finite set of objects (values) indexed by a set of immutable python objects
(keys). The keys in the mapping must be hashable for the same reason given previously in describing
set members thus eliminating mutable types like lists, frozensets, mappings etc. The expression, a[k],
selects the item indexed by the key, k, from the mapping a and can be used as in assignments or del
statements. The dictionary mostly called dict for convenience is the only intrinsic mapping type
built into python:
1. Dictionary: Dictionaries can be created by placing a comma-separated sequence of key: value
pairs within braces, for example: {'name': "obi", 'age': 18}, or by the dict() constructor.
The main operations supported by the dictionary type is the addition, deletion and selection
of values using a given key. When adding a key that is already in use within a dict, the old
value associated with that key is forgotten. Attempting to access a value with a non-existent
key will result in a KeyError exception. Dictionaries are perhaps one of the most important
types within the interpreter. Without explicitly making use of a dictionary, the interpreter is
already using them in a number of different places. For example, the namespaces, namespaces
are discussed in a subsequent chapter, in python are implemented using dictionaries; this
means that every time a symbol is referenced within a program, a dictionary access occurs.
Objects are layered on dictionaries in python; all attributes of python objects are stored in a
dictionary attribute, __dict__. These are but a few applications of this type within the python
interpreter.
Python supplies more advanced forms of the dictionary type in its collections library. These are
the OrderedDict that introduces order into a dictionary thus remembering the order in which items
were insert and the defaultdict that takes a factory function that is called to produce a value when
a key is missing. If a key is missing from a defaultdict instance, the factory function is called to
produce a value for the key and the dictionary is updated with this key, value pair and the created
value is returned. For example,
Objects 201
43
```python
>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> d
defaultdict(<class 'int'>, {})
>>> d[7]
0
>>>d
defaultdict(<class 'int'>, {7: 0})
```
Callable Types
These are types that support the function call operation. The function call operation is the use of ()
after the type name. In the example below, the function is print_name and the function call is when
the () is appended to the function name as such print_name().
def print_name(name):
print(name)
Functions are not the only callable types in python; any object type that implements the __call__special method is a callable type. The function, callable(type), is used to check that a given type
is callable. The following are built-in callable python types:
1. User-defined functions: these are functions that a user defines with the def statement such as
the print_name function from the previous section.
2. Methods: these are functions defined within a class and accessible within the scope of the class
or a class instance. These methods could either be instance methods, static or class methods.
3. Built-in functions: These are functions available within the interpreter core such as the len
function.
4. Classes: Classes are also callable types. The process of creating a class instance involves calling
the class such as Foo().
Each of the above types is covered in detail in subsequent chapters.
Custom Type
Custom types are created using the class statements. Custom class objects have a type of type.
These are types created by user defined programs and they are discussed in the chapter on object
oriented programming.
44
Objects 201
Module Type
A module is one of the organizational units of Python code just like functions or classes. A
module is also an object just like every other value in the python. The module type is created
by the import system as invoked either by the import statement, or by calling functions such as
importlib.import_module() and built-in __import__().
File/IO Types
A file object represents an open file. Files are created using the open built-in functions that opens
and returns a file object on the local file system; the file object can be open in either binary or text
mode. Other methods for creating file objects include:
1. os.fdopen that takes a file descriptor and create a file object from it. The os.open method not
to be confused with the open built-in function is used to create a file descriptor that can then
be passed to the os.fdopen method to create a file object as shown in the following example.
>>> import os
>> fd = os.open("test.txt", os.O_RDWR|os.O_CREAT)
>>> type(fd)
<class 'int'>
>>> fd
3
>>> fo = os.fdopen(fd, "w")
>>> fo
<_io.TextIOWrapper name=3 mode='w' encoding='UTF-8'>
>>> type(fo)
<class '_io.TextIOWrapper'>
Built-in Types
These are objects used internally by the python interpreter but accessible by a user program. They
include traceback objects, code objects, frame objects and slice objects
https://docs.python.org/3/library/functions.html#open
45
Objects 201
Code Objects
Code objects represent compiled executable Python code, or bytecode. Code objects are machine
code for the python virtual machine along with all that is necessary for the execution of the bytecode
they represent. They are normally created when a block of code is compiled. This executable piece of
code can only be executed using the exec or eval python methods. To give a concrete understanding
of code objects we define a very simple function below and dissect the code object.
def return_author_name():
return "obi Ike-Nwosu"
The code object for the above function can be obtained from the function object by assessing its
__code__ attribute as shown below:
>>> return_author_name.__code__
<code object return_author_name at 0x102279270, file "<stdin>", line 1>
We can go further and inspect the code object using the dir function to see the attributes of the code
object.
>>> dir(return_author_name.__code__)
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattri\
bute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '_\
_reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'co_argcount'\
, 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars',\
'co_kwonlyargcount', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames\
']
Of particular interest to us at this point in time are the non-special methods that is methods that do
not start with an underscore. We give a brief description of each of these non-special methods in
the following table
Method
Description
co_argcount
co_code
co_consts
co_filename
co_firstlineno
co_flags
co_lnotab
co_name
co_names
co_nlocals
co_stacksize
co_varnames
46
Objects 201
We can view the bytecode string for the function using the co_code method of the code object as
shown below.
>>> return_author_name.__code__.co_code
b'd\x01\x00S'
The bytecode returned however is basically of no use to someone investigating code objects. This is
where the python dis module comes into play. The dis module can be used to generate a human
readable version of the code object. We use the dis function from the dis module to generate the
code object for return_author_name function.
>>> dis.dis(return_author_name)
2
0 LOAD_CONST
3 RETURN_VALUE
1 ('obi Ike-Nwosu')
The above shows the human readable version of the the python code object. The LOAD_CONST
instruction reads a value from the co_consts tuple, and pushes it onto the top of the stack (the
CPython interpreter is a stack based virtual machine). The RETURN_VALUE instruction pops the top
of the stack, and returns this to the calling scope signalling the end of the execution of that python
code block.
Code objects serve a number of purposes while programming. They contain information that can aid
in interactive debugging while programming and can provide us with readable tracebacks during
an exception.
Frame Objects
Frame objects represent execution frames. Python code blocks are executed in execution frames. The
call stack of the interpreter stores information about currently executing subroutines and the call
stack is made up of stack frame objects. Frame objects on the stack have a one-to-one mapping with
subroutine calls by the program executing or the interpreter. The frame object contains code objects
and all necessary information, including references to the local and global name-spaces, necessary
for the runtime execution environment. The frame objects are linked together to form the call stack.
To simplify how this all fits together a bit, the call stack can be thought of as a stack data structure
(it actually is), every time a subroutine is called, a frame object is created and inserted into the stack
and then the code object contained within the frame is executed. Some special read-only attributes
of frame objects include:
1. f_back is to the previous stack frame towards the caller, or None if this is the bottom stack
frame.
2. f_code is the code object being executed in this frame.
3. f_locals is the dictionary used to look up local variables.
4. f_globals is used for global variables.
47
Objects 201
Description
tb_next
is the next level in the stack trace (towards the frame where the
exception occurred), or None if there is no next level
points to the execution frame of the current level; tb_lineno gives the
line number where the exception occurred
indicates the precise instruction. The line number and last instruction in
the traceback may differ from the line number of its frame object if the
exception occurred in a try statement with no matching except clause or
with a finally clause.
tb_frame
tb_lasti
Slice Objects
Slice objects represent slices for __getitem__() methods of sequence-like objects (more on special
methods such as __getitem__() in the chapter on object oriented programming). Slice object return
a subset of the sequence they are applied to as shown below.
48
Objects 201
>>>
>>>
[0,
>>>
[0,
>>>
t = [i for i in range(10)]
t
1, 2, 3, 4, 5, 6, 7, 8, 9]
t[:10:2]
2, 4, 6, 8]
They are also created by the built-in slice([start,], stop [,step]) function. The returned object
can be used in between the square brackets as a regular slice object.
>>> t = [i for i in range(10)]
>>> t
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> t[:10:2]
[0, 2, 4, 6, 8]
>>> s = slice(None, 10, 2)
>>> s
slice(None, 10, 2)
>>> t[s]
[0, 2, 4, 6, 8]
Description
start
stop
step
Each of the optional attributes is None if omitted. Slices can take a number of forms in addition to
the standard slice(start, stop [,step]). Other forms include
a[start:end]
a[start:]
a[:end]
a[:]
#
#
#
#
items start to
items start to
items from the
a shallow copy
The start or end values may also be negative in which case we count from the end of the array as
shown below:
a[-1]
a[-2:]
a[:-2]
Objects 201
49
1. slice.indices(self, length): This method takes a single integer argument, length, and returns a
tuple of three integers - (start, stop, stride) that indicates how the slice would apply to the
given length. The start and stop indices are actual indices they would be in a sequence of
length given by the length argument. An example is shown below:
```python
>>> s = slice(10, 30, 1)
# applying slice(10, 30, 1) to sequence of length 100 gives [10:30]
>>> s.indices(100)
(10, 30, 1)
# applying slice(10, 30, 1) to sequence of length 15 gives [10:15]
>>> s.indices(15)
(10, 15, 1)
# applying slice(10, 30, 1) to sequence of length 1 gives [1:1]
>>> s.indices(1)
(1, 1, 1)
>>> s.indices(0)
(0, 0, 1)
```
Generator Objects
Generator objects are created by the invocation of generator functions; these are functions that use
the keyword, yield. This type is discussed in detail in the chapter on Sequences and Generators.
With a strong understanding of the built-in type hierarchy, the stage is now set for examining object
oriented programming and how users can create their own type hierarchy and even make such types
behave like built-in types.
Class definitions introduce class objects, instance objects and method objects.
Class Objects
The execution of a class statement creates a class object. At the start of the execution of a class
statement, a new name-space is created and this serves as the name-space into which all class
attributes go; unlike languages like Java, this name-space does not create a new local scope that can
be used by class methods hence the need for fully qualified names when accessing attributes. The
Account class from the previous section illustrates this; a method trying to access the num_accounts
variable must use the fully qualified name, Account.num_accounts else an error results such as when
the fully qualified name is not used in the __init__ method as shown below:
51
class Account(object):
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
num_accounts += 1
def del_account(self):
Account.num_accounts -= 1
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
return self.balance
>>> acct = Account('obi', 10)
Traceback (most recent call last):
File "python", line 1, in <module>
File "python", line 9, in __init__
UnboundLocalError: local variable 'num_accounts' referenced before assignment
At the end of the execution of a class statement, a class object is created; the scope preceding the
class definition is reinstated, and the class object is bound in this scope to the class name given in
the class definition header.
A little diversion here. One may ask, if the class created is an object then what is the class of the class
object?. In accordance with the Python philosophy that every value is an object , the class object does
indeed have a class which it is created from; this is the type class.
>>> type(Account)
<class 'type'>
So just to confuse you a bit, the type of a type, the Account type, is type. To get a better understanding
of the fact that a class is indeed an object with its own class we go behind the scenes to explain what
really goes on during the execution of a class statement using the Account example from above.
52
>>>class_name = "Account"
>>>class_parents = (object,)
>>>class_body = """
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
num_accounts += 1
def del_account(self):
Account.num_accounts -= 1
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
return self.balance
"""
# a new dict is used as local name-space
>>>class_dict = {}
#the body of the class is executed using dict from above as local
# name-space
>>>exec(class_body, globals(), class_dict)
# viewing the class dict reveals the name bindings from class body
>>> class_dict
{'del_account': <function del_account at 0x106be60c8>, 'num_accounts': 0, 'inquiry': <function i\
nquiry at 0x106beac80>, 'deposit': <function deposit at 0x106be66e0>, 'withdraw': <function withdraw\
at 0x106be6de8>, '__init__': <function __init__ at 0x106be2c08>}
# final step of class creation
>>>Foo = type(class_name, class_parents, class_dict)
# view created class object
>>>Account
<class '__main__.Account'>
>>>type(Account)
<type 'type'>
During the execution of class statement, the interpreter carries out the following steps behind the
scene:
1. The body of the class statement is isolated in a string.
2. A class dictionary representing the name-space for the class is created.
3. The body of the class is executed as a set of statements within this name-space.
53
4. During the final step, the class object is created by instantiating the type class, passing in
the class name, base classes and class dictionary as arguments. The type class used here in
creating the Account class object is a meta-class, the class of a class. The meta-class used
in the class object creation can be explicitly specified by supplying the metaclass keyword
argument in the class definition. In the case that this is not supplied, the class statement
examines the first entry in the tuple of the the base classes if any. If no base classes are used,
the global variable __metaclass__ is searched for and if no value is found for this, Python
uses the default meta-class. More about meta-classes is discussed in subsequent chapters.
Class objects support attribute reference and object instantiation. Attributes are referenced using the
standard dot syntax; an object followed by dot and then attribute name: obj.name. Valid attribute
names are all the variable and method names present in the class name-space when the class object
was created. For example:
>>> Account.num_accounts
0
>>> Account.deposit
>>> <unbound method Account.deposit>
Object instantiation is carried out by calling the class object like a normal function with required
parameters for the __init__ method of the class as shown in the following example:
>>> Account("obi", 0)
An instance object that has been initialized with supplied arguments is returned from instantiation
of a class object. In the case of the Account class, the account name and account balance are set and,
the number of instances is incremented by 1 in the __init__ method.
Instance Objects
If class objects are the cookie cutters then instance objects are the cookies that are the result of
instantiating class objects. Instance objects are returned after the correct initialization of a class
just as shown in the previous section. Attribute references are the only operations that are valid on
instance objects. Instance attributes are either data attribute, better known as instance variables in
languages like Java, or method attributes.
Method Objects
If x is an instance of the Account class, x.deposit is an example of a method object. Method objects
are similar to functions however during a method definition, an extra argument is included in the
arguments list, the self argument. This self argument refers to an instance of the class but why do
we have to pass an instance as an argument to a method? This is best illustrated by a method call
such as the following.
54
>>> x = Account()
>>> x.inquiry()
10
But what exactly happens when an instance method is called? It can be observed that
x.inquiry() is called without an argument above, even though the method definition for inquiry()
requires the self argument. What happened to this argument?
In the example from above, the call to x.inquiry() is exactly equivalent to Account.inquiry(x);
notice that the object instance, x, is being passed as argument to the method - this is the
self argument. Invoking an object method with an argument list is equivalent to invoking the
corresponding method from the objects class with an argument list that is created by inserting the
methods object at the start of the list of argument. In order to understand this, note that methods
are stored as functions in class dicts.
>>> type(Account.inquiry)
<class 'function'>
To fully understand how this transformation takes place one has to understand descriptors and
Pythons attribute references algorithm. These are discussed in subsequent sections of this chapter.
In summary, the method object is a wrapper around a function object; when the method object
is called with an argument list, a new argument list is constructed from the instance object and
the argument list, and the underlying function object is called with this new argument list. This
applies to all instance method objects including the __init__ method. Note that the self argument
is actually not a keyword; the name, self is just a convention and any valid argument name can be
used as shown in the Account class definition below.
class Account(object):
num_accounts = 0
def __init__(obj, name, balance):
obj.name = name
obj.balance = balance
Account.num_accounts += 1
def del_account(obj):
Account.num_accounts -= 1
def deposit(obj, amt):
obj.balance = obj.balance + amt
def withdraw(obj, amt):
obj.balance = obj.balance - amt
def inquiry(obj):
return obj.balance
>>>
0
>>>
>>>
>>>
10
55
Account.num_accounts
x = Account('obi', 0)
x.deposit(10)
Account.inquiry(x)
56
Attempting to do the math.ceil operation in an __init__ method will cause the object initialization
to fail. The __new__ method can also be overridden to create a Singleton super class; subclasses of
this class can only ever have a single instance throughout the execution of a program; the following
example illustrates this.
class Singleton:
def __new__(cls, *args, **kwds):
it = cls.__dict__.get("__it__")
if it is None:
return it
cls.__it__ = it = object.__new__(cls)
it.init(*args, **kwds)
return it
def __init__(self, *args, **kwsds):
pass
It is worth noting that when implementing the __new__ method, the implementation must call its
base class __new__ and the implementation method must return an object.
Users are already familiar with defining the __init__ method; the __init__ method is overridden
to perform attribute initialization for an instance of a mutable types.
Special methods for attribute access
The special methods in this category provide means for customizing attribute references; this maybe
in order to access or set such an attribute. This set of special methods available for this include:
1. __getattr__: This method can be implemented to handle situations in which a referenced
attribute cannot be found. This method is only called when an attribute that is referenced is
neither an instance attribute nor is it found in the class tree of that object. This method should
return some value for the attribute or raise an AttributeError exception. For example, if x
is an instance of the Account class defined above, trying to access an attribute that does not
exist will result in a call to this method as shown in the following snippet
57
class Account(object):
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
Account.num_accounts += 1
def del_account(self):
Account.num_accounts -= 1
def __getattr__(self, name):
return "Hey I don't see any attribute called {}".format(name)
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
>>> x = Account('obi', 0)
>>> x.balaance
Hey I dont see any attribute called balaance
Care should be taken with the implementation of __getattr__ because if the implementation
references an instance attribute that does not exist, an infinite loop may occur because the
__getattr__ method is called successively without end.
class Account(object):
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
Account.num_accounts += 1
def del_account(self):
Account.num_accounts -= 1
def __getattr__(self, name):
return self.namee # trying to acess a variable that doesnt exist will result in __getatt\
r___ calling itself over and over again
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
58
>>> x = Account('obi', 0)
>>> x.balaance # this will result in a RuntimeError: maximum recursion depth exceeded while call\
ing a Python object exception
1. __getattribute__: This method is implemented to customize the attribute access for a class.
This method is always called unconditionally during attribute access for instances of a class.
2. __setattr__: This method is implemented to unconditionally handle all attribute assignment.
__setattr__ should insert the value being assigned into the dictionary of the instance
attributes rather than using self.name=value which results in an infinite recursive call. When
__setattr__() is used for instance attribute assignment, the base class method with the same
name should be called such as super().__setattr__(self, name, value).
3. __delattr__: This is implemented to customize the process of deleting an instance of a class.
it is invoked whenever del obj is called.
4. __dir__: This is implemented to customize the list of object attributes returned by a call to
dir(obj).
Operator Description
a.__add__(self, b)
a.__sub__(self, b)
a.__mul__(self, b)
a.__truediv__(self, b)
a.__floordiv__(self, b)
a.__mod__(self, b)
a.__divmod__(self, b)
a.__pow__(self, b[, modulo])
binary addition, a + b
binary subtraction, a - b
binary multiplication, a * b
division of a by b
truncating division of a by b
a modulo b
returns a divided by b, a modulo b
a raised to the bth power
59
Python has the concept of reflected operations; this was covered in the section on the NotImplemented of previous chapter. The idea behind this concept is that if the left operand of a binary
arithmetic operation does not support a required operation and returns NotImplemented then an
attempt is made to call the corresponding reflected operation on the right operand provided the
type of both operands differ. An example of this rarely used functionality is shown in the following
trivial example for emphasis.
class MyNumber(object):
def __init__(self, x):
self.x = x
def __str__(self):
return str(self.x)
>>> 10 - MyNumber(9) # int type, 10, does not know how to subtract MyNumber type and MyNumbe\
r does not know how to handle the operation too
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: 'int' and 'MyNumber'
In the next snippet the class implements the reflected special method and this reflected method is
called by the interpreter.
class MyFixedNumber(MyNumber):
def __rsub__(self, other): # reflected operation implemented
return MyNumber(other - self.val)
>>> (10 - MyFixedNumber(9)).val
1
Operator Description
a.__radd__(self, b)
a.__rsub__(self, b)
a.__rmul__(self, b)
a.__rtruediv__(self, b)
a.__rfloordiv__(self, b)
a.__rmod__(self, b)
a.__rdivmod__(self, b)
a.__rpow__(self, b[, modulo])
Another set of operators that work with numeric types are the augmented assignment operators.
An example of an augmented operation is shown in the following code snippet.
60
A few of the special methods for implementing augmented arithmetic operations are listed in the
following table.
Special Method
Description
a.__iadd__(self, b)
a.__isub__(self, b)
a.__imul__(self, b)
a.__itruediv__(self, b)
a.__ifloordiv__(self, b)
a.__imod__(self, b)
a.__ipow__(self, b[, modulo])
a += b
a -= b
a *= b
a //= b
a /= b
a %= b
a **= b
Description
__len__(obj)
__getitem__(obj, key)
61
Special Method
Description
__iter__(self)
Sequence types such as lists support the addition (for concatenating lists) and multiplication
operators (for creating copies), + and * respectively, by defining the methods __add__(), __radd__(), __iadd__(), __mul__(), __rmul__() and __imul__(). Sequence types also implement the
__reversed__ method that implements the reversed() method that is used for reverse iteration
over a sequence. User defined classes can implement these special methods to get the required
functionality.
Emulating Callable Types
Callable types support the function call syntax, (args). Classes that implement the __call__(self[,
args...]) method are callable. User defined classes for which this functionality makes sense can
implement this method to make class instances callable. The following example shows a class
implementing the __call__(self[, args...]) method and how instances of this class can be called
using the function call syntax.
class Account(object):
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
Account.num_accounts += 1
def __call__(self, arg):
return "I was called with \'{}\'".format(arg)
def del_account(self):
Account.num_accounts -= 1
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
62
b)
b)
b)
b)
b)
b)
Description
a<b
a <= b
a == b
a != b
a>b
a >= b
In Python, x==y is True does not imply that x!=y is False so __eq__() should be defined along
with __ne__() so that the operators are well behaved. __lt__() and __gt__(), and __le__() and
__ge__() are each others reflection while __eq__() and __ne__() are their own reflection; this
means that if a call to the implementation of any of these methods on the left argument returns
NotImplemented, the reflected operator is is used.
63
class Account:
"""base class for representing user accounts"""
# we can also use __slots__ = "name balance"
__slots__ = ['name', 'balance']
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
Account.num_accounts += 1
def del_account(self):
Account.num_accounts -= 1
def __getattr__(self, name):
"""handle attribute reference for non-existent attribute"""
return "Hey I dont see any attribute called {}".format(name)
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
>>>acct = Account("obi", 10)
>>>acct.__dict__ # __dict__ attribute is gone
Hey I dont see any attribute called __dict__
>>>acct.x = 10
Traceback (most recent call last):
File "acct.py", line 32, in <module>
acct.x = 10
AttributeError: 'Account' object has no attribute 'x'
>>>acct.__slots__
['name', 'balance']
A few things that are worth noting about __slots__ include the following:
1. If a superclass has the __dict__ attribute then using __slots__ in sub-classes is of no use as
the dictionary is available.
2. If __slots__ are used then attempting to assign to a variable not in the __slots__ variable
will result in an AttributeError as shown in the previous example.
3. Sub-classes will have a __dict__ even if they inherit from a base class with a __slots__declaration; subclasses have to define their own __slots__ attribute which must contain only
the additional names in order to avoid having the __dict__ for storing names.
64
4. Subclasses with variable-length built-in types as base class cannot have a non-empty __slots__ variable.
5. __bool__: This method implements the truth value testing for a given class; it is invoked by
the built-in operation bool() and should return a True or False value. In the absence of an
implementation, __len__() is called and if __len__ is implemented, the objects truth value
is considered to be True if result of the call to __len__ is non-zero. If neither __len__() nor
__bool__() are defined by a class then all its instances are considered to be True.
6. __repr__ and __str__: These are two closely related methods as they both return string
representations for a given object and only differ subtly in the intent behind their creation.
Both are invoked by a call to repr and str methods respectively. The __repr__ method
implementation should return an unambiguous string representation of the object it is
being called on. Ideally, the representation that is returned should be an expression that
when evaluated by the eval method returns the given object; when this is not possible
the representation returned should be as unambiguous as possible. On the other hand,
__str__ exists to provide a human readable version of an object; a version that would make
sense to some one reading the output but that doesnt necessarily understand the semantics of
the language. A very good illustration of how both methods differ is shown below by calling
both methods on a data object.
>>> import datetime
>>> today = datetime.datetime.now()
>>> str(today)
'2015-07-05 20:55:58.642018' # human readable version of datetime object
>>> repr(today)
'datetime.datetime(2015, 7, 5, 20, 55, 58, 642018)' # eval will return the datetime object
When using string interpolation, %r makes a call to repr while %s makes a call to str.
7. __bytes__: This is invoked by a call to the bytes() built-in and it should return a byte string
representation for an object. The byte string should be a bytes object.
8. __hash__: This is invoked by the hash() built-in. It is also used by operations that work on
types such as set, frozenset, and dict that make use of object hash values. Providing __hash__ implementation for user defined classes is an involved and delicate act that should be
carried out with care as will be seen. Immutable built-in types are hashable while mutable
types such as lists are not. For example, the hash of a number is the value of the number as
shown in the following snippet.
>>> hash(1)
1
>>> hash(12345)
12345
>>>
User defined classes have a default hash value that is derived from their id() value. Any __hash__()
implementation must return an integer and objects that are equal by comparison must have the
65
same hash value so for two object, a and b, (a==b and hash(a)==hash(b)) must be true. A few
rules for implementing a __hash__() method include the following: 1. A class should only define
the __hash__() method if it also defines the __eq__() method.
1. The absence of an implementation for the __hash__() method in a class renders its instances
unhashable.
2. The interpreter provides user-defined classes with default implementations for __eq__() and
__hash__(). By default, all objects compare unequal except with themselves and x.__hash__() returns a value such that (x == y and x is y and hash(x) == hash(y)) is always true.
In CPython, the default __hash__() implementation returns a value derived from the id() of
the object.
3. Overriding the __eq__() method without defining the __hash__() method sets the __hash__() method to None in the class. When the __hash__() method of a class is None, an
instance of the class will raise an appropriate TypeError when an attempt is made to retrieve
its hash value. The object will also be correctly identified as unhashable when checking
isinstance(obj, collections.Hashable).
4. If a class overrides the __eq__() and needs to keep the implementation of __hash__() from
a base class, this must be done explicitly by setting __hash__ = BaseClass.__hash__.
5. A class that does not override the __eq__() can suppress hash support by setting __hash__ to
None. If a class defines its own __hash__() method that explicitly raises a TypeError,
instances of such class will be incorrectly identified as hashable by an isinstance(obj,
collections.Hashable) test.
66
67
def __mul__(self,other):
#If other is a vector, returns the dot product of self and other
if isinstance(other, Vec):
return dot(self,other)
else:
return NotImplemented # Will cause other.__rmul__(self) to be invoked
def __truediv__(self,other):
return (1/other)*self
# Scalar division
=
=
=
+
Vec({'a','e','i','o','u'}, {'a':0,'e':1,'i':2})
Vec({'a','e','i','o','u'}, {'o':4,'u':7})
Vec({'a','e','i','o','u'}, {'a':0,'e':1,'i':2,'o':4,'u':7})
b == c
== Vec({'a','e','i','o','u'}, {'a':0,'e':1,'i':2})
== Vec({'a','e','i','o','u'}, {'o':4,'u':7})
=
=
=
+
Vec({'x','y','z'}, {'x':2,'y':1})
Vec({'x','y','z'}, {'z':4,'y':-1})
Vec({'x','y','z'}, {'x':2,'y':0,'z':4})
e == f
== Vec({'x','y','z'}, {'x':2,'y':1})
68
69
70
6.4 Inheritance
Inheritance is one of the basic tenets of object oriented programming and python supports multiple
inheritance just like C++. Inheritance provides a mechanism for creating new classes that specialise
or modify a base class thereby introducing new functionality. We call the base class the parent class
or the super class. An example of a class inheriting from a base class in python is given in the
following example.
class Account:
"""base class for representing user accounts"""
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
Account.num_accounts += 1
def del_account(self):
Account.num_accounts -= 1
def __getattr__(self, name):
"""handle attribute reference for non-existent attribute"""
return "Hey I dont see any attribute called {}".format(name)
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
class SavingsAccount(Account):
def __init__(self, name, balance, rate):
super().__init__(name, balance)
self.rate = rate
def __repr__(self):
return "SavingsAccount({}, {}, {})".format(self.name, self.balance, self.rate)
71
Multiple Inheritance
In multiple inheritance, a class can have multiple parent classes. This type of hierarchy is strongly
discouraged. One of the issues with this kind of inheritance is the complexity involved in properly
resolving methods when called. Imagine a class, D, that inherits from two classes, B and C and there
is a need to call a method from the parent classes however both parent classes implement the same
method. How is the order in which classes are searched for the method determined ? A Method
Resolution Order algorithm determines how a method is found in a class or any of the class base
classes. In Python, the resolution order is calculated at class definition time and stored in the class __dict__ as the __mro__ attribute. To illustrate this, imagine a class hierarchy with multiple inheritance
such as that showed in the following example.
>>> class A:
...
def meth(self): return "A"
...
>>> class B(A):
...
def meth(self): return "B"
...
>>> class C(A):
...
def meth(self): return "C"
...
>>> class D(B, C):
...
def meth(self): return "X"
...
>>>
>>> D.__mro__
(<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <class \
'object'>)
>>>
To obtain an mro, the interpreter method resolution algorithm carries out a left to right depth first
listing of all classes in the hierarchy. In the trivial example above, this results in the following class
list, [D, B, A, C, A, object]. Note that all objects will inherit from the root object class if no
parent class is supplied during class definition. Finally, for each class that occurs multiple times, all
occurrences are removed except the last occurrence resulting in an mro of [D, B, C, A, object] for
72
the previous class hierarchy. This result is the order in which classes would be searched for attributes
for a given instance of D.
Cooperative method calls with super
This section will show the power of the super keyword in a multiple inheritance hierarchy. The
class hierarchy from the previous section is used. This example is from the excellent write up by
Guido Van Rossum on Type Unification. Imagine that A defines a method that is overridden by B,
C and D. Suppose that there is a requirement that all the methods are called; the method may be a
save method that saves an attribute for each type it is defined for, so missing any call will result in
some unsaved data in the hierarchy. A combination of super and __mro__ provide the ammunition
for solving this problem. This solution is referred to as the call-next method by Guido van Rossum
and is shown in the following snippet:
class A(object):
def meth(self):
"save A's data"
print("saving A's data")
class B(A):
def meth(self):
"save B's data"
super(B, self).meth()
print("saving B's data")
class C(A):
def meth(self):
"save C's data"
super(C, self).meth()
print("saving C's data")
class D(B, C):
def meth(self):
"save D's data"
super(D, self).meth()
print("saving D's data")
When self.meth() is called by an instance of D for example, super(D, self).meth() will find and
call B.meth(self), since B is the first base class following D that defines meth in D.__mro__. Now
in B.meth, super(B, self).m() is called and since self is an instance of D, the next class after B
is C (__mro__ is [D, B, C, A]) and the search for a definition of meth continues here. This finds
C.meth which is called, and which in turn calls super(C, self).m(). Using the same MRO, the next
class after C is A, and thus A.meth is called. This is the original definition of m, so no further super()
call is made at this point. Using super and method resolution order, the interpreter has been able
https://www.python.org/download/releases/2.2.3/descrintro/
73
to find and call all version of the meth method implemented by each of the classes in the hierarchy.
However, multiple inheritance is best avoided because for more complex class hierarchies, the calls
may be way more complicated than this.
Static Methods
Static methods are normal functions that exist in the name-space of a class. Referencing a static
method from a class shows that rather than an unbound method type, a function type is returned as
shown below:
class Account(object):
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
Account.num_accounts += 1
def del_account(self):
Account.num_accounts -= 1
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
@staticmethod
def static_test_method():
return "Current Account"
>>> Account.static_test_method
<function Account.static_test_method at 0x101b846a8>
To define a static method, the @staticmethod decorator is used and such methods do not require
the self argument. Static methods provide a mechanism for better organization as code related
74
to a class are placed in that class and can be overridden in a sub-class as needed. Unlike ordinary
class methods that are wrappers around the actual underlying functions, static methods return the
underlying functions without any modification when used.
Class Methods
Class methods as the name implies operate on classes themselves rather than instances. Class
methods are created using the @classmethod decorator with the class rather than instance passed
as the first argument to the method.
import json
class Account(object):
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
Account.num_accounts += 1
def del_account(self):
Account.num_accounts -= 1
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
return "Name={}, balance={}".format(self.name, self.balance)
@classmethod
def from_json(cls, params_json):
params = json.loads(params_json)
return cls(params.get("name"), params.get("balance"))
@staticmethod
def type():
return "Current Account"
A motivating example of the usage of class methods is as a factory for object creation. Imagine data
for the Account class comes in different formats such as tuples, json string etc. It is not possible to
define multiple __init__ methods in a class so class methods come in handy for such situations. In
the Account class defined above for example, there is a requirement to initialize an account from a
json string object so we define a class factory method, from_json that takes in a json string object
and handles the extraction of parameters and creation of the account object using the extracted
75
parameters. Another example of a class method in action as a factory method is the dict.fromkeys
methods that is used for creating dict objects from a sequence of supplied keys and value.
The above method maybe feasible for enforcing such type checking for one or two data
attributes but as the attributes increase in number it gets cumbersome. Alternatively, a type_check(type, val) function could be defined and this will be called in the __init__ method
before assignment; but this cannot be elegantly applied when the attribute value is set after
initialization. A quick solution that comes to mind is the getters and setters present in Java
but that is un-pythonic and cumbersome.
2. Consider a program that needs object attributes to be read-only once initialized. One could
also think of ways of implementing this using Python special methods but once again such
implementation could be unwieldy and cumbersome.
3. Finally, consider a program in which the attribute access needs to be customized. This maybe
to log such attribute access or to even perform some kind of transformation of the attribute
for example. Once again, it is not too difficult to come up with a solution to this although such
solution maybe unwieldy and not reusable.
All the above mentioned issues are all linked together by the fact that they are all related to attribute
references. Attribute access is trying to be customized.
https://docs.python.org/3/library/stdtypes.html#dict.fromkeys
76
Objects implementing only the __get__ method are non-data descriptors so they can only be read
from after initialization while objects implementing the __get__ and __set__ are data descriptors
meaning that such descriptor objects are writable.
To get a better understanding of descriptors descriptor based solutions are provided to the issues
mentioned in the previous section. Implementing type checking on an object attribute using
descriptors is a simple and straightforward task. A decorator implementing this type checking is
shown in the following snippet.
class TypedAttribute:
def __init__(self, name, type, default=None):
self.name = "_" + name
self.type = type
self.default = default if default else type()
def __get__(self, instance, cls):
return getattr(instance, self.name, self.default)
def __set__(self,instance,value):
if not isinstance(value,self.type):
raise TypeError("Must be a %s" % self.type)
setattr(instance,self.name,value)
def __delete__(self,instance):
raise AttributeError("Can't delete attribute")
class Account:
name = TypedAttribute("name",str)
balance = TypedAttribute("balance",int, 42)
>>
>>
>>
>>
acct = Account()
acct.name = "obi"
acct.balance = 1234
acct.balance
77
1234
>> acct.name
obi
# trying to assign a string to number fails
>> acct.balance = '1234'
TypeError: Must be a <type 'int'>
In the example, a descriptor, TypedAttribute is implemented and this descriptor class enforces
rudimentary type checking for any attribute of a class which it is used to represent. It is important
to note that descriptors are effective in this kind of case only when defined at the class level rather
than instance level i.e. in __init__ method as shown in the example above.
Descriptors are integral to the Python language. Descriptors provide the mechanism behind
properties, static methods, class methods, super and a host of other functionality in Python classes.
In fact, descriptors are the first type of object searched for during an attribute reference. When
an object is referenced, a reference, b.x, is transformed into type(b).__dict__['x'].__get__(b,
type(b)). The algorithm then searches for the attribute in the following order.
1. type(b).__dict__ is searched for the attribute name and if a data descriptor is found, the
result of calling the descriptors __get__ method is returned. If it is not found, then all base
classes of type(b) are searched in the same way.
2. b.__dict__ is searched and if attribute name is found here, it is returned.
3. type(b).__dict is searched for a non-data descriptor with given attribute name and if found
it is returned,
4. If the name is not found, an AttributeError is raised or __getattr__() is called if provided.
This precedence chain can be overridden by defining custom __getattribute__ methods for a given
object class (the precedence defined above is contained in the default __getattribute__ provided
by the interpreter).
With a firm understanding of the mechanics of descriptors, it is easy to implement elegant solutions
to the second and third issues raised in the previous section. Implementing a read only attribute
with descriptors becomes a simple case of implementing a non-data descriptor i.e descriptor with
no __set__ method. To solve the custom access issue, whatever functionality is required is added to
the __get__and __set__ methods respectively.
Class Properties
Defining descriptor classes each time a descriptor is required is cumbersome. Python properties
provide a concise way of adding data descriptors to attributes. A property signature is given below:
property(fget=None, fset=None, fdel=None, doc=None) -> property attribute
fget, fset and fdel are the getter, setter and deleter methods for such class attributes. The process
78
class Accout(object):
def __init__(self):
self._acct_num = None
def get_acct_num(self):
return self._acct_num
def set_acct_num(self, value):
self._acct_num = value
def del_acct_num(self):
del self._acct_num
acct_num = property(get_acct_num, set_acct_num, del_acct_num, "Account number property.")
If acct is an instance of Account, acct.acct_num will invoke the getter, acct.acct_num = value
will invoke the setter and del acct_num.acct_num will invoke the deleter.
The property object and functionality can be implemented in python as illustrated in Descriptor
How-To Guide using the descriptor protocol as shown below :
class Property(object):
"Emulate PyProperty_Type() in Objects/descrobject.c"
def __init__(self, fget=None, fset=None, fdel=None, doc=None):
self.fget = fget
self.fset = fset
self.fdel = fdel
if doc is None and fget is not None:
doc = fget.__doc__
self.__doc__ = doc
def __get__(self, obj, objtype=None):
if obj is None:
return self
if self.fget is None:
raise AttributeError("unreadable attribute")
return self.fget(obj)
def __set__(self, obj, value):
if self.fset is None:
raise AttributeError("can't set attribute")
self.fset(obj, value)
def __delete__(self, obj):
if self.fdel is None:
raise AttributeError("can't delete attribute")
self.fdel(obj)
https://docs.python.org/2/howto/descriptor.html
79
Python also provides the @property decorator that can be used to create read only attributes. A
property object has getter, setter, and deleter decorator methods that can be used to create a copy
of the property with the corresponding accessor function set to the decorated function. This is best
explained with an example:
class C(object):
def __init__(self):
self._x = None
@property
# the x property. the decorator creates a read-only property
def x(self):
return self._x
@x.setter
# the x property setter makes the property writeable
def x(self, value):
self._x = value
@x.deleter
def x(self):
del self._x
However, object methods are of bound method type as shown in the following snippet.
80
To understand how this transformation takes place, note that a bound method is just a thin wrapper
around the class function. Functions are descriptors because they have the __get__ method attribute
so a reference to a function will result in a call to the __get__ method of the function and this returns
the desired type, the function itself or a bound method, depending on whether this reference is from
a class or from an instance of the class. It is not difficult to imagine how static and class methods
maybe implemented by the function descriptor and this is left to the reader to come up with.
81
class Car(Vehicle):
def __init__(self, make, model, color):
self.make = make
self.model = model
self.color = color
# abstract methods not implemented
>>> car = Car("Toyota", "Avensis", "silver")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Car with abstract methods change_gear, start_engine
>>>
Once, a class implements all abstract methods then that class becomes a concrete class and can be
instantiated by a user.
from abc import ABCMeta, abstractmethod
class Vehicle(object):
__meta-class__ = ABCMeta
@abstractmethod
def change_gear(self):
pass
@abstractmethod
def start_engine(self):
pass
class Car(Vehicle):
def __init__(self, make, model, color):
self.make = make
self.model = model
82
self.color = color
def change_gear(self):
print("Changing gear")
def start_engine(self):
print("Changing engine")
>>> car = Car("Toyota", "Avensis", "silver")
>>> print(isinstance(car, Vehicle))
True
Abstract base classes also allow existing classes to register as part of its hierarchy but it performs no
check on whether such classes implement all the methods and properties that have been marked as
abstract. This provides a simple solution to the second issue raised in the opening paragraph. Now,
a proxy class can be registered with an abstract base class and isinstance check will return the
correct answer when used.
from abc import ABCMeta, abstractmethod
class Vehicle(object):
__meta-class__ = ABCMeta
@abstractmethod
def change_gear(self):
pass
@abstractmethod
def start_engine(self):
pass
class Car(object):
def __init__(self, make, model, color):
self.make = make
self.model = model
self.color = color
>>> Vehicle.register(Car)
>>> car = Car("Toyota", "Avensis", "silver")
>>> print(isinstance(car, Vehicle))
True
Abstract base classes are used a lot in python library. They provide a mean to group python objects
such as number types that have a relatively flat hierarchy. The collections module also contains
abstract base classes for various kinds of operations involving sets, sequences and dictionaries.
Whenever we want to enforce contracts between classes in python just as interfaces do in Java,
abstract base classes is the way to go.
7. The Function
The function is another organizational unit of code in Python. Python functions are either named
or anonymous set of statements or expressions. In Python, functions are first class objects. This
means that there is no restriction on function use as values; introspection on functions can be carried
out, functions can be assigned to variables, functions can be used as arguments to other function
and functions can be returned from method or function calls just like any other python value such
as strings and numbers.
When a function definition such as the square function defined above is encountered, only the
function definition statement, that is def square(x), is executed; this implies that all arguments are
evaluated. The evaluation of arguments has some implications for function default arguments that
have mutable data structure as values; this will be covered later on in this chapter. The execution of
a function definition binds the function name in the current name-space to a function object which
is a wrapper around the executable code for the function. This function object contains a reference
to the current global name-space which is the global name-space that is used when the function is
called. The function definition does not execute the function body; this gets executed only when the
function is called.
Python also has support for anonymous functions. These functions are created using the lambda
keyword. Lambda expressions in python are of the form:
lambda_expr ::=
Lambda expressions return function objects after evaluation and have same attributes as named
functions. Lambda expressions are normally only used for very simple functions in python due to
the fact that a lambda definition can contain only one expression. A lambda definition for the square
function defined above is given in the following snippet.
The Function
84
Like every other object, introspection on functions using the dir() function provides a list of
function attributes.
def square(x):
return x**2
>>> square
<function square at 0x031AA230>
>>> dir(square)
['__annotations__', '__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delat\
tr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__get__', '__getattribut\
e__', '__globals__', '__gt__', '__hash__', '__init__', '__kwdefaults__', '__le__', '__lt__', '__modu\
le__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '\
__setattr__', '__sizeof__', '__str__', '__subclasshook__']
>>>
85
The Function
>>> def square(x: int) -> int:
...
return x*x
...
square.__annotations__
{'x': <class 'int'>, 'return': <class 'int'>}
Parameters are annotated by a colon after the parameter name, followed by an expression
evaluating to the value of the annotation. Return values are annotated by a literal ->, followed
by an expression, between the parameter list and the colon denoting the end of the def
statement. In the case of default values for functions, the annotation is of the following form.
>>> def def_annotation(x: int, y: str = "obi"):
...
pass
__defaults__ returns a tuple of the default argument values. Default arguments are discussed
later on.
__kwdefaults__ returns a dict containing default keyword argument values.
__globals__ returns a reference to the dictionary that holds the functions global variables
(see the chapter 5 for a word on global variables).
>>> square.func_globals
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', 'square': <functio\
n square at 0x10f099c08>, '__doc__': None, '__package__': None}
86
The Function
>>> square.func_dict
{}
__closure__ returns tuple of cells that contain bindings for the functions free variables.
Closures are discussed later on in this chapter.
Functions can be passed as arguments to other functions. These functions that take other functions as
argument are commonly referred to as higher order functions and these form a very important part
of functional programming. A very good example of a higher order function is the map function
that takes a function and an iterable and applies the function to each item in the iterable returning
a new list. In the following example, we illustrate the use of the map() higher order function by
passing the square function previously defined and an iterable of numbers to the map function.
>>> map(square, range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
A function can be defined inside another function as well as returned from a function call.
>>> def make_counter():
...
count = 0
...
def counter():
...
nonlocal count # nonlocal captures the count binding from enclosing scope not gl\
obal scope
...
count += 1
...
return count
...
return counter
In the previous example, a function, counter is defined within another function, make_counter, and
the counter function is returned whenever the make_counter function is executed. Functions can
also be assigned to variables just like any other python object as shown below:
>>> def make_counter():
...
count = 0
...
def counter():
...
nonlocal count # nonlocal captures the count binding from enclosing scope not global\
scope
...
count += 1
...
return count
...
return counter
>>> func = make_counter()
>>> func
<function inner at 0x031AA270>
>>>
In the above example, the make_counter function returns a function when called and this is assigned
to the variable func. This variable refers to a function object and can be called just like any other
function as shown in the following example:
https://docs.python.org/2/library/functions.html#map
87
The Function
>>> func()
1
This __get__ method is called whenever a function is referenced and provides the mechanism for
handling method calls from objects and ordinary function calls. This descriptor characteristic of a
function enables functions to return either itself or a bound method/ when referenced depending on
where and how it is referenced.
The above function has been defined with a single normal positional argument, arg and two
default arguments, def_arg and def_arg2. The function can be called in any of the following
ways below:
Supplying non-default positional argument values only; in this case the other arguments
take on the supplied default values:
88
The Function
def show_args(arg, def_arg=1, def_arg2=2):
return "arg={}, def_arg={}, def_arg2={}".format(arg, def_arg, def_arg2)
>>> show_args("tranquility")
'arg=tranquility, def_arg=1, def_arg2=2'
Supplying values for all arguments overriding even arguments with default values.
def show_args(arg, def_arg=1, def_arg2=2):
return "arg={}, def_arg={}, def_arg2={}".format(arg, def_arg, def_arg2)
>>> show_args("tranquility", "to Houston", "the eagle has landed")
'arg=tranquility, def_arg=to Houston, def_arg2=the eagle has landed'
It is also very important to be careful when using mutable data structures as default arguments.
Function definitions get executed only once so these mutable data structures are created once
at definition time. This means that the same mutable data structure is used for all function
calls as shown in the following example:
def show_args_using_mutable_defaults(arg, def_arg=[]):
def_arg.append("Hello World")
return "arg={}, def_arg={}".format(arg, def_arg)
>>> show_args_using_mutable_defaults("test")
"arg=test, def_arg=['Hello World']"
>>> show_args_using_mutable_defaults("test 2")
"arg=test 2, def_arg=['Hello World', 'Hello World']"
On every function call, Hello World is added to the def_arg list and after two function calls
the default argument has two hello world strings. It is important to take note of this when
using mutable default arguments as default values.
Keyword Argument: Functions can be called using keyword arguments of the form kwarg=value.
A kwarg refers to the name of arguments used in a function definition. Take the function
defined below with positional non-default and default arguments.
def show_args(arg, def_arg=1):
return "arg={}, def_arg={}".format(arg, def_arg)
To illustrate function calls with key word arguments, the following function can be called in
any of the following ways:
89
The Function
show_args(arg="test", def_arg=3)
show_args(test)
show_args(arg="test")
show_args("test", 3)
In a function call, keyword arguments must not come before non-keyword arguments thus
the following will fail:
show_args(def_arg=4)
A function cannot supply duplicate values for an argument so the following is illegal:
show_args("test", arg="testing")
Arbitrary Argument List: Python also supports defining functions that take arbitrary number
of arguments that are passed to the function in a
tuple. An example of this from the python tutorial is given below:
def write_multiple_items(file, separator, *args):
file.write(separator.join(args))
The arbitrary number of arguments must come after normal arguments; in this case, after the
file and separator arguments. The following is an example of function calls to the above
defined function:
f = open("test.txt", "wb")
write_multiple_items(f, " ", "one", "two", "three", "four", "five")
The arguments one two three four five are all bunched together into a tuple that can be
accessed via the args argument.
The Function
90
If the values for a function call are in a list then these values can be unpacked directly into the
function as shown below:
>>> args = [1, 2]
>>> print_args(*args)
1
2
Similarly, dictionaries can be used to store keyword to value mapping and the ** operator is used
to unpack the keyword arguments to the functions as shown below:
>>> def parrot(voltage, state=a stiff, action=voom):
print "-- This parrot wouldnt", action,
print "if you put", voltage, "volts through it.",
print "Es", state, "!"
>>> d = {"voltage": "four million", "state": "bleedin demised", "action": "VOOM"}
>>> parrot(**d)
>>> This parrot wouldnt VOOM if you put four million volts through it. Es bleedin demised
The *args argument represents an unknown length of sequence of positional arguments while
**kwargs represents a dict of keyword name value mappings which may contain any amount of
keyword name value mapping. The *args must come before **kwargs in the function definition.
The following illustrates this:
The Function
91
The normal argument must be supplied to the function but the *args and **kwargs are optional as
shown below:
>>> show_args("hey", *args, **kwargs)
hey
At function call the normal argument(s) is/are supplied normally while the optional arguments are
unpacked. This kind of function definition comes in handy when dealing with function decorators
as will be seen in the chapter on decorators.
In the nested function definition, the function counter is in scope only inside the function make_counter, so it is often useful when the counter function is returned from the make_counter function.
The Function
92
In nested functions such as in the above example, a new instance of the nested function is created
on each call to outer function. This is because during each execution of the make_counter function,
the definition of the counter function is executed but the body is not executed.
A nested function has access to the environment in which it was created. A result is that a variable
defined in the outer function can be referenced in the inner function even after the outer functions
has finished execution.
>>> x = make_counter()
>>> x
<function counter at 0x0273BCF0>
>>> x()
1
When nested functions reference variables from the outer function in which they are defined, the
nested function is said to be closed over the referenced variable. The __closure__ special attribute
of a function object is used to access the closed variables as shown in the next example.
>>> cl = x.__closure__
>>> cl
(<cell at 0x029E4470: str object at 0x02A0FD90>,)
>>> cl[0].cell_contents
0
Closures in previous versions of Python have a quirky behaviour. In Python 2.x and below, variables
that reference immutable types such as string and numbers cannot be rebound within a closure. The
following example illustrates this.
def counter():
count = 0
def c():
count += 1
return count
return c
>>> c = counter()
>>> c()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in c
UnboundLocalError: local variable 'count' referenced before assignment
A rather wonky solution to this is to make use of a mutable type to capture the closure as shown
below:
The Function
93
def counter():
count = [0]
def c():
count[0] += 1
return count[0]
return c
>>>
>>>
1
>>>
2
>>>
3
c = counter()
c()
c()
c()
Python 3 introduced the nonlocal key word that fixed this closure scoping issue as shown in the
following snippet.
def counter():
count = 0
def c():
nonlocal count
count += 1
return count
return c
Closures can be used for maintaining states (isnt that what classes are for) and for some simple
cases provide a more succinct and readable solution than classes. A class version of a logging API
tech_pro is shown in the following example.
class Log:
def __init__(self, level):
self._level = level
def __call__(self, message):
print("{}: {}".format(self._level, message))
log_info = Log("info")
log_warning = Log("warning")
log_error = Log("error")
The same functionality that the class based version possesses can be implemented with functions
closures as shown in the following snippet:
http://tech.pro/tutorial/1512/python-decorators
The Function
94
def make_log(level):
def _(message):
print("{}: {}".format(level, message))
return _
log_info = make_log("info")
log_warning = make_log("warning")
log_error = make_log("error")
The closure based version as can be seen is way more succinct and readable even though both
versions implement exactly the same functionality. Closures also play a major role in a major
function decorators. This is a widely used functionality that is explained in the chapter on metaprogramming. Closures also form the basis for the partial function, a function that is described in
detail in the next section. With a firm understanding of functions, a tour of some techniques and
modules for functional programming in Python is given in the following section.
A functional version of the above would avoid any modification to arguments and create new values
that are then returned as shown in the following example.
def squares(numbers):
return map(lambda x:x*x, numbers)
Language features such as first class functions make functional programming possible while
programming techniques such as mapping, reducing, filtering, currying and recursion are
examples of techniques for implementing a functional style of programming. In the above example,
the map function applies the function lambda x:x*x to each element in the supplied sequence of
numbers.
Python provides built-in functions such as map, filter and reduce that aid in functional programming. A description of these functions follows.
95
The Function
1. map(func, iterable): This is a classic functional programming construct that takes a function
and an iterable as argument and returns an iterator that applies the function to each item in
the iterable. The squares function from above is an illustration of map in use. The ideas behind
the map and reduce constructs have seen application in large scale data processing with the
popular MapReduce programming model that is used to fan out (map) operation on large data
streams to a cluster of distributed machines for computation and then gather the result of
these computations together (reduce).
2. filter(func, iterable): This also takes a function and an iterable as argument. It returns an
iterator that applies func to each element of the iterable and returns elements of the iterable
for which the result of the application is True. The following trivial example selects all even
numbers from a list.
>>> even = lambda x: x%2==0
>>> even(10)
True
>>> filter(even, range(10))
<filter object at 0x101c7b208>
>>> list(filter(even, range(10)))
[0, 2, 4, 6, 8]
import functools
def flatten_list(nested_list):
return functools.reduce(lambda x, y: x + y, nested_list, [])
flatten_list([[1, 3, 4], [5,6, 7], [8, 9, 10]])
3, 4, 5, 6, 7, 8, 9, 10]
The above listed functions are examples of built-in higher order functions in Python. Some of the
functionality they provide can be replicated using more common constructs. Comprehensions are
one of the most popular alternatives to these higher order functions.
Comprehensions
Python comprehensions are syntactic constructs that enable sequences to be built from other
sequences in a clear and concise manner. Python comprehensions are of three types namely:
The Function
96
1. List Comprehensions.
2. Set Comprehensions.
3. Dictionary Comprehensions.
List Comprehensions
List comprehensions are by far the most popular Python comprehension construct. List comprehensions provide a concise way to create new list of elements that satisfy a given condition from
an iterable. A list of squares for a sequence of numbers can be computed using the following
squaresfunction that makes use of the map function.
def squares(numbers):
return map(lambda x:x*x, numbers)
>>> sq = squares(range(10))
The same list can be created in a more concise manner by using list comprehensions rather than the
map function as in the following example.
>>> squares = [x**2 for x in range(10)]
The comprehension version is clearer and more concise than the conventional map method for one
without any experience in higher order functions.
According to the python documentation,
a list comprehension consists of square brackets containing an expression followed
by a for clause and zero or more for or if clauses.
[expression for item1 in iterable1 if condition1
for item2 in iterable2 if condition2
...
for itemN in iterableN if conditionN ]
The result of a list comprehension expression is a new list that results from evaluating the expression
in the context of the for and if clauses that follow it. For example, to create a list of the squares of
even numbers between 0 and 10, the following comprehension is used.
>>> even_squares = [i**2 for i in range(10) if i % 2 == 0]
>>> even_squares
[0, 4, 16, 36, 64]
The expression i**2 is computed in the context of the for clause that iterates over the numbers from
0 to 10 and the if clause that filters out non-even numbers.
Nested for loops and List Comprehensions List comprehensions can also be used with multiple
or nested for loops. Consider for example, the simple code fragment shown below that creates a
tuple from pair of numbers drawn from the two sequences given.
The Function
97
>>> combs = []
>>> for x in [1,2,3]:
...
for y in [3,1,4]:
...
if x != y:
...
combs.append((x, y))
...
>>> combs
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
The above can be rewritten in a more concise and simple manner as shown below using list
comprehensions
>>> [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
It is important to take into consideration the order of the for loops as used in the list comprehension.
Careful observation of the code snippets using comprehension and that without comprehension
shows that the order of the for loops in the comprehension follows the same order if it had been
written without comprehensions. The same applies to nested for loops with nesting depth greater
than two.
Nested List Comprehensions List comprehensions can also be nested. Consider the following
example drawn from the Python documentation of a 3x4 matrix implemented as a list of 3 lists
each of length 4:
>>> matrix = [
...
[1, 2, 3, 4],
...
[5, 6, 7, 8],
...
[9, 10, 11, 12],
... ]
Transposition is a matrix operation that creates a new matrix from an old one using the rows of the
old matrix as the columns of the new matrix and the columns of the old matrix as the rows of the
new matrix. The rows and columns of the matrix can be transposed using the following nested list
comprehension:
>>> [[row[i] for row in matrix] for i in range(4)]
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
https://docs.python.org/2/tutorial/datastructures.html
The Function
98
>>> transposed = []
>>> for i in range(4):
...
transposed.append([row[i] for row in matrix])
...
>>> transposed
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
Set Comprehensions
In set comprehensions, braces rather than square brackets are used to create new sets. For example,
to create the set of the squares of all numbers between 0 and 10, the following set comprehensions
is used.
>>> x = {i**2 for i in range(10)}
>>> x
set([0, 1, 4, 81, 64, 9, 16, 49, 25, 36])
>>>
Dict Comprehensions
Braces are also used to create new dictionaries in dict comprehensions. In the following example,
a mapping of a number to its square is created using dict comprehensions.
>>> x = {i:i**2 for i in range(10)}
>>> x
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}
Functools
The functools module in Python contains a few higher order functions that act on and return
other functions. A few of the interesting higher order functions that are included in this module are
described.
1. partial(func, *args, **keywords) This is a function that when called returns an object
that can be called like the original func argument with *args and **keywords as arguments.
If the returned object is called with additional *args or **keyword arguments then these are
added to the original *args and **keywords and the updated set of arguments are used in the
function call. This is illustrated with the following trivial example.
99
The Function
>>>
>>>
>>>
>>>
18
In the above example, a new callable, basetwo, that takes a number in binary and converts
it a number in decimal is created. What has happened is that the int() functions that takes
two arguments has been wrapped by a callable, basetwo that takes only one argument. To
understand how this may work, take your mind back to the discussion about closures and
how variable captures work. Once this is understood, it is easy to imagine how to implement
this partial function. The partial function has functionality that is equivalent to the following
closure as defined in the Python documentation.
def partial(func, *args, **keywords):
def newfunc(*fargs, **fkeywords):
newkeywords = keywords.copy()
newkeywords.update(fkeywords)
return func(*(args + fargs), **newkeywords)
newfunc.func = func
newfunc.args = args
newfunc.keywords = keywords
return newfunc
Partial objects provide elegant solutions to some practical problems that are encountered
during development. For example, suppose one has a list of points represented as tuples of
(x,y) coordinates and there is a requirement to sort all the points according to their distance
from some other central point. The following function computes the distance between two
points in the xy plane:
>>> points = [ (1, 2), (3, 4), (5, 6), (7, 8) ]
>>> import math
>>> def distance(p1, p2):
...
x1, y1 = p1
...
x2, y2 = p2
...
return math.hypot(x2 - x1, y2 - y1)
The built-in sort() method of lists is handy here and accepts a key argument that can be
used to customize sorting, but it only works with functions that take a single argument thus
distance() is unsuitable. The partial method provides an elegant method of dealing with
this as shown in the following snippet.
The Function
100
>>> pt = (4, 3)
>>> points.sort(key=partial(distance,pt))
>>> points
[(3, 4), (1, 2), (5, 6), (7, 8)]
>>>
The partial function creates and returns a callable that takes a single argument, a point. Now
note that the partial object has captured the reference point, pt already so when the key is
called with the point argument, the distance function passed to the partial function is used to
compute the distance between the supplied point and the reference point.
2. @functools.lru_cache(maxsize=128, typed=False): This decorator is used to wrap a
function with a memoizing callable that saves up to the maxsize number of most recent
calls. When maxsize is reached, oldest cached values are ejected. Caching can save time
when an expensive or I/O bound function is periodically called with the same arguments.
This decorator makes use of a dictionary for storing results so is limited to caching only
arguments that are hashable. The lru_cache decorator provides a function, the cache_info
for stats on cache useage.
3. @functools.singledispatch: This is a decorator that changes a function into a single dispatch
generic function. The functionality aims to handle dynamic overloading in which a single
function can handle multiple types. The mechanics of this is illustrated with the following
code snippet.
@singledispatch
def fun(arg, verbose=False):
if verbose:
print("Let me just say,", end=" ")
print(arg)
@fun.register(int)
def _(arg, verbose=False):
if verbose:
print("Strength in numbers, eh?", end=" ")
print(arg)
@fun.register(list)
def _(arg, verbose=False):
if verbose:
print("Enumerate this:")
for i, elem in enumerate(arg):
print(i, elem)
fun("Hello, world.")
fun(1, verbose=True)
fun([1, 2, 3], verbose=True)
fun((1, 2, 3), verbose=True)
Hello, world.
Strength in numbers, eh? 1
The Function
101
Enumerate this:
0 1
1 2
2 3
Let me just say, (1, 2, 3)
A generic function is defined with the @singledispatch function, the register decorator is
then used to define functions for each type that is handled. Dispatch to the correct function is
carried out based on the type of the first argument to the function call hence the name, single
generic dispatch. In the event that no function is defined for the type of the first argument
then the base generic function, fun in this case is called.
8.1 Iterators
An iterable in Python is any object that implements the __iter__ special method that when called
returns an iterator (the __iter__ special method is invoked by a call to iter(obj)). Simply put, a
Python iterable is any type that can be used with a for..in loop. Python lists, tuples, dicts and
sets are all examples of built-in iterables. Iterators are objects that implement the iterator protocol.
The iterator protocol in defines the following set of methods that need to be implemented by any
object that wants to be used as an iterator.
__iter__ method that is called on initialization of an iterator. This should return an object
that has a __next__ method.
__next__ method that is called whenever the next() global function is invoked with the iterator
as argument. The iterators __next__ method should return the next value of the iterable.
When an iterator is used with a for...in loop, the for loop implicitly calls next() on the
iterator object. This method should raise a StopIteration exception when there is no longer
any new value to return to signal the end of the iteration.
Care should be taken when distinguishing between an iterable and an iterator because an iterable
is not necessarily an iterator. The following snippet shows how this is possible.
103
>>> x = [1, 2, 3]
>>> type(x)
<class 'list'>
>>> x_iter = iter(x)
>>> type(x_iter)
<class 'list_iterator'>
# x is iterable & can be used in a for loop but is not an iterators as it
# does not have the __next__ method
>>> dir(x)
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__\
eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', \
'__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '\
__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', \
'__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', \
'insert', 'pop', 'remove', 'reverse', 'sort']
# x_iter is an iterator as it has the __iter__ and __next__ methods
>>> dir(x_iter)
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattri\
bute__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__length_hint__', '__lt__', '__ne_\
_', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__',\
'__sizeof__', '__str__', '__subclasshook__']
Something worth noting is that most times an iterable object is also an iterator so a call to such an
objects __iter__ special method returns the object itself. This will be seen later on in this section.
Any class that fully implements the iterator protocol can be used as an iterator. This is illustrated
in the following by implementing a simple iterator that returns Fibonacci numbers up to a given
maximum value.
class Fib:
def __init__(self, max):
self.max = max
def __iter__(self):
self.a = 0
object is an iterable and an iterator
self.b = 1
def __next__(self):
fib = self.a
if fib > self.max:
raise StopIteration
self.a, self.b = self.b, self.a + self.b
return fib
>>>for i in Fib(10):
print i
0
1
1
2
return self #
104
3
5
8
A custom range function for looping through numbers can also be modelled as an iterator. The
following is a simple implementation of a range function that loops from 0 upwards.
class CustomRange:
def __init__(self, max):
self.max = max
def __iter__(self):
self.curr = 0
return self
def __next__(self):
numb = self.curr
if self.curr >= self.max:
raise StopIteration
self.curr += 1
return numb
for i in CustomRange(10):
print i
0
1
2
3
4
5
6
7
8
9
Before attempting to move on, stop for a second and study both examples carefully. The essence of
an iterator is that an iterator object knows how to calculate and return the elements in the sequence
as needed not all at once. The CustomRange does not return all the elements in the range when it
is initialized rather it returns an object that when the objects __iter__ method is called returns
an iterator object that can calculate the next element of the range using the steps defined in the
__next__ method. It is possible to define a range function that returns all positive whole numbers
(an infinite sequence) by simply removing the upper bound on the method. The same idea applies
to the Fib iterator. This basic idea just explained above can be seen in built-in functions that return
sequences. For example, the built-in range function does not return a list as one would intuitively
expect but returns an object that returns a range iterator object when its __iter__ method is called.
To get the sequence as expected the range iterator object is passed to the list constructor as shown
in the following example.
105
The iterator protocol implements a form of computing that is referred to as lazy computation; it
does not do more work than it has to do at any given time.
106
2. chain(*iterables): This takes a single iterable that contains a variable number of iterables
and returns an iterator representing a union of all the iterables in supplied iterable.
>>> x = [['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']]
>>> chain.from_iterable(x)
<itertools.chain object at 0x101c6a208>
>>> list(chain.from_iterable(x))
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
>>>
3. combinations(iterable, r) This returns an iterator representing a set of r length subsequences of elements from the input iterable. Elements are treated as unique based on their
value and not on their position.
107
>>> combinations('ABCDE', 3)
<itertools.combinations object at 0x101c71138>
>>> list(combinations('ABCDE', 3))
[('A', 'B', 'C'), ('A', 'B', 'D'), ('A', 'B', 'E'), ('A', 'C', 'D'), ('A', 'C', 'E'), ('A', 'D'\
, 'E'), ('B', 'C', 'D'), ('B', 'C', 'E'), ('B', 'D', 'E'), ('C', 'D', 'E')]
>>>
4. filterfalse(predicate, iterable):: This returns an iterator that filters elements from the
iterable argument returning only those for which the value of applying the predicate to the
element is False. If predicate is None, the function returns the items that are false.
5. groupby(iterable, key=None): This returns an iterator that returns consecutive keys and
corresponding groups for these keys from the iterable argument. The key argument is a
function computing a key value for each element. If a key function is not specified or is None,
the key defaults to an identity function that returns the element unchanged. Generally, the
iterable needs to already be sorted on the same key function. The returned group is itself an
iterator that shares the underlying iterable with groupby(). An example usage of this is shown
in the following snippet.
>>> from itertools import groupby
>>> {k:list(g) for k, g in groupby('AAAABBBCCD')}
{'D': ['D'], 'B': ['B', 'B', 'B'], 'A': ['A', 'A', 'A', 'A'], 'C': ['C', 'C']}
>>> >>> [k for k, g in groupby('AAAABBBCCDAABBB')]
['A', 'B', 'C', 'D', 'A', 'B']
6. islice(terable, start, stop[, step]): This returns an iterator that returns elements from
the iterable that are within the specified range. If start is non-zero, then elements from the
iterable are skipped until start is reached. Afterwards, elements are returned consecutively
with step elements skipped if step is greater than one just as in the conventional slice until
the iterable argument is exhausted. Unlike conventional slicing,islice() does not support
negative values for start, stop, or step.
7. permutation(iterable, r=None): This returns a succession of r length permutations of
elements in the iterable. If r is not specified or is None, it defaults to the length of the iterable.
Elements are treated as unique based on their position, not on their value and this is where
permutations differs from combinations that was previously defined. So if the input elements
are unique, there will be no repeat values in each permutation.
8. product(*iterables, repeat=1): This returns a iterator that returns successive Cartesian
product of input iterables. This is equivalent to nested for-loops in a list expression. For
example, product(A, B) returns an iterator that returns values that are the same as [(x,y)
for x in A for y in B]. This function can compute the product of an iterable with itself by
specifying the number of repetitions with the optional repeat keyword argument. For example,
product(A, repeat=4) means the same as product(A, A, A, A).
108
8.2 Generators
Generators and iterators have a very intimate relationship. In short, Python generators are iterators
and understanding generators gives one an idea of how iterators can be implemented. This may
sound quite circular but after going through an explanation of generators, it will become clearer.
PEP 255 that describes simple generators refers to generators by their full name, generator-iterators.
Generators just like the name suggests generate (or consume) values when their __next__ method is
called. Generators are used either by explicitly calling the __next__ method on the generator object
or using the generator object in a for...in loop. Generators are of two types:
1. Generator Functions
2. Generator Expressions
Generator Functions
Generator functions are functions that contain the yield expression. Calling a function that contains
a yield expression returns a generator object. For example, the Fibonacci iterator can be recast as
a generator using the yield keyword as shown in the following example.
def fib(max):
a, b = 0, 1
while a < max:
yield a
a, b = b, a + b
The yield keyword expression is central to generator functions but what does this expression
really do? To understand the yield expression, contrast it with the return keyword. The return
keyword when encountered returns control to the caller of a function effectively ending the function
execution. This is shown in the following example by calling the normal Fibonacci function to return
all Fibonacci numbers less than 10.
https://www.python.org/dev/peps/pep-0255/
109
On the other hand, the presence of the yield expression in a function complicates things a bit. When
a function with a yield expression is called, the function does not run like a normal function rather
it returns a generator expression. This is illustrated by a call to the fib function in the following
snippet.
>>> f = fib(10)
>>> f
<generator object fib at 0x1013d8828>
The generator object executes when its __next__ method is invoked and the generator object
executes all statements in the function definition till the yield keyword is encountered.
>>>
0
>>>
1
>>>
1
>>>
2
f.__next__()
f.__next__()
f.__next__()
f.__next__()
The object suspends execution at that point, saves its context and returns any value in the
expression_list to the caller. When the caller invokes __next__() method of the generator object,
execution of the function continues till another yield or return expression is encountered or end
of function is reached. This continues till the loop condition is false and a StopIteration exception
is raised to signal that there is no more data to generate. To quote PEP 255,
If a yield statement is encountered, the state of the function is frozen, and the value
of expression_list is returned to .__next__()'s caller. By frozen we mean that all
local state is retained, including the current bindings of local variables, the instruction
pointer, and the internal evaluation stack: enough information is saved so that the next
time .next() is invoked, the function can proceed exactly as if the yield statement were
just another external call. On the other hand when a function encounters a return
statement, it returns to the caller along with any value proceeding the return statement
and the execution of such function is complete for all intent and purposes. One can think
of yield as causing only a temporary interruption in the executions of a function.
110
With a better understanding of generators, it is not difficult to see how generators can be used to
implement iterators. Generators know how to calculate the next value in a sequence so functions
that return iterators can be rewritten using the yield statement. To illustrate this, the accumulator
function from the itertools module can be rewritten using generators as in the following snippet.
def accumulate(iterable, func=operator.add):
'Return running totals'
# accumulate([1,2,3,4,5]) --> 1 3 6 10 15
# accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120
it = iter(iterable)
try:
total = next(it)
except StopIteration:
return
yield total
for element in it:
total = func(total, element)
yield total
Similarly, one can emulate a generator object by implementing the iterator protocol discussed at the
start of this chapter. However, the yield keyword provides a more succinct and elegant method for
creating generators.
Generator Expressions
In the previous chapter, list comprehensions were discussed. One drawback with list comprehensions
is that values are calculated all at once regardless of whether the values are needed at that time or
not (eager calculation). This may sometimes consume an inordinate amount of computer memory.
PEP 289 proposed the generator expression to resolve this and this proposal was accepted and added
to the language. Generator expressions are like list comprehensions; the only difference is that the
square brackets in list comprehensions are replaced by circular brackets that return a generator
expression object.
To generate a list of the square of number from 0 to 10 using list comprehensions, the following is
done.
>>> squares = [i**2 for i in range(10)]
>>> squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
A generator expression could be used in place of a list comprehension as shown in the following
snippet.
https://www.python.org/dev/peps/pep-0289/
111
The values of the generator can then be accessed using for...in loops or via a call to the __next__()
method of the generator object as shown below.
>>> squares = (i**2 for i in range(10))
>>> for square in squares:
print(square)
0
1
4
9
16
25
36
49
64
81
Generator expression create generator objects without using the yield expression.
When the algorithm terminates, the remaining numbers not marked in the list are all the primes
below n. Now this is a rather trivial algorithm and this is implemented using generators.
https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
112
The above example though very simple, shows the beauty of how generators can be chained together
with the output of one acting as input to another; think of this stacking of generators with one
another as a kind of processing pipeline. The filter_multiples_of_n function is worth discussing
a bit here because it maybe confusing at first. counts(2) when initialized returns a generator that
returns a sequence of consecutive numbers from 2 so the line, prime=ints.__next__() returns 2 on
the first iteration. After the yield expression, ints=filter_multiples_of_n(2, ints) is invoked
creating a generator that returns a stream of numbers that are not multiples of 2 - note that the
original sequence generator is captured within this new generator (this is very important). Now on
the next iteration of the loop within the sieve function, the ints generator is invoked. The generator
loops through the original sequence now [3, 4, 5, 6, 7, ....] yielding the first number that is not
divisible by 2, 3 in this case. This part of the pipeline is easy to understand. The prime, 3, is yielded
from the sieve function then another generator that returns non-multiples of the prime, 3, is created
and assigned to ints. This generator captures the previous generator that produces non- multiples of
2 and that generator captured the original generator that produces sequences of infinite consecutive
numbers. A call to the __next__() method of this generator will loop through the previous generator
113
that returns non-multiples of 2 and every non-multiple of 2 returned by the generator is checked
for divisibility by 3 and if the number is not divisible by 3 it is yielded. This chaining of generators
goes on and on. The next prime is 5 so the generator excluding the multiples of primes will loop
through the generator that returns non-multiples of 3 which in turn loops through the generator
that produces non-multiple of 2.
This streaming of data through multiple generators can be applied to the space and sometime time
efficient processing of any other stream of massive data such as log files and data bases. Generators
however have other nifty and mind-blowing use cases as will be seen in the following sections.
114
To fully grasp the send() method, observe that the argument passed to the send() method of the
generator will be the result of the yield expression so in the above example, the value that send() is
called with is assigned to the variable, line. The rest of the function is straightforward to understand.
Note that calling send(None) is equivalent to calling the generators __next__() method.
A little thinking shows that multiple generators or coroutines can be scheduled to run in an
interleaved manner and it would be like they were executing simultaneously. With that, we have
multitasking or something like it. In this section, rudimentary multitasking is simulated to illustrate
the versatility of generators/coroutines.
In reality, even a full blown multitasking operating system is only ever executing a single task at
one time. A task is any set of instructions with own internal state. For example, a simple task may
take a stream of text count the occurrence of some words and print a running count of some words
in the stream or may just print any word it receives. What is important here is that tasks are totally
independent of each other. The illusion of multitasking is achieved by giving each task a slice of
time to run until it encounters a trap which forces it to stop running so that other tasks can run.
This happens so fast that human users cannot sense what is happening. This can easily be simulated
with a collection of coroutines that run independent of each as shown in this section. A very simple
example of this multitasking is shown in the following snippet in which text is read from a source
and then sent for processing to multiple coroutines. In the snippet, a task is modelled as a thin
wrapper around a coroutine.
115
def run():
f = open("data.txt")
source = read(f)
tasks = [ Task(print_line()), Task(word_count()) ]
for line in source:
for task in tasks:
try:
task.run(line)
except StopIteration:
tasks.remove(task)
if __name__ == '__main__':
run()
We love python don't we?
Word distribution so far is defaultdict(<class 'int'>, {"don't": 1, 'we?': 1, 'python': 1, 'lov\
e': 1, 'We': 1})
No we don't love python
Word distribution so far is defaultdict(<class 'int'>, {"don't": 2, 'we?': 1, 'python': 2, 'we'\
: 1, 'We': 1, 'No': 1, 'love': 2})
116
Observer how the outputs are interleaved because execution of each coroutine happens for a limited
time then another coroutines executes. The above example is very instructive in showing the power
of generators and coroutines. The above has just been provided for illustration purposes. The asyncio
module is provided in Python 3.5 for concurrent programming using coroutines.
As previously mentioned, yielding data from a delegated generator was not the only reason for the
introduction of the yield from keyword because the previous yield from snippet can be replicated
without yield from as shown in the following example.
>>>
...
...
...
...
...
...
...
>>>
>>>
[5,
def g(x):
r = []
for i in range(x, 0, -1):
r.append(i)
for j in range(x):
r.append(j)
return r
x = g(5)
x
4, 3, 2, 1, 0, 1, 2, 3, 4]
The real benefit of using the new yield from keyword comes from the ability of a calling generator
to send values into the delegated generator as shown in the following example. Thus if a value is
sent into a generator yield from enables that generator to also implicitly send the same value into
the delegated generator.
117
def accumulate():
tally = 0
while 1:
next = yield
if next is None:
return tally
tally += next
def gather_tallies(tallies):
while 1:
tally = yield from accumulate()
tallies.append(tally)
tallies = []
acc = gather_tallies(tallies)
next(acc) # Ensure the accumulator is ready to accept values
for i in range(4):
acc.send(i)
acc.send(None) # Finish the first tally
for i in range(5):
acc.send(i)
acc.send(None) # Finish the second tally
tallies
10]
The complete semantics for yield from is explained in PEP 380 and given below.
1. Any values that the iterator yields are passed directly to the caller.
2. Any values sent to the delegating generator using send() are passed directly to the iterator. If
the sent value is None, the iterators __next__() method is called. If the sent value is not None,
the iterators send() method is called. If the call raises StopIteration, the delegating
generator is resumed. Any other exception is propagated to the delegating generator.
3. Exceptions other than GeneratorExit thrown into the delegating generator are passed to the
throw() method of the iterator. If the call raises StopIteration, the delegating generator is
resumed. Any other exception is propagated to the delegating generator.
4. If a GeneratorExit exception is thrown into the delegating generator, or the close() method
of the delegating generator is called, then the close() method of the iterator is called if it has
one. If this call results in an exception, it is propagated to the delegating generator. Otherwise,
GeneratorExit is raised in the delegating generator.
5. The value of the yield from expression is the first argument to the StopIteration exception
raised by the iterator when it terminates.
6. return expr in a generator causes StopIteration(expr) to be raised upon exit from the
generator.
https://www.python.org/dev/peps/pep-0380/
118
Any
Any
Any
Any
live
live
live
dead
cell
cell
cell
cell
with
with
with
with
The initial pattern of cells on the grid constitutes the seed of the system. The first generation is
created by applying the above rules simultaneously to every cell and the discrete moment at which
this happens is sometimes called a tick. The rules continue to be applied repeatedly to create further
generations.
In the following implementation, each cells simulation is carried out using a coroutine with the
state of the cells stored in a Grid object from generation to generation.
#
#
#
#
#
#
#
#
#
#
#
#
#
119
x + 0) # North
x + 1) # Northeast
x + 1) # East
x + 1) # Southeast
x + 0) # South
x - 1) # Southwest
x - 1) # West
x - 1) # Northwest
e_, se, s_, sw, w_, nw]
neighbor_states if state is == ALIVE])
120
121
0
-----*----**---*-----
|
|
|
|
|
|
1
------*---**--**-----
|
|
|
|
|
|
2
------**-*----**-----
|
|
|
|
|
|
3
------*--*----*------
|
|
|
|
|
|
4
---------**----------
|
|
|
|
|
|
5
---------------------
Generators are a fascinating topic and this chapter has barely scratched the surface of what is
possible. David Beazley gave a series of excellent talks, 1,2 and 3, that go into great detail about
very advanced usage of generators.
http://www.dabeaz.com/generators-uk/GeneratorsUK.pdf
http://www.dabeaz.com/coroutines/Coroutines.pdf
http://www.dabeaz.com/finalgenerator/FinalGenerator.pdf
9.1 Decorators
A decorator is a function that wraps another function or class. It introduces new functionality to the
wrapped class or function without altering the original functionality of such class or function thus
the interface of such class or function remains the same.
Function Decorators
A good understanding of functions as first class objects is important in order to understand function
decorators. A reader will be well served by reviewing the material on functions. When functions are
first class objects the following will apply to functions:
1. Functions can be passed as arguments to other functions.
2. Functions can be returned from other function calls.
3. Functions can be defined within other functions resulting in closures.
The above listed properties of first class functions provide the foundation needed to explain function
decorators. Put simply, function decorators are wrappers that enable the execution of code
before and after the function they decorate without modifying the function itself .
Function decorators are not unique to Python so to explain them, Python function decorators and
the corresponding syntax are ignored for the moment and instead the essence of function decorators
is focused on. To understand what decorators do, a very trivial function is decorated with another
trivial function that logs calls to the decorated function in the following example. This function
decoration is achieved using function composition as shown below (follow the comments):
123
import datetime
# decorator expects another function as argument
def logger(func_to_dec):
# A wrapper function is defined on the fly
def func_wrapper():
# add any pre original function execution functionality
print("Calling function: {} at {}".format(func_to_dec.__name__, datetime.datetime.now()))
# execute original function
func_to_dec()
# add any post original function execution functionality
print("Finished calling : {}".format(func_to_dec.__name__))
# return the wrapper function defined on the fly. Body of the
# wrapper function has not been executed yet but a closure
# over the func_to_decorate has been created.
return func_wrapper
def print_full_name():
print("My name is John Doe")
# use composition to decorate the print_full_name function
>>>decorated_func = logger(print_full_name)
>>>decorated_func
# the returned value, decorated_func, is a reference to a func_wrapper
<function func_wrapper at 0x101ed2578>
>>>decorated_func()
# decorated_func call output
Calling function: print_full_name at 2015-01-24 13:48:05.261413
# the original functionality is preserved
My name is John Doe
Finished calling : print_full_name
In the trivial example defined above, the decorator adds a new feature, printing some information
before and after the original function call, to the original function without altering it. The decorator,
logger takes a function to be decorated, print_full_name and returns a function, func_wrapper
that calls the decorated function, print_full_name, when it is executed. The decoration process
here is calling the decorator with the function to be decorated as argument. The function returned,
func_wrapper is closed over the reference to the decorated function, print_full_name and thus can
invoke the decorated function when it is executing. In the above, calling decorated_func results
in print_full_name being executed in addition to some other code snippets that implement new
functionality. This ability to add new functionality to a function without modifying the original
function is the essence of function decorators. Once this concept is understood, the concept of
decorators is understood.
124
Decorators in Python
Now that the essence of function decorators have been discussed, an attempt is made to de-construct
Python constructs that enable the definition of decorators more easily. The previous section describes
the essence of decorators but having to use decorators via function compositions as described is
cumbersome. Python introduces the @ symbol for decorating functions. Decorating a function using
the Python decorator syntax is achieved as shown in the following example.
@decorator
def a_stand_alone_function():
pass
is equivalent to
def func(arg1, arg2, ...):
pass
func = dec2(dec1(func))
without the intermediate func argument. In the above, @dec1 and @dec2 are the decorator invocations. Stop, think carefully and ensure you understand this. dec1 and dec2 are function object
references and these are the actual decorators. These values can even be replaced by any function
call or a value that when evaluated returns a function that takes another function. What
is of paramount importance is that the name reference following the @ symbol is a reference to a
function object (for this tutorial we assume this should be a function object but in reality it should
be a callable object) that takes a function as argument. Understanding this profound fact will help
in understanding python decorators and more involved decorator topics such as decorators that take
arguments.
https://www.python.org/dev/peps/pep-0318/#why
125
Note how the *args and **kwargs parameters are used in defining the inner wrapper function; this
is for the simple reason that it cannot be known beforehand what functions are going to be decorated
and thus the function signature of such functions.
126
As mentioned previously, the key to understanding what is going on with this is to note that
we can replace the reference value following the @ in a function decoration with any value that
evaluates to a function object that takes another function as argument. In the above snippet,
the value returned by the function call, decorator_maker_with_arguments("Apollo 11 Landing")
, is the decorator. The call evaluates to a function, decorator that accepts a function as argument.
Thus the decoration @decorator_maker_with_arguments("Apollo 11 Landing") is equivalent to
@decorator but with the decorator, decorator , closed over the argument, Apollo 11 Landing by the
decorator_maker_with_arguments function call. Note that the arguments supplied to a decorator
can not be dynamically changed at run time as they are executed on script import.
127
Functools.wrap
Using decorators involves swapping out one function for another. A result of this is that meta
information such as docstrings in the swapped out function are lost when using a decorator with
such function. This is illustrated below:
import datetime
# decorator expects another function as argument
def logger(func_to_decorate):
# A wrapper function is defined on the fly
def func_wrapper():
# add any pre original function execution functionality
print("Calling function: {} at {}".format(func_to_decorate.__name__, datetime.datetime.n\
ow()))
# execute original function
func_to_decorate()
# add any post original function execution functionality
print("Finished calling : {}".format(func_to_decorate.__name__))
# return the wrapper function defined on the fly. Body of the
# wrapper function has not been executed yet but a closure
# over the func_to_decorate has been created.
return func_wrapper
@logger
def print_full_name():
"""return john doe's full name"""
print("My name is John Doe")
>>> print(print_full_name.__doc__)
None
>>> print(print_full_name.__name__)
func_wrapper
In the above example, an attempt to print the documentation string returns None because the
decorator has swapped out the print_full_name function with the func_wrapper function that has
no documentation string. Even the function name now references the name of the wrapper function
rather than the actual function. This, most times, is not what we want when using decorators. To
work around this Python functools module provides the wraps function that also happens to be a
decorator. This decorator is applied to the wrapper function and takes the function to be decorated
as argument. The usage is illustrated in the following example.
128
import datetime
from functools import wraps
# decorator expects another function as argument
def logger(func_to_decorate):
@wraps(func_to_decorate)
# A wrapper function is defined on the fly
def func_wrapper(*args, **kwargs):
# add any pre original function execution functionality
print("Calling function: {} at {}".format(func_to_decorate.__name__, datetime.datetime.n\
ow()))
# execute original function
func_to_decorate(*args, **kwargs)
# add any post original function execution functionality
print("Finished calling : {}".format(func_to_decorate.__name__))
# return the wrapper function defined on the fly. Body of the
# wrapper function has not been executed yet but a closure over
# the func_to_decorate has been created.
return func_wrapper
@logger
def print_full_name(first_name, last_name):
"""return john doe's full name"""
print("My name is {} {}".format(first_name, last_name))
>>> print(print_full_name.__doc__)
return john doe's full name
>>>print(print_full_name.__name__)
print_full_name
Class Decorators
Like functions, classes can also be decorated. Class decorations server the same purpose as function
decorators - introducing new functionality without modifying the actual classes. An example of a
class decorator is given in the following singleton decorator that ensures that only one instance of
a decorated class is ever initialised throughout the lifetime of the execution of the program.
129
def singleton(cls):
instances = {}
def get_instance():
if cls not in instances:
instances[cls] = cls()
return instances[cls]
return get_instance
Putting the decorator to use in the following examples shows how this works. In the following
example, the Foo class is initialized twice however comparing the ids of both initialized objects
shows that they both refer to the same object.
@singleton
class Foo(object):
pass
>>> x = Foo()
>>> id(x)
4310648144
>>> y = Foo()
>>> id(y)
4310648144
>>> id(y) == id(x) # both x and y are the same object
True
>>>
The same singleton functionality can be achieved using a metaclass by overriding the __call__method of the metaclass as shown below:
class Singleton(type):
_instances = {}
def __call__(cls, *args, **kwargs):
if cls not in cls._instances:
cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
return cls._instances[cls]
class Foo(object):
__metaclass__ = Singleton
>>> x = Foo()
>>> y = Foo()
>>> id(x)
4310648400
>>> id(y)
4310648400
>>> id(y) == id(x)
True
130
>>>Spam.class_method(10)
10
9
8
7
131
6
5
4
3
2
1
0.00019788742065429688
>>>Spam.static_method(10)
10
9
8
7
6
5
4
3
2
1
0.00014591217041015625
https://wiki.python.org/moin/PythonDecoratorLibrary
132
import logging
def log(func):
'''Returns a wrapper that wraps func. The wrapper will log the entry and exit points of the\
function with logging.INFO level.'''
logging.basicConfig()
logger = logging.getLogger(func.__module__)
@functools.wraps(func)
def wrapper(*args, **kwds):
logger.info("About to execute {}".format(func.__name__))
f_result = func(*args, **kwds)
logger.info("Finished the execution of {}".format(func.__name__))
return f_result
return wrapper
2. A memoization decorator can be used to decorate a function that performs a calculation so that
for a given argument if the result has been previously computed, the stored value is returned
but if it has not then it is computed and stored before it is returned to the caller. This kind
of decorator is available in the functools module as discussed in the chapter on functions. An
implementation for such a decorator is shown in the following example.
import collections
def cache(func):
cache = {}
logging.basicConfig()
logger = logging.getLogger(func.__module__)
logger.setLevel(10)
@functools.wraps(func)
def wrapper(*arg, **kwds):
if not isinstance(arg, collections.Hashable):
logger.info("Argument cannot be cached: {}".format(arg))
return func(*arg, **kwds)
if arg in cache:
logger.info("Found precomputed result, {}, for argument, {}".format(cache[arg], arg\
))
return cache[arg]
else:
logger.info("No precomputed result was found for argument, {}".format(arg))
value = func(*arg, **kwds)
cache[arg] = value
return value
return wrapper
3. Decorators could also easily be used to implement functionality that retries a callable up to a
maximum amount of times.
133
4. Another very interesting decorator recipe is the use of decorators to enforce types for function
call as shown in the following example.
import sys
def accepts(*types, **kw):
'''Function decorator. Checks decorated function's arguments are
of the expected types.
Parameters:
types -- The expected types of the inputs to the decorated function.
Must specify type for each parameter.
kw
-- Optional specification of 'debug' level (this is the only valid
keyword argument, no other should be given).
debug = ( 0 | 1 | 2 )
'''
if not kw:
# default level: MEDIUM
debug = 1
else:
debug = kw['debug']
try:
def decorator(f):
134
def newf(*args):
if debug is 0:
return f(*args)
assert len(args) == len(types)
argtypes = tuple(map(type, args))
if argtypes != types:
msg = info(f.__name__, types, argtypes, 0)
if debug is 1:
raise TypeError(msg)
return f(*args)
newf.__name__ = f.__name__
return newf
return decorator
except KeyError as err:
raise KeyError(key + "is not a valid keyword argument")
except TypeError(msg):
raise TypeError(msg)
def info(fname, expected, actual, flag):
'''Convenience function returns nicely formatted error/warning msg.'''
format = lambda types: ', '.join([str(t).split("'")[1] for t in types])
expected, actual = format(expected), format(actual)
msg = "'{}' method ".format( fname )\
+ ("accepts", "returns")[flag] + " ({}), but ".format(expected)\
+ ("was given", "result is")[flag] + " ({})".format(actual)
return msg
>>> @test_concat.accepts(int, int, int)
... def div_sum_by_two(x, y, z):
...
return sum([x, y, z])/2
...
>>> div_sum_by_two('obi', 'nkem', 'chuks') # calling with wrong arguments
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/c4obi/src/test_concat.py", line 104, in newf
raise TypeError(msg)
TypeError: 'div_sum_by_two' method accepts (int, int, int), but was given (str, str, str)
5. A common use of class decorators is for registering classes as the class statements are executed
as shown in the following example.
135
{}
def register(cls):
registry[cls.__clsid__] = cls
return cls
@register
class Foo(object):
__clsid__ = ".mp3"
def bar(self):
pass
A more comprehensive listing of recipes including the examples listed that are discussed in this
section can be found at Python decorator library website.
9.3 Metaclasses
Metaclasses are deeper magic than 99% of users should ever worry about. If you wonder
whether you need them, you dont
Tim Peters
All values in Python are objects including classes so a given class object must have another class
from which it is created. Consider, an instance, f, of a user defined class Foo. The type/class of the
instance, f, can be found by using the built-in method, type and in the case of the object, f,the type
of f is Foo.
>>> class Foo(object):
...
pass
...
>>> f = Foo()
>>> type(f)
<class '__main__.Foo'>
>>>
This introspection can also extended to a class object to find out the type/class of such a class. The
following example shows the result of applying the type() function to the the Foo class.
https://wiki.python.org/moin/PythonDecoratorLibrary
136
class Foo(object):
pass
>>> type(Foo)
<class 'type'>
In Python, the class of all other class objects is the type class. This applies to user defined classes as
shown above as well as built-in classes as shown in the following code example.
>>>type(dict)
<class 'type'>
A class such as the type class that is used to create other classes is called a metaclass. That is
all there is to a metaclass - a metaclass is a class that are used in creating other classes. Custom
metaclasses are not used often in Python but sometimes it is necessary to control the way classes
are created most especially when working on big projects with big team.
Before explaining how metaclasses are used to customize class creation, a recap of how class objects
are created when a class statement is encountered during the execution of a program is provided.
The following snippet is the class definition for a simple class that every Python user is familiar with
but this is not the only way a class can be defined.
# class definition
class Foo(object):
def __init__(self, name):
self.name = name
def print_name():
print(self.name)
The following snippet shows a more involved method for defining the same class with all the
syntactic sugar provided by the class keyword stripped away. This snippet provides a better
understanding of what actually goes on under the covers during the execution of a class statement.
class_name = "Foo"
class_parents = (object,)
class_body = """
def __init__(self, name):
self.name = name
def print_name(self):
print(self.name)
"""
# a new dict is used as local namespace
class_dict = {}
137
#the body of the class is executed using dict from above as local
# namespace
exec(class_body, globals(), class_dict)
# viewing the class dict reveals the name bindings from class body
>>>class_dict
{'__init__': <function __init__ at 0x10066f8c8>, 'print_name': <function blah at 0x10066fa60>}
# final step of class creation
Foo = type(class_name, class_parents, class_dict)
During the execution of class statement, the interpreter carries out the following procedures behind
the scene:
1.
2.
3.
4.
Metaclasses in Action
It is possible to define custom metaclasses that can be used when creating classes. These custom
metaclasses will normally inherit from type and re-implement certain methods such as the __init__ or __new__ methods.
Imagine that you are the chief architect for a shiny new project and you have diligently read dozens
of software engineering books and style guides that have hammered on the importance of docstrings
so you want to enforce the requirement that all non-private methods in the project must have
*docstrings; how would you enforce this requirement?
A simple and straightforward solution is to create a custom metaclass for use across the project that
enforces this requirement. The snippet that follows though not of production quality is an example
of such a metaclass.
138
class DocMeta(type):
def __init__(self, name, bases, attrs):
for key, value in attrs.items():
# skip special and private methods
if key.startswith("__"):
continue
# skip any non-callable
if not hasattr(value, "__call__"):
continue
# check for a doc string. a better way may be to store
# all methods without a docstring then throw an error showing
# all of them rather than stopping on first encounter
if not getattr(value, '__doc__'):
raise TypeError("%s must have a docstring" % key)
type.__init__(self, name, bases, attrs)
DocMeta is a type subclass that overrides the type class __init__ method. The implemented __init__ method iterates through all the class attributes searching for non-private methods missing a
Another trivial example that illustrates an application of a metaclass is in the creation of a final class,
that is a class that cannot be sub-classed. Some people may argue that final classes are unpythonic
but for illustration purposes such functionality is implemented using a metaclass in the following
snippet.
139
class Final(type):
def __init__(cls, name, bases, namespace):
super().__init__(name, bases, namespace)
for c in bases:
if isinstance(c, Final):
raise TypeError(c.__name__ + " is final")
class B(object, metaclass=Final):
pass
class C(B):
pass
>>> class B(object, metaclass=Final):
...
pass
...
>>> class C(B):
...
pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 6, in __init__
TypeError: B is final
In the example, the metaclass simply performs a check ensuring that the final class is never part of
the base classes for any class being created.
Another very good example of a metaclass in action is in Abstract Base Classes that was previously
discussed. When defining an abstract base class, the ABCMeta metaclass from the abc module is
used as the metaclass for the abstract base class being defined and the @abstractmethod and
@abstractproperty decorators are used to create methods and properties that must be implemented
by non-abstract subclasses.
from abc import ABCMeta, abstractmethod
class Vehicle(object):
__metaclass__ = ABCMeta
@abstractmethod
def change_gear(self):
pass
@abstractmethod
def start_engine(self):
pass
class Car(Vehicle):
140
Once a class implements all abstract methods then such a class becomes a concrete class and can be
instantiated by a user.
from abc import ABCMeta, abstractmethod
class Vehicle(object):
__metaclass__ = ABCMeta
@abstractmethod
def change_gear(self):
pass
@abstractmethod
def start_engine(self):
pass
class Car(Vehicle):
def __init__(self, make, model, color):
self.make = make
self.model = model
self.color = color
def change_gear(self):
print("Changing gear")
def start_engine(self):
print("Changing engine")
>>> car = Car("Toyota", "Avensis", "silver")
>>> print(isinstance(car, Vehicle))
True
141
It would not be possible to modify class attributes such as the list of base classes or attribute names
in the __init__ method because as has been said previously, this method is called after the object
has already been created. On the other hand, when the intent is just to carry out initialization or
validation checks such as was done with the DocMeta and Final metaclasses then the __init__method of the metaclass should be overridden.
142
# create a context
with open('output.txt', 'w') as f:
# carry out operations within context
f.write('Hi there!')
The with statement can be used with any object that implements the context management protocol.
This protocol defines a set of operations, __enter__ and __exit__ that are executed just before
the start of execution of some piece of code and after the end of execution of some piece of code
respectively. Generally, the definition and use of a context manager is shown in the following
snippet.
class context:
def __enter__(self):
set resource up
return resource
def __exit__(self, type, value, traceback):
tear resource down
# the context object returned by __enter__ method is bound to name
with context() as name:
do some functionality
If the initialised resource is used within the context then the __enter__ method must return the
resource object so that it is bound within the with statement using the as mechanism. A resource
object must not be returned if the code being executed in the context doesnt require a reference
to the object that is set-up. The following is a very trivial example of a class that implements the
context management protocol in a very simple fashion.
>>> class Timer:
...
def __init__(self):
...
pass
...
def __enter__(self):
...
self.start_time = time.time()
...
def __exit__(self, type, value, traceback):
...
print("Operation took {} seconds to complete".format(time.time()-self.start_time))
...
...
>>> with Foo():
...
print("Hey testing context managers")
...
Hey testing context managers
Operation took 0.00010395050048828125 seconds to complete
>>>
143
When the with statement executes, the __enter__() method is called to create a new context; if a
resource is initialized for use here then it is returned but this is not the case in this example. After the
operations within the context are executed, the __exit__() method is called with the type, value
and traceback as arguments. If no exception is raised during the execution of the of the operations
within the context then all arguments are set to None. The __exit__ method returns a True or
False depending on whether any raised exceptions have been handled. When False is returned
then exception raised are propagated outside of the context for other code blocks to handle. Any
resource clean-up is also carried out within the __exit__() method. This is all there is to context
management. Now rather than write try...finally code to ensure that a file is closed or that a lock
is released every time such resource is used, such chores can be handled in the the __exit__ method
of a context manager class thus eliminating code duplication and making the code more intelligible.
This context generator function, time_func in this case, must yield exactly one value if it is required
that a value be bound to a name in the with statements as clause. When generator yields, the code
block nested in the with statement is executed. The generator is then resumed after the code block
finishes execution. If an exception occurs during the execution of a block and is not handled in
the block, the exception is re-raised inside the generator at the point where the yield occurred.
If an exception is caught for purposes other than adequately handling such an exception then the
generator must re-raise that exception otherwise the generator context manager will indicate to the
144
with statement that the exception has been handled, and execution will resume normally after the
context block.
Context managers just like decorators and metaclasses provide a clean method for abstracting away
these kind of repetitive code that can clutter code and makes following code logic difficult.
10.1 Modules
Modules enable the reuse of programs. A module is a file that contains a collection of definitions
and statements and has a .py extension. The contents of a module can be used by importing the
module either into another module or into the interpreter. To illustrate this, our favourite Account
class shown in the following snippet is saved in a module called account.py.
class Account:
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
Account.num_accounts += 1
def del_account(self):
Account.num_accounts -= 1
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
return self.balance
To re-use the module definitions, the import statement is used to import the module as shown in
the following snippet.
146
All executable statements contained within a module are executed when the module is imported. A
module is also an object that has a type - module as such all generic operations that apply to objects
can be applied to modules. The following snippets show some unintuitive ways of manipulating
module objects.
Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import account
>>> type(account)
<class 'module'>
>>> getattr(account, 'Account') # access the Account class using getattr
<class 'cl.Account'>
>>> account.__dict__
{'json': <module 'json' from '/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/js\
on/__init__.py'>, '__cached__': '/Users/c4obi/writings/scratch/src/__pycache__/cl.cpython-34.pyc', '\
__loader__': <_frozen_importlib.SourceFileLoader object at 0x10133d4e0>, '__doc__': None, '__file__'\
: '/Users/c4obi/writings/scratch/src/cl.py', 'Account': <class 'account.Account'>, '__package__': ''\
, '__builtins__': { ...} ...
}
Each module possesses its own unique global namespace that is used by all functions and classes
defined within the module and when this feature is properly used, it eliminates worries about name
clashes from third party modules. The dir() function without any argument can be used within a
module to find out what names are available in a modules namespace.
As mentioned, a module can import another module; when this happens and depending on the form
of the import, the imported modules name, part of the name defined within the imported module
or all names defined with the imported module could be placed in the namespace of the module
doing the importing. For example, from account import Account imports and place the Account
name from the account module into the namespace, import account imports and adds the account
name referencing the whole module to the namespace while from account import * will import
and add all names in the account module except those that start with an underscore to the current
namespace. Using from module import * as a form of import is strongly advised against as it may
import names that the developer is not aware of and that conflict with names used in the module
doing the importing. Python has the __all__ special variable that can be used within modules. This
value of the __all__ variable should be a list that contains the names within a module that are
147
imported from such module when the from module import * syntax is used. Defining this method
is totally optional on the part of the developer. We illustrate the use of the __all__ special method
with the following example.
__all__ = ['Account']
class Account:
num_accounts = 0
def __init__(self, name, balance):
self.name = name
self.balance = balance
Account.num_accounts += 1
def del_account(self):
Account.num_accounts -= 1
def deposit(self, amt):
self.balance = self.balance + amt
def withdraw(self, amt):
self.balance = self.balance - amt
def inquiry(self):
return self.balance
class SharedAccount:
pass
>>> from account import *
>>> dir()
['Account', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__'] # on\
ly Account has been imported
>>>
The name of an imported module is gotten by referencing the __name__ attribute of the imported
module. In the case of a module that is currently executing, the __name__ value is set to __main__.
Python modules can be executed with python module <arguments>. A corollary of the fact that the
__name__ of the currently executing module is set to __main__ is that we can have a recipe such as
the following.
if __name__ == "__main__":
# run some code
That makes the module usable as a standalone script as well as an importable module. A popular
use of the above recipe is for running unittest; we can run the module as a standalone to test it but
then import it for use into another module without running the test cases.
148
Reloading Modules
Once modules have been imported into the interpreter, any change to such a module is not reflected
within the interpreters. However, Python provides the importlib.reload that can be used to reimport a module once again into the current namepace.
The sys.path list can be modified at runtime by adding or removing elements from this list.
However, when the interpreter is started conventionally, the sys.path list contains paths that come
from three sources namely: sys.prefix, PYTHONPATH and initialization by the site.py module.
1. sys.prefix: This variable specifies the base location for a given Python installation. From this
base location, the Python interpreter can work out the location of the Python standard library
modules. The location of the standard library is given by the following paths.
sys.prefix + '/lib/python3X.zip'
sys.prefix + '/lib/python3.X'
sys.prefix + '/lib/python3.X/plat-sysname'
sys.exec_prefix + '/lib/python3.X/lib-dynload'
The paths of the standard library can be found by running the Python interpreter with the -S option;
this prevents the site.py initialization that adds the third party package paths to the sys.path
list. The location of the standard library can also be overriden by defining the PYTHONHOME
environment variable that replaces the sys.prefix and sys.exec_prefix.
1. PYTHONPATH : Users can define the PYTHONPATH environment variable and the value of
this variable is added as the first argument to the sys.path list. This variable can be set to the
directory where a user keeps user defined modules.
149
2. site.py: This is a path configuration module that is loaded during the initialization of the
interpreter. This modules adds site-specific paths to the module search path. The site.py
starts by constructing up to four directories from a prefix and a suffix. For the prefix, it uses
sys.prefix and sys.exec_prefix. For the suffix, it uses the empty string and then lib/sitepackages on Windows or lib/pythonX.Y/site-packages on Unix and Macintosh. For each of
these distinct combinations, if it refers to an existing directory, it is added to the sys.path and
further inspected for configuration files. The configuration files are files with .pth extension.
The contents are additional items one per line to be added to sys.path. Non-existing items
are never added to sys.path, and no check is made that the item refers to a directory rather
than a file. Each item is added to sys.pathonce. Blank lines and lines beginning with #
are skipped. Lines starting with import followed by space or tab are executed. After these
path manipulations, an attempt is made to import a module named sitecustomize that can
perform arbitrary site-specific customizations. It is typically created by a system administrator
in the site-packages directory. If this import fails with an ImportError exception, it is silently
ignored. After this, if ENABLE_USER_SITE is true, an attempt is made to import a module named
usercustomize that can perform arbitrary user-specific customizations, . This file is intended
to be created in the user site-packages directory that is part of sys.path unless disabled by
-s. Any ImportError is silently ignored.
10.3 Packages
Just as modules provide a mean for organizing statements and definitions, packages provide a mean
for organizing modules. A close but imperfect analogy of the relationship of packages to modules
is that of folders to files on computer file systems. A package just like a folder can be composed of
a number of module files. In Python however, packages are just like modules; in fact all packages
are modules but not all modules are packages. The difference between a module and package is the
presence of a __path__ special variable in a package object that does not have a None value. Packages
can have sub-packages and so on; when referencing a package and it corresponding sub-packages
the dot notation is used so a complex number sub-package within a mathematics package will be
referenced as math.complex.
There are currently two types of packages:- regular packages and namespace packages.
Regular Packages
A regular package is one that consists of a group of modules in a folder with an __init__.py module
within the folder. The presence of this __init__.py file within the folder cause the interpreter to
treat the folder as a package. An example of package structure is the following.
150
parent/
<----- folder
__init__.py
one/
<------ sub-folder
__init__.py
a.py
two/
<------ sub-folder
__init__.py
b.py
The parent, one and two folders are all packages because they all contain an __init__.py module
within each of their respective folders. one and two are sub-packages of the parent package.
Whenever a package is imported, the __init__.py module of such a package is executed. One can
think of the __init__.py as the store of attributes for the package - only symbols defined in this
module are attributes of the imported module. Assuming the __init__.py module from the above
parent package is empty and the package, parent, is imported using import parent, the parent
package will have no module or subpackage as an attribute. The following code listing shows this.
>>> import parent
>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'parent']
>>> dir(parent)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__\
', '__path__', '__spec__']
As the example shows, none of the modules or sub-packages is listed as an attribute of the imported
package object. On the other hand, if a symbol, package="testing packages", is defined in the
__init__.py module of the parent package and the parent package is imported, the package object
has this symbol as an attribute as shown in the following code listing .
>>> import parent
>>> dir()
['__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'parent']
>>> dir(parent)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__\
', '__path__', '__spec__', 'package']
>>> parent.package
'testing packages'
>>>
When a sub-package is imported, all __init__.py modules in parent packages are imported in
addition to the __init__.py module of the sub-package. Sub-packages are referenced during import
using the dot notation just like modules in packages are. In the previous package structure, the
notation would be parent.one to reference the one sub-package. Packages support the same kind of
import semantics as modules; individual modules or packages can be imported as in the following
example.
151
When the above method is used then the fully qualified name for the module, parent.one.a, must
be used to access any symbol in the module. Note that when using this method of import, the last
symbol can be either a module or sub-package only; classes, functions or variables defined within
modules are not allowed. It is also possible to import just the module or sub-package that is needed
as the following example shows.
# importing just required module
from parent.one import a
# importing just required sub-package
from parent import one
Symbols defined in the a module or modules in the one package can then be referenced using dot
notation with just a or one as the prefix. The import forms, from package import * or from
package.subpackage import *, can be used to import all the modules in a package or sub-package.
This form of import should however be used carefully if ever used as it may import some names
into the namespace that may cause naming conflicts. Packages support the __all__ (the value of
this should by convention be a list) variable for listing modules or names that are visible when
the package is imported using the from package import * syntax. If __all__ is not defined, the
statement from package import * does not import all submodules from the package into the
current namespace rather it only ensures that the package has been imported possibly running
any initialization code in __init__.py and then imports whatever symbols are defined in the
__init__.py module; including any names defined here and submodules imported here.
Namespace Packages
A namespace package is a package in which the component modules and sub-packages of the
package may reside in multiple different locations. The various components may reside on different
part of the file system, in zip files, on the network or on any other location searched by interpreter
during the import process however when the package is imported, all components exist in a common
namespace. To illustrate a namespace package, observe the following directory structures containing
modules; both directories, apollo and gemini could be located on any part of the file system and
not necessarily next to each other.
152
apollo/
space/
test.py
gemini/
space/
test1.py
In these directories, the name, space, is used as a common namespace and will serve as the package
name. Observe the absence of __init__.py modules in either directory. The absence of this module
within these directories is a signal to the interpreter that it should create a namespace package when
it encounters such. To be able to import this space package, the paths for its components must first
of all be added to interpreters module search path, sys.path.
>>>
>>>
>>>
>>>
import sys
sys.path.extend(['apollo', 'gemini'])
import space.test
import space.test1
Observe that the two different package directories are now logically regarded as a single name space
and either space.test or space.test1 can be imported as if they existed in the same package. The
key to a namespace package is the absence of the __init__.py modules in the top-level directory that
serves as the common namespace. The absence of the __init__.py module causes the interpreter to
create a list of all directories within its sys.path variable that contain a matching directory name
rather than throw an exception. A special namespace package module is then created and a readonly copy of the list of directories is stored in its __path__ variable. The following code listing gives
an example of this.
>>> space.__path__
_NamespacePath(['apollo/space', 'gemini/space'])
Namespaces bring added flexibility to package manipulation because namespaces can be extend by
anyone with their own code thus eliminating the need to modify package structures in third party
packages. For example, suppose a user had his or her own directory of code like this:
my-package/
space/
custom.py
Once this directory is added to sys.path along with the other packages, it would seamlessly merge
together with the other space package directories and the contents can also be imported along with
any existing artefacts.
153
154
The interpreter continues the search for the module by querying each finder in the meta_path to
find which can handle the module. The finder objects must implement the find_spec method that
takes three arguments: the first is the fully qualified name of the module, the second is an import
path that is used for the module search - this is None for top level modules but for sub-modules or
sub-packages, it is the value of the parent packages __path__ and the third argument is an existing
module object that is passed in by the system only when a module is being reloaded.
If one of the finders locates the module, it returns a module spec that is used by the interpreter
import machinery to create and load the module (loading is tantamount to executing the module).
The loaders carry out the module execution in the modules global namespace. This is done by a
call to the importlib.abc.Loader.exec_module() method with the already created module object
as argument.
Customizing the import process
The import process can be customized via import hooks. There are two types of this hook: meta
hooks and import path hooks.
155
Meta hooks These are called at the start of the import process immediately after the sys.modules
cache lookup and before any other process. These hooks can override any of the default finders
search processes. Meta hooks are registered by adding new finder objects to the sys.meta_path
variable.
To understand how a custom meta_path hook can be implemented, a very simple case is illustrated.
In online Python interpreters, some built-in modules such as the os are disabled or restricted to
prevent malicious use. A very simple way to implement this is to implement a meta import hook
that raises an exception any time a restricted import is attempted; the following snippet shows such
an example.
class RestrictedImportFinder:
def __init__(self):
self.restr_module_names = ['os']
def find_spec(self, fqn, path=None, module=None):
if fqn in self.restr_module_names:
raise ImportError("%s is a restricted module and cannot be imported" % fqn)
return None
import sys
# remove os from sys.module cache
del sys.modules['os']
sys.meta_path.insert(0, RestrictedImportFinder())
import os
Traceback (most recent call last):
File "test_concat.py", line 16, in <module>
import os
File "test_concat.py", line 9, in find_spec
raise ImportError("%s is a restricted module and cannot be imported" % fqn)
ImportError: os is a restricted module and cannot be imported
Import Path hooks These hooks are called as part of the sys.path or package.__path__ processing. Recall from our previous discussion that a path based finder is one of the default meta-finder
and this finder works with entries in the sys.path variable. The meta path based finder delegates
the job of finding modules on the sys.path variables to other finders - these are the import path
hooks. The sys.path_hooks is a collection of built in path entry finders. By default, the Python
interpreter has support for processing files in zip folders and normal files in directories as shown in
the following snippet.
156
import sys
>>> sys.path_hooks
[<class 'zipimport.zipimporter'>, <function FileFinder.path_hook.<locals>.path_hook_for_FileFinder a\
t 0x1003c1b70>]
Each hooks knows how to handle a particular kind of file. For example, an attempt to get the finder
for one of the entries in sys.path is attempted in the following snippet.
>>> sys.path_hooks
[<class 'zipimport.zipimporter'>, <function FileFinder.path_hook.<locals>.path_hook_for_FileFind\
er at 0x1003c1b70>]
# sys.prefix is a directory
>>> path = sys.prefix
# sys.path_hooks[0] is associated with zip files
>>> finder = sys.path_hooks[0](path)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
zipimport.ZipImportError: not a Zip file
>>> finder = sys.path_hooks[1](path)
>>> finder
FileFinder('/Library/Frameworks/Python.framework/Versions/3.4')
>>>
New import path hooks can be added by inserting new callables into the sys.path_hooks.
157
A set-up script using distutils is a setup.py file. For a program with the following package
structure,
```python
parent/
__init__.py
spam.py
one/
__init__.py
a.py
two/
__init__.py
b.py
```
The setup.py file must exist at the top level directory so in this case, it should exist at parent/setup.py. The values used in the set-up script are self explanatory. py_modules will contain the
names of all single file python modules, packages will contains a list of all packages,scripts will
contain a list of all scripts within the program. The rest of the arguments though not exhaustive of
the possible parameters are self explanatory.
Once the setup.py file is ready, the following snippet is used at the commandline to create an archive
file for distribution.
>>> python setup.py sdist
sdist will create an archive file (e.g., tarball on Unix, ZIP file on Windows) containing your setup
script setup.py, your modules and packages. The archive file will be named parent-1.0.tar.gz (or
158
.zip), and will unpack into a directory parent-1.0.. To install the created distribution, the file is
unzipped and python setup.py install is run inside the directory. This will install the package in
the site-packages directory for the installation.
One can also create one or more built distributions for programs. For instance, if running a Windows
machine, one can make the use of the program easy for end users by creating an executable installer
with the bdist_wininst command. For example:
python setup.py bdist_wininst
Some of the methods from the inspect module for handling source code include:
1. inspect.getdoc(object): This returns the documentation string for the argument object. The
string returned is cleaned up with inspect.cleandoc().
Inspecting Objects
160
3. inspect.getfile(object): Return the name of the file in which an object was defined. The
argument should be a module, class, method, function, traceback, frame or code object. This
will fail with a TypeError if the object is a built-in module, class, or function.
>>> inspect.getfile(test.Account)
'/Users/c4obi/src/test_concat.py'
>>>
4. inspect.getmodule(object): This function attempts to guess the module that the argument
object was defined in.
>>> inspect.getmodule(acct)
<module 'test' from '/Users/c4obi/src/test.py'>
>>>
5. inspect.getsourcefile(object): This returns the name of the Python source file in which
the argument object was defined. This will fail with a TypeError if the object is a built-in
module, class, or function.
>>> inspect.getsourcefile(test_concat.Account)
'/Users/c4obi/src/test.py'
>>>
6. inspect.getsourcelines(object): This returns a tuple of the list of source code lines and
the line number on which the source code for the argument object begins. The argument may
be a module, class, method, function, traceback, frame, or code object. An OSError is raised if
the source code cannot be retrieved.
Inspecting Objects
161
>>> inspect.getsourcelines(test_concat.Account)
(['class Account:\n', '
"""base class for representing user accounts"""\n', '
num_accounts = \
0\n', '\n', '
def __init__(self, name, balance):\n', '
self.name = name \n', '
self\
.balance = balance \n', '
Account.num_accounts += 1\n', '\n', '
def del_account(self):\n',\
'
Account.num_accounts -= 1\n', '\n', '
def __getattr__(self, name):\n', '
"""hand\
le attribute reference for non-existent attribute"""\n', '
return "Hey I dont see any attribu\
te called {}".format(name)\n', '\n', '
def deposit(self, amt):\n', '
self.balance = self.b\
alance + amt \n', '\n', '
def withdraw(self, amt):\n', '
self.balance = self.balance - amt\
\n', '\n', '
def inquiry(self):\n', '
return "Name={}, balance={}".format(self.name, self\
.balance) \n'], 52)
7. inspect.getsource(object): Return the human readable text of the source code for the
argument object. The argument may be a module, class, method, function, traceback, frame,
or code object. The source code is returned as a single string. An OSError is raised if the source
code cannot be retrieved. Note the difference between this and inspect.getsourcelines is
that this method returns the source code as a single string while inspect.getsourcelines
returns a list of source code lines.
>>> inspect.getsource(test.Account)
'class Account:\n
"""base class for representing user accounts"""\n
num_accounts = 0\n\n
d\
ef __init__(self, name, balance):\n
self.name = name \n
self.balance = balance \n
\
Account.num_accounts += 1\n\n
def del_account(self):\n
Account.num_accounts -= 1\n\n
\
def __getattr__(self, name):\n
"""handle attribute reference for non-existent attribute"""\n\
return "Hey I dont see any attribute called {}".format(name)\n\n
def deposit(self, amt):\\
n
self.balance = self.balance + amt \n\n
def withdraw(self, amt):\n
self.balance = \
self.balance - amt \n\n
def inquiry(self):\n
return "Name={}, balance={}".format(self.name\
, self.balance) \n'
>>>
Inspecting Objects
162
Important attributes of a Parameter object are the name and kind attributes. The kind attribute
is a string that could either be POSITIONAL_ONLY, POSITIONAL_OR_KEYWORD, VAR_POSITIONAL,
KEYWORD_ONLY or VAR_KEYWORD.
3. BoundArguments: This is the return value of a Signature.bind or Signature.partial_bind
method call.
Inspecting Objects
163
>>> sig
<inspect.Signature object at 0x101b3c5c0>
>>> sig.bind(1, 2)
<inspect.BoundArguments object at 0x1019e6048>
Inspecting Objects
164
The defined decorator, type_assert, can then be used to enforce type assertion as shown in the
following example.
>>> @typeassert(str)
... def print_name(name):
...
print("My name is {}".format(name))
...
>>> print_name(10)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/c4obi/src/test_concat.py", line 43, in wrapper
raise TypeError('Argument {} must be {}'.format(name,bound_types[name]) )
TypeError: Argument name must be <class 'str'>
>>>
The bind_partial is used rather than bind so that we do not have to specify the type for all
arguments; the idea behind the partial function from the functools module is also behind this.
The inspect module further defines some functions for interacting with classes and functions. A
cross-section of these functions include:
1. inspect.getclasstree(classes, unique=False): This arranges the list of classes into a
hierarchy of nested lists. If the returned list contains a nested list, the nested list contains
classes derived from the class whose entry immediately precedes the list. Each entry is a tuple
containing a class and a tuple of its base classes.
>>> class Account:
...
pass
...
>>> class CheckingAccount(Account):
...
pass
...
>>> class SavingsAccount(Account):
...
pass
...
>>> import inspect
>>> inspect.getclasstree([Account, CheckingAccount, SavingsAccount])
[(<class 'object'>, ()), [(<class '__main__.Account'>, (<class 'object'>,)), [(<class '__main__.Chec\
kingAccount'>, (<class '__main__.Account'>,)), (<class '__main__.SavingsAccount'>, (<class '__main__\
.Account'>,))]]]
>>>
Inspecting Objects
165
Inspecting Objects
166
arguments; the predicate is an optional value that serves as a filter on the values returned. For
example for a given class instance, i, we can get a list of attribute members of i that are methods
by making the call inspect.getmembers(i, inspect.ismethod); this returns a list of tuples of the
attribute name and attribute object. The following example illustrates this.
>>> acct = test_concat.Account("obi", 1000000000)
>>> import inspect
>>> inspect.getmembers(acct, inspect.ismethod)
[('__getattr__', <bound method Account.__getattr__ of <test_concat.Account object at 0x101b3c470\
>>), ('__init__', <bound method Account.__init__ of <test_concat.Account object at 0x101b3c470>>), (\
'del_account', <bound method Account.del_account of <test_concat.Account object at 0x101b3c470>>), (\
'deposit', <bound method Account.deposit of <test_concat.Account object at 0x101b3c470>>), ('inquiry\
', <bound method Account.inquiry of <test_concat.Account object at 0x101b3c470>>), ('withdraw', <bou\
nd method Account.withdraw of <test_concat.Account object at 0x101b3c470>>)]
The inspect module has predicates for this method that include isclass, ismethod, isfunction,
isgeneratorfunction, isgenerator, istraceback, isframe, iscode, isbuiltin, isroutine, isabstract, ismethoddescriptor.
Inspecting Objects
167
5. inspect.stack(context=1): This returns a list of frame records for the callers stack. The first
entry in the returned list represents the caller; the last entry represents the outermost call on
the stack.
6. inspect.trace(context=1): This returns a list of frame records for the stack between the
current frame and the frame in which an exception currently being handled was raised in.
The first entry in the list represents the caller; the last entry represents where the exception
was raised.