Python Mastery
Python Mastery
Mastery
David Beazley (@dabeaz)
https://www.dabeaz.com
Copyright (C) 2007-2024
(CC BY-SA 4.0), David Beazley
https://www.dabeaz.com/courses.html
Books/Video
https://www.safaribooksonline.com
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com,
Course License and Usage
• Official course materials are here
https://github.com/dabeaz-course/python-mastery
You are free to use and adapt this material in any way that you wish as you
as you give attribution to the original source. If you choose to use selected
presentation slides in your own course, I kindly ask that the copyright notice,
license, and authorship in the lower-left corner remain present.
Course Setup
Python Review
(Optional)
• Downloads
• Documentation and tutorial
• Community Links
• News and more
• Tutorial
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 1- 3
Running Python
• Python programs run inside an interpreter
• The interpreter is a simple "console-based"
application that normally starts from a
command shell (e.g., the Unix shell)
bash % python3
Python 3.6.0 (default, Jan 27 2017, 13:20:23)
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)]
>>>
• Command line
bash % python3 helloworld.py
hello world
bash %
• #! (Unix)
#!/usr/bin/env python3
# helloworld.py
print('hello world')
Time: 10 minutes
• Prefer lowercase
• Use leading _ for "private" names
def _internal_helper():
...
• If-elif-else
if a == '+':
op = PLUS
elif a == '-':
op = MINUS
elif a == '*':
op = TIMES
else:
op = UNKNOWN
• .format() method
print('{:10s} {:10d} {:10.2f}'.format(name,shares,price))
• Use % operator
print('%10s %10d %10.2f' % (name, shares, price))
Time: 15 minutes
• To read data
line = f.readline() # Read a line of text
data = f.read([maxbytes]) # Read data
• Write to a file
with open('foo.txt','w') as f:
f.write('some text\n')
...
Time: 10 minutes
• Calling a function
a = sumcount(100)
Time: 10 minutes
• What is a class?
• It's all of the function definitions that
implement the various methods
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 1- 42
Instances
• Created by calling the class as a function
>>> a = Player(2, 3)
>>> b = Player(10, 20)
>>>
class Player:
def __init__(self, x, y):
self.x = x
self.y = y
self.health = 100
newly created object
Time: 10 minutes
a = foo.grok(2)
b = foo.spam('Hello')
...
a = m.sin(x)
def rectangular(r,theta):
x = r*cos(theta)
y = r*sin(theta)
return x,y
def rectangular(r,theta):
x = r*cos(theta)
y = r*sin(theta)
return x,y
...
r = gauss(0.0,1.0) # In what module?
Time: 5 minutes
Data Handling
Core Topics
• Data structures
• Containers and collections
• Iteration
• Understanding the builtins
• Object model
• Some options
• Tuple
• Dictionary
• Class instance
• Let's take a short tour
• Immutable
s[1] = 75 # TypeError. No item assignment
@dataclass
class Stock:
name : str
shares : int
price: float
class Stock(typing.NamedTuple):
name: str
shares: int
price: float
Stock = namedtuple('Stock',
['name', 'shares', 'price'])
Time : 30 minutes
• Common use
p = prices['IBM'] # Value lookup
p = prices.get('AAPL', 0.0) # Lookup with default value
prices['HPE'] = 37.42 # Assignment
• Usage:
p = prices['ACME', '2017-01-01']
prices['ACME','2017-01-04'] = 515.20
• Set comprehension
{ expression for item in sequence if condition }
• Dict comprehension
{ key:value for item in sequence if condition }
• What it means
result = []
for item in sequence:
if condition:
result.append(expression)
• Unique values
unique_names = {s['name'] for s in portfolio}
>>> d = defaultdict(list)
>>> d
defaultdict(<class 'list'>, {})
>>> d['x']
[]
>>> d
defaultdict(<class 'list'>, {'x': []})
>>>
>>> q = deque()
>>> q.append(1)
>>> q.append(2)
>>> q.appendleft(3)
>>> q.appendleft(4)
>>> q
deque([4, 3, 1, 2])
>>> q.pop()
2
>>> q.popleft()
4
>>>
history = deque(maxlen=N)
with open(filename) as f:
for line in f:
history.append(line)
...
• Solution: ChainMap
from collections import ChainMap
allprices = ChainMap(techs, auto)
>>> allprices['HPE']
34.23
>>> allprices['F']
51.1
>>>
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 2- 27
Commentary
Time : 45 minutes
name values
'GOOG' [490.1, 485.25, 487.5 ]
'IBM' [91.5]
'HPE' [13.75, 12.1, 13.25, 14.2, 13.5 ]
'CAT' [52.5, 51.2]
for j in range(10,20):
# j = 10,11,..., 19
for k in range(10,50,2):
# k = 10,12,...,48
>>> c
{ 'name': 'GOOG', 'shares':100, 'price': 490.1,
'date': '6/11/2007,'time': '9:45am' }
>>>
1 4 9 16
>>> for n in squares:
print(n, end=' ')
notice no output (spent)
>>>
Time : 10 minutes
Time : 30 Minutes
pointer array
a b c d e
used reserved
list
items.append(5) 1 2 3 4 5
items.append(6) 1 2 3 4 5 6
items.append(7) 1 2 3 4 5 6 7
>>> a.__hash__()
-539294296
>>> b.__hash__()
1034194775
>>> c.__hash__()
2135385778
• Recurrence
3
4
OCCUPIED 5
i, h = perturb(i, h, size)
6
i = 7, 6, 1, 4, 5, 2, 3, 0, ... 7
Time : 25 Minutes
"c" [•, •]
>>> a = [1,2,3]
>>> func(a)
>>> a
[1, 2, 3, 42]
>>>
ref = 1
a = [4,5,6] "a" [4,5,6]
ref = 1
"b" [1,2,3]
ref = 7 ref = 2
'name' 'MSFT'
if op == '+':
ops = {
r = add(x, y)
'+' : add,
elif op == '-':
'-' : sub,
r = sub(x, y):
'*' : mul,
elif op == '*':
'/' : div
r = mul(x, y):
}
elif op == '/':
r = div(x, y):
r = ops[op](x,y)
Time : 25 Minutes
• Behavior
c = Connection('www.python.org',80)
c.open()
c.send(data)
c.recv()
c.close()
• What is a class?
• Mostly, it's a set of functions that carry out
various operations on so-called "instances"
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 3- 4
Instances
• Instances are the actual "objects" that you
manipulate in your program
• Created by calling the class as a function
>>> a = Player(2, 3)
>>> b = Player(10, 20)
>>>
class Player:
...
def move(self, dx, dy):
self.x += dx
self.y += dy
Time : 20 Minutes
• Example: Output
attributes = [ 'name', 'shares', 'price']
for attr in attributes:
print(attr, '=', getattr(obj, attr))
Time : 15 Minutes
def cost(self):
return self.shares * self.price
class NewDate(Date):
... Gets the correct class
(e.g., NewDate)
d = NewDate.today()
• Example:
>>> SomeClass.yow()
SomeClass.yow
>>>
Time : 15 Minutes
class Child(Base):
def spam(self):
print('Spam', self._name)
@property
def shares(self):
return self._ shares
@shares.setter
def shares(self, value):
if not isinstance(value, int):
raise TypeError('Expected int')
self._ shares = value
assignment @shares.setter
calls the setter def shares(self, value):
if not isinstance(value, int):
raise TypeError('Expected int')
self._ shares = value
@property
def cost(self):
return self.shares * self.price
Time : 15 Minutes
class Child(Parent):
...
class MyStock(Stock):
def __init__(self, name, shares, price, factor):
super().__init__(name, shares, price)
self.factor = factor
def cost(self):
return self.factor * super().cost()
class MyStock(Stock):
...
Time : 20 Minutes
def __str__(self):
return '%d-%d-%d' % (self.year,
self.month,
self.day)
def __repr__(self):
return 'Date(%r,%r,%r)' % (self.year,
self.month,
self.day)
• Definition in a class
class Container:
def __len__(self):
...
def __getitem__(self,a):
...
def __setitem__(self,a,v):
...
def __delitem__(self,a):
...
def __contains__(self,a):
...
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 3- 57
Methods: Mathematics
• Mathematical operators
a + b a.__add__(b)
a - b a.__sub__(b)
a * b a.__mul__(b)
a / b a.__div__(b)
a // b a.__floordiv__(b)
a % b a.__mod__(b)
a << b a.__lshift__(b)
a >> b a.__rshift__(b)
a & b a.__and__(b)
a | b a.__or__(b)
a ^ b a.__xor__(b)
a ** b a.__pow__(b)
-a a.__neg__()
~a a.__invert__()
abs(a) a.__abs__()
d = Date.today()
• Typical uses:
• Proper shutdown of system resources (e.g.,
network connections)
• Releasing locks (e.g., threading)
• Avoid defining it for any other purpose
• Example use:
>>> m = Manager()
>>> with m:
... print('Hello World')
...
Entering
Hello World
Leaving
>>>
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 3- 68
Exercise 3.6
Time : 15 Minutes
class IStream(ABC):
@abstractmethod
def read(self, maxbytes=None):
pass
@abstractmethod
def write(self, data):
pass
>>> p = UnixPipe()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class
UnixPipe with abstract methods read
>>>
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 3- 73
Handler Classes
formatter = TextTableFormatter()
print_table(portfolio, ['name','shares'], formatter)
parser = DictCSVParser()
portfolio = parser.parse('portfolio.csv')
Time : 15 Minutes
class Child(Parent):
def spam(self):
print('Different spam')
super().spam()
class B(A):
...
>>> d = LoudDog()
>>> d.noise()
'WOOF'
>>>
Time : 15 Minutes
{
'name' : 'GOOG',
self._ _ dict_ _ 'shares' : 100,
'price' : 490.10
}
{
'cost' : <function>,
Stock._ _ dict_ _ 'sell' : <function>,
'_ _ init_ _ ' : <function>,
}
methods
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 4- 7
Instances and Classes
• Instances and classes are linked together
• __class__ attribute refers back to the class
>>> s = Stock('GOOG', 100, 490.10)
>>> s._ _ dict_ _
{'name':'GOOG','shares':100,'price':490.10 }
>>> s._ _ class_ _
<class '_ _ main_ _ .Stock'>
>>>
{attrs}
instances ._ _ dict_ _
._ _ class_ _
._ _ dict_ _ {methods}
class
Time : 10 Minutes
3
look in _ _ bases_ _
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 4- 16
Single Inheritance
• In inheritance hierarchies, attributes are
found by walking up the inheritance tree
class A(object): pass object
class B(A): pass
class C(A): pass
class D(B): pass A
class E(D): pass
• With single
B C
inheritance, there is a D
MRO D A B C Base
super()
D A B C Base object
Time : 25 Minutes
{
'cost' : <function>,
Stock._ _ dict_ _ 'sell' : <function>,
'_ _ init_ _ ' : <function>,
}
methods
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 4- 29
Reading Attributes (Reprise)
• Recall that a two-step process is used to
locate attributes on objects
>>> s = Stock(...) s
._ _ dict_ _ {'name': 'GOOG',
>>> s.name
._ _ class_ _ 1 'shares': 100 }
'GOOG'
>>> s.cost()
49010.0
Stock {'cost': <func>,
>>> ._ _ dict_ _
2 'sell':<func>,
'_ _ init_ _ ':..}
f = Foo()
f.a = 23 # Stores value in f.__dict__['a']
>>> f = Foo()
>>> f.a
Getting!
>>> f.__dict__['a'] = 42
>>> f.a
42
>>> Notice how the value in the
dictionary hides the descriptor
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 4- 48
Descriptor Naming
• Descriptors can define a name setter
class Descriptor:
def __init__(self, name=None):
self.name = name
Time : 15 Minutes
• __getattribute__(self,name)
• Called every time an attribute is read
• Default behavior looks for descriptors,
checks the instance dictionary, checks bases
classes (inheritance), etc.
• If it can't find the attribute after all of those
steps, it invokes __getattr__(self,name)
• __getattr__(self,name)
• A failsafe method. Called if an attribute can't
be found using the standard mechanism
• Default behavior is to raise AttributeError
• Sometimes customized
• __setattr__(self,name,value)
• Called every time an attribute is set
• Default behavior checks for descriptors,
stores values in the instance dictionary, etc.
• __delattr__(self,name)
• Called every time an attribute is deleted
• Default behavior checks for descriptors and
deletes from the instance dictionary
>>> p = Proxy(c)
>>> p
<__main__.Proxy object at 0x37f130>
>>> p.radius
getattr: radius
Notice how attribute access
4.0
>>> p.area() gets captured by __getattr__
getattr: area and then redirected to the
50.26548245743669 original object
>>>
Time : 10 Minutes
Functions
• Function design
• Functional Programming
• Error handling and Logging
• Testing
arguments function
Yes No
with open('Data.csv') as f:
data = read_data(f)
r = requests.get('http://place/data.csv')
data = read_data(r.iter_lines(decode_unicode='utf-8')))
Time : 10 Minutes
t1 = Thread(target=foo) concurrent
t1.start() execution
t2 = thread(target=bar)
t2.start()
fut = Future()
• To store a result
fut.set_result(value)
• Coordination is required
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 5- 21
Future Example
def func(x, y, fut):
time.sleep(20)
fut.set_result(x+y)
def caller():
fut = Future()
threading.Thread(target=func, args=(2, 3, fut)).start()
result = fut.result()
print('Got:', result)
Time : 15 Minutes
• Essential features...
• Functions can accept functions as input
• Functions can return functions as results
• Python supports both
def square(x):
return x * x
nums = [1, 2, 3, 4]
r = sum_map(square, nums)
nums = [1, 2, 3, 4]
result = sum_map(lambda x: x*x, nums)
• functools.partial
from functools import partial
dist_from10 = partial(distance, 10)
def square(x):
return x * x
nums = [1, 2, 3, 4]
result = reduce(sum, map(square, nums))
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 5- 31
Commentary
• Subdividing problems into small composable
parts is a useful software architecture tool
Data
map
CPUs
reduce
Time : 10 Minutes
>>> a = add(3, 4)
>>> a.__closure__
(<cell at 0x54f30: int object at 0x54fe0>,
<cell at 0x54fd0: int object at 0x54f60>)
>>> a.__closure__[0].cell_contents
3
>>> a.__closure__[1].cell_contents
4
>>> a = add(3, 4)
>>> a.__closure__
(<cell at 0x10bb52708: int object at 0x10b5d3610>,)
>>> a.__closure__[0].cell_contents
7
>>>
>>> c = counter(10)
>>> c()
11
>>> c()
12
>>>
Time : 10 Minutes
• Argh!!!! Boom!
try:
# Some complicated operation
...
except Exception:
# !! TODO
Ariane 5
pass
• Unwrapping it later
try:
...
except TaskError as e:
print("It didn't seem to work.")
print("Reason:", e.__cause__)
class SomeOtherError(ApplicationError):
pass
def read_data(filename):
...
try:
name = row[0]
shares = int(row[1])
price = float(row[2])
except ValueError as e:
log.warning("Bad row: %s", row)
log.debug("Reason : %s", e)
Time : 15 Minutes
>>> add(2,2)
4
>>> add('hello','world')
'helloworld'
>>>
• Let's test it
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 5- 55
Assertions/Contracts
• Assertions are runtime checks
def add(x, y):
'''
Adds x and y
'''
assert isinstance(x, int)
assert isinstance(y, int)
return x + y
def test_str(self):
# Test with strings
r = simple.add('hello', 'world')
self.assertEqual(r, 'helloworld')
# Assert that x == y
self.assertEqual(x,y)
Time : 15 Minutes
1 (2,3,4,5)
{ 'flag' : True,
'mode' : 'fast',
'header' : 'debug' }
func(data, **kwargs)
# Same as func(data,color='red',delimiter=',',width=400)
Time : 20 minutes
def func():
y = value # Local variable
def func():
global x
x = 37
Time : 15 minutes
• Stored in __annotations__
>>> func.__annotations__
{'a': <class 'int'>, 'b': <class 'int'>,
'return': <class 'int'>}
>>>
func.threadsafe = False
func.blah = 42
args = (1, 2)
kwargs = {'c': 10}
Time : 15 minutes
Time : 15 minutes
Time : 15 Minutes
Metaprogramming
• Example use:
>>> add(3, 4)
7
>>> logged_add(3, 4) This extra output is created by the
Calling add wrapper, but the original function
7
>>>
is still called to get the result
• Usage:
>>> logged_add = logged(add)
>>> logged_add
<function wrapper at 0x378670>
>>> logged_add(3, 4)
Calling add
7
>>>
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 7- 7
Wrappers as Replacements
• When you create a wrapper, you often want
to replace the original function with it
def add(x, y):
return x + y
• Usage:
@timethis
def bigcalculation():
statements
statements
Time : 15 Minutes
>>> add.__name__
'add'
>>> add.__doc__
'Adds x and y'
>>> help(add)
Help on function add in module __main__:
add(x, y)
Adds x and y
>>>
>>> add.__name__
'wrapper'
>>> add.__doc__
>>> help(add)
Help on function wrapper in module __main__:
wrapper(*args, **kwargs)
>>>
• This is a problem
Copyright (C) 2007-2024 (CC BY-SA 4.0), David Beazley, https://www.dabeaz.com, 7- 18
Copying Metadata
• Decorators should copy metadata
def logged(func):
def wrapper(*args, **kwargs):
print('Calling', func.__name__)
return func(*args, **kwargs)
manual wrapper.__name__ = func.__name__
copying of wrapper.__doc__ = func.__doc__
metadata return wrapper
• Example use:
@logmsg('You called {name}')
def add(x, y):
return x + y
@logged
def add(x, y):
return x + y
Time : 15 Minutes
# Replacement method
def __getattribute__(self, name):
print('Getting:', name)
return orig_getattribute(self, name)
>>> s = MyClass()
>>> s.x = 23
>>> s.x
Getting: x
>>> s.foo()
Notice how all lookups
Getting: foo now have logging
>>> s.bar()
Getting: bar
>>>
class Child1(Parent):
pass
Time : 15 Minutes
>>> s = Spam()
>>> type(s)
<class '__main__.Spam'>
>>>
>>> type
<class 'type'>
>>>
Time : 15 Minutes
Time : 10 Minutes
class
type.__new__(type, name, bases, dict)
definition
type.__init__(cls, name, bases, dict)
class dupemeta(type):
@classmethod
def __prepare__(cls, name, bases):
return dupedict()
• Example:
class A(metaclass=dupemeta):
def bar(self):
pass
def bar(self):
pass Fails! Duplicate
class meta(type):
@staticmethod
def __new__(meta, clsname, bases, dict):
for key, val in dict.items():
if callable(val):
dict[key] = decorator(val)
return super().__new__(meta, clsname, bases, dict)
• Example:
>>> class A(metaclass=meta):
pass
>>> a = A()
Creating instance of <class '__main__.A'>
>>>
Time : 10 Minutes
def __iter__(self):
return self._holdings.__iter__()
def __iter__(self):
n = self.n
while n > 0:
yield n
n -= 1
Time : 20 Minutes
def producer():
...
yield item
...
b = processing(a)
c = consumer(b)
Time : 15 minutes
@consumer
def match(pattern):
...
• Hooking it up
follow('access-log',
match('python',
printer()))
• A picture
send() send()
follow() match() printer()
send()
coroutine
Time : 15 Minutes
Time : 10 Minutes
def countup(n):
x = 0
while x < n:
print('Up we go', x)
yield
x += 1
Time : 20 Minutes
def up_and_down(n):
countup(n)
countdown(n)
>>> g = greeting('Guido')
>>> g
<coroutine object greeting at 0x10b8b8258>
>>> g.send(None)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration: Hello Guido
Time : 15 Minutes
a = foo.grok(2)
b = foo.spam('Hello')
...
def import_module(name):
# locate the module and get source code
filename = find_module(name)
code = open(filename).read()
# Run it
exec(code, mod.__dict__, mod.__dict__)
return mod
def import_module(name):
# Check for cached module
if name in sys.modules:
return sys.modules[name]
filename = find_module(name)
code = open(filename).read()
mod = types.ModuleType(name)
sys.modules[name] = mod
grok(2)
grok(2)
spam('Hello')
...
10 minutes
• Example :
# bar.py
• Example:
>>> import xml
>>> xml.__package__
'xml'
>>> xml.__path__
['/usr/local/lib/python3.5/xml']
>>>
10 minutes
bar.py # bar.py
class Bar:
...
...
bar.py # bar.py
class Bar:
...
...
__init__.py # __init__.py
f = spam.Foo()
b = spam.Bar()
...
_collections_abc.py
Container from _collections_abc import *
Hashable
Mapping class OrdererDict(dict):
... ...
class Counter(dict):
...
class Foo:
...
...
class Bar:
...
...
bar.py
# bar.py
class Bar:
...
...
bar.py
# bar.py
class Bar:
...
...
__init__.py
# __init__.py
20 minutes
15 minutes