Solid Python PDF
Solid Python PDF
net/publication/323935872
CITATIONS READS
0 1,128
1 author:
Duncan Watson-Parris
University of Oxford
12 PUBLICATIONS 185 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Duncan Watson-Parris on 22 March 2018.
December 2013
SOLID Python
SOLID principles applied to a dynamic programming language
Duncan Watson-Parris
We’re starting to use Python more and more on projects and increasingly for more complex
applications rather than just as a scripting language. The language itself is also rapidly maturing and
there are a huge number of very capable (and freely available) scientific and numerical libraries.
There is now a GPG for Python which is a useful reference for dos and don’ts, and acceptable style,
but this article will hopefully be a bit less prescriptive and help you think about how you structure and
design your Python code in a more ‘Pythonic’ way. Pythonic just means idiomatic Python, it refers to
the style, but also the architecture and design of the code. Often we come to develop a code in
Python and we bring our experience from another language such as Java or C# and try and bend
Python to the architectural styles which we are used to in these strongly typed languages.
First I will discuss each of the SOLID principles in turn, then have a brief discussion about one of the
other core design principles which you find in good Python (EAFP) and how this might relate to the
SOLID principles. Finally, in the Appendix, I have included a set of other hints and ideas which will
hopefully inspire you to write more Pythonic code!
An online presentation on applying SOLID principles in Python was pointed out to me during review of
this article. The author appears to make some similar points as me, and the examples might be useful
but unfortunately there isn’t any accompanying voice-over so it is quite hard to follow.
1 SOLID Principles
I won’t dwell on each of the principles themselves here, as there are far better accounts of them in [1]
and [2]. Rather I’m more interested in how to apply these principles in Python. It would be very easy
to just take the examples provided in [2] and apply them to Python, but applying them in such a way
that the code remains Pythonic is a bit more challenging.
It’s worth noting that these are principles only, and won’t necessarily be possible or even desirable to
apply in every case - that’s what makes architecture difficult!
More specifically, a class should have only one reason to change. This principle sounds very simple,
and it is. Sticking to it, and fixing it once you spot there is a problem can be more complicated though.
It’s very easy to find yourself in the situation where you are adding a method to a class because there
isn’t really anywhere else it fits, or maybe it could fit in any of two or three classes. Over time you find
that a class which started off with well a defined responsibility now has many responsibilities. The
class is coupling these responsibilities together, and changes to one of those responsibilities can lead
to the class being unable to meet its other responsibilities. Thus the coupling has led to complexity
and fragility.
Until the need arises to change one of the responsibilities it may be that having multiple
responsibilities is a perfectly sensible design, again [2] gives some excellent advice: “An axis of
change is an axis of change only if the changes actually occur.”
Fortunately, if we have written good flexible Python, fixing this is extremely easy. Take the following
example:
def draw(self):
# Do some drawing
pass
def area(self):
return self.width * self.height
We have a trivial Rectangle class which is responsible for both the geometric properties of the
rectangle and also the GUI representation of it. This may be acceptable early in the development of
the system but later on we realise we need to split the responsibility because the GUI representation
needs factoring out. So we simply split the class:
def area(self):
return self.width * self.height
class DrawRectangle:
def draw(self):
# Do some drawing
Now the individual classes will work wherever the main class was used before – assuming the code
was well structured. That is, if the methods using those types only used and relied on the properties of
the object it needs to do its job. This is something I will touch on again later. Notice that, because of
Python’s duck-typing, we don’t have to update any signatures of interfaces to use the new types. Of
course, we may have decided to have DrawRectangle inherit GeometricRectangle.
At first reading this statement may seem contradictory, but in any OOP language this is trivially
achieved through abstraction. The base (or abstract) class is closed for modification and we
implement concrete subclasses in order to modify their behaviour. This is also important in Python -
and again easy to follow.
Sub-classing is straight forward, the decorator pattern can also be useful, and if we needed abstract
base classes we could use the ABC [4] package here, but it’s interesting to note that Python offers us
some other more exotic options as explored below. These are mostly exotic for a reason though! Most
of the time we can achieve OCP without resorting to these, but as [2] says: “there will always be some
kind of change against which [our module] is not closed”, and this is where these options can be
useful.
1.2.1 Mix-ins
Python allows for multiple inheritance of concrete subclasses. This allows us to create ‘mix-ins’,
multiple classes which each provide specific functionality, and are intended to be inherited together to
create a ‘mixed’ class. This possibility doesn’t exist in Java or C# but if you are coming from these
languages it might be useful to think of a mix-in as an interface which you don’t have to implement as
it has already been implemented. A good example courtesy of this answer on SO: In C#, the biggest
The benefits are obviously that it allows you to ensure your classes retain only one responsibility (as
discussed above) while giving you powerful options for modifying the properties of your base class.
This option should be used with care though as using it a lot can lead to namespace pollution,
particularly for large classes which you expect to be in turn sub-classed. You should also consider the
order of inheritance, as it may not be that which you expect. That is not to say you cannot create
reliable and extensible classes using multiple inheritance, [5] provides an excellent explanation of the
method resolution order, with many examples.
1.2.2 Monkey-Patching
Another option available to us in Python which may not exist in statically typed languages is the
possibility of monkey patching (see a good explanation, with appropriate health warnings, here [6]).
In Python we are able to change the functionality of any method, class or function at will. We can
even add methods to classes (or individual instances!) at run-time. For example, imagine we had
created a GeometricRectangle using our previous example, but in order to make it fit into a badly
designed API which insisted on the object having a name() attribute we might consider the following
solution:
shape = GeometricRectangle(2, 5)
def name():
return "I'm a rectangle"
shape.name = name
We’ve added a new function to a single instance of an object just by assignment! Note that I have
chosen my words carefully here, this really is a function on the instance – not a method. It has no
access to the instance attributes. It is however possible to add methods to classes in this way,
(stretching the rectangle example to the limit!) consider the following example:
def square_area(self):
return self.width ** 2
GeometricRectangle.area = square_area
square = GeometricRectangle(2, 5)
We’ve completely changed the implementation of the area method on all instances of
GeomerticRectangle created after this patch! Modifying classes themselves however is moving into
meta-programming and out of the scope of this article (see [7] for a good introduction). Now, it goes
without saying that if you start using this all over the place then you will very quickly end up with un-
testable, un-manageable code. With a little respect, and used judiciously it can however be a very
powerful tool to allow us to modify behaviour without changing the underlying code.
Using the @overload decorator it is possible to create function overloads which perform different
functionality for different arguments [8]. In my mind this is slightly better than using isinstance to
The Liskov Substitution Principle basically states that any subclass should be replaceable with its
parent class. Again this is a simple enough principle which throws up some quite subtle difficulties in
implementation.
Often we use the IS-A test when deciding whether a type should sub-class another type. For
example, a cat “is a” mammal therefor in our OO design we might define a Cat class which
subclasses our abstract base class Mammal. This might be a sensible design decision, but it might
not – depending on the use of our subclass. [1 and 2] provide the same excellent example where
Square subclasses Rectangle. This may seem acceptable at first but the difficulties become apparent
when we consider the behaviour of each class when manipulating their Height and Width properties.
Most of the arguments and examples on this principle are equally valid in Python, but I think you have
to be particularly careful of this in Python because it is so easy to override methods and variables – as
we have seen above! Changes in behaviour using e.g. monkey patching will almost inevitably break
the Liskov Substitution Principle (but may be justified in some circumstances).
It’s interesting to consider for a moment how this principle relates to classes with multiple inheritance.
For example if we had used mix-ins to satisfy OCP should our subclass be replaceable with all of its
base types? I would argue not. The whole point of using mix-ins is that the behaviour of the subclass
is the sum of the behaviours of the base classes.
This principle aims to ensure that clients are not forced to depend on methods which they do not use.
For me this is a key principle in good Python programming, and something I will come back to in the
next section. A good way of ensuring this is by separation through multiple inheritance. In [1 & 2] this
is done using interfaces because that is the only way of implementing multiple inheritance in Java. In
Python we are free to inherit from multiple concrete classes, and this is precisely the purpose of the
mix-ins discussed above – to provide multiple clients specific behaviours.
The Dependency Inversion Principle basically states that even high level modules should depend
upon abstractions, not low level classes (details). I think we get this for free with good Python: the
answer to this problem in [2] is the use of interfaces to define high-level abstractions which the details
need to implement; we go one better than that and only rely on the given object having the required
properties.
For me this is a key point: You shouldn’t assume properties on an object unless they’re needed in the
operation the function is performing. This is related to the principle mentioned above. Because of
duck typing a client can’t be forced to rely on a method they do not use – but we should actively
ensure that clients only rely on the minimum properties required. A good example of this is a method
which checks an object’s length before looping over it: in the future it may be passed an object (such
as a generator (see Appendix 5.2) which is fine to iterate over but doesn’t have a length. By checking
In fact, I would argue in Python that you often don’t need interfaces at all: You shouldn't separate the
definition of the behaviour from the implementation unless you have to. A given client can assume an
argument has a given property and it is up to the programmer and his unit tests to ensure it does. A
well-documented function should describe the behaviours it expects of an argument – not the type of
an argument.
There are situations when you might need to define an interface in order to make an explicit contract,
such as for APIs or classes which you expect to be extended as part of a library or framework. In this
case you can use abstract base classes [4] which, because Python allows multiple inheritance, are
essentially the same as interfaces.
Consider the following trivial example function which takes the square root of a real number:
#LBYL
if i > 0.0:
v = cmath.sqrt(i)
else:
#handle
#EAFP
try:
v = cmath.sqrt(i)
except ValueError:
# handle
In the LBYL example the very act of checking assumes the ability to compare i to a real number,
whereas in the EAFP example we make no such assumption. In fact, if in the future we wanted to be
able to use our function for also taking the square root of complex numbers the comparison in the
LBYL code would have to change. The EAFP code would still work with no modifications.
Admittedly the try, except blocks can become complicated but Python provides us with a tool which
may help here. The ‘with’ statement allows the encapsulation of try, except, finally blocks in a simple
and elegant way. The most common usage of this is probably for file handling:
The ‘with’ statement doesn’t take care of the fact that the file may not exist, or other IO errors, but it
does ensure that if an exception occurs in the ‘do something’ block then the file gets closed
regardless. Obviously, this is most useful for IO, or network connections where you have to ensure
some finally block is executed, but should be extendable to more general scenarios.
In order to be able to use a with statement in your own code you can create a context manager which
implements both __enter()__ and __exit()__ methods (see PEP-343 for details), or more simply
use the built-in contextlib. A good example is provided by StackOverflow[9]:
@contextmanager
def working_directory(path):
current_dir = os.getcwd()
os.chdir(path)
try:
yield
finally:
os.chdir(current_dir)
with working_directory("data/stuff"):
# do something within data/stuff
# here I am back again in the original working directory
This probably doesn’t help with checking real numbers but it does get you into the mind-set of EAFP.
3 Conclusions
I think the EAFP idiom runs right through Python, and hopefully the arguments for using interfaces as
little as possible make more sense in this light. When we use strongly typed languages the language
is in effect performing LBYL on every function call, checking that what you passed in is what you said
you were going to (whether this is at run-time or compile time). This in its very nature inhibits flexibility,
flexibility which we can exploit if we assume the object we were given has the properties we need it
to.
I think Python can be an extremely effective programming language, especially for agile development
where the need for flexibility as you develop is vitally important. The SOLID principles provide an
excellent set of general rules for working with evolving code, but might lead to comparatively rigid
code if not applied with some thought. Writing Python, and following the SOLID principles, in such a
way as to make the most of this flexibility, rather than trying to inhibit it can, if used with care, lead to
better code, and better software. Of course, it is a matter of taste to some extent, but for me if you've
chosen to develop in Python and then start using interfaces all over the place then you've probably
missed the point of Python!
I’ve also included a number of more general tips for making your code more Pythonic in the Appendix,
and these will hopefully reinforce some of the points made above.
4 References
[1] http://testube:8080/Testube/video/69/architectural-principles
[2] Agile Software Development, Robert C Martin, 2012. (Course Number 5131)
[3] PEP-343 on the ‘with’ statement: http://docs.python.org/release/2.5/whatsnew/pep-343.html
[4] ABC: http://docs.python.org/2/library/abc.html
[5] Multiple inheritance resolution order: http://www.python.org/download/releases/2.3/mro/
[6] Monkey patching: http://www.codinghorror.com/blog/2008/07/monkeypatching-for-humans.html
[7] Meta-programming: http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html
[8] PEP-3124 Overloading: http://www.python.org/dev/peps/pep-3124/#overloading-generic-functions
[9] StackOverflow: http://stackoverflow.com/questions/3012488/what-is-the-python-with-statement-
designed-for
5 Appendix
I’ve also put together a collection of nice tricks and time savers that might help make your Python
more Pythonic! In no particular order:
Copyright Tessella 2013
5.1 Use list comprehensions
These one line constructs make creating list objects trivially easy. e.g.
my_list = [ x.attribute for x in some_iterable ]
For the more adventurous it’s also possible to include logic statements and nested comprehensions,
but don’t overdo it, I’ve seen 5 line comprehensions before and it’s not pretty!
5.2 Know when not to use list comprehensions - using generators instead
Generators allow you to declare a function that behaves as an iterator. That is the resulting
expression is not evaluated and stored in memory when it is declared (as for a list comprehension),
rather it is evaluated each time the function is called.
For cases where the expression is evaluated only once, or where the expression would be too large
to store in memory, the benefits are obvious. It is easy to define functions which act as generators,
but you can also use ‘generator comprehension’ which is almost identical to a list comprehension
except using parenthesis, e.g.
my_gen = ( x.attribute for x in some_iterable )
It may not be immediately obvious to new Python programmers but because Classes and functions
are first class objects it is trivially easy to store these in lists, or even dictionaries. One great example
of this is an implementation of the strategy pattern using dictionaries. e.g.
my_new_obj = my_dict[key]() # where my_dict contains key:Class mappings
This function makes it really easy to perform operations on any collection of objects. e.g.
squares = map(sqrt, range(10))
It returns a list of results mapping the function given onto the list values (which may just be any form
of iterable).
Numpy is a numerical library with very fast linear algebra operations and a number of extremely
useful constructs. See http://www.numpy.org/.
There are a number of ways of indexing lists which you may not have been aware of:
You can count backwards, e.g. access the last element in a list using my_list[-1]
Reversing a list using my_list[::-1].
The above is just a special case of setting an increment e.g. my_list[::2] gives a step of 2.
All of the above work on strings!
The function enumerate returns a counter as well as the item to be enumerated which can be very
useful if you need the index of an item as well as the item itself. e.g.
for i, x in enumerate(my_list):
# do something
In order to avoid having to catch InvalidKeyErrors every time you query a dictionary use:
val = my_dict.get(key, default)
To provide a default value if the key is not present.
Also - there is a defaultdict collection which gives keys default values, or use
my_dict.setdefault to set a default on a standard dict. There are some subtle differences though
about when the default is created, and some code might expect a KeyError, so take care with this
one.
When formatting strings the easiest way is probably using named placeholders, e.g.:
print("The {foo} is {bar}".format(foo='answer', bar=42))
# Note that you can also unpack a dict into format!
Although generally frowned upon, the ternary operators in Python are actually fairly readable and
intuitive when written as:
x = 3 if (y==1) else 2
This one is definitely not for the feint hearted. Because classes are first class objects in Python it is
possible to define them at run-time, e.g. within if statements or even functions. Use with care!