From Python To Numpy 2023 10 16 14 41 27
From Python To Numpy 2023 10 16 14 41 27
There are already a fair number of books about Numpy (see Bibliography) and a legitimate question
is to wonder if another book is really necessary. As you may have guessed by reading these lines, my
personal answer is yes, mostly because I think there is room for a different approach concentrating
on the migration from Python to Numpy through vectorization. There are a lot of techniques that you
don't find in books and such techniques are mostly learned through experience. The goal of this book
is to explain some of these techniques and to provide an opportunity for making this experience in
the process.
Website: http://www.labri.fr/perso/nrougier/from-python-to-numpy
Table of Contents
1. Preface • Temporal vectorization • Back to Python
• About the author • Spatial vectorization • Numpy & co
• About this book • Conclusion • Scipy & co
• License 5. Problem vectorization • Conclusion
2. Introduction • Introduction 8. Conclusion
• Simple example • Path finding 9. Quick References
• Readability vs speed • Fluid Dynamics • Data type
3. Anatomy of an array • Blue noise sampling • Creation
• Introduction • Conclusion • Indexing
• Memory layout 6. Custom vectorization • Reshaping
• Views and copies • Introduction • Broadcasting
• Conclusion • Typed list 10. Bibliography
4. Code vectorization • Memory aware array • Tutorials
• Introduction • Conclusion • Articles
• Uniform vectorization 7. Beyond Numpy • Books
Disclaimer: All external pictures should have associated credits. If there are missing credits, please tell
me, I will correct it. Similarly, all excerpts should be sourced (wikipedia mostly). If not, this is an error
and I will correct it as soon as you tell me.
1 Preface
Contents
About the author
About this book
Prerequisites
Conventions
How to contribute
Publishing
License
Prerequisites
This is not a Python beginner guide and you should have an intermediate level in Python and ideally
a beginner level in numpy. If this is not the case, have a look at the bibliography for a curated list of
resources.
Conventions
We will use usual naming conventions. If not stated explicitly, each script should import numpy, scipy
and matplotlib as:
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
We'll use up-to-date versions (at the date of writing, i.e. January, 2017) of the different packages:
Packages Version
Python 3.6.0
Numpy 1.12.0
Scipy 0.18.1
Matplotlib 2.0.0
How to contribute
Publishing
If you're an editor interested in publishing this book, you can contact me if you agree to have this
version and all subsequent versions open access (i.e. online at this address), you know how to deal
with restructured text (Word is not an option), you provide a real added-value as well as supporting
services, and more importantly, you have a truly amazing latex book template (and be warned that
I'm a bit picky about typography & design: Edward Tufte is my hero). Still here?
1.3 License
Book
This work is licensed under a Creative Commons Attribution-Non Commercial-Share Alike 4.0
International License. You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.
Code
The code is licensed under the OSI-approved BSD 2-Clause License.
2 Introduction
Contents
Simple example
Readability vs speed
Numpy is all about vectorization. If you are familiar with Python, this is the main difficulty you'll face
because you'll need to change your way of thinking and your new friends (among others) are named
"vectors", "arrays", "views" or "ufuncs".
Let's take a very simple example, random walk. One possible object oriented approach would be to
define a RandomWalker class and write a walk method that would return the current position after
each (random) step. It's nice, it's readable, but it is slow:
Object oriented approach
class RandomWalker:
def __init__(self):
self.position = 0
walker = RandomWalker()
walk = [position for position in walker.walk(1000)]
Procedural approach
For such a simple problem, we can probably save the class definition and concentrate only on the
walk method that computes successive positions after each random step.
def random_walk(n):
position = 0
walk = [position]
for i in range(n):
position += 2*random.randint(0, 1)-1
walk.append(position)
return walk
walk = random_walk(1000)
This new method saves some CPU cycles but not that much because this function is pretty much the
same as in the object-oriented approach and the few cycles we saved probably come from the inner
Python object-oriented machinery.
Vectorized approach
But we can do better using the itertools Python module that offers a set of functions creating iterators
for efficient looping. If we observe that a random walk is an accumulation of steps, we can rewrite the
function by first generating all the steps and accumulate them without any loop:
def random_walk_faster(n=1000):
from itertools import accumulate
# Only available from Python 3.6
steps = random.choices([-1,+1], k=n)
return [0]+list(accumulate(steps))
walk = random_walk_faster(1000)
In fact, we've just vectorized our function. Instead of looping for picking sequential steps and add
them to the current position, we first generated all the steps at once and used the accumulate
function to compute all the positions. We got rid of the loop and this makes things faster:
We gained 85% of computation-time compared to the previous version, not so bad. But the
advantage of this new version is that it makes numpy vectorization super simple. We just have to
translate itertools call into numpy ones.
def random_walk_fastest(n=1000):
# No 's' in numpy choice (Python offers choice & choices)
steps = np.random.choice([-1,+1], n)
return np.cumsum(steps)
walk = random_walk_fastest(1000)
This book is about vectorization, be it at the code or problem level. We'll see this difference is
important before looking at custom vectorization.
2.2 Readability vs speed
Before heading to the next chapter, I would like to warn you about a potential problem you may
encounter once you'll have become familiar with numpy. It is a very powerful library and you can
make wonders with it but, most of the time, this comes at the price of readability. If you don't
comment your code at the time of writing, you won't be able to tell what a function is doing after a
few weeks (or possibly days). For example, can you tell what the two functions below are doing?
Probably you can tell for the first one, but unlikely for the second (or your name is Jaime Fernández
del Río and you don't need to read this book).
As you may have guessed, the second function is the vectorized-optimized-faster-numpy version of
the first function. It is 10 times faster than the pure Python version, but it is hardly readable.
3 Anatomy of an array
Contents
Introduction
Memory layout
Views and copies
Direct and indirect access
Temporary copy
Conclusion
3.1 Introduction
As explained in the Preface, you should have a basic experience with numpy to read this book. If this
is not the case, you'd better start with a beginner tutorial before coming back here. Consequently I'll
only give here a quick reminder on the basic anatomy of numpy arrays, especially regarding the
memory layout, view, copy and the data type. They are critical notions to understand if you want
your computation to benefit from numpy philosophy.
Let's consider a simple example where we want to clear all the values from an array which has the
dtype np.float32. How does one write it to maximize speed? The below syntax is rather obvious (at
least for those familiar with numpy) but the above question asks to find the fastest operation.
If you look more closely at both the dtype and the size of the array, you can observe that this array
can be casted (i.e. viewed) into many other "compatible" data types. By compatible, I mean that
Z.size * Z.itemsize can be divided by the new dtype itemsize.
Said differently, an array is mostly a contiguous block of memory whose parts can be accessed using
an indexing scheme. Such indexing scheme is in turn defined by a shape and a data type and this is
precisely what is needed when you define a new array:
Z = np.arange(9).reshape(3,3).astype(np.int16)
Here, we know that Z itemsize is 2 bytes (int16), the shape is (3,3) and the number of dimensions is 2
(len(Z.shape)).
>>> print(Z.itemsize)
2
>>> print(Z.shape)
(3, 3)
>>> print(Z.ndim)
2
Furthermore and because Z is not a view, we can deduce the strides of the array that define the
number of bytes to step in each dimension when traversing the array.
With all these information, we know how to access a specific item (designed by an index tuple) and
more precisely, how to compute the start and end offsets:
offset_start = 0
for i in range(ndim):
offset_start += strides[i]*index[i]
offset_end = offset_start + Z.itemsize
>>> Z = np.arange(9).reshape(3,3).astype(np.int16)
>>> index = 1,1
>>> print(Z[index].tobytes())
b'\x04\x00'
>>> offset = 0
>>> for i in range(Z.ndim):
... offset + = Z.strides[i]*index[i]
>>> print(Z.tobytes()[offset_start:offset_end]
b'\x04\x00'
This array can be actually considered from different perspectives (i.e. layouts):
Item layout
shape[1]
(=3)
┌───────────┐
┌ ┌───┬───┬───┐ ┐
│ │ 0 │ 1 │ 2 │ │
│ ├───┼───┼───┤ │
shape[0] │ │ 3 │ 4 │ 5 │ │ len(Z)
(=3) │ ├───┼───┼───┤ │ (=3)
│ │ 6 │ 7 │ 8 │ │
└ └───┴───┴───┘ ┘
└───────────────────────────────────┘
Z.size
(=9)
strides[1]
(=2)
┌─────────────────────┐
┌ ┌──────────┬──────────┐ ┐
│ p+00: │ 00000000 │ 00000000 │ │
│ ├──────────┼──────────┤ │
│ p+02: │ 00000000 │ 00000001 │ │ strides[0]
│ ├──────────┼──────────┤ │ (=2x3)
│ p+04 │ 00000000 │ 00000010 │ │
│ ├──────────┼──────────┤ ┘
│ p+06 │ 00000000 │ 00000011 │
│ ├──────────┼──────────┤
Z.nbytes │ p+08: │ 00000000 │ 00000100 │
(=3x3x2) │ ├──────────┼──────────┤
│ p+10: │ 00000000 │ 00000101 │
│ ├──────────┼──────────┤
│ p+12: │ 00000000 │ 00000110 │
│ ├──────────┼──────────┤
│ p+14: │ 00000000 │ 00000111 │
│ ├──────────┼──────────┤
│ p+16: │ 00000000 │ 00001000 │
└ └──────────┴──────────┘
└─────────────────────┘
Z.itemsize
Z.dtype.itemsize
(=2)
V = Z[::2,::2]
Such view is specified using a shape, a dtype and strides because strides cannot be deduced anymore
from the dtype and shape only:
Item layout
shape[1]
(=2)
┌───────────┐
┌ ┌───┬╌╌╌┬───┐ ┐
│ │ 0 │ │ 2 │ │ ┌───┬───┐
│ ├───┼╌╌╌┼───┤ │ │ 0 │ 2 │
shape[0] │ ╎ ╎ ╎ ╎ │ len(Z) → ├───┼───┤
(=2) │ ├───┼╌╌╌┼───┤ │ (=2) │ 6 │ 8 │
│ │ 6 │ │ 8 │ │ └───┴───┘
└ └───┴╌╌╌┴───┘ ┘
┌───┬╌╌╌┬───┬╌╌╌┬╌╌╌┬╌╌╌┬───┬╌╌╌┬───┐ ┌───┬───┬───┬───┐
│ 0 │ │ 2 │ ╎ ╎ │ 6 │ │ 8 │ → │ 0 │ 2 │ 6 │ 8 │
└───┴╌╌╌┴───┴╌╌╌┴╌╌╌┴╌╌╌┴───┴╌╌╌┴───┘ └───┴───┴───┴───┘
└─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
└───┬───┘ └───┬───┘
└───────────┬───────────┘
Z.size
(=4)
┌ ┌──────────┬──────────┐ ┐ ┐
┌─┤ p+00: │ 00000000 │ 00000000 │ │ │
│ └ ├──────────┼──────────┤ │ strides[1] │
┌─┤ p+02: │ │ │ │ (=4) │
│ │ ┌ ├──────────┼──────────┤ ┘ │
│ └─┤ p+04 │ 00000000 │ 00000010 │ │
│ └ ├──────────┼──────────┤ │ strides[0]
│ p+06: │ │ │ │ (=12)
│ ├──────────┼──────────┤ │
Z.nbytes ─┤ p+08: │ │ │ │
(=8) │ ├──────────┼──────────┤ │
│ p+10: │ │ │ │
│ ┌ ├──────────┼──────────┤ ┘
│ ┌─┤ p+12: │ 00000000 │ 00000110 │
│ │ └ ├──────────┼──────────┤
└─┤ p+14: │ │ │
│ ┌ ├──────────┼──────────┤
└─┤ p+16: │ 00000000 │ 00001000 │
└ └──────────┴──────────┘
└─────────────────────┘
Z.itemsize
Z.dtype.itemsize
(=2)
First, we have to distinguish between indexing and fancy indexing. The first will always return a view
while the second will return a copy. This difference is important because in the first case, modifying
the view modifies the base array while this is not true in the second case:
>>> Z = np.zeros(9)
>>> Z_view = Z[:3]
>>> Z_view[...] = 1
>>> print(Z)
[ 1. 1. 1. 0. 0. 0. 0. 0. 0.]
>>> Z = np.zeros(9)
>>> Z_copy = Z[[0,1,2]]
>>> Z_copy[...] = 1
>>> print(Z)
[ 0. 0. 0. 0. 0. 0. 0. 0. 0.]
Thus, if you need fancy indexing, it's better to keep a copy of your fancy index (especially if it was
complex to compute it) and to work with it:
>>> Z = np.zeros(9)
>>> index = [0,1,2]
>>> Z[index] = 1
>>> print(Z)
[ 1. 1. 1. 0. 0. 0. 0. 0. 0.]
If you are unsure if the result of your indexing is a view or a copy, you can check what is the base of
your result. If it is None, then you result is a copy:
>>> Z = np.random.uniform(0,1,(5,5))
>>> Z1 = Z[:3,:]
>>> Z2 = Z[[0,1,2], :]
>>> print(np.allclose(Z1,Z2))
True
>>> print(Z1.base is Z)
True
>>> print(Z2.base is Z)
False
>>> print(Z2.base is None)
True
Note that some numpy functions return a view when possible (e.g. ravel) while some others always
return a copy (e.g. flatten):
>>> Z = np.zeros((5,5))
>>> Z.ravel().base is Z
True
>>> Z[::2,::2].ravel().base is Z
False
>>> Z.flatten().base is Z
False
Temporary copy
Copies can be made explicitly like in the previous section, but the most general case is the implicit
creation of intermediate copies. This is the case when you are doing some arithmetic with arrays:
In the example above, three intermediate arrays have been created. One for holding the result of
2*X, one for holding the result of 2*Y and the last one for holding the result of 2*X+2*Y. In this
specific case, the arrays are small enough and this does not really make a difference. However, if your
arrays are big, then you have to be careful with such expressions and wonder if you can do it
differently. For example, if only the final result matters and you don't need X nor Y afterwards, an
alternate solution would be:
Using this alternate solution, no temporary array has been created. Problem is that there are many
other cases where such copies needs to be created and this impact the performance like
demonstrated on the example below:
3.4 Conclusion
As a conclusion, we'll make an exercise. Given two vectors Z1 and Z2. We would like to know if Z2 is a
view of Z1 and if yes, what is this view ?
>>> Z1 = np.arange(10)
>>> Z2 = Z1[1:-1:2]
╌╌╌┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬╌╌
Z1 │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │
╌╌╌┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴╌╌
╌╌╌╌╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌╌╌╌╌╌╌╌
Z2 │ 1 │ │ 3 │ │ 5 │ │ 7 │
╌╌╌╌╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌╌╌╌╌╌╌╌
Next difficulty is to find the start and the stop indices. To do this, we can take advantage of the
byte_bounds method that returns a pointer to the end-points of an array.
byte_bounds(Z1)[0] byte_bounds(Z1)[-1]
↓ ↓
╌╌╌┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬╌╌
Z1 │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │
╌╌╌┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴╌╌
byte_bounds(Z2)[0] byte_bounds(Z2)[-1]
↓ ↓
╌╌╌╌╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌┬───┬╌╌╌╌╌╌╌╌╌╌
Z2 │ 1 │ │ 3 │ │ 5 │ │ 7 │
╌╌╌╌╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌┴───┴╌╌╌╌╌╌╌╌╌╌
As an exercise, you can improve this first and very simple implementation by taking into account:
Negative steps
Multi-dimensional arrays
Solution to the exercise.
4 Code vectorization
Contents
Introduction
Uniform vectorization
The Game of Life
Python implementation
Numpy implementation
Exercise
Sources
References
Temporal vectorization
Python implementation
Numpy implementation
Faster numpy implementation
Visualization
Exercise
Sources
References
Spatial vectorization
Boids
Python implementation
Numpy implementation
Exercise
Sources
References
Conclusion
4.1 Introduction
Code vectorization means that the problem you're trying to solve is inherently vectorizable and only
requires a few numpy tricks to make it faster. Of course it does not mean it is easy or straightforward,
but at least it does not necessitate totally rethinking your problem (as it will be the case in the
Problem vectorization chapter). Still, it may require some experience to see where code can be
vectorized. Let's illustrate this through a simple example where we want to sum up two lists of
integers. One simple way using pure Python is:
def add_python(Z1,Z2):
return [z1+z2 for (z1,z2) in zip(Z1,Z2)]
This first naive solution can be vectorized very easily using numpy:
def add_numpy(Z1,Z2):
return np.add(Z1,Z2)
Without any surprise, benchmarking the two approaches shows the second method is the fastest
with one order of magnitude.
Not only is the second approach faster, but it also naturally adapts to the shape of Z1 and Z2. This is
the reason why we did not write Z1 + Z2 because it would not work if Z1 and Z2 were both lists. In
the first Python method, the inner + is interpreted differently depending on the nature of the two
objects such that if we consider two nested lists, we get the following outputs:
The first method concatenates the two lists together, the second method concatenates the internal
lists together and the last one computes what is (numerically) expected. As an exercise, you can
rewrite the Python version such that it accepts nested lists of any depth.
Figure 4.1
Conus textile snail exhibits a cellular automaton pattern on its shell. Image by Richard Ling,
2005.
Note
Excerpt from the Wikipedia entry on the Game of Life
The Game of Life is a cellular automaton devised by the British mathematician John Horton Conway
in 1970. It is the best-known example of a cellular automaton. The "game" is actually a zero-player
game, meaning that its evolution is determined by its initial state, needing no input from human
players. One interacts with the Game of Life by creating an initial configuration and observing how it
evolves.
The universe of the Game of Life is an infinite two-dimensional orthogonal grid of square cells, each
of which is in one of two possible states, live or dead. Every cell interacts with its eight neighbours,
which are the cells that are directly horizontally, vertically, or diagonally adjacent. At each step in
time, the following transitions occur:
1. Any live cell with fewer than two live neighbours dies, as if by needs caused by underpopulation.
2. Any live cell with more than three live neighbours dies, as if by overcrowding.
3. Any live cell with two or three live neighbours lives, unchanged, to the next generation.
4. Any dead cell with exactly three live neighbours becomes a live cell.
The initial pattern constitutes the 'seed' of the system. The first generation is created by applying the
above rules simultaneously to every cell in the seed – births and deaths happen simultaneously, and
the discrete moment at which this happens is sometimes called a tick. (In other words, each
generation is a pure function of the one before.) The rules continue to be applied repeatedly to
create further generations.
Python implementation
Note
We could have used the more efficient python array interface but it is more convenient to use
the familiar list object.
In pure Python, we can code the Game of Life using a list of lists representing the board where cells
are supposed to evolve. Such a board will be equipped with border of 0 that allows to accelerate
things a bit by avoiding having specific tests for borders when counting the number of neighbours.
Z = [[0,0,0,0,0,0],
[0,0,0,1,0,0],
[0,1,0,1,0,0],
[0,0,1,1,0,0],
[0,0,0,0,0,0],
[0,0,0,0,0,0]]
def compute_neighbours(Z):
shape = len(Z), len(Z[0])
N = [[0,]*(shape[0]) for i in range(shape[1])]
for x in range(1,shape[0]-1):
for y in range(1,shape[1]-1):
N[x][y] = Z[x-1][y-1]+Z[x][y-1]+Z[x+1][y-1] \
+ Z[x-1][y] +Z[x+1][y] \
+ Z[x-1][y+1]+Z[x][y+1]+Z[x+1][y+1]
return N
To iterate one step in time, we then simply count the number of neighbours for each internal cell and
we update the whole board according to the four aforementioned rules:
def iterate(Z):
N = compute_neighbours(Z)
for x in range(1,shape[0]-1):
for y in range(1,shape[1]-1):
if Z[x][y] == 1 and (N[x][y] < 2 or N[x][y] > 3):
Z[x][y] = 0
elif Z[x][y] == 0 and N[x][y] == 3:
Z[x][y] = 1
return Z
The figure below shows four iterations on a 4x4 area where the initial state is a glider, a structure
discovered by Richard K. Guy in 1970.
Figure 4.2
The glider pattern is known to replicate itself one step diagonally in 4 iterations.
Numpy implementation
Starting from the Python version, the vectorization of the Game of Life requires two parts, one
responsible for counting the neighbours and one responsible for enforcing the rules. Neighbour-
counting is relatively easy if we remember we took care of adding a null border around the arena. By
considering partial views of the arena we can actually access neighbours quite intuitively as illustrated
below for the one-dimensional case:
┏━━━┳━━━┳━━━┓───┬───┐
Z[:-2] ┃ 0 ┃ 1 ┃ 1 ┃ 1 │ 0 │ (left neighbours)
┗━━━┻━━━┻━━━┛───┴───┘
↓︎
┌───┏━━━┳━━━┳━━━┓───┐
Z[1:-1] │ 0 ┃ 1 ┃ 1 ┃ 1 ┃ 0 │ (actual cells)
└───┗━━━┻━━━┻━━━┛───┘
↑
┌───┬───┏━━━┳━━━┳━━━┓
Z[+2:] │ 0 │ 1 ┃ 1 ┃ 1 ┃ 0 ┃ (right neighbours)
└───┴───┗━━━┻━━━┻━━━┛
Going to the two dimensional case requires just a bit of arithmetic to make sure to consider all the
eight neighbours.
N = np.zeros(Z.shape, dtype=int)
N[1:-1,1:-1] += (Z[ :-2, :-2] + Z[ :-2,1:-1] + Z[ :-2,2:] +
Z[1:-1, :-2] + Z[1:-1,2:] +
Z[2: , :-2] + Z[2: ,1:-1] + Z[2: ,2:])
For the rule enforcement, we can write a first version using numpy's argwhere method that will give
us the indices where a given condition is True.
# Flatten arrays
N_ = N.ravel()
Z_ = Z.ravel()
# Apply rules
R1 = np.argwhere( (Z_==1) & (N_ < 2) )
R2 = np.argwhere( (Z_==1) & (N_ > 3) )
R3 = np.argwhere( (Z_==1) & ((N_==2) | (N_==3)) )
R4 = np.argwhere( (Z_==0) & (N_==3) )
Even if this first version does not use nested loops, it is far from optimal because of the use of the four
argwhere calls that may be quite slow. We can instead factorize the rules into cells that will survive
(stay at 1) and cells that will give birth. For doing this, we can take advantage of Numpy boolean
capability and write quite naturally:
Note
We did no write Z = 0 as this would simply assign the value 0 to Z that would then become a
simple scalar.
If you look at the birth and survive lines, you'll see that these two variables are arrays that can be
used to set Z values to 1 after having cleared it.
Figure 4.3
The Game of Life. Gray levels indicate how much a cell has been active in the past.
0:00 / 0:50
Exercise
Reaction and diffusion of chemical species can produce a variety of patterns, reminiscent of those
often seen in nature. The Gray-Scott equations model such a reaction. For more information on this
chemical system see the article Complex Patterns in a Simple System (John E. Pearson, Science, Volume
261, 1993). Let's consider two chemical species U and V with respective concentrations u and v and
diffusion rates Du and Dv . V is converted into P with a rate of conversion k . f represents the rate of
the process that feeds U and drains U , V and P . This can be written as:
Based on the Game of Life example, try to implement such reaction-diffusion system. Here is a set of
interesting parameters to test:
Name Du Dv f k
Bacteria 1 0.16 0.08 0.035 0.065
Bacteria 2 0.14 0.06 0.035 0.065
Coral 0.16 0.08 0.060 0.062
Fingerprint 0.19 0.05 0.060 0.062
Spirals 0.10 0.10 0.018 0.050
Spirals Dense 0.12 0.08 0.020 0.050
Spirals Fast 0.10 0.16 0.020 0.050
Unstable 0.16 0.08 0.020 0.055
Worms 1 0.16 0.08 0.050 0.065
Worms 2 0.16 0.08 0.054 0.063
Zebrafish 0.16 0.08 0.035 0.060
The figure below shows some animations of the model for a specific set of parameters.
Figure 4.4
Reaction-diffusion Gray-Scott model. From left to right, Bacteria 1, Coral and Spiral Dense.
0:00 0:00
0:00
Sources
game_of_life_python.py
game_of_life_numpy.py
gray_scott.py (solution to the exercise)
References
John Conway new solitaire game "life", Martin Gardner, Scientific American 223, 1970.
Gray Scott Model of Reaction Diffusion, Abelson, Adams, Coore, Hanson, Nagpal, Sussman, 1997.
Reaction-Diffusion by the Gray-Scott Model, Robert P. Munafo, 1996.
Figure 4.5
Romanesco broccoli, showing self-similar form approximating a natural fractal. Image by Jon
Sullivan, 2004.
Python implementation
The interesting (and slow) part of this code is the mandelbrot function that actually computes the
sequence f c(f c(f c...))). The vectorization of such code is not totally straightforward because the
internal return implies a differential processing of the element. Once it has diverged, we don't need
to iterate any more and we can safely return the iteration count at divergence. The problem is to
then do the same in numpy. But how?
Numpy implementation
The trick is to search at each iteration values that have not yet diverged and update relevant
information for these values and only these values. Because we start from Z = 0, we know that each
value will be updated at least once (when they're equal to 0, they have not yet diverged) and will stop
being updated as soon as they've diverged. To do that, we'll use numpy fancy indexing with the
less(x1,x2) function that return the truth value of (x1 < x2) element-wise.
def mandelbrot_numpy(xmin, xmax, ymin, ymax, xn, yn, maxiter, horizon=2.0):
X = np.linspace(xmin, xmax, xn, dtype=np.float32)
Y = np.linspace(ymin, ymax, yn, dtype=np.float32)
C = X + Y[:,None]*1j
N = np.zeros(C.shape, dtype=int)
Z = np.zeros(C.shape, np.complex64)
for n in range(maxiter):
I = np.less(abs(Z), horizon)
N[I] = n
Z[I] = Z[I]**2 + C[I]
N[N == maxiter-1] = 0
return Z, N
The gain is roughly a 5x factor, not as much as we could have expected. Part of the problem is that
the np.less function implies xn × yn tests at every iteration while we know that some values have
already diverged. Even if these tests are performed at the C level (through numpy), the cost is
nonetheless significant. Another approach proposed by Dan Goodman is to work on a dynamic array
at each iteration that stores only the points which have not yet diverged. It requires more lines but
the result is faster and leads to a 10x factor speed improvement compared to the Python version.
Z = np.zeros(C.shape, np.complex64)
for i in range(itermax):
if not len(Z): break
# Failed convergence
I = abs(Z) > horizon
N_[Xi[I], Yi[I]] = i+1
Z_[Xi[I], Yi[I]] = Z[I]
Visualization
In order to visualize our results, we could directly display the N array using the matplotlib imshow
command, but this would result in a "banded" image that is a known consequence of the escape
count algorithm that we've been using. Such banding can be eliminated by using a fractional escape
count. This can be done by measuring how far the iterated point landed outside of the escape cutoff.
See the reference below about the renormalization of the escape count. Here is a picture of the result
where we use recount normalization, and added a power normalized color map (gamma=0.3) as well
as light shading.
Figure 4.6
The Mandelbrot as rendered by matplotlib using recount normalization, power normalized
color map (gamma=0.3) and light shading.
Exercise
Note
You should look at the ufunc.reduceat method that performs a (local) reduce with specified
slices over a single axis.
We now want to measure the fractal dimension of the Mandelbrot set using the Minkowski–Bouligand
dimension. To do that, we need to do box-counting with a decreasing box size (see figure below). As
you can imagine, we cannot use pure Python because it would be way too slow. The goal of the
exercise is to write a function using numpy that takes a two-dimensional float array and returns the
dimension. We'll consider values in the array to be normalized (i.e. all values are between 0 and 1).
Figure 4.7
The Minkowski–Bouligand dimension of the Great Britain coastlines is approximately 1.24.
Sources
mandelbrot.py
mandelbrot_python.py
mandelbrot_numpy_1.py
mandelbrot_numpy_2.py
fractal_dimension.py (solution to the exercise)
References
How To Quickly Compute the Mandelbrot Set in Python, Jean Francois Puget, 2015.
My Christmas Gift: Mandelbrot Set Computation In Python, Jean Francois Puget, 2015.
Fast fractals with Python and Numpy, Dan Goodman, 2009.
Renormalizing the Mandelbrot Escape, Linas Vepstas, 1997.
Figure 4.8
Flocking birds are an example of self-organization in biology. Image by Christoffer A
Rasmussen, 2012.
Boids
Note
Excerpt from the Wikipedia entry Boids
Boids is an artificial life program, developed by Craig Reynolds in 1986, which simulates the flocking
behaviour of birds. The name "boid" corresponds to a shortened version of "bird-oid object", which
refers to a bird-like object.
As with most artificial life simulations, Boids is an example of emergent behavior; that is, the
complexity of Boids arises from the interaction of individual agents (the boids, in this case) adhering
to a set of simple rules. The rules applied in the simplest Boids world are as follows:
separation: steer to avoid crowding local flock-mates
alignment: steer towards the average heading of local flock-mates
cohesion: steer to move toward the average position (center of mass) of local flock-mates
Figure 4.9
Boids are governed by a set of three local rules (separation, cohesion and alignment) that serve
as computing velocity and acceleration.
Python implementation
Since each boid is an autonomous entity with several properties such as position and velocity, it
seems natural to start by writing a Boid class:
import math
import random
from vec2 import vec2
class Boid:
def __init__(self, x=0, y=0):
self.position = vec2(x, y)
angle = random.uniform(0, 2*math.pi)
self.velocity = vec2(math.cos(angle), math.sin(angle))
self.acceleration = vec2(0, 0)
The vec2 object is a very simple class that handles all common vector operations with 2
components. It will save us some writing in the main Boid class. Note that there are some vector
packages in the Python Package Index, but that would be overkill for such a simple example.
Boid is a difficult case for regular Python because a boid has interaction with local neighbours.
However, and because boids are moving, to find such local neighbours requires computing at each
time step the distance to each and every other boid in order to sort those which are in a given
interaction radius. The prototypical way of writing the three rules is thus something like:
Full sources are given in the references section below, it would be too long to describe it here and
there is no real difficulty.
To complete the picture, we can also create a Flock object:
class Flock:
def __init__(self, count=150):
self.boids = []
for i in range(count):
boid = Boid()
self.boids.append(boid)
def run(self):
for boid in self.boids:
boid.run(self.boids)
Using this approach, we can have up to 50 boids until the computation time becomes too slow for a
smooth animation. As you may have guessed, we can do much better using numpy, but let me first
point out the main problem with this Python implementation. If you look at the code, you will
certainly notice there is a lot of redundancy. More precisely, we do not exploit the fact that the
Euclidean distance is reflexive, that is, |x − y | = |y − x |. In this naive Python implementation, each rule
2
(function) computes n 2 distances while n would be sufficient if properly cached. Furthermore, each
2
rule re-computes every distance without caching the result for the other functions. In the end, we are
2
computing 3n 2 distances instead of n .
2
Numpy implementation
As you might expect, the numpy implementation takes a different approach and we'll gather all our
boids into a position array and a velocity array:
n = 500
velocity = np.zeros((n, 2), dtype=np.float32)
position = np.zeros((n, 2), dtype=np.float32)
The first step is to compute the local neighborhood for all boids, and for this we need to compute all
paired distances:
We could have used the scipy cdist but we'll need the dx and dy arrays later. Once those have been
computed, it is faster to use the hypot method. Note that distance shape is (n, n) and each line
relates to one boid, i.e. each line gives the distance to all other boids (including self).
From theses distances, we can now compute the local neighborhood for each of the three rules,
taking advantage of the fact that we can mix them together. We can actually compute a mask for
distances that are strictly positive (i.e. have no self-interaction) and multiply it with other distance
masks.
Note
If we suppose that boids cannot occupy the same position, how can you compute mask_0
more efficiently?
Then, we compute the number of neighbours within the given radius and we ensure it is at least 1 to
avoid division by zero.
mask_1_count = np.maximum(mask_1.sum(axis=1), 1)
mask_2_count = np.maximum(mask_2.sum(axis=1), 1)
mask_3_count = mask_2_count
Cohesion
Separation