Applied Computer Science
Applied Computer Science
Shane Torbert
1 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Air Resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Lunar Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2 Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1 Pixel Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2 Scalable Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.3 Building Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1 Geospatial Population Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2 Particle Diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.3 Approximating π . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.1 Text and Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.2 Babylonian Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3 Workload Balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.1 Disease Outbreak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.2 Runtime Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.3 Guessing Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.1 Sliding Tile Puzzle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.2 Anagram Scramble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.3 Collision Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
ix
x Contents
7 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.1 Predator-Prey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.2 Laws of Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
7.3 Bioinformatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Postscript . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Chapter 1
Simulation
Experiments are often limited by a high level of danger, and because they are too
expensive or simply impossible to arrange, such as in developing new medical treat-
ments, vehicles for space flight, and also studying geologic events. In these cases
we may benefit from the use of computer simulation to refine our understanding
and narrow our investigation prior to an “official” observation. Since the promise of
this technique is to accelerate information gathering for a relatively low total cost,
interest has been gaining momentum everywhere.
Examples in this chapter include discrete, continuous, and interactive systems
with particular importance given to the long-term (asymptotic) trend as the scale
of a problem grows toward infinity. Our approach favors clarity and context over
trivia or theory with the hope that this better enables early success, but of course in
learning there are never any real guarantees. Efficiency matters but is not paramount
as we prefer to implement more widely accessible solutions while perhaps merely
suggesting a state-of-the-art method.
For all code listings we use Python 2 with Tk and PIL, plus gnuplot, but other
options are available (e.g., MatlabR
, Mathematica, Python 3, Java, C/C++, Fortran,
Scheme, Pascal) and our belief is these choices are sufficiently intuitive to serve
as instructive pseudocode, that also happens to run!, for any environment you use.
Regardless, we feel the central issues up-front are quality problems, a thoughtful
sequencing of topics, and rapid feedback.
Imagine yourself outside on a pleasant day looking for a nice place to sit and read.
By chance you are standing exactly halfway between your two favorite spots but
are unable to decide which one to take. Out of curiosity you engage in a rather
obscure process to settle the matter: you flip a coin and take one step in the indicated
direction, then flip again followed by another step, and again and again, moving
back-and-forth as the coin dictates, possibly coming very close to one spot or the
other but ending only when a final step brings you all the way there. We wish to
simulate this random drifting process with computer code. Each lab is numbered so
that the first digit indicates chapter (1-7), the second a particular problem (1-3) in
the chapter, and the third digit your specific assignment (1-5) related to the problem.
We begin with output as shown in Table 1.1 where variables specify the size of our
walk and also track current position. If n is the distance from the halfway point to
the edge just next to either destination then m = 2n + 1 is the total distance available
for drifting.
Code Listing 1.1: Initializing variables.
#
n=5
m=2*n+1
j= n+1
#
1.1 Random Walk 3
The values of n and m remain constant here but our current position j will start
at the halfway point j = n + 1 and then change as we drift until either j = 0 or
j = m + 1 and we have arrived at a reading spot. Whenever j = 1 or j = m we are
at one of the two edges, needing but a single step more in that direction to complete
our walk. Commands to initialize and update these variables “over time ” are shown
in Code Listings 1.1 and 1.2, respectively.
Of course we want to see the walk, too. Once we know everything is working
properly this “trace” will be less important but we should first verify that our code
is behaving as expected. We could just print the value of j at each step but it will be
better if we draw a “picture” instead. Helper variable k loops over the entire drifting
area in Code Listing 1.3 to display a single row from any of Table 1.1’s examples.
Your first assignment is to put all this code together into a 1-D random walk
simulation, including counting-up the total number of steps.
Questions to consider:
0.52
0.51
0.5
0.49
0.48
0.47
0.46
0.45
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Trials, number of flips
Before proceeding we ask an essential question: After how many coin flips are we
reasonably sure the heads-tails ratio is “close” to even? Your assignment is to write
a small program that only flips coins, over and over and over again, to calculate the
percentage of heads, or tails, obtained. Code Listing 1.4 is a sample gnuplot script
for a plot similar to the one shown in Figure 1.1, where we assume the program’s
output (total observed probability after each trial) has been stored in a plain text file.
120
100
80
Steps
60
40
20
0
1 2 3 4 5 6 7 8 9 10
Distance to Edge, n
Fig. 1.2: Quadratic growth in the average number of steps to complete a walk.
We show the average number of steps for walks of size n ≤ 10 in Figure 1.2 over-
layed with the curve f (n) = (n + 1)2 since it takes n + 1 total steps in a particular
direction to actually end the walk. A trial now is an entire random walk, not just a
single coin flip, and we use Algorithm 1.1.1 and 10, 000 trials for accurate results.
Fig. 1.3: A large plume of ash during an eruption of the Mt. Cleveland volcano,
Alaska, as seen from the International Space Station on 23 May 2006. Image cour-
tesy of the Image Science and Analysis Laboratory, NASA Johnson Space Center.
Observation of the physical world often reveals behavior that may be explained at
least partially by drifting. For instance, the plume of ash shown in Figure 1.3 moves
in two fundamental ways. First, wind blows the ash in some direction, a topic we
will consider in more detail with the next problem. Second, the ash spreads out and
eventually the plume will break-up in a process called diffusion that can be modeled
by the random motion of individual ash particles. This means that our random walk,
while an absurd method for picking a spot to read, does relate accurately to the
diffusive aspect of a plume’s movement over time.
However, volcanic eruptions may have a worldwide impact over many years,
scales far too large for 3-D particle-by-particle calculations even for state of the art
algorithms running night and day on the fastest computers in the world. Simulations
of such cases require a team of experts (we need you!) to build different models,
better algorithms, and bigger computers.
1.1 Random Walk 7
Fig. 1.4: Blue Mountain supercomputer, Los Alamos National Laboratory, New
Mexico, one of the most powerful computing systems in the world at the end of
the twentieth century. Image courtesy of Los Alamos National Security, LLC.
Parallel Processing
One popular technique for large-scale simulations is the use of multiple systems
connected together to process sub-units of the overall code in parallel. For instance,
individual trials in the previous lab are independent of each other and therefore may
be run simultaneously on separate machines in order to speed up calculation.
The parallel computing system shown in Figure 1.4 was one of the world’s most
powerful supercomputers. It has since been decommissioned and not even a decade
later the same level of capability became commercially available in graphics cards,
also massively parallel but no longer the exclusive domain of a premier national lab
the use of whose image requires:
Unless otherwise indicated, this information has been authored by an employee or employ-
ees of the Los Alamos National Security, LLC (LANS), operator of the Los Alamos Na-
tional Laboratory under Contract No. DE-AC52-06NA25396 with the U.S. Department of
Energy. The U.S. Government has rights to use, reproduce, and distribute this information.
The public may copy and use this information without charge, provided that this Notice
and any statement of authorship are reproduced on all copies. Neither the Government nor
LANS makes any warranty, express or implied, or assumes any liability or responsibility
for the use of this information.
8 1 Simulation
0.9
0.8
0.7
0.6
0.5
0 5 10 15 20 25
Distance to Edge, n
Fig. 1.5: As the size of the walk increases the importance of the first step decreases.
We want to know how important the first step is in determining the final direction our
drifting leads. Code Listing 1.5 shows how the first step can be “remembered” for
later comparison after the entire walk has finished. Your job is to count the number
of times this very first step matches the final direction. Results for n ≤ 25 shown in
Figure 1.5 reflect again the use of 10, 000 trials for each size.
Code Listing 1.5: Remembering the first step for later comparison.
j=n+1
if random()<0.5:
j+=1
else:
j-=1
first_step=(j-(n+1)) # either 1 or -1
#
while 1<=j<=m:
#
...
1.1 Random Walk 9
0.9
0.8
0.7
0.6
0.5
0 5 10 15 20 25
Distance to Edge, n
Fig. 1.6: The importance of the first edge reached increases with the size of the walk.
Asking a similar question about the first edge allows us to use the symmetry of our
problem to speed up runtime. It is not necessary to start in the middle and wait for
the drifting to eventually reach an edge; instead, just start on an edge at step one!
Results for n ≤ 25 shown in Figure 1.6 are consistent regardless of which edge we
start on and again reflect the use of 10, 000 trials for each size.
Imagine some friends in a sunny field kicking the soccer ball around. If the ball
leaves your foot at 60 miles per hour and forming a 30 degree angle with the ground
then how far does it go? What does its trajectory look like? Such questions are
often asked in math and science classrooms∗ but our investigation will approach
this problem from the standpoint of computer simulation.
We begin by neglecting air resistance but it will be clear that, using our approach,
including air resistance in the model is straightforward. In all cases we restrict our-
selves to 2-D and some behavior, such as lift and spin, will not be considered.
The soccer ball’s velocity and position are initialized in Code Listing 1.7 where
we assume v0 = 26.82 meters per second and θ = π /6 radians have already been
converted from mph and degrees, and cos and sin are imported from math.
Our main loop shown in Code Listing 1.8 updates the position (x, y) based on
the velocity (vx , vy ) with vx constant (for now) and vy itself updated by the “pull” of
gravity, g = −9.81 m/s2 near the surface of the Earth.
Deconstructed Parabola
10
9
8
7
Height, meters
6
5
4
3
2
1
0
-1
0 10 20 30 40 50 60 70
Distance, meters
Obviously vy changes since what goes up must come down. We can think of vx
as a non-diffusive “transport” similar to the previously unexamined aspect of the
ash plume’s motion. We use a timestep dt = 0.001 seconds and total time t can also
be tracked. If our loop outputs (t, x, y) at each step then Code Listing 1.9 plots the
overall (x, y) trajectory as shown in Figure 1.7.
Two observations about a soccer ball’s motion make this simulation possible:
• Horizontal and vertical movement can be treated separately.
• Small enough timesteps allow for straightforward modeling.
12 1 Simulation
67
66.5
Distance, meters
66
65.5
65
64.5
64
63.5
63
1e-06 1e-05 0.0001 0.001 0.01 0.1 1
Timestep, seconds
6
5
4
3
2
1
0
-1
0 10 20 30 40 50 60 70
Distance, meters
5
Height, meters
-1
0 5 10 15 20 25 30 35
Distance, meters
A better model for air resistance is a = −c1 v − c2 v2 but the quadratic term is
more complicated in 2-D and our focus here is foremost on the “big idea” and how
it appears in your code. Namely, the idea is that air resistance opposes the direction
of motion (hence the minus sign) and also scales with increased speed.
Our plots reflect c1 = 0.5, an ad hoc value, where in general this would depend
on the material property of the projectile, its shape, the composition of air, current
altitude, etc. The mass is also an implicit part of the c1 coefficient.
Note the dramatic changes in both height and range compared to our parabolic
trajectory; if this seems too unreasonable you should change c1 . (After all, it is your
code.) Disclaimer: Any similarities to an actual product or scenario-of-interest is
unintentional and must occur purely by chance.
Code Listing 1.11: Fix the x-axis in place for comparing multiple plots.
set terminal png
set output "lab124.png"
set xrange[:35]
plot "lab124.txt" u 2:3 w l notitle,0 w l notitle
1.2 Air Resistance 15
5
Height, meters
-1
0 5 10 15 20 25 30 35
Distance, meters
Fig. 1.11: Trajectory plot with air resistance and a headwind of 10 mph.
Lab124: Wind
Code Listing 1.11 shows a script that fixes the x-axis in place for comparison of
multiple plots. If we assume there is a horizontal wind only and call vw its velocity
then we can model this wind by calculating the relative velocity (vx − vw ) as shown
in Code Listing 1.12. We use relative velocity instead of the absolute vx to determine
the horizontal accleration due to air resistance.
Since vy is not affected by vw the calculation of ay is exactly as it was before
and comparing Figures 1.10 and 1.11 we see that height is unchanged but range is
diminished. Plots reflecting even stronger winds are shown in Figure 1.12.
Height, meters
4
-1
0 5 10 15 20 25 30 35
5
Height, meters
-1
0 5 10 15 20 25 30 35
5
Height, meters
-1
0 5 10 15 20 25 30 35
Distance, meters
Fig. 1.12: Headwind trajectory plots. Top to bottom: 20 mph, 30 mph, and 40 mph.
1.2 Air Resistance 17
Fig. 1.13: Early users of modern computing systems. Left: A tank in Tunisia, 1943,
the World War II campaign that motivated ENIAC. Right: Ada Lovelace, 1838.
Images courtesy of the U.S. Army and NASA. Tank photo credit U.S. Army Military
History Institute, WWII Signal Corps Photograph Collection.
The idea of a computer is not new. In the nineteenth century mechanical devices
were built to perform otherwise difficult calculations. The first programmer was
Ada Lovelace, shown in Figure 1.13 alongside a World War II era tank, and U.S.
Defense Department programming language Ada was named after her. During the
North African Campaign our need to update artillery firing tables motivated ENIAC,
famous as the “first” computer although Z1 was already programmable in Germany
and ABC was already electronic at Iowa State University.
While more elaborate equations are used in practice, each entry of a firing table
based on our 2-D model would require finding the range x given the wind vw and
initial angle θ , assuming that v0 , g, and all other parameters are fixed. These ranges
could be calculated in parallel and since each entry is independent of any other
this is an “embarrassingly parallel” problem, so-called because the parallel code is
relatively straightforward to write.
Besides having already generated such a firing table and simply looking up the
answer, how could you find θ to hit a specific target range xT given a known vw ?
18 1 Simulation
1400
1200
Height, meters
1000
800
600
400
200
-200
0 5 10 15 20 25 30 35
Distance, meters
Fig. 1.14: Trajectory plot for free fall from a height of just under one mile with a
tailwind of not quite 1 mph. Note carefully the vast difference between horizontal
and vertical scales.
To simulate free fall we set v0 = 0.0 and perhaps y0 = 1500.0 meters, with a slight
tailwind of vw = 0.447 meters per second. Results are shown in Figure 1.14 where
we note the vast difference between horizontal and vertical scales.
When vy stops changing the projectile has reached “terminal velocity” indicating
that ay = 0 because the two terms g and c1 vy , shown again in Code Listing 1.13 for
reference, balance each other out. Detailed plots in Figure 1.15 show the evolution
of vertical velocity and acceleration over time.
Question: When would ax = 0 and thus vx stop changing, too?
-5
Velocity, m/s
-10
-15
-20
0 2 4 6 8 10 12 14
Time, seconds
0
Acceleration, m/s2
-2
-4
-6
-8
-10
0 2 4 6 8 10 12 14
Time, seconds
Our final problem for this chapter involves landing on the Moon. Unlike the two pre-
vious problems our code will now have to allow the user to interact with a running
program. Note that Figure 1.16 contains an anachronism: the Earthrise background
photo is from the Apollo 8 mission on 24 December 1968, but the Eagle lunar mod-
ule photo is from Apollo 11 on 20 July 1969, the day before extravehicular activity.
We begin first with no user interaction because the animation alone requires your
complete, focused attention. The underlying model is 1-D freefall with no air resis-
tance as the lunar atmosphere is practically a vacuum. Code Listing 1.14 shows a
complete program, written in Python 2 with Tk and PIL, and your assignment is to
understand what it does.
Line 31 establishes that ay is determined entirely by gM = −1.62 m/s2 and we
note that compared to Earth’s gE = −9.81 m/s2 the gravitational acceleration is only
16.5% as strong on the Moon. So, when we conclude that our speed on impact is
almost 40 mph keep in mind that this figure would be closer to 100 mph on Earth;
well, only if air resistance is neglected... thank goodness for air! And parachutes!
Lines 14-16 are familiar, note however they are no longer contained in a loop
but instead a function tick to control animation. Commands to make this anima-
tion work are underlined on Lines 11 and 24, and the call on Line 57 to the canvas
object’s after method starts the entire process with a request that “after 1 mil-
lisecond” the tick function should be called,
but it does not actually call the tick function itself.
The function is defined starting on Line 11 and takes no arguments. The condi-
tional call on Line 24 keeps the process going (tick, tick, tick, tick, tick, ...) following
the first call. In addition to facilitating animation we can also think of this repeated
calling as taking the place of our main loop, with Line 23’s if-statement acting like
the “keep going” condition of a while-loop.
Of course only a single frame can be displayed on the screen at any one time
and the graphics are repeatedly updated by Line 21 which moves our image of the
Eagle as it falls. The illusion of crashing is maintained because Line 20 and Line 37
consider y = 0 to be only 70% of the way downscreen, an ad hoc figure.
Curiously, the Tk system requires that the names pmg1 and pmg2 on Line 47
and Line 51 be different; otherwise, if the variable of a photo image is discarded
then the corresponding Tk image will disappear (!) from the window.
Figure 1.17 shows our goal: landing on the moon instead of crashing.
1.3 Lunar Module 21
Fig. 1.16: Descent of the lunar module (not shown to scale). Top: beginning free fall
at an altitude of 100 meters, a contrived scenario. Bottom: crashing at over 40 mph.
Images courtesy of NASA, lunar module photo credit Michael Collins.
22 1 Simulation
80
Altitude, meters
60
40
20
-20
0 2 4 6 8 10 12
Time, seconds
80
Altitude, meters
60
40
20
-20
0 2 4 6 8 10 12
Time, seconds
Fig. 1.17: Altitude over time during lunar descent. Top: crashing at over 40 mph.
Bottom: controlled landing with vertical thrusters burning just before touchdown so
that impact velocity is well under 2 mph.
1.3 Lunar Module 23
Code Listing 1.14: A complete program, written in Python 2 with Tk and PIL.
1 ###############################################################################
2 #
3 # Chapter 1: Simulation
4 # Problem 3: Lunar Module
5 # Lab 1.3.1: Uncontrolled Descent
6 #
7 ###############################################################################
8 from Tkinter import Tk,Canvas
9 from PIL import Image,ImageTk
10 ###############################################################################
11 d e f tick():
12 g l o b a l t,y,vy
13 #
14 t += dt
15 y += (vy*dt)
16 vy += (ay*dt)
17 #
18 p r i n t t,y,vy
19 #
20 yp = 0.7*h/y0*(y0-y)
21 cnvs.coords(tkid,w/4.0,yp)
22 #
23 i f y>0.0:
24 cnvs.after(1,tick)
25 #
26 ###############################################################################
27 w,h= 800,600
28 #
29 y0 = 100.0 # meters --> we must stipulate that images are not shown to scale
30 vy = 0.0 # m/s --> so, assuming some previous arrangements at play here
31 ay = -1.62 # m/sˆ2, acceleration due to gravity near the surface of the Moon
32 #
33 t = 0.0
34 dt = 0.001
35 #
36 y = y0
37 yp = 0.7*h/y0*(y0-y) # linearly interpolate --> crash before the very bottom
38 #
39 p r i n t t,y,vy
40 ###############################################################################
41 #
42 root=Tk()
43 cnvs=Canvas(root,width=w,height=h,bg=‘black’)
44 cnvs.pack()
45 #
46 img1=Image.open(‘earthrise.jpg’).resize((w,h))
47 pmg1=ImageTk.PhotoImage(img1)
48 cnvs.create_image(w/2.0,h/2.0,image=pmg1)
49 #
50 img2=Image.open(‘eagle.jpg’).resize((200,170))
51 pmg2=ImageTk.PhotoImage(img2)
52 tkid=cnvs.create_image(w/4.0,yp,image=pmg2)
53 #
54 f=(‘Times’,14,‘bold’)
55 cnvs.create_text(w-110,h-15,text=‘Images courtesy of NASA.’,font=f)
56 #
57 cnvs.after(1,tick)
58 root.mainloop()
59 #
60 # end of file
61 #
62 ###############################################################################
24 1 Simulation
-15.59
Fig. 1.18: Controlled lunar descent where the black rectangle now artificially marks
our landing site and the current velocity is indicated in meters per second. Images
courtesy of NASA, lunar module photo credit Michael Collins.
Now that our animation is working we can add user interaction. Code Listing 1.15
shows commands both to display current velocity vy and also to fire vertical thrusters
by pressing the spacebar. Figure 1.18 suggests further helping the user control for a
soft landing by artificially marking the landing site.
Code Listing 1.15: Additional commands that allow for user interaction.
d e f tick():
cnvs.itemconfigure(tkid2,text=‘%0.2f’%vy)
#
d e f spacebar(evnt):
g l o b a l vy
vy+=1.0 # another ad hoc figure
#
tkid2=cnvs.create_text(w-75,50,text=‘%0.2f’%vy,fill=‘white’)
#
root.bind(‘<space>’,spacebar)
1.3 Lunar Module 25
80
Altitude, meters
60
40
20
-20
0 2 4 6 8 10 12 14
Time, seconds
2
0
-2
-4
Velocity, m/s
-6
-8
-10
-12
-14
-16
-18
0 2 4 6 8 10 12 14
Time, seconds
Fig. 1.19: Altitude and velocity over time during lunar descent. The case shown here
is a “soft landing” where vertical thrusters are burned just after the 10 second mark.
26 1 Simulation
60 168
40 252
20 336
0 420
0 2 4 6 8 10 12
Time, seconds
The “sawtooth curve” in Figure 1.19’s velocity plot reflects the fact that we instan-
taneously add 1.0 to vy whenever the spacebar is pressed, because the code is easier
to write this way, but in reality the act of burning thrusters would itself take time
making the vy curve smoother.
Translating from altitude to screen position, as shown again in Code Listing 1.16
yp−0 y−y0
for reference, assumes 0.7h−0 = 0.0−y 0
which calculates yp from a known y given
that 0 (screen position) corresponds to y0 , and 0.7h with 0.0 (altitude). Vertical coor-
dinates are often inverted as Figure 1.20 shows because the upper-left corner tends
to anchor a graphics window.
Your assignment is to generate the plots shown in Figures 1.21, 1.22, and 1.23.
80
Altitude, meters
60
40
20
-20
0 5 10 15 20 25 30 35
Time, seconds
-2
Velocity, m/s
-4
-6
-8
-10
-12
-14
0 5 10 15 20 25 30 35
Time, seconds
Fig. 1.21: Altitude and velocity over time during lunar descent. In this case the lunar
module “hovers” mid-descent before crash landing at just under 30 mph.
28 1 Simulation
140
120
Altitude, meters
100
80
60
40
20
-20
0 5 10 15 20 25 30 35 40
Time, seconds
20
15
10
Velocity, m/s
-5
-10
-15
-20
-25
0 5 10 15 20 25 30 35 40
Time, seconds
Fig. 1.22: Altitude and velocity over time during lunar descent. In this case the lunar
module performs a secondary “climb” before crash landing at just under 50 mph.
1.3 Lunar Module 29
80
Altitude, meters
60
40
20
-20
0 5 10 15 20 25 30 35 40
Time, seconds
0.5
-0.5
Velocity, m/s
-1
-1.5
-2
-2.5
-3
-3.5
0 5 10 15 20 25 30 35 40
Time, seconds
Fig. 1.23: Altitude and velocity over time during lunar descent. In this case the lunar
module maintains a constant velocity “profile” and lands safely at just over 5 mph.
30 1 Simulation
0
0 0.2 0.4 0.6 0.8 1
Time, seconds
Fig. 1.24: Computing applied to diagnosis and treatment. Left: An operating room
during surgery, the culmination of extensive preparation. Right: Typical volumetric
bloodflow profile from a single heartbeat, one starting point toward the expansion
of knowledge. Image courtesy of the U.S. Army, photo credit C. Todd Lopez.
Medical Applications
Simulation has become sufficiently robust to influence medicine by, for instance,
modeling the flow of blood within our bodies. Figure 1.24 suggests one piece of
minimal information needed for diagnosis or treatment.
The process by which arteries transport blood away from our heart is similar to
incompressible flow in a pipe where viscosity dictates the interaction with tissue
lining the inner wall of our blood vessels. Unlike a lunar module descent velocity,
bloodflow is periodic and the waveform pattern is repeated with each heartbeat.
These flows were studied by Poiseuille in the nineteenth century while the more
general set of equations were, around the same time, formulated by both Navier and
Stokes. An inviscid model had already been developed by Euler in the eighteenth
century using Newton’s famous second law of motion from the seventeenth century.
But only in the latter twentieth century have computers become powerful enough to
apply these models to practical cases.
So, science waited centuries for machines to catch up. Now what?
Running an experiment that may pose some danger to a human patient is
goverened chiefly by issues related to medical ethics, and we choose to err on the
side of “first, do no harm.” Again this makes simulation an attractive technique but
also when a patient’s life-and-death (!) depends on proper diagnosis and treatment
then the accurate performance of our code takes on a fundamentally different level
of importance.
1.3 Lunar Module 31
-11.90
16
Fig. 1.25: Controlled lunar descent with limited fuel availability. Remaining fuel is
indicated in number of times vertical thrusters may be burned, initially set to 20.
Images courtesy of NASA, lunar module photo credit Michael Collins.
While we can press the spacebar as often as we like in real life the lunar module
contained a limited amount of fuel. Code Listing 1.17 shows one way to model
limited fuel in our code and Figure 1.25 shows the current fuel level communicated
to the user; when the number reaches 0 then we can no longer fire the thrusters.
Code Listing 1.17: Limited fuel where we track the current level remaining.
def spacebar(evnt):
global vy,u
#
if u>0:
#
vy+=1.0
u -=1
32 1 Simulation
-2
Velocity, m/s
-4
-6
-8
-10
-12
0 2 4 6 8 10 12 14 16 18
Time, seconds
Fig. 1.26: As shown, vertical thrusters fire at regular intervals but we do not even
use all our fuel. Upon modification of the firing schedule your author’s best effort
was a vy = −0.17 m/s landing, under 0.5 mph.
Do you think the target rectangle, fuel indicator, and velocity text are easy to use?
An alternative would be to define a strategy in the code and thus automate the entire
process (i.e., no longer use the spacebar at all). Code Listing 1.18 specifies a firing
schedule based on current elapsed time with results, crash, reported in Figure 1.26.
Code Listing 1.18: Control of vertical thrusters with a firing schedule.
def tick():
global t,y,vy,i
#
if i<len(fs) and t>=fs[i]:
vy+=1.0
i +=1
#
#
i=0
fs=[ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0,10.0,\
11.0,12.0,13.0,14.0,15.0,16.0,17.0,18.0,19.0,20.0]
Chapter 2
Graphics
Curiosity might ask how an image is built at all. In this chapter we consider three
different representations: a bitmap assigning colors to each pixel, a scalable model
requiring later calculations to render the image, and an even higher level specifica-
tion where the underlying details are completely hidden from view.
As an example of our first distinction, PPM stands for “portable pixel map” and
this format specifies literally hundreds of thousands of red-green-blue color values
in very large image files. SVG is “scalable vector graphics” where images contain
only a geometric description of lines and curves, thus reducing file size by delaying
the pixel-by-pixel rendering process. This approach has the advantage that enlarg-
ing a vector image is immune from any pixelation issues the corresponding bitmap
would face, but it also means the original image must be conceived in terms of
geometric objects which is not trivial for, say, a photograph.
Fig. 2.1: Example image files: gold, silver, and Olympus Mons, Mars, the largest
known volcano. Image courtesy of NASA from the Viking 1 mission, 22 June 1978.
Table 2.1 shows a very small 5 × 5 image. If one byte is used to specify each color
value and there are 25 pixels, then we require only 75 bytes to store all RGB data
for this entire image. (Each dot stores red 0-255, green 0-255, blue 0-255 at the
pixel center.) PPM files also include a header listing such information as the width
and height of our image, but the header does not scale with image size in the same
manner that the amount of color data will.
Lab211: Circle π
Figure 2.2 shows one quadrant of the unit circle within a unit square. Your assign-
ment is to produce such an image. Code Listing 2.2 specifies a side-length m which
in turn determines the total number of pixels n = m2 . The variable count will be
used to count-up how many of these n pixels are also inside our unit circle.
In addition, Figure 2.3 shows pixelation when a smaller bitmap is enlarged, either
by direct calculation or with interpolation of the color values. Free tools are available
for this kind of image manipulation.
Fig. 2.2: One quadrant of the unit circle within a unit square. Note y-coordinates are
often inverted in graphical systems, for either interactive windows or stored files.
Fig. 2.3: Pixelation when a smaller bitmap is enlarged. Left: direct calculation.
Right: linear interpolation of colors using the GNU Image Manipulation Program.
36 2 Graphics
Table 2.3: Size (no image), computed value of π , and associated runtime.
The area of a circle is A = π r2 and so for the unit circle, where r = 1, area is
A = π . For only one quadrant area is A = π /4. If the n pixels in our image represent
a unit square with area A = 1, the number of pixels inside the circle will relate to n
by a π : 4 ratio. Since we count these pixels in our code we may approximate π with
improving accuracy as n increases, shown in Tables 2.2 and 2.3 with runtimes for
an IntelR
Core i7-940 chip.
Code Listing 2.3 converts from pixel coordinates to unit coordinates, and also
shows how the putpixel method is called on an image to set the RGB color
value of a single pixel. Of course the value of (x, y) rather than (xp, yp) determines
if a point is inside the unit circle or not.
∗ Estimated runtime for size m = 106 is over ten days. Our eleven second runtime is based on a
more efficient calculation suggested on the next page. A common story: the necessity of running a
large problem is what compels us to consider a more sophisticated technique in the first place.
2.1 Pixel Mapping 37
Fig. 2.4: Circle divide-and-conquer. Left: binary search, only the marked pixels are
checked and all other pixels are classified automatically. Right: quadtree, a similar
idea in 2-D where only the corners of each box are checked. In general our goal is to
localize calculations for larger sizes along the edge of the circle, where they matter.
12
binary search
quadtree
In-Out Checks, percentage
10
0
500 1000 1500 2000 2500 3000 3500
Size of Image File, pixels per side
Fig. 2.5: Divide-and-conquer savings. For the previous size m = 106 result we might
alternatively process the 106 × 106 = 1 trillion pixels in parallel using over 75, 000
computers to achieve the same runtime performance, thus we tend to prefer a better
algorithm to a bigger computer (or more computers) when possible.
38 2 Graphics
Fig. 2.6: Percolate pixelate. Top to bottom: p = 0.5, p = 0.6, and p = 0.7.
2.1 Pixel Mapping 39
Consider an 80 × 60 image where each pixel is colored green with probability p and
purple with probability 1 − p. Algorithm 2.1.1 enlarges this image to 800 × 600 by
converting each pixel into a 10 × 10 block of 100 pixels. Figure 2.6 shows typical
results for p = 0.5, p = 0.6, and p = 0.7. Later in Chapter 5 we will ask:
For what probabilities will there be a green pathway connecting all four sides?
If each pixel has at most four neighbors (i.e., not counting diagonals) then it appears
the p = 0.5 image does not have a path while p = 0.7 clearly does. For p = 0.6 it is
not at all obvious what will happen in general.
This question is related to “percolation” or the flow of fluids (e.g., groundwater)
in porous material such as rock or a layer of sediment (or the flow of boiling water
through coffee grounds in a percolator). Percolation theory applies graph algorithms
and statistics to what was originally conceived as a physical science problem.
Also, in this context pixelation is actually helpful because it aids the human eye in
tracing pathways across the image. It would not be desirable to use a “better” image
with some form of interpolation, although eventually this will not matter once we
have coded an automatic tool to determine if such a pathway exists.
Fix three points P1 , P2 , and P3 . As shown in Figure 2.7 these are P1 = (0.5, 0.1),
P2 = (0.1, 0.9), and P3 = (0.9, 0.9), if we map the image pixel coordinates to unit
square coordinates. Randomly initialize a fourth point P = (x, y), pick one of the
three fixed-points also at random, then move P halfway toward that point and draw
the corresponding pixel in your image. Now repeat: randomly pick one of the three
fixed-points, move P halfway from its current position, and draw the pixel.
Our result is a famous fractal called Sierpinski’s Gasket and this random drawing
process is known as a Chaos Game. Experiment with the total number of loops for
yourself but Figure 2.8 provides some guidance for a 300 × 300 image. Note how
we reach “carrying capacity” because there are only so many pixels to draw.
Alternative drawing techniques are suggested in Figure 2.9.
2.1 Pixel Mapping 41
7500
5000
2500
0
0 25000 50000 75000 100000
Total Number of Loops
Fig. 2.8: Diminishing marginal returns. Since there are only so many pixels to draw
eventually more-and-more looping introduces fewer-and-fewer new pixels.
Fig. 2.9: Pascal’s Triangle. Obviously we could generate a similar image geometri-
cally by starting with a whole triangle and recursively discarding the middle quarter.
Or, here we suggest a technique based on coloring the even-odd values in Pascal’s
much used triangle, albeit for a sample this small our image is quite pixelated.
42 2 Graphics
x = x1 + t · (x2 − x1 )
y = y1 + t · (y2 − y1 )
Clearly when t = 0.0 we draw P1 , when t = 1.0 we draw P2 , and for 0.0 < t < 1.0
the pixels in-between are drawn. However, if we choose dt too large then we may
draw a “dotted” line and for dt too small we waste time re-drawing the same pixels
over and over again.
As shown in Figure 2.10 your assignment is to draw 50 random lines where P1
is chosen inside a circle with radius r = 75 (all units are in pixels and the image
size is 250 × 250) and P2 is outside that circle but inside another concentric circle
with r = 125. Note the lack of anti-aliasing here, as in Jack Bresenham’s famous
line drawing algorithm but later addressed by Xiaolin Wu.
We might imagine the spirals shown in Figure 2.11 are the result of four bugs chas-
ing each other, or four dogs, or even penguins. But as the next problem will use
turtles we imagine them initialized at the corners of a box and looking only at their
nearest clockwise neighbors. Steps “simultaneously” move the turtles 10% of the
way toward these nearest neighbors while also drawing the lines of sight.
Code Listing 2.4: Color codes for tan and dark green.
#
img=Image.new(‘RGB’,(w,h),(210,180,140)) # tan
#
img.putpixel((x,y),(0,100,0)) # dark green
#
44 2 Graphics
Like an SVG file our turtle programs will specify how shapes should be drawn, and
both systems delay the actual pixel-by-pixel rendering process until either the vector
image is displayed or the turtle walks along its path (carrying a marker, of course).
Each format is a high-level represention of a drawing that may be scaled to an image
of any size. Similar techniques are used to render 3-D models from CAD files.
In Code Listing 2.5 a general drawline function is defined on Line 11, essential
turtle functions on Lines 25-45, and Line 46 onward is program specific. Output is
shown in Figure 2.12. Initialization of the turtle (Lines 54-56) is required for any
drawing. Your assignment is to replace the ellipses with working code.
Fig. 2.12: Turtle square, rendered in a variety of sizes, sequences, and directions.
2.2 Scalable Format 45
Figure 2.13 shows polygons with n = 3, 4, 5, 6 sides and side-length 40 pixels (top)
and n = 7, 8, 9, 10 of side-length 20 pixels (bottom). Code Listing 2.6 shows how
easy square is to write once the poly function is working.
Figure 2.14 shows polygons spinning in beautiful ways. The idea of turtles drawing
vector graphics is not new and versions for many different programming languages
might be used in a variety of contexts and education levels.
Our code uses “local” variables in drawline because values of t, x, and y are
unimportant once the function has finished executing. But we use “global” variables
in jump and turn because the values of xt, yt, and ht are essential in tracking our
turtle as it moves across the image over time.
Function move uses each kind of “scope” and note how Python treats arguments
like r, dh, and size as copies∗ of the original. Local copies. By default an assign-
ment in a function will create a local variable as with t and x and y. When you see
the global command it is telling Python not do this, that the variable is not local,
as with a turtle’s data xt and yt and ht.
∗ Variables stored with a “pointer” and passed to a function may be altered but not by assignment.
48 2 Graphics
#
def jump(xnew,ynew):
global xt,yt
#
xt=xnew
yt=ynew
#
#
...
#
xt= 20.0
yt=100.0
#
jump(0.0,0.0)
#
print xt,yt # output is 0.0 0.0
#
#
def jump(xnew,ynew):
#
xt=xnew
yt=ynew
#
#
...
#
xt= 20.0
yt=100.0
#
jump(0.0,0.0)
#
print xt,yt # output is 20.0 100.0
#
2.2 Scalable Format 49
Figure 2.15 shows polygons spinning while side-length also changes. In particular,
the circular spiral was inspired by an automated lawn mower project where the gap
width between layers had to match the specific width of the real-life lawn mower.
Can you draw this spiral with a purposeful gap width?
Turtle code may be organized into modules as Python does with its Tk library,
math module, the Python Imaging Library (PIL), and even a built-in turtle:
http://docs.python.org/library/turtle.html
A drawline function might be in a general-use module for graphics separate from
the turtle library. (In fact PythonWare
R
has ImageDraw in PIL.) Programs could ac-
cess these resources with the same kind of import statements we have been using
for “official” modules. We did not organize our programs this way only because the
code is small and, especially when beginning, ease-of-use is highly valued.
Fig. 2.16: Random tree, we might also use random angles at each branching.
Figure 2.16 shows most, but not all, of a tree. The tree was drawn by 800 turtles,
each beginning at the root and walking up the trunk, making 799 redundant lines
just to start. Then, at random, half of the turtles branched left and half right.
This random branching process continued for nine total steps with size decreas-
ing at each level. Some turtles walked the exact same total path as others, not con-
tributing anything new to the overall drawing. This redundancy is more probable at
all levels as time passes so we again have diminishing marginal returns, as shown
in Figure 2.17. Depending on how much the size changes the details of these plots
will vary but the overall characteristic remains the same.
Later in Chapter 5 we will see an alternative technique called recursion that can
be used to draw the entire tree precisely with a single (!) turtle. Other recursive
possibilities are shown in Figures 2.18 and 2.19.
2.2 Scalable Format 51
250
200
150
100
50
0
0 100 200 300 400 500 600 700 800
Random Walks
5000
Total Pixels Drawn by All Walks
4500
4000
3500
3000
2500
2000
1500
1000
500
0
0 100 200 300 400 500 600 700 800
Random Walks
Fig. 2.17: Diminishing marginal returns. Top: new pixels drawn by current walk.
Bottom: total pixels drawn by all walks, shown after each new walk.
52 2 Graphics
11 4
8 10
9
2 5
7 14
12
13
1 6
Fig. 2.20: Click count, where your own program need not show the counts.
We return to interactive graphics with a goal to build drawing programs for non-
commercial sketchwork. The final suggested version will include both tool and color
selectors and is a starting point for a full piece of software. The user experience is
foremost so we continually ask ourselves: “How would someone new to this react?”
Feature-creep must be carefully guarded against as clutter and incoherence do not
make a positive interface. Keep in mind what the application is supposed to be.
Writing “software” is not the same as writing “code” because code is written for
only yourself to run. Software has to keep working even after you leave the room so
we need to make assumptions about what a typical user would want, and also what
they can actually do. Comparing vector graphics to pixel-by-pixel data, software is a
representation at an even higher level where the user does not see any of the details.
In the same way, as a coder you are like the user when importing a library you did
not write, or a library that you wrote but just not recently.
2.3 Building Software 55
Our first idea is to respond when the user clicks the mouse; every second click we
draw a line connecting the two previous click locations. The click count numbers
shown in Figure 2.20 have been added only to show how the lines were drawn and
they should not be present in your own version unless you really mean for them to
be a part of someone’s sketch.
Code Listing 2.7 is a shell where the event-object evnt knows the (x, y) location
of each mouse click. Note again how we must specify that count, x, and y are defined
at the global scope since function click contains an assignment statement for
each of these variables. In this manner their values will persist between successive
calls (i.e., between successive clicks). Line 30 assumes we want left-clicks rather
than right-clicks, thus Button-1 rather than Button-3.
2 3
4
1
7
Fig. 2.21: Polyline, right-click ends each chain, left-click starts the next one.
Lab232: Polyline
Figure 2.21 shows a “polyline” where a right-click ends each chain. Key events are
also shown in Code Listing 2.8 and function exit is from the sys module.
Code Listing 2.8: Mouse click events and quitting the program.
#
def click(evnt):
...
#
def rightclick(evnt):
...
#
def quit(evnt):
exit(0)
#
root.bind(‘<Button-1>’,click)
root.bind(‘<Button-3>’,rightclick)
root.bind(‘q’,quit)
#
2.3 Building Software 57
Of course when we draw a pencil sketch we do not press the paper only at the
endpoints of a line but at every single point. We can implement this feature using
drag events as shown in Code Listing 2.9 and Figure 2.22, a quick “review” session.
Fig. 2.22: A stroll down memory lane with highlights from our previous labs.
Fig. 2.23: A lot of rectangles, each is drawn so you see it grow as you drag.
Lab234: Rectangles!
The idea is that as the mouse is dragged you can see the rectangle changing size,
updated dynamically. Each rectangle is fixed in place only as the button is released.
Code Listing 2.10: Rectangle commands.
#
tkid=cnvs.create_rectangle( ... ,fill=‘’)
cnvs.coords(tkid, ... )
#
2 4 2 4 6
1 3 5 1 3 5
We might also like to see the growing line update as we move (not drag) in our
previous line and polyline programs, before a click sets the second or next endpoint.
The coords command works the same on lines as it does on rectangles, shown in
Figure 2.24 where a line is growing first from Point 5 to Point 6 and then, presum-
ably, a future Point 7. Code Listing 2.12 shows how to respond to mouse motion
events when no button is being pressed.
In addition, Code Listing 2.13 shows a double-click event we might use to con-
nect the last endpoint of a polyline back to the first endpoint. In this case we must
assume the location of our first click was remembered at the time it happened be-
cause it cannot be reconstructed later.
Code Listing 2.12: Mouse motion event.
#
def move(evnt):
...
#
root.bind(‘<Motion>’,move)
#
Figure 2.25 shows all our previous drawing tools as options, plus a color selector,
and one last new feature: spray paint. (Disclaimer, vandalism is a crime.) This tool
presents a number of coding challenges most notably that the spray paint should still
work when the button is pressed even if the mouse is not moving, so drag events
alone are not sufficient.
One solution is to use animation where a click sets some Boolean variable true,
the subsequent release sets it false, and drag events update the (x, y) location. All the
while our tick function is drawing random 1 × 1 rectangles somehwere in a circle
centered at the current (x, y) and so long as the mouse button is still being pressed.
In this context random points clustered near the center of a circle actually match the
reality of a spray can.
The graffiti icon displayed among the tools will be different with each run of
the program unless we set an explicit random number seed. (In the same way exact
replication of simulation results may be obtained, an important requirement of any
experiment.) You might also consider including a fill-the-area tool, a cut-and-paste
option, and some facility for saving the current picture to an image file.
Fig. 2.25: Graffiti tool, one of many options our users enjoy.
Chapter 3
Visualization
An ability to create graphics in various forms is not the same as an ability to say
something important and understandable using graphics. A famous high-quality
example is shown in Figure 3.1 by Charles Jospeh Minard, from 1861. This map
depicts a shrinking French army in 1812, first invading Russia and then retreating.
Fig. 3.1: Napoleon’s 1812 invasion of Russia, by Charles Joseph Minard, 1861.
Fig. 3.2: Cholera outbreak in London, by John Snow, 1854. The black dots represent
cholera deaths and stacks of these can be found near the Broad Street water pump.
Figure 3.2 is from John Snow’s 1854 study of a cholera outbreak. The concentration
of cases near a particular water source, made obvious by Snow’s drawing, led to an
understanding that contaminated water was involved in spreading the disease.
The composite image of our nighttime Earth shown in Figure 3.3 was produced
by NASA from satellite data. Such a picture is not only impossible but probably
also unimaginable without advanced technology, and a time-traveler from the past
could not be expected to make much sense of it. However, it should say a lot to us.
County-level results from Florida’s contentious Bush v. Gore presidential election
are shown in Figure 3.4. Geospatial plots have become much easier to build since
polygon shape files for U.S. regions, as well as election results, are now publicly
accessible on the Internet:
http://www.census.gov/geo/www/cob/bdy files.html
http://election.dos.state.fl.us/elections/resultsarchive/
Article 2, Section 1 of the U.S. Constitution says the Electoral College is based
on states, not counties, but since administration of elections happens at the county
level (local level) this is where complaints were made.
3.1 Geospatial Population Data 63
Fig. 3.3: Earth at night, a high-tech composite of satellite data, 27 November 2000.
Image courtesy of NASA, production credit C. Mayhew and R. Simmon.
Fig. 3.4: Florida county-level presidential election results from 2 November 2000.
Counties are light gray if won by George W. Bush and dark gray if won by Al Gore.
64 3 Visualization
A missing piece in our Florida graphic is population data because a landslide win
in one county could equate with many close wins in others. We begin by drawing a
polygon as shown in Figure 3.5 where our octagon is almost regular but not quite.
We do not “find” the min/max in Code Listing 3.1 since they are plainly −2 and 2.
However, the 49 regions of the contiguous U.S. require O 105 vertices and those,
in turn, require a data file. Code Listing 3.2 reads the vertices used by Figure 3.6.
Code Listing 3.2: Polygon data from a file being read into a list.
#
data=open(‘lab312.txt’,‘r’).read().split()
xy=[]
#
j=0
while j<len(data):
x=float(data[j])
j+=1
y=float(data[j])
j+=1
...
xy.append((x,y))
#
66 3 Visualization
Figure 3.7 shows extreme values xmin = −124.733, xmax = −66.950, ymin = 24.545,
and ymax = 49.384, thus the aspect ratio is 2.33 : 1 not 1.33 : 1 (or 4 : 3) and we are
now bound by width rather than height.
Our cleaned-up data file lists regions alphabetically by postal abbreviation where
each region may contain one or more polygons. Code Listing 3.3 shows the last
two points of Alabama’s second polygon and the first two points of Arkansas’s only
polygon. Parsing the Internet’s “dirty” dataset is a topic for another course.
Code Listing 3.3: Part of a file containing all 49 contiguous U.S. regions.
...
-88.1665689999999955 30.2492550000000016
-88.1884692736000062 30.2469340542000005
END_ONE_POLY
END_ALL_POLY
AR
-94.4760497582000056 36.4993199124999990
-94.4568835367000048 36.4993666550000029
...
3.1 Geospatial Population Data 67
Figure 3.8 distinguishes regions with more than 5, 600, 000 people, those with fewer
than 2, 675, 000, or in-between. Code Listing 3.4 reads in the population data and
stores it in a hashtable by associating keys with values: print htable[‘DC’]
Code Listing 3.4: Population data from a file being read into a hashtable.
#
data=open(‘lab314.txt’,‘r’).read().split()
htable={}
#
j=0
while j<len(data):
key=data[j]
j+=1
val=int(data[j])
j+=1
#
htable[key]=val
#
68 3 Visualization
1.000000e+06 1.000000e+07
Fig. 3.11: Dynamic updates, a breakpoint is modified if the mouse is hovering over
its corresponding square as the scroll-wheel turns... and the polygons change colors
immediately!
cnvs.itemconfigure(tkidnum,fill=color)
This command is run inside two loops: one loop for all the regions and then another
for all the polygons of each region, where only the color values are changing.
3.1 Geospatial Population Data 71
Consider again a random walk but now in 2-D rather than 1-D and with hundreds of
particles not just one, bringing us closer to our previous atmospheric example.
cnvs.move(tkid[j],dx,dy)
In our code we go a step further and assume the existence of a lattice constraining
all particle motion to one of only four nearest neighbors: up, down, left, or right.
1000
Fig. 3.12: Diffusion after 1000 steps where all particles began at the center.
3.2 Particle Diffusion 73
2000
3000
100
80
60
40
20
0
0 2500 5000 7500 10000 12500 15000 17500 20000
Time, total steps
Fig. 3.14: Average distance, measured from the particles’ shared starting location.
If the center is at (xc, yc) and some particle Pi is at (xi , yi ) then the straight-line
distance di is such that di2 = (xi − xc)2 + (yi − yc)2 . Figures 3.12 and 3.13 showed
the average distance growing over time and Figure 3.14 shows the same information
in a single graphic without also showing the particles. Another measurement called
the “root mean square” is shown too, as well as the square root of elapsed time.
1 N−1 2
AV G = ∑
N i=0
di
1 N−1
RMS = ∑ di2
N i=0
In calculating RMS, the square root in di and the square in the formula cancel
each other out; ignore both to avoid performing useless calculations.
3.2 Particle Diffusion 75
140
120
100
80
60
40
20
0
0 2500 5000 7500 10000 12500 15000 17500 20000
Time, total steps
Fig. 3.15: Standard deviation, measuring the concentration of data near the average.
An increasing average distance tells us the particles are moving away from the center
but it does not tell us they are moving away from each other; how do we know they
are not all making the exact same movements together, or that a tight circle is not
maintained with an expanding radius? Answer: calculate the standard deviation.
Figure 3.15 tells us the particles are spreading out away from each other. If we
call the average distance μ then the standard deviation σ is calculated by:
1 N−1
σ = ∑ (μ − di ) 2
N i=0
The two dashed circles in the previous plots were shown with a radius of one
standard deviation above and below the average. Observation indicates that the area
between these two circles accounts for almost 70% of our particles.
76 3 Visualization
2500
As shown in Figure 3.16 and Code Listing 3.6 we interpolate y from 0.0 to 0.03 and
assume a normal distribution with formula:
1
√ · e−(x−μ ) /(2σ )
2 2
y=
σ 2π
Note that x is measured in pixels for raw screen distance with no interpolation so
we scale only by a factor of 2, but since y is a probability it must be scaled by the
window height or our curve will appear exceedingly small.
78 3 Visualization
The vertical line in Figure 3.17 is at x = μ and the horizontal line has length σ .
Code Listing 3.7 shows how to draw our curve point-by-point and also how to draw
the vertical line. What happens to the vertical and horizontal lines over time?
3.2 Particle Diffusion 79
2500
Code Listing 3.8: Bins form a histogram to check our model assumptions.
#
def tick():
...
#
j=0
while j<bins:
#
# current coordinates of this bin
#
x1,y1,x2,y2=canvas2.coords( ... )
#
# desired height of this bin
#
y=bincount[j]/(1.0*num)/((w/2.0)/num_bins)
#
# translate bin height to pixel coordinates
#
y1=int(h-h*(y/0.03)+0.5)
#
# only the bin height changes !!!
#
canvas2.coords( ... ,x1,y1,x2,y2)
j+=1
#
...
But the normal distribution assumption is wrong! Distance cannot be less than zero
so an entire “tail” is cut off. Figure 3.18 shows this by grouping the particles into
various bins, forming a histogram. (To avoid this truncation issue we might also
initialize the particles along a circle of radius 100 instead of clumped together.)
As the bin width decreases (i.e., as the number of bins increases) an accurate
model’s curve would closely match the observed distribution of particles. You might
use arrow keys to interactively change the bin settings...
root.bind(‘<Left>’,leftarrow)
...but be careful, with only num=1000 particles using too many bins will ruin any
model approximation.
3.2 Particle Diffusion 81
2500
Fig. 3.18: A histogram for the observed particle distances indicating that our as-
sumption of a normal distribution requires correction.
82 3 Visualization
A petri dish limits each particle’s motion, shown in Figures 3.19 and 3.20 with a
uniform distribution at time zero. Fix one particle in place at the center. If a particle
occupies an adjacent lattice point of any fixed-particle then it becomes fixed itself.
In this way a crystalline-like “aggregation” develops as the number of fixed particles
grows over time. These tendrils make it difficult for a still-traveling particle to reach
the interior of the structure.
This is not an embarrassingly parallel problem because the location where one
particle becomes fixed will influence where other particles can continue to move
freely. Thus, this problem is said to be coupled. Two parallel options are possible.
First, if each parallel node tracks particles at any location then we must commu-
nicate where they become fixed. Each node might send a broadcast message to the
entire group, or we could designate a “manager” node to collect everyone’s infor-
mation one-by-one and then make a single broadcast.
Alternatively, this application lends itself to spatial decomposition. If each node
only tracks particles in a particular region of space, they only need to communicate
fixed-particles on their boundary and only to that nearest-neighbor node. Of course
if a particle leaves its region then a node will have to communicate this as well.
Approximating π
4
Computed Value
0
0 5 10 15 20 25 30
Number of Terms in Series
Fig. 3.21: Plot values, showing both alternating terms and convergence to π .
3.3 Approximating π
Consider again how to calculate digits of π but now using series approximations
such as Gregory’s series, centuries-old (!) and based on tan 45◦ = tan (π /4) = 1.
π 1 1 1 1 1 1
= − + − + − +···
4 1 3 5 7 9 11
A cumbersome way to visualize this series is shown in Table 3.1 where we output
the approximation as each new term is added. Note the very slow convergence.
Since the sign alternates and the denominator grows we can bound our error without
even knowing (!) the true value, as shown in Figure 3.21 for the first 30 terms.
3.3 Approximating π 85
4.000000 2.666667
1 2
3.466667 3.137593
3 250
Since π /4 radians is a 45◦ angle we might plot our own approximation as an angle to
compare. Figure 3.22 shows the first three steps and the 250th , which is so close that
the angles cannot be distinguished. These drawings require sin or cos functions,
defined either using the famous CORDIC algorithm or their own series:
x3 x5 x7 x9 x11
sin x = x − + − + − +···
3! 5! 7! 9! 11!
x2 x4 x6 x8 x10
cos x = 1 − + − + − +···
2! 4! 6! 8! 10!
3.3 Approximating π 87
π2 1 1 1 1 1 1
= 2 + 2 + 2 + 2 + 2 + 2 +···
6 1 2 3 4 5 6
Text output is shown in Table 3.2 where sign no longer matters (angles would
approach 45◦ from only one side) but we must remember to square the denominator
and take a square root after multiplying through by six. Runtime for 25 million terms
was almost 14 seconds and the last term added is nearly zero in 16-digit floating-
point representation, assuming use of the 64-bit IEEE standard format.
Floating-Point Representation
31
9 22
4 5 14 8
3 1 4 1 5 9 2 6
Fine-Grain Parallelism
Figure 3.23 suggests a technique for calculating any sum in parallel, in this case:
3 + 1 + 4 + 1 + 5 + 9 + 2 + 6 = 31
In serial we might initialize a variable sum=0 and do a loop that adds each term:
sum=3, sum=4, sum=8, sum=9, sum=14, sum=23, sum=25, sum=31
However, as we have seen for series requiring a large numer of terms, this loop
may need to run for a long time. An alternative is to pair the terms up so that each
pair-sum can be calculated simultaneously. Here 3+1=4 and 4+1=5 and 5+9=14
and 2+6=8 could all be calculated at the same time in parallel. This idea is repeat-
edly applied on the intermediate sums, so 4+5=9 and 14+8=22 at the next level,
after which 9+22=31 gives the final result.
Rather than requiring N = 25 million loops we now need only log2 N levels of
this parallel tree. Since
log2 (25, 000, 000) < 25
we might speed up our run by a factor of a million! (In theory.) This is not an
embarrassingly parallel problem or even one where spatial decomposition applies,
but is known instead as fine-grain parallelism.
90 3 Visualization
math.pi 3.1415926535897931
π :10−16 3.1415926535897932
High-quality results in Table 3.4 are based on sin 30◦ = sin (π /6) = 1/2 = x and:
π x3 x5 x7 x9 x11
= c1 x + c3 + c5 + c7 + c9 + c11 +···
6 3 5 7 9 11
π 1 x3 3 x5 5 x7 7 x9 9 x11
= (1)x + c1 + c3 + c5 + c7 + c9 +···
6 2 3 4 5 6 7 8 9 10 11
c1 = 1
c3 = 1
2
c5 = 1·3
2·4
c7 = 1·3·5
2·4·6
c9 = 1·3·5·7
2·4·6·8
c11 = 1·3·5·7·9
2·4·6·8·10
Note how we experience swamping after step 23 long before underflow at step 531.
Chapter 4
Efficiency
To solve a small problem it is unnecessary to write code that runs as fast as possible,
but here “small” must be understood reflexively to mean only that a fast running
code is not required. Regardless, to solve problems of practical interest it will cer-
tainly be necessary that our code runs fast for large cases, and as habits govern so
much of our behavior we will do best by writing code for small cases with potential
large cases already in mind.
100
linear
quadratic
80
Relative Runtime
60
40
20
0
0 2 4 6 8 10
Size of Problem
Fig. 4.1: Comparison of runtime efficiency between linear and quadratic algorithms
as the size of a problem grows. Small problems appear only in the lower-left corner.
Figure 4.1 shows the relative difference in runtime between linear and quadratic
algorithms as a problem grows larger and larger. With every order of magnitude
increase in problem size the relative difference in runtime grows by two orders of
magnitude. So if there is a linear solution, find it!
As one example, the Declaration of Independence contains just under 1500 words
while Federalist 78 has just over 3000. To determine how many times each unique
word appears in these two texts we might use either a linear or quadratic algorithm.
On an old PowerPC R
chip the linear code ran in 0.009 and 0.02 seconds, respec-
tively, while the quadratic code took 1.1 and 3.9 seconds.
Note that as the problem size doubled the runtime of the linear code also doubled
but the quadratic code quadrupled. A fair objection: 4 seconds is not a long time to
wait for a code to run regardless of the algorithm. Yet if we consider instead every
issue of every newspaper from the past year, or every page of every website from
the past 10 years, then only the efficiency of our code will matter.
Research in “natural language processing” involves the analysis of these kinds of
texts using statistical inference, among other techniques, potentially also studying
changes over time. Since understanding requires accurate results (and otherwise
what is the point?) it becomes necessary to draw on a very large body of work and
this in turn requires a very fast running code.
We want to build a word cloud that quickly relays information about the main topics
of a work in a visual manner. In preparation, we first build a population cloud for
the United States from Census 2000 data as shown in Figure 4.2.
Code Listing 4.2 shows population data previously used and Code Listing 4.3
shows part of a file to translate from each state’s postal abbreviation to its full name.
Be careful using the split function here because some state names include more
than one word; in fact, this is why the abbreviation and full name are listed on every
other line in the file, so that split(‘\n’) can be used. This will separate the data
at newlines only and not at any other whitespace characters.
Your assignment is to use the state names and populations to display full names,
at random, where font size is determined by relative population size. (Note there is
a big lie here: the difference in font size does not reflect the actual difference in, say,
the populations of Wyoming and California.)
South
MaineCarolina
Missouri
California Kansas
Arizona
South Dakota
Florida
Massachusetts
Nebraska
Alabama Minnesota
Delaware
Oregon
Kentucky Connecticut
Arkansas
Idaho District of Columbia
Pennsylvania Nevada
Alaska
Wyoming
Illinois
Iowa
Code Listing 4.2: Part of the file from Lab314 containing Census 2000 data.
AL 4447100
AK 626932
AZ 5130632
AR 2673400
CA 33871648
CO 4301261
CT 3405565
...
Code Listing 4.3: Part of a file to translate state abbreviation to state full name.
AL
Alabama
AK
Alaska
AZ
Arizona
...
94 4 Efficiency
seas
justice themselves free
at has
their
power
united
the refused
others usurpations
government
repeated
do are
we hold time
without these his
mankind
a other should
colonies our
absolute
in
of laws such
pass peace
this that
long
among
is right and which
rights from
war
states people
an
declaration
us as when
they
them
have
america by
large
be
all
to
powers
he with on
most
assent
abolishing
it
new for
independent been its
Fig. 4.3: Word cloud where font size is based on relative frequency in the text.
Common words such as “of”, “the”, “to”, and “and” make Figure 4.3 not immedi-
ately identifiable with the Declaration of Independence. But, producing this word
cloud does require first tabulating the frequency of each unique word in the text,
which is a start. Algorithm 4.1.1 shows a quadratic calculation of these frequencies.
powers power
usurpations america
mankind
assent
time united
independent
free new
laws large
states
rights
hold
declaration
justice refused
right pass
government absolute
seas
colonies repeated
peace
themselves
Fig. 4.4: Word cloud more readily associated with the Declaration of Independence,
excluding common words and only including words that appear at least three times.
Files of common words can be found on the Internet, or you can build your own.
Then, a word cloud like the one shown in Figure 4.4 may be produced by filtering
these words out before (!) tabulating frequencies. Algorithm 4.1.2 outlines an im-
proved linear calculation to count-up each word frequency step-by-step over time.
Fig. 4.5: Sunset at Cerro Tololo Inter-American Observatory near La Serena, Chile.
Image courtesy of the U.S. Navy, photo credit Dr. Brian Mason.
Databases
T heory ⇔ Data
Experiment ⇔ Simulation
4.1 Text and Language 97
concerning proportion
good
liberty
citizens
causes government whole
republic
views
property justice
interests
parties rights
local passion
common
opinions
Fig. 4.6: Word cloud from the famous Federalist 10, written by James Madison,
shown without overlap between the bounding boxes that surround each word’s text.
Avoiding overlap requires first finding the width and height of the bounding box
that surrounds each word’s text. Code Listing 4.4 shows how we might determine
these dimensions but still this is a difficult problem related to an even more difficult
problem in general: bin packing. Big words are drawn first as it does not get any
easier to find a large open space once more-and-more other words have been drawn.
Code Listing 4.4: Text width and height for a specific font and a specific word.
#
from tkFont import Font
#
f=Font(family=‘Times’,size=fsize,weight=‘bold’)
#
width = f.measure(the_word)
height = f.metrics()[‘linespace’]
#
98 4 Efficiency
stronger
former place
first
authority
security
people executive
latter
departments
parties
great
society same
several
department
weaker
necessary republic control
republican
separate system
more against
state
federal
each very
men
distinct
principles
citizens
nature little members
power
majority different
independent
interests
under
government well one
itself
less
rights principle
Fig. 4.7: Word cloud from Federalist 51, also written by James Madison.
Lab415: Comparison
Comparing the word clouds in Figures 4.6, 4.7, and 4.8, we see clearly the variety of
topics discussed in the Federalist Papers. Keep in mind also that they were written
by diffferent people: James Madison, Alexander Hamilton, and John Jay.
Implement a feature that cycles through all 85 papers showing one cloud on the
screen at a time and somehow indicating (background color, footnote) who wrote
each article. Note that frequencies of multiple similar words could be combined, too.
We might, in this case, count “promise”, “promises”, “promised”, and “promising”
as essentially the same word.
void
powers
law nothing
executive judiciary
judges well
upon those
acts
laws
courts body
duty
justice
offices
great
legislature power
between
rights
themselves
independence
act
constitution
authority
ought contrary
two
government
judicial legislative
people
superior hold
one particular
nature
ought general
constitution those appellate
inferior
shall cases
laws
fact well
authority
plan
both
state public
body same
one
cognizance new
states
trial law courts national
legislative
upon legislature
judicial
jurisdiction part
causes
federal
men judges
convention
Fig. 4.8: Word clouds from Federalists 78 and 81, written by Alexander Hamilton.
100 4 Efficiency
Fig. 4.9: Some of the crumbling ruins of ancient Babylon, 1932. Image courtesy of
the G. Eric and Edith Matson Photograph Collection, Library of Congress Prints and
Photographs Division, original source American Colony (Jerusalem), Photo Dept.
Figure 4.9 shows the ruins of Babylon in 1932. A clay tablet over 3500 years old
depicts their version of Algorithm 4.2.1, a technique now widely used in general.
Algebra verifies that Line 3 is equivalent to x2 = 2 but an assignment statement in a
computer program is not the same as a mathematical
√ equation; the input and output
values will not actually match unless x = 2 within machine precision.
Algorithm 4.2.1 Babylonian method for calculating the square root of two.
1: x = 5
2: while changing do
3: x = 0.5 · (x + 2/x)
4: print x
5: end while
4.2 Babylonian Method 101
Your assignment is to draw a cobweb diagram as shown in Figure 4.10 where sample
output in Code Listing 4.5 displays our sequence of approximations. The solid line
in the cobweb diagram is y = x from the left-hand side of Line 3 in Algorithm 4.2.1
and the dashed line is y = 0.5 · (x + 2/x) taken from the right-hand side. Their√point
of intersection occurs when the input and output values are equal, when x = 2.
Code Listing 4.5: A sequence of approximations approaching the square root of two.
2.7000000000000002
1.7203703703703703
1.4414553681776501
1.4144709813677712
1.4142135857968836
1.4142135623730954
1.4142135623730949
Cobweb Diagram
6
4
y
0
0 1 2 3 4 5 6
x
Fig. 4.10: Cobweb diagram, square root of two. Even if our initial guess is absurd
the value of x stops changing after only seven steps. A vertical line shows how an
input produces an output while a horizontal line shows how we then use that ouptut
as the next input in order to build a sequence of better-and-better approximations.
102 4 Efficiency
Cobweb Diagram
6
3
y
0
0 1 2 3 4 5 6
x
√
Fig. 4.11: Even though our√ initial guess is very close to 2 + 3 ≈ 3.732 we find the
attracting solution at 2 − 3 ≈ 0.2679 instead.
Points of intersection on cobweb plots are called “fixed points” because the input-
output x is unchanged at these locations. Figure 4.11 uses x = 1/ (4 − x) and x0 = 3.7
to solve 0 = x2 − 4x + 1 but other schemes are possible and they may be faster or
slower than this one; they may also find the other fixed point, too.
Even in cases where a direct method is available (we could have used the quadratic
formula, after all) code using an iterative method may run faster, and if our results
are sufficiently close that we cannot tell which came from which method
√ then in
what sense is the formula more true? Our previous approximation of 2 converged
to 1.4142135623730949 rather than the value 1.4142135623730951 returned by the
math module’s sqrt function. Pretty close. Say we were manufacturing a car and
needed this calculation; could an industrial saw cut metal with enough precision
for the difference to matter? Afterwards, could we even measure the cut metal this
precisely? And how does the sqrt function calculate its result anyway?
4.2 Babylonian Method 103
Fig. 4.12: A gecko climbing up glass. Evidence has suggested that this ability is
possible only because an exceedingly large number of very small bristles use inter-
molecular attraction in a massively parallel way. Image courtesy of Tim Vickers.
The ideal gas law can, under certain conditions, determine volume V when absolute
pressure P, the number of moles n, and absolute temperature T are all known:
PV = nRT ⇒ V = nRT /P
When this fails the following corrections were suggested by van der Waals:
P + n2 a/V 2 · (V − nb) = nRT
Constants a and b are determined experimentally for particular gases and these
added terms account for both intermolecular attraction and the volume of individual
particles. (Figure 4.12 shows another practical use of attraction between molecules.)
But now how can we solve a cubic equation for V ? One option is an iterative method
where we use the ideal gas law’s solution (!) as our first approximation V0 and then
build a sequence of improved approximations for V from there.
104 4 Efficiency
Cobweb Diagram
1
0.8
0.6
y
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
x
Fig. 4.13: Periodic behavior, oscillation between two points: 0.802 and 0.509
Lab423: Periodic
Lab424: Chaotic
Figure 4.14 shows a very different kind of behavior, chaos, although the only change
is now x = 3.99 · x · (1 − x) and x0 = 0.25. Systems that are this sensitive to slight
variations in a model parameter occur most famously in weather forecasting where
the so-called “butterfly effect” discovered by Lorenz makes it difficult to predict
events beyond the very short term. (As the story goes, a butterfly flapping its wings
in San Francisco causes a rainstorm to develop later
in New Jersey.)
Code Listing 4.6 shows sample output after O 104 loops, seemingly a sufficient
amount of time for any transient movement to have ended and yet still no regular
pattern has emerged between any number of points.
4.2 Babylonian Method 105
Cobweb Diagram
1
0.8
0.6
y
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
x
Code Listing 4.6: A sequence that fails to “settle down” over time.
:
0.8032323382113545
0.6306200947608700
0.9294241794701987
0.2617235476045228
0.7709650856129653
0.7045459102912460
0.8305622726266709
0.5615070498244030
0.9824053624593747
0.0689674144190529
0.2562015315679398
0.7603436040928253
0.7270626191537534
0.7917858422623198
0.6577954787985217
0.8981513416142737
106 4 Efficiency
xp ⇔ r
yp ⇔ x
At left we see a stable population, first at a constant value increasing with r and
later periodic after the so-called “bifurcation” where the population then oscillates
between two, four, or eight values. These periodic regions might represent cyclic
changes in our deer population, from either seasonal or migratory effects perhaps,
but they are stable nevertheless. The bifurcation points themselves would appear
even sharper if more pre-drawing loops were used. At right we see... chaos.
Code Listing 4.7 shows how to animate this plot. It is easier to change xp at each
step and interpolate from there to r than the other way around. Also, drawing very
small 1 × 1 rectangles for each pixel is too slow so we use a PIL image instead.
4.2 Babylonian Method 107
Consider a similar iterative scheme but now in 2-D using complex numbers:
z = x + yi
√
i = −1
Except for the rare historical genius it was only the latter half of the last century
when computers allowed humans to even see the structures these schemes generate.
z0 = 0
z = z2 + c
First calculate the constant term c = c1 + c2 i from (xp, yp) screen coordinates and
then update z = x + yi with:
x = x2 − y2 + c1
y = 2xy + c2
Even though complex numbers are used in this formulation our code will only
involve real numbers; but two numbers now instead of one. Code Listing 4.8 shows
three test cases and we want to know if each sequence is “well-behaved” or not,
meaning that convergence to a single value, periodic oscillation, and chaotic behav-
ior are all treated the same.
On the other hand unbounded growth (such a case is said to “blow up”) is of
particular interest, and we note how many steps it takes to explode. Observation
indicates that if at any step x2 + y2 ≥ 2 then the sequence will not behave and so
we can give up immediately. Otherwise we wait until 100 steps to stop the loop.
Code Listing 4.8: Three test cases for an iterative scheme using complex numbers.
c1=-0.625 converges
c2=-0.250
Use code similar to our animated plot of the logistic map. Again xp is updated at
each step but its value is now associated with c1 rather than r, and within tick we
loop over all yp and each value is translated to c2 . Figure 4.16 shows our plot, drawn
column-by-column. Color each pixel based on the number of steps its associated
sequence makes before blowing up. Code Listing 4.9 shows how to get a color
value from the loop count assuming some maximum number of loops (e.g., 100).
Code Listing 4.9: Some commands to help generate and plot the Mandelbrot Set.
xmin,xmax=-2.0,2.0
ymin,ymax=-1.5,1.5
#
c1=xmin+xp*(xmax-xmin)/w
c2=ymax+yp*(ymin-ymax)/h # inverted
#
value=(1.0-count/100.0)**3.0
color=int(255.0*value+0.5)
110 4 Efficiency
Fig. 4.17: Zooming in. Left: The arrow points to a box indicating the zoom region.
Right: The zoom region shows self-similarity as the overall set is present here again.
Lab433: Zooming In
Figure 4.17 shows what the Mandelbrot Set looks like as we zoom in on a particular
region and you can begin to see why this structure has fascinated so many people.
Code Listing 4.10 manually specifies this zoom region while Code Listing 4.11 uses
mouse clicks in order to zoom interactively, reducing both x and y by a factor of 2.
Fig. 4.18: Zooming and sharpening. Left: Zoom region taken from upper-left quad-
rant of previous zoom region. Right: Doubling of iteration count sharpens features.
In the case of interactive zooming we must take special care not to draw the image
too inefficiently. For instance, if the putpixel command is accidentally placed inside
our inner-most loop, and if the itemconfigure command is placed inside any loop,
then runtime may increase by as much as a factor of 1000 (!).
Also, if we zoom in on a small enough region then we will need to increase the
maximum iteration count so that the details of our plot do not become smeared,
shown in Figure 4.18. Doubling of the iteration count from 100 to 200 increases our
overall runtime roughly by a factor of two. In particular those sequences that do not
blow up will necessarily loop for the maximum count. Code Listing 4.12 manually
specifies this zoom region. On the next few pages Figures 4.19, 4.20, and 4.21 show
zoom regions with a variety of other “interesting” features.
Fig. 4.19: Spiral, located along the upper-right edge of the overall set, as we zoom
further-and-further the spiral turns round-and-round, seemingly forever. Top: spiral
with “tip” first pointing up and left, turning counterclockwise. Bottom: the same
spiral zoomed-in more so that the tip is now pointing left and down.
4.3 Workload Balance 113
Fig. 4.20: Spiral, if we zoom in far enough the image will pixelate once the floating-
point values are sufficiently close together. Top: same as before but now pointing
down and right. Bottom: a few “full” spins later our spiral has become pixelated;
note that intensities have been artificially enhanced to sharpen the details.
114 4 Efficiency
Fig. 4.21: More features. Top: a “valley” between the overall “head” and “body”.
Bottom: zoomed further, one of many tendril-like “flowers” present in this valley.
4.3 Workload Balance 115
Lab434: Movie
We make a “movie” of the Mandelbrot Set by zooming in, see Code Listing 4.13,
then panning across that image. Code Listing 4.14 shows left to right motion where
only the end-column is calculated for our next image; all the others just shift over,
thus saving a factor of O(100), i.e. the max iteration count, a comparable speed-up
to rendering on a 100-node parallel system, only cheaper! Each image is saved as a
separate PNG file with frames numbered from 000 to 999, and once we have all the
frames they can be combined together into a single animated GIF file:
convert -delay 10 -loop 1 frame*.png movie.gif
In only minutes a program of size 3.4 KB is generating files of total size 15 MB,
a factor of over 4, 000. Wow!
Fig. 4.22: An example Julia Set closely related to the Mandelbrot Set.
We can generate any number of so-called Julia Sets by first fixing c1 and c2 then
using (xp, yp) to initialize x and y instead (i.e., not starting off at z0 = 0 anymore).
One such plot is shown in Figure 4.22, see also Code Listing 4.15, but others are
possible. By displaying a sequence of these Julia Sets we get another movie and
frame-to-frame transitions are “interesting” when the corresponding c-values mark
a continuous path, particularly from along the Mandelbrot Set’s edge. Such a movie
based on our plot here might show the “spikes” waving back-and-forth over time.
Code Listing 4.15: Some commands to help generate and plot a Julia Set.
xmin,xmax=-1.5,1.5
ymin,ymax=-1.125,1.125
#
c1=-0.7375 # fixed
c2= 0.0625
#
x0=xmin+xp*(xmax-xmin)/w
y0=ymax+yp*(ymin-ymax)/h
Chapter 5
Recursion
Fig. 5.1: A scene in 3-D rendered using recursive ray tracing, where reflections
appear within reflections of their own reflections. Image courtesy of Al Hart.
Imagine a group of people who all come in close contact on a regular basis. It may be
the case that Alice never comes in contact with Bob but instead they both contact the
same third-person, Eve, so that if any one of the three should become sick it would
be possible for the illness to travel (directly or indirectly) to each of the others.
We begin with random initialization of a 2-D grid to establish the social relationships
described above. We use a 1-D list to represent this grid and translate back-and-forth
between list index and (x, y) grid coordinates as shown in Code Listings 5.1 and 5.2.
Results for p = 0.6 and p = 0.7 are shown in Figure 5.3 on the next page.
Code Listing 5.3: Random grids. Empty slots are shown as dashes, healthy people
as ‘O’, and sick people as ‘X’ (also in bold). Top: 40% empty. Bottom: 30% empty.
O - X O O - X - O X O O O O - - O O O O - - O O O - O O - O O O X O O X - O - -
O O O - O - O - O O O X - X O - O O - - O X - O - O X X - X - O O O - O O - O O
X O O O O - O - - - - - O - - O O - O - O O - O - O - - O O O - X - - O - O O O
O O O - - O - O O O - - - O O O - X O O - O X O O O O O O O O - - - - - O - O X
- O - O - - - O - O O X X O - O O X O - O - - O O O X O - - O O O O - X - - - -
O X X O - O O O - - O O O - - - - O - X - O O O - O X - - O - O O - O O - O O O
O - O O O - - - O - O - - O - - O O - O O O - O O - O O - - O O O O O O O - - -
O O - O O - O O O O X O O - - - O O - O - - X - O - - O - - O O X O O O - O O O
O X - - O - - O O O - - O O - O X O O - X - - O O O O - O O - O - O O O - O O -
O - - - - O - - - - O O O O O - O O O X - X O - O - O - O - O O O - - - - - O O
- - O - - O - O O X - - - O O - - X O O O O - - - O O O O O - O O O - O O - O -
O - O - X O O - - - O O O - - O O - - O O O O - - O - O O - X O O O - O X - - O
X - - X O O O - O - - O - - - - - - - O O O X O - O - O O O O O X O - - O - O -
X - O O - O - X O - O - - X O - O - - O - X - X O O - X O O - O O O - O - - O O
X O - O O - O O - - - - X X - - X O - O - O - - O O - O - O O - - O X O O O O O
O O O - O - X O - O O O O O O O X O - X O O - - O O O - O - X - O O O X O O O -
O O O - O O - - - O O O O - O O O - O O - O O O - O - X O - - O X O - X - - X O
X O - O O O - X O - O X O X X X O - - O O O - O X O O X - O O O - O X O O O O -
- X - X X O O - O - O O X - - - - O O - O O - X O O O O O O O O O O - - X O O -
O O O - O O O - - - - - O - O - - - O - O O O O - O X - O O - O - - - O O - X O
O O O - - - - O - - O O O O X - - O - - - - O O X - - - - - - O O O - X - O - -
O - O - - O - X - O X O - O O X O - - O O O X X O O O O O - - O - - - - O - - O
O - - O O O - - - - O X O - O - - - - - - - O O - O - - O - O - - - X O - - - O
- O - - O - O O - - O O O - - O - X O - O X X O O - O X - - O - O O O O - O O -
O O O O O - O - - - O - O - O - O O O - - - O - O - - O O O O - X - X O - O O -
- O - - O O X X O X - O X - O - - O - - O O X - - O - O - O O O - O O O O O O O
O O O O O O O X - O X - X O - O - - - O - O O X O O O X O X - O O O X X - - - X
X X O O O - O O X - O O O - O X - - O O - O X X O - X O O X X O - - O O - - - O
O O O O - O O X O X O O O - - - - O O O O O O O O O O O O - - - - - O - - - O O
O - X O - X O X - - O O O O O - - O O X O - X X O O - - - - O - - X O - - O O O
O O X O O O X - O X O O O O - O O O O O - - O O O - O O O O O O X O O X - O - O
O O O - O - O - O O O X - X O O O O - - O X - O O O X X - X O O O O - O O - O O
X O O O O - O - - - O - O - - O O - O - O O - O - O - O O O O - X O - O - O O O
O O O - O O O O O O - - - O O O - X O O - O X O O O O O O O O - O O - - O - O X
- O - O - - - O - O O X X O - O O X O - O - - O O O X O - - O O O O - X - - - -
O X X O - O O O - O O O O - - - O O - X - O O O - O X - - O - O O - O O - O O O
O - O O O O - - O - O - - O - - O O - O O O - O O O O O - O O O O O O O O - - -
O O - O O - O O O O X O O O - O O O - O O - X - O - O O - - O O X O O O - O O O
O X - - O O - O O O - - O O - O X O O - X - - O O O O - O O O O O O O O - O O -
O - O - - O O - - O O O O O O O O O O X - X O - O - O - O O O O O - O - - - O O
O O O - - O - O O X - - - O O O - X O O O O - O - O O O O O - O O O - O O - O -
O - O - X O O - O - O O O - - O O - O O O O O - - O O O O - X O O O - O X O - O
X O O X O O O - O - - O - O - - - - - O O O X O - O - O O O O O X O - O O - O -
X - O O - O - X O O O - - X O - O - - O - X O X O O - X O O - O O O - O - - O O
X O - O O - O O - - - - X X O - X O - O - O - - O O - O - O O O - O X O O O O O
O O O - O - X O O O O O O O O O X O - X O O O - O O O - O - X - O O O X O O O O
O O O O O O - - O O O O O - O O O - O O - O O O - O O X O O - O X O - X - O X O
X O O O O O O X O - O X O X X X O - O O O O O O X O O X O O O O O O X O O O O O
- X O X X O O - O O O O X - O - O O O - O O - X O O O O O O O O O O - - X O O -
O O O - O O O O O - O - O - O - - - O - O O O O O O X - O O - O - O - O O O X O
O O O - - - O O - O O O O O X - - O - - - - O O X - O O - O - O O O O X O O - -
O - O O - O - X - O X O O O O X O - - O O O X X O O O O O - - O - O - - O - O O
O O - O O O O - - - O X O O O - O - - - O - O O - O - - O - O - - O X O O - - O
- O - - O - O O - - O O O - - O - X O O O X X O O - O X O - O O O O O O - O O O
O O O O O O O - O O O - O - O O O O O - - - O O O - - O O O O - X O X O - O O -
O O - - O O X X O X - O X O O - - O O O O O X - O O - O - O O O - O O O O O O O
O O O O O O O X - O X - X O - O O - O O - O O X O O O X O X - O O O X X - - - X
X X O O O - O O X - O O O - O X O - O O O O X X O O X O O X X O - - O O - - - O
O O O O - O O X O X O O O - O - - O O O O O O O O O O O O O - - - - O O - - O O
O O X O - X O X O - O O O O O - O O O X O - X X O O O - O - O - - X O O - O O O
120 5 Recursion
Code Listing 5.4: Infection. Top: Same 40% empty grid. Bottom: Once-healthy
nearest neighbors of everyone initially marked as sick are now also sick themselves.
O - X O O - X - O X O O O O - - O O O O - - O O O - O O - O O O X O O X - O - -
O O O - O - O - O O O X - X O - O O - - O X - O - O X X - X - O O O - O O - O O
X O O O O - O - - - - - O - - O O - O - O O - O - O - - O O O - X - - O - O O O
O O O - - O - O O O - - - O O O - X O O - O X O O O O O O O O - - - - - O - O X
- O - O - - - O - O O X X O - O O X O - O - - O O O X O - - O O O O - X - - - -
O X X O - O O O - - O O O - - - - O - X - O O O - O X - - O - O O - O O - O O O
O - O O O - - - O - O - - O - - O O - O O O - O O - O O - - O O O O O O O - - -
O O - O O - O O O O X O O - - - O O - O - - X - O - - O - - O O X O O O - O O O
O X - - O - - O O O - - O O - O X O O - X - - O O O O - O O - O - O O O - O O -
O - - - - O - - - - O O O O O - O O O X - X O - O - O - O - O O O - - - - - O O
- - O - - O - O O X - - - O O - - X O O O O - - - O O O O O - O O O - O O - O -
O - O - X O O - - - O O O - - O O - - O O O O - - O - O O - X O O O - O X - - O
X - - X O O O - O - - O - - - - - - - O O O X O - O - O O O O O X O - - O - O -
X - O O - O - X O - O - - X O - O - - O - X - X O O - X O O - O O O - O - - O O
X O - O O - O O - - - - X X - - X O - O - O - - O O - O - O O - - O X O O O O O
O O O - O - X O - O O O O O O O X O - X O O - - O O O - O - X - O O O X O O O -
O O O - O O - - - O O O O - O O O - O O - O O O - O - X O - - O X O - X - - X O
X O - O O O - X O - O X O X X X O - - O O O - O X O O X - O O O - O X O O O O -
- X - X X O O - O - O O X - - - - O O - O O - X O O O O O O O O O O - - X O O -
O O O - O O O - - - - - O - O - - - O - O O O O - O X - O O - O - - - O O - X O
O O O - - - - O - - O O O O X - - O - - - - O O X - - - - - - O O O - X - O - -
O - O - - O - X - O X O - O O X O - - O O O X X O O O O O - - O - - - - O - - O
O - - O O O - - - - O X O - O - - - - - - - O O - O - - O - O - - - X O - - - O
- O - - O - O O - - O O O - - O - X O - O X X O O - O X - - O - O O O O - O O -
O O O O O - O - - - O - O - O - O O O - - - O - O - - O O O O - X - X O - O O -
- O - - O O X X O X - O X - O - - O - - O O X - - O - O - O O O - O O O O O O O
O O O O O O O X - O X - X O - O - - - O - O O X O O O X O X - O O O X X - - - X
X X O O O - O O X - O O O - O X - - O O - O X X O - X O O X X O - - O O - - - O
O O O O - O O X O X O O O - - - - O O O O O O O O O O O O - - - - - O - - - O O
O - X O - X O X - - O O O O O - - O O X O - X X O O - - - - O - - X O - - O O O
O - X X O - X - X X X X O X - - O O O O - - O O O - X X - X O X X X X X - O - -
X O X - O - X - O X X X - X X - O O - - X X - O - X X X - X - O X O - X O - O O
X X O O O - O - - - - - O - - O O - O - O X - O - O - - O X O - X - - O - O O X
X O O - - O - O O O - - - O O O - X X O - X X X O O X O O O O - - - - - O - X X
- X - O - - - O - O X X X X - O X X X - O - - O O X X X - - O O O O - X - - - -
X X X X - O O O - - O X X - - - - X - X - O O O - X X - - O - O O - O X - O O O
O - X O O - - - O - X - - O - - O O - X O O - O O - X O - - O O X O O O O - - -
O X - O O - O O O X X X O - - - X O - O - - X - O - - O - - O X X X O O - O O O
X X - - O - - O O O - - O O - X X X O - X - - O O O O - O O - O - O O O - O O -
O - - - - O - - - - O O O O O - X X X X - X X - O - O - O - O O O - - - - - O O
- - O - - O - O X X - - - O O - - X X X O X - - - O O O O O - O O O - O X - O -
X - O - X X O - - - O O O - - O O - - O O O X - - O - O O - X X X O - X X - - O
X - - X X O O - O - - O - - - - - - - O O X X X - O - X O O X X X X - - X - O -
X - O X - O - X X - O - - X X - X - - O - X - X X O - X X O - O X O - O - - O O
X X - O O - X X - - - - X X - - X X - X - X - - O O - X - O X - - X X X O O O O
X O O - O - X X - O O O X X O X X X - X X O - - O O O - O - X - X O X X X O X -
X O O - O O - - - O O X O - X X X - O X - O O O - O - X X - - X X X - X - - X X
X X - X X O - X X - X X X X X X X - - O O O - X X X X X - O O O - X X X X O X -
- X - X X X O - O - O X X - - - - O O - O O - X X O X X O O O O O O - - X X X -
O X O - X O O - - - - - X - X - - - O - O O O X - X X - O O - O - - - X X - X X
O O O - - - - X - - X O O X X - - O - - - - X X X - - - - - - O O O - X - O - -
O - O - - O - X - X X X - O X X X - - O O X X X X O O O O - - O - - - - O - - O
O - - O O O - - - - X X X - O - - - - - - - X X - O - - O - O - - - X X - - - O
- O - - O - O O - - O X O - - O - X X - X X X X O - X X - - O - X O X O - O O -
O O O O O - X - - - O - X - O - O X O - - - X - O - - X O O O - X - X X - O O -
- O - - O X X X X X - X X - O - - O - - O X X - - O - X - X O O - O X X O O O X
X X O O O O X X - X X - X X - X - - - O - O X X X O X X X X - O O X X X - - - X
X X X O O - O X X - X O X - X X - - O O - X X X X - X X X X X X - - X X - - - X
X X X O - X X X X X X O O - - - - O O X O O X X O O X O O - - - - - O - - - O O
O - X X - X X X - - O O O O O - - O X X X - X X X O - - - - O - - X X - - O O O
5.1 Disease Outbreak 121
Imagine now that all the sick people cough. Assume that everyone standing directly
next to this cough becomes infected. These neighbors are not sick right away but
they will be shortly. Consider only the four directions up, down, left, and right.
Regarding the spread of disease this represents just a single step. The sick people
have coughed on their neighbors but those neighbors have not yet coughed on their
neighbors... so the neighbor’s neighbor may not be sick (for now). Code Listing 5.4
on the previous page shows output for the same grid initialization as before.
Watch out for the common error of looping low to high and thus cascading the
disease farther to the right and down than you really mean to. In the next lab we will
do exactly that on purpose and in all four directions. Code Listing 5.5 shows the
sick (maybe) infecting their neighbors if the neighboring slot is even in bounds.
Code Listing 5.5: The sick infecting only (!) their nearest neighbors.
#
def maybe_infect(grid,x,y):
if 0<=x<w and 0<=y<h: # Are we even in bounds?
j=y*w+x
if grid[j]!=‘-’: # Is this an empty slot?
if grid[j]!=‘X’: # Are we already sick?
grid[j]=‘*’ # Mark "going to be sick."
#
j=0
while j<w*h:
if grid[j]==‘X’:
#
y=j/w # translate from 1-D to 2-D
x=j%w
#
maybe_infect(grid,x,y-1) # up
maybe_infect(grid,x,y+1) # down
maybe_infect(grid,x-1,y) # left
maybe_infect(grid,x+1,y) # right
#
j+=1
#
# Convert the "going to be sick" into actually sick.
#
j=0
while j<w*h:
if grid[j]==‘*’: grid[j]=‘X’
j+=1
#
122 5 Recursion
Lab513: Floodfill
Code Listing 5.7: Recursive infection. Top: Again, same 40% empty grid as before.
Bottom: Everyone in the social circle of a sick person is now sick themselves, too.
O - X O O - X - O X O O O O - - O O O O - - O O O - O O - O O O X O O X - O - -
O O O - O - O - O O O X - X O - O O - - O X - O - O X X - X - O O O - O O - O O
X O O O O - O - - - - - O - - O O - O - O O - O - O - - O O O - X - - O - O O O
O O O - - O - O O O - - - O O O - X O O - O X O O O O O O O O - - - - - O - O X
- O - O - - - O - O O X X O - O O X O - O - - O O O X O - - O O O O - X - - - -
O X X O - O O O - - O O O - - - - O - X - O O O - O X - - O - O O - O O - O O O
O - O O O - - - O - O - - O - - O O - O O O - O O - O O - - O O O O O O O - - -
O O - O O - O O O O X O O - - - O O - O - - X - O - - O - - O O X O O O - O O O
O X - - O - - O O O - - O O - O X O O - X - - O O O O - O O - O - O O O - O O -
O - - - - O - - - - O O O O O - O O O X - X O - O - O - O - O O O - - - - - O O
- - O - - O - O O X - - - O O - - X O O O O - - - O O O O O - O O O - O O - O -
O - O - X O O - - - O O O - - O O - - O O O O - - O - O O - X O O O - O X - - O
X - - X O O O - O - - O - - - - - - - O O O X O - O - O O O O O X O - - O - O -
X - O O - O - X O - O - - X O - O - - O - X - X O O - X O O - O O O - O - - O O
X O - O O - O O - - - - X X - - X O - O - O - - O O - O - O O - - O X O O O O O
O O O - O - X O - O O O O O O O X O - X O O - - O O O - O - X - O O O X O O O -
O O O - O O - - - O O O O - O O O - O O - O O O - O - X O - - O X O - X - - X O
X O - O O O - X O - O X O X X X O - - O O O - O X O O X - O O O - O X O O O O -
- X - X X O O - O - O O X - - - - O O - O O - X O O O O O O O O O O - - X O O -
O O O - O O O - - - - - O - O - - - O - O O O O - O X - O O - O - - - O O - X O
O O O - - - - O - - O O O O X - - O - - - - O O X - - - - - - O O O - X - O - -
O - O - - O - X - O X O - O O X O - - O O O X X O O O O O - - O - - - - O - - O
O - - O O O - - - - O X O - O - - - - - - - O O - O - - O - O - - - X O - - - O
- O - - O - O O - - O O O - - O - X O - O X X O O - O X - - O - O O O O - O O -
O O O O O - O - - - O - O - O - O O O - - - O - O - - O O O O - X - X O - O O -
- O - - O O X X O X - O X - O - - O - - O O X - - O - O - O O O - O O O O O O O
O O O O O O O X - O X - X O - O - - - O - O O X O O O X O X - O O O X X - - - X
X X O O O - O O X - O O O - O X - - O O - O X X O - X O O X X O - - O O - - - O
O O O O - O O X O X O O O - - - - O O O O O O O O O O O O - - - - - O - - - O O
O - X O - X O X - - O O O O O - - O O X O - X X O O - - - - O - - X O - - O O O
X - X X X - X - X X X X X X - - X X X X - - X X X - X X - X X X X X X X - O - -
X X X - X - X - X X X X - X X - X X - - X X - X - X X X - X - X X X - X X - X X
X X X X X - X - - - - - O - - X X - X - X X - X - X - - X X X - X - - X - X X X
X X X - - O - X X X - - - X X X - X X X - X X X X X X X X X X - - - - - O - X X
- X - X - - - X - X X X X X - X X X X - O - - X X X X X - - X X X X - X - - - -
X X X X - X X X - - X X X - - - - X - X - X X X - X X - - O - X X - X X - O O O
X - X X X - - - X - X - - O - - X X - X X X - X X - X X - - X X X X X X X - - -
X X - X X - X X X X X X X - - - X X - X - - X - X - - X - - X X X X X X - O O O
X X - - X - - X X X - - X X - X X X X - X - - X X X X - X X - X - X X X - O O -
X - - - - X - - - - X X X X X - X X X X - X X - X - X - X - X X X - - - - - O O
- - O - - X - X X X - - - X X - - X X X X X - - - X X X X X - X X X - X X - O -
X - O - X X X - - - O O O - - O O - - X X X X - - X - X X - X X X X - X X - - O
X - - X X X X - X - - O - - - - - - - X X X X X - X - X X X X X X X - - X - X -
X - X X - X - X X - O - - X X - X - - X - X - X X X - X X X - X X X - X - - X X
X X - X X - X X - - - - X X - - X X - X - X - - X X - X - X X - - X X X X X X X
X X X - X - X X - X X X X X X X X X - X X X - - X X X - X - X - X X X X X X X -
X X X - X X - - - X X X X - X X X - X X - X X X - X - X X - - X X X - X - - X X
X X - X X X - X X - X X X X X X X - - X X X - X X X X X - X X X - X X X X X X -
- X - X X X X - X - X X X - - - - O O - X X - X X X X X X X X X X X - - X X X -
X X X - X X X - - - - - X - X - - - O - X X X X - X X - X X - X - - - X X - X X
X X X - - - - X - - X X X X X - - O - - - - X X X - - - - - - X X X - X - O - -
X - X - - X - X - X X X - X X X X - - X X X X X X X X X X - - X - - - - O - - O
X - - X X X - - - - X X X - X - - - - - - - X X - X - - X - X - - - X X - - - O
- X - - X - X X - - X X X - - O - X X - X X X X X - X X - - X - X X X X - X X -
X X X X X - X - - - X - X - O - X X X - - - X - X - - X X X X - X - X X - X X -
- X - - X X X X X X - X X - O - - X - - X X X - - X - X - X X X - X X X X X X X
X X X X X X X X - X X - X X - X - - - X - X X X X X X X X X - X X X X X - - - X
X X X X X - X X X - X X X - X X - - X X - X X X X - X X X X X X - - X X - - - X
X X X X - X X X X X X X X - - - - X X X X X X X X X X X X - - - - - X - - - X X
X - X X - X X X - - X X X X X - - X X X X - X X X X - - - - O - - X X - - X X X
124 5 Recursion
1 - 1 1 1 - 2 - 3 3 3 3 3 3 - - 4 4 4 4 - - 4 4 4 - 4 4 - 4 4 4 4 4 4 4 - 5 - -
1 1 1 - 1 - 2 - 3 3 3 3 - 3 3 - 4 4 - - 4 4 - 4 - 4 4 4 - 4 - 4 4 4 - 4 4 - 6 6
1 1 1 1 1 - 2 - - - - - 7 - - 4 4 - 4 - 4 4 - 4 - 4 - - 4 4 4 - 4 - - 4 - 6 6 6
1 1 1 - - 8 - 4 4 4 - - - 4 4 4 - 4 4 4 - 4 4 4 4 4 4 4 4 4 4 - - - - - 9 - 6 6
- 1 - 1 - - - 4 - 4 4 4 4 4 - 4 4 4 4 - A - - 4 4 4 4 4 - - 4 4 4 4 - 4 - - - -
1 1 1 1 - 4 4 4 - - 4 4 4 - - - - 4 - 4 - 4 4 4 - 4 4 - - B - 4 4 - 4 4 - C C C
1 - 1 1 1 - - - 4 - 4 - - D - - 4 4 - 4 4 4 - 4 4 - 4 4 - - 4 4 4 4 4 4 4 - - -
1 1 - 1 1 - 4 4 4 4 4 4 4 - - - 4 4 - 4 - - E - 4 - - 4 - - 4 4 4 4 4 4 - F F F
1 1 - - 1 - - 4 4 4 - - 4 4 - 4 4 4 4 - G - - 4 4 4 4 - 4 4 - 4 - 4 4 4 - F F -
1 - - - - H - - - - 4 4 4 4 4 - 4 4 4 4 - 4 4 - 4 - 4 - 4 - 4 4 4 - - - - - F F
- - I - - H - J J J - - - 4 4 - - 4 4 4 4 4 - - - 4 4 4 4 4 - 4 4 4 - K K - F -
L - I - H H H - - - M M M - - N N - - 4 4 4 4 - - 4 - 4 4 - 4 4 4 4 - K K - - O
L - - H H H H - P - - M - - - - - - - 4 4 4 4 4 - 4 - 4 4 4 4 4 4 4 - - K - 4 -
L - H H - H - P P - Q - - R R - R - - 4 - 4 - 4 4 4 - 4 4 4 - 4 4 4 - 4 - - 4 4
L L - H H - P P - - - - R R - - R R - 4 - 4 - - 4 4 - 4 - 4 4 - - 4 4 4 4 4 4 4
L L L - H - P P - R R R R R R R R R - 4 4 4 - - 4 4 4 - 4 - 4 - 4 4 4 4 4 4 4 -
L L L - H H - - - R R R R - R R R - 4 4 - 4 4 4 - 4 - 4 4 - - 4 4 4 - 4 - - 4 4
L L - H H H - S S - R R R R R R R - - 4 4 4 - 4 4 4 4 4 - 4 4 4 - 4 4 4 4 4 4 -
- L - H H H H - S - R R R - - - - T T - 4 4 - 4 4 4 4 4 4 4 4 4 4 4 - - 4 4 4 -
L L L - H H H - - - - - R - R - - - T - 4 4 4 4 - 4 4 - 4 4 - 4 - - - 4 4 - 4 4
L L L - - - - U - - R R R R R - - V - - - - 4 4 4 - - - - - - 4 4 4 - 4 - W - -
L - L - - R - U - R R R - R R R R - - 4 4 4 4 4 4 4 4 4 4 - - 4 - - - - X - - Y
L - - R R R - - - - R R R - R - - - - - - - 4 4 - 4 - - 4 - 4 - - - 4 4 - - - Y
- R - - R - R R - - R R R - - Z - a a - 4 4 4 4 4 - 4 4 - - 4 - 4 4 4 4 - 4 4 -
R R R R R - R - - - R - R - b - a a a - - - 4 - 4 - - 4 4 4 4 - 4 - 4 4 - 4 4 -
- R - - R R R R R R - R R - b - - a - - 4 4 4 - - 4 - 4 - 4 4 4 - 4 4 4 4 4 4 4
R R R R R R R R - R R - R R - c - - - 4 - 4 4 4 4 4 4 4 4 4 - 4 4 4 4 4 - - - 4
R R R R R - R R R - R R R - c c - - 4 4 - 4 4 4 4 - 4 4 4 4 4 4 - - 4 4 - - - 4
R R R R - R R R R R R R R - - - - 4 4 4 4 4 4 4 4 4 4 4 4 - - - - - 4 - - - 4 4
R - R R - R R R - - R R R R R - - 4 4 4 4 - 4 4 4 4 - - - - d - - 4 4 - - 4 4 4
Code Listing 5.9: Is there a component that spans the grid? Yes or no?
- - 1 - - 2 - 2 2 2 2 - - 3 - 3 - 3 3 3 3 - 4 4 - 5 5 - - - 6 6 - 7 7 7 - - - 8
3 - 1 - 2 2 - - - - 2 - 3 3 3 3 - - 3 - - 9 - - 5 5 5 5 5 5 - - 7 7 7 - - - A -
3 - - 2 2 2 2 - 2 2 2 - 3 3 - 3 - - 3 3 - 9 9 - 5 5 - - 5 - 7 7 7 7 7 - - - A A
3 - - - 2 - 2 2 2 2 - B - 3 - 3 3 3 3 3 3 - 9 9 - - 5 5 5 5 - 7 7 - - A A - A A
3 3 3 3 - C - 2 2 2 2 - - 3 3 - - - - - 3 3 - - - 5 5 - - 5 - - - D - A A A A -
3 3 - - 3 - - - - 2 - - 3 - - 3 3 - 3 - 3 - - - E - - 3 - 5 5 5 - - A A A A A A
- 3 3 3 3 3 - F - - 3 3 3 - 3 - 3 3 3 3 3 3 - 3 - - 3 3 - - 5 - - G - - - - A A
- - 3 3 3 - H - 3 3 3 3 3 - 3 3 3 - - 3 - - 3 3 3 3 3 3 - I - - J - - A - A A A
3 3 3 3 3 3 - 3 3 3 3 3 3 3 3 3 3 3 - - 3 3 3 - 3 - 3 3 3 - - - - K - A A A A -
- - 3 - 3 3 3 3 3 - 3 3 3 - - 3 - 3 - 3 3 - - - - - 3 3 - - L - K K K - - A A -
3 3 - M - - 3 3 3 - 3 3 3 3 3 3 3 - 3 3 3 3 3 3 3 3 3 3 3 - - - K K - - - A A A
- 3 3 - N - - 3 3 - - 3 3 3 3 3 3 3 - 3 3 - 3 3 3 - - 3 3 - - - - - O - - A A A
- 3 3 - N - 3 3 3 3 - 3 3 - 3 - - 3 - - 3 3 3 3 3 3 - 3 3 - - 3 3 3 - 3 - A - -
3 3 3 3 - - 3 3 - - 3 3 3 3 3 - - 3 3 3 3 3 3 3 3 3 3 3 3 3 3 - 3 3 3 3 - A A A
3 3 3 - 3 3 3 - 3 3 3 3 3 3 - 3 3 - - - 3 - - 3 - - 3 - 3 3 3 3 3 - - - P - - -
- - 3 3 3 - - - 3 - 3 3 - - 3 3 - 3 3 - - - 3 - - 3 - 3 3 3 - 3 3 - 3 - - 3 3 3
- 3 3 - 3 - - 3 3 - - 3 - 3 3 3 3 3 3 3 3 - 3 3 3 3 - - 3 3 3 3 - 3 3 3 3 3 - -
- 3 - Q - - 3 3 - - 3 3 3 3 3 3 3 3 3 3 - 3 3 3 - 3 3 - 3 3 - 3 3 3 3 3 - 3 3 3
- - R - - 3 3 3 3 - - - 3 - - 3 3 - 3 - 3 3 - 3 3 3 - - - - - - - 3 3 - - 3 3 -
R R R - 3 3 - 3 3 3 - S - - T - - - 3 3 3 3 - - 3 3 - 3 3 - U - 3 3 3 3 - 3 3 -
- R R - - 3 3 - 3 - - - 3 3 - 3 3 3 - 3 3 3 3 - 3 - - 3 3 3 - - 3 3 3 3 3 - 3 3
R R R R - - - 3 3 - 3 3 3 - 3 3 3 - - 3 3 - 3 - - - V - - 3 3 3 3 3 - - - W - 3
R R - - - - - - 3 3 3 3 3 3 - 3 3 - 3 3 3 3 3 - 3 3 - X - - - - - 3 3 - 3 - - -
R R R - Y Y Y - 3 3 3 - - - - 3 - 3 3 - 3 3 3 - 3 3 - X - - - Z - 3 3 3 3 3 - 3
R - - Y Y - Y - - - 3 3 - a - 3 3 3 - 3 - 3 - - - 3 - - Z Z Z Z Z - - 3 3 3 3 3
R - Y Y - - - - 3 3 3 - - a - 3 3 - 3 3 - 3 3 3 3 3 3 3 - - Z - Z Z - 3 3 - - 3
R - Y - - Y Y - - - 3 - b - c - 3 3 3 - 3 3 3 - - 3 - 3 - - - Z Z - 3 3 - d - -
- Y Y Y Y Y Y - e - 3 - - - - - 3 - - 3 3 3 3 3 - - 3 3 3 - - - Z - - - - - - f
g - - Y Y Y Y - e - 3 - h h - 3 3 3 - - - - 3 - - i - - - - i - - j j - k - l -
g g - Y Y Y Y - - 3 3 3 - h - 3 - - - 3 3 3 3 3 - i i i i i i - - - j - - l l l
Lab515: Spanning
Runtime Comparison
0.012
recursion
loop/list
0.01
Runtime, seconds
0.008
0.006
0.004
0.002
0
0 5 10 15 20 25
Fibonacci Number, n not Fn
A certain problem of size n = 31 has a runtime of 0.75 seconds. Increasing the size
to n = 33 then takes almost 2 seconds, and for n = 37 over 13 seconds. Why?
It gets worse. If the trend continues then n = 51 will take over 10, 000 seconds or
almost three hours. Likewise, n = 65 needs nearly 10, 000, 000 seconds or 15 weeks.
Ridiculously, n = 75 runs for over 1, 000, 000, 000 seconds, just shy of 36 years.
The scenario described above is based on exponential growth. Figures 5.2 and 5.3
indicate a similar runtime for the Fibonacci function shown in Code Listing 5.10.
Note the dramatic changes in vertical scale between the two plots whereas the
problem size has not moved that far along the horizontal axis. If recursion is desired
then Code Listing 5.11 suggests an incremental (inchworming) treatment of
Fn = Fn−1 + Fn−2
and F1 = F2 = 1, a tail recursion that translates into code using just a loop and a list.
5.2 Runtime Analysis 127
Runtime Comparison
14
recursion
loop/list
12
Runtime, seconds
10
0
0 5 10 15 20 25 30 35 40
Fibonacci Number, n not Fn
Fig. 5.3: Runtime analysis, recursion versus a loop and a list, continued.
128 5 Recursion
This famous algorithm for calculating the greatest common divisor (GCD) of two
positive integers (a and b) comes from Euclid’s The Elements, circa 500 B.C.
See Code Listing 5.12 for the procedure and Table 5.1 for an example. We could
also use a loop instead of recursion but either way Euclid’s approach offers a clear
improvement over other standard methods:
1. Loop from 1 up to min (a, b) remembering only the last common divisor you see.
Since the loop goes up, when it finishes the last one you saw must be the greatest.
2. Loop from min (a, b) down to 1 stopping as soon as you find a common divisor.
If you make it all the way down to 1 then a and b are said to be relatively prime.
Code Listing 5.12: Tail recursion again, which could be coded with just a loop.
#
def gcd(a,b):
#
mod=(a%b)
#
if mod==0:
return b
else:
return gcd(b,mod)
#
#
call a b mod
1 20 72 20
2 72 20 12
3 20 12 8
4 12 8 4
5 8 4 0
Code Listing 5.13 shows three more test cases, but of course you can come up
with a set of your own cases, too. Note how in the last case Euclid requires only a
single step, and not O(1000), to determine that 999 and 1000 are relatively prime.
Code Listing 5.13: A small set of test cases.
20 72: 4
230 4050: 10
999 1000: 1
5.2 Runtime Analysis 129
What does the computer actually do when we make a recursive call? Say we are in
the main program and we call gcd(70, 15)...
But when the last recursive call returns “the answer” it does so only to the previ-
ous recursive call, who must pass it on, who must pass it on...
In this way the answer (5) is finally returned to the original call of gcd(70, 15)
in our main program. Table 5.2 shows the system call stack during execution of this
recursive function, assuming that successive calls occupy contiguous memory.
..
.
call #1 argument #2 b = 15
argument #1 a = 70
return address main program
local variable mod = 10
call #2 argument #2 b = 10
argument #1 a = 15
return address call #1
local variable mod = 5
call #3 argument #2 b = 5
argument #1 a = 10
return address call #2
local variable mod = 0
..
.
Now you can see why there has to be a recursive depth limit. Each function call
accounts for a non-zero amount of memory and there is finite memory available
to any computer. This also suggests that we could code “recursion” using only an
explicit stack and (!) no function. That is true but also a topic for another course.
130 5 Recursion
Your assignment is to produce the first plot shown in Figure 5.4, the top one.
Using our previous code to determine whether or not there is a connected com-
ponent that spans a grid, Algorithm 5.2.1 loops p from 0.0 to 1.0 and finds for each
different p the likelihood that a spanning component will exist. So, the horizontal
axis in our plot shows the value of p and the vertical axis shows what fraction of the
associated trials had a spanning component present for that p.
These plots both used 1000 trials. Running more trials would make the plot
smoother, as would incrementing p more finely (as shown d p = 0.01), however
both of those refinements will have an associated runtime cost.
Note carefully in the code outline how p only affects initialization of the grid.
From the plots we see that for p < 0.5 there is “never” a spanning component and
for p > 0.7 there “always” is one. For the in-between range we transition toward
ever more likely spanning as p increases.
As we scale the size of our grid toward infinity (an order of magnitude increase
is shown in the bottom plot) this transition will sharpen until eventually we see only
a step-function that jumps directly from 0% to 100% at the critical probability.
To save on runtime in both zooming and sharpening it is only necessary to refine
our plot locally near this critical probability. We certainly never need to do any
refinement where we already know the spanning percentage is either 0% or 100%
because how could any change possibly have an effect on that!?
Unlike the Mandelbrot movie problem where a small program generated a large
amount of image-frame data in a short amount of time, drawing a high-resolution
spanning plot involves the input of a single value, p, then your code runs for hours,
only to output another single value. This scenario is ideal for parallel computing.
0.8
0.6
0.4
0.2
0
0.5 0.55 0.6 0.65 0.7 0.75
Observed Probability of a Spanning Component
0.8
0.6
0.4
0.2
0
0.5 0.55 0.6 0.65 0.7 0.75
Probability of Occupying each Slot
Fig. 5.4: For what probabilities are we likely to observe a spanning component?
Top: grid size 40 × 30. Bottom: grid size 120 × 90, an order of magnitude larger.
132 5 Recursion
The next few pages show the simulation of a forest fire ignited on the left edge of
our grid that moves through populated slots and eventually burns out. We want to
know how the average burnout time changes as the density of trees is varied.
Our floodfill algorithm would recursively burn the fire through the entire forest
all at once, so that is not desirable here. Instead we need to move the fire step-by-step
so we can count the number of steps it takes to burn out.
Code Listing 5.14 advances the fire a single step where currently burning trees
ignite their nearest neighbors and then burn out themselves. In the grid plots shown
in Figures 5.15, 5.16, and 5.17, the asterisks (∗) from the code appear as filled-in
black dots. Note how the code uses ‘F’ to mark trees as “going to be on fire” but we
never see this character in any of the plots.
Your code should also calculate burnout time and display a running step count.
A fast running alternative is to use a special kind of list called a queue, similar
to a system call stack only where items are added and removed from different ends
of the list. The advancing-front characteristic of the forest fire is related to a famous
technique known as breadth-first search. Again, since there is a time component to
our fire burning out it does not make sense to recursively floodfill all the way across
the forest at once. We could modify Lab513 in the same way.
5.2 Runtime Analysis 133
Code Listing 5.15: Forest fire. Top: Initial conditions. Bottom: After five steps.
T T T T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
• • T T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T
134 5 Recursion
Code Listing 5.16: Forest fire. Top: After 10 steps. Bottom: After 20 steps.
T • T T T T T
T T T T T T T T T T T T T T T
T T T T T T
T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T
T T T T T T T T T T T T T T T T T
T T T T T T T
T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T
T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T
T T T T T T
T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
T T T T T T T
T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
T T T T
T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T
• T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
• T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
T T T T T T • T T T T T T T T T T T T T
T T T • T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T
T T • T T T T T T T T T T T T T T
T • T T T T T T T T T T T T T T
T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T
T T T T T T T T T T T T
T T T T T T T T T T T T
T T T T T T T T T T T T T T T
5.2 Runtime Analysis 135
Code Listing 5.17: Forest fire. Top: After 40 steps. Bottom: Done after 78 steps.
T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T • T T T T T T T T T T T
T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T T
T T T T T T T T T T T T T T T
T T T T T T T T T T T T T
T T T T T T T T
T T T T T T T
T T T T T T T T T T
T T T T T T T T T T T
T T T T T T T T T
T T T T T T T
T T T T T T T T T
T T T T T T T T T T T
T T T T T T T T T
T T T T T T T T T
T T T T T T T T
T T T • T T T T T T
T T T T T T T T T T T
T T T T
T T
T T
T T
T T T
T T
T T T
T T T T T
T T T T T T T
T T T T T T
T T T
T T T
T T T T T
T T T T T
T T T T T T
T T T T
T T T
T T T
T T T T T T T T
T T T T T T T T T T
T T T T T T T T T
T T T T T T T
T T T T T T T T T
T T T T T T T T T T T
T T T T T T T T T
T T T T T T T T T
T T T T T T T T
T T T T T T T T T
136 5 Recursion
60
50
40
30
20
10
0
0 0.2 0.4 0.6 0.8 1
Density of Trees
Your assignment is to generate a plot similar to the one shown in Figure 5.5 which
used 1000 trials for each different density of trees.
In the bottom-left we see that very sparse forests burn out quickly because the
fire has nowhere to spread.
To the right-hand side we see that very dense forests have a burnout time pro-
portional to the width (here the grid was 40 × 30) because the fire spreads directly
across the forest without having to make its way back-and-forth at all.
Closer to the familiar looking critical probability we might expect a spanning
component of the grid to just barely reach all four borders. In this case the fire takes
a long time to travel up and down and left and right through all the fractal-like
tendrils of the forest.
As we scale the size of the grid toward infinity the peak burnout time itself goes
to infinity and forms a vertical asymptote at the critical probability.
Wow!
5.3 Guessing Games 137
Consider two games, first a simple case of flipping a coin and guessing heads or tails
and then a more complicated game where we guess higher or lower as a sequence
of the digits 1-9 is constructed.
The coin-flipping game is presented for the sake of comparison because it is so
much easier to analyze than high-low. In particular, for coin flips there is only one
successful path through the game: always guess the next 50-50 flip correctly.
We try to guess if a coin will land on heads or tails when flipped and we assume
there is no cheating so the chance of success is really 50%. It does not matter if we
guess heads, guess tails, or guess at random; all three “strategies” have the same
success rate and if we run 10, 000 trials the odds will average out close enough.
So, how likely are we to end the game having guessed n flips correctly in a row?
The answer is 1/2n+1 where the “plus one” accounts for the fact that we must guess
the last coin flip wrong to end the streak, also a 50-50 chance.
Figure 5.6 confirms this but you should write your own code to convince yourself.
0.4
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8 9
Steps in Streak
Fig. 5.6: Guessing coin flips. Before we flip the first coin a streak of nine steps is
highly unlikely: less than one-tenth of one percent. But if we have already guessed
eight flips correctly then the chance of a ninth is 50-50, a coin flip.
138 5 Recursion
0.4
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8 9
Steps in Streak
Fig. 5.7: Guessing high or low. Before we make the first guess a streak of nine steps
has a 1.3% chance of occurring. So, we have a better chance here than we did just
flipping coins but surviving the first guess is still 50-50, a “coin flip.”
In the high-low game we start always with the number 5 first. The next number
cannot be the same as the current number so the second number will never be 5.
Numbers are chosen from the digits 1-9 and you have to guess if the next number
will be higher or lower than the current one. Many paths are possible because if you
guess lower on the first turn then it could be a 1 or a 2 or a 3 or a 4. In the sample
run in Code Listing 5.18 the words “lower” and “higher” were typed by the user.
Code Listing 5.18: A sample run with a four-guess winning streak.
Algorithm 5.3.1 outlines an optimal strategy and Figure 5.7 on the previous page
shows our results. Algorithm 5.3.2 outlines the structure of an overall simulation
which could also be used for coin flipping if we substitute out the call to game.
Algorithm 5.3.1 A function game() that plays high-low with an optimal strategy.
1: streak = 0
2: number = 5
3: while true do
4: if number < 5 then
5: guess = higher
6: else if number > 5 then
7: guess = lower
8: else
9: guess = random
10: end if
11: nextnumber = generate
12: if correct then
13: streak = streak + 1
14: number = nextnumber
15: else
16: return streak
17: end if
18: end while
An alternative to randomized simulation (which may play out the same exact game
multiple times) is to use recursion and generate all possible game paths directly.
This is straightforward with heads-tails because we can list out possible games
using the binary representation of integers, 0=heads and 1=tails. There are 2n differ-
ent bitstrings with length n where we “pad” the front as in 00110001.
Code Listing 5.19 shows a similar, albeit more meandering, enumeration of all
possible paths for high-low. This recursive function is called using: recur([5])
Code Listing 5.19: Recursively print all possible high-low game paths.
def recur(seqnums):
#
if len(seqnums)>maxstreak:
print seqnums+[‘...’]
return
#
num=seqnums[-1] # current number
#
# loop over all possible next numbers
#
nextnum=1
while nextnum<=9:
#
if num==nextnum: # prohibited
pass
#
elif num<5 and nextnum<num:
print seqnums+[nextnum,‘lose’]
elif num>5 and nextnum>num:
print seqnums+[nextnum,‘lose’]
#
# assume we guess ‘lower’ on a 5
#
elif num==5 and nextnum>num:
print seqnums+[nextnum,‘lose’]
#
# finally now, a correct guess...
#
else:
recur(seqnums+[nextnum])
#
nextnum+=1
#
#
5.3 Guessing Games 141
Using the tree of all possible paths we can also calculate the likelihood of any partic-
ular length streak. In fact, previously mentioned Figure 5.7 shows simulation results
overlayed with precisely these analytic findings.
Code Listing 5.20 shows our recursive function where the argument p indicates
the probability that we have even made it this far and p diminishes by a factor of
eight down each particular path. To start use: calc(0,5,1.0)
Note that simulation runtime for similarly accurate results may be much faster.
Code Listing 5.20: Recursively calculate the probability of each game path.
htable={}
#
def calc(streak,num,p):
#
if streak>maxstreak:
return
#
# How likely is the streak to end now?
#
chances=4+abs(5-num)
if streak not in htable:
htable[streak]=0.0
htable[streak]+=(p*(1.0-chances/8.0))
#
if num>=5:
#
# in this case we guess lower...
#
nextnum=1
while nextnum<num:
calc(streak+1,nextnum,p/8.0)
nextnum+=1
#
else:
#
# in this case we guess higher...
#
nextnum=num+1
while nextnum<=9:
calc(streak+1,nextnum,p/8.0)
nextnum+=1
#
#
#
142 5 Recursion
Code Listing 5.21 shows a 2-D grid initialized with both rectangles (left) and also
at random (right). We do not need a recursive floodfill to label the connected slots
on the left-hand side. Just use two loops. However, on the right-hand side loops are
insufficient, or at least more cumbersome to code than recursion.
Code Listing 5.21: Comparison of when to use loops and when to use recursion.
- - - - - - - - - - - - - - - - - - - - - - X X X - X X - X X X X X X X - X - -
- X X X X - - - - - - - - - - - - - - - - X - X - X X X - X - X X X - X X - X X
- X X X X - - - - - - - - - - - - - - - - X - X - X - - X X X - X - - X - X X X
- X X X X - - - - - - - - - - - - - - - - X X X X X X X X X X - - - - - X - X X
- X X X X - - - X X X X X X X X X X - - - - - X X X X X - - X X X X - X - - - -
- X X X X - - - X X X X X X X X X X - - - X X X - X X - - X - X X - X X - X X X
- X X X X - - - X X X X X X X X X X - - - X - X X - X X - - X X X X X X X - - -
- X X X X - - - X X X X X X X X X X - - - - X - X - - X - - X X X X X X - X X X
- X X X X - - - X X X X X X X X X X - - - - - X X X X - X X - X - X X X - X X -
- X X X X - - - X X X X X X X X X X - - - X X - X - X - X - X X X - - - - - X X
- X X X X - - - - - - - - - - - - - - - - X - - - X X X X X - X X X - X X - X -
- X X X X - - - - - - - - - - - - - - - - X X - - X - X X - X X X X - X X - - X
- X X X X - X X X X X X X X X X X X X - - X X X - X - X X X X X X X - - X - X -
- X X X X - X X X X X X X X X X X X X - - X - X X X - X X X - X X X - X - - X X
- X X X X - X X X X X X X X X X X X X - - X - - X X - X - X X - - X X X X X X X
- - - - - - X X X X X X X X X X X X X - - X - - X X X - X - X - X X X X X X X -
- - - - - - X X X X X X X X X X X X X - - X X X - X - X X - - X X X - X - - X X
- - - - - - X X X X X X X X X X X X X - - X - X X X X X - X X X - X X X X X X -
- - - - - - X X X X X X X X X X X X X - - X - X X X X X X X X X X X - - X X X -
- - - - - - X X X X X X X X X X X X X - - X X X - X X - X X - X - - - X X - X X
- - - - - - X X X X X X X X X X X X X - - - X X X - - - - - - X X X - X - X - -
- - - - - - X X X X X X X X X X X X X - - X X X X X X X X - - X - - - - X - - X
- - - - - - X X X X X X X X X X X X X - - - X X - X - - X - X - - - X X - - - X
- - - - - - X X X X X X X X X X X X X - - X X X X - X X - - X - X X X X - X X -
- - - - - - X X X X X X X X X X X X X - - - X - X - - X X X X - X - X X - X X -
- - - - - - - - - - - - - - - - - - - - - X X - - X - X - X X X - X X X X X X X
X X X X X X X X X X X X X X X - - - - - - X X X X X X X X X - X X X X X - - - X
X X X X X X X X X X X X X X X - - - - - - X X X X - X X X X X X - - X X - - - X
X X X X X X X X X X X X X X X - - - - - - X X X X X X X X - - - - - X - - - X X
X X X X X X X X X X X X X X X - - - - - - - X X X X - - - - X - - X X - - X X X
As you continue learning you will find that one of the most important considerations
at the start of any problem is deciding which tools to even use in your solution.
Chapter 6
Projects
Everyone has ideas, and part of the joy of computing is to see your ideas come alive.
Three very different types of projects are presented in this chapter with the hope
that everyone will find something that speaks to them. A few guidelines:
• Build everything up step-by-step... test as you go!
• Think about the big picture even as you struggle with the small details.
• Watch out for feature-creep. Keep in mind, what are you trying to do?
Imagine a 4 × 4 grid of tiles where one tile has been removed. There are always at
least two tiles next to the blank space and any tile next to the blank can slide over,
effectively swapping positions. Tiles move horizontally and vertically but only in
discrete chunks; it would not make any sense to slide a tile halfway into a slot.
Starting small, we begin with just text where Code Listing 6.1 shows how to
create a generic text object at screen coordinates (xp, yp) displaying an empty string
in a large font, and then also how to modify its text at a later time.
The illusion of the tiles is maintained by a rectangular grid drawn only in the
background that does not move. Arrange these however you like but also plan ahead,
we may scale the size to as much as 10 × 10 so write your code for an n × n grid.
Code Listing 6.1: Creating a Tk text object and remembering its ID number.
#
f=(‘Times’,36,‘bold’)
#
tkid=cnvs.create_text(xp,yp,text=‘’,font=f)
#
cnvs.itemconfigure(tkid,text=‘Tile #1’)
#
Tile #7
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15
Fig. 6.1: Click tile, the text at top updates the tile number as the user clicks.
We may determine which tile is clicked, if any, by translating from pixel coordinates
evnt.x and evnt.y into column and row (x, y) coordinates in the 4 × 4 puzzle.
Here x and y are each 0, 1, 2, or 3. From these 2-D values the 1-D index is found
directly using 4y + x because each row contains four slots. As shown in Figure 6.1
we might display “BLANK” if the blank space is clicked or “You missed!” if the
user happens to click somewhere completely outside the puzzle region.
6.1 Sliding Tile Puzzle 145
Tile #13
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15
Fig. 6.2: Change tile, the text at top is still updated as before and now the tile’s text
object is also modified to show where the user has clicked.
In order to change the font size of a particular tile’s text object we must store their
Tk ID numbers in a list. We do not need to store Tk ID numbers of any rectangles
because they are not changing. Two options to reset the previous tile back to a reg-
ular font size: (1) set the font of all tiles after every click; and (2) remember which
tile was clicked last and only reset it. Since n2 is small in this case, either will work.
146 6 Projects
Tile #12
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15
Fig. 6.3: Neighbor of blank, while the other tiles continue to be modified as before,
the neighbors of the blank space are now treated differently when clicked.
Step 15
5 1 2 3
9 10 6 4
13 7 8
14 15 11 12
Fig. 6.4: Slide tile, where quite obviously the user has clicked the tiles in a spiral
pattern and thus the “puzzle” is not much of a challenge.
Now we want the tiles to move. At first the tile number corresponds to our list index
but after a series of moves it may not. We can find the current text using:
num=cnvs.itemcget(tiles[j],‘text’)
Figures 6.4, 6.5, and 6.6 show this feature in action.
∗ Hint: do not actually move the location of the text objects, just swap their displayed text instead.
148 6 Projects
Step 20
5 1 2 3
9 10 6
13 15 7 4
14 11 12 8
Step 27
9 5 1 2
13 10 6 3
14 15 7 4
11 12 8
Step 60
9 2 10 6
14 15 13 1
5 12 8
11 3 4 7
Step 100
9 14 15 6
3 12 10 2
13 7 8
5 11 4 1
0 GO 0
3 14 15 9
2 6 5
4 8 10 12
11 1 13 7
Fig. 6.7: Possible features, here a “go” button is used so that after a friend has
shuffled the puzzle both a timer (left) and an update of total moves (right) can start.
Figure 6.7 shows one way a “game” feature could be implemented. We could also
provide a shuffle button for one-player use but be careful (!) to only make random
legal moves; half of all states that result from generic shuffling are not reachable by
sliding tiles (you would need to pop out one of the tiles, which is cheating).
If the tiles display words instead of numbers then be careful (!) again because
repeated letters may not have interchangeable positions, and keeping track of which
is which would be a real challenge. Figure 6.8 suggests a similar idea only using an
image that was first broken into pieces with PIL. Also, we might keep track of the
moves made and print them to a file so we can replay the game later.
6.1 Sliding Tile Puzzle 151
Time 0 Step 0
Time 60 Step 79
Fig. 6.8: Possible features, shuffle the broken-up pieces of an image using PIL.
Image of George Washington courtesy of the U.S. Army, photo credit U.S. Army
Military History Institute.
152 6 Projects
Impossible!
1 2 3 4
5 6 7 8
9 10 11 12
13 15 14
Fig. 6.9: Impossible to solve, note that the 14 and 15 have been swapped.
Two tiles are neighbors if the sum of the absolute value of their row difference and
the absolute value of their column difference is equal to one.
Likewise, two states are neighbors if a single move transitions
the 4 × 4 grid of
tiles from one state
to the other. While there are only O n 2 tiles in a grid there are
an astronomical O n2 !/2 states. For a 5 × 5 puzzles that translates to O 1024 .
Computer search for any sequence of moves to solve a puzzle is thus very difficult
but (!) a related problem is so important that it now has a very, very large prize:
http://www.claymath.org/millennium/P vs NP/
The puzzle in Figure 6.9 offered a similar prize in the 1800s, and sales were brisk!
6.2 Anagram Scramble 153
All results in this section reflect the use of the word list: lab621.txt
Desired output is shown in Code Listing 6.2 where we identify four-letter words
found in the middle of six-letter words. Code Listing 6.3 shows two options for
storing the four-letter words, assuming they are pre-screened out of the word list.
When the user types a six-letter word we must check if the middle four characters
also form a word. We might use the in operator to do this but with a list that will
require a loop. (Even if we do not see the loop it still happens.) On the other hand,
checking in with a hashtable does not use a loop because the hash function maps
directly to the key in question. Really this is a hash set since only the keys matter.
We now find all pairs of four- and six-letter words where the middle of the longer
word is the same as the shorter word. Partial output is shown in Code Listing 6.4.
Our word list contains O(5000) six-letter words and O(1800) four-letter words.
Imagine that every six-letter word “shakes hands” with every four-letter word.
This would total O(5000 × 1800) = O(9, 000, 000) handshakes. But of all those
millions there are only O(500) hits where we actually find one word inside another,
less than a 0.01% hit rate. This is what happens to our code if we store the four-letter
words in a list because the in operator must check all of those words in a loop.
In the hashtable case the six-letter words do not shake hands with everyone but
instead attempt to shake hands directly with a single four-letter word. This means
there are now only O(5000) handshakes and our hit rate goes up to 10%, not bad!
Code Listing 6.5 shows the six-letter words shaking hands.
Code Listing 6.5: How many times do six-letter words shake hands?
#
for word in list6:
#
middle = ...
#
if middle in ...
#
print list6[0] + middle.upper() + list6[5]
#
#
#
6.2 Anagram Scramble 155
Lab623: Anagrams
Example anagrams are shown in Code Listing 6.6, four-letter words only.
Code Listing 6.7 shows one way to directly check if two words are anagrams.
One word is converted into a list of characters and then we loop over the characters
in the other word. If at any point we find a character that does not match then we
immediately return failure. When characters do match we must remove them from
the character list otherwise we may fall into a common trap regarding duplicate
letters: “rare” and “area” are not anagrams.
Only when all the letters have matched do we return success. Can you think of
another way to find anagrams? Perhaps a more efficient way?
lamb balm
calm clam
team mate
slow owls
part trap
save vase
Code Listing 6.7: A direct check to see if two words are anagrams.
#
def anagrams(word1,word2):
#
chlist=list(word1)
#
for ch in word2:
#
if ch not in chlist:
return False
else:
chlist.remove(ch)
#
#
return True
#
156 6 Projects
We now find all pairs of four-letter words that are anagrams. Coincidentally, there
are O(500) such pairs. Algorithm 6.2.1 shows how to pair together all possible com-
binations of any two four-letter words. Note that the inner-loop does not reset to zero
each time because the handshakes are always initiated by the word with the lower
index in the word list, to avoid duplicate pairings.
Code Listings 6.8 and 6.9 suggest two indirect ways to check for anagrams.
Namely, if two words are anagrams then they will have the same letter frequency
tables and the same sorted letter strings. You can either import lowercase from
the string module or just make a variable:
lowercase=‘abcdefghijklmnopqrstuvwxyz’
Code Listing 6.10 shows a dramatic increase in algorithm efficiency. Rather than
generating all possible pairs and checking each pair for anagrams, the code shown
uses a word’s sorted letter string as the key in a hashtable where the value is a list of
all words who share that key. But those lists are precisely the anagram sets!
A word ladder is a type of puzzle where a starting and an ending word are specified
and to solve the puzzle you must provide a list of intermediate words such that from
one word to the next only the following changes may occur:
• Insert a letter.
• Delete a letter.
• Substitute a letter.
• Substitute the entire word for an anagram.
Table 6.1 shows puzzles where only the “substitute a letter” change is allowed
but even here we may quickly face an exceedingly large search space once again.
6.3 Collision Detection 159
Jugglers often use beanbags in their performances because they are easy to handle
and also safe if there should be any kind of accident.
Likewise, in Figure 6.10 we show the final version of a game in which we launch
beanbags to pop bubbles as they drift upward. We will build this game step-by-step
and include the following features:
• Aiming the beanbag cannon by using the arrow keys to rotate left or right.
• Launch a beanbag by hitting spacebar. The beanbag moves linearly offscreen.
• Launch multiple beanbags in rapid succession but with a limited total number.
• Randomized bubbles that move up linearly but drift back-and-forth sideways.
• Detection of collisions between beanbags and bubbles... so the bubbles pop!
Safety First
Do not attempt to play this game in real life without permission and supervision.
160 6 Projects
Code Listing 6.12: Shell for part of a program to control a beanbag cannon.
#
def leftarrow(evnt):
global theta
#
theta += ...
if theta > ...
theta = ...
#
...
#
...
root.bind(‘<Left>’ ,leftarrow)
root.bind(‘<Right>’,rightarrow)
Code arrow keys to move cannon side-to-side. Stay within upward range, as shown.
Fig. 6.11: Aim beanbag cannon, do not allow θ to get too close to the horizontal.
6.3 Collision Detection 161
Fig. 6.12: Cannon moving, 45◦ to right (top) and then 15◦ over to left (bottom).
162 6 Projects
Fig. 6.13: Launch beanbag, a single flying object is created by pressing spacebar.
Code Listing 6.13 launches a single beanbag. If the object already exists then we do
not launch another one. Code Listing 6.14 is the animation loop; our beanbag moves
linearly and once offscreen we clear the Tk oval and reset obj=None. Access the
class in Code Listing 6.15 using: from beanbag import Beanbag
Code Listing 6.13: Press the spacebar to launch a beanbag from the cannon.
obj=None
#
def space(evt):
global obj
#
if obj==None:
xc,yc,xpos,ypos=cnvs.coords(cannon_aim)
...
obj=Beanbag(xpos,ypos, ... , ... ,cnvs)
#
#
6.3 Collision Detection 163
Code Listing 6.14: Our beanbag object moves linearly across the screen.
def tick():
global obj
#
if obj!=None:
if obj.offscreen():
obj.delete_me()
obj=None
else:
obj.move_me()
#
cnvs.after(1,tick)
#
Code Listing 6.15: Shell for the Beanbag class, filename: beanbag.py
class Beanbag:
#
def __init__(self,x,y,vx,vy,cnvs):
#
...
#
self.r = 5
self.w = int(cnvs.cget(‘width’))
self.h = int(cnvs.cget(‘height’))
#
self.tkid = cnvs.create_oval( ... )
#
#
def offscreen(self):
#
...
#
#
def delete_me(self):
#
self.cnvs.delete(self.tkid)
#
#
def move_me(self):
#
...
#
#
#
164 6 Projects
Code Listing 6.16: A program both defining and using a Turtle class.
#
from math import cos,sin,pi
#
class Turtle:
#
def __init__(self,x,y,h):
#
# instance variables
#
self.x = x
self.y = y
self.h = h
#
#
def move(self,r):
#
oldx,oldy=self.x,self.y
#
self.x += r*cos(self.h*pi/180.0)
self.y += -r*sin(self.h*pi/180.0)
#
drawline(oldx,oldy,self.x,self.y)
#
...
#
#
######################
#
# main program
#
smidge=Turtle(0,0,90)
smidge.move(100)
#
Scope
Code Listing 6.16 shows a class definition for Turtle, from Chapter 2, where instance
data such as self.x are directly accessible by functions of the class (note the first
argument of move is named self), thus the global command is unnecessary.
Local variables such as oldx occupy memory that is allocated for their use only
during execution of the function. Once the function has ended then, as we have seen,
that portion of the system call stack may be re-allocated for other data.
6.3 Collision Detection 165
Fig. 6.14: List of beanbags, note how their direction is not forever tied to the cannon.
First, the Beanbag class code does not change at all here. What has to change in the
program to make multiple beanbags? Figure 6.14 and Code Listing 6.17 show how
each beanbag knows its own position and velocity independently of the others.
Code Listing 6.17: Loop backwards to avoid skipping over any beanbags.
def tick():
#
j=len(objlist)-1
while j>=0:
#
if objlist[j].offscreen():
objlist[j].delete_me()
objlist.pop(j)
else:
objlist[j].move_me()
j-=1
...
166 6 Projects
Code Listing 6.18: Bubbles move up linearly but drift back-and-forth sideways.
class Bubble:
...
#
def move_me(self):
#
if random()<0.5:
self.x += self.vx*dt
else:
self.x -= self.vx*dt
#
self.y -= self.vy*dt
...
#
Lab634: Bubbles
See Code Listing 6.18 for movement. Figures 6.15 and 6.16 show one, then many.
Fig. 6.15: Bubble drifts upward, where obviously we would prefer a list of them.
6.3 Collision Detection 167
Fig. 6.16: List of bubbles, no collisions yet so the beanbags just fly right on by.
168 6 Projects
Lab635: Collisions
Code Listing 6.19: Handshake problem, two groups: beanbags and bubbles.
def tick():
#
...
#
for obj in objlist: # beanbags do not pop
#
j=len(bubbles)-1
while j>=0:
#
if obj.collide(bubbles[j]):
#
...
#
j-=1
...
Code Listing 6.20: Beanbags are the ones detecting collisions, for now.
class Beanbag:
...
#
def collide(self,the_bubble):
#
x1=self.x
y1=self.y
#
x2=the_bubble.x
y2=the_bubble.y
#
...
#
6.3 Collision Detection 169
Fig. 6.17: Collisions, where clearly the bubbles hit by a beanbag now pop.
170 6 Projects
Only differences are movement, radius, fill color, outline color, and outline width.
Beanbag is small, gray, moves quick, straight. Bubble is large, white border, drifts
upward. Code Listing 6.21 shows where the collide function should be defined
and Code Listing 6.22 shows how to inherit from a more generic super-class type.
7.1 Predator-Prey
We first model rabbits that move in discrete chunks (i.e., step by 2r) on a grid.
Code Listings 7.1 and 7.2 get you started on the class-based lonely rabbit simulation
shown in Figure 7.1, where you may want to “steal” from your beanbag code, too.
Code Listing 7.1: Defining a rabbit object, filename: rabbit.py
#
class Rabbit:
#
def __init__(self,cnvs):
#
self.hungry=False
#
#
def mark_as_hungry(self):
#
self.hungry=True
#
#
def check_is_hungry(self):
#
return self.hungry
#
#
#
Fig. 7.1: A lonely rabbit stuck on a lattice: up, down, left, right.
Code Listing 7.2: A list of rabbits but with only one object, for now.
#
from rabbit import Rabbit
#
rabbits=[]
#
def tick():
#
...
#
#
rabbits.append( Rabbit(cnvs) )
#
7.1 Predator-Prey 173
A lonely rabbit no more. Code Listing 7.3 uses an unnatural breeding mechanism
but the real questions is, Do the observed results in Figure 7.2 match nature or not?
Code Listing 7.3: Breeding rabbits but with an artificial limit imposed.
n=len(rabbits)
j=0
while j<n and len(rabbits)<1000:
#
rabbits.append( Rabbit(cnvs) ) # breed
j+=1
174 7 Modeling
70
Size of Population
60
50
40
30
20
10
0
0 5 10 15 20 25 30 35
Time, number of steps
Fig. 7.3: Carrying capacity, the initial population boom levels off quickly → logistic.
Lab713: Hunger
A less artificial mechanism for controlling population growth is to model the com-
petition for food as shown in Code Listings 7.4 and 7.5. If two rabbits are close
enough together then one will get hungry (at random) because there is only so much
food to eat. Figures 7.3 and 7.4 show the results, a better match than before.
Code Listing 7.4: Handshake problem, one group: rabbits.
j=0
while j<len(rabbits):
#
k=j+1
while k<len(rabbits):
#
if rabbits[j].collide(rabbits[k]):
if ...
rabbits[j].mark_as_hungry()
else:
rabbits[k].mark_as_hungry()
...
7.1 Predator-Prey 175
Fig. 7.4: Competition for food, some neighboring rabbits are about to go hungry.
Code Listing 7.5: Looping backwards, again, so as not to skip over any rabbits.
j=len(rabbits)-1
#
while j>=0:
#
if rabbits[j].check_is_hungry():
#
rabbits[j].delete_me() # die/starve
rabbits.pop(j)
else:
#
rabbits[j].move_me() # move
...
176 7 Modeling
12
10
0
0 10 20 30 40 50 60 70 80 90 100
Time, number of steps
Fig. 7.5: Maximum age, how old is the oldest rabbit? Unfortunately not very old.
30
25
20
15
10
0
2 4 6 8 10 12
Age, number of steps
30
25
20
15
10
0
2 4 6 8 10 12
Age, number of steps
Fig. 7.6: Age distribution. Top: after 13 steps. Bottom: after 32 steps.
178 7 Modeling
Population Dynamics
80
rabbits
70 wolves
60
Population
50
40
30
20
10
0
0 20 40 60 80 100 120 140 160
Time, number of steps
Fig. 7.7: Population dynamics, rabbits and wolves alternate between peak and valley.
Lab715: Wolves
Code Listing 7.7 shows a wolf always winning the “coin flip” but also needing to eat
every four steps or it dies of hunger itself. Wolves have two offspring but only once
at age ten. Results are shown in Figures 7.7 and 7.8. Since the underlying parameters
are ad hoc you may tune them as you see fit, just be prepared for major changes.
Code Listing 7.7: Handshake problem, two groups: wolves and rabbits.
shuffle(wolves)
for wolf in wolves:
#
j=len(rabbits)-1
while j>=0:
if wolf.collide(rabbits[j]):
#
rabbits[j].delete_me() # die/eaten
rabbits.pop(j)
#
wolf.reset_hunger()
...
7.1 Predator-Prey 179
Fig. 7.8: Two extreme population snapshots. Top: step 55, 12 rabbits, 61 wolves.
Bottom: step 110, 52 rabbits, 17 wolves.
180 7 Modeling
Analytic Model
A purely mathematical model can also describe our two interacting populations:
dx = c1 x − c2 xy
dy = −c3 y + c4 xy
At each step the changes are calculated from the current population sizes and then
each population size is updated. Obviously this is a coupled system with each vari-
able depending on the other variable’s value, too.
As shown in Figure 7.9 we call x the rabbits and y the wolves, using:
c1 = 4.00
c2 = 1.00
c3 = 2.00
c4 = 0.25
Population Dynamics
30
rabbits
wolves
25
20
Population
15
10
0
0 1000 2000 3000 4000 5000 6000
Time, number of steps
Fig. 7.9: Population dynamics, rabbits and wolves alternating but not as before.
Fig. 7.10: Knobs attached to “springs” but only the middle knob moves, for now.
Code Listings 7.8 and 7.9 show a generic “knob” and Figures 7.10, 7.11, and 7.12
show its motion over time. We calculate acceleration (Hooke and Newton) based on
vertical displacement from a desired position halfway between the two neighbors.
Code Listing 7.9: Shell of a program that uses three knob objects.
def tick():
#
knob2.move_me(knob1,knob3) # only knob2 moves
#
cnvs.coords(tkid1, ... ,knob1.y, ... ,knob2.y)
...
#
knob1=Knob( ... , ... ,cnvs)
knob2=Knob( ... , ... ,cnvs)
knob3=Knob( ... , ... ,cnvs)
#
tkid1=cnvs.create_line( ... ,fill=‘black’,width=2)
tkid2=cnvs.create_line( ... ,fill=‘black’,width=2)
350
300
250
200
150
100
50
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Time, steps
Fig. 7.11: Vertical position of the middle knob as it oscillates between its neighbors.
7.2 Laws of Motion 183
Fig. 7.12: Hooke’s Law, motion goverened by displacement and spring constant.
184 7 Modeling
Fig. 7.13: Hooke’s Law, without the arrow we could not determine which direction
the knob is moving. Top: after 0.4 seconds. Bottom: after 1.6 seconds.
7.2 Laws of Motion 185
Still only the middle knob moves but now there is an entire list of objects.
Code Listing 7.10: A list of knobs but still only one knob is moving.
def tick():
#
knobs[4].move_me(knobs[3],knobs[5]) # middle knob
#
cnvs.coords(tkid[3], ... , ... , ... , ... )
...
#
j=0
while j<9:
#
if j==4:
knobs.append( Knob( ... , ... ,cnvs) )
else:
knobs.append( Knob( ... , ... ,cnvs) )
j+=1
60
40
Velocity, vy
20
-20
-40
-60
-80
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Time, steps
Fig. 7.14: Vertical velocity of the middle knob as it oscillates between its neighbors.
186 7 Modeling
Fig. 7.15: Hooke’s Law, depending on which knobs are fixed in place an asymmetry
can be established. Top: after 0.2 seconds. Bottom: after 1.2 seconds.
7.2 Laws of Motion 187
Lab723: Asymmetry
Figures 7.15 and 7.16 show multiple knobs moving. Since the initial displacement
is asymmetric relative to the fixed knobs an “interesting” sequence unfolds.
We might also connect four nearest neighbors in 2-D (or six in 3-D) to model
a cloth mesh or even solid objects. This suggests the possibility of using computer
modeling in the real-time development of industrial products where, for instance,
simulation could limit the amount of time wasted re-shipping prototypes.
Code Listing 7.11: A loop over the list of knobs tells each to move.
def tick():
#
j=1
while j<6:
#
knobs[j].move_me(knobs[j-1],knobs[j+1])
j+=1
...
Fig. 7.16: Hooke’s Law after 2.2 seconds. Each knob has the same desire to be
halfway between its two nearest neighbors. Only... they might be moving, too.
188 7 Modeling
Lab724: Damping
Figures 7.17 and 7.18 show many more knobs and an extreme initial displacement.
The result is a wave that propagates first to the right and then back to the left again.
It would continue reflecting back-and-forth forever except that Code Listing 7.12
adds a damping term to ay so that the chain will eventually settle down.
Code Listing 7.12: A damping term added to each knob’s acceleration calculation.
class Knob:
...
def move_me(self,knobLeft,knobRight):
#
yT = 0.5*(knobLeft.y+knobRight.y)
#
ay = 0.1*(yT-self.y)-0.5*self.vy # ad hoc damping
#
self.vy += ay*dt
self.y += self.vy*dt
...
Fig. 7.17: Hooke’s Law plus damping, fixed ends and extreme initial conditions.
7.2 Laws of Motion 189
Fig. 7.18: Hooke’s Law plus damping, a wave pattern emerges and propagates back-
and-forth across the chain. Top: after 0.2 seconds. Bottom: after 1.0 seconds.
190 7 Modeling
Fig. 7.19: Hooke’s Law plus damping and gravity, fixed ends. Top: initial conditions.
Bottom: after 1.2 seconds.
7.2 Laws of Motion 191
Lab725: Gravity
Code Listing 7.13 now adds an ad hoc gravity term to our ad hoc spring plus ad hoc
damping calculation. In this case, without gravity there would be no motion because
the initial conditions do not include any displacement. The two end-knobs are still
fixed in place and Figures 7.19 and 7.20 show motion toward a steady state.
Code Listing 7.13: A gravity term added to each knob’s acceleration calculation.
class Knob:
...
def move_me(self,knobLeft,knobRight):
#
yT = 0.5*(knobLeft.y+knobRight.y)
#
ay = 10.0*(yT-self.y)-0.5*self.vy+10.0 # ad hoc
#
self.vy += ay*dt
self.y += self.vy*dt
...
Fig. 7.20: Hooke’s Law plus damping and gravity, fixed ends. Steady state.
192 7 Modeling
Fig. 7.21: U.S. Navy carrier group in the Pacific Ocean. Image courtesy of the U.S.
Navy, photo credit Mass Communication Specialist 2nd Class Walter M. Wayman.
A Technological World
Figure 7.21 shows one way nations project power around the world. It is not an
insignificant task to develop, build, and maintain a carrier group.
Consider now the following aspects of these operations...
• Hull design.
• Navigation and control.
• Weapon systems.
• On-board defensive measures.
• Communication.
• Communication security.
...and imagine all the ways that computational modeling can impact those efforts.
7.3 Bioinformatics 193
7.3 Bioinformatics
The Levenshtein distance finds the total number of edits necessary to transform from
one word to another. Consider first the case where two words have the same length
and just count the total differences. Code Listing 7.14 shows a few test cases.
Code Listing 7.14: A few test cases, counting the number of different letters.
If the lengths differ by one then we can use a loop to find where in the shorter word
we should insert a “gap” for the best possible match. Code Listing 7.15 shows two
examples where only the best placement of the gap is reported and the associated
number is the sum of the difference count and the number of gaps.
Code Listing 7.15: Two examples of inserting a single gap with a loop.
kitchen
kit-ten
2
cart
ca-t
1
Code Listing 7.16 recursively places multiple gaps and Code Listing 7.17 reports
partial results of the corresponding search for a best match.
Code Listing 7.16: Recursive placement of gaps, all possibilities.
#
def recur(word1,word2,j,chlist,gaps,maxgaps):
#
if len(chlist)==len(word1):
...
else:
#
# try inserting a gap in this slot
#
if gaps<maxgaps:
recur(word1 ,word2 ,\
j ,chlist+[‘-’] ,\
gaps+1,maxgaps )
#
# try inserting the actual letter
#
if j<len(word2):
recur(word1 ,word2 , \
j+1 ,chlist+[word2[j]], \
gaps ,maxgaps )
#
7.3 Bioinformatics 195
Code Listing 7.17: Trying all possible arrangements. Looking for the best match.
:
kitchens
ki-t-ten
6
kitchens
ki-tt-en
6
kitchens
ki-tte-n
5
kitchens
ki-tten-
4
kitchens
kit--ten
5
kitchens
kit-t-en
5
kitchens
kit-te-n
4
kitchens # best
kit-ten-
3
kitchens
kitt--en
5
kitchens
kitt-e-n
4
kitchens # best
kitt-en-
3
kitchens
kitte--n
5
kitchens
kitte-n-
4
kitchens
kitten--
5
196 7 Modeling
We naively apply the previous technique to our BCHE gene sequences, where only
the difference count is required if we stay within the following same-length groups:
• human, tiger, cat, dog, cattle, horse, orangutan
• mouse, chicken
Code Listing 7.18: Which pair of sequences has the closest similarity measure?
Dynamic Programming
1 2 3 4 5 6 7 8 9 10
0 12 115 176 149 195 187 330 331 306 478
1 109 172 144 190 182 324 322 301 477
2 192 162 193 183 335 335 303 492
3 154 175 170 341 337 282 468
4 172 168 332 331 281 473
5 23 348 337 132 471
6 343 329 148 464
7 137 445 509
8 455 513
9 566
Code Listing 7.20 shows results for the BCHE gene, where a smaller score means
more similar, and Code Listing 7.21 outlines the code in our main program.
Code Listing 7.21: Handshake problem, one group: BCHE gene sequences.
#
filenames=open(‘list_of_files.txt’,’r’).read().split()
j=0
while j<len(filenames):
#
gene1=open(filenames[j]+‘.txt’,’r’).read().strip()
#
k=j+1
while k<len(filenames):
...
result=levenshtein_distance(gene1,gene2)
#
7.3 Bioinformatics 199
Tree of Life
For example, tiger and cat have the best similarity score (12) so they are paired
together first. Then, all other scores involving tiger or cat are replaced with an aver-
age from the two. Eventually we will also pair together pairs rather than individuals
but the same rules apply. The process continues:
• tiger and cat
• human and orangutan
• dog and tiger/cat
• mouse and rat
• chimpanzee and human/orangutan
• ...
• ALL and chicken
It should not be surprising that chicken is matched last since it is the only non-
mammal in the list. Code Listing 7.22 on the following page shows our final result.
200 7 Modeling
|---horse
|-------|
| |---cattle
|---|
| | |---cat
| | |---|
| | | |---tiger
| |---|
| |-------dog
|---|
| | |---orangutan
| | |---|
| | | |---human
| |-------|
| |-------chimpanzee
|---|
| | |---rat
| |---------------|
| |---mouse
---|
|-----------------------chicken
Postscript
I wrote this book while on a sabbatical from Thomas Jefferson High School for
Science and Technology where I have taught Computer Science since 2001.
I am fortunate to have worked in the Computer Systems Lab at TJ since 2004
and this past year in the Center for Computational Fluid Dynamics at George Mason
University. I owe more than I could ever repay to both of these places.
This book is designed to speak to a wide and varied audience. News related to
high school and undergraduate participation in Computer Science has become grim
in recent years. Efforts are underway to change those trends and over time I believe
the field will grow. I have myself been writing courses since 2003 and this book is
my view on how best to present CS to beginning students.
I hope you find something here to grab hold of, something to spark your interest,
something even that takes your breath away.