Lecture Notes On Contracts: 15-122: Principles of Imperative Computation Frank Pfenning January 17, 2013
Lecture Notes On Contracts: 15-122: Principles of Imperative Computation Frank Pfenning January 17, 2013
Contracts
15-122: Principles of Imperative Computation
Frank Pfenning
Lecture 2
January 17, 2013
1 Introduction
In these notes we review contracts, which we use to collectively denote
function contracts, loop invariants, and other assertions about the program.
Contracts will play a central role in this class, since they represent the key
to connect algorithmic ideas to imperative programs. We follow the exam-
ple from lecture, developing annotations to a given program that express
the contracts, thereby making the program understandable (and allowing
us to find the bug).
In term of our learning goals, this lecture addresses:
Programming: Contracts
If you have not seen this example, we invite you to read this section by
section to see how much of the story you can figure out on your own before
moving on to the next section.
2 A Mysterious Program
You are a new employee in a company, and a colleague comes to you with
the following program, written by your predecessor who was summarily
fired for being a poor programmer. Your colleague claims he has tracked a
bug in a larger project to this function. It is your job to find and correct this
bug.
Before you read on, you might examine this program for a while to try
to determine what it does, or is supposed to do, and see if you can spot the
problem.
3 Forming a Conjecture
The first step it to execute the program on some input values to see its
results. The code is in a file called mystery2.c0 so we invoke the coin inter-
preter to let us experiment with code.
f
% coin mystery2.c0
C0 interpreter (coin) 0.3.2 ’Nickel’ (r256, Thu Jan 3 14:18:03 EST 2013)
Type ‘#help’ for help or ‘#quit’ to exit.
-->
At this point we can type in statements and they will be executed. One
form of statement is an expression, in which case coin will show its value.
For example:
--> 3+8;
11 (int)
-->
We can also use the function in the files that we loaded when we started
coin. In this case, the mystery function is called f, so we can evaluate it on
some arguments.
--> f(2,3);
8 (int)
--> f(2,4);
16 (int)
--> f(1,7);
1 (int)
--> f(3,2);
9 (int)
-->
From these and similar examples, you might form the conjecture is that
f (x, y) = xy , this is, x to the power y. One can confirm that with a few more
values, such as
--> f(-2,3);
-8 (int)
--> f(2,8);
256 (int)
--> f(2,10);
1024 (int)
-->
It seems to work out! Our next task is to see why this function actually
computes the power function. Understanding this is necessary so we can
try to find the error and correct it.
To understand why this loop works we need to find a so-called loop in-
variant: a quantity that does not change throughout the loop. In this exam-
ple, when y is a power of 2 then r is a loop invariant. Can you see a loop
invariant involving just x and y?
Going back to our earlier conjecture, we are trying to show that this
function computes xy . Interestingly, after every iteration of the loop, this
quantity is exactly the same! Before the first iteration it is 28 = 256. After
the first iteration it is 44 = 256. After the second iteration it is 162 = 256.
After the third iteration is it is 2561 = 256. Let’s note it down in the table.
iteration x y r xy
0 2 8 1 256
1 4 4 1 256
2 16 2 1 256
3 256 1 1 256
r ⇤ x = 1 ⇤ x = x = x1 = xy
7 Termination
In this case it is easy to see that the loop always terminates, because if we
start with y = 2n we go around the loop exactly n times before y = 2n n = 1
and we exit the loop. We used here that (2k )/2 = 2k 1 for k 1.
Our next challenge then will be to extend this result to arbitrary y. Be-
fore we do this, now that we have some positive results, let’s try to see if
we find some counterexample since the function is supposed to have a bug
somewhere!
Please try to find a counterexample to the conjecture that f (x, y) = xy
before you move on, taking the above information into account.
8 A Counterexample
We don’t have to look at powers of 2 — we already know the function
works correctly there. Some of the earlier examples were not powers of
two, and the function still worked:
--> f(2,3);
8 (int)
--> f(-2,3);
-8 (int)
--> f(2,1);
2 (int)
-->
--> f(2,0);
2 (int)
--> f(2,-1);
2 (int)
-->
9 Imposing a Precondition
Let’s go back to a mathematical definition of the power function xy on inte-
gers x and y. We define:
x0 = 1
x y+1 = x ⇤ xy for y 0
This is the first part of what we call the function contract. It expresses what
the function requires of any client that calls it, namely that the second ar-
gument is positive. It is an error to call it with a negative argument; no
promises are made about what the function might return otherwise. It
might even abort the computation due to a contract violation.
But a contract usually has two sides. What does f promise? We know it
promises to compute the exponential function, so this should be formally
expressed.
10 Promising a Postcondition
The C0 language does not have a built-in power function. So we need to
write it explicitly ourselves. But wait! Isn’t that what f is supposed to do?
The idea in this an many other examples to capture a specification in the
simplest possible form, even if it may not be computationally efficient, and
then promise in the postcondition to satisfy this simple specification. Here,
we can transcribe the mathematical definition into a recursive function.
int POW (int x, int y)
//@requires y >= 0;
{
if (y == 0)
return 1;
else
return x * POW(x, y-1);
}
In the rest of the lecture we often silently go back and forth between xy
and P OW (x, y). Now we incorporate POW into a formal postcondition for
the function. Postconditions have the form //@ensures e;, where e is a
boolean expression. The are also written before the function body, by con-
vention after the preconditions. Postconditions can use a special variable
\result to refer to the value returned by the function.
int f (int x, int y)
//@requires y >= 0;
//@ensures \result == POW(x,y);
{
int r = 1;
while (y > 1) {
if (y % 2 == 1) {
r = x * r;
}
x = x * x;
y = y / 2;
}
return r * x;
}
Note that as far as the function f is concerned, if we are considering calling
it we do not need to look at its body at all. Just looking at the pre- and post-
% coin solution2b.c0 -d
C0 interpreter (coin) 0.3.2 ’Nickel’ (r256, Thu Jan 3 14:18:03 EST 2013)
Type ‘#help’ for help or ‘#quit’ to exit.
--> f(3,2);
9 (int)
--> f(3,-1);
foo.c0:12.4-12.20: @requires annotation failed
The fact that @requires annotation fails in the second example call means
that our call is to blame, not f . The fact that the @ensures annotation fails
in the third example call means the function f does not satisfy its contract
and is therefore to blame.
What quantity remains invariant now, throughout the loop? Try to form a
conjecture for a more general loop invariant before reading on.
Let’s make a table again, this time to trace a call when the exponent is
not a power of two, say, while computing 27 by calling f (2, 7).
iteration b e r be
0 2 7 1 128
1 4 3 2 64
2 16 1 8 16
As we can see, be is not invariant, but r ⇤ be = 128 is! The extra factor from
the equation on the previous page is absorbed into r.
We now express this proposed invariant formally in C0. This requires
the @loop_invariant annotation. It must come immediately before the
loop body, but it is checked just before the loop exit condition. We would
like to say that the expression r * POW(b,e) is invariant, but this is not
possible directly.
Loop invariants in C0 are boolean expressions which must be either true
or false. We can achieve this by stating that r * POW(b,e) == POW(x,y).
Observe that x and y do not change in the loop, so this guarantees that
r * POW(b,e) never changes either. But it says a little more, stating what
the invariant quantity is in term of the original function parameters.
Now when the exponent y = 0 we skip the loop body and return r = 1,
which is the right answer for x0 ! Indeed:
% coin solution2d.c0 -d
Coin 0.2.3 "Penny" (r1478, Thu Jan 20 16:14:15 EST 2011)
Type ‘#help’ for help or ‘#quit’ to exit.
--> f(2,0);
1 (int)
-->
Init: The invariant holds initially. When we enter the loop, e = y and y 0
by the precondition of the function. Done.
Preservation: Assume the invariant holds just before the exit condition is
checked. We have to show that it is true again when we reach the exit
condition after one iteration of the loop
Assumption: e 0.
To show: e0 0 where e0 = e/2, with integer division. This clearly
holds.
Init: The invariant holds initially, because when entering the loop we have
r = 1, b = x and e = y.
reason:
r0 ⇤ P OW (b0 , e0 ) = (b ⇤ r) ⇤ P OW (b ⇤ b, (e 1)/2)
= (b ⇤ r) ⇤ P OW (b, 2 ⇤ (e 1)/2) Since (a2 )c = a2⇤c
= (b ⇤ r) ⇤ P OW (b, e 1) Since e 1 is even
= r ⇤ P OW (b, e) Since a ⇤ (ac ) = ac+1
= P OW (x, y) By assumption
16 Termination
The previous argument for termination still holds. By loop invariant, we
know that e 0. When we enter the body of the loop, the condition must
be true so e > 0. Now we just use that e/2 < e for e > 0, so the value
of e is strictly decreasing and positive, which, as an integer, means it must
eventually become 0, upon which we exit the loop and return from the
function after one additional step.
17 A Surprise
Now, let’s try our function on some larger numbers, computing some pow-
ers of 2.
% coin -d solution2e.c0
Coin 0.2.3 "Penny" (r1478, Thu Jan 20 16:14:15 EST 2011)
Type ‘#help’ for help or ‘#quit’ to exit.
--> f(2,30);
1073741824 (int)
--> f(2,31);
-2147483648 (int)
--> f(2,32);
0 (int)
-->
230 looks plausible, but how could 231 be negative or 232 be zero? We
claimed we just proved it correct!
The reason is that the values of type int in C0 or C and many other
languages actually do not represent arbitrarily large integers, but have a
fixed-size representation. In mathematical terms, this means we that we are
dealing with modular arithmetic. The fact that 232 = 0 provides a clue that
integers in C0 have 32 bits, and arithmetic operations implement arithmetic
modulo 232 .
In this light, the results above are actually correct. We examine modular
arithmetic in detail in the next lecture.
@loop invariant: A loop invariant. This is checked every time just before
the loop exit condition is tested.
@requires: At the call sites we have to prove that the precondition for the
function is satisfied for the given arguments. We can then assume it
when reasoning in the body of the function.
@ensures: At the return sites inside a function we have to prove that the
postcondition is satisfied for the given return value. We can then as-
sume it at the call site.
Init: The loop invariant is satisfied initially, when the loop is first
encountered.
Preservation: Assuming the loop invariant is satisfied at the begin-
ning of the loop (just before the exit test), we have to show it still
holds when the beginning of the loop is reached again, after one
iteration of the loop.
We are then allowed to assume that the loop invariant holds after the
loop exits, together with the exit condition.
Contracts are crucial for reasoning since (a) they express what needs to
be proved in the first place (give the program’s specification), and (b) they
localize reasoning: from a big program to the conditions on the individual
functions, from the inside of a big function to each loop invariant or asser-
tion.
Exercises
Exercise 1 Rewrite first POW and then f so that it signals an error in case of an
overflow rather than silently working in modular arithmetic.
Lecture 3
January 22, 2013
1 Introduction
Two fundamental types in almost any programming language are booleans
and integers. Booleans are comparatively straightforward: they have two
possible values (true and false) and conditionals to test boolean values.
We will return to their properties in a later lecture.
Integers . . . , 2, 1, 0, 1, 2, . . . are considerably more complex, because
there are infinitely many of them. Because memory is finite, only a finite
subrange of them can be represented in computers. In this lecture we dis-
cuss how integers are represented, how we can deal with the limited range
in the representation, and how various operations are defined on these rep-
resentations.
In terms of our learning goals, this lecture addresses:
10011[2] = (((1 ⇤ 2 + 0) ⇤ 2 + 0) ⇤ 2 + 1) ⇤ 2 + 1 = 19
(· · · ((bn 1 ⇤ 2 + bn 2) ⇤ 2 + bn 3) ⇤ 2 + · · · + b1 ) ⇤ 2 + b0
For example, taking the binary number 10010110[2] write the digits
from most significant to least significant, calculating the cumulative value
from left to right by writing it top to bottom.
1 = 1
1 ⇤ 2 + 0 = 2
2 ⇤ 2 + 0 = 4
4 ⇤ 2 + 1 = 9
9 ⇤ 2 + 0 = 18
18 ⇤ 2 + 1 = 37
37 ⇤ 2 + 1 = 75
75 ⇤ 2 + 0 = 150
198 = 99 ⇤ 2 + 0
99 = 49 ⇤ 2 + 1
49 = 24 ⇤ 2 + 1
24 = 12 ⇤ 2 + 0
12 = 6 ⇤ 2 + 0
6 = 3 ⇤ 2 + 0
3 = 1 ⇤ 2 + 1
1 = 0 ⇤ 2 + 1
3 Modular Arithmetic
Within a computer, there is a natural size of words that can be processed
by single instructions. In early computers, the word size was typically 8
bits; now it is 32 or 64. In programming languages that are relatively close
to machine instructions like C or C0, this means that the native type int of
integers is limited to the size of machine words. In C0, we decided that the
values of type int occupy 32 bits.
This is very easy to deal with for small numbers, because the more sig-
nificant digits can simply be 0. According to the formula that yields their
number value, these bits do not contribute to the overall value. But we
have to decide how to deal with large numbers, when operations such as
addition or multiplication would yield numbers that are too big to fit into
a fixed number of bits. One possibility would be to raise overflow excep-
tions. This is somewhat expensive (since the overflow condition must be
explicitly detected), and has other negative consequences. For example,
(n + n) n is no longer equal to n + (n n) because the former can overflow
while the latter always yields n and does not overflow. Another possibility
is to carry out arithmetic operations modulo the number of representable
integers, which would be 232 in the case of C0. We say that the machine
implements modular arithmetic.
In higher-level languages, one would be more inclined to think of the
type of int to be inhabited by integers of essentially unbounded size. This
means that a value of this type would consist of a whole vector of machine
tive and negative numbers, where we include 0 among the positive ones.
From this considerations we can see that 0, . . . , 7 should be positive and
8, . . . , 1 should be negative and that the highest bit of the 4-bit binary
representation tells us if the number is positive or negative.
Just for verification, let’s check that 7 + ( 7) = 0 (mod 16):
0 1 1 1
+ 11 01 01 1
(1) 0 0 0 0
6 Hexadecimal Notation
In C0, we use 32 bit integers. Writing these numbers out in decimal nota-
tion is certainly feasible, but sometimes awkward since the bit pattern of
the representation is not easy to discern. Binary notation is rather expan-
sive (using 32 bits for one number) and therefore difficult to work with.
A good compromise is found in hexadecimal notation, which is a represen-
tation in base 16 with the sixteen digits 0–9 and A–F . “Hexadecimal” is
7 Useful Powers of 2
The drive to expand the native word size of machines by making circuits
smaller was influenced by two different considerations. For one, since the
bits of a machine word (like 32 or 64) are essentially treated in parallel in
the circuitry, operations on larger numbers are much more efficient. For
another, we can address more memory directly by using a machine word
as an address.
A useful way to relate this to common measurements of memory and
storage capacity is to use
210 = 1024 = 1K
Note that this use of “1K” in computer science is slightly different from
its use in other sciences where it would indicate one thousand (1, 000). If
we want to see how much memory we can address with a 16 bit word we
calculate
216 = 26 ⇤ 210 = 64K
so roughly 64K cells of memory each usually holding a byte which is 8 bits
wide). We also have
220 = 210 ⇤ 210 = 1, 048, 576 = 1M
(pronounced “1 Meg”) which is roughly 1 million and
230 = 210 ⇤ 210 ⇤ 210 = 1, 073, 741, 824 = 1G
(pronounced “1 Gig”) which is roughly 1 billion.
In a more recent processor with a word size of 32 we can therefore ad-
dress
232 = 22 ⇤ 210 ⇤ 210 ⇤ 210 = 4GB
of memory where “GB” stands for Gigabyte.
The signifant number would be 1024GB which would be 1T B (Ter-
abyte).
& 0 1 ^ 0 1 | 0 1 ~ 0 1
0 0 0 0 0 1 0 0 1 1 0
1 0 1 1 1 0 1 1 1
(x/y) ⇤ y + (x%y) = x
so that x%y is like the remainder of division. The above is not yet sufficient
to define the two operations. In addition we say 0 |x%y| < |y|. Still, this
leaves open the possibility that the modulus is positive or negative when y
does not divide x. We fix this by stipulating that integer division truncates
its result towards zero. This means that the modulus must be negative if x
is negative and there is a remainder, and it must be positive if x is positive.
By contrast, the quotient operation always truncates down (towards 1),
which means that the remainder is always positive. There are no primitive
operators for quotient and remainder, but they can be implemented with
the ones at hand.
Of course, the above constraints are impossible to satisfy when y = 0,
because 0 |x%0| < |0| is impossible. But division by zero is defined to
raise an error, and so is the modulus.
10 Shifts
We also have some hybrid operators on ints, somewhere between bit-level
and arithmetic. These are the shift operators. We write x << k for the result
of shifting x by k bits to the left, and x >> k for the result of shifting x by k
bits to the right. In both cases, the value of k must be between 0 (inclusive)
and 32 (exclusive) – any other value is an arithmetic error like division by
zero. We assume below that k is in that range.
The left shift, x << k (for 0 k < 32), fills the result with zeroes on
the right, so that bits 0, . . . , k 1 will be 0. Every left shift corresponds to a
multiplication by 2 so x << k returns x ⇤ 2k (modulo 232 ). We illustrate this
with 8-bit numbers.
b7 b6 b5 b4 b3 b2 b1 b0
<<1
b6 b5 b4 b3 b2 b1 b0 0
b7 b6 b5 b4 b3 b2 b1 b0
<<2
b5 b4 b3 b2 b1 b0 0 0
The right shift, x >> k (for 0 k < 32), copies the highest bit while
shifting to the right, so that bits 31, . . . , 32 k of the result will be equal to
the highest bit of x. If viewing x a an integer, this means that the sign of the
result is equal to the sign of x, and shifting x right by k bits correspond to
integer division by 2k except that it truncates towards 1. For example,
-1 >> 1 == -1.
b7 b6 b5 b4 b3 b2 b1 b0
>>1
b7 b7 b6 b5 b4 b3 b2 b1
b7 b6 b5 b4 b3 b2 b1 b0
>>2
b7 b7 b7 b6 b5 b4 b3 b2
11 Representing Colors
As a small example of using the bitwise interpretation of ints, we consider
colors. Colors are decomposed into their primary components red, green,
and blue; the intensity of each uses 8 bits and therefore varies between
0 and 255 (or 0x00 and 0xFF). We also have the so-called ↵-channel which
indicates how opaque the color is when superimposed over its background.
Here, 0xFF indicates completely opaque, and 0x00 completely transparent.
For example, to extract the intensity of the red color in a given pixel p,
we could compute (p >> 16) & 0xFF. The first shift moves the red color
value into the bits 0–7; the bitwise and masks out all the other bits by setting
them to 0. The result will always be in the desired range, from 0–255.
Conversely, if we want to set the intensity of green of the pixel p to
the value of g (assuming we already have 0 g 255), we can compute
(p & 0xFFFF00FF) | (g << 8). This works by first setting the green in-
tensity to 0, while keep everything else the same, and then combining it
with the value of g, shifted to the right position in the word.
For more on color values and some examples, see Assignment 1.
Exercises
Exercise 1 Write functions quot and rem that calculate quotient and remainder
as explained in Section 9. Your functions should have the property that
quot(x,y)*y + rem(x,y) == x;
for all ints x and y unless quot overflows. How is that possible?
Exercise 2 Write a function int2hex that returns a string containing the hex-
adecimal representation of a given integer as a string. Your function should have
prototype
Exercise 3 Write a function lsr (logical shift right), which is like right shift (>>)
except that it fills the most significant bits with zeroes instead of copying the sign
bit. Explain what lsr(x,1) means on integers in two’s complement representa-
tion.
Lecture 4
January 24, 2012
1 Introduction
So far we have seen how to process primitive data like integers in impera-
tive programs. That is useful, but certainly not sufficient to handle bigger
amounts of data. In many cases we need aggregate data structures which
contain other data. A common data structure, in particular in imperative
programming languages, is that of an array. An array can be used to store
and process a fixed number of data elements that all have the same type.
We will also take a first detailed look at the issue of program safety.
A program is safe if it will execute without exceptional conditions which
would cause its execution to abort. So far, only division and modulus are
potentially unsafe operations, since division or modulus by 0 is defined
as a runtime error.1 Trying to access an array element for which no space
has been allocated is a second form of runtime error. Array accesses are
therefore potentially unsafe operations and must be proved safe.
With respect to our learning goals we will look at the following notions.
Computational Thinking: Safety
Algorithms and Data Structures: Fixed-size arrays
Programming: The type t[]; for-loops
In lecture, we only discussed a smaller example of programming with
arrays, so some of the material here is a slightly more complex illustration
of how to use for loops and loop invariants when working with arrays.
1
as is division of modulus of the minimal integer by 1
2 Using Arrays
When t is a type, then t[] is the type of an array with elements of type t.
Note that t is arbitrary: we can have an array of integers (int[]), and an
array of booleans (bool[]) or an array of arrays of characters (char[][]).
This syntax for the type of arrays is like Java, but is a minor departure from
C, as we will see later in class.
Each array has a fixed size, and it must be explicitly allocated using the
expression alloc_array(t, n). Here t is the type of the array elements,
and n is their number. With this operation, C0 will reserve a piece of mem-
ory with n elements, each having type t. Let’s try in coin:
% coin
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> int[] A = alloc_array(int, 10);
A is 0xECE2FFF0 (int[] with 10 elements)
-->
--> A[0];
0 (int)
--> A[1];
0 (int)
--> A[2];
0 (int)
--> A[10];
Error: accessing element 10 in 10-element array
Last position: <stdio>:1.1-1.5
--> A[-1];
Error: accessing element -1 in 10-element array
Last position: <stdio>:1.1-1.5
-->
We notice that after allocating the array, all elements appear to be 0. This
is guaranteed by the implementation, which initializes all array elements
to a default value which depends on the type. The default value of type
int is 0. Generally speaking, one should try to avoid exploiting implicit
initialization because for a reader of the program it may not be clear if the
initial values are important or not.
We also observe that trying to access an array element not in the spec-
ified range of the array will lead to an error. In this example, the valid
accesses are A[0], A[1], . . ., A[9] (which comes to 10 elements); everything
else is illegal. And every other attempt to access the contents of the array
would not make much sense, because the array has been allocated to hold
10 elements. How could we ever meaningfully ask what it’s element num-
ber 20 is if it has only 10? Nor would it make sense to ask A[-4]. In both
cases, coin and cc0 will give you an error message telling you that you
have accessed the array outside the bounds. While an error is guaranteed
in C0, in C no such guarantee is made. Accessing an array element that has
not been allocated leads to undefined behavior and, in principle, anything
could happen. This is highly problematic because implementations typi-
cally choose to just read from or write to the memory location where some
element would be if it had been allocated. Since it has not been, some other
unpredictable memory location may be altered, which permits infamous
buffer overflow attacks which may compromise your machines.
How do we change an element of an array? We can use it on the left-
hand size of an assignment. We can set A[i] = e; as long as e is an expres-
sion of the right type for an array element. For example:
ECE30000%
0xECE2FFF0% F4% F8% FC% 04% 08% 0C% 10% 14%
5" 10" 20" 0" 0" 0" 0" 0" 0" 0"
A[0]% A[1]% A[2]% A[3]% A[4]% A[5]% A[6]% A[7]% A[8]% A[9]%
Characteristically, the exit condition of the loop test i < n where i is the
array index and n is the length of the array (here 10).
After we type in the first line (the header of the for-loop), coin responds
with the prompt ... instead of -->. This indicates that the expression or
statement it has parsed so far is incomplete. We complete it by supplying
the body of the loop, the assignment A[i] = i * i * i;. Note that no
f0 = 0
f1 = 1
fn+2 = fn+1 + fn for n 0
int[] fib(int n) {
int[] F = alloc_array(int, n);
F[0] = 0;
F[1] = 1;
for (int i = 0; i < n; i++)
F[i+2] = F[i+1] + F[i];
return F;
}
This looks straightforward. Is there a problem with the code or will it run
correctly? In order to understand whether this function works correctly, we
systematically develop a specification for it. Before you read on, can you
spot a bug in the code? Or can you find a reason why it will work correctly?
Allocating an array will also fail if we ask for a negative number of ele-
ments. Since the number of elements we ask for in alloc_array(int, n)
is n, and n is a parameter passed to the function, we need to add n 0 into
the precondition of the function. In return, the function can safely promise
to return an array that has exactly the size n. This is a property that the code
using, e.g., fib(10) has to rely on. Unless the fib function promises to re-
turn an array of a specific size, the user has no way of knowing how many
elements in the array can be accessed safely without exceeding its bounds.
Without such a corresponding postcondition, code calling fib(10) could
not even safely access position 0 of the array that fib(10) returns.
For referring to the length of an array, C0 contracts have a special func-
tion \length(A) that stands for the number of elements in the array A. Just
like the \result variable, the function \length is part of the contract lan-
guage and cannot be used in C0 program code. Its purpose is to be used in
contracts to specify the requirements and behavior of a program. For the
Fibonacci function, we want to specify the postcondition that the length of
the array that the function returns is n.
int[] fib(int n)
//@requires n >= 0;
//@ensures \length(\result) == n;
{
int[] F = alloc_array(int, n);
F[0] = 0;
F[1] = 1;
for (int i = 0; i < n; i++) {
F[i+2] = F[i+1] + F[i];
}
return F;
}
int[] fib(int n)
//@requires n >= 0;
//@ensures \length(\result) == n;
{
int[] F = alloc_array(int, n);
F[0] = 0;
F[1] = 1;
for (int i = 0; i < n; i++)
//@loop_invariant i >= 0;
{
F[i+2] = F[i+1] + F[i];
}
return F;
}
Clearly, if i 0 then the other array accesses F[i+1] and F[i+2] also will
not violate the lower bounds of the array, because i + 1 0 and i + 2 0.
Will the program work correctly now?
The big issue with the code is that, even though the code ensures that
no array access exceeds the lower bound 0 of the array F, we do not know
whether the upper bounds of the array i.e., \length(F), which equals n,
is always respected. For each array access, we need a to ensure that it is
within the bounds. In particular, we need to ensure i < n for array access
F[i] and the condition i + 1 < n for array access F[i+1] and the condition
i+2 < n for F[i+2]. But this condition does not work out, because the loop
body also runs when i = n 1, at which point i+2 = (n 1)+2 = n+1 < n
does not hold, because we have allocated array F to have size n.
We can also easily observe this bug by using in coin.
% coin fibc.c0 -d
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> fib(5);
Error: accessing element 5 in 5-element array
Last position: fibc.c0:11.7-11.30
fib from <stdio>:1.1-1.7
Consequently, we need to stop the loop earlier and can only continue as
long as i + 2 < n. Since the loop condition in a for loop can be any boolean
expression, we could trivially ensure this by changing the loop as follows:
int[] fib(int n)
//@requires n >= 0;
//@ensures \length(\result) == n;
{
int[] F = alloc_array(int, n);
F[0] = 0;
F[1] = 1;
for (int i = 0; i+2 < n; i++)
//@loop_invariant i >= 0;
{
F[i+2] = F[i+1] + F[i];
}
return F;
}
Since it can be more convenient to see the exact bounds of a for loop, we
can replace the loop condition i + 2 < n by i < n 2 since both are equiv-
alent. It does not make much difference, which one we use, but the latter
can be more intuitive to determine how long a loop iterates to complete.
int[] fib(int n)
//@requires n >= 0;
//@ensures \length(\result) == n;
{
int[] F = alloc_array(int, n);
F[0] = 0;
F[1] = 1;
for (int i = 0; i < n-2; i++)
//@loop_invariant i >= 0;
{
F[i+2] = F[i+1] + F[i];
}
return F;
}
This program looks good and will behave well after a number of tests.
Is it correct? Before you read on, find an answer yourself.
% coin fibe.c0 -d
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> fib(5);
0xFF4FF780 (int[] with 5 elements)
--> fib(2);
0xFF4FF760 (int[] with 2 elements)
--> fib(1);
Error: accessing element 1 in 1-element array
Last position: fibe.c0:7.3-7.12
fib from <stdio>:1.1-1.7
--> fib(0);
Error: accessing element 0 in 0-element array
Last position: fibe.c0:6.3-6.12
fib from <stdio>:1.1-1.7
-->
To solve this issue we add tests that only run them if the array is big
enough to contain that entry.
See fibf.c0
int[] fib(int n)
//@requires n >= 0;
//@ensures \length(\result) == n;
{
int[] F = alloc_array(int, n);
if (n > 0) F[0] = 0; /* line 0 */
if (n > 1) F[1] = 1; /* line 1 */
for (int i = 0; i < n-2; i++)
//@loop_invariant i >= 0;
{
F[i+2] = F[i+1] + F[i]; /* line 2 */
}
return F;
}
Init: When we enter the loop for the first time, the for loop initialization
assigns i = 0 so i 0.
In the last case, we do not reason about how the loop operates but rely
solely on the loop invariant instead. This is crucial, since the loop invariant
is supposed to contain all the relevant information about the relevant effect
of the loop. In particular, our reasoning about the array accesses does not
depend on understanding what exactly the loop does after, say, 5 iterations
or where i started and how it evolved since. All that matters is whether we
can conclude from the loop invariant i 0 and the loop condition i < n 2
that the array accesses are okay. In this way, loop invariants have the ef-
fect of entirely localizing our reasoning to one general scenario to consider
for the loop body. This is how loop invariants can greatly contribute to
understanding programs and ensure we have implemented them correctly.
Similar effects occur in other scenarios where our understanding of the
behavior of loops becomes entirely focused on a local question of a single
behavior just by virtue of being able to conclude from the loop invariant.
Needless to say, before we use a loop invariant in our reasoning about the
behavior of the code, we should convince ourselves that the loop invariant
is correct by a proof.
8 Aliasing
We have seen assignments to array elements, such as A[0] = 0;. But we
have also seen assignments to array variables themselves, such as
% coin -d fibf.c0
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> int[] F;
--> int[] G;
--> F = fib(15);
F is 0xF6969A80 (int[] with 15 elements)
--> G[2];
Error: uninitialized value used
Last position: <stdio>:1.1-1.5
--> G = F;
G is 0xF6969A80 (int[] with 15 elements)
--> G = fib(10);
G is 0xF6969A30 (int[] with 10 elements)
-->
% coin
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> int[] A = alloc_array(int, 5);
A is 0xE8176FF0 (int[] with 5 elements)
--> int[] B = A;
B is 0xE8176FF0 (int[] with 5 elements)
--> A[0] = 42;
A[0] is 42 (int)
--> B[0];
42 (int)
-->
/* file copy.c0 */
int[] array_copy(int[] A, int n)
//@requires 0 <= n && n <= \length(A);
//@ensures \length(\result) == n;
{
int[] B = alloc_array(int, n);
For example, we can create B as a copy of A, and now assigning to the copy
of B will not affect A. We will invoke coin with the -d flag to make sure
that if a pre- or post-condition or loop invariant is violated we get an error
message.
% coin copy.c0 -d
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> int[] A = alloc_array(int, 10);
A is 0xF3B8DFF0 (int[] with 10 elements)
--> for (int i = 0; i < 10; i++) A[i] = i*i;
--> int[] B = array_copy(A, 10);
B is 0xF3B8DFB0 (int[] with 10 elements)
--> B[9];
81 (int)
--> A[9] = 17;
A[9] is 17 (int)
--> B[9];
81 (int)
-->
Exercises
Exercise 1 Write a function array_part that creates a copy of a part of a given
array, namely the elements from position i to position j. Your function should have
prototype
Develop a specification and loop invariants for this function. Prove that it works
correctly by checking the loop invariant and proving array bounds.
Exercise 2 Write a function copy_into that copies a part of a given array source,
namely n elements starting at position i, into another given array target, starting
at position j. Your function should have prototype
As an extra service, make your function return the last position in the target ar-
ray that it entered data into. Develop a specification and loop invariants for this
function. Prove that it works correctly by checking the loop invariant and proving
array bounds. What is difficult about this case?
Develop a specification and loop invariants for this function. Prove that it works
correctly by checking the loop invariant and proving array bounds. The num-
ber returned by can_copy_into should be compatible with the specification of
copy_into. Which calls to copy_into are guaranteed to work correctly after a
call of
Lecture 5
January 29, 2013
1 Introduction
One of the fundamental and recurring problems in computer science is to
find elements in collections, such as elements in sets. An important algo-
rithm for this problem is binary search. We use binary search for an integer
in a sorted array to exemplify it. As a preliminary study in this lecture we
analyze linear search, which is simpler, but not nearly as efficient. Still it is
often used when the requirements for binary search are not satisfied, for
example, when we do not have the elements we have to search arranged in
a sorted array.
In term of our learning goals, we discuss the following:
Computational Thinking: We will see the first time the power of order in
various algorithmic problems.
3 Sorted Arrays
A number of algorithms on arrays would like to assume that they are sorted.
Such algorithms would return a correct result only if they are actually run-
ning on a sorted array. Thus, the first thing we need to figure out is how
to specify sortedness in function specifications. The specification function
is_sorted(A,lower,upper) traverses the array A from left to right, start-
ing at lower and stopping just before reaching upper , checking that each el-
ement is smaller or equal to its right neighbor. We need to be careful about
the loop invariant to guarantee that there will be no attempt to access a
memory element out of bounds.
The loop invariant here does not have an upper bound on i. Fortunately,
when we are inside the loop, we know the loop condition is true so we
know i < upper 1. That together with lower i guarantees that both
accesses are in bounds.
We could also try i upper 1 as a loop invariant, but this turns out to
be false. It is instructive to think about why. If you cannot think of a good
reason, try to prove it carefully. Your proof should fail somewhere.
Actually, the attempted proof already fails at the initial step. If lower =
upper = 0 (which is permitted by the precondition) then it is not true that
0 = lower = i upper 1 = 0 1 = 1. We could say i upper , but that
wouldn’t seem to serve any particular purpose here since the array accesses
are already safe.
Let’s reason through that. Why is the acccess A[i] safe? By the loop
invariant lower i and the precondition 0 lower we have 0 i, which
is the first part of safety. Secondly, we have i < upper 1 (by the loop
condition, since we are in the body of the loop) and upper length(A)
(by the precondition), so i will be in bounds. In fact, even i + 1 will be in
bounds, since 0 lower i < i + 1 (since i is bounded from above) and
i + 1 < (upper 1) + 1 = upper length(A).
Whenever you see an array access, you must have a very good reason
why the access must be in bounds. You should develop a coding instinct
where you deliberately pause every time you access an array in your code
and verify that it should be safe according to your knowledge at that point
in the program. This knowledge can be embedded in preconditions, loop
invariants, or assertions that you have verified.
This does not exploit that the array is sorted. We would like to exit the
loop and return 1 as soon as we find that A[i] > x. If we haven’t found x
already, we will not find it subsequently since all elements to the right of i
will be greater or equal to A[i] and therefore strictly greater than x. But we
have to be careful: the following program has a bug.
Can you spot the problem? If you cannot spot it immediately, reason
through the loop invariant. Read on if you are confident in your answer.
Now A[i] <= x will only be evaluated if i < n and the access will be in
bounds since we also know 0 i from the loop invariant.
Alternatively, and perhaps easier to read, we can move the test into the
loop body.
This program is not yet satisfactory, because the loop invariant does not
have enough information to prove the postcondition. We do know that if we
return directly from inside the loop, that A[i] = x and so A[\result] == x
holds. But we cannot deduce that !is_in(x, A, 0, n) if we return 1.
Before you read on, consider which loop invariant you might add to
guarantee that. Try to reason why the fact that the exit condition must
be false and the loop invariant true is enough information to know that
!is_in(x, A, 0, n) holds.
Did you try to exploit that the array is sorted? If not, then your invariant
is most likely too weak, because the function is incorrect if the array is not
sorted!
What we want to say is that all elements in A to the left of index i are smaller
than x. Just saying A[i-1] < x isn’t quite right, because when the loop is
entered the first time we have i = 0 and we would try to access A[ 1]. We
again exploit shirt-circuiting evaluation, this time for disjunction.
Lecture 6
January 31, 2013
1 Introduction
One of the fundamental and recurring problems in computer science is to
find elements in collections, such as elements in sets. An important al-
gorithm for this problem is binary search. We use binary search for an in-
teger in a sorted array to exemplify it. We started in the last lecture by
discussing linear search and giving some background on the problem. This
lecture clearly illustrates the power of order in algorithm design: if an array
is sorted we can search through it very efficiently, much more efficiently
than when it is not ordered.
We will also once again see the importance of loop invariants in writing
correct code. Here is a note by Jon Bentley about binary search:
I’ve assigned [binary search] in courses at Bell Labs and IBM. Profes-
sional programmers had a couple of hours to convert [its] description
into a program in the language of their choice; a high-level pseudocode
was fine. At the end of the specified time, almost all the programmers
reported that they had correct code for the task. We would then take
thirty minutes to examine their code, which the programmers did with
test cases. In several classes and with over a hundred programmers,
the results varied little: ninety percent of the programmers found bugs
in their programs (and I wasn’t always convinced of the correctness of
the code in which no bugs were found).
I was amazed: given ample time, only about ten percent of profes-
sional programmers were able to get this small program right. But
they aren’t the only ones to find this task difficult: in the history in
Section 6.2.1 of his Sorting and Searching, Knuth points out that
while the first binary search was published in 1946, the first published
binary search without bugs did not appear until 1962.
—Jon Bentley, Programming Pearls (1st edition), pp.35–36
2 Binary Search
Can we do better than searching through the array linearly? If you don’t
know the answer already it might be surprising that, yes, we can do signif-
icantly better! Perhaps almost equally surprising is that the code is almost
as short!
Before we write the code, let us describe the algorithm. We start by
examining the middle element of the array. If it smaller than x than x must
be in the upper half of the array (if it is there at all); if is greater than x then
it must be in the lower half. Now we continue by restricting our attention
to either the upper or lower half, again finding the middle element and
proceeding as before.
We stop if we either find x, or if the size of the subarray shrinks to zero,
in which case x cannot be in the array.
Before we write a program to implement this algorithm, let us analyze
the running time. Assume for the moment that the size of the array is a
power of 2, say 2k . Each time around the loop, when we examine the mid-
dle element, we cut the size of the subarrays we look at in half. So before the
first iteration the size of the subarray of interest is 2k . After the second iter-
ation it is of size 2k 1 , then 2k 2 , etc. After k iterations it will be 2k k = 1,
so we stop after the next iteration. Altogether we can have at most k + 1
iterations. Within each iteration, we perform a constant amount of work:
computing the midpoint, and a few comparisons. So, overall, when given
We declare two variables, lower and upper, which hold the lower and up-
per end of the subinterval in the array that we are considering. We start
with lower as 0 and upper as n, so the interval includes lower and excludes
upper. This often turns out to be a convenient choice when computing with
arrays (but see Exercise 1).
The for loop from linear search becomes a while loop, exiting when
the interval has size zero, that is, lower == upper. We can easily write the
1
In general in computer science, we are mostly interested in logarithm to the base 2 so
we will just write log(n) for log to the base 2 from now on unless we are considering a
different base.
first loop invariant, relating lower and upper to each other and the overall
bound of the array.
In the body of the loop, we first compute the midpoint mid. By elemen-
tary arithmetic it is indeed between lower and upper .
Next in the loop body we check if A[mid ] = x. If so, we have found the
element and return mid .
Now comes the hard part. What is the missing part of the invariant?
The first instinct might be to say that x should be in the interval from
A[lower ] to A[upper ]. But that may not even be true when the loop is en-
tered the first time.
Let’s consider a generic situation in the form of a picture and collect
some ideas about what might be appropriate loop invariants. Drawing
diagrams to reason about an algorithm and the code that we are trying
to construct is an extremely helpful general technique.
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 5" 7" 11" 19" 34" 42" 65" 65" 89" 123"
The red box around elements 2 through 5 marks the segment of the
array still under consideration. This means we have ruled out everything
to the right of (and including) upper and to the left of (and not including)
lower . Everything to the left is ruled out, because those values have been
recognized to be strictly less than x, while the ones on the right are known
to be strictly greater than x, while the middle is still unknown.
We can depict this as follows:
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 5" 7" 11" 19" 34" 42" 65" 65" 89" 123"
We can summarize this by stating that A[lower 1] < x and A[upper ] >
x. This implies that x cannot be in the segments A[0..lower ) and A[upper ..n)
because the array is sorted (so all array elements to the left of A[lower 1]
will also be less than x and all array elements to the right of A[upper ] will
also be greater than x). For an alternative, see Exercise 2.
We can postulate these as invariants in the code.
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 5" 7" 11" 19" 34" 42" 65" 65" 89" 123"
At this point, let’s check if the loop invariant is strong enough to imply
the postcondition of the function. If we return from inside the loop because
A[mid ] = x we return mid , so A[\result] == x as required.
If we exit the loop because lower < upper is false, we know lower =
upper , by the first loop invariant. Now we have to distinguish some cases.
1. If A[lower 1] < x and A[upper ] > x, then A[lower ] > x (since lower =
upper ). Because the array is sorted, x cannot be in it.
Notice that we could verify all this without even knowing the complete
program! As long as we can finish the loop to preserve the invariant and
terminate, we will have a correct implementation! This would again be a
good point for you to interrupt your reading and to try to complete the
loop, reasoning from the invariant.
We have already tested if A[mid ] = x. If not, then A[mid ] must be less or
greater than x. If it is less, then we can keep the upper end of the interval
as is, and set the lower end to mid + 1. Now A[lower 1] < x (because
A[mid ] < x and lower = mid + 1), and the condition on the upper end
remains unchanged.
If A[mid ] > x we can set upper to mid and keep lower the same. We do
not need to test this last condition, because the fact that the tests A[mid ] = x
and A[mid ] < x both failed implies that A[mid ] > x. We note this in an
assertion.
4 Termination
Does this function terminate? If the loop body executes, that is, lower <
upper , then the interval from lower to upper is non-empty. Moreover, the
intervals from lower to mid and from mid + 1 to upper are both strictly
smaller than the original interval. Unless we find the element, the differ-
ence between upper and lower must eventually become 0 and we exit the
loop.
but that is in fact incorrect. Consider this change and try to find out why
this would introduce a bug.
Were you able to see it? It’s subtle, but somewhat related to other prob-
lems we had. When we compute (lower + upper)/2; we could actually
have an overflow, if lower + upper > 231 1. This is somewhat unlikely in
practice, since 231 = 2G, about 2 billion, so the array would have to have at
least 1 billion elements. This is not impossible, and, in fact, a bug like this
in the Java libraries2 was actually exposed.
Fortunately, the fix is simple: because lower < upper , we know that
upper lower > 0 and represents the size of the interval. So we can divide
that in half and add it to the lower end of the interval to get its midpoint.
Let us convince ourselves why the assert is correct. The division by two will
round to zero, which will round down to 0 here, because upper lower > 0.
Thus, 0 (upper lower )/2 < upper lower , because dividing a positive
number by two will make it strictly smaller. Hence,
mid = lower + (upper lower )/2 < lower + (upper lower ) = upper
6 Some Measurements
Algorithm design is an interesting mix between mathematics and an ex-
perimental science. Our analysis above, albeit somewhat preliminary in
nature, allow us to make some predictions of running times of our imple-
mentations. We start with linear search. We first set up a file to do some
experiments. We assume we have already tested our functions for correct-
ness, so only timing is at stake. See the file find-time.c0 on the course web
pages. We compile this file, together with the our implementation from
this lecture with the cc0 command below. We can get an overall end-to-
end timing with the Unix time command. Note that we do not use the -d
flag, since that would dynamically check contracts and completely throw
off our timings.
When running linear search 2000 times (1000 elements in the array and 1000
random elements) on 218 elements (256 K elements) we get the following
answer
Timing 1000 times with 2^18 elements
0
4.602u 0.015s 0:04.63 99.5% 0+0k 0+0io 0pf+0w
which indicates 4.602 seconds of user time.
Running linear search 2000 times on random arrays of size 218 , 219 and
220 we get the timings on our MacBook Pro
Exercises
Exercise 1 Rewrite the binary search function so that both lower and upper bounds
of the interval are inclusive. Make sure to rewrite the loop invariants and the loop
body appropriately, and proof that the correctness of the new loop invariants. Also
explicitly prove termination by giving a measure that strictly decreases each time
around the loop and is bounded from below.
Exercise 2 Rewrite the invariants of the binary search function to use is in(x, A, l, u)
which returns true if and only if there is an i such that x = A[i] for l i < u.
is in assumes that 0 l u n where n is the length of the array.
Then prove the new loop invariants, and verify that they are strong enough to
imply the function’s postcondition.
Exercise 3 Binary search as presented here may not find the leftmost occurrence
of x in the array in case the occurrences are not unique. Given an example demon-
strating this.
Now change the binary search function and its loop invariants so that it will
always find the leftmost occurrence of x in the given array (if it is actually in the
array, 1 as before if it is not).
Prove the loop invariants and the postconditions for this new version, and
verify termination.
Exercise 4 If you were to replace the midpoint computation by
int mid = (lower + upper)/2;
then which part of the contract will alert you to a flaw in your thinking? Why?
Give an example showing how the contracts can fail in that case.
Exercise 5 In lecture, we used design-by-invariant to construct the loop body im-
plementation from the loop invariant that we have identified before. We could also
have maintained the loop invariant by replacing the whole loop body just with
// .... loop_invariant elided ....
{
lower = lower;
upper = upper;
}
Prove the loop invariants for this loop body. What is wrong with this choice?
Which part of our proofs fail, thereby indicating why this loop body would not
implement binary search correctly?
Lecture 7
February 5, 2013
1 Introduction
We begin this lecture by discussing how to compare running times of func-
tions in an abstract, mathematical way. The same underlying mathematics
can be used for other purposes, like comparing memory consumption or
the amount of parellism permitted by an algorithm. We then use this to
take a first look at sorting algorithms, of which there are many. In this lec-
ture it will be selection sort because of its simplicity.
In terms of our learning goals, we will work on:
Computational Thinking: Still trying to understand how order can lead
to efficient computation. Worst-case asymptotic complexity of func-
tions.
2 Big-O Notation
Our brief analysis in the last lecture already indicates that linear search
should take about n iterations of a loop while binary search take about
log2 (n) iterations, with a constant number of operations in each loop body.
This suggests that binary search should more efficient. In the design and
constants factors don’t matter, then the two should be equivalent. We can
repair this by allowing the right-hand side to be multiplied by an arbitrary
constant.
This notation derives from the view of O(f ) as a set of functions, namely
those that eventually are smaller than a constant times f .1 Just to be ex-
plicit, we also write out the definition of O(f ) as a set of functions:
O(f ) = {g | there are c > 0 and n0 s.t. for all n n0 , g(n) c ⇤ f (n)}
With this definition we can check that O(f (n)) = O(c ⇤ f (n)).
When we characterize the running time of a function using big-O nota-
tion we refer to it as the asymptotic complexity of the function. Here, asymp-
totic refers to the fundamental principles listed above: we only care about
the function in the long run, and we ignore constant factors. Usually, we
use an analysis of the worst case among the inputs of a given size. Trying
to do average case analysis is much harder, because it depends on the distri-
bution of inputs. Since we often don’t know the distribution of inputs it is
much less clear whether an average case analysis may apply in a particular
use of an algorithm.
The asymptotic worst-case time complexity of linear search is O(n),
which we also refer to as linear time. The worst-case asymptotic time com-
plexity of binary search is O(log(n)), which we also refer to as logarithmic
time. Constant time is usually described as O(1), expressing that the running
time is independent of the size of the input.
Some brief fundamental facts about big-O. For any polynomial, only
the highest power of n matters, because it eventually comes to dominate the
function. For example, O(5⇤n2 +3⇤n+83) = O(n2 ). Also O(log(n)) ✓ O(n),
but O(n) 6✓ O(log(n)).
1
In textbooks and research papers you may sometimes see this written as g = O(f ) but
that is questionable, comparing a function with a set of functions.
That is the same as to say O(log(n)) ( O(n), which means that O(log(n))
is a proper subset of O(n), that is, O(log(n)) is a subset (O(log(n)) ✓ O(n)),
but they are not equal (O(log(n)) 6= O(n)). Logarithms to different (con-
stant) bases are asymptotically the same: O(log2 (n)) = O(logb (n)) because
logb (n) = log2 (n)/log2 (b).
As a side note, it is mathematically correct to say the worst-case running
time of binary search is O(n), because log(n) 2 O(n). It is, however, a
looser characterization than saying that the running time of binary search
is O(log(n)), which is also correct. Of course, it would be incorrect to say
that the running time is O(1). Generally, when we ask you to characterize
the worst-case running time of an algorithm we are asking for the tightest
bound in big-O notation.
3 Sorting Algorithms
We have seen in the last lecture that sorted arrays drastically reduce the
time to search for an element when compared to unsorted arrays. Asymp-
totically, it is the difference between O(n) (linear time) and O(log(n)) (loga-
rithmic time), where n is the length of the input array. This suggests that it
may be important to establish this invariant, namely sorting a given array.
In practice, this is indeed the case: sorting is an important component of
many other data structures or algorithms.
There are many different algorithms for sorting: bucket sort, bubble
sort, insertion sort, selection sort, heap sort, etc. This is testimony to the
importance and complexity of the problem, despite its apparent simplicity.
In this lecture we discuss selection sort, which is one of the simplest
algorithms. In the next lecture we will discuss quicksort. Earlier course in-
stances used mergesort as another example of efficient sorting algorithms.
4 Selection Sort
Selection sort is based on the idea that on each iteration we select the small-
est element of the part of the array that has not yet been sorted and move it
to the end of the sorted part at the beginning of the array.
Let’s play this through for two steps on an example array. Initially, we
consider the whole array (from i = 0 to the end). We write this as A[0..n),
that is the segment of the array starting at 0 up to n, where n is excluded.
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 12# 87# 21# 3# 2# 78# 97# 16# 89# 21#
i"="0" n"
We now find the minimal element of the array segment under consid-
eration (2) and move it to the front of the array. What do we do with the
element that is there? We move it to the place where 2 was (namely at A[4]).
In other words, we swap the first element with the minimal element. Swap-
ping is a useful operation when we sorting an array in place by modifying
it, because the result is clearly a permutation of the input. If swapping is
our only operation we are immediately guaranteed that the result is a per-
mutation of the input.
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 2" 87" 21" 3" 12" 78" 97" 16" 89" 21"
i" n"
Now 2 is in the right place, and we find the smallest element in the
remaining array segment and move it to the beginning of the segment (i =
1).
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 2" 3" 21" 87" 12" 78" 97" 16" 89" 21"
i" n"
Let’s pause and see if we can write down properties of the variables and
array segments that allow us to write the code correctly. First we observe
rather straightforwardly that
0in
where i = n after the last iteration and i = 0 before the first iteration. Next
we observe that the elements to the left of i are already sorted.
A[0..i) sorted
These two invariants are not yet sufficient to prove the correctness of selec-
tion sort. We also need to know that all elements to the left of i are less or
equal to all element to the right of i. We abbreviate this:
A[0..i) A[i..n)
saying that every element in the left segment is smaller than or equal to
every element in the right segment.
Let’s reason through without any code (for the moment), why these invari-
ants are preserved. Let’s look at the picture again.
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 2" 3" 21" 87" 12" 78" 97" 16" 89" 21"
i" n"
In the next iteration we pick the minimal element among A[i..n), which
would be 12 = A[4]. We now swap this to i = 2 and increment i. We write
here i0 = i + 1 in order to distinguish the old value of i from the new one,
as we do in proofs of preservation of the loop invariant.
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 2" 3" 12" 87" 21" 78" 97" 16" 89" 21"
We encourage you to now write the function, using the following aux-
iliary and contract functions:
Please write it and then compare it to our version on the next page.
At this point, let us verify that the loop invariants are initially satisfied.
We should also verify the assertion we added in the loop body. It ex-
presses that A[m] is less or equal to any element in the segment A[i..n),
abbreviated mathematically as A[m] A[i..n). This should be implies by
the postcondition of the min_index function.
How can we prove the postcondition (@ensures) of the sorting func-
tion? By the loop invariant 0 i n and the negation of the loop condition
i n we know i = n. The second loop invariant then states that A[0..n) is
sorted, which is the postcondition.
6 Auxiliary Functions
Besides the specification functions in contracts, we also used two auxiliary
functions: swap and min_index.
Here is the implementation of swap.
For min_index, we recommend you follow the method used for selec-
tion sort: follow the algorithm for a couple of steps on a generic example,
write down the invariants in general terms, and then synthesize the simple
code and invariants from the result. What we have is below, for complete-
ness.
n(n + 1) n2 n
O( ) = O( + ) = O(n2 )
2 2 2
The last equation follows since for a polynomial, as we remarked earlier,
only the degree matters.
We summarize this by saying that the worst-case running time of selec-
tion sort is quadratic. In this algorithm there isn’t a significant difference
between average case and worst case analysis: the number of iterations is
exactly the same, and we only save one or two assignments per iteration in
the loop body of the min_index function if the array is already sorted.
8 Empirical Validation
If the running time were really O(n2 ) and not asymptotically faster, we pre-
dict the following: for large inputs, its running time should be essentially
cn2 for some constant c. If we double the size of the input to 2n, then the
running time should roughly become c(2n)2 = 4(cn2 ) which means the
function should take approximately 4 times as many seconds as before.
We try this with the function sort_time(n, r) which generates a ran-
dom array of size n and then sorts it r times. You can find the C0 code at
sort-time.c0. We run this code several times, with different parameters.
n Time Ratio
1000 0.700
2000 2.700 3.85
4000 10.790 4.00
8000 42.796 3.97
We see that especially for the larger numbers, the ratio is almost exactly 4
when doubling the size of the input. Our conjecture of quadratic asymp-
totic running time has been experimentally confirmed.
Lecture 8
February 7, 2013
1 Introduction
In this lecture we first sketch two related algorithms for sorting that achieve
a much better running time than the selection sort from last lecture: merge-
sort and quicksort. We then develop quicksort and its invariants in detail.
As usual, contracts and loop invariants will bridge the gap between the
abstract idea of the algorithm and its implementation.
We will revisit many of the computational thinking, algorithm, and pro-
gramming concepts from the previous lectures. We highlight the following
important ones:
2 Mergesort
Let’s see how we can apply the divide-and-conquer technique to sorting.
How do we divide?
One simple idea is just to divide a given array in half and sort each
half independently. Then we are left with an array where the left half is
sorted and the right half is sorted. We then need to merge the two halves
into a single sorted array. We actually don’t really “split” the array into
two separate arrays, but we always sort array segments A[lower ..upper ).
We stop when the array segment is of length 0 or 1, because then it must be
sorted.
A straightforward implementation of this idea would be as follows:
void mergesort (int[] A, int lower, int upper)
//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(A, lower, upper);
{
if (upper-lower <= 1) return;
int mid = lower + (upper-lower)/2;
mergesort(A, lower, mid); //@assert is_sorted(A, lower, mid);
mergesort(A, mid, upper); //@assert is_sorted(A, mid, upper);
merge(A, lower, mid, upper);
return;
}
We would still have to write merge, of course. We use the specification func-
tion is_sorted from the last lecture that takes an array segment, defined
by its lower and upper bounds.
The simple and efficient way to merge two sorted array segments (so
that the result is again sorted) is to create a temporary array, scan each of
the segments from left to right, copying the smaller of the two into the
temporary array. This is a linear time (O(n)) operation, but it also requires
a linear amount of temporary space. Other algorithms, like quicksort later
in this lecture, sorts entirely in place and do not require temporary memory
to be allocated. We do not develop the merge operation here further.
The mergesort function represents an example of recursion: a function
(mergesort) calls itself on a smaller argument. When we analyze such a
function call it would be a mistake to try to analyze the function that we
call recursively. Instead, we reason about it using contracts.
This applies no matter whether the call is recursive, like it is in this example,
or not. In the mergesort code above the precondition is easy to see. We
have illustrated the postcondition with two explicit @assert annotations.
Reasoning about recursive functions using their contracts is an excel-
lent illustration of computational thinking, separating the what (that is, the
contract) from the how (that is, the definition of the function). To analyze
the recursive call we only care about what the function does.
We also need to analyze the termination behavior of the function, verify-
ing that the recursive calls are on strictly smaller arguments. What smaller
means differs for different functions; here the size of the subrange of the
array is what decreases. The quantity upper lower is divided by two for
each recursive call and is therefore smaller since it is always greater or equal
to 2. If it were less than 2 we would return immediately and not make a
recursive call.
Let’s consider the asymptotic complexity of mergesort, assuming that
n"
1"merge"*"n:"O(n)"
n/2"""""""""""""""""""""""""""""""""""""""""""""""""""""""n/2"
2"merges"*"n/2:"O(n)"
n/4""""""""""""""""""""n/4"""""""""""""""""""""""n/4"""""""""""""""""""""n/4"
4"merges"*"n/4:"O(n)"
1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1"
Mergesort,"worst"case:"log(n)"levels,"O(n)"per"level"
We see that the asymptotic running time will be O(nlog(n)), because there
are O(log(n)) levels, and on each level we have to perform O(n) operations
to merge.
n"
1"par**on"*"n:"O(n)"
n/2"""""""""""""""""""""""""""""""""""""""""""""""""""""""n/2"
2"par**ons"*"n/2:"O(n)"
n/4""""""""""""""""""""n/4"""""""""""""""""""""""n/4"""""""""""""""""""""n/4"
4"par**ons"*"n/4:"O(n)"
1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1"
Quicksort,"best"case:"log(n)"levels,"O(n)"per"level"
At each level the total work is O(n) operations to perform the partition.
In the best case there will be O(log(n)) levels, leading us to the O(nlog(n))
best-case asymptotic complexity.
How many recursive calls do we have in the worst case, and how long
are the array segment? In the worst case, we always pick either the small-
est or largest element in the array so that one side of the partition will be
empty, and the other has all elements except for the pivot itself. In the ex-
ample above, the recursive calls might proceed as follows (where we have
surrounded the unsorted part of the array with brackets):
array pivot
[3, 1, 4, 4, 8, 2, 7] 1
1, [3, 4, 4, 8, 2, 7] 2
1, 2, [3, 4, 4, 8, 7] 3
1, 2, 3, [4, 4, 8, 8] 4
1, 2, 3, 4, [4, 8, 7] 4
1, 2, 3, 4, 4, [8, 7] 7
1, 2, 3, 4, 4, 7, [8]
All other recursive calls are with the empty array segment, since we never
have any unsorted elements less than the pivot. We see that in the worst
case there are n 1 significant recursive calls for an array of size n. The
kth recursive call has to sort a subarray of size n k, which proceeds by
partitioning, requiring O(n k) comparisons.
This means that, overall, for some constant c we have
n
X1 n(n 1)
c k=c 2 O(n2 )
2
k=0
comparisons. Here we used the fact that O(p(n)) for a polynomial p(n) is
always equal to the O(nk ) where k is the leading exponent of the polyno-
mial. This is because the largest exponent of a polynomial will eventually
dominate the function, and big-O notation ignores constant coefficients.
So quicksort has quadratic complexity in the worst case. How can we
mitigate this? If we could always pick the median among the elements in
the subarray we are trying to sort, then half the elements would be less and
half the elements would be greater. So in this case there would be only
log(n) recursive calls, where at each layer we have to do a total amount of
n comparisons, yielding an asymptotic complexity of O(nlog(n)).
Unfortunately, it is not so easy to compute the median to obtain the
optimal partitioning. It turns out that if we pick a random element, its ex-
pected rank will be close enough to the median that the expected running
time of algorithm is still O(nlog(n)).
Quicksort solves the same problem as selection sort, so their contract is the
same, but their implementation differs. We sort the segment A[lower ..upper )
of the array between lower (inclusively) and upper (exclusively). The pre-
condition in the @requires annotation verifies that the bounds are mean-
ingful with respect to A. The postcondition in the @ensures clause guaran-
tees that the given segment is sorted when the function returns. It does not
express that the output is a permutation of the input, which is required to
hold but is not formally expressed in the contract (see Exercise 1).
Before we start the body of the function, we should consider how to
terminate the recursion. We don’t have to do anything if we have an array
segment with 0 or 1 elements. So we just return if upper lower 1.
1
Actually not quite, with the code that we have shown. Can you find the reason?
Here we use the auxiliary functions ge_seg (for greater or equal than segment)
and le_seg (for less or equal that segment), where
5 Partitioning
The trickiest aspect of quicksort is the partitioning step, in particular since
we want to perform this operation in place. Let’s consider situation when
partition is called:
pivot_index%
…% 2" 87" 21" 3" 12" 78" 97" 16" 89" 21" …%
lower% upper%
Perhaps the first thing we notice is that we do not know where the pivot
will end up in the partitioned array! That’s because we don’t know how
many elements in the segment are smaller and how many are larger than
the pivot. In particular, the return value of partitxion could be different
than the pivot index that we pass in, even if the element that used to be at
the pivot index in the array before calling partition will be at the returned
index when partition is done. One idea is to make a pass over the seg-
ment and count the number of smaller elements, move the pivot into its
place, and then scan the remaining elements and put them into their place.
Fortunately, this extra pass is not necessary. We start by moving the pivot
element out of the way, by swapping it with the rightmost element in the
array segment.
pivot%=%16% upper%&1%
…% 2" 87" 21" 3" 12" 78" 97" 21" 89" 16" …%
lower% upper%
Now the idea is to gradually work towards the middle, accumulating el-
ements less than the pivot on the left and elements greater than the pivot
on the right end of the segment (excluding the pivot itself). For this pur-
pose we introduce two indices, left and right. We start them out as left and
upper 2.2
pivot%=%16% upper%)1%
…% 2" 87" 21" 3" 12" 78" 97" 21" 89" 16" …%
Since 2 < pivot, we can advance the left index: this element is in the proper
place.
≤%pivot% pivot%
…% 2" 87" 21" 3" 12" 78" 97" 21" 89" 16" …%
At this point, 87 > pivot, so we swap it into A[right] and decrement the
right index.
…% 2" 89" 21" 3" 12" 78" 97" 21" 87" 16" …%
Let’s take one more step: 89 > pivot, so we swap it into A[right] and decre-
ment the right index again.
…% 2" 21" 21" 3" 12" 78" 97" 89" 87" 16" …%
At this point we pause to read off the general invariants which will
allow us to synthesize the program. We see:
(1) A[lower ..left) pivot
…% 2" 12" 3" 21" 78" 97" 21" 89" 87" 16" …%
lower% upper%
Where do left and right need to be, according to our invariants? By invari-
ant (1), all elements up to but excluding left must be less or equal to pivot.
To guarantee we are finished, therefore, the left must address the element
21 at lower + 3. Similarly, invariant (2) states that the pivot must be less or
equal to all elements starting from right + 1 up to but excluding upper 1.
Therefore, right must address the element 3 at lower + 2.
…% 2" 12" 3" 21" 78" 97" 21" 89" 87" 16" …%
This means after the last iteration, just before we exit the loop, we have
left = right + 1, and throughout:
Now comes the last step: since left = right + 1, pivot A[left] and we can
swap the pivot at upper 1 with the element at left to complete the partition
operation. We can also see the left should be returned as the new position
of the pivot element.
6 Implementing Partitioning
Now that we understand the algorithm and its correctness proof, it remains
to turn these insights into code. We start by swapping the pivot element to
the end of the segment.
...
}
At this point we initialize left and right to lower and upper 2, respectively.
We have to make sure that the invariants are satisfied when we enter the
loop for the first time, so let’s write these.
The crucial observation here is that lower < upper by the precondition of
the function. Therefore left upper 1 = right + 1 when we first en-
ter the loop, since right = upper 2. The segments A[lower ..left) and
A[right+1..upper 1) will both be empty, initially.
The code in the body of the loop just compares the element at index left
with the pivot and either increments left, or swaps the element to A[right].
Now we just note the observations about the final loop state with an as-
sertion, swap the pivot into place, and return the index left. The complete
function is on the next page, for reference.
Exercises
Exercise 1 In this exercise we explore strengthening the contracts on in-place
sorting functions.
3. Discuss any specific difficulties or problems that arise. Assess the outcome.
Exercise 2 Prove that the precondition for sort together with the contract for
partition implies the postcondition. During this reasoning you may also assume
that the contract holds for recursive calls.
Exercise 3 Our implementation of partitioning did not pick a random pivot, but
took the middle element. Construct an array with seven elements on which our
algorithm will exhibit its worst-case behavior, that is, on each step, one of the par-
titions is empty.
Exercise 4 An alternative way to track the unscanned part of the array segment
during partitioning is to make the segment A[left..right) exclusive on the right.
Rewrite the code for partition, including its invariants, for this version of the
indices.
Lecture 9
February 12, 2013
1 Introduction
In this lecture we introduce queues and stacks as data structures, e.g., for
managing tasks. They follow similar principles of organizing the data.
Both provide functionality for putting new elements into it. But they dif-
fer in terms of the order how the elements come out of the data structure
again. Both queues and stacks as well as many other data structures could
be added to the programming language. But they can be implemented eas-
ily as a library in C0. In this lecture, we will focus on the abstract principles
of queues and stacks and defer a detailed implementation to the next lec-
ture.
Relating this to our learning goals, we have
Computational Thinking: We illustrate the power of abstraction by con-
sidering both client-side and library-side of the interface to a data
structure.
Algorithms and Data Structures: We are looking at queues and stacks as
important data structures, we introduce abstract datatypes by exam-
ple.
Programming: Use and design of interfaces.
the top of the stack to make the stack bigger, and remove items from the top
as well to make the stack smaller. This makes stacks a LIFO (Last In First
Out) data structure – the data we have put in last is what we will get out
first.
Before we consider the implementation to a data structure it is helpful
to consider the interface. We then program against the specified interface.
Based on the description above, we require the following functions:
/* type elem must be defined */
x 1 , x2 , . . . , x n
where x1 is the bottom of the stack and xn is the top of the stack. We push
elements on the top and also pop them from the top.
For example:
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$
bo0om" top"
and write the pre- and postconditions for functions that implement the in-
terface. Here, it is a simple check of making sure that the bottom and top
indices are in the range of the array and that bottom stays at 0, where we
expect it to be.
bool is_stack(stack S)
{
if (!(S->bottom == 0)) return false;
if (!(S->bottom <= S->top)) return false;
//@assert S->top < \length(S->data);
return true;
}
so that we can read off the invariants of the data structure. A specification
function like is_stack should be safe – it should only ever return true or
false or raise an assertion violation – and if possible it should avoid rais-
ing an assertion violation. Assertion violations are sometimes unavoidable
because we can only check the length of an array inside of the assertion
language.
bool stack_empty(stack S)
//@requires is_stack(S);
{
return S->top == S->bottom;
}
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$
bo0om" top"
to
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$
bo0om" top"
The "c" can still be present in the array at position 3, but it is now a part of
the array that we don’t care about, which we indicate by putting an X over
it. In code, popping looks like this:
string pop(stack S)
//@requires is_stack(S);
//@requires !stack_empty(S);
//@ensures is_stack(S);
{
string r = S->data[S->top];
S->top--;
return r;
}
Notice that contracts are cumulative. Since we already indicated
//@requires !stack_empty(S);
in the interface of pop, we would not have to repeat this requires clause in
the implementation. We repeat it regardless to emphasize its importance.
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$
bo0om" top"
to
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$ “e”$
bo0om" top"
In code:
struct stack_header {
string[] data;
int top;
int bottom;
int capacity; // capacity == \length(data);
};
typedef struct stack_header* stack;
Giving us the following updated view of array-based stacks:
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$
bool is_stack(stack S)
{
if (!(S->bottom == 0)) return false;
if (!(S->bottom <= S->top)) return false;
if (!(S->top < S->capacity)) return false;
//@assert S->capacity == \length(S->data);
return true;
}
This assertion can indeed fail if the client tries to push too many ele-
ments on the stack, which is why we use a hard assert – an assertion that
will run whether or not we compile with -d. The alternative would be to
expose the capacity of the stack to the user with a stack_full function
and then add a precondition //@requires !stack_full(S) to our push()
function.
stack stack_new()
//@ensures stack_empty(\result);
//@ensures is_stack(\result);
{
stack S = alloc(struct stack_header);
S->bottom = 0;
S->top = 0;
S->capacity = 100; // arbitrary resource bound
S->data = alloc_array(elem, S->capacity);
return S;
}
As shown above, we also need to allocate an array data to store the ele-
ments in. At this point, at the latest, we realize a downside of our stack im-
plementation. If we want to implement stacks in arrays in the simple way
that we just did, the trouble is that we need to decide its capacity ahead
of time. That is, we need to decide how many elements at maximum will
ever be allowed in the stack at the same time. Here, we arbitrarily choose
the capacity 100, but this gives us a rather poor implementation of stacks in
case the client needs to store more data. We will see how to solve this issue
with a better implementation of stacks in the next lecture.
This completes the implementation of stacks, which are a very simple
and pervasive data structure.
5 Abstraction
An important point about formulating a precise interface to a data structure
like a stack is to achieve abstraction. This means that as a client of the data
structure we can only use the functions in the interface. In particular, we
are not permitted to use or even know about details of the implementation
of stacks.
Let’s consider an example of a client-side program. We would like to
examine the element of top of the stack without removing it from the stack.
Such a function would have the declaration
string peek(stack S)
//@requires !stack_empty(S);
;
string peek(stack S)
//@requires !stack_empty(S);
{
return S->data[S->top];
}
We don’t see any top field, or any data field, so accessing these as a
client of the data structure would violate the abstraction. Why is this so
wrong? The problem is that if the library implementer decided to improve
the code, or perhaps even just rename some of the structures to make it eas-
ier to read, then the client code will suddenly break! In fact, we will provide
a different implementation of stacks in the next lecture, which would make
the above implementation of peek break. With the above client-side im-
plementation of peek, the stack interface does not serve the purpose it is
intended for, namely provide a reliable way to work with a data structure.
Interfaces are supposed to separate the implementation of a data structure
in a clean way from its use so that we can change one of the two without
affecting the other.
So what can we do? It is possible to implement the peek operation
without violating the abstraction! Consider how before you read on.
The idea is that we pop the top element off the stack, remember it in a
temporary variable, and then push it back onto the stack before we return.
string peek(stack S)
//@requires !stack_empty(S);
{
string x = pop(S);
push(S, x);
return x;
}
This is clearly less efficient: instead of just looking up the fields of a struct
and accessing an element of an array we actually have to pop and element
and then push it back onto the stack. However, it is still a constant-time
operation (O(1)) since both pop and push are constant-time operations.
Nonetheless, we have a possible argument to include a function peek in
the interface and implement it library-side instead of client-side to save a
small constant of time.
If we are actually prepared to extend the interface, then we can go back
to our original implementation.
string peek(stack S)
//@requires !stack_empty(S);
{
return S->data[S->top];
}
Is this a good implementation? Not quite. First we note that inside the
library we should refer to elements as having type elem, not string. For
our running example, this is purely a stylistic matter because these two
are synonyms. But, just as it is important that clients respect the library
interface, it is important that the library respect the client interface. In this
case, that means that the users of a stack can, without changing the library,
decide to change the definition of elem type in order to store different data
in the stack.
Second we note that we are now missing a precondition. In order to
even check if the stack is non-empty, we first need to be assured that it
is a valid stack. On the client side, all elements of type stack come from
the library, and any violation of data structure invariants could only be
discovered when we hand it back through the library interface to a function
implemented in the library. Therefore, the client can assume that values of
type stack are valid and we don’t have explicit pre- or post-conditions for
those. Inside the library, however, we are constantly manipulating the data
structure in ways that break and then restore the invariants, so we should
check if the stack is indeed valid.
From these two considerations we obtain the following code for inside
the library:
elem peek(stack S)
//@requires is_stack(S);
//@requires !stack_empty(S);
{
return S->data[S->top];
}
int stack_size(stack S)
//@requires is_stack(S);
{
return S->top - S->bottom;
}
x 1 , x2 , . . . , x n
where x1 is the front of the queue and xn is the back of the queue. We enqueue
elements in the back and dequeue them from the front.
For example:
queue C = Q;
will not have the effect of copying the queue Q into a new queue C. Just
as for the case of array, this assignment makes C and Q alias, so if we
change one of the two, for example enqueue an element into C, then the
other queue will have changed as well. Just as for the case of arrays, we
need to implement a function for copying the data.
The queue interface provides functions that allow us to dequeue data
from the queue, which we can do as long as the queue is not empty. So we
create a new queue C. Then we read all data from queue Q and put it into
the new queue C.
queue C = queue_new();
while (!queue_empty(Q)) {
enq(C, deq(Q));
}
//@assert queue_empty(Q);
Now the new queue C will contain all data that was previously in Q, so C
is a copy of what used to be in Q. But there is a problem with this approach.
Before you read on, can you find out which problem?
We could try to enqueue all data that we have read from Q back into Q
before putting it into C.
queue C = queue_new();
while (!queue_empty(Q)) {
string s = deq(Q);
enq(Q, s);
enq(C, s);
}
//@assert queue_empty(Q);
But there is something very fundamentally wrong with this idea. Can you
figure it out?
The problem with the above attempt is that the loop will never termi-
nate unless Q is empty to begin with. For every element that the loop body
dequeues from Q, it enqueues one element back into Q. That way, Q will
always have the same number of elements and will never become empty.
Therefore, we must go back to our original strategy and first read all ele-
ments from Q. But instead of putting them into C, we will put them into a
third queue T for temporary storage. Then we will read all elements from
the temporary storage T and enqueue them into both the copy C and back
into the original queue Q. At the end of this process, the temporary queue
T will be empty, which is fine, because we will not need it any longer. But
both the copy C and the original queue Q will be replenished with all the
elements that Q had originally. And C will be a copy of Q.
queue queue_copy(queue Q) {
queue T = queue_new();
while (!queue_empty(Q)) {
enq(T, deq(Q));
}
//@assert queue_empty(Q);
queue C = queue_new();
while (!queue_empty(T)) {
string s = deq(T);
enq(Q, s);
enq(C, s);
}
//@assert queue_empty(T);
return C;
}
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$
struct queue_header {
elem[] data;
int front;
int back;
int capacity;
};
typedef struct queue_header* queue;
When does a struct of this type represent a valid queue? In fact, when-
ever we define a new data type representation we should first think about
the data structure invariants. Making these explicit is important as we
think about and write the pre- and postconditions for functions that im-
plement the interface.
What we need here is simply that the front and back are within the
array bounds for array data and that the capacity is not too small. The
back of the queue is not used (marked X) but in the array, so we decide to
require that the capacity of a queue be at least 2 to make sure we can store
at least one element. (The WARNING about NULL still applies here.)
bool is_queue(queue Q)
{
if (Q->capacity < 2) return false;
if (Q->front < 0 || Q->front >= Q->capacity) return false;
if (Q->back < 0 || Q->back >= Q->capacity) return false;
//@assert Q->capacity == \length(Q->data);
return true;
}
To check if the queue is empty we just compare its front and back. If
they are equal, the queue is empty; otherwise it is not. We require that we
are being passed a valid queue. Generally, when working with a data struc-
ture, we should always require and ensure that its invariants are satisifed
in the pre- and post-conditions of the functions that manipulate it. Inside
the function, we will generally temporarily violate the invariants.
bool queue_empty(queue Q)
//@requires is_queue(Q);
{
return Q->front == Q->back;
}
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$
And in code:
elem deq(queue Q)
//@requires is_queue(Q);
//@requires !queue_empty(Q);
//@ensures is_queue(Q);
{
elem e = Q->data[Q->front];
Q->front++;
return e;
}
To enqueue something, that is, add a new item to the back of the queue,
we just write the data (here: a string) into the extra element at the back, and
increment back. You should draw yourself a diagram before you write this
kind of code. Here is a before-and-after diagram for inserting "e":
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$
0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$ “e”$
In code:
void enq(queue Q, string s)
//@requires is_queue(Q);
//@ensures is_queue(Q);
//@ensures !queue_empty(Q);
{
assert(Q->back < Q->capacity-1); // otherwise out of resources
Q->data[Q->back] = e;
Q->back++;
}
To obtain a new empty queue, we allocate a list struct and initialize both
front and back to 0, the first element of the array. We do not initialize the
elements in the array because its contents are irrelevant until some data is
put in. It is good practice to always initialize memory if we care about its
contents, even if it happens to be the same as the default value placed there.
queue queue_new()
//@ensures is_queue(\result);
//@ensures queue_empty(\result);
{
queue Q = alloc(struct queue_header);
Q->front = 0;
Q->back = 0;
Q->capacity = 100;
Q->data = alloc_array(elem, Q->capacity);
return Q;
}
Those contracts need to hold for all queue implementations. Why did we
decide not to include them? The reason is that there are not many situa-
tions in which this knowledge about queues is useful, because we rarely
want to dequeue an element right after enqueueing it. This is in contrast
to the //@requires !queue_empty(Q) contract of deq, which is critical for
the client to know about, because he can only dequeue elements from non-
empty queues and has to check for non-emptyness before calling deq.
Similar observations hold for our rationale for designing the stack in-
terface.
Then we change the precondition of enq to require that elements can only
be enqueued if the queue is not full
And require that pushing is only possible if the stack is not full
The advantage of this design is that the client now has a way of checking
whether there still is space in the stack/queue. The downside, however, is
that the client still does not have a way of increasing the capacity if he wants
to store more data in it.
In the next lecture, we will see a better implementation of stacks and
queues that does not have any of those capacity bounds. That implemen-
tation uses pointers and linked lists.
Exercises
Exercise 1 Can you implement a version of stack that does not use the bottom
field in the struct stack_header?
Exercise 2 Consider what would happen if we pop an element from the empty
stack when contracts are not checked? When does an error arise?
Exercise 5 Our queue design always “wasted” one element that we marked X.
Can we save this memory and implement the queue without extra elements? What
are the tradeoffs and alternatives when implementing a queue?
Exercise 6 The stack implementation using arrays may run out of space if its
capacity is exceeded. Can you think of a way of implementing unbounded stacks
stored in an array?
Lecture 9
February 14, 2013
1 Introduction
In this lecture we complete our discussion of types in C0 by discussing
pointers and structs, two great tastes that go great together. We will discuss
using contracts to ensure that pointer accesses are safe, as well as the use
of linked lists to ensure implement the stack and queue interfaces that were
introduced last time. The linked list implementation of stacks and queues
allows us to handle lists of any length.
Relating this to our learning goals, we have
Algorithms and Data Structures: Linked lists are a fundamental data struc-
ture.
Programming: We will see structs and pointers, and the use of recursion in
the definition of structs.
string[]"A"
struct img_header {
pixel[] data;
int width;
int height;
};
Here data, width, and height are not variables, but fields of the struct.
The declaration expresses that every image has an array of data as well as a
width and a height. This description is incomplete, as there are some miss-
ing consistency checks – we would expect the length of data to be equal to
the width times the height, for instance, but we can capture such properties
in a separate data structure invariant.
Structs do not necessarily fit into a machine word because they can have
arbitrarily many components, so they must be allocated on the heap (in
memory, just like arrays). This is true if they happen to be small enough to
fit into a word in order to maintain a uniform and simple language.
% coin structdemo.c0
C0 interpreter (coin) 0.3.2 ’Nickel’
Type ‘#help’ for help or ‘#quit’ to exit.
--> struct img_header IMG;
<stdio>:1.1-1.22:error:type struct img_header not small
[Hint: cannot pass or store structs in variables directly; use
pointers]
We can access the fields of a structs, for reading or writing, through the
notation p->f where p is a pointer to a struct, and f is the name of a field
in that struct. Continuing above, let’s see what the default values are in the
allocated memory.
--> IMG->data;
(default empty int[] with 0 elements)
--> IMG->width;
0 (int)
--> IMG->height;
0 (int)
We can write to the fields of a struct by using the arrow notation on the
left-hand side of an assignment.
3 Pointers
As we have seen in the previous section, a pointer is needed to refer to a
struct that has been allocated on the heap. In can also be used more gener-
ally to refer to an element of arbitrary type that has been allocated on the
heap. For example:
In this case we refer to the value using the notation *p, either to read (when
we use it inside an expression) or to write (if we use it on the left-hand side
of an assignment).
So we would be tempted to say that a pointer value is simply an ad-
dress. But this story, which was correct for arrays, is not quite correct for
pointers. There is also a special value NULL. Its main feature is that NULL is
not a valid address, so we cannot dereference it to obtain stored data. For
example:
4 Linked Lists
Linked lists are a common alternative to arrays in the implementation of
data structures. Each item in a linked list contains a data element of some
type and a pointer to the next item in the list. It is easy to insert and delete
elements in a linked list, which are not natural operations on arrays, since
arrays have a fixed size. On the other hand access to an element in the
middle of the list is usually O(n), where n is the length of the list.
An item in a linked list consists of a struct containing the data element
and a pointer to another linked list. In C0 we have to commit to the type
of element that is stored in the linked list. We will refer to this data as
having type elem, with the expectation that there will be a type definition
elsewhere telling C0 what elem is supposed to be. Keeping this in mind
ensures that none of the code actually depends on what type is chosen.
These considerations give rise to the following definition:
struct list_node {
elem data;
struct list_node* next;
};
typedef struct list_node list;
struct infinite {
int x;
struct infinite next;
}
front! back!
struct queue_header {
list* front;
list* back;
};
typedef struct queue_header* queue;
We call this a header because it doesn’t hold any elements of the queue, just
pointers to the linked list that really holds them. The type definition allows
us to use queue as a type that represents a pointer to a queue header. We
define it this way so we can hide the true implementation of queues from
the client and just call it an element of type queue.
When does a struct of this type represent a valid queue? In fact, when-
ever we define a new data type representation we should first think about
the data structure invariants. Making these explicit is important as we
think about and write the pre- and postconditions for functions that im-
plement the interface.
What we need here is if we follow front and then move down the
linked list we eventually arrive at back. We call this a list segment. We
also want both front and back not to be NULL so it conforms to the picture,
with one element already allocated even if the queue is empty.
bool is_queue(queue Q) {
if (Q == NULL) return false;
if (Q->front == NULL) return false;
if (Q->back == NULL) return false;
if (!is_segment(Q->front, Q->back)) return false;
return true;
}
Next, the code for checking whether two pointers delineate a list segment.
When both start and end are NULL, we consider it a valid list segment, even
though this will never come up for queues. It is a common code pattern for
working with linked lists and similar data representation to have a pointer
variable, here called p, that is updated to the next item in the list on each
iteration until we hit the end of the list.
return false. The other situation is if we find end , in which case we return
true since we have a valid list segment. This function may not terminate
if the list contains a cycle. We will address this issue in the next lecture; for
now we assume all lists are acyclic.
To check if the queue is empty we just compare its front and back. If
they are equal, the queue is empty; otherwise it is not. We require that we
are being passed a valid queue. Generally, when working with a data struc-
ture, we should always require and ensure that its invariants are satisfied
in the pre- and post-conditions of the functions that manipulate it. Inside
the function, we will generally temporarily violate the invariants.
bool queue_empty(queue Q)
//@requires is_queue(Q);
{
return Q->front == Q->back;
}
To obtain a new empty queue, we just allocate a list struct and point both
front and back of the new queue to this struct. We do not initialize the list
element because its contents are irrelevant, according to our representation.
It is good practice to always initialize memory if we care about its contents,
even if it happens to be the same as the default value placed there.
queue queue_new()
//@ensures is_queue(\result);
//@ensures queue_empty(\result);
{
queue Q = alloc(struct queue_header);
list* p = alloc(struct list_node);
Q->front = p;
Q->back = p;
return Q;
}
Let’s take one of these lines apart. Why does
queue Q = alloc(struct queue_header);
make sense? According to the definition of alloc, we might expect
struct queue_header* Q = alloc(struct queue_header);
since allocation returns the address of what we allocated. Fortunately, we
defined queue to be a short-hand for struct queue_header* so all is well.
To enqueue something, that is, add a new item to the back of the queue,
we just write the data (here: a string) into the extra element at the back,
create a new back element, and make sure the pointers updated correctly.
You should draw yourself a diagram before you write this kind of code.
Here is a before-and-after diagram for inserting "3" into a list. The new or
updated items are dashed in the second diagram.
data! next!
!
1! ! 2!
Q!
front! back!
data! next!
!
1! ! 2! 3!
Q!
front! back!
Another important point to keep in mind as you are writing code that ma-
nipulates pointers is to make sure you perform the operations in the right
order, if necessary saving information in temporary variables.
void enq(queue Q, string s)
//@requires is_queue(Q);
//@ensures is_queue(Q);
{
list* p = alloc(struct list);
Q->back->data = s;
Q->back->next = p;
Q->back = p;
}
data! next!
!
1! ! 2! 3!
front! back!
data! next!
!
1! ! 2! 3!
front! back!
And in code:
string deq(queue Q)
//@requires is_queue(Q);
//@requires !queue_empty(Q);
//@ensures is_queue(Q);
{
string s = Q->front->data;
Q->front = Q->front->next;
return s;
}
Let’s verify that the our pointer dereferencing operations are safe. We have
Q->front->data
bool is_queue(queue Q) {
if (Q == NULL) return false;
if (Q->front == NULL) return false;
if (Q->back == NULL) return false;
if (!is_segment(Q->front, Q->back)) return false;
return true;
}
We see that Q->front is okay, because by the first test we know that Q != NULL
is the precondition holds. By the second test we see that both Q->front and
Q->back are not null, and we can therefore dereference them.
We also make the assignment Q->front = Q->front->next. Why does
this preserve the invariant? Because we know that the queue is not empty
(second precondition of deq) and therefore Q->front != Q->back. Because
Q->front to Q->back is a valid non-empty segment, Q->front->next can-
not be null.
An interesting point about the dequeue operation is that we do not ex-
plicitly deallocate the first element. If the interface is respected there cannot
be another pointer to the item at the front of the queue, so it becomes un-
reachable: no operation of the remainder of the running programming could
ever refer to it. This means that the garbage collector of the C0 runtime sys-
tem will recycle this list item when it runs short of space.
the data at the bottom of the stack is meaningless and will not be used in
our implementation. A typical stack then has the following form:
data! next!
!
3! ! 2! 1!
top! bo.om!
struct list_node {
elem data;
struct list_node* next;
};
typedef struct list_node list;
struct stack_header {
list* top;
list* bottom;
};
typedef struct stack_header* stack;
bool is_stack(stack S) {
if (S == NULL) return false;
if (Q->front == NULL) return false;
if (Q->back == NULL) return false;
if (!is_segment(Q->front, Q->back)) return false;
return true;
}
Popping from a stack requires taking an item from the front of the
linked list, which is much like dequeuing.
elem pop(stack S)
//@requires is_stack(S);
//@requires !stack_empty(S);
//@ensures is_stack(S);
{
elem e = S->top->data;
S->top = S->top->next;
return e;
}
To push an element onto the stack, we create a new list item, set its data
field and then its next field to the current top of the stack – the opposite end
of the linked list from the queue. Finally, we need to update the top field of
the stack to point to the new list item. While this is simple, it is still a good
idea to draw a diagram. We go from
data! next!
!
3! ! 2! 1!
top! bo.om!
to
top! bo/om!
In code:
Exercises
Exercise 1 Consider what would happen if we pop an element from the empty
stack when contracts are not checked in the linked list implementation? When
does an error arise?
Exercise 2 Stacks are usually implemented with just one pointer in the header, to
the top of the stack. Rewrite the implementation in this style, dispensing with the
bottom pointer, terminating the list with NULL instead.
Lecture 12
February 21, 2012
1 Introduction
Most lectures so far had topics related to all three major categories of learn-
ing goals for the course: computational thinking, algorithms, and program-
ming. The same is true for this lecture. With respect to algorithms, we in-
troduce unbounded arrays and operations on them. Analyzing them requires
amortized analysis, a particular way to reason about sequences of operations
on data structures. We also briefly talk about again about data structure in-
variants and interfaces, which are crucial computational thinking concepts.
2 Unbounded Arrays
In the second homework assignment, you were asked to read in some files
such as the Collected Works of Shakespeare, the Scrabble Players Dictionary, or
anonymous tweets collected from Twitter. What kind of data structure do
we want to use when we read the file? In later parts of the assignment
we want to look up words, perhaps sort them, so it is natural to want to
use an array of strings, each string constituting a word. A problem is that
before we start reading we don’t know how many words there will be in
the file so we cannot allocate an array of the right size! One solution uses
either a queue or a stack. We discussed this in Lecture 9 on Queues and
in Lecture 10 on Pointers. Unlike the linked-list implementation of queues
from lecture 10, the array-based implementation of queues from Lecture 9
was still capacity-bounded. It would work, however, if we had unbounded
arrays. In fact, in unbounded arrays, we could store the data directly. While
arrays are a language primitive, unbounded arrays are a data structure that
we need to implement.
Thinking about it abstractly, an unbounded array should be like an ar-
ray in the sense that we can get and set the value of an arbitrary element
via its index i. We should also be able to add a new element to the end of
the array, and delete an element from the end of the array.
We use the unbounded array by creating an empty one before reading a
file. Then we read words from the file, one by one, and add them to the end
of the unbounded array. Meanwhile we can count the number of elements
to know how many words we have read. We trust the data structure not to
run out of space unless we hit some hard memory limit, which is unlikely
for the kind of task we have in mind, given modern operating systems.
When we have read the whole file the words will be in the unbounded
array, in order, the first word at index 0, the second at index 1, etc.
The general implementation strategy is as follows. We maintain an ar-
ray of a fixed length limit and an internal index size which tracks how many
elements are actually used in the array. When we add a new element we
increment size, when we remove an element we decrement size. The tricky
issue is how to proceed when we are already at the limit and want to add
another element. At that point, we allocate a new array with a larger limit
and copy the elements we already have to the new array. For reasons we
explain later in this lecture, every time we need to enlarge the array we dou-
ble its size. Removing an element from the end is simple: we just decrement
size. There are some issues to consider if we want to shrink the array, but
this is optional.
struct uba_header {
int limit; /* 0 < limit */
int size; /* 0 <= size && size <= limit */
elem[] A; /* \length(A) == limit */
};
violated. However, since the lengths of arrays can only be checked in con-
tracts (they may not be available when a program is compiled without -d
to make computation as efficient as possible) we may have to use contracts
to some extent even for functions whose intended use is only in contracts.
Note that we must check to make sure that L != NULL before checking any
other fields, including L->size and L->A (i.e. L->limit == \length(L->A))
in order to make sure the pointer dereferences on L are safe. Safety of an-
notations and safety of contract functions is just as indispensable as safety
in the rest of the code.
To create a new unbounded array, we allocate a struct uba_header
and an array of a supplied initial limit.
We check that doubling the array size would not overflow and raise an
assertion failure. Using assert as a statement instead of inside an anno-
tation means that the assertion will always be checked, even if the code
is compiled without -d. It will have the same effect as, for example, the
alloc_array function when there is not enough memory to allocate the
array.
We discuss how to remove an element from an array in section 6.
5 Amortized Analysis
It is easy to see that reading from or writing to an unbounded array at a
given index is a constant-time operation. However, adding an element to
an array is not. Most of the time it takes constant time O(1), but when we
have run out of space it take times O(size) because we have to copy the old
elements to the new underlying array. On the other hand, it doesn’t seem
to happen very often. Can we characterize this situation more precisely?
This is the subject of amortized analysis.
In order to make the analysis as concrete as possible, we want to count
the number of writes to an array, that is, the number of assignments A[ ] =
that are performed. Calling the operation to add a new element to an
unbounded array an insert, we claim:
This statement is quite different from what we have done before, when we
have analyzed the cost of a particular function call like sort or binsearch.
Based on the common use of unbounded arrays, we should consider the
cost of multiple operations together. Many other data structures introduced
later in the course will be subject to a similar analysis.
How do we prove the above bound? A simple insert (when there is
room in the array) requires a single write operation, so we count it as 1.
Similarly, we count the act of copying one element from one array to an-
other as 1 operation, because it requires one write operation. Now per-
forming a sequence of inserts, starting with an empty array of, say, size 4
looks as follows.
call op’s size limit
uba_add(L,"a") 1 1 4
uba_add(L,"b") 1 2 4
uba_add(L,"c") 1 3 4
uba_add(L,"d") 1 4 4
uba_add(L,"e") 5 5 8
uba_add(L,"f") 1 6 8
uba_add(L,"g") 1 7 8
uba_add(L,"h") 1 8 8
uba_add(L,"i") 9 9 16
We have taken 4 extra operations when inserting "e" in order to copy "a"
through "d". Overall, we have performed 21 operations for inserting 9
elements. Would that be O(n) by the time we had inserted n elements?
We approach this by giving us an overall budget of c ⇤ n operations
(“tokens”) before we start to insert n elements. Every time we perform
a write operation we spend a token. If we perform all n inserts without
running out of tokens, we have achieved the desired amortized complexity.
One difficulty is to guess the right constant c. We already know that
c = 1 or c = 2 will not be enough, because in the sequence above we must
spend 21 tokens to insert 9 elements. Let’s try c = 3, so we start with 27
tokens.
tokens
call op’s left size limit
uba_add(L,"a") 1 26 1 4
uba_add(L,"b") 1 25 2 4
uba_add(L,"c") 1 24 3 4
uba_add(L,"d") 1 23 4 4
uba_add(L,"e") 5 18 5 8
uba_add(L,"f") 1 17 6 8
uba_add(L,"g") 1 16 7 8
uba_add(L,"h") 1 15 8 8
uba_add(L,"i") 9 6 9 16
We see that we spend 4 tokens when adding "e" to copy "a" through "d",
and we add a new one for the insertion of "e" itself.
One of the insights of amortized analysis is that we don’t need to know
the number n of inserts ahead of time. In order to achieve the bound of
c ⇤ n operations, we simply allow each call to perform c operations. If it
performs fewer, these remain in the budget and may be spent later! Let’s
go through the same sequence of calls again.
allocated spent saved total saved
call op’s tokens tokens tokens tokens size limit
uba_add(L,"a") 1 3 1 2 2 1 4
uba_add(L,"b") 1 3 1 2 4 2 4
uba_add(L,"c") 1 3 1 2 6 3 4
uba_add(L,"d") 1 3 1 2 8 4 4
uba_add(L,"e") 5 3 5 2 6 5 8
uba_add(L,"f") 1 3 1 2 8 6 8
uba_add(L,"g") 1 3 1 2 10 7 8
uba_add(L,"h") 1 3 1 2 12 8 8
uba_add(L,"i") 9 3 9 6 6 9 16
The crucial property we need is that there are k 0 tokens left just after
we have doubled the size of the array. We think of this as an invariant of
the computation: it should always be true, no matter how many strings we
insert. In this example we reach 6 tokens after 5 inserts and again after 9
inserts.
To prove this invariant, we must show that it holds the first time we
have to double the size of the array, and that it is preserved by the opera-
tions.
When we create the array, we give it some initial limit limit 0 . We run
out of space, once we have inserted limit 0 tokens, arriving at the following
situation.
size
$
$
$
$
“a”
“b”
“c”
“d”
limit0
2*limit0
size
k*$
“a”
“b”
“c”
“d”
limit
=
2*size
After size more inserts we are at limit and added another 2 ⇤ size = limit
tokens.
On the next insert we double the size of the array and copy limit array
elements, spending limit tokens.
k*$
6 Removing Elements
Removing elements from the end of the array is simple, and does not change
our amortized analysis, unless we want to shrink the size of the array.
A first idea might be to simply cut the array in half whenever size
reaches half the size of the array. However, this cannot work in constant
amortized time. The example demonstrating that is an alternating sequence
of n inserts and n deletes precisely when we are at the limit of the array. In
that case the total cost of the 2 ⇤ n operations will be O(n2 ).
To avoid this problem we cut the size of the array in half only when the
number of elements in it reaches limit/4. The amortized analysis requires
two tokens for any deletion: one to delete the element, and one for any
future copy. Then if size = limit/2 just after we doubled the size of the
array and have no tokens, putting aside one token on every delete means
that we have size/2 = limit/4 tokens when we arrive at a size of limit/4.
Again, we have just enough tokens to copy the limit/4 elements to the new,
smaller array of size limit/2.
The code for uba_rem (“remove from end”):
elem uba_rem(uba L)
//@requires is_uba(L);
//@requires L->size > 0;
//@ensures is_uba(L);
{
if (L->size <= L->limit/4 && L->limit >= 2)
uba_resize(L, L->limit/2);
L->size--;
elem e = L->A[L->size];
return e;
}
We explicitly check that L->limit >= 2 to make sure that the limit never
becomes 0, which would violate one of our data structure invariants.
One side remark: before we decrement size, we should delete the el-
ement from the array by writing L->A[L->size] = "". In C0, we do not
have any explicit memory management. Storage will be reclaimed and
used for future allocation when the garbage collector can see that data are
Exercises
Exercise 1 In the amortized cost analysis for uba_add, we have concluded
Exercise 2 When removing elements from the unbounded array we resize if the
limit grossly exceeds its size. Namely when L->size <= L->limit/4. Your first
instinct might have been to already shrink the array when L->size <= L->limit/2.
We have argued by example why that does not give us constant amortized cost
O(n) for a sequence of n operations. We have also sketched an argument why
L->size <= L->limit/2 gives the right amortized cost. At which step in that
argument would you notice that L->size <= L->limit/2 is the wrong choice?
Lecture 13
February 28, 2013
1 Introduction
In this lecture we re-introduce the dictionaries that were implemented as a
part of Clac and generalize them as so-called associative arrays. Associative
arrays are data structures that are similar to arrays but are not indexed by
integers, but other forms of data such as strings. One popular data struc-
tures for the implementation of associative arrays are hash tables. To analyze
the asymptotic efficiency of hash tables we have to explore a new point of
view, that of average case complexity. Another computational thinking con-
cept that we revisit is randomness. In order for hash tables to work effi-
ciently in practice we need hash functions whose behavior is predictable
(deterministic) but has some aspects of randomness.
Relating to our learning goals, we have
2 Associative Arrays
Arrays can be seen as a mapping, associating with every integer in a given
interval some data item. It is finitary, because its domain, and therefore
also its range, is finite. There are many situations when we want to index
elements differently than just by integers. Common examples are strings
(for dictionaries, phone books, menus, data base records), or structs (for
dates, or names together with other identifying information). They are so
common that they are primitive in some languages such as PHP, Python,
or Perl and perhaps account for some of the popularity of these languages.
In many applications, associative arrays are implemented as hash tables
because of their performance characteristics. We will develop them incre-
mentally to understand the motivation underlying their design.
4 Chains
A first idea to explore is to implement the associative array as a linked
list, called a chain. If we have a key k and look for it in the chain, we just
traverse it, compute the intrinsic key for each data entry, and compare it
with k. If they are equal, we have found our entry, if not we continue the
search. If we reach the end of the chain and do not find an entry with key k,
then no entry with the given key exists. If we keep the chain unsorted this
gives us O(n) worst case complexity for finding a key in a chain of length
n, assuming that computing and comparing keys is constant time.
Given what we have seen so far in our search data structures, this seems
very poor behavior, but if we know our data collections will always be
small, it may in fact be reasonable on occasion.
Can we do better? One idea goes back to binary search. If keys are or-
dered we may be able to arrange the elements in an array or in the form of
a tree and then cut the search space roughly in half every time we make a
comparison. We will begin thinking about this approch just before Spring
Break, and it will occupy us for a few lectures after the break as well. De-
signing such data structures is a rich and interesting subject, but the best
we can hope for with this approach is O(log(n)), where n is the number of
entries. We have seen that this function grows very slowly, so this is quite
a practical approach.
Nevertheless, the challenge arises if we can do better than O(log(n)),
say, constant time O(1) to find an entry with a given key. We know that
it can done be for arrays, indexed by integers, which allow constant-time
access. Can we also do it, for example, for strings?
5 Hashing
The first idea behind hash tables is to exploit the efficiency of arrays. So:
to map a key to an entry, we first map a key to an integer and then use the
integer to index an array A. The first map is called a hash function. We write
it as hash( ). Given a key k, our access could then simply be A[hash(k)].
There is an immediate problem with this approach: there are 231 pos-
itive integers, so we would need a huge array, negating any possible per-
formance advantages. But even if we were willing to allocate such a huge
array, there are many more strings than int’s so there cannot be any hash
function that always gives us different int’s for different strings.
The solution is to allocate an array of smaller size, say m, and then look
up the result of the hash function modulo m, for example, A[hash(k)%m].
This creates a new problem: it is inevitable that multiple strings will map
to the same array index. For example, if the array has size m then if we
have more then m elements, at least two must map to the same index. In
practice, this will happen much sooner that this.
If two hash functions map a key to the same integer value (modulo m),
we say we have a collision. In general, we would like to avoid collisions,
6 Separate Chaining
How do we deal with collisions of hash values? The simplest is a technique
called separate chaining. Assume we have hash(k1 )%m = i = hash(k2 )%m,
where k1 and k2 are the distinct keys for two data entries e1 and e2 we want
to store in the table. In this case we just arrange e1 and e2 into a chain
(implemented as a linked list) and store this list in A[i].
In general, each element A[i] in the array will either be NULL or a chain of
entries. All of these must have the same hash value for their key (modulo
m), namely i. As an exercise, you might consider other data structures
here instead of chains and weigh their merits: how about sorted lists? Or
queues? Or doubly-linked lists? Or another hash table?
We stick with chains because they are simple and fast, provided the
chains don’t become too long. This technique is called separate chaining
because the chains are stored seperately, not directly in the array. Another
technique, which we do not discuss, is linear probing where we continue by
searching (linearly) for an unused spot array itself, starting from the place
where the hash function put us.
Under separate chaining, a snapshot of a hash table might look some-
thing like this picture.
m
0
3. Search the chain starting at A[i] for an element whose key matches k.
We will analyze this next.
The complexity of the last step depends on the length of the chain. In the
worst case it could be O(n), because all n elements could be stored in one
chain. This worst case could arise if we allocated a very small array (say,
m = 1), or because the hash function maps all input strings to the same
table index i, or just out of sheer bad luck.
Ideally, all the chains would be approximately the same length, namely
n/m. Then for a fixed load factor such as n/m = ↵ = 2 we would take on
the average 2 steps to go down the chain and find k. In general, as long
as we don’t let the load factor become too large, the average time should be
O(1).
If the load factor does become too large, we could dynamically adapt its
size, like in an unbounded array. As for unbounded arrays, it is beneficial
to double the size of the hash table when the load factor becomes too high,
or possibly halve it if the size becomes too small. Analyzing these factors
is a task for amortized analysis, just as for unbounded arrays.
8 Randomness
The average case analysis relies on the fact that the hash values of the key
are relatively evenly distributed. This can be restated as saying that the
probability that each key maps to an array index i should be about the
same, namely 1/m. In order to avoid systematically creating collisions,
small changes in the input string should result in unpredicable change in
the output hash value that is uniformly distributed over the range of C0 in-
tegers. We can achieve this with a pseudorandom number generator (PRNG).
0 ! 7 ! ( 6) ! ( 7) ! 4 ! ( 5) ! ( 2) !
3 ! ( 8) ! ( 1) ! 1 ! ( 4) ! 3 ! 6 ! 5 ! 0 ! . . .
This kind of generator is fine for random testing or (indeed) the basis for
a hashing function, but the results are too predictable to use it for cryp-
tographic purposes such as encrypting a message. In particular, a linear
congruential generator will sometimes have repeating patterns in the lower
bits. If one wants numbers from a small range it is better to use the higher
bits of the generated results rather than just applying the modulus opera-
tion.
It is important to realize that these numbers just look random, they aren’t
really random. In particular, we can reproduce the exact same sequence if
we give it the exact same seed. This property is important for both test-
ing purposes and for hashing. If we discover a bug during testing with
Exercises
Exercise 1 What happens when you replace the data structure for separate chain-
ing by something other than a linked list? Discuss the changes and identify ben-
efits and disadvantages when using a sorted list, a queue, a doubly-linked list, or
another hash table for separate chaining.
Exercise 2 Consider the situation of writing a hash function for strings of length
two, that only use the characters ’A’ to ’Z’. There are 676 different such strings.
You were hoping to get away with implementing a hash table without collisions,
since you are only using 79 out of those 676 two-letter words. But you still see
collisions most of the time. Explain this phenomenon with the birthday problem.
Lecture 14
October 16, 2012
1 Introduction
The notion of an interface to an implementation of an abstract data type or li-
brary is an extremely important concept in computer science. The interface
defines not only the types, but also the available operations on them and the
pre- and postconditions for these operations. For general data structures it
is also important to note the asymptotic complexity of the operations so
that potential clients can decide if the data structure serves their purpose.
For the purposes of this lecture we call the data structures and the op-
erations on them provided by an implementation the library and code that
uses the library the client.
What makes interfaces often complex is that in order for the library to
provide its services it may in turn require some operations provided by the
client. Hash tables provide an excellent example for this complexity, so we
will discuss the interface to hash tables in details before giving the hash
table implementation. Binary search trees, discussed in Lecture 15 provide
another excellent example.
Relating to our learning goals, we have
Programming: We revisit the char data type and use it to consider string
hashing.
where we have left it open for now (indicated by ___) how the type ht of
hash tables will eventually be defined. That is really the only type pro-
vided by the implementation. In addition, it is supposed to provide three
functions:
The function ht_new(int capacity) takes the initial capacityof the hash
table as an argument (which must be strictly positive) and returns a new
hash table without any elements.
The function ht_lookup(ht H, key k) searches for an element with
key k in the hash table H. If such an element exists, it is returned. If it does
not exist, we return NULL instead.
From these decisions we can see that the client must provide the type of
keys and the type of elements. Only the client can know what these might
be in any particular use of the library. In addition, we observe that NULL
must be a value of type elem.
The function ht_insert(ht H, elem e) inserts an element e into the
hash table H, which is changed as an effect of this operation. But NULL
cannot be a valid element to insert, because otherwise we would know
whether the return value NULL for ht_search means that an element is
present or not. We therefore require e not to be null.
To summarize the types we have discovered will have to come from the
client:
/* client-side types */
typedef ___* elem;
typedef ___ key;
We have noted the fact that elem must be a pointer by already filling in the
* in its definition. Keys, in contrast, can be arbitrary.
Does the client also need to provide any functions? Yes! Any function
the hash table implementation requires which must understand the imple-
mentations of the type elem and key must be provided by the client, since
the library is supposed to be generic.
It turns out there are three. First, and most obviously, we need a hash
function which maps keys to integers. We also provide the hash function
with a modulus, which will be the size of array in the hash table implemen-
tation.
/* client-side functions */
int hash(key k, int m)
//@requires m > 0;
//@ensures 0 <= \result && \result < m;
;
The result must be in the range specified by m. For the hash table im-
plementation to achieve its advertised (average-case) asymptotic complex-
ity, the hash function should have the property that its results are evenly
distributed between 0 and m. Interestingly, it will work correctly (albeit
slowly), as long as hash satisfies its contract even, for example, if it maps
every key to 0.
Now recall how lookup in a hash table works. We map the key to an
integer and retrieve the chain of elements stored in this slot in the array.
Then we walk down the chain and compare keys of the stored elements
with the lookup key. This requires the client to provide two additional
operations: one to compare keys, and one to extract a key from an element.
/* client-side functions */
bool key_equal(key k1, key k2);
key elem_key(elem e)
//@requires e != NULL;
;
/*************************/
/* client-side interface */
/*************************/
typedef ___* elem;
typedef ___ key;
key elem_key(elem e)
//@requires e != NULL;
;
/**************************/
/* library side interface */
/**************************/
typedef struct ht_header* ht;
ht ht_new(int capacity)
//@requires capacity > 0;
;
elem ht_search(ht H, key k); /* O(1) avg. */
void ht_insert(ht H, elem e) /* O(1) avg. */
//@requires e != NULL;
;
int ht_size(ht H, elem e); /* O(1) */
void ht_stats(ht h);
The function ht_size reports the total number of elements in the array
(remember that the load factor is the size n divided by the capacity m). The
function ht_stats has no effect, but prints out a histogram reporting how
many chains in the hash table are empty, how many have length 1, how
many have length 2, and so on. For a hashtable to have good performance,
chains should be generally short.
3 A Tiny Client
One sample application is to count word occurrences – say, in a corpus of
Twitter data or in the collected works of Shakespeare. In this application,
the keys are the words, represented as strings. Data elements are pairs of
words and word counts, the latter represented as integers.
/******************************/
/* client-side implementation */
/******************************/
struct wcount {
string word;
int count;
};
int hash_string(string s) {
int len = string_length(s);
int h = 0;
for (int i = 0; i < len; i++)
//@loop_invariant 0 <= i;
{
int ch = char_ord(string_charat(s, i));
// Do something to combine h and ch
}
return h;
}
Now, if we don’t add anything to replace the comment, the function above
will still allow the hashtable to work correctly, it will just be very slow
because the hash value of every string will be zero.
An slightly better idea is combining h and ch with addition or multipli-
cation:
for (int i = 0; i < len; i++)
//@loop_invariant 0 <= i;
{
int ch = char_ord(string_charat(s, i));
h = h + ch;
}
This is still pretty bad, however. We can see how bad by running entering
the n = 45, 600 news vocabulary words from Homework 2 into a table with
m = 22, 800 chains (load factor is 2) and running ht_stats:
Hash table distribution: how many chains have size...
...0: 21217
...1: 239
...2: 132
...3: 78
...4: 73
...5: 55
...6: 60
...7: 46
...8: 42
...9: 23
...10+: 835
Longest chain: 176
Most of the chains are empty, and many of the chains are very, very long.
One problem is that most strings are likely to have very small hash values
when we use this hash function. An even bigger problem is that rearrang-
ing the letters in a string will always produce another string with the same
hash value – so we know that "cab" and "abc" will always collide in a
hash table. Hash collisions are inevitable, but when we can easily predict
that two strings have the same hash value, we should be suspicious that
something is wrong.
To address this, we can manipulate the hex value in some way before
we combine it with the current value. Some versions of Java use this as
their default string hashing function.
This does much better when we add all the news vocabulary strings into
the hash table:
rand_t r = init_rand(0x1337BEEF);
for (int i = 0; i < len; i++)
//@loop_invariant 0 <= i;
{
int ch = char_ord(string_charat(s, i));
h = rand(r) * h;
h = h + ch;
}
If we look at the performance of this function, it is comparable to the Java
hash function, though it is not actually quite as good – more of the chains
are empty, and more are longer.
Hash table distribution: how many chains have size...
...0: 3796
...1: 6214
...2: 5424
...3: 3589
...4: 2101
...5: 1006
...6: 455
...7: 145
...8: 48
...9: 15
...10+: 7
Longest chain: 11
Many other variants are possible; for instance, we could directly apply-
ing the linear congruential generator to the hash value at every step:
for (int i = 0; i < len; i++)
//@loop_invariant 0 <= i;
{
int ch = char_ord(string_charat(s, i));
h = 1664525 * h + 1013904223;
h = h + ch;
}
The key goals are that we want a hash function that is very quick to com-
pute and that nevertheless achieves good distribution across our hash ta-
ble. Handwritten hash functions often do not work well, which can signifi-
cantly affect the performance of the hash table. Whenever possible, the use
of randomness can help to avoid any systematic bias.
/*******************************/
/* library-side implementation */
/*******************************/
struct list_node {
elem data; /* data != NULL */
struct list_node* next;
};
typedef struct list_node list;
struct ht_header {
int size; /* size >= 0 */
int capacity; /* capacity > 0 */
list*[] table; /* \length(table) == capacity */
};
bool is_ht(ht H) {
if (H == NULL) return false;
if (!(H->size >= 0)) return false;
if (!(H->capacity > 0)) return false;
//@assert \length(H->table) == H->capacity;
/* check that each element of table is a valid chain */
/* includes checking that all elements are non-null */
return true;
}
Recall that the test on the length of the array must be inside an annotation,
because the \length function is not available when the code is compiled
without dynamic checking enabled.
Allocating a hash table is straightforward.
ht ht_new(int capacity)
//@requires capacity > 0;
//@ensures is_ht(\result);
{
ht H = alloc(struct ht_header);
H->size = 0;
H->capacity = capacity;
H->table = alloc_array(list*, capacity);
/* initialized to NULL */
return H;
}
We can extract the key from the element l->data because the data can not
be null in a valid hash table.
list* p = H->table[i];
while (p != NULL)
// loop invariant: p points to chain
{
//@assert p->data != NULL;
if (key_equal(elem_key(p->data), k))
{
/* overwrite existing element */
p->data = e;
return;
} else {
p = p->next;
}
}
//@assert p == NULL;
/* prepend new element */
list* q = alloc(struct list_node);
q->data = e;
q->next = H->table[i];
H->table[i] = q;
(H->size)++;
return;
}
Exercises
Exercise 1 Extend the hash table implementation so it dynamically resizes itself
when the load factor exceeds a certain threshold. When doubling the size of the
hash table you will need to explicitly insert every element from the old hash table
into the new one, because the result of hashing depends on the size of the hash table.
Exercise 2 Extend the hash table interface with new functions ht_size that re-
turns the number of elements in a table and ht_tabulate that returns an array
with the elements in the hash table, in some arbitrary order.
Exercise 3 Complete the client-side code to build a hash table containing word
frequencies for the words appearing in Shakespeare’s collected works. You should
build upon the code in Assignment 2.
Exercise 4 Extend the hash table interface with a new function to delete an ele-
ment with a given key from the table. To be extra ambitious, shrink the size of the
hash table once the load factor drops below some minimum, similarly to the way
we could grow and shrink unbounded arrays.
Lecture 15
March 07, 2013
1 Introduction
In this lecture, we will continue considering associative arrays as we did in
the hash table lecture. This time, we will follow a different implementation
principle, however. With binary search trees we try to obtain efficient insert
and search times for associative arrays dictionaries, which we have pre-
viously implemented as hash tables. We will eventually be able to achieve
O(log(n)) worst-case asymptotic complexity for insert and search. This also
extends to delete, although we won’t discuss that operation in lecture. In
particular, the worst-case complexity of associative array operations im-
plemented as binary search trees is better than the worst-case complexity
when implemented as hash tables.
then O(n) steps to make room for that new element by shifting all bigger
elements over. We would also need to grow the array as in unbounded
arrays to make sure it does not run out of capacity. In this lecture, we will
follow similar principles, but move to a different data structure to make
insertion a cheap operation as well, not just lookup. In particular, arrays
themselves are not flexible enough for insertion, but the data structure that
we will be devising in this lecture will be.
or as
We use the first case if advancing from mid to the left (where next_midmid),
because the element we are looking for is smaller than the element at mid,
so we can discard all elements to the right of mid and have to look on the
left of mid. The second case will be used if advancing from mid to the right
(where next_mid mid), because the element we are looking for is bigger
than the one at mid, so we can discard all elements to the left of mid. In
Lecture 6, we also saw that both computations might actually overflow in
arithmetic, so we devised a more clever way of computing the midpoint,
but we will ignore this for simplicity here. In Lecture 6, we also did con-
sider int as the data type. Now we study data of an arbitrary type elem
provided by the client. In particular, as one step of abstraciton, we will now
actually compare elements in terms of their keys.
Unfortunately, inserting into arrays remains an O(n) operation. For
other data structures, insertion is easy. For example, insertion into a doubly
linked list at a given list node is O(1). But if we use a sorted doubly linked
list, the insertion step will be easy, but finding the right position by binary
search is challenging, because we can only advance one step to the left or
right in a doubly linked list. That would throw us back into linear search
through the list to find the element, which gives a lookup complexity of
O(n). How can we combine the advantages of both: fast navigation by sev-
eral elements as in arrays, together with fast insertion as in doubly linked
lists? Before you read on, try to see if you can find an answer yourself.
1. Compare the current node to what we are looking for. Stop if equal.
What do we need to know about the binary tree to make sure that this prin-
ciple will always lookup elements correctly? What data structure invariant
do we need to maintain for the binary search tree? Do you have a sugges-
tion?
This implies that no key occurs more than once in a tree, and we have to
make sure our insertion function maintains this invariant.
If our binary search tree were perfectly balanced, that is, had the same
number of nodes on the left as on the right for every subtree, then the order-
ing invariant would ensure that search for an element with a given key has
asymptotic complexity O(log(n)), where n is the number of elements in the
tree. Why? When searching for a given key k in the tree, we just compare
k with the key k 0 of the entry at the root. If they are equal, we have found
the entry. If k < k 0 we recursively search in the left subtree, and if k 0 < k
we recursively search in the right subtree. This is just like binary search,
except that instead of an array we traverse a tree data structure. Unlike in
an array, however, we will see that insertion is quick as well.
6 The Interface
The basic interface for binary search trees is almost the same as for hash
tables, because both implement the same abstract principle: associative
arrays. Binary search trees, of course, do not need a hash function. We
assume that the client defines a type elem of elements and a type key of
keys, as well as functions to extract keys from elements and to compare
keys. Then the implementation of binary search trees will provide a type
bst and functions to insert an element and to search for an element with a
given key.
/* Client-side interface */
key elem_key(elem e)
//@requires e != NULL;
;
/* Library interface */
bst bst_new();
void bst_insert(bst B, elem e)
//@requires e != NULL;
;
elem bst_lookup(bst B, key k); /* return NULL if not in tree */
We stipulate that elem is some form of pointer type so we can return NULL
if no element with the given key can be found. Generally, some more oper-
ations may be requested at the interface, such as the number of elements in
the tree or a function to delete an element with a given key.
The key_compare function provided by the client is different from the
equality function we used for hash tables. For binary search trees, we ac-
tually need to compare keys k1 and k2 and determine if k1 < k2 , k1 = k2 ,
or k1 > k2 . A standard approach to this in imperative languages is for a
comparison function to return an integer r, where r < 0 means k1 < k2 ,
r = 0 means k1 = k2 , and r > 0 means k1 > k2 . Our contract captures that
we expect key_compare to return no values other than -1, 0, and 1.
struct tree_node {
elem data;
struct tree_node* left;
struct tree_node* right;
};
typedef struct tree_node tree;
As usual, we have a header which in this case just consists of a pointer to the
root of the tree. We often keep other information associated with the data
structure in these headers, such as the size.
struct bst_header {
tree* root;
};
9 Inserting an Element
Inserting an element is almost as simple. We just proceed as if we are look-
ing for the key of the given element. If we find a node with that key, we just
overwrite its data field. If not, we insert it in the place where it would have
been, had it been there in the first place. This last clause, however, creates
a small difficulty. When we hit a null pointer (which indicates the key was
not already in the tree), we cannot just modify NULL. Instead, we return the
new tree so that the parent can modify itself.
While this should always be true for a binary search tree, it is far weaker
than the ordering invariant stated at the beginning of lecture. Before read-
ing on, you should check your understanding of that invariant to exhibit a
tree that would satisfy the above, but violate the ordering invariant.
There is actually more than one problem with this. The most glaring
one is that following tree would pass this test:
7
5 11
1 9
Even though, locally, the key of the left node is always smaller and on the
right is always bigger, the node with key 9 is in the wrong place and we
would not find it with our search algorithm since we would look in the
right subtree of the root.
An alternative way of thinking about the invariant is as follows. As-
sume we are at a node with key k.
The general idea then is to traverse the tree recursively, and pass down
an interval with lower and upper bounds for all the keys in the tree. The
following diagram illustrates this idea. We start at the root with an unre-
stricted interval, allowing any key, which is written as ( 1, +1). As usual
in mathematics we write intervals as (x, z) = {y | x < y and y < z}. At
the leaves we write the interval for the subtree. For example, if there were
a left subtree of the node with key 7, all of its keys would have to be in the
(‐∞,
+∞)
9
(‐∞,
9)
5
(5,
9)
(9,
+∞)
(‐∞,
5)
7
(5, 7) (7, 9)
bool is_ordtree(tree* T) {
/* initially, we have no bounds - pass in NULL */
return is_ordered(T, NULL, NULL);
}
bool is_bst(bst B) {
return B != NULL && is_ordtree(B->root);
}
2 2 2
3 3
4
3 3 3 3
1 1 4 1 4
2
Clearly, the last tree is much more balanced. In the extreme, if we insert
elements with their keys in order, or reverse order, the tree will be linear,
and search time will be O(n) for n items.
These observations mean that it is extremely important to pay attention
to the balance of the tree. We will discuss ways to keep binary search trees
balanced in a later lecture.
Exercises
Exercise 1 Rewrite tree_lookup to be iterative rather than recursive.
Exercise 3 The binary search tree interface only expected a single function for key
comparison to be provided by the client:
An alternative design would have been to, instead, expect the client to provide a
set of key comparison functions, one for each outcome:
Lecture 16
October 18, 2012
1 Introduction
In this lecture we will look at priority queues as an abstract type and dis-
cuss several possible implementations. We then pick the implementation
as heaps and start to work towards an implementation. Heaps have the
structure of binary trees, a very common structure since a (balanced) bi-
nary tree with n elements has depth O(log(n)). During the presentation of
algorithms on heaps we will also come across the phenomenon that invari-
ants must be temporarily violated and then restored. We will study this in
more depth in the next lecture. From the programming point of view, we
will see a cool way to implement binary trees in arrays which, alas, does
not work very often.
2 Priority Queues
Priority queues are a generalization of stacks and queues. Rather than in-
serting and deleting elements in a fixed order, each element is assigned a
priority represented by an integer. We always remove an element with the
highest priority, which is given by the minimal integer priority assigned.
Priority queues often have a fixed size. For example, in an operating system
the runnable processes might be stored in a priority queue, where certain
system processes are given a higher priority than user processes. Similarly,
in a network router packets may be routed according to some assigned pri-
orities. In both of these examples, bounding the size of the queues helps to
3 Some Implementations
Before we come to heaps, it is worth considering different implementation
choices and consider the complexity of various operations.
The first idea is to use an unordered array of size limit, where we keep
a current index n. Inserting into such an array is a constant-time operation,
since we only have to insert it at n and increment n. However, finding
the minimum will take O(n), since we have to scan the whole portion of
the array that’s in use. Consequently, deleting the minimal element also
takes O(n): first we find the minimal element, then we swap it with the last
element in the array, and decrement n.
A second idea is to keep the array sorted. In this case, inserting an el-
ement is O(n). We can quickly (in O(log(n)) steps) find the place i where
it belongs using binary search, but then we need to shift elements to make
room for the insertion. This take O(n) copy operations. Finding the mini-
mum is O(1) (since it is stored at index 0 in the array). We can also make
deleting it O(1) if we keep the array sorted in descending order, or if we
keep two array indices: one for the smallest current element and one for
the largest.
To anticipate our analysis, heaps will have logarithmic time for insert
and deleting the minimal element.
Heap ordering invariant, alternative (1) : The key of each node in the tree
is less or equal to all of its childrens’ keys.
Heap ordering invariant, alternative (2) : The key of each node in the tree
except for the root is greater or equal to its parent’s key.
the invariant.
2 2
4 3 4 3
9 7 8 9 7 8 1
The dashed line indicates where the ordering invariant might be violated.
And, indeed, 3 > 1.
We can fix the invariant at the dashed edge by swapping the two nodes.
The result is shown on the right.
2 2
4 3 4 1
9 7 8 1 9 7 8 3
The link from the node with key 1 to the node with key 8 will always satisfy
the invariant, because we have replaced the previous key 3 with a smaller
key (1). But the invariant might now be violated going up the tree to the
root. And, indeed 2 > 1.
We repeat the operation, swapping 1 with 2.
2 1
4 1 4 2
9 7 8 3 9 7 8 3
As before, the link between the root and its left child continues to satisfy
the invariant because we have replaced the key at the root with a smaller
one. Furthermore, since the root node has no parent, we have fully restored
the ordering invariant.
In general, we swap a node with its parent if the parent has a strictly
greater key. If not, or if we reach the root, we have restored the ordering
invariant. The shape invariant was always satisfied since we inserted the
new node into the next open place in the tree.
The operation that restores the ordering invariant is called sifting up,
since we take the new node and move it up the heap until the invariant has
been reestablished. The complexity of this operation is O(l), where l is the
number of levels in the tree. For a tree of n 1 nodes there are log(n) + 1
levels. So the complexity of inserting a new node is O(log(n)), as promised.
2 8
3 4 3 4
9 7 8 9 7
However, the node that is now at the root might have a strictly greater key
one or both of its children, which would violate the ordering invariant.
If the ordering invariant in indeed violated, we swap the node with the
8 3
3 4 8 4
9 7 9 7
This will reestablish the invariant at the root. In general we see this as
follows. Assume that before the swap the invariant is violated, and the left
child has a smaller key than the right one. It must also be smaller than
the root, otherwise the ordering invariant would hold. Therefore, after we
swap the root with its left child, the root will be smaller than its left child. It
will also be smaller than its right child, because the left was smaller than the
right before the swap. When the right is smaller than the left, the argument
is symmetric.
Unfortunately, we may not be done, because the invariant might now
be violated at the place where the old root ended up. If not, we stop. If yes,
we compare the children as before and swap with the smaller one.
3 3
8 4 7 4
9 7 9 8
We stop this downward movement of the new node if either the order-
ing invariant is satisfied, or we reach a leaf. In both cases we have fully
restored the ordering invariant. This process of restoring the invariant is
called sifting down, since we move a node down the tree. As in the case for
insert, the number of operations is bounded by the number of levels in the
tree, which is O(log(n)) as promised.
1 1
10 11 2 3
Exercises
Exercise 1 During the lecture, students suggested to work with a sorted linked
list instead of a sorted array to implement priority queues. What is the complex-
ity of the priority queue operations on this representation? What are the advan-
tages/disadvantages compared to an ordered array?
Lecture 17
October 23, 2012
1 Introduction
In this lecture we will implement operations on heaps. The theme of this
lecture is reasoning with invariants that are partially violated, and making
sure they are restored before the completion of an operation. We will only
briefly review the algorithms for inserting and deleting the minimal node
of the heap; you should read the notes for Lecture 16 on priority queues
and keep them close at hand.
Temporarily violating and restoring invariants is a common theme in
algorithms. It is a technique you need to master.
The test in the loop is not quite right, but lets just verify that it is at least
safe
The middle line is a little stronger than we need for safety, but it is im-
portant that we never access an element that is meaningless, like the one
stored at index 0, and the ones stored at H!next and beyond. Then the
final version of our is_heap function is:
bool is_heap(struct heap_header* H) {
if (!(H != NULL)) return false;
//@assert \length(H->data) == H->limit;
if (!(1 <= H->next && H->next <= H->limit)) return false;
for (int i = 2; i < H->next; i++)
//@loop_invariant 2 <= i;
if (!(priority(H, i/2) <= priority(H, i))) return false;
return true;
}
4 Creating Heaps
We start with the simple code to test if a heap is empty or full, and to allo-
cate a new (empty) heap. A heap is empty if the next element to be inserted
would be at index 1. A heap is full if the next element to be inserted would
be at index limit (the size of the array).
bool pq_empty(heap H)
//@requires is_heap(H);
{
return H->next == 1;
}
bool pq_full(heap H)
//@requires is_heap(H);
{
return H->next == H->limit;
}
To create a new heap, we allocate a struct and an array and set all the
right initial values.
int i = H->next - 1;
while (i > 1 && priority(H,i) < priority(H,i/2))
{
swap(H->data, i, i/2);
i = i/2;
}
Here, swap is the standard function, swapping two elements of the array.
Setting i = i/2 is moving up in the array, to the place we just swapped the
new element to.
At this point, as always, we should ask why accesses to the elements
of the priority queue are safe. By short-circuiting of conjunction, we know
that i > 1 when we ask priority(H, i) < priority(H, i/2). But we need a
loop invariant to make sure that it respects the upper bound. The index i
starts at H!next 1, so it should always be strictly less that H!next.
int i = H->next - 1;
while (i > 1 && priority(H,i) < priority(H,i/2))
//@loop_invariant 1 <= i && i < H->next;
{
swap(H->data, i, i/2);
i = i/2;
}
One small point regarding the loop invariant: we just incremented H!next,
so it must be strictly greater than 1 and therefore the invariant 1 i must
be satisfied.
But how do we know that swapping the element up the tree restores the
ordering invariant? We need an additional loop invariant which states that
H is a valid heap except at index i. Index i may be smaller than its parent,
but it still needs to be less or equal to its children. We therefore postulate a
function is_heap_expect_up and use it as a loop invariant.
int i = H->next - 1;
while (i > 1 && priority(H,i) < priority(H,i/2))
//@loop_invariant 1 <= i && i < H->next;
//@loop_invariant is_heap_except_up(H, i);
{
swap(H->data, i, i/2);
i = i/2;
}
The next step is to write this function. We copy the is_heap function, but
check a node against its parent only when it is different from the distin-
guished element where the exception is allowed.
Now we try to prove that this is indeed a loop invariant, and there-
fore our function is correct. Rather than using a lot of text we verify this
properties on general diagrams. Other versions of this diagram are entirely
symmetric. On the left is the relevant part of the heap before the swap and
on the right is the relevant part of the heap after the swap. The relevant
nodes in the tree are labeled with their priority. Nodes that may be above
a or below c, c1 , c2 and to the right of a are not shown. These do not enter
into the invariant discussion, since their relations between each other and
the shown nodes remain fixed. Also, if x is in the last row the constraints
regarding c1 and c2 are vacuous.
a" a"
b" x"
We know the following properties on the left from which the properties
shown on the right follow as shown:
So we see that simply stipulating the (temporary) invariant that every node
is greater or equal to its parent except for the one labeled x is not strong
enough. It is not necessarily preserved by a swap.
But we can strengthen it a bit. You might want to think about how
before you move on to the next page.
The strengthened invariant also requires that the children of the po-
tentially violating node x are greater or equal to their grandparent! Let’s
reconsider the diagrams.
a" a"
b" x"
We have more assumptions on the left now ((6) and (7)), but we have also
two additional proof obligations on the right (a c and a b).
ab (1) order a?x allowed exception
bc (2) order ac from (1) and (2)
x c1 (3) order ab (1)
x c2 (4) order
xc from (5) and (2)
x<b (5) since we swap xb from (5)
b c1 (6) grandparent b c1 (6)
b c2 (7) grandparent b c2 (7)
Note that the strengthened loop invariants (or, rather, the strengthened
definition what it means to be a heap except in one place) is not necessary
to show that the postcondition of pq_insert (i.e. is_heap(H)) is implied.
Postcondition: If the loop exits, we know the loop invariants and the negated
loop guard:
Next we need to restore the heap invariant by sifting down from the root,
with sift_down(H, 1). We only do this if there is at least one element left
in the heap.
But what is the precondition for the sifting down operation? Again, we
cannot express this using the functions we have already written. Instead,
we need a function is_heap_except_down(H, n) which verifies that the
heap invariant is satisfied in H, expect possibly at n. This time, though,
it is between n and its children where things may go wrong, rather than
between n and its parent as in is_heap_except_up(H, n). In the pictures
below this would be at n = 1 on the left and n = 2 on the right.
8 3
3 4 8 4
9 7 9 7
return true;
}
With this we can have the right invariant to write our sift_down func-
tion. The tricky part of this function is the nature of the loop. Our loop
index i starts at n (which actually will always be 1 when this function is
called). We have reached a leaf if 2 ⇤ i next because if there is no left
child, there cannot be a right one, either. So the outline of our function
shapes up as follows:
We also have written down three loop invariants: the bounds for i, the
heap invariant (everywhere, except possibly at i, looking down), and the
invariant defining local variables left and right, standing for the left and
right children of i.
We want to return from the function if we have restored the invariant,
that is priority(H, i) priority(H, 2 ⇤ i) and priority(H, i) priority(H, 2 ⇤
i + 1). However, the latter reference might be out of bounds, namely if
we found a node that has a left child but not a right child. So we have to
guard this access by a bounds check. Clearly, when there is no right child,
checking the left one is sufficient.
return;
...
}
If this test fails, we have to determine the smaller of the two children. If
there is no right child, we pick the left one, of course. Once we have found
the smaller one we swap the current one with the smaller one, and then
make the child the new current node i.
7 Heapsort
We rarely discuss testing in these notes, but it is useful to consider how to
write decent test cases. Mostly, we have been doing random testing, which
has some drawbacks but is often a tolerable first cut at giving the code a
workout. It is much more effective in languages that are type safe such as
C0, and even more effective when we dynamically check invariants along
the way.
In the example of heaps, one nice way to test the implementation is to
insert a random sequence of numbers, then repeatedly remove the minimal
element until the heap is empty. If we store the elements in an array in the
order we take them out of the heap, the array should be sorted when the
heap is empty! This is the idea behind heapsort. We first show the code,
using the random number generator we have used for several lectures now,
then analyze the complexity.
int main() {
int n = (1<<9)-1; // 1<<9 for -d; 1<<13 for timing
int num_tests = 10; // 10 for -d; 100 for timing
int seed = 0xc0c0ffee;
rand_t gen = init_rand(seed);
int[] A = alloc_array(int, n);
heap H = pq_new(n);
}
print("Passed all tests!\n");
return 0;
}
Now for the complexity analysis. Inserting n elements into the heap is
bounded by O(n ⇤ log(n)), since each of the n inserts is bounded by log(n).
Then the n element deletions are also bounded by O(n ⇤ log(n)), since each
of the n deletions is bounded by log(n). So all together we get O(2 ⇤ n ⇤
log(n)) = O(n ⇤ log(n)). Heapsort is asymptotically as good as mergesort
(Lecture 7) or as good as the expected complexity of quicksort with random
pivots (Lecture 8).
The sketched algorithm uses O(n) auxiliary space, namely the heap.
One can use the same basic idea to do heapsort in place, using the unused
portion of the heap array to accumulate the sorted array.
Testing, including random testing, has many problems. In our context,
one of them is that it does not test the strength of the invariants. For ex-
ample, say we write no invariants whatsoever (the weakest possible form),
then compiling with or without dynamic checking will always yield the
same test results. We really should be testing the invariants themselves by
giving examples where they are not satisfied. However, we should not be
able to construct such instances of the data structure on the client side of the
interface. Furthermore, within the language we have no way to “capture”
an exception such as a failed assertion and continue computation.
8 Summary
We briefly summarize key points of how to deal with invariants that must
be temporarily violated and then restored.
1. Make sure you have a clear high-level understanding of why invari-
ants must be temporarily violated, and how they are restored.
2. Ensure that at the interface to the abstract type, only instances of the
data structure that satisfy the full invariants are being passed. Other-
wise, you should rethink all the invariants.
3. Write predicates that test whether the partial invariants hold for a
data structure. Usually, these will occur in the preconditions and
loop invariants for the functions that restore the invariants. This will
force you to be completely precise about the intermediate states of the
data structure, which should help you a lot in writing correct code for
restoring the full invariants.
Exercises
Exercise 1 Write a recursive version of is_heap.
Exercise 4 Give a diagrammatical proof for the invariant property of sifting down
for delete (called is_heap_except_down), along the lines of the one we gave for
sifting up for insert.
Exercise 5 Say we want to extend priority queues so that when inserting a new
element and the queue is full, we silently delete the element with the lowest priority
(= maximal key value) before adding the new element. Describe an algorithm,
analyze its asymptotic complexity, and provide its implementation.
Exercise 6 Using the invariants described in this lecture, write a function heapsort
which sorts a given array in place by first constructing a heap, element by element,
within the same array and then deconstructing the heap, element by element.
[Hint: It may be easier to sort the array in descending order and reverse in a last
pass or use so called max heaps where the maximal element is at the top]
Lecture 18b
March 26, 2013
1 Introduction
In this lecture we will start the transition from C0 to C. In some ways, the
lecture is therefore about knowledge rather than principles, a return to the
emphasis on programming that we had earlier in the semester. In future
lectures, we will explore some deeper issues in the context of C, but this
lecture is full of cautionary tales.
The main theme of this lecture is the way C manages memory. Unlike
C0 and other modern languages like Java, C#, or ML, C requires programs
to explicitly manage their memory. Allocation is relatively straightforward,
like in C0, requiring only that we correctly calculate the size of allocated
memory. Deallocating (“freeing”) memory, however, is difficult and error-
prone, even for experienced C programmers. Mistakes can either lead to
attempts to access memory that has already been deallocated, in which case
the result is undefined and may be catastrophic, or it can lead the running
program to hold on to memory no longer in use, which may slow it down
and eventually crash it when it runs out of memory. The second category
is a so-called memory leak.
2 A First Look at C
Syntactically, C and C0 are very close. Philosophically, they diverge rather
drastically. Underlying C0 are the principles of memory safety and type
safety. A program is memory safe if it only reads from memory that has
been properly allocated and initialized, and only writes to memory that
has been properly allocated. A program is type safe if all data it manipulates
have their declared types. In C0, all programs are type safe and memory
safe. The compiler guarantees this through a combination of static (that
is, compile-time) and dynamic (that is, run-time) checks. An example of
a static check is the error issued by the compiler when trying to assign an
integer to a variable declared to hold a pointer, such as
int* p = 37;
An example of a dynamic check is an array out-of-bounds error, which
would try to access memory that has not been allocated by the program.
Modern languages such as Java, ML, or Haskell are both type safe and
memory safe.
In contrast, C is neither type safe nor memory safe. This means that the
behavior of many operations in C is undefined. Unfortunately, undefined
behavior in C may yield any result or have any effect, which means that
the outcome of many programs is unpredictable. In many cases, even pro-
grams that are patently absurd will yield a consistent answer on a given
machine with a given compiler, or perhaps even across different machines
and different compilers. No amount of testing will catch the fact that such
programs have bugs, but they may break when, say, the compiler is up-
graded or details of the runtime system changes. Taken together, these
design decisions make it very difficult to write correct programs in C. This
fact is in evidence every day, when we download so-called security critical
updates to operating systems, browsers, and other software. In many cases,
the security critical flaws arise because an attacker can exploit behavior that
is undefined, but predictable across some spectrum of implementations, in
order to cause your machine to execute some arbitrary malicious code. You
will learn in 15-213 Computer Systems exactly how such attacks work.
These difficulties are compounded by the fact that there are other parts
of the C standard that are implementation defined. For example, the size of
values of type int is explicitly not specified by the C standard, but each
implementation must of course provide a size. This makes it very diffi-
cult to write portable programs. Even on one machine, the behavior of a
program might differ from compiler to compiler. We will talk more about
implementation defined behavior in the next lecture.
Despite all these problems, almost 40 years after its inception, C is still
a significant language. For one, it is the origin of the object-oriented lan-
guages C++ and strongly influenced Java and C#. For another, much sys-
tems code such as operating systems, file systems, garbage collectors, or
3 Undefined Behavior in C
For today’s lecture, there are three important undefined behaviors in C are:
Null pointer dereference: dereferencing the null pointer has undefined re-
sults.
int main() {
int* A = malloc(sizeof(int));
A[0] = 0; /* ok - A[0] is like *A */
A[1] = 1; /* error - not allocated */
A[317] = 29; /* error - not allocated */
A[-1] = 32; /* error - not allocated(!) */
printf("A[-1] = %d\n", A[-1]);
return 0;
}
will not raise any compile time error or even warnings, even under the
strictest settings. Here, the call to malloc allocates enough space for a single
integer in memory. In this class, we are using gcc with the following flags:
which generates all warnings (-Wall and -Wextra), turns all warnings into
errors (-Werror), and applies the C99 standard (-std=c99) pedantically
(-pedantic). The code above executes ok, and in fact prints 32, despite
four blatant errors in the code.
To discover whether such errors may have occurred at runtime, we can
use the valgrind tool.
% valgrind ./a.out
...
==nnnn== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)
which produces useful error messages (elided above) and indeed, flags 4
error in code whose observable behavior was bug-free.
valgrind slows down execution, but if at all feasible you should test all
your C code in the manner to uncover memory problems. For best error
messages, you should pass the -g flag to gcc which preserves some corre-
lation between binary and source code.
You can also guard memory accesses with approriate assert statements
that abort the program when attempting out-of-bounds accesses.
Conflating pointers and arrays provides a hint on how to convert C0
programs to C. We need to convert t[] which indicates a C0 array with
elements of type t to t* to indicate a pointer instead. In addition, the
alloc and alloc_array calls need to be changed, or defined by appropri-
ate macros (we’ll talk about this more later).
errors down the road. But we were able to describe, in each of the examples
above, what sorts of things were likely to happen.
There’s an old joke that whenever your encounter undefined behavior,
your computer could decide to play Happy Birthday or it could catch on fire.
This is less of a joke considering recent events:
• The Stuxnet worm caused centrifuges, such as those used for ura-
nium enrichment in Iran, to malfunction, physically damaging the
devices.2
Not quite playing Happy Birthday and catching on fire, but close enough.
4 Memory Allocation
Two important system-provided functions for allocating memory are malloc
and calloc.
malloc(sizeof(t)) allocates enough memory to hold a value of type
t. In C0, we would have written alloc(t) instead. The difference is that
alloc(t) has type t*, while malloc(sizeof(t)) returns a special type
void*, which we will discuss more in the next lecture. The important thing
to realize is that C will not even check that the pointer we allocated is the
right size, so that while we can write this:
int* p = malloc(sizeof(int));
int* p = malloc(sizeof(char));
which will generally have undefined results. Also, malloc does not guar-
antee that the memory it returns has been initialized, so the following code
is an error:
1
Scott Wolchok, Eric Wustrow, Dawn Isabel, and J. Alex Halderman. Attacking the Wash-
ington, D.C. Internet Voting System. Proceedings of the 16th Conference on Financial Cryp-
tography and Data Security, February 2012.
2
Holger Stark. Stuxnet Virus Opens New Era of Cyber War. Spiegel Online, August 8, 2011.
int* p = malloc(sizeof(char));
printf("%d\n", *p);
5 Header Files
To understand how the xalloc library works, and to take our our C0 im-
plementation of binary search treas and begin turning it into a C imple-
mentation, we will need to start by explaining the C convention of using
header files to specify interfaces. Header files have the extension .h and
contain type declarations and definitions as well as function prototypes
and macros, but never code. Header files are not listed on the command
line when the compiler is invoked, but included in C source files (with the
.c extension) using the #include preprocessor directive. The typical use is
to #include3 the header file both in the implementation of a data structure
and all of its clients. In this way, we know both match the same interface.
This applies to standard libraries as well as user-written libraries. For
example, the client of the C implementation of BSTs starts with
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
3
when we say “include” in the rest of this lecture, we mean #include
#include "xalloc.h"
#include "contracts.h"
#include "bst.h"
#include "bst-client.h"
The form #include <filename.h> includes file filename.h which must
be one of the system libraries provided by the suite of compilation tools
(gcc, in our case). The second form #include "filename.h" looks for
filename.h in the current source directory, so this is reserved for user
files. The names of the standard libraries and the types and functions they
provide can be found in the standard reference book The C Programming
Language, 2nd edition by Brian Kernighan and Dennis Ritchie or in various
places on the Internet.4
#include <string.h>
#ifndef _BST_CLIENT_H_
#define _BST_CLIENT_H_
#endif
Note that the way we are defining a client interface here is not good C style,
but it will take a few more lectures before we have the tools to do a better
job.
4
for example, http://www.acm.uiuc.edu/webmonkeys/book/c_guide/
The core of this file is exactly the client interface part of our C0 BST spec-
ification. It defines the type elem as a pointer to a struct wcount, whose
implementation remains hidden. There is one new function, elem_free,
which we have not yet discussed. All of our contracts are gone - C does not
have a facility for putting contracts in an interface.
We also see a certain idiom
#ifndef _BST_CLIENT_H_
#define _BST_CLIENT_H_
...
#endif
/*********************/
/* Library interface */
/*********************/
#include "bst-client.h"
#ifndef _BST_H_
#define _BST_H_
bst bst_new();
#endif
6 Macros
Macros are another extension of C that we left out from C0. We use macros
to get some semblence of contracts in C0, which are defined in the header
file contracts.h.
Macros are expanded by a preprocessor and the result is fed to the “reg-
ular” C compiler. When we do not want REQUIRES to be checked (which is
the default, just as for @requires), there is a macro definition
#define REQUIRES(COND) ((void)0)
which can be found in the file contracts.h. The right-hand side of this
definition, ((void)0) is the number zero, cast to have type void which
means it cannot be used as an argument to a function or operator; its result
must be ignored. When the code is compiled with
gcc -DDEBUG ...
then it is defined instead as a regular assert:
#define REQUIRES(COND) assert(COND)
In this case, any use of REQUIRES(e) is expanded into assert(e) before the
result is compiled into a runtime test.
The three macros, all of which behave identically are
REQUIRES(e);
ENSURES(e);
ASSERT(e);
although they are intended for different purposes, mirroring the @requires,
@ensures, and @assert annotations of C0. @loop_invariant is missing,
since there appears to be no good syntax to support loop invariants di-
rectly; we recommend you check them right after the exit test or at the end
of the loop using the ASSERT macro.
Another common use for macros is to define compile-time constants.
In general, it is considered good style to isolate “magic” numbers at the
beginning of a file, for easy reference; for instance, if we were coding our
E0 editor in C, it would make sense to
#define GAP_BUFFER_SIZE 16
to make it easy to change from size 16 gap buffers to some other size. The
C implementation itself uses them as well, for example, limits.h defines
INT_MAX as the maximal (signed) integer, and INT_MIN and the minimal
signed integer, and similarly for UINT_MAX for unsigned integers.
#ifdef DEBUG
...debugging statements...
#endif
where the variable DEBUG is usually set on the gcc command line with
7 Freeing Memory
Unlike C0, C does not automatically manage memory. As a result, pro-
grams have to free the memory they allocate explicitly; otherwise, long-
running programs or memory-intensive programs are likely to run out of
space. For that, the C standard library provides the function free, declared
with
2. After memory has been freed, it is not longer referenced by the pro-
gram in any way.
Freeing memory counts as a final use, so the goals imply that you should
not free memory twice. And, indeed, in C the behavior of freeing mem-
ory that has already been freed is undefined and may be exploited by and
adversary. If these rules are violated, the result of the operations is un-
defined. The valgrind tool will catch dynamically occurring violations of
these rules, but it cannot check statically if your code will respect these
rules when executed.
The golden rule of memory management in C is
You allocate it, you free it!
By inference, if you didn’t allocate it, you are not allowed to free it! But
this rule is tricky in practice, because sometimes we do need to transfer
ownership of allocated memory so that it “belongs” to a data structure.
Binary search trees are one example. When we allocate an element to
the binary search tree, are we still in charge of freeing that element, or
should it be freed when it is removed from the binary search tree? There
are arguments to be made for both of these options. If we want the BST
to “own” the reference, and therefore be in charge of freeing it, we can
write the following functions that free a binary search tree, given a func-
tion elem_free() that frees elements.
void tree_free(tree *T) {
REQUIRES(is_ordtree(T));
if(T != NULL) {
elem_free(T->data);
tree_free(T->left);
tree_free(T->right);
free(T);
}
return;
}
void bst_free(bst B) {
REQUIRES(is_bst(B));
tree_free(B->root);
free(B);
return;
}
Lecture 19
March 28, 2013
1 Introduction
Binary search trees are an excellent data structure to implement associa-
tive arrays, maps, sets, and similar interfaces. The main difficulty, as dis-
cussed in Lecture 15, is that they are efficient only when they are balanced.
Straightforward sequences of insertions can lead to highly unbalanced trees
with poor asymptotic complexity and unacceptable practical efficiency. For
example, if we insert n elements with keys that are in strictly increasing or
decreasing order, the complexity will be O(n2 ). On the other hand, if we
can keep the height to O(log(n)), as it is for a perfectly balanced tree, then
the complexity is bounded by O(n ⇤ log(n)).
The solution is to dynamically rebalance the search tree during insert
or search operations. We have to be careful not to destroy the ordering
invariant of the tree while we rebalance. Because of the importance of bi-
nary search trees, researchers have developed many different algorithms
for keeping trees in balance, such as AVL trees, red/black trees, splay trees,
or randomized binary search trees. They differ in the invariants they main-
tain (in addition to the ordering invariant), and when and how the rebal-
ancing is done.
In this lecture we use AVL trees, which is a simple and efficient data
structure to maintain balance, and is also the first that has been proposed.
It is named after its inventors, G.M. Adelson-Velskii and E.M. Landis, who
described it in 1962.
To describe AVL trees we need the concept of tree height, which we de-
fine as the maximal length of a path from the root to a leaf. So the empty
tree has height 0, the tree with one node has height 1, a balanced tree with
three nodes has height 2. If we add one more node to this last tree is will
have height 3. Alternatively, we can define it recursively by saying that the
empty tree has height 0, and the height of any node is one greater than the
maximal height of its two children. AVL trees maintain a height invariant
(also sometimes called a balance invariant).
Height Invariant. At any node in the tree, the heights of the left
and right subtrees differs by at most 1.
10
height
=
3
4
16
height
inv.
sa4sfied
1 7 13 19
If we insert a new element with a key of 14, the insertion algorithm for
binary search trees without rebalancing will put it to the right of 13.
10
height
=
4
4
16
height
inv.
sa.sfied
1 7 13 19
14
Now the tree has height 4, and one path is longer than the others. However,
it is easy to check that at each node, the height of the left and right subtrees
still differ only by one. For example, at the node with key 16, the left subtree
has height 2 and the right subtree has height 1, which still obeys our height
invariant.
Now consider another insertion, this time of an element with key 15.
This is inserted to the right of the node with key 14.
10
height
=
5
4
16
height
inv.
violated
at
13,
16,
10
1 7 13 19
14
15
All is well at the node labeled 14: the left subtree has height 0 while the
right subtree has height 1. However, at the node labeled 13, the left subtree
has height 0, while the right subtree has height 2, violating our invariant.
Moreover, at the node with key 16, the left subtree has height 3 while the
right subtree has height 1, also a difference of 2 and therefore an invariant
violation.
We therefore have to take steps to rebalance tree. We can see without
too much trouble, that we can restore the height invariant if we move the
node labeled 14 up and push node 13 down and to the right, resulting in
the following tree.
10
height
=
4
4
16
height
inv.
restored
at
14,
16,
10
1 7 14 19
13 15
(α,"ω)"
(α,"ω)"
x"
y"
le+"rota0on"at"x"
y"
(α,"x)" x"
(y,"ω)"
(x,"y)" (y,"ω)"
(α,"x)" (x,"y)"
From the intervals we can see that the ordering invariants are preserved, as
are the contents of the tree. We can also see that it shifts some nodes from
the right subtree to the left subtree. We would invoke this operation if the
invariants told us that we have to rebalance from right to left.
We implement this with some straightforward code. First, recall the
type of trees from last lecture. We do not repeat the function is_ordtree
that checks if a tree is ordered.
struct tree_node {
elem data;
struct tree_node *left;
struct tree_node *right;
};
typedef struct tree_node tree;
bool is_ordtree(tree *T);
The main point to keep in mind is to use (or save) a component of the
input before writing to it. We apply this idea systematically, writing to a
location immediately after using it on the previous line. We repeat the type
specification of tree from last lecture.
These rotations work generically. When we apply them to AVL trees specif-
ically later in this lecture, we will also have to recalculate the heights of the
two nodes involved. This involves only looking up the height of their chil-
dren.
The right rotation is exactly the inverse. First in pictures:
(α,"ω)"
(α,"ω)"
y"
x"
x" right"rota1on"at"y"
(z,"ω)" y"
(α,"y)"
(α,"y)" (y,"z)"
(y,"z)" (z,"ω)"
Then in code:
5 Inserting an Element
The basic recursive structure of inserting an element is the same as for
searching for an element. We compare the element’s key with the keys
associated with the nodes of the trees, inserting recursively into the left or
right subtree. When we find an element with the exact key we overwrite
the element in that node. If we encounter a null tree, we construct a new
tree with the element to be inserted and no children and then return it. As
we return the new subtrees (with the inserted element) towards the root,
we check if we violate the height invariant. If so, we rebalance to restore
the invariant and then continue up the tree to the root.
The main cleverness of the algorithm lies in analyzing the situations
when we have to rebalance and need to apply the appropriate rotations to
restore the height invariant. It turns out that one or two rotations on the
whole tree always suffice for each insert operation, which is a very elegant
result.
First, we keep in mind that the left and right subtrees’ heights before
the insertion can differ by at most one. Once we insert an element into one
of the subtrees, they can differ by at most two. We now draw the trees in
such a way that the height of a node is indicated by the height that we are
drawing it at.
The first situation we describe is where we insert into the right subtree,
which is already of height h + 1 where the left subtree has height h. If we
are unlucky, the result of inserting into the right subtree will give us a new
right subtree of height h + 2 which raises the height of the overall tree to
h + 3, violating the height invariant. This situation is depicted below. Note
that the node we inserted does not need to be z, but there must be a node z
in the indicated position.
(α,"ω)"
h+3"
x"
(α,"ω)"
h+2" y"
x"
insert"to"the"right"of"y"
y" h+1"
z"
h"
(α,"x)" (x,"y)" (y,"ω)" (α,"x)" (x,"y)" (y,"z)" (z,"ω)"
If the new right subtree has height h + 2, either its right or its left subtree
must be of height h + 1 (and only one of them; think about why). If it is the
right subtree we are in the situation depicted on the right above (and on the
left below). While the trees (↵, x) and (x, y) must have exactly height h, the
trees (y, z) and (z, !) need not. However, they differ by at most 1, because
we are investigating the case were the lowest place in the tree where the
invariant is violated is at x.
(α,"ω)"
h+3"
x"
(α,"ω)"
y" h+2" y"
le1"rota6on"at"x"
h+1" x"
z" z"
h"
(α,"x)" (x,"y)" (y,"z)" (z,"ω)" (α,"x)" (x,"y)" (y,"z)" (z,"ω)"
We fix this with a left rotation at x, the result of which is displayed to the
right. Because the height of the overall tree is reduced to its original h + 2,
no further rotation higher up in the tree will be necessary.
In the second case we consider we insert to the left of the right subtree,
and the result has height h+1. This situation is depicted on the right below.
h+3" (α,"ω)"
x"
(α,"ω)"
h+2"
x" insert"to"the"le7"of"z" z"
z" h+1"
y"
h"
(α,"x)" (x,"z)" (z,"ω)" (α,"x)" (x,"y)" (y,"z)" (z,"ω)"
In the situation on the right, the subtrees labeled (↵, x) and (z, !) must have
exactly height h, but only one of (x, y) and (y, z). In this case, a single left
rotation alone will not restore the invariant (see Exercise 1). Instead, we
apply a so-called double rotation: first a right rotation at z, then a left rotation
at the root labeled x. When we do this we obtain the picture on the right,
(α,"ω)"
h+3"
x"
(α,"ω)"
h+2" y"
z"
double"rota8on"at"z"and"x"
h+1" x"
y" z"
h"
(α,"x)" (x,"y)" (y,"z)" (z,"ω)" (α,"x)" (x,"y)" (y,"z)" (z,"ω)"
There are two additional symmetric cases to consider, if we insert the new
element on the left (see Exercise 4).
We can see that in each of the possible cases where we have to restore
the invariant, the resulting tree has the same height h + 2 as before the
insertion. Therefore, the height invariant above the place where we just
restored it will be automatically satisfied, without any further rotations.
6 Checking Invariants
The interface for the implementation is exactly the same as for binary search
trees, as is the code for searching for a key. In various places in the algo-
rithm we have to compute the height of the tree. This could be an operation
of asymptotic complexity O(n), unless we store it in each node and just look
it up. So we have:
struct tree_node {
elem data;
int height;
struct tree_node *left;
struct tree_node *right;
};
Of course, if we store the height of the trees for fast access, we need to
adapt it when rotating trees. After all, the whole purpose of tree rotations
is to rebalance and change the height. For that, we implement a function
fix_height that computes the height of a tree from the height of its chil-
dren. Its implementation directly follows the definition of the height of a
tree. The implementation of rotate_right and rotate_left needs to be
adapted to include calls to fix_height. These calls need to compute the
heights of the children first, before computing that of the root, because the
height of the root depends on the height we had previously computed for
the child. Hence, we need to update the height of the child before updating
the height of the root. Look at the code for details.
When checking if a tree is balanced, we also check that all the heights
that have been computed are correct.
We use this, for example, in a utility function that creates a new leaf
from an element (which may not be null).
tree *leaf(elem e) {
REQUIRES(e != NULL);
tree *T = xmalloc(sizeof(struct tree_node));
T->left = NULL;
T->data = e;
T->right = NULL;
T->height = 1;
ENSURES(is_avl(T));
return T;
}
Recall that, unlike in C0, xmalloc of C does not initialize memory any more,
so the initialization of T->left and T->right to NULL is crucial.
7 Implementing Insertion
The code for inserting an element into the tree is mostly identical with
the code for plain binary search trees. The difference is that after we in-
sert into the left or right subtree, we call a function rebalance_left or
rebalance_right, respectively, to restore the invariant if necessary and cal-
culate the new height.
ENSURES(is_avl(T));
return T;
}
The pre- and post-conditions of this functions are actually not strong enough
to prove this function correct. We also need an assertion about how the tree
might change due to insertion, which is somewhat tedious. If we perform
dynamic checking with the contract above, however, we establish that the
result is indeed an AVL tree. As we have observed several times already:
we can test for the desired property, but we may need to strengthen the
pre- and post-conditions in order to rigorously prove it.
We show only the function rebalance_right; rebalance_left is sym-
metric.
tree *l = T->left;
tree *r = T->right;
int hl = height(l);
int hr = height(r);
if (hr > hl+1) {
ASSERT(hr == hl+2);
if (height(r->right) > height(r->left)) {
ASSERT(height(r->right) == hl+1);
T = rotate_left(T);
ASSERT(height(T) == hl+2);
} else {
ASSERT(height(r->left) == hl+1);
/* double rotate left */
T->right = rotate_right(T->right);
T = rotate_left(T);
ASSERT(height(T) == hl+2);
}
} else {
ASSERT(!(hr > hl+1));
fix_height(T);
}
ENSURES(is_avl(T));
return T;
}
Note that the preconditions are weaker than we would like. In partic-
ular, they do not imply some of the assertions we have added in order to
show the correspondence to the pictures. This is left as the (difficult) Ex-
ercise 5. Such assertions are nevertheless useful because they document
expectations based on informal reasoning we do behind the scenes. Then,
if they fail, they may be evidence for some error in our understanding, or
in the code itself, which might otherwise go undetected.
8 Experimental Evaluation
We would like to assess the asymptotic complexity and then experimen-
tally validate it. It is easy to see that both insert and search operations take
type O(h), where h is the height of the tree. But how is the height of the tree
related to the number of elements stored, if we use the balance invariant of
AVL trees? It turns out that h is O(log(n)). It is not difficult to prove this,
but it is beyond the scope of this course.
To experimentally validate this prediction, we have to run the code with
inputs of increasing size. A convenient way of doing this is to double the
size of the input and compare running times. If we insert n elements into
the tree and look them up, the running time should be bounded by c ⇤ n ⇤
log(n) for some constant c. Assume we run it at some size n and observe
r = c ⇤ n ⇤ log(n). If we double the input size we have c ⇤ (2 ⇤ n) ⇤ log(2 ⇤ n) =
2 ⇤ c ⇤ n ⇤ (1 + log(n)) = 2 ⇤ r + 2 ⇤ c ⇤ n, we mainly expect the running
time to double with an additional summand that roughly doubles with as n
doubles. In order to smooth out minor variations and get bigger numbers,
we run each experiment 100 times. Here is the table with the results:
We see in the third column, where 2r stands for the doubling of the previ-
ous value, we are quite close to the predicted running time, with a approx-
imately linearly increasing additional summand.
In the fourth column we have run the experiment with plain binary
search trees which do not rebalance automatically. First of all, we see that
they are much less efficient, and second we see that their behavior with
increasing size is difficult to predict, sometimes jumping considerably and
sometimes not much at all. In order to understand this behavior, we need
to know more about the order and distribution of keys that were used in
this experiment. They were strings, compared lexicographically. The keys
were generated by counting integers upward and then converting them to
strings. The distribution of these keys is haphazard, but not random. For
example, if we start counting at 0
"0" < "1" < "2" < "3" < "4" < "5" < "6" < "7" < "8" < "9"
< "10" < "12" < ...
the first ten strings are in ascending order but then numbers are inserted
between "1" and "2". This kind of haphazard distribution is typical of
many realistic applications, and we see that binary search trees without
rebalancing perform quite poorly and unpredictably compared with AVL
trees.
The complete code for this lecture can be found in directory 19-avl/ on the
course website.
Exercises
Exercise 1 Show that in the situation on page 9 a single left rotation at the root
will not necessarily restore the height invariant.
Exercise 3 Show that left and right rotations are inverses of each other. What can
you say about double rotations?
Exercise 4 Show the two cases that arise when inserting into the left subtree
might violate the height invariant, and show how they are repaired by a right ro-
tation, or a double rotation. Which two single rotations does the double rotation
consist of in this case?
Exercise 5 Strengthen the invariants in the AVL tree implementation so that the
assertions and postconditions which guarantee that rebalancing restores the height
invariant and reduces the height of the tree follow from the preconditions.
Lecture 20
April 2, 2013
1 Introduction
In lecture 18, we emphasized the things we lost by going to C:
• Many operations that would safely cause an error in C0, like derefer-
encing NULL or reading outside the bounds of an array, are undefined
in C – we cannot predict or reason about what happens when we have
undefined behaviors.
• In C, pointers and arrays are the same – and we declare them like
pointers, writing int *i.
• The C0 types string, char* and char[] are all represented as point-
ers to char in C.
In this lecture, we will endeavor to look on the bright side and look at the
new things that C gives us. But remember: with great power comes great
responsibility.
This lecture has three parts. First, we will continue our discussion of
memory management in C: everything has an address and we can use the
address-of operation &e to obtain this address. Second, we will look at the
different ways that C represents numbers and the general, though mostly
2 Address-of
In C0, we can only obtain new pointers and arrays with the built-in alloc
and alloc_array operations. As we discussed last time, alloc(ty) in C0
roughly translates to malloc(sizeof(ty)) in C, with the exception that C
does not initialize allocated memory to default values. Similarly, the C0
code alloc_array(ty, n) roughly translates to calloc(n, sizeof(ty)),
and calloc does initialize allocated memory to default values. Because
both of these operations can return NULL, we also introduced xmalloc and
xcalloc that allow us to safely assume a non-NULL result.
C also gives us a new way to create pointers. If e is an expression (like
x, A[12], or *x) that describes a memory location which we can read from
and potentially write to, then the expression &e gives us a pointer to that
memory location. In C0, if we have a struct containing a string and an
integer, it’s not possible to get a pointer to just the integer. This is possible
in C:
struct wcount {
char *word;
int count;
};
3 Stack Allocation
In C, we can also allocate data on the system stack (which is different from
the explicit stack data structure used in the running example). As discussed
in the lecture on memory layout, each function allocates memory in its so-
called stack frame for local variables. We can obtain a pointer to this memory
using the address-of operator. For example:
int main () {
int a1 = 1;
int a2 = 2;
increment(&a1);
increment(&a2);
...
}
Note that there is no call to malloc or calloc which allocate spaces on the
system heap (again, this is different from the heap data structure we used
for priority queues).
Note that we can only free memory allocated with malloc or calloc,
but not memory that is on the system stack. Such memory will automat-
ically be freed when the function whose frame it belongs to returns. This
has two important consequences. The first is that the following is a bug,
because free will try to free the memory holding a1 , which is not on the
heap:
int main() {
int a1 = 1;
int a2 = 2;
free(a1);
...
}
The second consequence is pointers to data stored on the system stack do
not survive the function’s return. For example, the following is a bug:
int *f_ohno() {
int a = 1; /* bug: a is deallocated when f_ohno() returns */
return &a;
}
A correct implementation requires us to allocate on the system heap, using
a call to malloc or calloc (or one of the library functions which calls them
in turn).
int *f() {
int* x = xmalloc(sizeof(int));
*x = 1;
return x;
}
In general, stack allocation is more efficient than heap allocation, be-
cause it is freed automatically when the function in which it is defined
returns. That removes the overhead of managing the memory explicitly.
However, if the data structure we allocate needs to survive past the end of
the current function you must allocate it on the heap.
4 Pointer Arithmetic in C
We have already discussed that C does not distinguish between pointers
and arrays; essentially a pointer holds a memory address which may be
the beginning of an array. In C we can actually calculate with memory
addresses. Before we explain how, please heed our recommendation: rec-
ommendation
Do not perform arithmetic on pointers!
Code with explicit pointer arithmetic will generally be harder to read and
is more error-prone than using the usual array access notation A[i].
Now that you have been warned, here is how it works. We can add an
integer to a pointer in order to obtain a new address. In our running ex-
ample, we can allocate an array and then push pointers to the first, second,
and third elements in the array onto a stack.
int* A = xcalloc(3, sizeof(int));
A[0] = 0; A[1] = 1; A[2] = 2;
increment(A); /* A[0] now equals 1 */
increment(A+1); /* A[1] now equals 2 */
increment(A+2); /* A[2] now equals 3 */
The actual address denoted by A + 1 depends on the size of the elements
stored at ⇤A, in this case, the size of an int. A much better way to achieve
the same effect is
int* A = xcalloc(3, sizeof(int));
A[0] = 0; A[1] = 1; A[2] = 2;
increment(&A[0]); /* A[0] now equals 1 */
We cannot free array elements individually, even though they are located
on the heap. The rule is that we can apply free only to pointers returned
from malloc or calloc. So in the example code we can only free A.
5 Numbers in C
In addition to the undefined behavior resulting from bad memory access
(dereferencing a NULL pointer or reading outside of an array), there is un-
defined behavior in C. In particular:
• Arithmetic overflow for signed types like int is undefined. (In C0,
this is defined as modular arithmetic.)
This has some strange effects. If x and y are signed integers, then the
expressions x < x+1 and x/y == x/y are either true or undefined (due to
signed arithmetic or overflow, respectively). So the compiler is allowed
to pretend that these expressions are just true all the time. The compiler
is also allowed to behave the same way C0 does, returning false in the
first case when x is the maximum integer and raising an exception in the
second case when y is 0. The compiler is also free to check for signed inte-
ger overflow and division by zero and start playing Rick Astley’s “Never
Gonna Give You Up” if either occurs, though this is last option is unlikely
in practice. Undefined behavior is unpredictable – it can and does change
uint fib(int n) {
REQUIRES(n >= 0);
uint A[n+2]; /* stack-allocated array A */
A[0] = 0;
A[1] = 1;
for (int i = 0; i <= n-2; i++)
A[i+2] = A[i] + A[i+1];
return A[n]; /* deallocate A just before actual return */
}
In addition to int, which is a signed type, there are the signed types
short and long, and unsigned versions of each of these types – short is
smaller than int and long is bigger. The numeric type char is smaller than
short and always takes up one byte. The maximum and minimum values
of these numeric types can be found in the standard header file limits.h.
C, annoyingly, does not define whether char is signed or unsigned. A
signed char is definitely signed, a unsigned char is unsigned. The type
char can be either signed or unsigned – this is implementation defined.
(C also gives us floating point numbers, float and double, but we will
not cover these in 122.)
6 Implementation-defined Behavior
It is often very difficult to say useful and precise things about the C pro-
gramming language, because many of the features of C that we have to
rely on in practice are not part of the C standard. Instead, they are things
that the C standard leaves up to the implementation – implementation
defined behaviors. Implementation defined behaviors make it quite dif-
ficult to write code on one computer that will compile and run on another
computer, because on the other compiler may make completely different
choices about implementation defined behavior.
The first example we have seen is that, while a char is always exactly
one byte, we don’t know whether it is signed or unsigned – whether it
can represent integer values in the range [128, 128) or integer values in the
range [0, 256). And it is even worse, because a byte can be more than 8 bits!
If you really want to mean “8 bits,” you should say octet.
In this class we going to rely on a number of implementation-defined
behaviors. For example, you can always assume that bytes are 8 bits. When
it is important to not rely on integer sizes being implementation-defined, it
is possible to use the types defined in stdint.h, which defines signed and
unsigned types of specific sizes. In the systems that you are going to use for
programming, you can reasonably expect a common set of implementation-
defined behaviors: char will be a signed 8-bit integer and so on. This chart
describes how these types line up:
a problem you will actually encounter later in this semester.) We can cast
this character value to an integer value buy writing (int)e.
However, what will the value of this integer be? You can run this code and
find out on your own, but the important thing to realize is that it’s not clear,
because there are two different stories we can tell.
In the first story, we start by transforming the unsigned char into an
unsigned int. When we cast from a small unsigned quantity to a large un-
signed quantity, we can be sure that the value will be preserved. Because
the bits 11110000 are understood as the unsigned integer 240, the unsigned
int will also be 240, written in hexadecimal as 0x000000F0. Then, when
we cast from an unsigned int to a signed int, we can expect the bits to re-
main the same (though this is really implementation defined), and because
the interpretation of signed integers is two’s-complement (also implemen-
tation defined) the final value will be 240.
In the second story, we transform the unsigned char into the signed-
char. Again, the implementation-defined behavior we expect is that we will
interpret the result as a 8-bit signed two’s-complement quantity, meaning
that 0xF0 is understood as -16. Then, when we cast from the small signed
quantity (char) to a large signed quantity (int), we know the quantity -16
will be preserved, meaning that we will end up with a signed integer writ-
ten in hexadecimal as 0xFFFFFFF0.
0xF0 !
(as$a$uint8_t:$240)$
preserve$value$ preserve$bit$pa9ern$
0x000000F0! 0xF0!
(as$an$uint32_t:$240)$ (as$an$int8_t:$116)$
preserve$bit$pa9ern$ preserve$value$
0x000000F0! 0xFFFFFFF0!
(as$an$int32_t:$240)$ (as$an$int32_t:$116)$
8 Void Pointers
In C, a special type void* denotes a pointer to a value of unknown type.
For most pointers, the type of a pointer tells C how big it is. When you
have a char*, it represents an address that points to one byte (or, equiva-
lently, an array of one-byte objects). When you have a int*, it represents
an address that points to four bytes (assuming the implementation defines
4-byte integers), so when C dereferences this pointer it will read or write to
four bytes at a time. A void* is just an address; C does not know how to
read or write from it. We can cast back and forth between void pointers to
other pointers.
int x = 12;
int *y = xcalloc(1, sizeof(int));
int *z;
void *px = (void*)&x;
void *py = (void*)y;
z = (int*)px;
z = (int*)py;
Casting out of void* incorrectly is generally either undefined or implementation-
defined. We can also cast between pointers and the intptr_t types that can
contain them.
int x = 12;
int *y = xcalloc(1, sizeof(int));
int *z;
intptr_t ipx = (intptr_t)&x;
uintptr_t ipy = (uintptr_t)y;
z = (int*)ipx;
z = (int*)ipy;
Thus, we don’t strictly need the void* type – we could always use uintptr_t
– but it is helpful to use the C type system to help us avoid accidentally, say
adding two pointers together.
The return type of xmalloc and company is actually a void pointer.
int x = 12;
void *px = (void*)(intptr_t)12;
int y = (int)(intptr_t)px;
9 Simple Libraries
We can use void pointers to make data structures more generic. For exam-
ple, an interface to generic stacks might be specified as
Notice the use of void* for the first argument to push and for the return
type of pop.
stack S = stack_new();
struct wcount *wc = malloc(sizeof(struct wcount));
wc->name = "wherefore"
wc->count = 3;
push(S, wc);
wc = malloc(sizeof(struct wcount));
wc->name = "henceforth"
wc->count = 5;
push(S, wc);
while(!stack_empty(S)) {
wc = (struct wcount*)pop(S);
printf("Popped %s with count %d\n", wc->name, wc->count);
free(wc);
}
Because we can squeeze integers into a void*, we can also use the
generic stacks to store integers:
stack S = stack_new();
push(S, (void*)(intptr_t)6);
push(S, (void*)(intptr_t)12);
while(!stack_empty(S)) {
printf("Popped: %d\n", (int)(intptr_t)pop(S));
}
Translating stacks from C0 to C and making them generic is no different
than translating BSTs. In fact, we no longer need stacks to know about the
client interface, because rather than having one specific element, we have
a generic element. The trade-off is that we no longer know how we are
supposed to free a generic element when we free a stack. As the previous
example shows, the elements stored as void pointers might not even be
pointers!
The easy way out is to require that stack_free only be called on empty
stacks, which means there are no elements that we have to consider freeing.
This makes the implementation of stack_free simple:
void stack_free(stack S) {
REQUIRES(is_stack(S) && stack_empty(S));
ASSERT(S->top == S->bottom);
free(S->top);
free(S);
}
In the next C lecture (Lecture 22), we will learn how to extend the stack
implementation so that we can free non-empty stacks without leaks. This
Lecture 21
April 4, 2012
1 Introduction
In the data structures implementing associative arrays so far, we have needed
either an equality operation and a hash function, or a comparison operator
with a total order on keys. Similarly, our sorting algorithms just used a total
order on keys and worked by comparisons of keys. We obtain a different
class of representations and algorithms if we analyze the structure of keys
and decompose them. In this lecture we explore tries, an example from this
class of data structures. The asymptotic complexity we obtain has a differ-
ent nature from data structures based on comparisons, depending on the
structure of the key rather than the number of elements stored in the data
structure.
in the grid
E F R A
H G D R
P S N A
E E B E
we have the words SEE, SEEP, and BEARDS, but not SEES. Scoring as-
signs points according to the lengths of the words found, where longer
words score higher.
One simple possibility for implementing this game is to systematically
search for potential words and then look them up in a dictionary, perhaps
stored as a sorted word list, some kind of binary search tree, or a hash table.
The problem is that there are too many potential words on the grid, so we
want to consider prefixes and abort the search when a prefix does not start
a word. For example, if we start in the upper right-hand corner and try
horizontally first, then EF is a prefix for a number of words, but EFR, EFD,
EFG, EFH are not and we can abandon our search quickly. A few more
possibilities reveal that no word with 3 letters or more in the above grid
starts in the upper left-hand corner.
Because a dictionary is sorted alphabetically, by prefix, we may be able
to use a sorted array effectively in order for the computer to play Boggle
and quickly determine all possible words on a grid. But we may still look
for potentially more efficient data structures which take into account that
we are searching for words that are constructed by incrementally extending
the prefix.
3 Multi-Way Tries
One possibility is to use a multi-way trie, where each node has a potential
child for each letter in the alphabet. Consider the word SEE. We start at the
root and follow the link labeled S, which gets us to a node on the second
level in the tree. This tree indexes all words with first character S. From
here we follow the link labeled E, which gets us to a node indexing all
words that start with SE. After one more step we are at SEE. At this point
we cannot be sure if this is a complete word or just a prefix for words stored
in it. In order to record this, we can either store a Boolean (true if the
current prefix is a complete word) or terminate the word with a special
character that cannot appear in the word itself.
A" B" C" D" E" …" Z" A" B" C" D" E" …" Z"
false" true"
While the paths to finding each word are quite short, including one more
node than characters in the word, the data structure consumes a lot of
space, because there are a lot of nearly empty arrays.
An interesting property is that the lookup time for a word is O(k),
where k is the number of characters in the word. This is independent of
how many words are stored in the data structure! Contrast this with, say,
balanced binary search trees where the search time is O(log(n)), where n is
the number of words stored. For the latter analysis we assumed that key
comparisons where constant time, which is not really true because the keys
(which are strings) have to be compared character by character. So each
comparison, while searching through a binary search tree, might take up to
O(k) individual character comparison, which would make it O(k ⇤ log(n))
in the worst case. Compare that with O(k) for a trie.
On the other hand, the wasted space of the multi-way trie with an array
at each node costs time in practice. This is not only because this memory
must be allocated, but because on modern architectures the so-called mem-
ory hierarchy means that accesses to memory cells close to each other will be
much faster than accessing distant cells. You will learn more about this in
15-213 Computer Systems.
4 Binary Tries
The idea of the multi-way trie is quite robust, and there are useful special
cases. One of these if we want to represent sets of numbers. In that case
we can decompose the binary representation of numbers bit by bit in order
to index data stored in the trie. We could start with the most significant or
least significant bit, depending on the kind of numbers we expect. In this
case every node would have at most two successors, one for 0 and one for
1. This does not waste nearly as much space and can be efficient for many
purposes.
5 Linked Lists
For the particular application we have in mind, namely searching for words
on a grid of letters, we could either use multiway tries directly (wasting
space) or use binary tries (wasting time and space, because each character
is decomposed into individual bits).
A compromise solution is replacing the array (which may end up mostly
empty) with a linked list. This gives us two fundamentally different uses of
pointers. Child pointers (drawn in blue) correspond to forward movement
through the string. The next pointers of the linked list (drawn in red), on
the other hand, connect what used to be parts of the same array list.
In this representation, it also becomes natural to have the Boolean “end
of word” flag stored with the final character, rather than one step below
the final character like we did above. (This means it’s no longer possible to
store the empty string, however.) The tree above, containing BACCALAU-
REATE, BE, and BEE, now looks like this:
false! b!
e!
!
false! a! true!
! !
false! c! true! e!
! !
false! b! false! o!
e!
!
a!
!
false! true! true! r!
d!
! !
First"Character"
b"
o"
Second"Character"
e" r"
a"
Third"Character"
e"
c" d"
7 Specifying an Interface
Specifying an interface for tries is tricky. If we want to just use tries as
dictionaries, then we can store arbitrary elements, but we commit to strings
as keys. In that sense our interface is not very abstract, but well-suited to
our application. (To relate this to our discussion above, an X in the diagram
above can represent the presence or absence of an element rather than a
bool flag.)
trie trie_new();
elem trie_lookup(trie TR, char *s);
void trie_insert(trie TR, char *s, elem e);
void trie_free(trie TR);
9 Checking Invariants
The declarations of the types is completely straightforward.
struct trie_header {
tst *root;
};
It only makes sense to add one to CHAR_MAX because int is a bigger type
than char. 0 and CHAR_MAX+1 essentially function as 1 and +1 for check-
ing the intervals of a binary search tree with strictly positive character val-
ues as keys.
If the tree is null, the word is not stored in the trie and we return NULL.
On the other hand, if we are at the end of the string (s[i+1] = ’\0’) we
return the stored data. Otherwise, we continue lookup in the left, middle,
or right subtree as appropriate. Important for the last case: if the string
character s[i] equal to the character stored at the node, then we look for
the remainder of the word in the middle subtrie. This is implemented by
passing i + 1 to the subtrie.
11 Implementing Insertion
Insertion follows the same structure as search, which is typical for the kind
of data structure we have been considering in the last few weeks. If the tree
to insert into is null, we create a new node with the character of the string
we are currently considering (the ith) and null children and then continue
with the insertion algorithm.
if (T == NULL)
{
T = xmalloc(sizeof(struct trie_node));
T->c = s[i];
T->data = NULL;
T->left = NULL;
T->right = NULL;
T->middle = NULL;
}
As usual with recursive algorithms, we return the the trie after insertion to
handle the null case gracefully, but we operate imperatively on the subtries.
At the top level we just insert into the root, with an initial index of 0. At
this (non-recursive) level, insertion is done purely by modifying the data
structure.
Exercises
Exercise 1 Implement the game of Boggle as sketched in this lecture. Make sure
to pick the letters according to the distribution of their occurrence in the English
language. You might use the Scrabble dictionary itself, for example, to calculate
the relative frequency of the letters.
If you are ambitious, try to design a simple textual interface to print a random
grid and then input words from the human player and show the words missed by
the player.
B"
A"
C" E"
C"
D"
Exercise 4 Using this modified TST implementation from the question above, al-
low repeated searching of prefixes with the following interface:
Exercise 5 Consider other implementations of the interface above that allow re-
peated searching of prefixes.
Lecture 22
November 15, 2012
1 Introduction
Using void* to represent pointers to values of arbitrary type, we were able
to implement generic stacks in that the types of the elements were arbitrary.
The main remaining restriction was that they had to be pointers. Generic
queues or unbounded arrays can be implemented in an analogous fashion.
However, when considering, say, hash tables or binary search trees, we run
into difficulties because implementations of these data structures require
operations on data provided by the client. For example, a hash table im-
plementation requires a hash function and an equality function on keys.
Similarly, binary search trees require a comparison function on keys with
respect to an order. In this lecture we show how to overcome this limitation
using function pointers as introduce in the previous lecture.
/************************************/
/* Hash table client-side interface */
/************************************/
key elem_key(elem e)
//@requires e != NULL;
;
We were careful to write the implementation so that it did not need to know
what these types and functions were. But due to limitations in C0, we could
not obtain multiple implementations of hash tables to be used in the same
application, because once we fix elem, key, and the above three functions,
they cannot be changed.
Given the above the library provides a type ht of hash tables and means
to create, insert, and search through a hash table.
/*************************************/
/* Hash table library side interface */
/*************************************/
typedef struct ht_header* ht;
ht ht_new(int capacity)
//@requires capacity > 0;
;
elem ht_lookup(ht H, key k); /* O(1) avg. */
void ht_insert(ht H, elem e) /* O(1) avg. */
//@requires e != NULL;
;
3 Generic Types
Since both keys and elements are defined by the clients, they turn into
generic pointer types when we implement a truly generic structure in C.
We might try the following in a file ht.h, where we have added the func-
tion ht_free to the interface. The latter takes a pointer to the function that
frees elements stored in the table, as explained in a previous lecture.
#include <stbool.h>
#include <stdlib.h>
#ifndef _HASHTABLE_H_
#define _HASHTABLE_H_
#endif
We use type definitions instead of writing void* in this interface so the role
of the arguments as keys or elements is made explicit (even if the compiler
is blissfully unaware of this distinction). We write ht_elem now in the C
code instead of elem to avoid clashes with functions of variables of that
name.
However, this does not yet work. Before you read on, try to think about
why not, and how we might solve it
#ifndef _HASHTABLE_H_
#define _HASHTABLE_H_
#endif
We have made some small changes to exploit the presence of unsigned in-
tegers (in key_hash) and the also unsigned size_t types to provide more
appropriate types to certain functions.
Storing the function for manipulating the data brings us closer to the
realm of object-oriented programming where such functions are called meth-
ods, and the structure they are stored in are objects. We don’t pursue this
analogy further in this course, but you may see it in follow-up courses,
specifically 15-214 Software System Construction.
/* elements */
struct wc {
char *word; /* key */
int count; /* information */
};
typedef struct wc *ht_elem;
ht ht_new(size_t capacity,
ht_key (*elem_key)(ht_elem e),
bool (*key_equal)(ht_key k1, ht_key k2),
unsigned int (*key_hash)(ht_key k, unsigned int m),
void (*elem_free)(ht_elem e))
{
REQUIRES(capacity > 0);
ht H = xmalloc(sizeof(struct ht_header));
H->size = 0;
H->capacity = capacity;
H->table = xcalloc(capacity, sizeof(chain*));
/* initialized to NULL */
H->elem_key = elem_key;
H->key_equal = key_equal;
H->key_hash = key_hash;
H->elem_free = elem_free;
ENSURES(is_ht(H));
return H;
}
chain *p = H->table[i];
while (p != NULL) {
ASSERT(is_chain(H, i, NULL));
ASSERT(p->data != NULL);
if (keyequal(H, elemkey(H, p->data), k)) {
/* overwrite existing element */
p->data = e;
} else {
p = p->next;
}
}
ASSERT(p == NULL);
...
}
At the end of the while loop, we know that the key k is not already in the
hash table. But this code fragment has a subtle memory leak. Can you see
it?1
1
The code author overlooked this in the port of the code from C0 to C, but one of the
students noticed.
The client has to be aware that the element already in the table will be freed
when a new one with the same key is added.
In order to avoid this potentially dangerous convention, we can also
just return the old element if there is one, and NULL otherwise. The infor-
mation that such an element already existed may be useful to the client in
other situations, so it seems like the preferable solution. The client could
always immediately apply it element free function if that is appropriate.
This requires a small change in the interface, but first we show the relevant
code.
chain *p = H->table[i];
while (p != NULL) {
ASSERT(p->data != NULL);
if (keyequal(H, elemkey(H, p->data), k)) {
/* overwrite existing element and return it */
ht_elem tmp = p->data;
p->data = e;
return tmp;
} else {
p = p->next;
}
}
The relevant part of the revised header file ht.h now reads:
typedef void* ht_elem;
typedef void* ht_key;
ht ht_new(size_t capacity,
ht_key (*elem_key)(ht_elem e),
bool (*key_equal)(ht_key k1, ht_key k2),
unsigned int (*key_hash)(ht_key k, unsigned int m),
void (*elem_free)(ht_elem e));
8 Separate Compilation
Although the C language does not provide much support for modularity,
convention helps. The convention rests on a distinction between header files
(with extension .h) and program files (with extension c).
When we implement a data structure or other code, we provide not
only filename.c with the code, but also a header file filename.h with
declarations providing the interface for the code in filename.c. The im-
plementation filename.c contains #include "filename.h" at its top, and
client will have the same line. The fact that both implementation and client
include the same header file provides a measure of consistency between
the two.
Header files filename.h should never contain any function definitions
(that is, code), only type definition, structure declarations, macros, and
function declarations (so-called function prototypes). In contrast, program
files filename.c can contain both declarations and definitions, with the
understanding that the definitions are not available to other files.
We only ever #include header files, never program files, in order to
maintain the separation between code and interface.
• Every file filename, except for the one with the main function, has a
header file filename.h and a program file filename.c.
• The program filename.c and any client that would like to use it has
a line #include "filename.h" at the beginning.
• The header file filename.h never contains any code, only macros,
type definition, structure definitions, and functions header files. It
has appropriate header guards to void problems if it is loaded more
than once.
• We never #include any program files, only header files (with .h ex-
tension).
Exercises
Exercise 1 Convert the interface and implementation for binary search trees from
C0 to C and make them generic. Also convert the testing code, and verify that no
memory is leaked in your tests. Make sure to adhere to the conventions described
in Section 8.
int main () {
return (3+4)*5/2;
}
We compile it with
% cc0 –b ex1.c0
to generate the corresponding byte code file ex1.bc0:
C0 C0 FF EE # magic number
00 09 # version 4, arch = 1 (64 bits)
00 01 # function count
# function_pool
#<main>
00 00 # number of arguments = 0
00 00 # number of local variables = 0
00 0C # code length = 12 bytes
10 03 # bipush 3 # 3
10 04 # bipush 4 # 4
60 # iadd # (3 + 4)
10 05 # bipush 5 # 5
68 # imul # ((3 + 4) * 5)
10 02 # bipush 2 # 2
6C # idiv # (((3 + 4) * 5) / 2)
B0 # return #
00 00 # native count
# native pool
EXAMPLE 2
int main () {
return mid(3,6);
}
#<mid>
00 02 # number of arguments = 2
00 03 # number of local variables = 3
00 10 # code length = 16 bytes
15 00 # vload 0 # lower
15 01 # vload 1 # upper
15 00 # vload 0 # lower
64 # isub # (upper - lower)
10 02 # bipush 2 # 2
6C # idiv # ((upper - lower) / 2)
60 # iadd # (lower + ((upper - lower) / 2))
36 02 # vstore 2 # mid = ...;
15 02 # vload 2 # mid
B0 # return #
EXAMPLE 3
int main() {
return next_rand(0xdeadbeef);
}
BYTECODE:
C0 C0 FF EE # magic number
00 09 # version 4, arch = 1 (64 bits)
00 02 # function count
# function_pool
#<main>
00 00 # number of arguments = 0
00 01 # number of local variables = 1
00 07 # code length = 7 bytes
13 00 02 # ildc 2 # c[2] = -559038737
B8 00 01 # invokestatic 1 # next_rand(-559038737)
B0 # return #
#<next_rand>
00 01 # number of arguments = 1
00 01 # number of local variables = 1
00 0B # code length = 11 bytes
15 00 # vload 0 # last
13 00 00 # ildc 0 # c[0] = 1664525
68 # imul # (last * 1664525)
13 00 01 # ildc 1 # c[1] = 1013904223
60 # iadd # ((last * 1664525) + 1013904223)
B0 # return #
00 00 # native count
# native pool
EXAMPLE 4
int main () {
int sum = 0;
for (int i = 1; i < 100; i += 2)
//@loop_invariant 0 <= i && i <= 100;
sum += i;
return sum;
}
#<main>
00 00 # number of arguments = 0
00 02 # number of local variables = 2
00 26 # code length = 38 bytes
10 00 # bipush 0 # 0
36 00 # vstore 0 # sum = 0;
10 01 # bipush 1 # 1
36 01 # vstore 1 # i = 1;
# <00:loop>
15 01 # vload 1 # i
10 64 # bipush 100 # 100
A1 00 06 # if_icmplt +6 # if (i < 100) goto <01:body>
A7 00 14 # goto +20 # goto <02:exit>
# <01:body>
15 00 # vload 0 # sum
15 01 # vload 1 # i
60 # iadd #
36 00 # vstore 0 # sum += i;
15 01 # vload 1 # i
10 02 # bipush 2 # 2
60 # iadd #
36 01 # vstore 1 # i += 2;
A7 FF E8 # goto -24 # goto <00:loop>
# <02:exit>
15 00 # vload 0 # sum
B0 # return #
EXAMPLE 5
struct point {
int x;
int y;
};
typedef struct point* point;
point reflect(point p) {
point q = alloc(struct point);
q->x = p->y;
q->y = p->x;
return q;
}
int main () {
point p = alloc(struct point);
p->x = 1;
p->y = 2;
point q = reflect(p);
return q->x*10 + q->y;
}
#<reflect>
00 01 # number of arguments = 1
00 02 # number of local variables = 2
00 1B # code length = 27 bytes
BB 08 # new 8 # alloc(struct point)
36 01 # vstore 1 # q = alloc(struct point);
15 01 # vload 1 # q
62 00 # aaddf 0 # &q->x
15 00 # vload 0 # p
62 04 # aaddf 4 # &p->y
2E # imload # p->y
4E # imstore # q->x = p->y;
15 01 # vload 1 # q
62 04 # aaddf 4 # &q->y
15 00 # vload 0 # p
62 00 # aaddf 0 # &p->x
2E # imload # p->x
4E # imstore # q->y = p->x;
15 01 # vload 1 # q
B0 # return #
EXAMPLE 6
#use <conio>
int main() {
int[] A = alloc_array(int, 100);
for (int i = 0; i < 100; i++)
A[i] = i;
return A[99];
}
#<main>
00 00 # number of arguments = 0
00 02 # number of local variables = 2
00 2D # code length = 45 bytes
10 64 # bipush 100 # 100
BC 04 # newarray 4 # alloc_array(int, 100)
36 00 # vstore 0 # A = alloc_array(int, 100);
10 00 # bipush 0 # 0
36 01 # vstore 1 # i = 0;
# <00:loop>
15 01 # vload 1 # i
10 64 # bipush 100 # 100
A1 00 06 # if_icmplt +6 # if (i < 100) goto <01:body>
A7 00 15 # goto +21 # goto <02:exit>
# <01:body>
15 00 # vload 0 # A
15 01 # vload 1 # i
63 # aadds # &A[i]
15 01 # vload 1 # i
4E # imstore # A[i] = i;
15 01 # vload 1 # i
10 01 # bipush 1 # 1
60 # iadd #
36 01 # vstore 1 # i += 1;
A7 FF E7 # goto -25 # goto <00:loop>
# <02:exit>
15 00 # vload 0 # A
10 63 # bipush 99 # 99
63 # aadds # &A[99]
2E # imload # A[99]
B0 # return #
EXAMPLE 7
#use <string>
#use <conio>
int main () {
string h = "Hello ";
string hw = string_join(h, "World!\n");
print(hw);
return string_length(hw);
}
BYTECODE:
C0 C0 FF EE # magic number
00 09 # version 4, arch = 1 (64 bits)
00 01 # function count
# function_pool
#<main>
00 00 # number of arguments = 0
00 02 # number of local variables = 2
00 1B # code length = 27 bytes
14 00 00 # aldc 0 # s[0] = "Hello "
36 00 # vstore 0 # h = "Hello ";
15 00 # vload 0 # h
14 00 07 # aldc 7 # s[7] = "World!\n"
B7 00 00 # invokenative 0 # string_join(h, "World!\n")
36 01 # vstore 1 # hw = ...
15 01 # vload 1 # hw
B7 00 01 # invokenative 1 # print(hw)
57 # pop # (ignore result)
15 01 # vload 1 # hw
B7 00 02 # invokenative 2 # string_length(hw)
B0 # return #
00 03 # native count
# native pool
00 02 00 4F # string_join
00 01 00 06 # print
00 01 00 50 # string_length
Lecture Notes on
Programs as Data: The C0VM
Lecture 23
November 20, 2012
1 Introduction
A recurring theme in computer science is to view programs as data. For
example, a compiler has to read a program as a string of characters and
translate it into some internal form, a process called parsing. Another in-
stance are first-class functions, which you will study in great depth in 15–
150, a course dedicated to functional programming. When you learn about
computer systems in 15–213 you will see how programs are represented as
machine code in binary form.
In this lecture we will take a look at a virtual machine. In general, when
a program is read by a compiler, it will be translated to some lower-level
form that can be executed. For C and C0, this is usually machine code. For
example, the cc0 compiler you have been using in this course translates
the input file to a file in the C language, and then a C compiler (gcc) trans-
lates that in turn into code that can be executed directly by the machine. In
contrast, Java implementations typically translate into some intermediate
form called byte code which is saved in a class file. Byte code is then inter-
preted by a virtual machine called the JVM (for Java Virtual Machine). So
the program that actually runs on the machine hardware is the JVM which
interprets byte code and performs the requested computations.
Using a virtual machine has one big drawback, which is that it will be
slower than directly executing a binary on the machine. But it also has a
number of important advantages. One is portability: as long as we have an
implementation of the virtual machine on our target computing platform,
we can run the byte code there. So we need a virtual machine implementa-
tion for each computing platform, but only one compiler. A second advan-
tage is safety: when we execute binary code, we give away control over the
actions of the machine. When we interpret byte code, we can decide at each
step if we want to permit an action or not, possibly terminating execution if
the byte code would do something undesirable like reformatting the hard
disk or crashing the computer. The combination of these two advantages
led the designers of Java to create an abstract machine. The intent was for
Java to be used for mobile code, embedded in web pages or downloaded
from the Internet, which may not be trusted or simply be faulty. Therefore
safety was one of the overriding concerns in the design.
In this lecture we explore how to apply the same principles to develop
a virtual machine to implement C0. We call this the C0VM and in Assign-
ment 8 of this course you will have the opportunity to implement it. The
cc0 compiler has an option (-b) to produce bytecode appropriate for the
C0VM. This will give you insight not only into programs-as-data, but also
into how C0 is executed, its operational semantics.
As a side remark, at the time the C language was designed, machines
were slow and memory was scarce compared to today. Therefore, efficiency
was a principal design concern. As a result, C sacrificed safety in a number
of crucial places, a decision we still pay for today. Any time you download
a security patch for some program, chances are a virus or worm or other
malware was found that takes advantage of the lack of safety in C in order
to attack your machine. The most gaping hole is that C does not check if
array accesses are in bounds. So by assigning to A[k] where k is greater
than the size of the array, you may be able to write to some arbitrary place
in memory and, for example, install malicious code. In 15–213 Computer
Systems you will learn precisely how these kind of attacks work, because
you will carry out some of your own!
In C0, we spent considerable time and effort to trim down the C lan-
guage so that it would permit a safe implementation. This makes it mar-
ginally slower than C on some programs, but it means you will not have
to try to debug programs that crash unpredictably. You have been intro-
duced to all the unsafe features of C, when the course switched to C, and
we taught you programming practices that avoid these kinds of behavior.
But it is very difficult, even for experienced teams of programmers, as the
large number of security-relevant bugs in today’s commercial software at-
tests. One might ask why program in C at all? One reason is that many
of you, as practicing programmers, will have to deal with large amounts
of legacy code that is written in C or C++. As such, you should be able to
understand, write, and work with these languages. The other reason is that
there are low-level systems-oriented programs such as operating systems
kernels, device drivers, garbage collectors, networking software, etc. that
are difficult to write in safe languages and are usually written in a combina-
tion of C and machine code. But don’t lose hope: research in programming
language has made great strides of the last two decades, and there is an
ongoing effort at Carnegie Mellon to build an operating system based on
a safe language that is a cousin of C. So perhaps we won’t be tied to an
unsafe language and a flood of security patches forever.
Implementation of a virtual machine is actually one of the applications
where even today C is usually the language of choice. That’s because C
gives you control over the memory layout of data, and also permits the
kind of optimizations that are crucial to make a virtual machine efficient.
Here, we don’t care so much about efficiency, being mostly interested in
correctness and clarity, but we still use C to implement the C0VM.
2 A Stack Machine
The C0VM is a stack machine. This means that the evaluation of expressions
uses a stack, called the operand stack. It is written from left to right, with the
rightmost element denoting the top of the stack.
We begin with a simple example, evaluating an expression without
variables:
(3 + 4) ⇤ 5/2
In the table below we show the virtual machine instruction on left, in tex-
tual form, and the operand stack after the instruction on the right has been
executed. We write ‘·’ for the empty stack.
Instruction Operand Stack
·
bipush 3 3
bipush 4 3, 4
iadd 7
bipush 5 7, 5
imul 35
bipush 2 35, 2
idiv 17
The translation of expressions to instructions is what a compiler would
normally do. Here we just write the instructions by hand, in effect simulat-
ing the compiler. The important part is that executing the instructions will
compute the correct answer for the expression. We always start with the
empty stack and end up with the answer as the only item on the stack.
In the C0VM, instructions are represented as bytes. This means we only
have at most 256 different instructions. Some of these instructions require
more than one byte. For example, the bipush instruction requires a second
byte for the number to push onto the stack. The following is an excerpt
from the C0VM reference, listing only the instructions needed above.
0x10 bipush <b> S -> S,b
0x60 iadd S,x,y -> S,x+y
0x68 imul S,x,y -> S,x*y
0x6C idiv S,x,y -> S,x/y
On the right-hand side we see the effect of the operation on the stack S.
Using these code we can translate the program into code.
In the figure above, and in the rest of these notes, we always show bytecode
in hexadecimal form, without the 0x prefix. In a binary file that contains
this program we would just see the bytes
10 03 10 04 60 10 05 68 10 02 6C
and it would be up to the C0VM implementation to interpret them ap-
propriately. The file format we use is essentially this, except we don’t
use binary but represent the hexadecimal numbers as strings separated by
whitespace, literally as written in the display above.
3 Compiling to Bytecode
The cc0 compiler provides an option -b to generate bytecode. You can use
this to experiment with different programs to see what they translate to.
For the simple arithmetic expression from the previous section we could
create a file ex1.c0:
int main () {
return (3+4)*5/2;
}
We compile it with
% cc0 -b ex1.c0
which will write a file ex1.bc0. In the current version of the compiler, this
has the following content:
C0 C0 FF EE # magic number
00 09 # version 4, arch = 1 (64 bits)
00 01 # function count
# function_pool
#<main>
00 00 # number of arguments = 0
00 00 # number of local variables = 0
00 0C # code length = 12 bytes
10 03 # bipush 3 # 3
10 04 # bipush 4 # 4
60 # iadd # (3 + 4)
10 05 # bipush 5 # 5
68 # imul # ((3 + 4) * 5)
10 02 # bipush 2 # 2
6C # idiv # (((3 + 4) * 5) / 2)
B0 # return #
00 00 # native count
# native pool
#<main>
00 00 # number of arguments = 0
00 00 # number of local variables = 0
00 0C # code length = 12 bytes
tell the virtual machine that the function main takes no arguments, uses no
local variables, and its code has a total length of 12 bytes (0x0C in hex). The
next few lines embody exactly the code we wrote by hand. The comments
first show the virtual machine instruction and then the expression in the
source code that was translated to the corresponding byte code.
10 03 # bipush 3 # 3
10 04 # bipush 4 # 4
60 # iadd # (3 + 4)
10 05 # bipush 5 # 5
68 # imul # ((3 + 4) * 5)
10 02 # bipush 2 # 2
6C # idiv # (((3 + 4) * 5) / 2)
B0 # return #
The return instruction at the end means that the function returns the value
that is currently the only one on the stack. When this function is exe-
cuted, this will be the value of the expression shown on the previous line,
(((3 + 4) * 5) / 2).
As we proceed through increasingly complex language constructs, you
should experiment yourself, writing C0 programs, compiling them to byte
code, and testing your understanding by checking that it is as expected (or
at least correct).
4 Local Variables
So far, the only part of the runtime system that we needed was the local
operand stack. Next, we add the ability to handle function arguments and
local variables to the machine. For that purpose, a function has an array
V containing local variables. We can push the value of a local variable onto
the operand stack with the vload instruction, and we can pop the value
from the top of the stack and store it in a local variable with the vstore
instruction. Initially, when a function is called, its arguments x0 , . . . , xn 1
are stored as local variables V [0], . . . , V [n 1].
Assume we want to implement the function mid.
int mid(int lower, int upper) {
int mid = lower + (upper - lower)/2;
return mid;
}
Here is a summary of the instructions we need
0x15 vload <i> S -> S,v (v = V[i])
0x36 vstore <i> S,v -> S (V[i] = v)
0x64 isub S,x,y -> S,x-y
0xB0 return .,v -> .
Notice that for return, there must be exactly one element on the stack. Us-
ing these instructions, we obtain the following code for our little function.
We indicate the operand stack on the right, using symbolic expressions to
denote the corresponding runtime values. The operand stack is not part of
the code; we just write it out as an aid to reading the program.
#<mid>
00 02 # number of arguments = 2
00 03 # number of local variables = 3
00 10 # code length = 16 bytes
15 00 # vload 0 # lower
15 01 # vload 1 # lower, upper
15 00 # vload 0 # lower, uppper, lower
64 # isub # lower, (upper - lower)
10 02 # bipush 2 # lower, (upper - lower), 2
6C # idiv # lower, ((upper - lower) / 2)
60 # iadd # (lower + ((upper - lower) / 2))
36 02 # vstore 2 # mid = (lower + ((upper - lower) / 2));
15 02 # vload 2 # mid
B0 # return #
We can optimize this piece of code, simply removing the last vstore 2 and
vload 2, but we translated the original literally to clarify the relationship
between the function and its translation.
5 Constants
So far, the bipush <b> instruction is the only way to introduce a constant
into the computation. Here, b is a signed byte, so that its possible values are
128 b < 128. What if the computation requires a larger constant?
The solution for the C0VM and similar machines is not to include the
constant directly as arguments to instructions, but store them separately
in the byte code file, giving each of them an index that can be referenced
from instructions. Each segment of the byte code file is called a pool. For
example, we have a pool of integer constants. The instruction to refer to an
integer is ildc (integer load constant).
The index into the constant pool is a 16-bit unsigned quantity, given in two
bytes with the most significant byte first. This means we can have at most
216 1 = 65, 535 different constants in a byte code file.
As an example, consider a function that is part of a linear congruen-
tial pseudorandom number generator. It generates the next pseudorandom
number in a sequence from the previous number.
int main() {
return next_rand(0xdeadbeef);
}
There are three constants in this file that require more than one byte to
represent: 1664252, 1013904223, and 0xdeadbeef. Each of them is assigned
an index in the integer pool. The constants are then pushed onto the stack
with the ildc instruction.
C0 C0 FF EE # magic number
00 09 # version 4, arch = 1 (64 bits)
00 02 # function count
# function_pool
#<main>
00 00 # number of arguments = 0
00 01 # number of local variables = 1
00 07 # code length = 7 bytes
13 00 02 # ildc 2 # c[2] = -559038737
B8 00 01 # invokestatic 1 # next_rand(-559038737)
B0 # return #
#<next_rand>
00 01 # number of arguments = 1
00 01 # number of local variables = 1
00 0B # code length = 11 bytes
15 00 # vload 0 # last
13 00 00 # ildc 0 # c[0] = 1664525
68 # imul # (last * 1664525)
13 00 01 # ildc 1 # c[1] = 1013904223
60 # iadd # ((last * 1664525) + 1013904223)
B0 # return #
00 00 # native count
# native pool
The comments denote the ith integer in the constant pool by c[i].
There are other pools in this file. The string pool contains string con-
stants. The function pool contains the information on each of the functions,
as explained in the next section. The native pool contains references to “na-
tive” functions, that is, library functions not defined in this file.
6 Function Calls
As already explained, the function pool contains the information on each
function which is the number of arguments, the number of local variables,
the code length, and then the byte code for the function itself. Each function
is assigned a 16-bit unsigned index into this pool. The main function always
has index 0. We call a function with the invokestatic instruction.
0xB8 invokestatic <c1,c2> S, v1, v2, ..., vn -> S, v
We find the function g at function_pool[c1<<8|c2], which must take n
arguments. After g(v1 , . . . , vn ) returns, its value will be on the stack instead
of the arguments.
Execution of the function will start with the first instruction and ter-
minate with a return (which does not need to be the last byte code in the
function). So the description of functions themselves is not particularly
tricky, but the implementation of function calls is.
Let’s collect the kind of information we already know about the runtime
system of the virtual machine. We have a number of pools which come from
the byte code file. These pools are constant in that they never change when
the program executes.
Then we have the operand stack which expands and shrinks within each
function’s operation, and the local variable array which holds function argu-
ments and the local variables needed to execute the function body.
In order to correctly implement function calls and returns we need one
further runtime structure, the call stack. The call stack is a stack of so-called
frames. We now analyze what the role of the frames is and what they need
to contain.
Consider the situation where a function f is executing and calls a func-
tion g with n arguments. At this point, we assume that f has pushed the
arguments onto the operand stack. Now we need take the following steps:
1. Create a new local variable array Vg for the function g.
2. Pop the arguments from f ’s operand stack Sf and store them in g’s
local variable array Vg [0..n).
When the called function g returns, its return value is the only value on its
operand stack Sg . We need to do the following
1. Pop the last frame from the call stack. This frame holds Vf , Sf , and
pc f (the return address).
Concretely, we suggest that a frame from the call stack contain the fol-
lowing information:
7 Conditionals
The C0VM does not have if-then-else or conditional expressions. Like ma-
chine code and other virtual machines, it has conditional branches that jump
to another location in the code if a condition is satisfied and otherwise con-
tinue with the next instruction in sequence.
As part of the test, the arguments are popped from the operand stack. Each
of the branching instructions takes two bytes are arguments which describe
a signed 16-bit offset. If that is positive we jump forward, if it is negative we
jump backward in the program.
As an example, we compile the following loop, adding up odd numbers
to obtain perfect squares.
int main () {
int sum = 0;
for (int i = 1; i < 100; i += 2)
//@loop_invariant 0 <= i && i <= 100;
sum += i;
return sum;
}
#<main>
00 00 # number of arguments = 0
00 02 # number of local variables = 2
00 23 # code length = 35 bytes
10 00 # bipush 0 # 0
36 00 # vstore 0 # sum = 0;
10 01 # bipush 1 # 1
36 01 # vstore 1 # i = 1;
# <00:loop>
15 01 # vload 1 # i
10 64 # bipush 100 # 100
A2 00 14 # if_icmpge 20 # if (i >= 100) goto <01:endloop>
15 00 # vload 0 # sum
15 01 # vload 1 # i
60 # iadd #
36 00 # vstore 0 # sum += i;
15 01 # vload 1 # i
10 02 # bipush 2 # 2
60 # iadd #
36 01 # vstore 1 # i += 2;
A7 FF EB # goto -21 # goto <00:loop>
# <01:endloop>
15 00 # vload 0 # sum
B0 # return #
The compiler has embedded symbolic labels in this code, like <00:loop>
and <01:endloop> which are the targets of jumps or conditional branches.
In the actual byte code, they are turned into relative offsets. For example,
if we count forward 20 bytes, starting from A2 (the byte code of if_icmpge,
the negation of the test i < 100 in the source) we land at <01:endloop>
which labels the vload 0 instruction just before the return. Similarly, if we
count backwards 21 bytes from A7 (which is a goto), we land at <00:loop>
which starts with vload 1.
8 The Heap
In C0, structs and arrays can only be allocated on the system heap. The
virtual machine must therefore also provide a heap in its runtime system.
If you implement this in C, the simplest way to do this is to use the runtime
heap of the C language to implement the heap of the C0VM byte code that
you are interpreting. One can use a garbage collector for C such as libgc
in order to manage this memory. We can also sidestep this difficulty by
assuming that the C0 code we interpret does not run out of memory.
We have two instructions to allocate memory.
0xBB new <s> S -> S, a:* (*a is now allocated, size <s>)
0xBC newarray <s> S, n:w32 -> S, a:* (a[0..n) now allocated)
The new instructions takes a size s as an argument, which is the size (in
bytes) of the memory to be allocated. The call returns the address of the
allocated memory. It can also fail with an exception, in case there is insuffi-
cient memory available, but it will never return NULL. newarray also takes
the number n of elements from the operand stack, so that the total size of
allocated space is n ⇤ s bytes.
For a pointer to a struct, we can compute the address of a field by using
the aaddf instruction. It takes an unsigned byte offset f as an argument,
pops the address a from the stack, adds the offset, and pushes the resulting
address a + f back onto the stack. If a is null, and error is signaled, because
the address computation would be invalid.
point reflect(point p) {
point q = alloc(struct point);
q->x = p->y;
q->y = p->x;
return q;
}
The reflect function is compiled to the following code. When reading this
code, recall that q->x, for example, stands for (*q).x. In the comments, the
compiler writes the address of the x field in the struct pointed to by q as
&(*(q)).x, in analogy with C’s address-of operator &.
#<reflect>
00 01 # number of arguments = 1
00 02 # number of local variables = 2
00 1B # code length = 27 bytes
BB 08 # new 8 # alloc(struct point)
36 01 # vstore 1 # q = alloc(struct point);
15 01 # vload 1 # q
62 00 # aaddf 0 # &q->x
15 00 # vload 0 # p
62 04 # aaddf 4 # &p->y
2E # imload # p->y
4E # imstore # q->x = p->y;
15 01 # vload 1 # q
62 04 # aaddf 4 # &q->y
15 00 # vload 0 # p
62 00 # aaddf 0 # &p->x
2E # imload # p->x
4E # imstore # q->y = p->x;
15 01 # vload 1 # q
B0 # return #
We see that in this example, the size of a struct point is 8 bytes, 4 each for
the x and y fields. You should scrutinize this code carefully to make sure
you understands how structs work.
Array accesses are similar, except that the address computation takes
an index i from the stack. The size of the array elements is stored in the
runtime structure, so it is not passed as an explicit argument. Instead, the
byte code interpreter must retrieve the size from memory. The following is
our sample program.
int main() {
int[] A = alloc_array(int, 100);
for (int i = 0; i < 100; i++)
A[i] = i;
return A[99];
}
Showing only the loop, we have the code below (again slightly edited).
Notice the use of aadds to consume A and i from the stack, pushing &A[i]
onto the stack.
# <00:loop>
15 01 # vload 1 # i
10 64 # bipush 100 # 100
9F 00 15 # if_cmpge 21 # if (i >= 100) goto <01:endloop>
15 00 # vload 0 # A
15 01 # vload 1 # i
63 # aadds # &A[i]
15 01 # vload 1 # i
4E # imstore # A[i] = i;
15 01 # vload 1 # i
10 01 # bipush 1 # 1
60 # iadd #
36 01 # vstore 1 # i += 1;
A7 FF EA # goto -22 # goto <00:loop>
# <01:endloop>
#use <string>
#use <conio>
int main () {
string h = "Hello ";
string hw = string_join(h, "World!\n");
print(hw);
return string_length(hw);
}
There are two string constants, "Hello " and "World!\n". In the byte code
file below they are stored in the string pool at index positions 0 and 7.
C0 C0 FF EE # magic number
00 05 # version 2, arch = 1 (64 bits)
In the byte code program, we access these strings by pushing their address
onto the stack using the aldc instruction.
We can see its use in the byte code for the main function.
#<main>
00 00 # number of arguments = 0
00 02 # number of local variables = 2
00 1B # code length = 27 bytes
14 00 00 # aldc 0 # s[0] = "Hello "
36 00 # vstore 0 # h = "Hello ";
15 00 # vload 0 # h
14 00 07 # aldc 7 # s[7] = "World!\n"
B7 00 00 # invokenative 0 # string_join(h, "World!\n")
36 01 # vstore 1 # hw = string_join(h, "World!\n");
15 01 # vload 1 # hw
B7 00 01 # invokenative 1 # print(hw)
57 # pop # (ignore result)
15 01 # vload 1 # hw
B7 00 02 # invokenative 2 # string_length(hw)
B0 # return #
Another noteworthy aspect of the code is the use of native functions with
index 0, 1, and 2. For each of these, the native pool contains the number of
arguments and an internal index.
00 03 # native count
# native pool
00 02 00 4E # string_join
00 01 00 06 # print
00 01 00 4F # string_length
There is a further subtle point regarding the memory load and store
instructions and their interaction with strings. As we can see from the
string pool representation, a character takes only one byte of memory. The
operand stack and local variable array maintains all primitive types as 4-
byte quantities. We need to mediate this difference when loading or storing
characters. Booleans similarly take only one byte, where 0 stands for false
and 1 for true. For this purpose, the C0VM has variants of the mload and
mstore instructions that load and store only a single byte.
trivial, but we see here that it just establishes a data structure invariant for
the byte code interpreter.
It is important to recognize that there are limits to what can be done
with bytecode verification before the code is executed. For example, we
can not check in general if division might try to divide by 0, or if the pro-
gram will terminate. There is a lot of research in the area of programming
languages concerned with pushing the boundaries of static verification, in-
cluding here at Carnegie Mellon University. Perhaps future instances of
this course will benefit from this research by checking your C0 program in-
variants, at least to some extent, and pointing out bugs before you ever run
your program just like the parser and type checker do.
Instruction operands:
<i> = local variable index (unsigned)
<b> = byte (signed)
<s> = element size in bytes (unsigned)
<f> = field offset in struct in bytes (unsigned)
<c> = <c1,c2> = pool index = (c1<<8|c2) (unsigned)
<o> = <o1,o2> = pc offset = (o1<<8|o2) (signed)
Stack operands:
a : * = address ("reference")
x, i, n : w32 = 32 bit word representing an int, bool, or char ("primitive")
v = arbitrary value (v:* or v:w32)
Stack operations
Arithmetic
Local Variables
Constants
Control Flow
Functions
Memory
0xBB new <s> S -> S, a:* (*a is now allocated, size <s>)
0xBC newarray <s> S, n:w32 -> S, a:* (a[0..n) now allocated)
0xBE arraylength S, a:* -> S, n:w32 (n = \length(a))
struct bc0_file {
u4 magic; # magic number, always 0xc0c0ffee
u2 version+arch; # version number (now 2) and architecture
u2 int_count; # number of integer constants
i4 int_pool[int_count]; # integer constants
u2 string_count; # number of characters in string pool
u1 string_pool[string_count]; # adjacent ’\0’-terminated strings
u2 function_count; # number of functions
fi function_pool[function_count]; # function info
u2 native_count; # number of native (library) functions
ni native_pool[native_count]; # native function info
};
struct function_info {
u2 num_args; # number of arguments, V[0..num_args)
u2 num_vars; # number of variables, V[0..num_vars)
u2 code_length; # number of bytes of bytecode
u1 code[code_length]; # bytecode
};
struct native_info {
u2 num_args; # number of arguments, V[0..num_args)
u2 function_table_index; # index into table of library functions
};
Lecture 24
April 23, 2013
1 Introduction
In this lecture we introduce graphs. Graphs provide a uniform model for
many structures, for example, maps with distances or Facebook relation-
ships. Algorithms on graphs are therefore important to many applications.
They will be a central subject in the algorithms courses later in the curricu-
lum; here we only provide a very small sample of graph algorithms.
2 Paths in Graphs
We start with undirected graphs which consist of a set V of vertices (also
called nodes) and a set E of edges, each connecting two different vertices.
In particular, these graphs have no edges from a node back to itself. A
graph is connected if we can reach any vertex from any other vertex by
following edges in either direction. In a directed graph edges provide a con-
nection from one node to another, but not necessarily in the opposite direc-
tion. More mathematically, we say that the edge relation between vertices is
symmetric for undirected graphs. In this lecture we only discuss undirected
graphs, although directed graphs also play an important role in many ap-
plications.
The following is a simple example of a connected, undirected graph
with 5 vertices (A, B, C, D, E) and 6 edges (AB, BC, CD, AE, BE, CE).
A
D
E
B C
v 0 , v1 , v2 , v3 , . . . , v l
of some length l 0 such that there is an edge from vi to vi+1 in the graph
for each i < l. For example, all of the following are paths in the graph
above:
A B E C D
A B A
E C D C B
B
The last one is a special case: The length of a path is given by the number of
edges in it, so a node by itself is a path of length 0 (without following any
edges). Paths always have a starting vertex and an ending vertex, which
coincide in a path of length 0. We also say that a path connects its end-
points.
The graph reachability problem is to determine if there is a path connecting
two given vertices in a graph. If we know the graph is connected, this
problem is easy since one can reach any node from any other node. But
we might refine our specification to request that the algorithm return not
just a boolean value (reachable or not), but an actual path. At that point
the problem is somewhat interesting even for connected graphs. Using our
earlier terminology, a path from vertex v to vertex w is a certificate or explicit
evidence for the fact that vertex w is reachable from another vertex v. It is
easy to check whether the certificate is valid, since it is easy to check if each
node in the path is connected to the next one by an edge. It is more difficult
to produce such a certificate.
3 Implicit Graphs
There are many, many different ways to represent graphs. In some appli-
cations they are never explicitly constructed but remain implicit in the way
the problem was solved. One such example was peg solitaire. The vertices of
the graph implicit in this problem are board positions. There is an edge from
A to B if we can make a move in position A to reach position B. Note that
this implicit graph is actually a directed graph since the game does not allow
us to undo a move we just made. The classical reachability question here
would be if from some initial position we can reach another given final po-
sition. We actually solved a related question, namely if we can reach any of
a number of alternative positions (those with exactly one peg) from a given
initial position. We win the game if we can reach any of those positions
with a single peg.
The reason why we did not explicitly construct the full graph is that
for standard boards it is unreasonably large – there are too many reachable
positions. Instead, we incrementally construct it as we search for a solu-
tion in the hope we can find a solution without ever generating all nodes.
In some examples (like the standard English board), this hope was justi-
fied if we were lucky enough to pick a good move strategy. To make sure
that unsolvable boards had no solution, however, we still had to visit ev-
ery reachable position. Just because we have 3 pegs remaining with one
attempt of trying to solve the board does not mean we could not have been
more successful if we had moved the pegs around in a different way.
We use the C0 notation for contracts on the interface functions here. Even
though C compilers do not recognize the //@requires contract and will
simply discard it as a comment, the contract still serves an important role
for the programmer reading the program. For the graph interface, we de-
cide that it does not make sense to add an edge into a graph when that edge
is already there, hence the second requires.
With this minimal interface, we can create a graph for our running ex-
ample (letting A = 0, B = 1, and so on).
graph G = graph_new(5);
graph_addedge(G, 0, 1);
graph_addedge(G, 1, 2);
graph_addedge(G, 2, 3);
graph_addedge(G, 0, 4);
graph_addedge(G, 1, 4);
graph_addedge(G, 2, 4);
5 Adjacency Matrices
There are two simple ways to implement the graph interface. One way is
to represent the graph as a two-dimensional array that represents its edge
relation. We can check if there is an edge from B (= 1) to D (= 3) by looking
for a checkmark in row 1, column 3. In an undirected graph, the top-right
half of this two-dimensional array will be a mirror image of the bottom-left,
A!!!!!B!!!!!C!!!!!D!!!!E!
A! ✔! ✔!
B! ✔! ✔! ✔!
C! ✔! ✔! ✔!
D! ✔!
E! ✔! ✔! ✔!
6 Adjacency Lists
The other classic representation of a graph is as an adjacency list. In an
adjacency list representation, we have a one-dimensional array that looks
much like a hash table. Each vertex has a spot in the array, and each spot
in the array contains a linked list of all the other vertices connected to that
vertex. Our running example would look like this as an adjacency list:
struct adjlist_node {
vertex vert;
adjlist *next;
};
struct graph_header {
unsigned int size;
adjlist *adj[]; // Flexible array member!
};
The array adj of adjacency lists will be contiguous in memory with the
size field – this is quite different than, say, a hashtable chain, which is a
linked list with data fields of type elem and next pointers. We allocate
this adjacency list using xcalloc to make sure that the adjacency list is
initialized to an empty array. Behind the scenes xcalloc just multiplies
its two arguments; because we are allocating a struct with a flexible array
member, we pass in 1 for the first argument and explicitly figure out the
desired size of the array for the second argument:
7 Depth-First Search
The first algorithm we consider for determining if one vertex is reachable
from another is called depth-first search.
Let’s try to work our way up to this algorithm. Assume we are trying to
find a path from u to w. We start at u. If it is equal to w we are done, because
w is reachable by a path of length 0. If not we pick an arbitrary edge leaving
u to get us to some node v. Now we have “reduced” the original problem
to the one of finding a path from v to w.
The problem here is of course that we may never arive at w even if there
is a path. For example, say we want to find a path from A to D in our earlier
example graph.
A
D
E
B C
When we are at B we mark B and have three choices for the next step.
2. We could go to E.
3. We could go to C.
Say we pick E. At this point have again three choices. We might consider
A as a next node on the path, but it is ruled out because A has already been
marked. We show this by dashing the edge from A to E to indicate it was
considered, but ineligible. The only possibility now is to go to C, because
we have been at B as well (we just came from B).
A B E C D
A
D
E
B C
and the goal to find a path from E to B. Let’s say we start E C and then
C D. At this point, all the vertices we could go to (which is only C) have
already been marked! So we have to backtrack to the most recent choice
point and pursue alternatives. In this case, this could be C, where the only
remaining alternative would be B, completing the path E C B. Notice
that when backtracking we have to go back to C even though it is already
marked.
D" F"
A"
E"
B" G"
C"
We write the current node we are visiting on the left and on the right a stack
of nodes we have to return to when we backtrack. For each of these we
also remember which choices remain (in parentheses). We annotate marked
nodes with an asterisk, which means that we never pick them as the next
node to visit. In particular, when our current node has been marked, then
we have been there and did not find the goal (yet), so we do not explore
again. Hence, we do not care what their neighbors are now, because those
had already been put on the stack earlier and will be or will have been
considered at some point.
For example, at step 5 we do not consider E ⇤ but go to D instead. We
backtrack, when no unmarked neighbors remain to the current node.
Step Current Neighbors Stack Remark
1 A (E, B)
2 E (C, B, A⇤ ) A⇤ (B)
3 C (G, E ⇤ , D) E ⇤ (B, A⇤ ) | A⇤ (B)
4 G (C ⇤ ) C ⇤ (E ⇤ , D) | E ⇤ (B, A⇤ ) | A⇤ (B)
5 C⇤ don’t care G⇤ () | C ⇤ (E ⇤ , D) | E ⇤ (B, A⇤ ) | A⇤ (B) Backtrack
6 E⇤ don’t care C ⇤ (D) | E ⇤ (B, A⇤ ) | A⇤ (B)
7 D (F, C ⇤ ) C ⇤ () | E ⇤ (B, A⇤ ) | A⇤ (B)
8 F (D⇤ ) D⇤ (C ⇤ ) | C ⇤ () | E ⇤ (B, A⇤ ) | A⇤ (B)
9 D⇤ don’t care F ⇤ () | D⇤ (C ⇤ ) | C ⇤ () | E ⇤ (B, A⇤ ) | A⇤ (B) Backtrack
10 C⇤ don’t care D⇤ () | C ⇤ () | E ⇤ (B, A⇤ ) | A⇤ (B) Backtrack ⇥2
11 B (A⇤ ) E ⇤ (B, A⇤ ) | A⇤ (B) Goal Reached
return false;
}
Note that this recursive implementation of DFS uses the (implicit) stack
of the function calls to dfs and each dfs function body has its own linked
list for the adjacency list. In effect, that gives the search management data
structure the form of a stack of queues (see Clac lab) as indicated in the
example above. The stack elements are separated by | and the elements of
the queue are wrapped in parentheses (B, A⇤ ) etc.
stack S = stack_new();
push(S, (void*)(uintptr_t)start);
bool mark[graph_size(G)];
for(unsigned int i = 0; i < graph_size(G); i++)
mark[i] = false;
while(!stack_empty(S)) {
vertex v = (vertex)(uintptr_t) pop(S);
if(!mark[v]) {
printf("Visiting %d\n", v);
mark[v] = true;
if (v == target) {
stack_free(S, NULL);
return true;
}
stack_free(S, NULL);
return false;
}
We explicitly put the starting node on the stack. Then, every time we
pop an element of the stack, we check to see whether it has been marked
already (like we did in the iterative implementation). If it wasn’t, we visit
the node by marking it and comparing it to the target. Ultimately, we push
all neighbors on the stack to make sure we look at them later.
9 Breadth-First Search
The iterative DFS algorithm managed his agenda, i.e. the list of nodes it
still had to look at using a stack. But there’s no reason to insist on a stack
for the purposes of that. What happens if we replace the data management
by a queue instead? All of a sudden, we will no longer explore the most
recently found neighbor first as in depth-first search, but, instead, we will
look at the oldest neighbor first. This corresponds to a breadth-first search
where you explore the graph layer by layer. So BFS completes a layer of
the graph before proceeding to the next layer. The code for that and many
other interesting variations of graph search can be found on the web page.
Lecture 26
April 25, 2013
A
D
E
B C
A
D
A
D
E
E
B C B C
5. A graph with exactly one path between any two distinct vertices,
where a path is a sequence of distinct vertices where each is connected
to the next by an edge. (For paths in a tree to be distinct, we have to
disallow paths that double back on themselves).
When considering the asymptotic complexity it is often useful to cate-
gorize graphs as dense or sparse. Dense graphs have a lot of edges compared
to the number of vertices. Writing n = |V | for the number of vertices (which
will be our notation in the rest of the lecture) know there can be at most
n ⇤ (n 1)/2: every node is connected to any other node (n ⇤ (n 1)), but in
an undirected way (n ⇤ (n 1)/2). If we write e for the number of edges, we
have e = O(n2 ). By comparison, a tree is sparse because e = n 1 = O(n).
1. Start with the collection of singleton trees, each with exactly one node.
2. As long as we have more than one tree, connect two trees together
with an edge in the graph.
Let’s try this algorithm on our first graph, considering edges in the
listed order: (AB, BC, CD, AE, BE, CE).
A
D
A
D
A
D
A
D
E
E
E
E
B C B C B C B C
A
D
A
D
E
E
B C B C
The first graph is the given graph, the completley disconnected graph is the
starting point for this algorithm. At the bottom right we have computed the
spanning tree, which we know because we have added n 1 = 4 edges. If
we tried to continue, the next edge BE could not be added because it does
not connect two trees, and neither can CE. The spanning tree is complete.
A
D
A
D
2
2
E
E
3
3
3
2
2
2
2
B
C
B
C
3
2. Consider the edges in order. If the edge does not create a cycle, add
it to the spanning tree. Otherwise discard it. Stop, when n 1 edges
have been added, because then we must have spanning tree.
A
D
2
E
3
3
2
2
B
C
3
we first sort the edges. There is some ambiguity—say we obtain the follow-
ing list
AE 2
BE 2
CE 2
BC 3
CD 3
AB 3
We now add the edges in order, making sure we do not create a cycle. After
A
D
2
E
2
2
B
C
At this point we consider BC. However, this edge would create a cycle
BCE since it connects two vertices in the same tree instead of two differ-
ent trees. We therefore do not add it to the spanning tree. Next we consider
CD, which does connect two trees. At this point we have a minimum span-
ning tree
A
D
2
E
3
2
2
B
C
We do not consider the last edge, AB, because we have already added n
1 = 4 edges.
In the next lecture we will analyze the problem of incrementally adding
edges to a tree in a way that allows us to quickly determine if an edge
would create a cycle.
Exercises
Exercise 1 Write a function to generate a random permutation of a given array,
using a random number generator with the interface in the standard rand library.
What is the asymptotic complexity of your function?
Exercise 2 Prove that the cycle property implies the correctness of Kruskal’s algo-
rithm.
Lecture 26
April 30, 2013
1 Introduction
Kruskal’s algorithm for minimum weight spanning trees starts with a col-
lection of single-node trees and adds edges until it has constructed a span-
ning tree. At each step, it must decide if adding the edge under consid-
eration would create a cycle. If so, the edge would not be added to the
spanning tree; if not, it will.
In this lecture we will consider an efficient data structure for checking if
adding an edge to a partial spanning tree would create a cycle, a so-called
union-find structure.
of the two equivalence classes, because each node in either of the two trees
is now connected to all nodes in both trees.
When we have to decide if adding an edge between two nodes u and w
would create a cycle, we have to determine if u and w belong to the same
equivalence class. If so, then there is already a path and u and w; adding
the edge would create a cycle. If not, then there is not already such a path,
and adding the edge would therefore not create a cycle.
The union-find data structure maintains a so-called canonical represen-
tative of each equivalence class, which can be computed efficiently from
any element in the class. We then determine if two nodes u and w are in
the same class by computing the canonical representatives from u and w,
respectively, and comparing them. If they are equal, they must be in the
same class, otherwise they are in two different classes.
3 An Example
In order to motivate how the union-find data structure works, we consider
an example of Kruskal’s algorithm. We have the following graph, with the
indicated edge weights.
D
C
1
1
E
2
F
2
2
1
1
A
B
We have to consider the edges in increasing order, so let’s fix the order AE,
ED, F B, CF , AD, EF , CB. We represent the nodes A–F as integers 0–5
and keep the canonical representation for each node in an array.
D
C
E
F
A B
A
B
C
D
E
F
0
1
2
3
4
5
0
1
2
3
4
5
We begin by considering the edge AE, we see that they are in two dif-
ferent equivalence classes because A[0] = 0 and A[4] = 4, and 0 6= 4. This
means we have to add an edge between A and E.
D
C
E
F
A B
A
B
C
D
E
F
0
1
2
3
4
5
0
1
2
3
0
5
Next we consider ED. Again, this edge should be added because A[4] =
0 6= 3 = A[3].
D
C
E
F
A B
A
B
C
D
E
F
0
1
2
3
4
5
0
1
2
0
0
5
We now combine two more steps, because they are analagous to the
above, adding edges F B and CF .
D
C
E
F
A B
The array:
A
B
C
D
E
F
0
1
2
3
4
5
0
5
5
0
0
5
Next was the edge AD. In the array we have that A[0] = 0 = A[3], so A
and D belong to the same equivalence class. Adding the edge would create
a cycle, so we ignore it and move on.
The next edge to consider is EF . Since A[4] = 0 6= 5 = A[5] they are
in different equivalence classes. The two classes are of equal size, so we
have to decide which to make the canonical representative. If it is 0, then
we would need to change A[1], A[2], and A[5] all to be 0. This could take
up to n/2 changes in the array, potentially multiple times at different stages
during the algorithm.
In order to avoid this we make one representative “point” to the other
(say, setting A[5] = 0), but we do not change A[1] and A[2]. Now, to find
the canonical representative of, say, 1 we first look up A[1] = 5. Next we
lookup A[5] = 0. Then we lookup A[0] = 0. Since A[0] = 0 we know that
we have found a canonical element and stop. In essence, we follow a chain
of pointers until we reach a root, which is the canonical representative of
the equivalence class and looks like it points to itself.
D
C
E
F
A B
A
B
C
D
E
F
0
1
2
3
4
5
0
5
5
0
0
0
At this point we can stop, and we don’t even need to consider the last
edge BC. That’s because we have already added 5 = n 1 edges (where n
is the number of nodes), so we must have a spanning tree at this stage.
Examining the union-find structure we see that the representative for
all nodes is indeed 0, so we have reduced it to one equivalence class and
therefore a spanning tree.
In this algorithm we are trying to keep the chain from any node to its
representative short. Therefore, when we are applying the union to two
classes, we want the one with shorter chains to point to the one with longer
chains. In that case, we will only increase the size of the longest chain if
both are equal.
In this case, we can show relatively easily that the worst-case complex-
ity of a sequence of n find or union operations is O(n ⇤ log(n)) (see Exer-
cise 1).
4 An Implementation
Instead of developing the implementation here, we refer the reader to the
code posted at 26-unionfind. There is a simple implementation, unionfind-
lin.c. as we developed in lecture, which does not try to maintain balance,
and is therefore linear in the worst case.
This second implementation at unionfind-log.c changes the representa-
tion we used above slightly. In the above representation, an element i is
canonical if A[i] = i. In the improved represenation, A[i] = d, where d is
the maximal length of the chain leading to the representative i. This allows
us to make a quick decision how to pick a representative for the union.
Exercises
Exercise 1 Prove that after n union operations, the longest chain from an element
to its representative is O(log(n)) if we always take care to have the class with
longer chains be the canonical representative of the union. This is without any
form of path compression.
For the programming portion of this week’s homework, you’ll write two files C0 files corre-
sponding to two di↵erent string processing tasks, and a two other files that performs unit
tests on a potentially buggy sorting implementation:
You should submit these files electronically by 11:59 pm on the due date. Detailed submission
instructions can be found below.
15-122 Homework 2 Page 2 of 11
Starter code. Download the file hw2-handout.tgz from the course website. When you
unpack it, you will find a lib/ directory with several C0 files, including stringsearch.c0
and readfile.c0. You will also see a texts/ directory with some sample text files you may
use to test your code. You should not modify or submit code in the lib directory.
For this homework, you are not provided any main() functions. Instead, you should write
your own main() functions for testing your code. You should put this test code in separate
files from the ones you will submit for the problems below (e.g. duplicates-test.c0). You
may hand in these files or not.
Compiling and running. You will compile and run your code using the standard C0 tools.
For example, if you’ve completed the program duplicates that relies on functions defined
in stringsearch.c0 and you’ve implemented some test code in duplicates-test.c0, you
might compile with a command like the following:
Don’t forget to include the -d switch if you’d like to enable dynamic annotation checking,
but this check should be turned o↵ when you are evaluating the running time of a function.
Submitting. Once you’ve completed some files, you can submit them to Autolab. There
are two ways to do this:
Your files can also be submitted to the web interface of Autolab. To do so, please tar
them, for example:
You can submit this assignment as often as you would like. When we grade your assign-
ment, we will consider the most recent version submitted before the due date. If you get any
errors while trying to submit your code, you should contact the course sta↵ immediately.
15-122 Homework 2 Page 3 of 11
Unit testing. You should write unit tests for your code. This involves writing a separate
main() function that runs individual functions many times with various inputs, asserting
that the expected output is produced. You should specifically choose function inputs that
are tricky or are otherwise prone to fail. While you will not directly receive a large amount
of credit for these tests, your tests will help you check the correctness of your code, pinpoint
the location of bugs, and save you hours of frustration.
Style. Strive to write code with good style: indent every line of a block to the same level,
use descriptive variable names, keep lines to 80 characters or fewer, document your code
with comments, etc. If you find yourself writing the same code over and over, you should
write a separate function to handle that computation and call it whenever you need it. We
will read your code when we grade it, and good style is sure to earn our good graces. Feel
free to ask on Piazza if you’re unsure of what constitutes good style.
1
http://c0.typesafety.net/tutorial/Strings.html
15-122 Homework 2 Page 5 of 11
2 Removing Duplicates
In this programming exercise, you will take a sorted array of strings and return a new sorted
array that contains the same strings without duplicates. The length of the new array should
be just big enough to hold the resulting strings. Place your code for this section in a file
called duplicates.c0; you’ll want this file to start with #use "lib/stringsearch.c0" in
order to get the is_sorted function from class adapted to string arrays. Implement unit
tests for all of the functions in this section in a file called duplicates_test.c0.
Task 1 (1 pt) Implement a function matching the following function declaration:
bool is_unique(string[] A, int n)
//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
where n represents the size of the subarray of A that we are considering. This function should
return true if the given string array contains no repeated strings and false otherwise.
Task 2 (1 pt) Implement a function matching the following function declaration:
int count_unique(string[] A, int n)
//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
where n represents the size of the subarray of A that we are considering. This function should
return the number of unique strings in the array, and your implementation should have an
appropriate asymptotic running time given the precondition.
15-122 Homework 2 Page 6 of 11
where n represents the size of the subarray of A that we are considering. The strings in
the array should be sorted before the array is passed to your function. This function should
return a new array that contains only one copy of each distinct string in the array A. Your
new array should be sorted as well. Your implementation should have a linear asymptotic
running time. Your solution should include annotations for at least 3 strong postconditions.
You must include annotations for the precondition(s), postcondition(s) and loop invari-
ant(s) for each function. You may include additional annotations for assertions as necessary.
You may include any auxiliary functions you need in the same file, but you should not include
a main() function.
Your job: In this exercise, you will write a functions for analyzing the number of tokens
from a Twitter feed that appear (or not) in a user’s vocabulary. The user’s expected vo-
cabulary will be represented by a sorted array of strings vocab that has length v, and we
will maintain another integer array, freq, where freq[i] represents the number of times we
have seen vocab[i] in tweets so far (where i 2 [0, v)).
vocab&
freq&
1( 12( 0( 0( 2( 4( 1( 2(
2
Any resemblance between this scenario and Dr. Luis von Ahn’s company DuoLingo (www.duolingo.com)
are purely coincidental and should not be construed otherwise.
15-122 Homework 2 Page 7 of 11
This is an important pattern, and one that we will see repeatedly throughout the semester
in 15-122: the (sorted) vocabulary words stored in vocab are keys and the frequency counts
stored in freq are values.
The function count_vocab that we will write updates the values – the frequency counts
– based on the unsorted Twitter data we are getting in. For example, consider a Twitter
corpus containing only this tweet by weatherman Scott Harbaugh:
vocab&
freq&
2( 12( 1( 1( 2( 5( 2( 2(
Your data: DosLingos has given you 4 data files for your project in the texts/ directory:
Your tools: DosLingos already has a C0 library for reading text files, provided to you as
lib/readfile.c0, which implements the following functions:
You need not understand anything about the type string_bundle other than that you
can extract its underlying string array and the length of that array:
$ coin lib/readfile.c0
--> string_bundle bund = read_words("texts/scotttweet.txt");
bund is 0xFFAFB8E0 (struct fat_string_array*)
--> string_bundle_length(bund);
6 (int)
--> string[] tweet = string_bundle_array(bund);
tweet is 0xFFAFB670 (string[] with 6 elements)
--> tweet[0];
"phil" (string)
--> tweet[5];
"burrow" (string)
Being connoisseurs of efficient algorithms, DosLingos has also implemented their own set
of string search algorithms in lib/stringsearch.c0, which you may also find useful for this
assignment:
You can include these libraries in your code by writing #use "lib/readfile.c0" and
#use "lib/stringsearch.c0".
15-122 Homework 2 Page 9 of 11
The function should return the number of occurrences of words in the file tweetfile that
do not appear in the array vocab, and should update the frequency counts in freq with the
number of times each word in the vocabulary appears. If a word appears multiple times in
the tweetfile, you should count each occurrence separately, so the tweet “ha ha ha LOL
LOL” would cause the the frequency count for “ha” to be incremented by 3 and would cause
2 to be returned, assuming LOL was not in the vocabulary.
Note that a precondition of count_vocab is that the vocab must be sorted, a fact you
should exploit. Your function should use the linear search algorithm when fast is set to false
and it should use the binary search algorithm when fast is true.
Because count_vocab uses the is_unique function you wrote earlier, when you write a
count_vocab-test.c0 function to test your implementation, you’ll want to include the file
duplicates.c0 on the command line:
1. Give the asymptotic running time of count_vocab under (1) linear and (2) binary
search using big-O notation. This should be in terms of v, the size of the vocabulary,
and n, the number of tweets in tweetfile.
2. How many seconds did it take your function to run on the linear search strategy
(fast=false) using the small 1K twitter text? Do not use contract checking via
the -d option. Also, these tests should use cc0, not Coin, so you’ll need to write a
file count_vocab-time.c0 to help you when you do this step.
You should use the Unix command time in this step. You can report either wall clock
time or CPU time, but say which one you used.
Example: time ./count_vocab
3. How many seconds did it take for the binary search strategy (fast=true) to run on the
small 1K twitter text?
4. How many seconds did it take for the binary search strategy (fast=true) on the larger
200K twitter text?
4 Unit testing
DosLingos’s old selection sort is no longer up to the task of sorting large texts to make
vocabularies. Your colleagues currently use two sorts, both given in lib/stringsort.c0:
The first is an in-place sort like we discussed in class, and the second is a copying sort that
must leave the original array A unchanged and return a sorted array of length upper-lower.
(Note that neither of these conditions are directly expressed in the contracts.)
DosLingos decided to pay another company to write faster sorting algorithms with the
same interface. Unfortunately, they didn’t realize that the other company was a closed-source
shop, so now your company’s future is depending on code you can’t see – you know that the
contracts are set up correctly, but you don’t know anything about the implementation. This
causes (at least) two big problems.
First, you can’t prove that your sorting functions always respect their contracts – the
best you can do is give a counterexample, writing a test that causes the @ensures statement
to fail. If the outside contractors give you this completely bogus implementation. . .
. . . then you can show that their implementation is buggy by writing a test file with a main()
function that performs a sort and observing that the @ensures statement fails when you
compile the test with -d and run it.
Second, even code that does satisfy the contracts may not actually be correct! For exam-
ple, this sort_copy function will never fail the postcondition, but it is definitely incorrect:
In order to catch this kind of incorrect sort, you will have to write a test file with a main()
function that runs the sort and then uses //@assert statements to check that other correct-
ness conditions hold – to catch this bug, the is_in() and/or binsearch() functions might
be helpful, for instance, though they certainly aren’t necessary.
To get full credit on the next task, you’ll need to write tests with extra assertions that will
fail with assertion errors both if the outside contractors wrote a sometimes-postcondition-
failing implementation and if they exploited the contracts to give you a bogus-but-contract-
abiding implementation.
Task 6 (5 pts) Write two files, sort-test.c0 and sort_copy-test.c0, that test the two
sorting functions. The autograder will assign you a grade based on the ability of your unit
tests to pass when given a correct sort and fail when given various buggy sorts. Your tests
must still be safe: it should not be possible for your code to make an array access out-of-
bounds when -d is turned on.
You do not need to catch all our bugs to get full points, but catching additional tests will
be reflected on the scoreboard (and considered for bonus points).
All four of these tests should compile and run, but the last two invocations of ./a.out
should trigger a contract violation if your tests are more than minimal. We will only test
one function at a time, so sort-test.c0 must only reference the sort() function and
sort_copy-test.c0 must only reference the sort_copy() function. Both can reference the
specification function is_sorted and all other functions defined in lib/stringsearch.c0.
You can #use "lib/stringsearch.c0" in your tests, but it is important that you do not
#use "lib/stringsort.c0".
15-122 Homework 6 Page 1 of 11
Name:
Andrew ID:
Recitation:
The written portion of this week’s homework will give you some practice working with heaps,
priority queues, BSTs, and AVL trees, as well as begin our transition to full C. You can either
type up your solutions or write them neatly by hand in the spaces provided. You should
submit your work in class on the due date just before lecture or recitation begins. Please
remember to staple your written homework before submission.
1. Heaps.
We represent heaps, conceptually, as trees. For example, consider the min-heap below.1
(2) (a) Assume a heap is stored in an array as discussed in class where the root is stored
at index 1. Using the above min-heap, at what index is the element with value 32
stored? At what index is its parent stored? At what indices are its left and right
children stored?
Solution:
The value 32 is stored at index ___________.
(1) (b) Suppose we have a non-empty min-heap of integers of size n and we wish to find
the maximum integer in the heap. Describe precisely where the maximum must
be in the min-heap. (You should be able to answer this question with one short
sentence.)
Solution:
1
Diagram courtesy of Hamilton (http://hamilton.herokuapp.com)
15-122 Homework 6 Page 3 of 11
(1) (c) Using the following C0 definition for a heap of integers (position 0 in the array is
not used):
struct heap_header {
int limit; // size of the array of integers
int next; // next available array position for an integer
int[] value;
};
typedef struct heap_header* heap;
Write a C0 function find_max that takes a non-empty min-heap and returns the
maximum value. Your code should examine only those cells that could possibly
hold the maximum.
Solution:
int find_max(heap H)
//@requires is_heap(H);
//@requires H->next > 1;
{
(1) (d) What is the worst-case runtime complexity in big-O notation of your find_max
function on a non-empty min-heap of n elements from the previous problem?
Solution:
15-122 Homework 6 Page 4 of 11
Solution:
(1) (b) Draw a tree with at least four elements that is a BST and satisfies the (min-)heap
ordering invariant.
Solution:
15-122 Homework 6 Page 5 of 11
(1) (c) Why is it not a good idea to have a data structure that enforces both the (min-)heap
ordering invariant and the BST ordering invariant? (Be brief!)
Solution:
(3) (d) Maintaining the BST ordering invariant and the heap invariant on the same set of
values may not be a good idea, but it can be useful to have a tree structure where
each node has two separate values – a key used for the BST ordering invariant and
a priority used for the heap ordering invariant. Such trees are called treaps; we will
use strings as keys and ints as priorities in this question.
The treap below satisfies the BST ordering invariant, but violates the heap ordering
invariant because of the relationship between the “e”/9 node and its parent. In a
heap, we restore the heap shape invariant using swaps. But in a treap, such a
swap would violate the BST ordering invariant. However, by using the same local
rotations we learned about for AVL trees, it is possible to restore the heap ordering
invariant while preserving the BST ordering invariant.
The heap ordering invariant for the tree below can be restored with two tree rota-
tions. Draw the tree that results from each rotation. You should be drawing two
trees.
Solution:
15-122 Homework 6 Page 6 of 11
3. Priority Queues.
In a priority queue, each element has a priority value which is represented as an
integer. As in the previous question, the lower the integer, the higher the priority.
When we call pq_delmin, we remove the element with the highest priority.
(a) Consider the following ways that we can implement a priority queue. Using big-O
notation, what is the worst-case runtime complexity for each implementation to
perform pq_insert and pq_delmin on a priority queue with n elements?
(1) i. Using an unsorted array.
Solution:
pq_insert:
pq_delmin:
(1) ii. Using a sorted array, where the elements are stored from lowest to highest
priority.
Solution:
pq_insert:
pq_delmin:
Solution:
pq_insert:
pq_delmin:
(1) (b) Which implementation in (a) is preferable if the number of pq_insert and pq_delmin
operations are relatively balanced? Explain in one sentence.
Solution:
15-122 Homework 6 Page 7 of 11
(1) (c) Under what specific condition does a priority queue behave like a FIFO queue if
it is implemented using a heap? (Warning: if you words like “higher,” “lower,”
“increasing,” or “decreasing” in your answer, be clear whether you are talking about
priority or integer value.)
Solution:
(1) (d) Under what specific condition does a priority queue behave like a LIFO stack if it
is implemented using a heap?
Solution:
15-122 Homework 6 Page 8 of 11
4. AVL Trees.
(4) (a) Draw the AVL trees that result after successively inserting the following keys into
an initially empty tree, in the order shown:
Show the tree after each insertion and subsequent re-balancing (if any): the tree
after the first element, 98, is inserted into an empty tree, then the tree after 88 is
inserted into the first tree, and so on for a total of seven trees. Make it clear what
order the trees are in.
Be sure to maintain and restore the BST invariants and the additional balance
invariant required for an AVL tree after each insert.
Solution:
15-122 Homework 6 Page 9 of 11
h n
0 0
1 1
2 2
3
4
5
(2) ii. Recall that the xth Fibonacci number F (x) is defined by:
F (0) = 0
F (1) = 1
F (x) = F (x 1) + F (x 2), x>1
Using the table in part (i), give an expression for T (h), where T (h) = n. You
may find it useful to use F (n) in your answer, but your answer does not need
to be a closed form expression.
Solution:
15-122 Homework 6 Page 10 of 11
5. Pass by reference
bool my_int_parser(char *s, int *i); // Returns true iff parse succeeds
Using the address-of operator, rewrite the body of the parseit function so that it
does not heap-allocate, free, or leak any memory on the heap. You may assume
my_int_parser has been implemented (its prototype is given above).
Solution:
void parseit(char *s) {
REQUIRES(s != NULL);
return;
}
15-122 Homework 6 Page 11 of 11
(3) (b) In both C and C0, multiple values can be ‘returned’ by bundling them in a struct:
struct bundle { int x; int y; };
struct bundle *foo(int x) {
...
struct bundle *B = xmalloc(sizeof(struct bundle));
B->x = e1;
B->y = e2;
return B;
}
int main() {
...
struct bundle *B = foo(e);
int x = B->x;
int y = B->y;
free(B);
...
}
Rewrite the declaration and the last few lines of the function foo, as well as the
snippet of main, to avoid heap-allocating, freeing, or leaking any memory on the
heap. The rest of the code (...) should continue to behave exactly as it did before.
Solution:
_______________ foo(___________________________________________) {
...
________________________________________________________________
________________________________________________________________
________________________________________________________________
}
int main() {
...
________________________________________________________________
________________________________________________________________
________________________________________________________________
________________________________________________________________
...
}