0% found this document useful (0 votes)
337 views365 pages

Lecture Notes On Contracts: 15-122: Principles of Imperative Computation Frank Pfenning January 17, 2013

The document summarizes a lecture on using contracts to analyze an imperative program and find a bug. It discusses: 1) A function called f(x,y) that a colleague claims has a bug. By testing it on sample inputs, the behavior seems to be computing x to the power of y. 2) Analyzing the function code and observing it uses a loop to repeatedly square x and halve y until y becomes 1. 3) Identifying the quantity xy remains constant in each loop iteration, proving it is a loop invariant, helping show the function correctly computes x to the power of y when y is a power of 2. 4) Realizing a counterexample is

Uploaded by

ariecanicomp2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
337 views365 pages

Lecture Notes On Contracts: 15-122: Principles of Imperative Computation Frank Pfenning January 17, 2013

The document summarizes a lecture on using contracts to analyze an imperative program and find a bug. It discusses: 1) A function called f(x,y) that a colleague claims has a bug. By testing it on sample inputs, the behavior seems to be computing x to the power of y. 2) Analyzing the function code and observing it uses a loop to repeatedly square x and halve y until y becomes 1. 3) Identifying the quantity xy remains constant in each loop iteration, proving it is a loop invariant, helping show the function correctly computes x to the power of y when y is a power of 2. 4) Realizing a counterexample is

Uploaded by

ariecanicomp2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 365

Lecture Notes on

Contracts
15-122: Principles of Imperative Computation
Frank Pfenning

Lecture 2
January 17, 2013

1 Introduction
In these notes we review contracts, which we use to collectively denote
function contracts, loop invariants, and other assertions about the program.
Contracts will play a central role in this class, since they represent the key
to connect algorithmic ideas to imperative programs. We follow the exam-
ple from lecture, developing annotations to a given program that express
the contracts, thereby making the program understandable (and allowing
us to find the bug).
In term of our learning goals, this lecture addresses:

Computational Thinking: Specification versus implementation; correctness


of programs

Algorithms and Data Structures: Efficient and inefficient implementation

Programming: Contracts

If you have not seen this example, we invite you to read this section by
section to see how much of the story you can figure out on your own before
moving on to the next section.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.2

2 A Mysterious Program
You are a new employee in a company, and a colleague comes to you with
the following program, written by your predecessor who was summarily
fired for being a poor programmer. Your colleague claims he has tracked a
bug in a larger project to this function. It is your job to find and correct this
bug.

int f (int x, int y) {


int r = 1;
while (y > 1) {
if (y % 2 == 1) {
r = x * r;
}
x = x * x;
y = y / 2;
}
return r * x;
}

Before you read on, you might examine this program for a while to try
to determine what it does, or is supposed to do, and see if you can spot the
problem.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.3

3 Forming a Conjecture
The first step it to execute the program on some input values to see its
results. The code is in a file called mystery2.c0 so we invoke the coin inter-
preter to let us experiment with code.
f

% coin mystery2.c0
C0 interpreter (coin) 0.3.2 ’Nickel’ (r256, Thu Jan 3 14:18:03 EST 2013)
Type ‘#help’ for help or ‘#quit’ to exit.
-->

At this point we can type in statements and they will be executed. One
form of statement is an expression, in which case coin will show its value.
For example:

--> 3+8;
11 (int)
-->

We can also use the function in the files that we loaded when we started
coin. In this case, the mystery function is called f, so we can evaluate it on
some arguments.

--> f(2,3);
8 (int)
--> f(2,4);
16 (int)
--> f(1,7);
1 (int)
--> f(3,2);
9 (int)
-->

Can you form a conjecture from these values?

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.4

From these and similar examples, you might form the conjecture is that
f (x, y) = xy , this is, x to the power y. One can confirm that with a few more
values, such as

--> f(-2,3);
-8 (int)
--> f(2,8);
256 (int)
--> f(2,10);
1024 (int)
-->

It seems to work out! Our next task is to see why this function actually
computes the power function. Understanding this is necessary so we can
try to find the error and correct it.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.5

4 Finding a Loop Invariant


Now we start to look inside the function and see how it computes.
int f (int x, int y) {
int r = 1;
while (y > 1) {
if (y % 2 == 1) {
r = x * r;
}
x = x * x;
y = y / 2;
}
return r * x;
}
We notice the conditional
if (y % 2 == 1) {
r = x * r;
}
The condition tests if y modulo 2 is 1. For positive y, this is true if y is odd.
We also observe that in the loop body, y must indeed be positive so this is
a correct test for whether y is odd.
Each time around the loop we divide y by 2, using integer division
(which rounds towards 0). It is exact division if y is even. If y starts as
a power of 2, it will remain even throughout the iteration. In this case r
will remain 1 throughout the execution of the function. Let’s tabulate how
the loop works for x = 2 and y = 8. But at which point in the program do
we tabulate the values? It turns out generally the best place for a loop is just
before the exit condition is tested. By iteration 0 we mean when we enter the
loop the first time and test the condition, iteration 1 is after the loop body
has been traversed once and we are looking again at the exit condition, etc.
iteration x y r
0 2 8 1
1 4 4 1
2 16 2 1
3 256 1 1
After 3 iterations, x = 256 and y = 1, so the loop condition y > 1 becomes
false and we exit the loop. We return r ⇤ x = 256.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.6

To understand why this loop works we need to find a so-called loop in-
variant: a quantity that does not change throughout the loop. In this exam-
ple, when y is a power of 2 then r is a loop invariant. Can you see a loop
invariant involving just x and y?

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.7

Going back to our earlier conjecture, we are trying to show that this
function computes xy . Interestingly, after every iteration of the loop, this
quantity is exactly the same! Before the first iteration it is 28 = 256. After
the first iteration it is 44 = 256. After the second iteration it is 162 = 256.
After the third iteration is it is 2561 = 256. Let’s note it down in the table.
iteration x y r xy
0 2 8 1 256
1 4 4 1 256
2 16 2 1 256
3 256 1 1 256

Still concentrating on this special case where y is a power of 2, let’s see


if we can use the invariant to show that the function is correct.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.8

5 Proving the Loop Invariant


To show that the quantity xy is a loop invariant, we have to prove that
if we execute the loop body once, xy before will be equal to xy after. We
cannot write this as xy = xy , because that is of course always true, speaking
mathematically. Mathematics does not understand the idea of assigning a
new value to a variable. The general convention we follow is to add a prime
(0 ) to the name of a variable to denote its value after an iteration.
So assume we have x and y, and y is a power of 2. After one iteration
we have x0 = x ⇤ x and y 0 = y/2. To show that xy is loop invariant, we have
0
to show that xy = x0 y . So let’s calculate:
0
x0 y = (x ⇤ x)y/2 By definition of x0 and y 0
= (x2 )y/2 Since a ⇤ a = a2
= x2⇤(y/2) Since (ab )c = ab⇤c
= xy Since 2 ⇤ (a/2) = a when a is even

Moreover, if y is a power of 2, then y 0 = y/2 is also a power of 2 (subtracting


1 from the exponent).
We have confirmed that xy is loop invariant if y is a power of 2. Does
this help us to ascertain that the function is correct when y is a power of
two?

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.9

6 Loop Invariant Implies Postcondition


The postcondition of a function is usually a statement about the result it re-
turns. Here, the postcondition is that f (x, y) = xy . Let’s recall the function:

int f (int x, int y) {


int r = 1;
while (y > 1) {
if (y % 2 == 1) {
r = x * r;
}
x = x * x;
y = y / 2;
}
return r * x;
}

If y is a power of 2, then the quantity xy never changes in the loop (as we


have just shown). Also, in that case r never changes, remaining equal to 1.
When we exit the loop, y = 1 because y starts out as some (positive) power
of 2 and is divided by 2 every time around loop. So then

r ⇤ x = 1 ⇤ x = x = x1 = xy

so we return the correct result, xy !


By using two loop invariant expressions (r and xy ) we were able to
show that the function returns the correct answer if it does return an an-
swer. Does the loop always terminate?

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.10

7 Termination
In this case it is easy to see that the loop always terminates, because if we
start with y = 2n we go around the loop exactly n times before y = 2n n = 1
and we exit the loop. We used here that (2k )/2 = 2k 1 for k 1.
Our next challenge then will be to extend this result to arbitrary y. Be-
fore we do this, now that we have some positive results, let’s try to see if
we find some counterexample since the function is supposed to have a bug
somewhere!
Please try to find a counterexample to the conjecture that f (x, y) = xy
before you move on, taking the above information into account.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.11

8 A Counterexample
We don’t have to look at powers of 2 — we already know the function
works correctly there. Some of the earlier examples were not powers of
two, and the function still worked:

--> f(2,3);
8 (int)
--> f(-2,3);
-8 (int)
--> f(2,1);
2 (int)
-->

What about 0, or negative exponents?

--> f(2,0);
2 (int)
--> f(2,-1);
2 (int)
-->

Looks like we have found at least two problems. 20 = 1, so the answer 2


is definitely incorrect. 2 1 = 1/2 so one might argue it should return 0. Or
one might argue in the absence of fractions (we are working with integers),
a negative exponent does not make sense. In any case, f (2, 1) should
certainly not return 2.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.12

9 Imposing a Precondition
Let’s go back to a mathematical definition of the power function xy on inte-
gers x and y. We define:

x0 = 1
x y+1 = x ⇤ xy for y 0

In this form it remains undefined for negative exponents. In programming,


this is captured as a precondition: we require that the second argument to f
not be negative. Preconditions are written as //@requires and come before
the body of the function.

int f (int x, int y)


//@requires y >= 0;
{
int r = 1;
while (y > 1) {
if (y % 2 == 1) {
r = x * r;
}
x = x * x;
y = y / 2;
}
return r * x;
}

This is the first part of what we call the function contract. It expresses what
the function requires of any client that calls it, namely that the second ar-
gument is positive. It is an error to call it with a negative argument; no
promises are made about what the function might return otherwise. It
might even abort the computation due to a contract violation.
But a contract usually has two sides. What does f promise? We know it
promises to compute the exponential function, so this should be formally
expressed.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.13

10 Promising a Postcondition
The C0 language does not have a built-in power function. So we need to
write it explicitly ourselves. But wait! Isn’t that what f is supposed to do?
The idea in this an many other examples to capture a specification in the
simplest possible form, even if it may not be computationally efficient, and
then promise in the postcondition to satisfy this simple specification. Here,
we can transcribe the mathematical definition into a recursive function.
int POW (int x, int y)
//@requires y >= 0;
{
if (y == 0)
return 1;
else
return x * POW(x, y-1);
}
In the rest of the lecture we often silently go back and forth between xy
and P OW (x, y). Now we incorporate POW into a formal postcondition for
the function. Postconditions have the form //@ensures e;, where e is a
boolean expression. The are also written before the function body, by con-
vention after the preconditions. Postconditions can use a special variable
\result to refer to the value returned by the function.
int f (int x, int y)
//@requires y >= 0;
//@ensures \result == POW(x,y);
{
int r = 1;
while (y > 1) {
if (y % 2 == 1) {
r = x * r;
}
x = x * x;
y = y / 2;
}
return r * x;
}
Note that as far as the function f is concerned, if we are considering calling
it we do not need to look at its body at all. Just looking at the pre- and post-

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.14

conditions (the @requires and @ensures clauses), tells us everything we


need to know. As long as we adhere to our contract and pass f a positive y,
then f will adhere to its contract and return xy .

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.15

11 Dynamically Checking Contracts


During the program development phase, we can instruct the C0 compiler
or interpreter to check adherence to contracts. This is done with the -d flag
on the command line, which stands for dynamic checking. Let’s see how the
implementation now reacts to correct and incorrect inputs, assuming we
have added POW as well as pre- and postconditions as shown above.
% coin solution2a.c0 -d
foo.c0:10.5-10.6:error:cannot assign to variable ’x’
used in @ensures annotation
x = x * x;
~
Unable to load files, exiting...
%
The error is that we are changing the value of x in the body of the loop,
while the postcondition refers to x. If it were allowed, it would violate the
principle that we need to look at the contract only when calling the func-
tion, because assignments to x change the meaning of the postcondition.
We want \result == POW(x,y) for the original x and y we passed as argu-
ments to f and not the values x and y might hold at the end of the function.
We therefore change the function body, creating auxiliary variables b
(for base) and e (for exponent) to replace x and y which we leave un-
changed.
int f (int x, int y)
//@requires y >= 0;
//@ensures \result == POW(x,y);
{
int r = 1;
int b = x; /* base */
int e = y; /* exponent */
while (e > 1) {
if (e % 2 == 1) {
r = b * r;
}
b = b * b;
e = e / 2;
}
return r * b;
}

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.16

Now invoking the interpreter with -d works correctly when we return


the right answer, but raises an exception if we give it arguments where we
know the function to be incorrect, or arguments that violate the precondi-
tion to the function.

% coin solution2b.c0 -d
C0 interpreter (coin) 0.3.2 ’Nickel’ (r256, Thu Jan 3 14:18:03 EST 2013)
Type ‘#help’ for help or ‘#quit’ to exit.
--> f(3,2);
9 (int)
--> f(3,-1);
foo.c0:12.4-12.20: @requires annotation failed

Last position: foo.c0:12.4-12.20


f from <stdio>:1.1-1.8
--> f(2,0);
foo.c0:13.4-13.32: @ensures annotation failed
Last position: foo.c0:13.4-13.32
f from <stdio>:1.1-1.7
-->

The fact that @requires annotation fails in the second example call means
that our call is to blame, not f . The fact that the @ensures annotation fails
in the third example call means the function f does not satisfy its contract
and is therefore to blame.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.17

12 Generalizing the Loop Invariant


Before fixing the bug with an exponent of 0, let’s figure out why the func-
tion apparently works when the exponent is odd. Our loop invariant so far
only works when y is a power of 2. It uses the basic law that b2⇤c = (b2 )c =
(b ⇤ b)c in the case where the exponent e = 2 ⇤ c is even.
What about the case where the exponent is odd? Then we are trying
to compute b2⇤c+1 . With analogous reasoning to above we obtain b2⇤c+1 =
b ⇤ b2⇤c = b ⇤ (b ⇤ b)c . This means there is an additional factor of b in the
answer. We see that we exactly multiply r by b in the case that e is odd!

int f (int x, int y)


//@requires y >= 0;
//@ensures \result == POW(x,y);
{
int r = 1;
int b = x; /* base */
int e = y; /* exponent */
while (e > 1) {
if (e % 2 == 1) {
r = b * r;
}
b = b * b;
e = e / 2;
}
return r * b;
}

What quantity remains invariant now, throughout the loop? Try to form a
conjecture for a more general loop invariant before reading on.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.18

Let’s make a table again, this time to trace a call when the exponent is
not a power of two, say, while computing 27 by calling f (2, 7).

iteration b e r be
0 2 7 1 128
1 4 3 2 64
2 16 1 8 16

As we can see, be is not invariant, but r ⇤ be = 128 is! The extra factor from
the equation on the previous page is absorbed into r.
We now express this proposed invariant formally in C0. This requires
the @loop_invariant annotation. It must come immediately before the
loop body, but it is checked just before the loop exit condition. We would
like to say that the expression r * POW(b,e) is invariant, but this is not
possible directly.
Loop invariants in C0 are boolean expressions which must be either true
or false. We can achieve this by stating that r * POW(b,e) == POW(x,y).
Observe that x and y do not change in the loop, so this guarantees that
r * POW(b,e) never changes either. But it says a little more, stating what
the invariant quantity is in term of the original function parameters.

int f (int x, int y)


//@requires y >= 0;
//@ensures \result == POW(x,y);
{
int r = 1;
int b = x; /* base */
int e = y; /* exponent */
while (e > 1)
//@loop_invariant r * POW(b,e) == POW(x,y);
{
if (e % 2 == 1) {
r = b * r;
}
b = b * b;
e = e / 2;
}
return r * b;
}

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.19

13 Fixing the Function


The bug we have discovered so far was for y = 0. In that case, e = 0
so we never go through the loop. If we exit the loop and e = 1, then the
loop invariant implies the function postcondition. To see this, note that we
return r ⇤ b and r ⇤ b = r ⇤ b1 = r ⇤ be = xy , where the last equation is the
loop invariant. When y (and therefore e) is 0, however, this reasoning does
not apply because we exit the loop and e = 0, not 1.
Think about how you might fix the function and its annotations before
reading on.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.20

We can fix it by carrying on with the while loop until e = 0. On the


last iteration e is 1, which is odd, so we set r0 = b ⇤ r. This means we now
should return r0 (the new r) after the one additional iteration of the loop,
and not r ⇤ b.

int f (int x, int y)


//@requires y >= 0;
//@ensures \result == POW(x,y);
{
int r = 1;
int b = x; /* base */
int e = y; /* exponent */
while (e > 0)
//@loop_invariant r * POW(b,e) == POW(x,y);
{
if (e % 2 == 1) r = b * r;
b = b * b;
e = e / 2;
}
return r;
}

Now when the exponent y = 0 we skip the loop body and return r = 1,
which is the right answer for x0 ! Indeed:

% coin solution2d.c0 -d
Coin 0.2.3 "Penny" (r1478, Thu Jan 20 16:14:15 EST 2011)
Type ‘#help’ for help or ‘#quit’ to exit.
--> f(2,0);
1 (int)
-->

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.21

14 Strengthening the Loop Invariant Again


We would now like to show that the improved function is correct. That
requires two steps: one is that the loop invariant implies the postcondition;
another is that the proposed loop invariant is indeed a loop invariant. The
loop invariant, r ⇤ be = xy implies that the result r = xy if we know that
e = 0 (since b0 = 1).
But how do we know that e = 0 when we exit the loop? Actually,
we don’t: the loop invariant is too weak to prove that. The negation of
the exit condition only tells us that e  0. However, if we add another
loop invariant, namely that e 0, then we know e = 0 when the loop is
exited and the postcondition follows. For clarity, we also add a (redundant)
assertion to this effect after the loop and before the return statement.

int f (int x, int y)


//@requires y >= 0;
//@ensures \result == POW(x,y);
{
int r = 1;
int b = x; /* base */
int e = y; /* exponent */
while (e > 0)
//@loop_invariant e >= 0;
//@loop_invariant r * POW(b,e) == POW(x,y);
{
if (e % 2 == 1) {
r = b * r;
}
b = b * b;
e = e / 2;
}
//@assert e == 0;
return r;
}

The @assert annotation can be used to verify an expression that should


be true. If it is not, our reasoning must have been faulty somewhere else.
@assert is a useful debugging tool and sometimes helps the reader under-
stand better what the code author intended.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.22

15 Verifying the Loop Invariants


It seems like we have beaten this example to death: we have added pre- and
post-conditions, stated loop invariants, fixed the original bug and shown
that the loop invariants imply the postcondition. But we have not yet veri-
fied that the loop invariant actually holds! Ouch! Let’s do it.
We begin with the invariant e 0. We have two demonstrate two prop-
erties.

Init: The invariant holds initially. When we enter the loop, e = y and y 0
by the precondition of the function. Done.

Preservation: Assume the invariant holds just before the exit condition is
checked. We have to show that it is true again when we reach the exit
condition after one iteration of the loop

Assumption: e 0.
To show: e0 0 where e0 = e/2, with integer division. This clearly
holds.

Next, we look at the invariant r ⇤ P OW (b, e) = P OW (x, y).

Init: The invariant holds initially, because when entering the loop we have
r = 1, b = x and e = y.

Preservation: We show that the invariant is perserved on every iteration.


For this, we distinguish two cases: e is even and e is odd.

Assumption: r ⇤ P OW (b, e) = P OW (x, y).


To show: r0 ⇤ P OW (b0 , e0 ) = P OW (x, y), where r0 , b0 , and e0 are the
values of r, b, and e after one iteration.
Case: e is even. Then r0 = r, b0 = b ⇤ b and e0 = e/2 and we reason:
r0 ⇤ P OW (b0 , e0 ) = r ⇤ P OW (b ⇤ b, e/2)
= r ⇤ P OW (b, 2 ⇤ (e/2)) Since (a2 )c = a2⇤c
= r ⇤ P OW (b, e) Since e is even
= P OW (x, y) By assumption
Case: e is odd. Then r0 = b ⇤ r, b0 = b ⇤ b and e0 = (e 1)/2 (because
e is odd, integer division rounds towards 0, and e 0) and we

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.23

reason:
r0 ⇤ P OW (b0 , e0 ) = (b ⇤ r) ⇤ P OW (b ⇤ b, (e 1)/2)
= (b ⇤ r) ⇤ P OW (b, 2 ⇤ (e 1)/2) Since (a2 )c = a2⇤c
= (b ⇤ r) ⇤ P OW (b, e 1) Since e 1 is even
= r ⇤ P OW (b, e) Since a ⇤ (ac ) = ac+1
= P OW (x, y) By assumption

This shows that both loop invariants hold on every iteration.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.24

16 Termination
The previous argument for termination still holds. By loop invariant, we
know that e 0. When we enter the body of the loop, the condition must
be true so e > 0. Now we just use that e/2 < e for e > 0, so the value
of e is strictly decreasing and positive, which, as an integer, means it must
eventually become 0, upon which we exit the loop and return from the
function after one additional step.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.25

17 A Surprise
Now, let’s try our function on some larger numbers, computing some pow-
ers of 2.

% coin -d solution2e.c0
Coin 0.2.3 "Penny" (r1478, Thu Jan 20 16:14:15 EST 2011)
Type ‘#help’ for help or ‘#quit’ to exit.
--> f(2,30);
1073741824 (int)
--> f(2,31);
-2147483648 (int)
--> f(2,32);
0 (int)
-->

230 looks plausible, but how could 231 be negative or 232 be zero? We
claimed we just proved it correct!
The reason is that the values of type int in C0 or C and many other
languages actually do not represent arbitrarily large integers, but have a
fixed-size representation. In mathematical terms, this means we that we are
dealing with modular arithmetic. The fact that 232 = 0 provides a clue that
integers in C0 have 32 bits, and arithmetic operations implement arithmetic
modulo 232 .
In this light, the results above are actually correct. We examine modular
arithmetic in detail in the next lecture.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.26

18 Summary: Contracts, and Why They are Important


We have introduced contracts, using the example of a version of Euclid’s
algorithm for computing the greatest common divisor.
Contracts are expressed in form of annotations, started with //@. These
annotations are checked when the program is executed if it is compiled or
interpreted with the -d flag. Otherwise, they are ignored.
The forms of contracts, and how they are checked, are:

@requires: A precondition to a function. This is checked just before the


function body executes.

@ensures: A postcondition for a function. This is checked just after the


function body has been executed. We use \result to refer to the value
returned by the function to impose a condition on it.

@loop invariant: A loop invariant. This is checked every time just before
the loop exit condition is tested.

@assert: An assertion. This is like a statement and is checked every time


it is encountered during execution.

Contracts are important for two purposes.

Testing: Contracts represent a kind of generic test of a function. Rather


than stating specific inputs (like gcd(9,12) and testing the answer 3),
contracts talk about expected properties for arbitrary values. On the
other hand, contracts are only useful in this regard if we have a good
set of test cases, because contracts that are not executed cannot cause
execution to abort.

Reasoning: Contracts express important properties of programs so we can


prove them. Ultimately, this can mathematically verify program cor-
rectness. Since correctness is the most important concern about pro-
grams, this is a crucial aspect of program development. Different
forms of contracts have different roles, reviewed below.

The proof obligations for contracts are as follows:

@requires: At the call sites we have to prove that the precondition for the
function is satisfied for the given arguments. We can then assume it
when reasoning in the body of the function.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.27

@ensures: At the return sites inside a function we have to prove that the
postcondition is satisfied for the given return value. We can then as-
sume it at the call site.

@loop invariant: We have to show:

Init: The loop invariant is satisfied initially, when the loop is first
encountered.
Preservation: Assuming the loop invariant is satisfied at the begin-
ning of the loop (just before the exit test), we have to show it still
holds when the beginning of the loop is reached again, after one
iteration of the loop.

We are then allowed to assume that the loop invariant holds after the
loop exits, together with the exit condition.

@assert: We have to show that an assertion is satisfied when it is reached


during program execution. We can then assume it for subsequent
statements.

Contracts are crucial for reasoning since (a) they express what needs to
be proved in the first place (give the program’s specification), and (b) they
localize reasoning: from a big program to the conditions on the individual
functions, from the inside of a big function to each loop invariant or asser-
tion.

L ECTURE N OTES J ANUARY 17, 2013


Contracts L2.28

Exercises
Exercise 1 Rewrite first POW and then f so that it signals an error in case of an
overflow rather than silently working in modular arithmetic.

L ECTURE N OTES J ANUARY 17, 2013


Lecture Notes on
Ints
15-122: Principles of Imperative Computation
Frank Pfenning

Lecture 3
January 22, 2013

1 Introduction
Two fundamental types in almost any programming language are booleans
and integers. Booleans are comparatively straightforward: they have two
possible values (true and false) and conditionals to test boolean values.
We will return to their properties in a later lecture.
Integers . . . , 2, 1, 0, 1, 2, . . . are considerably more complex, because
there are infinitely many of them. Because memory is finite, only a finite
subrange of them can be represented in computers. In this lecture we dis-
cuss how integers are represented, how we can deal with the limited range
in the representation, and how various operations are defined on these rep-
resentations.
In terms of our learning goals, this lecture addresses:

Computational Thinking: Resource limitations

Algorithms and Data Structures: Algorithms for binary addition; fixed-


size data structures for the representation of numbers

Programming: The type int

2 Binary Representation of Natural Numbers


For the moment, we only consider the natural numbers 0, 1, 2, . . . and we
do not yet consider the problems of limited range. Number notations have
a base b. To write down numbers in base b we need b distinct digits. Each

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.2

digit is multiplied by an increasing power of b, starting with b0 at the right


end. For example, in base 10 we have the ten digits 0–9 and the string 9380
represents the number 9 ⇤ 103 + 3 ⇤ 102 + 8 ⇤ 101 + 0 ⇤ 100 . We call numbers in
base 10 decimal numbers. Unless it is clear from context that we are talking
about a certain base, we use a subscript[b] to indicate a number in base b.
In computer systems, two bases are of particular importance. Binary
numbers use base 2, with digits 0 and 1, and hexadecimal numbers (explained
more below) use base 16, with digits 0–9 and A–F . Binary numbers are
so important because the basic digits, 0 and 1, can be modeled inside the
computer by two different voltages, usually “off” for 0 and “on” for 1. To
find the number represented by a sequence of binary digits we multiply
each digit by the appropriate power of 2 and add up the results. In general,
the value of a bit sequence
n
X1
n 1
bn 1 . . . b1 b0 [2] = bn 12 + · · · + b 1 21 + b 0 20 = bi 2i
i=0

For example, 10011[2] represents 1 ⇤ 24 + 0 ⇤ 23 + 0 ⇤ 22 + 1 ⇤ 21 + 1 ⇤ 20 =


16 + 2 + 1 = 19.
We can also calculate the value of a binary number in a nested way,
exploiting Horner’s rule for evaluating polynomials.

10011[2] = (((1 ⇤ 2 + 0) ⇤ 2 + 0) ⇤ 2 + 1) ⇤ 2 + 1 = 19

In general, if we have an n-bit number with bits bn 1 . . . b0 , we can calculate

(· · · ((bn 1 ⇤ 2 + bn 2) ⇤ 2 + bn 3) ⇤ 2 + · · · + b1 ) ⇤ 2 + b0

For example, taking the binary number 10010110[2] write the digits
from most significant to least significant, calculating the cumulative value
from left to right by writing it top to bottom.

1 = 1
1 ⇤ 2 + 0 = 2
2 ⇤ 2 + 0 = 4
4 ⇤ 2 + 1 = 9
9 ⇤ 2 + 0 = 18
18 ⇤ 2 + 1 = 37
37 ⇤ 2 + 1 = 75
75 ⇤ 2 + 0 = 150

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.3

Reversing this process allows us to convert a number into binary form.


Here we start with the number and successively divide by two, calculating
the remainder. At the end, the least significant bit is at the top.
For example, converting 198 to binary form would proceed as follows:

198 = 99 ⇤ 2 + 0
99 = 49 ⇤ 2 + 1
49 = 24 ⇤ 2 + 1
24 = 12 ⇤ 2 + 0
12 = 6 ⇤ 2 + 0
6 = 3 ⇤ 2 + 0
3 = 1 ⇤ 2 + 1
1 = 0 ⇤ 2 + 1

We read off the answer, from bottom to top, arriving at 11000110[2] .

3 Modular Arithmetic
Within a computer, there is a natural size of words that can be processed
by single instructions. In early computers, the word size was typically 8
bits; now it is 32 or 64. In programming languages that are relatively close
to machine instructions like C or C0, this means that the native type int of
integers is limited to the size of machine words. In C0, we decided that the
values of type int occupy 32 bits.
This is very easy to deal with for small numbers, because the more sig-
nificant digits can simply be 0. According to the formula that yields their
number value, these bits do not contribute to the overall value. But we
have to decide how to deal with large numbers, when operations such as
addition or multiplication would yield numbers that are too big to fit into
a fixed number of bits. One possibility would be to raise overflow excep-
tions. This is somewhat expensive (since the overflow condition must be
explicitly detected), and has other negative consequences. For example,
(n + n) n is no longer equal to n + (n n) because the former can overflow
while the latter always yields n and does not overflow. Another possibility
is to carry out arithmetic operations modulo the number of representable
integers, which would be 232 in the case of C0. We say that the machine
implements modular arithmetic.
In higher-level languages, one would be more inclined to think of the
type of int to be inhabited by integers of essentially unbounded size. This
means that a value of this type would consist of a whole vector of machine

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.4

words whose size may vary as computation proceeds. Basic operations


such as addition no longer map directly onto machine instruction, but are
implemented by small programs. Whether this overhead is acceptable de-
pends on the application.
Returning to modular arithmetic, the idea is that any operation is car-
ried out modulo 2p for size p. Even when the modulus is not a power of
two, many of the usual laws of arithmetic continue to hold, which makes it
possible to write programs confidently without having to worry, for exam-
ple, about whether to write x + (y + z) or (x + y) + z. We have the following
properties of the abstract algebraic class of rings which are shared between
ordinary integers and integers modulo a fixed number n.

Commutativity of addition x+y = y+x


Associativity of addition (x + y) + z = x + (y + z)
Additive unit x+0 = x
Additive inverse x + ( x) = 0
Cancellation ( x) = x
Commutativity of multiplication x⇤y = y⇤x
Associativity of multiplication (x ⇤ y) ⇤ z = x ⇤ (y ⇤ z)
Multiplicative unit x⇤1 = x
Distributivity x ⇤ (y + z) = x ⇤ y + x ⇤ z
Annihilation x⇤0 = 0
Some of these laws, such as associativity and distributivity, do not hold
for so-called floating point numbers that approximate real numbers. This
significantly complicates the task of reasoning about programs with float-
ing point numbers which we have therefore omitted from C0.

4 An Algorithm for Binary Addition


In the examples, we use arithmetic modulo 24 , with 4-bit numbers. Addi-
tion proceeds from right to left, adding binary digits modulo 2, and using
a carry if the result is 2 or greater. For example,
1 0 1 1 = 11
+ 1 0 1 01 1 = 9
(1) 0 1 0 0 = 20 = 4 (mod 16)
where we used a subscript to indicate a carry from the right. The final carry,
shown in parentheses, is ignored, yielding the answer of 4 which is correct
modulo 16.

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.5

This grade-school algorithm is quite easy to implement in software, but


it is not suitable for a hardware implementation because it is too sequential.
On 32 bit numbers the algorithm would go through 32 stages, for an oper-
ation which, ideally, we should be able to perform in one machine cycle.
Modern hardware accomplishes this by using an algorithm where more of
the work can be done in parallel.

5 Two’s Complement Representation


So far, we have concentrated on the representation of natural numbers
0, 1, 2, . . .. In practice, of course, we would like to program with nega-
tive numbers. How do we define negative numbers? We define nega-
tive numbers as additive inverses: x is the number y such that x + y = 0.
A crucial observation is that in modular arithmetic, additive inverses al-
ready exist! For example, 1 = 15 (mod 16) because 1 + 16 = 15. And
1 + 15 = 16 = 0 (mod 16), so, indeed, 15 is the additive inverse of 1 modulo
16.
Similarly, 2 = 14 (mod 16), 3 = 13 (mod 16), etc. Writing out
the equivalence classes of numbers modulo 16 together with their binary
representation, we have

... 16 0 16 ... 0000


... 15 1 17 ... 0001
... 14 2 18 ... 0010
... 13 3 19 ... 0011
... 12 4 20 ... 0100
... 11 5 21 ... 0101
... 10 6 22 ... 0110
... 9 7 23 ... 0111
... 8 8 24 ... 1000
... 7 9 25 ... 1001
... 6 10 26 ... 1010
... 5 11 27 ... 1011
... 4 12 28 ... 1100
... 3 13 29 ... 1101
... 2 14 30 ... 1110
... 1 15 31 ... 1111

At this point we just have to decide which numbers we interpret as positive


and which as negative. We would like to have an equal number of posi-

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.6

tive and negative numbers, where we include 0 among the positive ones.
From this considerations we can see that 0, . . . , 7 should be positive and
8, . . . , 1 should be negative and that the highest bit of the 4-bit binary
representation tells us if the number is positive or negative.
Just for verification, let’s check that 7 + ( 7) = 0 (mod 16):

0 1 1 1
+ 11 01 01 1
(1) 0 0 0 0

It is easy to see that we can obtain x from x on the bit representation


by first complementing all the bits and then adding 1. In fact, the addition
of x with its bitwise complement (written ⇠x) always consists of all 1’s,
because in each position we have a 0 and a 1, and no carries at all. Adding
one to the number 11 . . . 11 will always result in 00 . . . 00, with a final carry
of 1 that is ignored.
These considerations also show that, regardless of the number of bits,
1 is always represented as a string of 1’s.
In 4-bit numbers, the maximal positive number is 7 and the minimal
negative numbers is 8, thus spanning a range of 16 = 24 numbers. In
general, in a representation with p bits, the positive numbers go from 0
to 2p 1 1 and the negative numbers from 2p 1 to 1. It is remarkable
that because of the origin of this representation in modular arithmetic, the
“usual” bit-level algorithms for addition and multiplication can ignore that
some numbers are interpreted as positive and others as negative and still
yield the correct answer modulo 2p .
However, for comparisons, division, and modulus operations the sign
does matter. We discuss division below in Section 9. For comparisons, we
just have to properly take into account the highest bit because, say, 1 =
15 (mod 16), but 1 < 0 and 0 < 15.

6 Hexadecimal Notation
In C0, we use 32 bit integers. Writing these numbers out in decimal nota-
tion is certainly feasible, but sometimes awkward since the bit pattern of
the representation is not easy to discern. Binary notation is rather expan-
sive (using 32 bits for one number) and therefore difficult to work with.
A good compromise is found in hexadecimal notation, which is a represen-
tation in base 16 with the sixteen digits 0–9 and A–F . “Hexadecimal” is

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.7

often abbreviated as “hex”. In the concrete syntax of C0 and C, hexadeci-


mal numbers are preceded by 0x in order to distinguish them from decimal
numbers.
binary hex decimal
0000 0x0 0
0001 0x1 1
0010 0x2 2
0011 0x3 3
0100 0x4 4
0101 0x5 5
0110 0x6 6
0111 0x7 7
1000 0x8 8
1001 0x9 9
1010 0xA 10
1011 0xB 11
1100 0xC 12
1101 0xD 13
1110 0xE 14
1111 0xF 15
Hexadecimal notation is convenient because most common word sizes
(8 bits, 16 bits, 32 bits, and 64 bits) are multiples of 4. For example, a 32
bit number can be represent by eight hexadecimal digits. We can even do
a limited amount on arithmetic on them, once we get used to calculating
modulo 16. Mostly, though, we use hexadecimal notation when we use
bitwise operations rather than arithmetic operations.

7 Useful Powers of 2
The drive to expand the native word size of machines by making circuits
smaller was influenced by two different considerations. For one, since the
bits of a machine word (like 32 or 64) are essentially treated in parallel in
the circuitry, operations on larger numbers are much more efficient. For
another, we can address more memory directly by using a machine word
as an address.
A useful way to relate this to common measurements of memory and
storage capacity is to use

210 = 1024 = 1K

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.8

Note that this use of “1K” in computer science is slightly different from
its use in other sciences where it would indicate one thousand (1, 000). If
we want to see how much memory we can address with a 16 bit word we
calculate
216 = 26 ⇤ 210 = 64K
so roughly 64K cells of memory each usually holding a byte which is 8 bits
wide). We also have
220 = 210 ⇤ 210 = 1, 048, 576 = 1M
(pronounced “1 Meg”) which is roughly 1 million and
230 = 210 ⇤ 210 ⇤ 210 = 1, 073, 741, 824 = 1G
(pronounced “1 Gig”) which is roughly 1 billion.
In a more recent processor with a word size of 32 we can therefore ad-
dress
232 = 22 ⇤ 210 ⇤ 210 ⇤ 210 = 4GB
of memory where “GB” stands for Gigabyte.
The signifant number would be 1024GB which would be 1T B (Ter-
abyte).

8 Bitwise Operations on Ints


Ints are also used to represent other kinds of data. An example, explored in
the first programming assignment, is colors (see Section 11). The so-called
ARGB color model divides an int into four 8-bit quantities. The highest 8
bits represent the opaqueness of the color against its background, while the
lower 24 bits represent the intensity of the red, green and blue components
of a color. Manipulating this representation with addition and multiplica-
tion is quite unnatural; instead we usually use bitwise operations.
The bitwise operations are defined by their action on a single bit and
then applied in parallel to a whole word. The tables below define the mean-
ing of bitwise and &, bitwise exclusive or ^ and bitwise or |. We also have bitwise
negation ~ as a unary operation.
And Exclusive Or Or Negation

& 0 1 ^ 0 1 | 0 1 ~ 0 1
0 0 0 0 0 1 0 0 1 1 0
1 0 1 1 1 0 1 1 1

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.9

9 Integer Division and Modulus


The division and modulus operators on integers are somewhat special. As
a multiplicative inverse, division is not always defined, so we adopt a dif-
ferent definition. We write x/y for integer division of x by y and x%y for
integer modulus. The two operations must satisfy the property

(x/y) ⇤ y + (x%y) = x

so that x%y is like the remainder of division. The above is not yet sufficient
to define the two operations. In addition we say 0  |x%y| < |y|. Still, this
leaves open the possibility that the modulus is positive or negative when y
does not divide x. We fix this by stipulating that integer division truncates
its result towards zero. This means that the modulus must be negative if x
is negative and there is a remainder, and it must be positive if x is positive.
By contrast, the quotient operation always truncates down (towards 1),
which means that the remainder is always positive. There are no primitive
operators for quotient and remainder, but they can be implemented with
the ones at hand.
Of course, the above constraints are impossible to satisfy when y = 0,
because 0  |x%0| < |0| is impossible. But division by zero is defined to
raise an error, and so is the modulus.

10 Shifts
We also have some hybrid operators on ints, somewhere between bit-level
and arithmetic. These are the shift operators. We write x << k for the result
of shifting x by k bits to the left, and x >> k for the result of shifting x by k
bits to the right. In both cases, the value of k must be between 0 (inclusive)
and 32 (exclusive) – any other value is an arithmetic error like division by
zero. We assume below that k is in that range.
The left shift, x << k (for 0  k < 32), fills the result with zeroes on
the right, so that bits 0, . . . , k 1 will be 0. Every left shift corresponds to a
multiplication by 2 so x << k returns x ⇤ 2k (modulo 232 ). We illustrate this
with 8-bit numbers.

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.10

b7 b6 b5 b4 b3 b2 b1 b0

<<1

b6 b5 b4 b3 b2 b1 b0 0

b7 b6 b5 b4 b3 b2 b1 b0

<<2

b5 b4 b3 b2 b1 b0 0 0

The right shift, x >> k (for 0  k < 32), copies the highest bit while
shifting to the right, so that bits 31, . . . , 32 k of the result will be equal to
the highest bit of x. If viewing x a an integer, this means that the sign of the
result is equal to the sign of x, and shifting x right by k bits correspond to
integer division by 2k except that it truncates towards 1. For example,
-1 >> 1 == -1.

b7 b6 b5 b4 b3 b2 b1 b0

>>1

b7 b7 b6 b5 b4 b3 b2 b1

b7 b6 b5 b4 b3 b2 b1 b0

>>2

b7 b7 b7 b6 b5 b4 b3 b2

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.11

11 Representing Colors
As a small example of using the bitwise interpretation of ints, we consider
colors. Colors are decomposed into their primary components red, green,
and blue; the intensity of each uses 8 bits and therefore varies between
0 and 255 (or 0x00 and 0xFF). We also have the so-called ↵-channel which
indicates how opaque the color is when superimposed over its background.
Here, 0xFF indicates completely opaque, and 0x00 completely transparent.
For example, to extract the intensity of the red color in a given pixel p,
we could compute (p >> 16) & 0xFF. The first shift moves the red color
value into the bits 0–7; the bitwise and masks out all the other bits by setting
them to 0. The result will always be in the desired range, from 0–255.
Conversely, if we want to set the intensity of green of the pixel p to
the value of g (assuming we already have 0  g  255), we can compute
(p & 0xFFFF00FF) | (g << 8). This works by first setting the green in-
tensity to 0, while keep everything else the same, and then combining it
with the value of g, shifted to the right position in the word.
For more on color values and some examples, see Assignment 1.

L ECTURE N OTES J ANUARY 22, 2013


Ints L3.12

Exercises
Exercise 1 Write functions quot and rem that calculate quotient and remainder
as explained in Section 9. Your functions should have the property that

quot(x,y)*y + rem(x,y) == x;

for all ints x and y unless quot overflows. How is that possible?

Exercise 2 Write a function int2hex that returns a string containing the hex-
adecimal representation of a given integer as a string. Your function should have
prototype

string int2hex(int x);

Exercise 3 Write a function lsr (logical shift right), which is like right shift (>>)
except that it fills the most significant bits with zeroes instead of copying the sign
bit. Explain what lsr(x,1) means on integers in two’s complement representa-
tion.

L ECTURE N OTES J ANUARY 22, 2013


Lecture Notes on
Arrays

15-122: Principles of Imperative Computation


Frank Pfenning, André Platzer

Lecture 4
January 24, 2012

1 Introduction
So far we have seen how to process primitive data like integers in impera-
tive programs. That is useful, but certainly not sufficient to handle bigger
amounts of data. In many cases we need aggregate data structures which
contain other data. A common data structure, in particular in imperative
programming languages, is that of an array. An array can be used to store
and process a fixed number of data elements that all have the same type.
We will also take a first detailed look at the issue of program safety.
A program is safe if it will execute without exceptional conditions which
would cause its execution to abort. So far, only division and modulus are
potentially unsafe operations, since division or modulus by 0 is defined
as a runtime error.1 Trying to access an array element for which no space
has been allocated is a second form of runtime error. Array accesses are
therefore potentially unsafe operations and must be proved safe.
With respect to our learning goals we will look at the following notions.
Computational Thinking: Safety
Algorithms and Data Structures: Fixed-size arrays
Programming: The type t[]; for-loops
In lecture, we only discussed a smaller example of programming with
arrays, so some of the material here is a slightly more complex illustration
of how to use for loops and loop invariants when working with arrays.
1
as is division of modulus of the minimal integer by 1

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.2

2 Using Arrays
When t is a type, then t[] is the type of an array with elements of type t.
Note that t is arbitrary: we can have an array of integers (int[]), and an
array of booleans (bool[]) or an array of arrays of characters (char[][]).
This syntax for the type of arrays is like Java, but is a minor departure from
C, as we will see later in class.
Each array has a fixed size, and it must be explicitly allocated using the
expression alloc_array(t, n). Here t is the type of the array elements,
and n is their number. With this operation, C0 will reserve a piece of mem-
ory with n elements, each having type t. Let’s try in coin:

% coin
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> int[] A = alloc_array(int, 10);
A is 0xECE2FFF0 (int[] with 10 elements)
-->

The result may be surprising: A is an array of integers with 10 elements


(obvious), but what does it mean to say A is 0xECE2FFF0 here? As we dis-
cussed in the lecture on integers, variables can only hold values of a small
fixed size, the word size of the machine. An array of 10 integers would be
10 times this size, so we cannot hold it directly in the variable A. Instead,
the variable A holds the address in memory where the actual array elements
are stored. In this case, the address happens to be 0xECE2FFF0 (incidentally
presented in hexadecimal notation), but there is no guarantee that the next
time you run coin you will get the same address. Fortunately, this is okay
because you cannot actually ever do anything directly with this address as
a number and never need to either. Instead you access the array elements
using the syntax A[i] where 0  i < n, where n is the length of the array.
That is, A[0] will give you element 0 of the array, A[1] will be element 1,
and so on. We say that arrays are zero-based because elements are numbered
starting at 0. For example:

--> A[0];
0 (int)
--> A[1];
0 (int)
--> A[2];

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.3

0 (int)
--> A[10];
Error: accessing element 10 in 10-element array
Last position: <stdio>:1.1-1.5
--> A[-1];
Error: accessing element -1 in 10-element array
Last position: <stdio>:1.1-1.5
-->

We notice that after allocating the array, all elements appear to be 0. This
is guaranteed by the implementation, which initializes all array elements
to a default value which depends on the type. The default value of type
int is 0. Generally speaking, one should try to avoid exploiting implicit
initialization because for a reader of the program it may not be clear if the
initial values are important or not.
We also observe that trying to access an array element not in the spec-
ified range of the array will lead to an error. In this example, the valid
accesses are A[0], A[1], . . ., A[9] (which comes to 10 elements); everything
else is illegal. And every other attempt to access the contents of the array
would not make much sense, because the array has been allocated to hold
10 elements. How could we ever meaningfully ask what it’s element num-
ber 20 is if it has only 10? Nor would it make sense to ask A[-4]. In both
cases, coin and cc0 will give you an error message telling you that you
have accessed the array outside the bounds. While an error is guaranteed
in C0, in C no such guarantee is made. Accessing an array element that has
not been allocated leads to undefined behavior and, in principle, anything
could happen. This is highly problematic because implementations typi-
cally choose to just read from or write to the memory location where some
element would be if it had been allocated. Since it has not been, some other
unpredictable memory location may be altered, which permits infamous
buffer overflow attacks which may compromise your machines.
How do we change an element of an array? We can use it on the left-
hand size of an assignment. We can set A[i] = e; as long as e is an expres-
sion of the right type for an array element. For example:

--> A[0] = 5; A[1] = 10; A[2] = 20;


A[0] is 5 (int)
A[1] is 10 (int)
A[2] is 20 (int)
-->

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.4

After these assignments, the contents of memory might be displayed as


follows, where A = 0xECE2FFF0:

ECE30000%
0xECE2FFF0% F4% F8% FC% 04% 08% 0C% 10% 14%
5" 10" 20" 0" 0" 0" 0" 0" 0" 0"
A[0]% A[1]% A[2]% A[3]% A[4]% A[5]% A[6]% A[7]% A[8]% A[9]%

Recall that an assignment (like A[0] = 5;) is a statement and as such


has an effect, but no value. coin will print back the effect of the assign-
ment. Here we have given three statements together, so all three effects are
shown. Again, exceeding the array bounds will result in an error message
and the program aborts, because it does not make sense to store data in an
array at a position that is outside the size of that array.

--> A[10] = 100;


Error: accessing element 10 in 10-element array
Last position: <stdio>:1.1-1.11
-->

3 Using For-Loops to Traverse Arrays


A common pattern of access and traversal of arrays is for-loops, where an
index i is counted up from 0 to the length of the array. To continue the
example above, we can assign i3 to the ith element of the array as follows:

--> for (int i = 0; i < 10; i++)


... A[i] = i * i * i;
--> A[6];
216 (int)
-->

Characteristically, the exit condition of the loop test i < n where i is the
array index and n is the length of the array (here 10).
After we type in the first line (the header of the for-loop), coin responds
with the prompt ... instead of -->. This indicates that the expression or
statement it has parsed so far is incomplete. We complete it by supplying
the body of the loop, the assignment A[i] = i * i * i;. Note that no

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.5

assignment effect is printed. This is because the assignment is part of a


loop. In general, coin will only print effects of top-level statements such
as assignments, because when a complicated program is executed, a huge
number of effects could be taking place.

4 Specifications for Arrays


When we use loops to traverse arrays, we need to make sure that all the
array accesses are in bounds. In many cases this is evident, but in can be
tricky in particular if we have two-dimensional data (for example, images).
As an aid to this reasoning, we state an explicit loop invariant which ex-
presses what will be true on every iteration of the loop.
To illustrate arrays, we develop a function that computes an array of the
first n Fibonacci numbers, starting to count from 0. It uses the recurrence:

f0 = 0
f1 = 1
fn+2 = fn+1 + fn for n 0

When we represent fn in an array as A[n], we can write the recurrence


directly as a loop operating on the array:

int[] fib(int n) {
int[] F = alloc_array(int, n);
F[0] = 0;
F[1] = 1;
for (int i = 0; i < n; i++)
F[i+2] = F[i+1] + F[i];
return F;
}

This looks straightforward. Is there a problem with the code or will it run
correctly? In order to understand whether this function works correctly, we
systematically develop a specification for it. Before you read on, can you
spot a bug in the code? Or can you find a reason why it will work correctly?

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.6

Allocating an array will also fail if we ask for a negative number of ele-
ments. Since the number of elements we ask for in alloc_array(int, n)
is n, and n is a parameter passed to the function, we need to add n 0 into
the precondition of the function. In return, the function can safely promise
to return an array that has exactly the size n. This is a property that the code
using, e.g., fib(10) has to rely on. Unless the fib function promises to re-
turn an array of a specific size, the user has no way of knowing how many
elements in the array can be accessed safely without exceeding its bounds.
Without such a corresponding postcondition, code calling fib(10) could
not even safely access position 0 of the array that fib(10) returns.
For referring to the length of an array, C0 contracts have a special func-
tion \length(A) that stands for the number of elements in the array A. Just
like the \result variable, the function \length is part of the contract lan-
guage and cannot be used in C0 program code. Its purpose is to be used in
contracts to specify the requirements and behavior of a program. For the
Fibonacci function, we want to specify the postcondition that the length of
the array that the function returns is n.

int[] fib(int n)
//@requires n >= 0;
//@ensures \length(\result) == n;
{
int[] F = alloc_array(int, n);
F[0] = 0;
F[1] = 1;
for (int i = 0; i < n; i++) {
F[i+2] = F[i+1] + F[i];
}
return F;
}

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.7

5 Loop Invariants for Arrays


By writing specifications, we should convince ourselves that all array ac-
cesses will be within the bounds. In the loop, we access F [i], which would
raise an error if i were negative, because that would violate the lower
bounds of the array. So we need to specify a loop invariant that ensures
i 0.

int[] fib(int n)
//@requires n >= 0;
//@ensures \length(\result) == n;
{
int[] F = alloc_array(int, n);
F[0] = 0;
F[1] = 1;
for (int i = 0; i < n; i++)
//@loop_invariant i >= 0;
{
F[i+2] = F[i+1] + F[i];
}
return F;
}

Clearly, if i 0 then the other array accesses F[i+1] and F[i+2] also will
not violate the lower bounds of the array, because i + 1 0 and i + 2 0.
Will the program work correctly now?

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.8

The big issue with the code is that, even though the code ensures that
no array access exceeds the lower bound 0 of the array F, we do not know
whether the upper bounds of the array i.e., \length(F), which equals n,
is always respected. For each array access, we need a to ensure that it is
within the bounds. In particular, we need to ensure i < n for array access
F[i] and the condition i + 1 < n for array access F[i+1] and the condition
i+2 < n for F[i+2]. But this condition does not work out, because the loop
body also runs when i = n 1, at which point i+2 = (n 1)+2 = n+1 < n
does not hold, because we have allocated array F to have size n.
We can also easily observe this bug by using in coin.

% coin fibc.c0 -d
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> fib(5);
Error: accessing element 5 in 5-element array
Last position: fibc.c0:11.7-11.30
fib from <stdio>:1.1-1.7

Consequently, we need to stop the loop earlier and can only continue as
long as i + 2 < n. Since the loop condition in a for loop can be any boolean
expression, we could trivially ensure this by changing the loop as follows:

int[] fib(int n)
//@requires n >= 0;
//@ensures \length(\result) == n;
{
int[] F = alloc_array(int, n);
F[0] = 0;
F[1] = 1;
for (int i = 0; i+2 < n; i++)
//@loop_invariant i >= 0;
{
F[i+2] = F[i+1] + F[i];
}
return F;
}

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.9

Since it can be more convenient to see the exact bounds of a for loop, we
can replace the loop condition i + 2 < n by i < n 2 since both are equiv-
alent. It does not make much difference, which one we use, but the latter
can be more intuitive to determine how long a loop iterates to complete.

int[] fib(int n)
//@requires n >= 0;
//@ensures \length(\result) == n;
{
int[] F = alloc_array(int, n);
F[0] = 0;
F[1] = 1;
for (int i = 0; i < n-2; i++)
//@loop_invariant i >= 0;
{
F[i+2] = F[i+1] + F[i];
}
return F;
}

This program looks good and will behave well after a number of tests.
Is it correct? Before you read on, find an answer yourself.

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.10

When we verify the previous program, we suddenly realize that there


are two array accesses for which we have not yet convinced ourselves that
they will access within bounds. The two array accesses F[0] and F[1] be-
fore the loop. And in fact, they may fail when we run coin.
We can also easily exhibit this bug with coin on either fibe.c0 or fibd.c0

% coin fibe.c0 -d
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> fib(5);
0xFF4FF780 (int[] with 5 elements)
--> fib(2);
0xFF4FF760 (int[] with 2 elements)
--> fib(1);
Error: accessing element 1 in 1-element array
Last position: fibe.c0:7.3-7.12
fib from <stdio>:1.1-1.7
--> fib(0);
Error: accessing element 0 in 0-element array
Last position: fibe.c0:6.3-6.12
fib from <stdio>:1.1-1.7
-->

To solve this issue we add tests that only run them if the array is big
enough to contain that entry.
See fibf.c0

int[] fib(int n)
//@requires n >= 0;
//@ensures \length(\result) == n;
{
int[] F = alloc_array(int, n);
if (n > 0) F[0] = 0; /* line 0 */
if (n > 1) F[1] = 1; /* line 1 */
for (int i = 0; i < n-2; i++)
//@loop_invariant i >= 0;
{
F[i+2] = F[i+1] + F[i]; /* line 2 */
}
return F;
}

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.11

6 Proving Correctness: Loop Invariants


The loop invariant states a property that must be true just before the exit
condition is tested. Variable i is initialized to 0 with i = 0 when the for
loop begins. Clearly, i is incremented each time around the loop (with
the step statement i++ which is the same as i = i+1), so i will always be
greater or equal to 0. Let us prove this precisely.

Init: When we enter the loop for the first time, the for loop initialization
assigns i = 0 so i 0.

Preservation: Assume that i 0 (the loop invariant) when we enter the


loop, we have to show it still holds after we traverse the loop body
once. We obtain the next value of the loop by executing i = i+1 so
the new value of i, written i0 , will only be bigger, so it must still be
greater of equal to 0, thus i0 = i + 1 0.
A subtle point: we are in two’s complement arithmetic, but i + 1 can-
not overflow since i is bounded from above by n 2.

7 Proving Correctness: Array Bounds


Now we have verified the loop invariant but still need to verify that all
array accesses are guaranteed to be in bounds, otherwise the program still
would not run correctly.

1. In line 0 we assign to F[0]. If the length of the array F (which is n)


were 0, this would be out of bounds. But we check that n > 0 in the
if statement so that the assignment only takes place if there is at least
one element in the array, labeled F[0].

2. Similarly, in line 1 we access F[1], but this is okay because we only


access it if n > 1.

3. In line 2, we access F[i+2], F[i+1] and F[i]. By the loop invariant


we know that i + 2, i + 1, and i are all greater than or equal to 0,
because i 0. Since we only enter the loop body if the loop condition
i < n 2 holds, and n is the length of array, we also know that i + 2
is less than the length of the array (and so are i + 1 and i because they
are only smaller). So all three access must always be in bounds.

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.12

In the last case, we do not reason about how the loop operates but rely
solely on the loop invariant instead. This is crucial, since the loop invariant
is supposed to contain all the relevant information about the relevant effect
of the loop. In particular, our reasoning about the array accesses does not
depend on understanding what exactly the loop does after, say, 5 iterations
or where i started and how it evolved since. All that matters is whether we
can conclude from the loop invariant i 0 and the loop condition i < n 2
that the array accesses are okay. In this way, loop invariants have the ef-
fect of entirely localizing our reasoning to one general scenario to consider
for the loop body. This is how loop invariants can greatly contribute to
understanding programs and ensure we have implemented them correctly.
Similar effects occur in other scenarios where our understanding of the
behavior of loops becomes entirely focused on a local question of a single
behavior just by virtue of being able to conclude from the loop invariant.
Needless to say, before we use a loop invariant in our reasoning about the
behavior of the code, we should convince ourselves that the loop invariant
is correct by a proof.

8 Aliasing
We have seen assignments to array elements, such as A[0] = 0;. But we
have also seen assignments to array variables themselves, such as

int[] A = alloc_array(int, n);

What do they mean? To explore this, we separate the declaration of an


array variable (here: F and G) from the assignment to them.

% coin -d fibf.c0
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> int[] F;
--> int[] G;
--> F = fib(15);
F is 0xF6969A80 (int[] with 15 elements)
--> G[2];
Error: uninitialized value used
Last position: <stdio>:1.1-1.5
--> G = F;
G is 0xF6969A80 (int[] with 15 elements)

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.13

--> G = fib(10);
G is 0xF6969A30 (int[] with 10 elements)
-->

The first assignment to F is as expected: it is the address of an array


of Fibonacci numbers with 15 elements. The use of G in G[2], of course,
cannot succeed, because we have only declared G to have a type of integer
arrays, but did not assign any array to G.
Afterwards, however, when we assign G = F, then G and F (as vari-
ables) hold the same address! Holding the same address means that F and G
are aliased. When we make the second assignment to G (changing its value)
we get a new array, which is in fact smaller and definitely no longer aliased
to F (note the different address). Aliasing (or the lack thereof) is crucial, be-
cause modifying one of two aliased arrays will also change the other. For
example:

% coin
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> int[] A = alloc_array(int, 5);
A is 0xE8176FF0 (int[] with 5 elements)
--> int[] B = A;
B is 0xE8176FF0 (int[] with 5 elements)
--> A[0] = 42;
A[0] is 42 (int)
--> B[0];
42 (int)
-->

C0 has no built-in way to copy from one array to another (ultimately


we will see that there are multiple meaningful ways how to copy arrays
of more complicated types). Here is a simple function to copy arrays of
integers.

/* file copy.c0 */
int[] array_copy(int[] A, int n)
//@requires 0 <= n && n <= \length(A);
//@ensures \length(\result) == n;
{
int[] B = alloc_array(int, n);

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.14

for (int i = 0; i < n; i++)


//@loop_invariant 0 <= i;
B[i] = A[i];
return B;
}

For example, we can create B as a copy of A, and now assigning to the copy
of B will not affect A. We will invoke coin with the -d flag to make sure
that if a pre- or post-condition or loop invariant is violated we get an error
message.

% coin copy.c0 -d
Coin 0.2.9 ’Penny’(r10, Fri Jan 6 22:08:54 EST 2012)
Type ‘#help’ for help or ‘#quit’ to exit.
--> int[] A = alloc_array(int, 10);
A is 0xF3B8DFF0 (int[] with 10 elements)
--> for (int i = 0; i < 10; i++) A[i] = i*i;
--> int[] B = array_copy(A, 10);
B is 0xF3B8DFB0 (int[] with 10 elements)
--> B[9];
81 (int)
--> A[9] = 17;
A[9] is 17 (int)
--> B[9];
81 (int)
-->

L ECTURE N OTES J ANUARY 24, 2012


Arrays L4.15

Exercises
Exercise 1 Write a function array_part that creates a copy of a part of a given
array, namely the elements from position i to position j. Your function should have
prototype

int[] array_part(int[] A, int i, int j);

Develop a specification and loop invariants for this function. Prove that it works
correctly by checking the loop invariant and proving array bounds.

Exercise 2 Write a function copy_into that copies a part of a given array source,
namely n elements starting at position i, into another given array target, starting
at position j. Your function should have prototype

int copy_into(int[] source, int i, int n, int[] target, int j);

As an extra service, make your function return the last position in the target ar-
ray that it entered data into. Develop a specification and loop invariants for this
function. Prove that it works correctly by checking the loop invariant and proving
array bounds. What is difficult about this case?

Exercise 3 Write a function can_copy_into that returns an integer indicating


how many elements, starting from position i, of an array source of a given length
n can be copied safely into a part of a given array target starting at position j, into
another given array, starting at position j. Your function should have prototype

int can_copy_into(int[] source, int i, int[] target, int j, int n);

Develop a specification and loop invariants for this function. Prove that it works
correctly by checking the loop invariant and proving array bounds. The num-
ber returned by can_copy_into should be compatible with the specification of
copy_into. Which calls to copy_into are guaranteed to work correctly after a
call of

int r = can_copy_into(source, i, target, j, n);

Exercise 4 Can you develop a reasonable (non-degenerate) and useful function


with the following prototype? Discuss.

int f(int[] A);

L ECTURE N OTES J ANUARY 24, 2012


Lecture Notes on
Linear Search
15-122: Principles of Imperative Computation
Frank Pfenning

Lecture 5
January 29, 2013

1 Introduction
One of the fundamental and recurring problems in computer science is to
find elements in collections, such as elements in sets. An important algo-
rithm for this problem is binary search. We use binary search for an integer
in a sorted array to exemplify it. As a preliminary study in this lecture we
analyze linear search, which is simpler, but not nearly as efficient. Still it is
often used when the requirements for binary search are not satisfied, for
example, when we do not have the elements we have to search arranged in
a sorted array.
In term of our learning goals, we discuss the following:

Computational Thinking: We will see the first time the power of order in
various algorithmic problems.

Algorithms and Data Structures: We will see a simple linear search in a


fixed-size array.

Programming: We will practice deliberate programming together in lectures.


We also emphasize the importance of contracts for testing and rea-
soning, both about safety and correctness.

2 Linear Search in an Unsorted Array


If we are given an array of integers A without any further information and
have to decide if an element x is in A, we just have to search through it,

L ECTURE N OTES J ANUARY 29, 2013


Linear Search L5.2

element by element. We return true as soon as we find an element that


equals x, false if no such element can be found.

bool is_in(int x, int[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
{
for (int i = lower; i < upper; i++)
//@loop_invariant lower <= i && i <= upper;
{
if (A[i] == x) return true;
}
return false;
}

We used the statement i++ which is equivalent to i = i+1 to step through


the array, element by element.
The precondition is very common when working with arrays. We pass
an array, and we also pass bounds – typically we will let lower be 0 and
upper be the length of the array. The added flexibility of allowing lower
and upper to take other values will be useful if we want to limit search to
the first n elements of an array and do not care about the others. It will also
be useful later to express invariants such as x is not among the first k elements
of A, which we will write in code as !is_in(x, A, 0, k) and which we
will write in mathematical notation as x 2 / A[0, k).
The loop invariant is also typical for loops over an array. We examine
every element (i ranges from lower to upper 1). But we will have i = upper
after the last iteration, so the loop invariant which is checked just before the
exit condition must allow for this case.
Could we strengthen the loop invariant, or write a postcondition? We
could try something like

//@loop_invariant !is_in(x, A, lower, i);

where !b is the negation of b. However, it is difficult to make sense of this


use of recursion in a contract or loop invariant so we will avoid it.
This is small illustration of the general observation that some functions
are basic specifications and are themselves not subject to further specifica-
tion. Because such basic specifications are generally very inefficient, they
are mostly used in other specifications (that is, pre- or post-conditions, loop
invariants, general assertions) rather than in code intended to be executed.

L ECTURE N OTES J ANUARY 29, 2013


Linear Search L5.3

3 Sorted Arrays
A number of algorithms on arrays would like to assume that they are sorted.
Such algorithms would return a correct result only if they are actually run-
ning on a sorted array. Thus, the first thing we need to figure out is how
to specify sortedness in function specifications. The specification function
is_sorted(A,lower,upper) traverses the array A from left to right, start-
ing at lower and stopping just before reaching upper , checking that each el-
ement is smaller or equal to its right neighbor. We need to be careful about
the loop invariant to guarantee that there will be no attempt to access a
memory element out of bounds.

bool is_sorted(int[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
{
for (int i = lower; i < upper-1; i++)
//@loop_invariant lower <= i;
if (!(A[i] <= A[i+1])) return false;
return true;
}

The loop invariant here does not have an upper bound on i. Fortunately,
when we are inside the loop, we know the loop condition is true so we
know i < upper 1. That together with lower  i guarantees that both
accesses are in bounds.
We could also try i  upper 1 as a loop invariant, but this turns out to
be false. It is instructive to think about why. If you cannot think of a good
reason, try to prove it carefully. Your proof should fail somewhere.
Actually, the attempted proof already fails at the initial step. If lower =
upper = 0 (which is permitted by the precondition) then it is not true that
0 = lower = i  upper 1 = 0 1 = 1. We could say i  upper , but that
wouldn’t seem to serve any particular purpose here since the array accesses
are already safe.
Let’s reason through that. Why is the acccess A[i] safe? By the loop
invariant lower  i and the precondition 0  lower we have 0  i, which
is the first part of safety. Secondly, we have i < upper 1 (by the loop
condition, since we are in the body of the loop) and upper  length(A)
(by the precondition), so i will be in bounds. In fact, even i + 1 will be in
bounds, since 0  lower  i < i + 1 (since i is bounded from above) and
i + 1 < (upper 1) + 1 = upper  length(A).

L ECTURE N OTES J ANUARY 29, 2013


Linear Search L5.4

Whenever you see an array access, you must have a very good reason
why the access must be in bounds. You should develop a coding instinct
where you deliberately pause every time you access an array in your code
and verify that it should be safe according to your knowledge at that point
in the program. This knowledge can be embedded in preconditions, loop
invariants, or assertions that you have verified.

4 Linear Search in a Sorted Array


Next, we want to search for an element x in an array A which we know is
sorted in ascending order. We want to return 1 if x is not in the array and
the index of the element if it is.
The pre- and postcondition as well as a first version of the function itself
are relatively easy to write.

int search(int x, int[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A,0,n);
/*@ensures (\result == -1 && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && A[\result] == x);
@*/
{
for (int i = 0; i < n; i++)
//@loop_invariant 0 <= i && i <= n;
if (A[i] == x) return i;
return -1;
}

This does not exploit that the array is sorted. We would like to exit the
loop and return 1 as soon as we find that A[i] > x. If we haven’t found x
already, we will not find it subsequently since all elements to the right of i
will be greater or equal to A[i] and therefore strictly greater than x. But we
have to be careful: the following program has a bug.

L ECTURE N OTES J ANUARY 29, 2013


Linear Search L5.5

int search(int x, int[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A,0,n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && A[\result] == x);
@*/
{
for (int i = 0; A[i] <= x && i < n; i++)
//@loop_invariant 0 <= i && i <= n;
if (A[i] == x) return i;
return -1;
}

Can you spot the problem? If you cannot spot it immediately, reason
through the loop invariant. Read on if you are confident in your answer.

L ECTURE N OTES J ANUARY 29, 2013


Linear Search L5.6

The problem is that the loop invariant only guarantees that 0  i  n


before the exit condition is tested. So it is possible that i = n and the test
A[i] <= x will try access an array element out of bounds: the n elements
of A are numbered from 0 to n 1.
We can solve this problem by taking advantage of the so-called short-
circuiting evaluation of the boolean operators of conjunction (“and”) && and
disjunction (“or”) ||. If we have condition e1 && e2 (e1 and e2 ) then we
do not attempt to evaluate e2 if e1 is false. This is because a conjunction
will always be false when the first conjunct is false, so the work would be
redundant.
Similarly, in a disjunction e1 || e2 (e1 or e2 ) we do not evaluate e2 if
e1 is true. This is because a disjunction will always be true when the first
disjunct it true, so the work would be redundant.
In our linear search program, we just swap the two conjuncts in the exit
test.

int search(int x, int[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A,0,n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && A[\result] == x);
@*/
{
for (int i = 0; i < n && A[i] <= x; i++)
//@loop_invariant 0 <= i && i <= n;
if (A[i] == x) return i;
return -1;
}

Now A[i] <= x will only be evaluated if i < n and the access will be in
bounds since we also know 0  i from the loop invariant.
Alternatively, and perhaps easier to read, we can move the test into the
loop body.

L ECTURE N OTES J ANUARY 29, 2013


Linear Search L5.7

int search(int x, int[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A,0,n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && A[\result] == x);
@*/
{
for (int i = 0; i < n; i++)
//@loop_invariant 0 <= i && i <= n;
{
if (A[i] == x) return i;
else if (A[i] > x) return -1;
}
return -1;
}

This program is not yet satisfactory, because the loop invariant does not
have enough information to prove the postcondition. We do know that if we
return directly from inside the loop, that A[i] = x and so A[\result] == x
holds. But we cannot deduce that !is_in(x, A, 0, n) if we return 1.
Before you read on, consider which loop invariant you might add to
guarantee that. Try to reason why the fact that the exit condition must
be false and the loop invariant true is enough information to know that
!is_in(x, A, 0, n) holds.

L ECTURE N OTES J ANUARY 29, 2013


Linear Search L5.8

Did you try to exploit that the array is sorted? If not, then your invariant
is most likely too weak, because the function is incorrect if the array is not
sorted!
What we want to say is that all elements in A to the left of index i are smaller
than x. Just saying A[i-1] < x isn’t quite right, because when the loop is
entered the first time we have i = 0 and we would try to access A[ 1]. We
again exploit shirt-circuiting evaluation, this time for disjunction.

int search(int x, int[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A,0,n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && A[\result] == x);
@*/
{
for (int i = 0; i < n; i++)
//@loop_invariant 0 <= i && i <= n;
//@loop_invariant i == 0 || A[i-1] < x;
{
if (A[i] == x) return i;
else if (A[i] > x) return -1;
//@assert A[i] < x;
}
return -1;
}

It is easy to see that this invariant is preserved. Upon loop entry, i = 0.


Before we test the exit condition, we just incremented i. We did not return
while inside the loop, so A[i 1] 6= x and also A[i 1]  x. From these two
together we have A[i 1] < x. We have added a corresponding assertion
to the program to highlight the importance of that fact.
Why does the loop invariant imply the postcondition of the function? If
we exit the loop normally, then the loop condition must be false. So i n.
know A[n 1] = A[i 1] < x. Since the array is sorted, all elements from
0 to n 1 are less or equal to A[n 1] and so also strictly less than x and x
can not be in the array.
If we exit from the loop because A[i] > x, we also know that A[i 1] < x
so x cannot be in the array since it is sorted.

L ECTURE N OTES J ANUARY 29, 2013


Linear Search L5.9

5 Analyzing the Number of Operations


In the worst case, linear search goes around the loop n times, where n is the
given bound. On each iteration except the last, we perform three compar-
isons: i < n, A[i] = x and A[i] > x. Therefore, the number of comparisons
is almost exactly 3 ⇤ n in the worst case. We can express this by saying that
the running time is linear in the size of the input (n). This allows us to pre-
dict the running time pretty accurately. We run it for some reasonably large
n and measure its time. Doubling the size of the input n0 = 2 ⇤ n mean that
now we perform 3 ⇤ n0 = 3 ⇤ 2 ⇤ n = 2 ⇤ (3 ⇤ n) operations, twice as many as
for n inputs.
We will introduce more abstract measurements for the running times in
the lecture after next.

L ECTURE N OTES J ANUARY 29, 2013


Lecture Notes on
Binary Search

15-122: Principles of Imperative Computation


Frank Pfenning

Lecture 6
January 31, 2013

1 Introduction
One of the fundamental and recurring problems in computer science is to
find elements in collections, such as elements in sets. An important al-
gorithm for this problem is binary search. We use binary search for an in-
teger in a sorted array to exemplify it. We started in the last lecture by
discussing linear search and giving some background on the problem. This
lecture clearly illustrates the power of order in algorithm design: if an array
is sorted we can search through it very efficiently, much more efficiently
than when it is not ordered.
We will also once again see the importance of loop invariants in writing
correct code. Here is a note by Jon Bentley about binary search:

I’ve assigned [binary search] in courses at Bell Labs and IBM. Profes-
sional programmers had a couple of hours to convert [its] description
into a program in the language of their choice; a high-level pseudocode
was fine. At the end of the specified time, almost all the programmers
reported that they had correct code for the task. We would then take
thirty minutes to examine their code, which the programmers did with
test cases. In several classes and with over a hundred programmers,
the results varied little: ninety percent of the programmers found bugs
in their programs (and I wasn’t always convinced of the correctness of
the code in which no bugs were found).
I was amazed: given ample time, only about ten percent of profes-
sional programmers were able to get this small program right. But

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.2

they aren’t the only ones to find this task difficult: in the history in
Section 6.2.1 of his Sorting and Searching, Knuth points out that
while the first binary search was published in 1946, the first published
binary search without bugs did not appear until 1962.
—Jon Bentley, Programming Pearls (1st edition), pp.35–36

I contend that what these programmers are missing is the understanding


of how to use loop invariants in composing their programs. They help
us to make assumptions explicit and clarify the reasons why a particular
program is correct. Part of the magic of pre- and post-conditions as well as
loop invariants and assertions is that they localize reasoning. Rather than
having to look at the whole program, or the whole function, we can focus
on individual statements tracking properties via the loop invariants and
assertions.

2 Binary Search
Can we do better than searching through the array linearly? If you don’t
know the answer already it might be surprising that, yes, we can do signif-
icantly better! Perhaps almost equally surprising is that the code is almost
as short!
Before we write the code, let us describe the algorithm. We start by
examining the middle element of the array. If it smaller than x than x must
be in the upper half of the array (if it is there at all); if is greater than x then
it must be in the lower half. Now we continue by restricting our attention
to either the upper or lower half, again finding the middle element and
proceeding as before.
We stop if we either find x, or if the size of the subarray shrinks to zero,
in which case x cannot be in the array.
Before we write a program to implement this algorithm, let us analyze
the running time. Assume for the moment that the size of the array is a
power of 2, say 2k . Each time around the loop, when we examine the mid-
dle element, we cut the size of the subarrays we look at in half. So before the
first iteration the size of the subarray of interest is 2k . After the second iter-
ation it is of size 2k 1 , then 2k 2 , etc. After k iterations it will be 2k k = 1,
so we stop after the next iteration. Altogether we can have at most k + 1
iterations. Within each iteration, we perform a constant amount of work:
computing the midpoint, and a few comparisons. So, overall, when given

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.3

a size of array n we perform c ⇤ log2 (n) operations.1


If the size n is not a power of 2, then we can round n up to the next
power of 2, and the reasoning above still applies. For example, is n = 13
we round it up to 16 = 24 . The actual number of steps can only be smaller
than this bound, because some of the actual subintervals may be smaller
than the bound we obtained when rounding up n.
The logarithm grows much slower than the linear function that we ob-
tained when analyzing linear search. As before, consider that we are dou-
bling the size of the input, n0 = 2 ⇤ n. Then the number of operations will be
c ⇤ log(2 ⇤ n) = c ⇤ (log(2) + log(n)) = c ⇤ (1 + log(n)) = c + c ⇤ log(n). So the
number of operations increases only by a constant amount c when we dou-
ble the size of the input. Considering that the largest representable positive
number in two’s complement representation is 231 1 (about 2 billion) bi-
nary search even for unreasonably large arrays will only traverse the loop
31 times! So the maximal number of operations is effectively bounded by a
constant if it is logarithmic.

3 Implementing Binary Search


The specification for binary search is the same as for linear search.

int binsearch(int x, int[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && A[\result] == x);
@*/
;

We declare two variables, lower and upper, which hold the lower and up-
per end of the subinterval in the array that we are considering. We start
with lower as 0 and upper as n, so the interval includes lower and excludes
upper. This often turns out to be a convenient choice when computing with
arrays (but see Exercise 1).
The for loop from linear search becomes a while loop, exiting when
the interval has size zero, that is, lower == upper. We can easily write the
1
In general in computer science, we are mostly interested in logarithm to the base 2 so
we will just write log(n) for log to the base 2 from now on unless we are considering a
different base.

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.4

first loop invariant, relating lower and upper to each other and the overall
bound of the array.

int binsearch(int x, int[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && A[\result] == x);
@*/
{ int lower = 0;
int upper = n;
while (lower < upper)
//@loop_invariant 0 <= lower && lower <= upper && upper <= n;
{
// ...??...
}
return -1;
}

In the body of the loop, we first compute the midpoint mid. By elemen-
tary arithmetic it is indeed between lower and upper .
Next in the loop body we check if A[mid ] = x. If so, we have found the
element and return mid .

int binsearch(int x, int[] A, int n)


// ... contract elided ...
{ int lower = 0;
int upper = n;
while (lower < upper)
//@loop_invariant 0 <= lower && lower <= upper && upper <= n;
//@loop_invariant ...??...
int mid = lower + (upper-lower)/2;
//@assert lower <= mid && mid < upper;
if (A[mid] == x) return mid;
// ...??...
}
return -1;
}

Now comes the hard part. What is the missing part of the invariant?
The first instinct might be to say that x should be in the interval from

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.5

A[lower ] to A[upper ]. But that may not even be true when the loop is en-
tered the first time.
Let’s consider a generic situation in the form of a picture and collect
some ideas about what might be appropriate loop invariants. Drawing
diagrams to reason about an algorithm and the code that we are trying
to construct is an extremely helpful general technique.

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 5" 7" 11" 19" 34" 42" 65" 65" 89" 123"

0" lower" upper" n"

The red box around elements 2 through 5 marks the segment of the
array still under consideration. This means we have ruled out everything
to the right of (and including) upper and to the left of (and not including)
lower . Everything to the left is ruled out, because those values have been
recognized to be strictly less than x, while the ones on the right are known
to be strictly greater than x, while the middle is still unknown.
We can depict this as follows:

<"x" ?" >"x"

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 5" 7" 11" 19" 34" 42" 65" 65" 89" 123"

0" lower" upper" n"

We can summarize this by stating that A[lower 1] < x and A[upper ] >
x. This implies that x cannot be in the segments A[0..lower ) and A[upper ..n)
because the array is sorted (so all array elements to the left of A[lower 1]
will also be less than x and all array elements to the right of A[upper ] will
also be greater than x). For an alternative, see Exercise 2.
We can postulate these as invariants in the code.

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.6

int binsearch(int x, int[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && A[\result] == x);
@*/
{ int lower = 0;
int upper = n;
while (lower < upper)
//@loop_invariant 0 <= lower && lower <= upper && upper <= n;
//@loop_invariant A[lower-1] < x;
//@loop_invariant A[upper] > x;
{ int mid = lower + (upper-lower)/2;
if (A[mid] == x) return mid;
// ...??...
}
return -1;
}

Now a very powerful programming instinct should tell you someting


is fishy. Can you spot the problem with the new invariants even before
writing any more code in the body of the loop?

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.7

Whenever you access an element of an array, you must have good


reason to know that the access will be in bounds!

In the code we blithely wrote A[lower 1] and A[upper ] because they


were in the middle of the array in our diagram. But initially (and poten-
tially through many iterations) this may not be the case. Fortunately, it is
easy to fix, following what we did for linear search. Consider the following
picture when we start the search.

<"x?" ?" >"x?"

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 5" 7" 11" 19" 34" 42" 65" 65" 89" 123"

0"=" lower" upper"="n"

In this case all elements of the array have to be considered candidates.


All elements strictly to the left of 0 (of which there are none) and to the right
of n (of which there are none) have been ruled out. As in linear search, we
can add this to the our invariant using disjunction.

int binsearch(int x, int[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && A[\result] == x);
@*/
{ int lower = 0;
int upper = n;
while (lower < upper)
//@loop_invariant 0 <= lower && lower <= upper && upper <= n;
//@loop_invariant (lower == 0 || A[lower-1] < x);
//@loop_invariant (upper == n || A[upper] > x);
{ int mid = lower + (upper-lower)/2;
if (A[mid] == x) return mid;
// ...??...
}
return -1;
}

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.8

At this point, let’s check if the loop invariant is strong enough to imply
the postcondition of the function. If we return from inside the loop because
A[mid ] = x we return mid , so A[\result] == x as required.
If we exit the loop because lower < upper is false, we know lower =
upper , by the first loop invariant. Now we have to distinguish some cases.

1. If A[lower 1] < x and A[upper ] > x, then A[lower ] > x (since lower =
upper ). Because the array is sorted, x cannot be in it.

2. If lower = 0, then upper = 0. By the third loop invariant, then either


n = 0 (and so the array has no elements and we must return 1), or
A[upper ] = A[lower ] = A[0] > x. Because A is sorted, x cannot be in
A if its first element is already strictly greater than x.

3. If upper = n, then lower = n. By the second loop invariant, then


either n = 0 (and so we must return 1), or A[n 1] = A[upper 1] =
A[lower 1] < x. Because A is sorted, x cannot be in A if its last
element is already strictly less than x.

Notice that we could verify all this without even knowing the complete
program! As long as we can finish the loop to preserve the invariant and
terminate, we will have a correct implementation! This would again be a
good point for you to interrupt your reading and to try to complete the
loop, reasoning from the invariant.
We have already tested if A[mid ] = x. If not, then A[mid ] must be less or
greater than x. If it is less, then we can keep the upper end of the interval
as is, and set the lower end to mid + 1. Now A[lower 1] < x (because
A[mid ] < x and lower = mid + 1), and the condition on the upper end
remains unchanged.
If A[mid ] > x we can set upper to mid and keep lower the same. We do
not need to test this last condition, because the fact that the tests A[mid ] = x
and A[mid ] < x both failed implies that A[mid ] > x. We note this in an
assertion.

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.9

int binsearch(int x, int[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && A[\result] == x);
@*/
{ int lower = 0;
int upper = n;
while (lower < upper)
//@loop_invariant 0 <= lower && lower <= upper && upper <= n;
//@loop_invariant (lower == 0 || A[lower-1] < x);
//@loop_invariant (upper == n || A[upper] > x);
{ int mid = lower + (upper-lower)/2;
//@assert lower <= mid && mid < upper;
if (A[mid] == x) return mid;
else if (A[mid] < x) lower = mid+1;
else /*@assert(A[mid] > x);@*/
upper = mid;
}
return -1;
}

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.10

Let’s set up the proof of the loop invariants more schematically.


Init: When the loop is first reached, we have lower = 0 and upper = n, so
the first loop invariant follows from the precondition to the function.
Furthermore, the first disjunct in loop invariants two (lower == 0)
and three (upper == n) is satisfied.
Preservation: Assume the loop invariants are satisfied and we enter the
loop:
0  lower  upper  n (Inv 1)
(lower = 0 or A[lower 1] < x) (Inv 2)
(upper = n or A[upper ] < x) (Inv 3)
lower < upper (loop condition)
We compute mid = lower + b(upper lower )/2c. Now we distinguish
three cases:
A[mid ] = x: In that case we exit the function, so we don’t need to
show preservation. We do have to show the postcondition, but
we already considered this earlier in the lecture.
A[mid ] < x: Then
lower 0 = mid + 1
upper 0 = upper
The first loop invariant 0  lower 0  upper 0  n follows from the
formula for mid , our assumptions, and elementary arithmetic.
For the second loop invariant, we calculate:
A[lower 0 1] = A[(mid + 1) 1] since lower 0 = mid + 1
= A[mid ] by arithmetic
< x this case (A[mid ] < x)

The third loop invariant is preserved, since upper 0 = upper .


A[mid ] > x: Then
lower 0 = lower
upper 0 = mid
Again, by elementary arithmetic, 0  lower 0  upper 0  n.
The second loop invariant is preserved since lower 0 = lower .
For the third loop invariant, we calculate
A[upper 0 ] = A[mid ] since upper 0 = mid
> x since we are in the case A[mid ] > x

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.11

4 Termination
Does this function terminate? If the loop body executes, that is, lower <
upper , then the interval from lower to upper is non-empty. Moreover, the
intervals from lower to mid and from mid + 1 to upper are both strictly
smaller than the original interval. Unless we find the element, the differ-
ence between upper and lower must eventually become 0 and we exit the
loop.

5 One More Observation


You might be tempted to calculate the midpoint with

int mid = (lower + upper)/2;

but that is in fact incorrect. Consider this change and try to find out why
this would introduce a bug.

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.12

Were you able to see it? It’s subtle, but somewhat related to other prob-
lems we had. When we compute (lower + upper)/2; we could actually
have an overflow, if lower + upper > 231 1. This is somewhat unlikely in
practice, since 231 = 2G, about 2 billion, so the array would have to have at
least 1 billion elements. This is not impossible, and, in fact, a bug like this
in the Java libraries2 was actually exposed.
Fortunately, the fix is simple: because lower < upper , we know that
upper lower > 0 and represents the size of the interval. So we can divide
that in half and add it to the lower end of the interval to get its midpoint.

int mid = lower + (upper-lower)/2; // as shown in binary search


//@assert lower <= mid && mid < upper;

Let us convince ourselves why the assert is correct. The division by two will
round to zero, which will round down to 0 here, because upper lower > 0.
Thus, 0  (upper lower )/2 < upper lower , because dividing a positive
number by two will make it strictly smaller. Hence,

mid = lower + (upper lower )/2 < lower + (upper lower ) = upper

Since dividing positive numbers by two will still result in a nonnegative


number, the first part of the assert is correct as well.

mid = lower + (upper lower )/2 lower + 0 = lower

Other operations in this binary search take place on quantities bounded


from above by the int n and thus cannot overflow.
Why did we choose to look at the middle element and not another el-
ement at all? Because, whatever the outcome of our comparison to that
middle element may be, we maximize how much we have learned about
the contents of the array by doing this one comparison. If we find the ele-
ment, we are happy because we are done. If the middle element is smaller
than what we are looking for, however, we are happy as well, because we
have just learned that the lower half of the array has become irrelevant.
Similarly, if the middle element is bigger, then we have made substantial
progress by learning that we never need to look at the upper half of the
array anymore. There are other choices, however, where binary search will
also still work in essentially the same way.
2
see Joshua Bloch’s Extra, Extra blog entry

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.13

6 Some Measurements
Algorithm design is an interesting mix between mathematics and an ex-
perimental science. Our analysis above, albeit somewhat preliminary in
nature, allow us to make some predictions of running times of our imple-
mentations. We start with linear search. We first set up a file to do some
experiments. We assume we have already tested our functions for correct-
ness, so only timing is at stake. See the file find-time.c0 on the course web
pages. We compile this file, together with the our implementation from
this lecture with the cc0 command below. We can get an overall end-to-
end timing with the Unix time command. Note that we do not use the -d
flag, since that would dynamically check contracts and completely throw
off our timings.

% cc0 find.c0 find-time.c0


% time ./a.out

When running linear search 2000 times (1000 elements in the array and 1000
random elements) on 218 elements (256 K elements) we get the following
answer
Timing 1000 times with 2^18 elements
0
4.602u 0.015s 0:04.63 99.5% 0+0k 0+0io 0pf+0w
which indicates 4.602 seconds of user time.
Running linear search 2000 times on random arrays of size 218 , 219 and
220 we get the timings on our MacBook Pro

array size time (secs)


218 4.602
219 9.027
220 19.239
The running times are fairly close to doubling consistently. Due to mem-
ory locality effects and other overheads, for larger arrays we would expect
larger numbers.
Running the same experiments with binary search we get

array size time (secs)


218 0.020
219 0.039
220 0.077

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.14

which is much, much faster but looks suspicously linear as well.


Reconsidering the code we see that the time might increase linearly be-
cause we actually must iterate over the whole array in order to initialize it
with random elements!
We comment out the testing code to measure only the initialization
time, and we see that for 220 elements we measure 0.072 seconds, as com-
pared to 0.077 which is insignificant. Effectively, we have been measuring
the time to set up the random array, rather than to find elements in it with
binary search!
This is a vivid illustration of the power of divide-and-conquer. Loga-
rithmic running time for algorithms grow very slowly, a crucial difference
to linear-time algorithms when the data sizes become large.

L ECTURE N OTES J ANUARY 31, 2013


Binary Search L6.15

Exercises
Exercise 1 Rewrite the binary search function so that both lower and upper bounds
of the interval are inclusive. Make sure to rewrite the loop invariants and the loop
body appropriately, and proof that the correctness of the new loop invariants. Also
explicitly prove termination by giving a measure that strictly decreases each time
around the loop and is bounded from below.
Exercise 2 Rewrite the invariants of the binary search function to use is in(x, A, l, u)
which returns true if and only if there is an i such that x = A[i] for l  i < u.
is in assumes that 0  l  u  n where n is the length of the array.
Then prove the new loop invariants, and verify that they are strong enough to
imply the function’s postcondition.
Exercise 3 Binary search as presented here may not find the leftmost occurrence
of x in the array in case the occurrences are not unique. Given an example demon-
strating this.
Now change the binary search function and its loop invariants so that it will
always find the leftmost occurrence of x in the given array (if it is actually in the
array, 1 as before if it is not).
Prove the loop invariants and the postconditions for this new version, and
verify termination.
Exercise 4 If you were to replace the midpoint computation by
int mid = (lower + upper)/2;
then which part of the contract will alert you to a flaw in your thinking? Why?
Give an example showing how the contracts can fail in that case.
Exercise 5 In lecture, we used design-by-invariant to construct the loop body im-
plementation from the loop invariant that we have identified before. We could also
have maintained the loop invariant by replacing the whole loop body just with
// .... loop_invariant elided ....
{
lower = lower;
upper = upper;
}

Prove the loop invariants for this loop body. What is wrong with this choice?
Which part of our proofs fail, thereby indicating why this loop body would not
implement binary search correctly?

L ECTURE N OTES J ANUARY 31, 2013


Lecture Notes on
Sorting

15-122: Principles of Imperative Computation


Frank Pfenning

Lecture 7
February 5, 2013

1 Introduction
We begin this lecture by discussing how to compare running times of func-
tions in an abstract, mathematical way. The same underlying mathematics
can be used for other purposes, like comparing memory consumption or
the amount of parellism permitted by an algorithm. We then use this to
take a first look at sorting algorithms, of which there are many. In this lec-
ture it will be selection sort because of its simplicity.
In terms of our learning goals, we will work on:
Computational Thinking: Still trying to understand how order can lead
to efficient computation. Worst-case asymptotic complexity of func-
tions.

Algorithms and Data Structures: In-place sorting of arrays in general, and


selection sort in particular. Big-O notation.

Programming: More examples of programming with arrays and algorithm


invariants.

2 Big-O Notation
Our brief analysis in the last lecture already indicates that linear search
should take about n iterations of a loop while binary search take about
log2 (n) iterations, with a constant number of operations in each loop body.
This suggests that binary search should more efficient. In the design and

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.2

analysis of algorithms we try to make this mathematically precise by deriv-


ing so-called asymptotic complexity measures for algorithms. There are two
fundamental principles that guide our mathematical analysis.
1. We only care about the behavior of an algorithm on large inputs, that
is, when it takes a long time. It is when the inputs are large that differ-
ences between algorithms become really pronounced. For example,
linear search on a 10-element array will be practically the same as bi-
nary search on a 10-element array, but once we have an array of, say,
a million entries the difference will be huge.

2. We do not care about constant factors in the mathematical analysis.


For example, in analyzing the search algorithms we count how of-
ten we have to iterate, not exactly how many operations we have to
perform on each iteration. In practice, constant factors can make a
big difference, but they are influenced by so many factors (compiler,
runtime system, machine model, available memory, etc.) that at the
abstract, mathematical level a precise analysis is neither appropriate
nor feasible.
Let’s see how these two fundamental principles guide us in the comparison
between functions that measure the running time of an algorithm.
Let’s say we have functions f and g that measure the number of oper-
ations of an algorithm as a function of the size of the input. For example
f (n) = 3 ⇤ n measures the number of comparisons performed in linear
search for an array of size n, and g(n) = 3 ⇤ log(n) measures the number of
comparisons performed in binary search for an array of size n.
The simplest form of comparison would be
g 0 f if for every n 0, g(n)  f (n).
However, this violates principle (1) because we compare the values and g
and f on all possible inputs n.
We can refine this by saying that eventually, g will always be smaller or
equal to f . We express “eventually” by requiring that there be a number n0
such that g(n)  f (n) for all n that are greater than n0 .
g 1 f if there is some n0 such that for every n n0 it is the case
that g(n)  f (n).
This now incorporates the first principle (we only care about the func-
tion on large inputs), but constant factors still matter. For example, accord-
ing to the last definition we have 3 ⇤ n 1 5 ⇤ n but 5 ⇤ n 61 3 ⇤ n. But if

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.3

constants factors don’t matter, then the two should be equivalent. We can
repair this by allowing the right-hand side to be multiplied by an arbitrary
constant.

g 2 f if there is a constant c > 0 and some n0 such that for


every n n0 we have g(n)  c ⇤ f (n).

This definition is now appropriate.


The less-or-equal symbol  is already overloaded with many meanings,
so we write instead:

g 2 O(f ) if there is a constant c > 0 and some n0 such that for


every n n0 we have g(n)  c ⇤ f (n).

This notation derives from the view of O(f ) as a set of functions, namely
those that eventually are smaller than a constant times f .1 Just to be ex-
plicit, we also write out the definition of O(f ) as a set of functions:

O(f ) = {g | there are c > 0 and n0 s.t. for all n n0 , g(n)  c ⇤ f (n)}

With this definition we can check that O(f (n)) = O(c ⇤ f (n)).
When we characterize the running time of a function using big-O nota-
tion we refer to it as the asymptotic complexity of the function. Here, asymp-
totic refers to the fundamental principles listed above: we only care about
the function in the long run, and we ignore constant factors. Usually, we
use an analysis of the worst case among the inputs of a given size. Trying
to do average case analysis is much harder, because it depends on the distri-
bution of inputs. Since we often don’t know the distribution of inputs it is
much less clear whether an average case analysis may apply in a particular
use of an algorithm.
The asymptotic worst-case time complexity of linear search is O(n),
which we also refer to as linear time. The worst-case asymptotic time com-
plexity of binary search is O(log(n)), which we also refer to as logarithmic
time. Constant time is usually described as O(1), expressing that the running
time is independent of the size of the input.
Some brief fundamental facts about big-O. For any polynomial, only
the highest power of n matters, because it eventually comes to dominate the
function. For example, O(5⇤n2 +3⇤n+83) = O(n2 ). Also O(log(n)) ✓ O(n),
but O(n) 6✓ O(log(n)).
1
In textbooks and research papers you may sometimes see this written as g = O(f ) but
that is questionable, comparing a function with a set of functions.

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.4

That is the same as to say O(log(n)) ( O(n), which means that O(log(n))
is a proper subset of O(n), that is, O(log(n)) is a subset (O(log(n)) ✓ O(n)),
but they are not equal (O(log(n)) 6= O(n)). Logarithms to different (con-
stant) bases are asymptotically the same: O(log2 (n)) = O(logb (n)) because
logb (n) = log2 (n)/log2 (b).
As a side note, it is mathematically correct to say the worst-case running
time of binary search is O(n), because log(n) 2 O(n). It is, however, a
looser characterization than saying that the running time of binary search
is O(log(n)), which is also correct. Of course, it would be incorrect to say
that the running time is O(1). Generally, when we ask you to characterize
the worst-case running time of an algorithm we are asking for the tightest
bound in big-O notation.

3 Sorting Algorithms
We have seen in the last lecture that sorted arrays drastically reduce the
time to search for an element when compared to unsorted arrays. Asymp-
totically, it is the difference between O(n) (linear time) and O(log(n)) (loga-
rithmic time), where n is the length of the input array. This suggests that it
may be important to establish this invariant, namely sorting a given array.
In practice, this is indeed the case: sorting is an important component of
many other data structures or algorithms.
There are many different algorithms for sorting: bucket sort, bubble
sort, insertion sort, selection sort, heap sort, etc. This is testimony to the
importance and complexity of the problem, despite its apparent simplicity.
In this lecture we discuss selection sort, which is one of the simplest
algorithms. In the next lecture we will discuss quicksort. Earlier course in-
stances used mergesort as another example of efficient sorting algorithms.

4 Selection Sort
Selection sort is based on the idea that on each iteration we select the small-
est element of the part of the array that has not yet been sorted and move it
to the end of the sorted part at the beginning of the array.
Let’s play this through for two steps on an example array. Initially, we
consider the whole array (from i = 0 to the end). We write this as A[0..n),
that is the segment of the array starting at 0 up to n, where n is excluded.

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.5

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 12# 87# 21# 3# 2# 78# 97# 16# 89# 21#

i"="0" n"

We now find the minimal element of the array segment under consid-
eration (2) and move it to the front of the array. What do we do with the
element that is there? We move it to the place where 2 was (namely at A[4]).
In other words, we swap the first element with the minimal element. Swap-
ping is a useful operation when we sorting an array in place by modifying
it, because the result is clearly a permutation of the input. If swapping is
our only operation we are immediately guaranteed that the result is a per-
mutation of the input.

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 2" 87" 21" 3" 12" 78" 97" 16" 89" 21"

i" n"

Now 2 is in the right place, and we find the smallest element in the
remaining array segment and move it to the beginning of the segment (i =
1).

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.6

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 2" 3" 21" 87" 12" 78" 97" 16" 89" 21"

i" n"

Let’s pause and see if we can write down properties of the variables and
array segments that allow us to write the code correctly. First we observe
rather straightforwardly that

0in

where i = n after the last iteration and i = 0 before the first iteration. Next
we observe that the elements to the left of i are already sorted.

A[0..i) sorted

These two invariants are not yet sufficient to prove the correctness of selec-
tion sort. We also need to know that all elements to the left of i are less or
equal to all element to the right of i. We abbreviate this:

A[0..i)  A[i..n)

saying that every element in the left segment is smaller than or equal to
every element in the right segment.

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.7

We summarize the invariants


0in
A[0..i) sorted
A[0..i)  A[i..n)

Let’s reason through without any code (for the moment), why these invari-
ants are preserved. Let’s look at the picture again.

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 2" 3" 21" 87" 12" 78" 97" 16" 89" 21"

i" n"

In the next iteration we pick the minimal element among A[i..n), which
would be 12 = A[4]. We now swap this to i = 2 and increment i. We write
here i0 = i + 1 in order to distinguish the old value of i from the new one,
as we do in proofs of preservation of the loop invariant.

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
A" 2" 3" 12" 87" 21" 78" 97" 16" 89" 21"

i" i’"="i+1" n"

Since we only step when i < n, the bounds on i are preserved.


Why is A[0..i+1) sorted? We know by the third invariant that any ele-
ment in A[0..i) is less than any element in A[i..n) and in particular the one
we moved to A[i+1].
Why is A[0..i+1)  A[i+1..n)? We know from the loop invariant before
the iteration that A[0..i)  A[i+1..n). So it remains to show that A[i..i+1] 
A[i+1..n). But that is true since A[i] was a minimal element of A[i..n) which
is the same as saying that it is smaller or equal to all the elements in A[i..n)
and therefore also A[i+1..n) after we swap the old A[i] into its new position.

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.8

5 Programming Selection Sort


From the above invariants and description of the algorithm, the correct
code is simple to write, including its invariants. The function does not
return a value, since it modifies the given array A, so it has declaration:

void sort(int[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(A, lower, upper);
;

We encourage you to now write the function, using the following aux-
iliary and contract functions:

1. is_sorted(A, lower, upper) which is true if the array segment


A[lower ..upper ) is sorted.

2. le_seg(x, A, lower, upper) which is true if x < A[lower 1 ..upper 1 )


(which means all x is less than or equal to all elements in the array
segment).

3. le_segs(A, lower1, upper1, lower2, upper2) which is true if


A[lower 1 ..upper 1 )  A[lower 2 ..upper 2 ) (which means all elements in
the first segment are less or equal to the all elements in the second
array segment).

4. swap(A, i, j) modifies the array A by swapping A[i] with A[j]. Of


course, if i = j, the array remains unchanged.

5. min_index(A, lower, upper) which returns the index m of a mini-


mal element in the segment A[lower ..upper ).

Please write it and then compare it to our version on the next page.

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.9

void sort(int[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(A, lower, upper);
{
for (int i = lower; i < upper; i++)
//@loop_invariant lower <= i && i <= upper;
//@loop_invariant is_sorted(A, lower, i);
//@loop_invariant le_segs(A, lower, i, i, upper);
{
int m = min_index(A, i, upper);
//@assert le_seg(A[m], A, i, upper);
swap(A, i, m);
}
return;
}

At this point, let us verify that the loop invariants are initially satisfied.

• 0  i and i  n since i = 0 and 0  n (by precondition (@requires)).

• A[0..i) is sorted, since for i = 0 the segment A[0..0) is empty (has no


elements) since the right bound is exclusive.

• A[0..i)  A[i..n) is true since for i = 0 the segment A[0..0) has no


elements. The other segment, A[0..n), is the whole array.

We should also verify the assertion we added in the loop body. It ex-
presses that A[m] is less or equal to any element in the segment A[i..n),
abbreviated mathematically as A[m]  A[i..n). This should be implies by
the postcondition of the min_index function.
How can we prove the postcondition (@ensures) of the sorting func-
tion? By the loop invariant 0  i  n and the negation of the loop condition
i n we know i = n. The second loop invariant then states that A[0..n) is
sorted, which is the postcondition.

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.10

6 Auxiliary Functions
Besides the specification functions in contracts, we also used two auxiliary
functions: swap and min_index.
Here is the implementation of swap.

void swap(int[] A, int i, int j)


//@requires 0 <= i && i < \length(A);
//@requires 0 <= j && j < \length(A);
{
int tmp = A[i];
A[i] = A[j];
A[j] = tmp;
return;
}

For min_index, we recommend you follow the method used for selec-
tion sort: follow the algorithm for a couple of steps on a generic example,
write down the invariants in general terms, and then synthesize the simple
code and invariants from the result. What we have is below, for complete-
ness.

int min_index(int[] A, int lower, int upper)


//@requires 0 <= lower && lower < upper && upper <= \length(A);
//@ensures lower <= \result && \result < upper;
//@ensures le_seg(A[\result], A, lower, upper);
{
int m = lower;
int min = A[lower];
for (int i = lower+1; i < upper; i++)
//@loop_invariant lower < i && i <= upper;
//@loop_invariant le_seg(min, A, lower, i);
//@loop_invariant A[m] == min;
if (A[i] < min) {
m = i;
min = A[i];
}
return m;
}

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.11

7 Asymptotic Complexity Analysis


Previously, we have had to expend some effort to prove that functions actu-
ally terminate (like in the function for greatest common divisor). Here we
do more: we do counting in order to give a big-O classification of the num-
ber of operations. If we have an explicit bound on the number of operations
that, of course, implies termination.
The outer loop iterates n times, from i = 0 to i = n 1. Actually, we
could stop one iteration earlier, but that does not effect the asymptotic com-
plexity, since it only involves a constant number of additional operations.
For each iteration of the outer loop (identified by the value for i), we
do a linear search through the array segment to the right of i and then a
simple swap. The linear search will take n i iterations, and cannot be
easily improved since the array segment A[i..n) is not (yet) sorted. So the
total number of iterations (counting the number of inner iterations for each
outer one)
n(n + 1)
n + (n 1) + (n 2) + · · · + 1 =
2
During each of these iterations, we only perform a constant amount of op-
erations (some comparisons, assignments, and increments), so, asymptoti-
cally, the running time can be estimated as

n(n + 1) n2 n
O( ) = O( + ) = O(n2 )
2 2 2
The last equation follows since for a polynomial, as we remarked earlier,
only the degree matters.
We summarize this by saying that the worst-case running time of selec-
tion sort is quadratic. In this algorithm there isn’t a significant difference
between average case and worst case analysis: the number of iterations is
exactly the same, and we only save one or two assignments per iteration in
the loop body of the min_index function if the array is already sorted.

L ECTURE N OTES F EBRUARY 5, 2013


Sorting L7.12

8 Empirical Validation
If the running time were really O(n2 ) and not asymptotically faster, we pre-
dict the following: for large inputs, its running time should be essentially
cn2 for some constant c. If we double the size of the input to 2n, then the
running time should roughly become c(2n)2 = 4(cn2 ) which means the
function should take approximately 4 times as many seconds as before.
We try this with the function sort_time(n, r) which generates a ran-
dom array of size n and then sorts it r times. You can find the C0 code at
sort-time.c0. We run this code several times, with different parameters.

% cc0 selectsort.c0 sort-time.c0


% time ./a.out -n 1000 -r 100
Timing array of size 1000, 100 times
0
0.700u 0.001s 0:00.70 100.0% 0+0k 0+0io 0pf+0w
% time ./a.out -n 2000 -r 100
Timing array of size 2000, 100 times
0
2.700u 0.001s 0:02.70 100.0% 0+0k 0+0io 0pf+0w
% time ./a.out -n 4000 -r 100
Timing array of size 4000, 100 times
0
10.790u 0.002s 0:10.79 100.0% 0+0k 0+0io 0pf+0w
% time ./a.out -n 8000 -r 100
Timing array of size 8000, 100 times
0
42.796u 0.009s 0:42.80 99.9% 0+0k 0+0io 0pf+0w
%

Calculating the ratios of successive running times, we obtain

n Time Ratio
1000 0.700
2000 2.700 3.85
4000 10.790 4.00
8000 42.796 3.97

We see that especially for the larger numbers, the ratio is almost exactly 4
when doubling the size of the input. Our conjecture of quadratic asymp-
totic running time has been experimentally confirmed.

L ECTURE N OTES F EBRUARY 5, 2013


Lecture Notes on
Quicksort
15-122: Principles of Imperative Computation
Frank Pfenning

Lecture 8
February 7, 2013

1 Introduction
In this lecture we first sketch two related algorithms for sorting that achieve
a much better running time than the selection sort from last lecture: merge-
sort and quicksort. We then develop quicksort and its invariants in detail.
As usual, contracts and loop invariants will bridge the gap between the
abstract idea of the algorithm and its implementation.
We will revisit many of the computational thinking, algorithm, and pro-
gramming concepts from the previous lectures. We highlight the following
important ones:

Computational Thinking: We revisit the divide-and-conquer technique from


the lecture on binary search. We will also see the importance of ran-
domness for the first time.

Algorithms and Data Structures: We examine mergesort and quicksort, both


of which use divide-and-conquer, but with different overall strate-
gies.

Programming: We have occasionally seen recursion in specification func-


tions. In both mergesort and quicksort, it will be a central computa-
tional technique.

Both mergesort and quicksort are examples of divide-and-conquer. We di-


vide a problem into simpler subproblems that can be solved independently
and then combine the solutions. As we have seen for binary search, the
ideal divide step breaks a problem into two of roughly equal size, because it

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.2

means we need to divide only logarithmically many times before we have a


basic problem, presumably with an immediate answer. Mergesort achieves
this, quicksort not quite, which presents an interesting tradeoff when con-
sidering which algorithm to chose for a particular class of applications.
Recall linear search for an element in an array, which has asymptotic
complexity of O(n). The divide-and-conquer technique of binary search
divides the array in half, determines which half our element would have
to be in, and then proceeds with only that subarray. An interesting twist
here is that we divide, but then we need to conquer only a single new sub-
problem. So if the length of the array is 2k and we divide it by two on each
step, we need at most k iterations. Since there is only a constant number of
operations on each iteration, the overall complexity is O(log(n)). As a side
remark, if we divided the array into 3 equal sections, the complexity would
remain O(log(n)) because 3k = (2log2 (3) )k = 2log2 3⇤k , so log2 (n) and log3 (n)
only differ in a constant factor, namely log2 (3).

2 Mergesort
Let’s see how we can apply the divide-and-conquer technique to sorting.
How do we divide?
One simple idea is just to divide a given array in half and sort each
half independently. Then we are left with an array where the left half is
sorted and the right half is sorted. We then need to merge the two halves
into a single sorted array. We actually don’t really “split” the array into
two separate arrays, but we always sort array segments A[lower ..upper ).
We stop when the array segment is of length 0 or 1, because then it must be
sorted.
A straightforward implementation of this idea would be as follows:
void mergesort (int[] A, int lower, int upper)
//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(A, lower, upper);
{
if (upper-lower <= 1) return;
int mid = lower + (upper-lower)/2;
mergesort(A, lower, mid); //@assert is_sorted(A, lower, mid);
mergesort(A, mid, upper); //@assert is_sorted(A, mid, upper);
merge(A, lower, mid, upper);
return;
}

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.3

We would still have to write merge, of course. We use the specification func-
tion is_sorted from the last lecture that takes an array segment, defined
by its lower and upper bounds.
The simple and efficient way to merge two sorted array segments (so
that the result is again sorted) is to create a temporary array, scan each of
the segments from left to right, copying the smaller of the two into the
temporary array. This is a linear time (O(n)) operation, but it also requires
a linear amount of temporary space. Other algorithms, like quicksort later
in this lecture, sorts entirely in place and do not require temporary memory
to be allocated. We do not develop the merge operation here further.
The mergesort function represents an example of recursion: a function
(mergesort) calls itself on a smaller argument. When we analyze such a
function call it would be a mistake to try to analyze the function that we
call recursively. Instead, we reason about it using contracts.

1. We have to ascertain that the preconditions of the function we are


calling are satisfied.

2. We are allowed to assume that the postconditions of the function we


are calling are satisfied when it returns.

This applies no matter whether the call is recursive, like it is in this example,
or not. In the mergesort code above the precondition is easy to see. We
have illustrated the postcondition with two explicit @assert annotations.
Reasoning about recursive functions using their contracts is an excel-
lent illustration of computational thinking, separating the what (that is, the
contract) from the how (that is, the definition of the function). To analyze
the recursive call we only care about what the function does.
We also need to analyze the termination behavior of the function, verify-
ing that the recursive calls are on strictly smaller arguments. What smaller
means differs for different functions; here the size of the subrange of the
array is what decreases. The quantity upper lower is divided by two for
each recursive call and is therefore smaller since it is always greater or equal
to 2. If it were less than 2 we would return immediately and not make a
recursive call.
Let’s consider the asymptotic complexity of mergesort, assuming that

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.4

the merging operation is O(n).

n"
1"merge"*"n:"O(n)"
n/2"""""""""""""""""""""""""""""""""""""""""""""""""""""""n/2"

2"merges"*"n/2:"O(n)"
n/4""""""""""""""""""""n/4"""""""""""""""""""""""n/4"""""""""""""""""""""n/4"
4"merges"*"n/4:"O(n)"

1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1"

Mergesort,"worst"case:"log(n)"levels,"O(n)"per"level"

We see that the asymptotic running time will be O(nlog(n)), because there
are O(log(n)) levels, and on each level we have to perform O(n) operations
to merge.

3 The Quicksort Algorithm


A characteristic of mergesort is that the divide phase of divide-and-conquer
is immediate: we only need to calculate the midpoint. On the other hand,
it is complicated and expensive (linear in time and temporary space) to
combine the results of solving the two independent subproblems with the
merging operation.
Quicksort uses the technique of divide-and-conquer in a different man-
ner. We proceed as follows:
1. Pick an arbitrary element of the array (the pivot).
2. Divide the array into two segments, those that are smaller and those
that are greater, with the pivot in between (the partition phase).
3. Recursively sort the segments to the left and right of the pivot.
In quicksort, dividing the problem into subproblems will be linear time,
but putting the results back together is immediate. This kind of trade-off is
frequent in algorithm design.

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.5

Let us analyze the asymptotic complexity of the partitioning phase of


the algorithm. Say we have the array
3, 1, 4, 4, 7, 2, 8
and we pick 3 as our pivot. Then we have to compare each element of this
(unsorted!) array to the pivot to obtain a partition where 2, 1 are to the left
and 4, 7, 8, 4 are to the right of the pivot. We have picked an arbitrary order
for the elements in the array segments all that matters is that all smaller
ones are to the left of the pivot and all larger ones are to the right.
Since we have to compare each element to the pivot, but otherwise
just collect the elements, it seems that the partition phase of the algorithm
should have complexity O(k), where k is the length of the array segment
we have to partition.
It should be clear that in the ideal (best) case, the pivot element will be
magically the median value among the array values. This just means that
half the values will end up in the left partition and half the values will end
up in the right partition. So we go from the problem of sorting an array of
length n to an array of length n/2. Repeating this process, we obtain the
following picture:

n"
1"par**on"*"n:"O(n)"
n/2"""""""""""""""""""""""""""""""""""""""""""""""""""""""n/2"

2"par**ons"*"n/2:"O(n)"
n/4""""""""""""""""""""n/4"""""""""""""""""""""""n/4"""""""""""""""""""""n/4"
4"par**ons"*"n/4:"O(n)"

1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1" 1"

Quicksort,"best"case:"log(n)"levels,"O(n)"per"level"

At each level the total work is O(n) operations to perform the partition.
In the best case there will be O(log(n)) levels, leading us to the O(nlog(n))
best-case asymptotic complexity.

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.6

How many recursive calls do we have in the worst case, and how long
are the array segment? In the worst case, we always pick either the small-
est or largest element in the array so that one side of the partition will be
empty, and the other has all elements except for the pivot itself. In the ex-
ample above, the recursive calls might proceed as follows (where we have
surrounded the unsorted part of the array with brackets):

array pivot
[3, 1, 4, 4, 8, 2, 7] 1
1, [3, 4, 4, 8, 2, 7] 2
1, 2, [3, 4, 4, 8, 7] 3
1, 2, 3, [4, 4, 8, 8] 4
1, 2, 3, 4, [4, 8, 7] 4
1, 2, 3, 4, 4, [8, 7] 7
1, 2, 3, 4, 4, 7, [8]

All other recursive calls are with the empty array segment, since we never
have any unsorted elements less than the pivot. We see that in the worst
case there are n 1 significant recursive calls for an array of size n. The
kth recursive call has to sort a subarray of size n k, which proceeds by
partitioning, requiring O(n k) comparisons.
This means that, overall, for some constant c we have
n
X1 n(n 1)
c k=c 2 O(n2 )
2
k=0

comparisons. Here we used the fact that O(p(n)) for a polynomial p(n) is
always equal to the O(nk ) where k is the leading exponent of the polyno-
mial. This is because the largest exponent of a polynomial will eventually
dominate the function, and big-O notation ignores constant coefficients.
So quicksort has quadratic complexity in the worst case. How can we
mitigate this? If we could always pick the median among the elements in
the subarray we are trying to sort, then half the elements would be less and
half the elements would be greater. So in this case there would be only
log(n) recursive calls, where at each layer we have to do a total amount of
n comparisons, yielding an asymptotic complexity of O(nlog(n)).
Unfortunately, it is not so easy to compute the median to obtain the
optimal partitioning. It turns out that if we pick a random element, its ex-
pected rank will be close enough to the median that the expected running
time of algorithm is still O(nlog(n)).

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.7

We really should make this selection randomly. With a fixed-pick strat-


egy, there may be simple inputs on which the algorithm takes O(n2 ) steps.
For example, if we always pick the first element, then if we supply an array
that is already sorted, quicksort will take O(n2 ) steps (and similarly if it is
“almost” sorted with a few exceptions)! If we pick the pivot randomly each
time, the kind of array we get does not matter: the expected running time is
always the same, namely O(nlog(n)).1 Proving this, however, is a different
matter and beyond the scope of this course. This is an important example
on how to exploit randomness to obtain a reliable average case behavior,
no matter what the distribution of the input values.

4 The Quicksort Function


We now turn our attention to developing an imperative implementation of
quicksort, following our high-level description. We implement quicksort
in the function sort as an in-place sorting function that modifies a given
array instead of creating a new one. It therefore returns no value, which is
expressed by giving a return type of void.

void sort(int[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(A, lower, upper);
{
...
}

Quicksort solves the same problem as selection sort, so their contract is the
same, but their implementation differs. We sort the segment A[lower ..upper )
of the array between lower (inclusively) and upper (exclusively). The pre-
condition in the @requires annotation verifies that the bounds are mean-
ingful with respect to A. The postcondition in the @ensures clause guaran-
tees that the given segment is sorted when the function returns. It does not
express that the output is a permutation of the input, which is required to
hold but is not formally expressed in the contract (see Exercise 1).
Before we start the body of the function, we should consider how to
terminate the recursion. We don’t have to do anything if we have an array
segment with 0 or 1 elements. So we just return if upper lower  1.

1
Actually not quite, with the code that we have shown. Can you find the reason?

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.8

void sort(int[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(A, lower, upper);
{
if (upper-lower <= 1) return;
...
}

Next we have to select a pivot element and call a partition function.


We tell that function the index of the element that we chose as the pivot.
For illustration purposes, we use the middle element as a pivot (to work
reasonably well for arrays that are sorted already), but it should really be
a random element, as in the code in qsort.c0. We want partitioning to be
done in place, modifying the array A. Still, partitioning needs to return the
index mid of the pivot element because we then have to recursively sort
the two subsegments to the left and right of the where the pivot is after
partitioning. So we declare:

int partition(int[] A, int lower, int pivot_index, int upper)


//@requires 0 <= lower && lower <= pivot_index;
//@requires pivot_index < upper && upper <= \length(A);
//@ensures lower <= \result && \result < upper;
//@ensures ge_seg(A[\result], A, lower, \result);
//@ensures le_seg(A[\result], A, \result+1, upper);
;

Here we use the auxiliary functions ge_seg (for greater or equal than segment)
and le_seg (for less or equal that segment), where

• ge_seg(x, A, lower, mid) if x y for every y in A[lower ..mid ).

• le_seq(x, A, mid+1, upper) if x  y for very y in A[mid +1..upper ).

Their definitions can be found in the file sortutil.c0.


Some details on this specification: we require pivot index to be a valid
index in the array range, i.e., lower  pivot index < upper . In particular,
we require lower < upper because if they were equal, then the segment
could be empty and we cannot possibly pick a pivot element or return its
index.
Now we can fill in the remainder of the main sorting function.

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.9

void sort(int[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(A, lower, upper);
{
if (upper-lower <= 1) return;
int pivot_index = lower + (upper-lower)/2; /* should be random */

int mid = partition(A, lower, pivot_index, upper);


sort(A, lower, mid);
sort(A, mid+1, upper);
return;
}
It is a simple but instructive exercise to reason about this program, using
only the contract for partition together with the pre- and postconditions
for sort (see Exercise 2).
To show that the sort function terminates, we have to show the array
segment becomes strictly smaller in each recursive call. First, mid lower <
upper lower since mid < upper by the postcondition for partition. Sec-
ond, upper (mid + 1) < upper lower because lower < mid + 1, also by
the postcondition for partition.

5 Partitioning
The trickiest aspect of quicksort is the partitioning step, in particular since
we want to perform this operation in place. Let’s consider situation when
partition is called:

pivot_index%

…% 2" 87" 21" 3" 12" 78" 97" 16" 89" 21" …%

lower% upper%

Perhaps the first thing we notice is that we do not know where the pivot
will end up in the partitioned array! That’s because we don’t know how

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.10

many elements in the segment are smaller and how many are larger than
the pivot. In particular, the return value of partitxion could be different
than the pivot index that we pass in, even if the element that used to be at
the pivot index in the array before calling partition will be at the returned
index when partition is done. One idea is to make a pass over the seg-
ment and count the number of smaller elements, move the pivot into its
place, and then scan the remaining elements and put them into their place.
Fortunately, this extra pass is not necessary. We start by moving the pivot
element out of the way, by swapping it with the rightmost element in the
array segment.

pivot%=%16% upper%&1%

…% 2" 87" 21" 3" 12" 78" 97" 21" 89" 16" …%

lower% upper%

Now the idea is to gradually work towards the middle, accumulating el-
ements less than the pivot on the left and elements greater than the pivot
on the right end of the segment (excluding the pivot itself). For this pur-
pose we introduce two indices, left and right. We start them out as left and
upper 2.2

pivot%=%16% upper%)1%

…% 2" 87" 21" 3" 12" 78" 97" 21" 89" 16" …%

lower% le1% right% upper%


2
After lecture, it occurred to me that sticking with the convention that the right end is
exclusive might have been slightly better. The invariants remain here as we developed them
in class, so you can also see what code and conditions look like when bounds on an array
segment are inclusive.

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.11

Since 2 < pivot, we can advance the left index: this element is in the proper
place.

≤%pivot% pivot%

…% 2" 87" 21" 3" 12" 78" 97" 21" 89" 16" …%

lower% le-% right% upper%

At this point, 87 > pivot, so we swap it into A[right] and decrement the
right index.

≤%pivot% ≥%pivot% pivot%

…% 2" 89" 21" 3" 12" 78" 97" 21" 87" 16" …%

lower% le-% right% upper%

Let’s take one more step: 89 > pivot, so we swap it into A[right] and decre-
ment the right index again.

≤%pivot% ≥%pivot% pivot%

…% 2" 21" 21" 3" 12" 78" 97" 89" 87" 16" …%

lower% le-% right% upper%

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.12

At this point we pause to read off the general invariants which will
allow us to synthesize the program. We see:
(1) A[lower ..left)  pivot

(2) pivot  A[right+1..upper 1)

(3) A[upper 1] = pivot


We may not be completely sure about the termination condition, but we
can play the algorithm through to its end and observe:

≤%pivot% ≥%pivot% pivot%

…% 2" 12" 3" 21" 78" 97" 21" 89" 87" 16" …%

lower% upper%

Where do left and right need to be, according to our invariants? By invari-
ant (1), all elements up to but excluding left must be less or equal to pivot.
To guarantee we are finished, therefore, the left must address the element
21 at lower + 3. Similarly, invariant (2) states that the pivot must be less or
equal to all elements starting from right + 1 up to but excluding upper 1.
Therefore, right must address the element 3 at lower + 2.

≤%pivot% ≥%pivot% pivot%

…% 2" 12" 3" 21" 78" 97" 21" 89" 87" 16" …%

lower% right% le-% upper%

This means after the last iteration, just before we exit the loop, we have
left = right + 1, and throughout:

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.13

(4) lower  left  right + 1  upper 1

Now comes the last step: since left = right + 1, pivot  A[left] and we can
swap the pivot at upper 1 with the element at left to complete the partition
operation. We can also see the left should be returned as the new position
of the pivot element.

6 Implementing Partitioning
Now that we understand the algorithm and its correctness proof, it remains
to turn these insights into code. We start by swapping the pivot element to
the end of the segment.

int partition(int[] A, int lower, int pivot_index, int upper)


//@requires 0 <= lower && lower <= pivot_index;
//@requires pivot_index < upper && upper <= \length(A);
//@ensures lower <= \result && \result < upper;
//@ensures ge_seg(A[\result], A, lower, \result);
//@ensures le_seg(A[\result], A, \result+1, upper);
{
int pivot = A[pivot_index];
swap(A, pivot_index, upper-1);

...
}

At this point we initialize left and right to lower and upper 2, respectively.
We have to make sure that the invariants are satisfied when we enter the
loop for the first time, so let’s write these.

int left = lower;


int right = upper-2;
while (left <= right)
//@loop_invariant lower <= left && left <= right+1 && right+1 < upper;
//@loop_invariant ge_seg(pivot, A, lower, left);
//@loop_invariant le_seg(pivot, A, right+1, upper-1);
{
...
}

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.14

The crucial observation here is that lower < upper by the precondition of
the function. Therefore left  upper 1 = right + 1 when we first en-
ter the loop, since right = upper 2. The segments A[lower ..left) and
A[right+1..upper 1) will both be empty, initially.
The code in the body of the loop just compares the element at index left
with the pivot and either increments left, or swaps the element to A[right].

int left = lower;


int right = upper-2;
while (left <= right)
//@loop_invariant lower <= left && left <= right+1 && right+1 < upper;
//@loop_invariant ge_seg(pivot, A, lower, left);
//@loop_invariant le_seg(pivot, A, right+1, upper-1);
{
if (A[left] <= pivot) {
left++;
} else { //@assert A[left] > pivot;
swap(A, left, right);
right--;
}
}

Now we just note the observations about the final loop state with an as-
sertion, swap the pivot into place, and return the index left. The complete
function is on the next page, for reference.

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.15

int partition(int[] A, int lower, int pivot_index, int upper)


//@requires 0 <= lower && lower <= pivot_index;
//@requires pivot_index < upper && upper <= \length(A);
//@ensures lower <= \result && \result < upper;
//@ensures ge_seg(A[\result], A, lower, \result);
//@ensures le_seg(A[\result], A, \result+1, upper);
{
int pivot = A[pivot_index];
swap(A, pivot_index, upper-1);

int left = lower;


int right = upper-2;
while (left <= right)
//@loop_invariant lower <= left && left <= right+1 && right+1 < upper;
//@loop_invariant ge_seg(pivot, A, lower, left);
//@loop_invariant le_seg(pivot, A, right+1, upper-1);
{
if (A[left] <= pivot) {
left++;
} else { //@assert A[left] > pivot;
swap(A, left, right);
right--;
}
}
//@assert left == right+1;
//@assert A[upper-1] == pivot;

swap(A, left, upper-1);


return left;
}

L ECTURE N OTES F EBRUARY 7, 2013


Quicksort L8.16

Exercises
Exercise 1 In this exercise we explore strengthening the contracts on in-place
sorting functions.

1. Write a function is_permutation which checks that one segment of an


array is a permutation of another.

2. Extend the specifications of sorting and partitioning to include the permu-


tation property.

3. Discuss any specific difficulties or problems that arise. Assess the outcome.

Exercise 2 Prove that the precondition for sort together with the contract for
partition implies the postcondition. During this reasoning you may also assume
that the contract holds for recursive calls.

Exercise 3 Our implementation of partitioning did not pick a random pivot, but
took the middle element. Construct an array with seven elements on which our
algorithm will exhibit its worst-case behavior, that is, on each step, one of the par-
titions is empty.

Exercise 4 An alternative way to track the unscanned part of the array segment
during partitioning is to make the segment A[left..right) exclusive on the right.
Rewrite the code for partition, including its invariants, for this version of the
indices.

Exercise 5 An alternative way of implementing the partition function is to use


extra memory for temporary storage. Develop such an implementation of

int partition(int[] A, int lower, int pivot_index, int upper)


//@requires 0 <= lower && lower <= pivot_index;
//@requires pivot_index < upper && upper <= \length(A);
//@ensures lower <= \result && \result < upper;
//@ensures ge_seg(A[\result], A, lower, \result);
//@ensures le_seg(A[\result], A, \result+1, upper);

L ECTURE N OTES F EBRUARY 7, 2013


Lecture Notes on
Stacks & Queues
15-122: Principles of Imperative Computation
Frank Pfenning, André Platzer, Rob Simmons

Lecture 9
February 12, 2013

1 Introduction
In this lecture we introduce queues and stacks as data structures, e.g., for
managing tasks. They follow similar principles of organizing the data.
Both provide functionality for putting new elements into it. But they dif-
fer in terms of the order how the elements come out of the data structure
again. Both queues and stacks as well as many other data structures could
be added to the programming language. But they can be implemented eas-
ily as a library in C0. In this lecture, we will focus on the abstract principles
of queues and stacks and defer a detailed implementation to the next lec-
ture.
Relating this to our learning goals, we have
Computational Thinking: We illustrate the power of abstraction by con-
sidering both client-side and library-side of the interface to a data
structure.
Algorithms and Data Structures: We are looking at queues and stacks as
important data structures, we introduce abstract datatypes by exam-
ple.
Programming: Use and design of interfaces.

2 The Stack Interface


Stacks are data structures that allow us to insert and remove items. The
operate like a stack of papers or books on our desk - we add new things to

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.2

the top of the stack to make the stack bigger, and remove items from the top
as well to make the stack smaller. This makes stacks a LIFO (Last In First
Out) data structure – the data we have put in last is what we will get out
first.
Before we consider the implementation to a data structure it is helpful
to consider the interface. We then program against the specified interface.
Based on the description above, we require the following functions:
/* type elem must be defined */

bool stack_empty(stack S); /* O(1), check if stack empty */


stack stack_new(); /* O(1), create new empty stack */
void push(stack S, elem e); /* O(1), add item on top of stack */
elem pop(stack S) /* O(1), remove item from top */
//@requires !stack_empty(S);
;
We want the creation of a new (empty) stack as well as pushing and pop-
ping an item all to be constant-time operations, as indicated by O(1). Fur-
thermore, pop is only possible on non-empty stacks. This is a fundamental
aspect of the interface to a stack, that a client can only read data from a
non-empty stack. So we include this as a requires contract in the interface.
We are being quite abstract here — we do not write, in this file, what
type the elements of the stack have to be. Instead we assume that at the top
of the file, or before this file is read, we have already defined a type elem
for the type of stack elements. We say that the implementation is generic
or polymorphic in the type of the elements. Unfortunately, neither C nor C0
provide a good way to enforce this in the language and we have to rely on
programmer discipline.
In the future, we will sometimes indicate that we have a typedef waiting
to be filled in by the client by writing the following:
typedef _________ elem;
This is not actually valid C0, but the client using this library will be able
to fill in the underscores with a valid type to make the stack a stack of this
type. In this example, we will assume that the client wrote
typedef string elem;
The critical point here is that this is a choice that is up to the user of the
library (the client), and it is not a choice that the stack library needs to know
or care about.

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.3

3 Using the Stack Interface


We play through some simple examples to illustrate the idea of a stack and
how to use the interface above. We write a stack as

x 1 , x2 , . . . , x n

where x1 is the bottom of the stack and xn is the top of the stack. We push
elements on the top and also pop them from the top.
For example:

Stack Command Other variables


stack S = stack_new();
push(S, "a");
"a" push(S, "b");
"a", "b" string e = pop(S); e = "b"
"a" push(S, "c"); e = "b"
"a", "c" e = pop(S); e = "c"
"a" e = "c"

4 One Stack Implementation (With Arrays)


Any programming language is going to come with certain data structures
“built-in.” Arrays, the only really complex data structure we have used so
far in this class, are one example in C0. Other data structures, like stacks
and queues, need to be built in to the language using existing language
features.
We will get to a more proper implementation of stacks in the next lec-
ture, using linked lists. For this lecture we will implement stacks by using
the familiar arrays that we have already been using so far in this class.
The idea is to put all data elements in an array and maintain an integer
top, which is the index where we read off elements.

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$

bo0om" top"

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.4

To help identify the similarities with the queue implementation, we decide


to also remember an integer bottom, which is the index of the bottom of the
stack. (The bottom will, in fact, remain 0.)
With this design decision, we do not have to handle the bottom of the
stack much different than any other element on the stack. The difference is
that the data at the bottom of the stack is meaningless and will not be used
in our implementation.
There appears to be a very big limitiation to this design of stacks: our
stack can’t contain more than 9 elements, like a pile of books on our desk
that cannot grow too high lest it reach the ceiling or fall over. There are
multiple solutions to this problem, but for this lecture we will be content to
work with stacks that have a limited maximum capacity.

4.1 Structs and data structure invariants


Currently, our picture of a stack includes three different things: an array
containing the struct data, an integer indicating where the top is, and an
integer indicating where the bottom is. This is similar to the situation in
Homework 1 where we had data (an array of pixels) and two integers, a
width and a height.
C0 has a feature that allows us to bundle these things up into a struct
rather than passing around all the pieces separately. We define:
struct stack_header {
string[] data;
int top;
int bottom;
};
typedef struct stack_header* stack;
What this notation means exactly, and especially what the part with
struct stack_header* is all about, will be explained in the next lecture.
(These are pointers and it is crucial to understand them, but we defer this
topic for now.) For now, it is sufficient to think of this as providing a nota-
tion for bundling aggregate data. When we have a struct S of type stack,
we can refer to the data as S->data, the integer representing the top of the
stack as S->top, and the integer representing the bottom of the stack as
S->bottom.
When does a struct of this type represent a valid stack? Whenever we
define a new data type representation we should first think about the data
structure invariants. Making these explicit is important as we think about

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.5

and write the pre- and postconditions for functions that implement the in-
terface. Here, it is a simple check of making sure that the bottom and top
indices are in the range of the array and that bottom stays at 0, where we
expect it to be.

bool is_stack(stack S)
{
if (!(S->bottom == 0)) return false;
if (!(S->bottom <= S->top)) return false;
//@assert S->top < \length(S->data);
return true;
}

WARNING: This specification function is missing something very impor-


tant (a check for NULL) – we will return to this next time!
When we write specification functions, we use a style of repeatedly say-
ing

if (!(some invariant of the data structure)) return false;

so that we can read off the invariants of the data structure. A specification
function like is_stack should be safe – it should only ever return true or
false or raise an assertion violation – and if possible it should avoid rais-
ing an assertion violation. Assertion violations are sometimes unavoidable
because we can only check the length of an array inside of the assertion
language.

4.2 Checking for emptiness


To check if the stack is empty, we only need to check whether top and
bottom are the same number.

bool stack_empty(stack S)
//@requires is_stack(S);
{
return S->top == S->bottom;
}

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.6

4.3 Popping from a stack


To pop an element from the stack we just look up the data that is stored
at the position indicated by the top field of the stack in the array S->data
of the data field of the stack. To indicate that this element has now been
removed from the stack, we decrement the top field of the stack. We go
from

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$

bo0om" top"

to

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$

bo0om" top"

The "c" can still be present in the array at position 3, but it is now a part of
the array that we don’t care about, which we indicate by putting an X over
it. In code, popping looks like this:
string pop(stack S)
//@requires is_stack(S);
//@requires !stack_empty(S);
//@ensures is_stack(S);
{
string r = S->data[S->top];
S->top--;
return r;
}
Notice that contracts are cumulative. Since we already indicated
//@requires !stack_empty(S);
in the interface of pop, we would not have to repeat this requires clause in
the implementation. We repeat it regardless to emphasize its importance.

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.7

4.4 Pushing onto a stack


To push an element onto the stack, we increment the top field of the stack
to reflect that there are more elements on the stack. And then we put the
element e at position top the into array S->data that is stored in the data
field. While this is simple, it is still a good idea to draw a diagram. We go
from

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$

bo0om" top"

to

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$ “e”$

bo0om" top"

In code:

void push(stack S, string e)


//@requires is_stack(S);
//@ensures is_stack(S);
{
S->top++;
S->data[S->top] = e;
}

Why is the array access S->data[S->top] safe? Is it even safe? At this


point, it is important to note that it is not safe if we ever try to push more el-
ements on the stack than we have reserved space for. We fully address this
shortcoming of our stack implementation in the next lecture. What we can
do right now to address the issue is to redesign the struct stack_header
by adding a capacity field that remembers the length of the array of the
data field:

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.8

struct stack_header {
string[] data;
int top;
int bottom;
int capacity; // capacity == \length(data);
};
typedef struct stack_header* stack;
Giving us the following updated view of array-based stacks:

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$

bo0om" top" capacity"

The comment that capacity == \length(data) is helpful for indicat-


ing what the intent of capacity is, but it is preferable for us to modify our
is_stack function to account for the change. (The WARNING from before
still applies here.)

bool is_stack(stack S)
{
if (!(S->bottom == 0)) return false;
if (!(S->bottom <= S->top)) return false;
if (!(S->top < S->capacity)) return false;
//@assert S->capacity == \length(S->data);
return true;
}

With a capacity in hand, we check for sufficient space with an explicit


assert statement before we try to access the array or change top.

void push(stack S, string e)


//@requires is_stack(S);
//@ensures is_stack(S);
{
assert(S->top < S->capacity - 1); // otherwise no space left
S->top++;
S->data[S->top] = e;
}

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.9

This assertion can indeed fail if the client tries to push too many ele-
ments on the stack, which is why we use a hard assert – an assertion that
will run whether or not we compile with -d. The alternative would be to
expose the capacity of the stack to the user with a stack_full function
and then add a precondition //@requires !stack_full(S) to our push()
function.

4.5 Creating a new stack


For creating a new stack, we allocate a struct stack_header and initialize
the top and bottom numbers to 0.

stack stack_new()
//@ensures stack_empty(\result);
//@ensures is_stack(\result);
{
stack S = alloc(struct stack_header);
S->bottom = 0;
S->top = 0;
S->capacity = 100; // arbitrary resource bound
S->data = alloc_array(elem, S->capacity);
return S;
}

As shown above, we also need to allocate an array data to store the ele-
ments in. At this point, at the latest, we realize a downside of our stack im-
plementation. If we want to implement stacks in arrays in the simple way
that we just did, the trouble is that we need to decide its capacity ahead
of time. That is, we need to decide how many elements at maximum will
ever be allowed in the stack at the same time. Here, we arbitrarily choose
the capacity 100, but this gives us a rather poor implementation of stacks in
case the client needs to store more data. We will see how to solve this issue
with a better implementation of stacks in the next lecture.
This completes the implementation of stacks, which are a very simple
and pervasive data structure.

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.10

5 Abstraction
An important point about formulating a precise interface to a data structure
like a stack is to achieve abstraction. This means that as a client of the data
structure we can only use the functions in the interface. In particular, we
are not permitted to use or even know about details of the implementation
of stacks.
Let’s consider an example of a client-side program. We would like to
examine the element of top of the stack without removing it from the stack.
Such a function would have the declaration

string peek(stack S)
//@requires !stack_empty(S);
;

The first instinct might be to write it as follows:

string peek(stack S)
//@requires !stack_empty(S);
{
return S->data[S->top];
}

However, this would be completely wrong. Let’s recall the interface:

bool stack_empty(stack S); /* O(1), check if stack empty */


stack stack_new(); /* O(1), create new empty stack */
void push(stack S, string e); /* O(1), add item on top of stack */
string pop(stack S); /* O(1), remove item from top */
//@requires !stack_empty(S);
;

We don’t see any top field, or any data field, so accessing these as a
client of the data structure would violate the abstraction. Why is this so
wrong? The problem is that if the library implementer decided to improve
the code, or perhaps even just rename some of the structures to make it eas-
ier to read, then the client code will suddenly break! In fact, we will provide
a different implementation of stacks in the next lecture, which would make
the above implementation of peek break. With the above client-side im-
plementation of peek, the stack interface does not serve the purpose it is
intended for, namely provide a reliable way to work with a data structure.
Interfaces are supposed to separate the implementation of a data structure

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.11

in a clean way from its use so that we can change one of the two without
affecting the other.
So what can we do? It is possible to implement the peek operation
without violating the abstraction! Consider how before you read on.

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.12

The idea is that we pop the top element off the stack, remember it in a
temporary variable, and then push it back onto the stack before we return.

string peek(stack S)
//@requires !stack_empty(S);
{
string x = pop(S);
push(S, x);
return x;
}

This is clearly less efficient: instead of just looking up the fields of a struct
and accessing an element of an array we actually have to pop and element
and then push it back onto the stack. However, it is still a constant-time
operation (O(1)) since both pop and push are constant-time operations.
Nonetheless, we have a possible argument to include a function peek in
the interface and implement it library-side instead of client-side to save a
small constant of time.
If we are actually prepared to extend the interface, then we can go back
to our original implementation.

string peek(stack S)
//@requires !stack_empty(S);
{
return S->data[S->top];
}

Is this a good implementation? Not quite. First we note that inside the
library we should refer to elements as having type elem, not string. For
our running example, this is purely a stylistic matter because these two
are synonyms. But, just as it is important that clients respect the library
interface, it is important that the library respect the client interface. In this
case, that means that the users of a stack can, without changing the library,
decide to change the definition of elem type in order to store different data
in the stack.
Second we note that we are now missing a precondition. In order to
even check if the stack is non-empty, we first need to be assured that it
is a valid stack. On the client side, all elements of type stack come from
the library, and any violation of data structure invariants could only be
discovered when we hand it back through the library interface to a function
implemented in the library. Therefore, the client can assume that values of

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.13

type stack are valid and we don’t have explicit pre- or post-conditions for
those. Inside the library, however, we are constantly manipulating the data
structure in ways that break and then restore the invariants, so we should
check if the stack is indeed valid.
From these two considerations we obtain the following code for inside
the library:

elem peek(stack S)
//@requires is_stack(S);
//@requires !stack_empty(S);
{
return S->data[S->top];
}

6 Computing the Size of a Stack


Let’s exercise our data structure once more by developing two implemen-
tations of a function that returns the size of a stack: one on the client’s side,
using only the interface, and one on the library’s side, exploiting the data
representation. Let’s first consider a client-side implementation, using only
the interface so far.

int stack_size(stack S);

Again, we encourage you to consider this problem and program it before


you read on.

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.14

First we reassure ourselves that it will not be a simple operation. We do


not have access to the linked lists (in fact, we cannot know how it is imple-
mented), so the only thing we can do is pop all the elements off the stack.
This can be accomplished with a prototypical while-loop that finishes as
soon as the stack is empty.
int stack_size(stack S) {
int count = 0;
while (!stack_empty(S)) {
pop(S);
count++;
}
return count;
}
However, this function has a big problem: in order to compute the size
we have to destroy the stack! Clearly, there may be situations where we
would like to know the number of elements in a stack without deleting all
of its elements. Fortunately, we can use the idea from the peek function in
amplified form: we maintain a new temporary stack T to hold the elements
we pop from S. Once we are done counting, we push them back onto S to
repair the damage.
int stack_size(stack S) {
stack T = stack_new();
int count = 0;
while (!stack_empty(S)) {
push(T, pop(S));
count++;
}
while (!stack_empty(T)) {
push(S, pop(T));
}
return count;
}
The complexity of this function is clearly O(n), where n is the number of
elements in the stack S, since we traverse each while loop n times, and
perform a constant number of operations in the body of both loops. For
that, we need to know that push and pop are constant time (O(1)).
What about a library-side implementation of stack_size? This can be
done more efficiently.

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.15

int stack_size(stack S)
//@requires is_stack(S);
{
return S->top - S->bottom;
}

7 The Queue Interface


A queue is a data structure where we add elements at the back and remove
elements from the front. In that way a queue is like “waiting in line”: the
first one to be added to the queue will be the first one to be removed from
the queue. This is also called a FIFO (First In First Out) data structure.
Queues are common in many applications. For example, when we read a
book from a file as in Assignment 2, it would be natural to store the words
in a queue so that when we are finished reading the file the words are in the
order they appear in the book. Another common example are buffers for
network communication that temporarily store packets of data arriving on
a network port. Generally speaking, we want to process them in the order
that they arrive.
Here is our interface:

/* type elem must be defined */

bool queue_empty(queue Q); /* O(1), check if queue is empty */


queue queue_new(); /* O(1), create new empty queue */
void enq(queue Q, elem s); /* O(1), add item at back */
elem deq(queue Q); /* O(1), remove item from front */
//@requires !queue_empty(Q);
;

Dequeuing is only possible on non-empty queues, which we indicate by a


requires contract in the interface.
We can write out this interface without committing to an implementa-
tion of queues. In particular, the type queue remains abstract in the sense
that we have not given its definition. This is important so that different
implementations of the functions in this interface can choose different rep-
resentations. Clients of this data structure should not care about the inter-
nals of the implementation. In fact, they should not be allowed to access
them at all and operate on queues only through the functions in this inter-
face. Some languages with strong module systems enforce such abstraction

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.16

rigorously. In C, it is mostly a matter of adhering to conventions.

8 Using the Queue Interface


We play through some simple examples to illustrate the idea of a queue
and how to use the interface above. We write a queue as

x 1 , x2 , . . . , x n

where x1 is the front of the queue and xn is the back of the queue. We enqueue
elements in the back and dequeue them from the front.
For example:

Queue Command Other variables


queue Q = queue_new();
enq(Q, "a");
"a" enq(Q, "b");
"a", "b" string s = deq(Q); s = "a"
"b" enq(Q, "c"); s = "a"
"b", "c" s = deq(Q); s = "b"
"c" s = "b"

9 Copying a Queue Using Its Interface


Suppose we have a queue Q and want to obtain a copy of it. That is, we
want to create a new queue C and implement an algorithm that will make
sure that Q and C have the same elements and in the same order. How can
we do that? Before you read on, see if you can figure it out for yourself.

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.17

The first thing to note is that

queue C = Q;

will not have the effect of copying the queue Q into a new queue C. Just
as for the case of array, this assignment makes C and Q alias, so if we
change one of the two, for example enqueue an element into C, then the
other queue will have changed as well. Just as for the case of arrays, we
need to implement a function for copying the data.
The queue interface provides functions that allow us to dequeue data
from the queue, which we can do as long as the queue is not empty. So we
create a new queue C. Then we read all data from queue Q and put it into
the new queue C.

queue C = queue_new();
while (!queue_empty(Q)) {
enq(C, deq(Q));
}
//@assert queue_empty(Q);

Now the new queue C will contain all data that was previously in Q, so C
is a copy of what used to be in Q. But there is a problem with this approach.
Before you read on, can you find out which problem?

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.18

Queue C now is a copy of what used to be in Q before we started copy-


ing. But our copying process was destructive! By dequeueing all elements
from Q to put them into C, Q has now become empty. In fact, our assertion
at the end of the above loop even indicated queue_empty(Q). So what we
need to do is put all data back into Q when we are done copying it all into
C. But where do we get it from? We could read it from the copy C to put
it back into Q, but, after that, the copy C would be empty, so we are back
to where we started from. Can you figure out how to copy all data into C
and make sure that it also ends up in Q? Before you read on, try to find a
solution for yourself.

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.19

We could try to enqueue all data that we have read from Q back into Q
before putting it into C.

queue C = queue_new();
while (!queue_empty(Q)) {
string s = deq(Q);
enq(Q, s);
enq(C, s);
}
//@assert queue_empty(Q);

But there is something very fundamentally wrong with this idea. Can you
figure it out?

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.20

The problem with the above attempt is that the loop will never termi-
nate unless Q is empty to begin with. For every element that the loop body
dequeues from Q, it enqueues one element back into Q. That way, Q will
always have the same number of elements and will never become empty.
Therefore, we must go back to our original strategy and first read all ele-
ments from Q. But instead of putting them into C, we will put them into a
third queue T for temporary storage. Then we will read all elements from
the temporary storage T and enqueue them into both the copy C and back
into the original queue Q. At the end of this process, the temporary queue
T will be empty, which is fine, because we will not need it any longer. But
both the copy C and the original queue Q will be replenished with all the
elements that Q had originally. And C will be a copy of Q.

queue queue_copy(queue Q) {
queue T = queue_new();
while (!queue_empty(Q)) {
enq(T, deq(Q));
}
//@assert queue_empty(Q);
queue C = queue_new();
while (!queue_empty(T)) {
string s = deq(T);
enq(Q, s);
enq(C, s);
}
//@assert queue_empty(T);
return C;
}

For example, when queue_copy returns, neither C nor Q will be empty.


Except if Q was empty to begin with, in which case both C and Q will still
be empty in the end.

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.21

10 The Queue Implementation


In this lecture, we implement the queue using an array, similar to how we
have implemented stacks in this lecture.

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$

front" back" capacity"

A queue is implemented as a struct with a front and back field. The


front field is the index of the front of the queue, the back field is the index
of the back of the queue. We need both so that we can dequeue (at the front)
and enqueue (back).
In the stack, we did not use anything outside the range (bottom, top],
and for queues, we do not use anything outside the range [front, back ).
Again, we mark this in diagrams with an X.
The above picture yields the following definition, where we will again
remember the capacity of the queue, i.e., the length of the array stored in
the data field.

struct queue_header {
elem[] data;
int front;
int back;
int capacity;
};
typedef struct queue_header* queue;

When does a struct of this type represent a valid queue? In fact, when-
ever we define a new data type representation we should first think about
the data structure invariants. Making these explicit is important as we
think about and write the pre- and postconditions for functions that im-
plement the interface.
What we need here is simply that the front and back are within the
array bounds for array data and that the capacity is not too small. The
back of the queue is not used (marked X) but in the array, so we decide to

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.22

require that the capacity of a queue be at least 2 to make sure we can store
at least one element. (The WARNING about NULL still applies here.)

bool is_queue(queue Q)
{
if (Q->capacity < 2) return false;
if (Q->front < 0 || Q->front >= Q->capacity) return false;
if (Q->back < 0 || Q->back >= Q->capacity) return false;
//@assert Q->capacity == \length(Q->data);
return true;
}

To check if the queue is empty we just compare its front and back. If
they are equal, the queue is empty; otherwise it is not. We require that we
are being passed a valid queue. Generally, when working with a data struc-
ture, we should always require and ensure that its invariants are satisifed
in the pre- and post-conditions of the functions that manipulate it. Inside
the function, we will generally temporarily violate the invariants.

bool queue_empty(queue Q)
//@requires is_queue(Q);
{
return Q->front == Q->back;
}

To dequeue an element, we only need to increment the field front,


which represents the index in data of the front of the queue. To emphasize
that we never use portions of the array outside the front to back range, we
first save the dequeued element in a temporary variable so we can return it
later. In diagrams:

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$

front" back" capacity"

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.23

And in code:
elem deq(queue Q)
//@requires is_queue(Q);
//@requires !queue_empty(Q);
//@ensures is_queue(Q);
{
elem e = Q->data[Q->front];
Q->front++;
return e;
}
To enqueue something, that is, add a new item to the back of the queue,
we just write the data (here: a string) into the extra element at the back, and
increment back. You should draw yourself a diagram before you write this
kind of code. Here is a before-and-after diagram for inserting "e":

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$

front" back" capacity"

0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
“a”$ “b”$ “c”$ “e”$

front" back" capacity"

In code:
void enq(queue Q, string s)
//@requires is_queue(Q);
//@ensures is_queue(Q);
//@ensures !queue_empty(Q);
{
assert(Q->back < Q->capacity-1); // otherwise out of resources
Q->data[Q->back] = e;
Q->back++;
}

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.24

To obtain a new empty queue, we allocate a list struct and initialize both
front and back to 0, the first element of the array. We do not initialize the
elements in the array because its contents are irrelevant until some data is
put in. It is good practice to always initialize memory if we care about its
contents, even if it happens to be the same as the default value placed there.

queue queue_new()
//@ensures is_queue(\result);
//@ensures queue_empty(\result);
{
queue Q = alloc(struct queue_header);
Q->front = 0;
Q->back = 0;
Q->capacity = 100;
Q->data = alloc_array(elem, Q->capacity);
return Q;
}

Observe that, unlike the queue implementation, the queue interface


only uses a single contract: that deq requires a non-empty queue to work.
The queue implementation has several additional implementation contracts.
All queue implementation functions use is_queue(Q) in their requires and
ensures contract. The only exception is the queue_new implementation,
which ensures the analogue is_queue(\result) instead. These is_queue
contracts do not appear in the queue interface because is_queue itself does
not appear in the interface, because it is an internal data structure invariant. If
the client obeys the interface abstraction, all he can do with queues is create
them via queue_new and then pass them to the various queue operations in
the interface. At no point in this process does the client have the oppor-
tunity to tamper with the queue data structure to make it fail is_queue,
unless the client violates the interface.
But there are other additional contracts in the queue implementation,
which we want to use to check our implementation, and they still are not
part of the interface. For example. we could have included the following
additional contracts in the interface

queue queue_new() /* O(1), create new empty queue */


//@ensures queue_empty(\result);
;
void enq(queue Q, elem s) /* O(1), add item at back */
//@ensures !queue_empty(Q);

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.25

Those contracts need to hold for all queue implementations. Why did we
decide not to include them? The reason is that there are not many situa-
tions in which this knowledge about queues is useful, because we rarely
want to dequeue an element right after enqueueing it. This is in contrast
to the //@requires !queue_empty(Q) contract of deq, which is critical for
the client to know about, because he can only dequeue elements from non-
empty queues and has to check for non-emptyness before calling deq.
Similar observations hold for our rationale for designing the stack in-
terface.

11 Bounded versus Unbounded Stacks & Queues


Both the queue and the stack implementation that we have seen so far have
a fundamental limitation. They are of bounded capacity. However large
we allocate their data arrays, there is a way of enqueuing elements into the
queue or pushing elements onto the stack that requires more space than
the array has had in the first place. And if that happens, the enq or push
operations will fail an assertion because of a resource bound that the client
has no way of knowing about. This is bad, because the client would have
to expect that any of his enq or push operations might fail, because he does
not know about the capacity of the queue and has no way of influencing
this.
One way of solving this problem would be to add operations into the
interface that make it possible to check whether a queue is full.

bool queue_full(queue Q);

Then we change the precondition of enq to require that elements can only
be enqueued if the queue is not full

void enq(queue Q, elem s)


//@requires !queue_full(Q)
....

Similarly, we could add an operation to the interface of stacks to check


whether the stack is full

bool stack_full(stack S);

And require that pushing is only possible if the stack is not full

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.26

void push(stack S, elem s)


//@requires !stack_full(S)
....

The advantage of this design is that the client now has a way of checking
whether there still is space in the stack/queue. The downside, however, is
that the client still does not have a way of increasing the capacity if he wants
to store more data in it.
In the next lecture, we will see a better implementation of stacks and
queues that does not have any of those capacity bounds. That implemen-
tation uses pointers and linked lists.

L ECTURE N OTES F EBRUARY 12, 2013


Stacks & Queues L9.27

Exercises
Exercise 1 Can you implement a version of stack that does not use the bottom
field in the struct stack_header?

Exercise 2 Consider what would happen if we pop an element from the empty
stack when contracts are not checked? When does an error arise?

Exercise 3 The check in is_queue required the capacity to be at least 1 to have


space for the unused data at back. Can we actually enqueue data into a queue with
capacity==1? If not, what do we need to change in is_queue to make sure the
queue can hold at least one element before it runs out of space.

Exercise 4 Our queue implementation wastes a lot of space unnecessarily. After


enqueuing and dequeueing a number of elements, the back may reach the capacity
limit. If the front has moved on, then there is a lot of space wasted in the beginning
of the data array. How can you change the implementation to reuse this storage
for enqueuing elements? How do you need to change the implementation of enq
and deq for that purpose?

Exercise 5 Our queue design always “wasted” one element that we marked X.
Can we save this memory and implement the queue without extra elements? What
are the tradeoffs and alternatives when implementing a queue?

Exercise 6 The stack implementation using arrays may run out of space if its
capacity is exceeded. Can you think of a way of implementing unbounded stacks
stored in an array?

L ECTURE N OTES F EBRUARY 12, 2013


Lecture Notes on
Pointers
15-122: Principles of Imperative Computation
Frank Pfenning, Rob Simmons

Lecture 9
February 14, 2013

1 Introduction
In this lecture we complete our discussion of types in C0 by discussing
pointers and structs, two great tastes that go great together. We will discuss
using contracts to ensure that pointer accesses are safe, as well as the use
of linked lists to ensure implement the stack and queue interfaces that were
introduced last time. The linked list implementation of stacks and queues
allows us to handle lists of any length.
Relating this to our learning goals, we have

Computational Thinking: We emphasize the importance of abstraction by


producing a second implementation of the stacks and queues we in-
troduced in the last lecture.

Algorithms and Data Structures: Linked lists are a fundamental data struc-
ture.

Programming: We will see structs and pointers, and the use of recursion in
the definition of structs.

2 Structs and pointers


So far in this course, we’ve worked with five different C0 types – int, bool,
char, string, and arrays t[] (there is a array type t[] for every type t). The
character, string, Boolean, and integer values that we manipulate, store lo-
cally, and pass to functions are just the values themselves, but when we

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.2

consider arrays, the things we store in assignable variables or pass to func-


tions are addresses, references to the place where the data stored in the array
can be accessed. The picture we work with looks like this:

char"c" ‘\n’$ 0" 1" 2" 3" 4"

int"i" 4" “b”$ “e”$ “e”$ “f”$

string[]"A"

An array allows us to store and access some number of values of the


same type (which we reference as A[0], A[1], and so on. The next data
structure we will consider is the struct. A struct can be used to aggregate
together different types of data, which helps us to create data structures.
Compare this to arrays, which is an aggregate of elements of the same type.
Structs must be explicitly declared. If we think of an image, we want to
store an array of pixels alongside the width and height of the image, and a
struct allows us to do that:

typedef int pixel;

struct img_header {
pixel[] data;
int width;
int height;
};

Here data, width, and height are not variables, but fields of the struct.
The declaration expresses that every image has an array of data as well as a
width and a height. This description is incomplete, as there are some miss-
ing consistency checks – we would expect the length of data to be equal to
the width times the height, for instance, but we can capture such properties
in a separate data structure invariant.
Structs do not necessarily fit into a machine word because they can have
arbitrarily many components, so they must be allocated on the heap (in
memory, just like arrays). This is true if they happen to be small enough to
fit into a word in order to maintain a uniform and simple language.

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.3

% coin structdemo.c0
C0 interpreter (coin) 0.3.2 ’Nickel’
Type ‘#help’ for help or ‘#quit’ to exit.
--> struct img_header IMG;
<stdio>:1.1-1.22:error:type struct img_header not small
[Hint: cannot pass or store structs in variables directly; use
pointers]

How, then, do we manipulate structs? We use the same solution as


for arrays: we manipulate them via their address in memory. Instead of
alloc_array we call alloc which returns a pointer to the struct that has
been allocated in memory. Let’s look at an example in coin.

--> struct img_header* IMG = alloc(struct img_header);


IMG is 0xFFAFFF20 (struct img_header*)

We can access the fields of a structs, for reading or writing, through the
notation p->f where p is a pointer to a struct, and f is the name of a field
in that struct. Continuing above, let’s see what the default values are in the
allocated memory.

--> IMG->data;
(default empty int[] with 0 elements)
--> IMG->width;
0 (int)
--> IMG->height;
0 (int)

We can write to the fields of a struct by using the arrow notation on the
left-hand side of an assignment.

--> IMG->data = alloc_array(pixel, 2);


IMG->data is 0xFFAFC130 (int[] with 2 elements)
--> IMG->width = 1;
IMG->width is 1 (int)
--> (*IMG).height = 2;
(*(IMG)).height is 2 (int)
--> IMG->data[0] = 0xFF00FF00;
IMG->data[0] is -16711936 (int)
--> IMG->data[1] = 0xFFFF0000;
IMG->data[1] is -65536 (int)

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.4

The notation (*p).f is a longer form of p->f. First, *p follows the


pointer to arrive at the struct in memory, then .f selects the field f. We
will rarely use this dot-notation (*p).f in this course, preferring the arrow-
notation p->f.
An updated picture of memory, taking into account the initialization
above, looks like this:

char"c" ‘\n’$ 0" 1" 2" 3" 4"

int"i" 4" “b”$ “e”$ “e”$ “f”$


data" width" height"
string[]"A"
1$ 2$
struct"img_header*"IMG"

0" 1" 2"


0xFF00FF00$ 0xFFFF0000$

3 Pointers
As we have seen in the previous section, a pointer is needed to refer to a
struct that has been allocated on the heap. In can also be used more gener-
ally to refer to an element of arbitrary type that has been allocated on the
heap. For example:

--> int* ptr1 = alloc(int);


ptr1 is 0xFFAFC120 (int*)
--> *ptr1 = 16;
*(ptr1) is 16 (int)
--> *ptr1;
16 (int)

In this case we refer to the value using the notation *p, either to read (when
we use it inside an expression) or to write (if we use it on the left-hand side
of an assignment).
So we would be tempted to say that a pointer value is simply an ad-
dress. But this story, which was correct for arrays, is not quite correct for
pointers. There is also a special value NULL. Its main feature is that NULL is
not a valid address, so we cannot dereference it to obtain stored data. For
example:

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.5

--> int* ptr2 = NULL;


p is NULL (int*)
--> *ptr2;
Error: null pointer was accessed
Last position: <stdio>:1.1-1.3

Graphically, NULL is sometimes represented with the ground symbol, so we


can represent our updated setting like this:

char"c" ‘\n’$ 0" 1" 2" 3" 4"

int"i" 4" “b”$ “e”$ “e”$ “f”$


data" width" height"
string[]"A"
1$ 2$
struct"img_header*"IMG"
int*"ptr1" 16$
int*"ptr2" 0" 1" 2"
0xFF00FF00$ 0xFFFF0000$

To rephrase, we say that a pointer value is an address, of which there


are two kinds. A valid address is one that has been allocated explicitly with
alloc, while NULL is an invalid address. In C, there are opportunities to
create many other invalid addresses, as we will discuss in another lecture.
Attempting to dereference the null pointer is a safety violation in the
same class as trying to access an array with an out-of-bounds index. In C0,
you will reliably get an error message, but in C the result is undefined and
will not necessarily lead to an error. Therefore:

Whenever you dereference a pointer p, either as *p or p->f, you must


have a reason to know that p cannot be NULL.

In many cases this may require function preconditions or loop invariants,


just as for array accesses.

4 Linked Lists
Linked lists are a common alternative to arrays in the implementation of
data structures. Each item in a linked list contains a data element of some
type and a pointer to the next item in the list. It is easy to insert and delete
elements in a linked list, which are not natural operations on arrays, since

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.6

arrays have a fixed size. On the other hand access to an element in the
middle of the list is usually O(n), where n is the length of the list.
An item in a linked list consists of a struct containing the data element
and a pointer to another linked list. In C0 we have to commit to the type
of element that is stored in the linked list. We will refer to this data as
having type elem, with the expectation that there will be a type definition
elsewhere telling C0 what elem is supposed to be. Keeping this in mind
ensures that none of the code actually depends on what type is chosen.
These considerations give rise to the following definition:

struct list_node {
elem data;
struct list_node* next;
};
typedef struct list_node list;

This definition is an example of a recursive type. A struct of this type


contains a pointer to another struct of the same type, and so on. We usually
use the special element of type t*, namely NULL, to indicate that we have
reached the end of the list. Sometimes (as will be the case for our use of
linked lists in stacks and queues), we can avoid the explicit use of NULL and
obtain more elegant code. The type definition is there to create the type
name list, which stands for struct list_node, so that a pointer to a list
node will be list*.
There are some restriction on recursive types. For example, a declara-
tion such as

struct infinite {
int x;
struct infinite next;
}

would be rejected by the C0 compiler because it would require an infinite


amount of space. The general rule is that a struct can be recursive, but
the recursion must occur beneath a pointer or array type, whose values are
addresses. This allows a finite representation for values of the struct type.
We don’t introduce any general operations on lists; let’s wait and see
what we need where they are used. Linked lists as we use them here are
a concrete type which means we do not construct an interface and a layer of
abstraction around them. When we use them we know about and exploit
their precise internal structure. This is contrast to abstract types such as

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.7

queues or stacks (see next lecture) whose implementation is hidden behind


an interface, exporting only certain operations. This limits what clients
can do, but it allows the author of a library to improve its implementation
without having to worry about breaking client code. Concrete types are
cast into concrete once and for all.

5 Queues with Linked Lists


Here is a picture of the queue data structure the way we envision imple-
menting it, where we have elements 1, 2, and 3 in the queue.
data! next!
!
1! ! 2! 3!

front! back!

A queue is implemented as a struct with a front and back field. The


front field points to the front of the queue, the back field points to the back
of the queue. We need these two pointers so we can efficiently access both
ends of the queue, which is necessary since dequeue (front) and enqueue
(back) access different ends of the list.
In the array implementation of queues, we kept the back as one greater
than the index of the last element in the array. In the linked-list implemen-
tation of queues, we use a similar strategy, making sure the back pointer
points to one element past the end of the queue. Unlike arrays, there must
be something in memory for the pointer to refer to, so there is always one
extra element at the end of the queue which does not have valid data or
next pointer. We have indicated this in the diagram by writing X.
The above picture yields the following definition.

struct queue_header {
list* front;
list* back;
};
typedef struct queue_header* queue;

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.8

We call this a header because it doesn’t hold any elements of the queue, just
pointers to the linked list that really holds them. The type definition allows
us to use queue as a type that represents a pointer to a queue header. We
define it this way so we can hide the true implementation of queues from
the client and just call it an element of type queue.
When does a struct of this type represent a valid queue? In fact, when-
ever we define a new data type representation we should first think about
the data structure invariants. Making these explicit is important as we
think about and write the pre- and postconditions for functions that im-
plement the interface.
What we need here is if we follow front and then move down the
linked list we eventually arrive at back. We call this a list segment. We
also want both front and back not to be NULL so it conforms to the picture,
with one element already allocated even if the queue is empty.

bool is_queue(queue Q) {
if (Q == NULL) return false;
if (Q->front == NULL) return false;
if (Q->back == NULL) return false;
if (!is_segment(Q->front, Q->back)) return false;
return true;
}

Next, the code for checking whether two pointers delineate a list segment.
When both start and end are NULL, we consider it a valid list segment, even
though this will never come up for queues. It is a common code pattern for
working with linked lists and similar data representation to have a pointer
variable, here called p, that is updated to the next item in the list on each
iteration until we hit the end of the list.

bool is_segment(list* start, list* end) {


list* p = start;
while (p != end) {
if (p == NULL) return false;
p = p->next;
}
return true;
}

Here we stop in two situations: if p = NULL, then we cannot come up


against end any more because we have reached the end of the list and we

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.9

return false. The other situation is if we find end , in which case we return
true since we have a valid list segment. This function may not terminate
if the list contains a cycle. We will address this issue in the next lecture; for
now we assume all lists are acyclic.
To check if the queue is empty we just compare its front and back. If
they are equal, the queue is empty; otherwise it is not. We require that we
are being passed a valid queue. Generally, when working with a data struc-
ture, we should always require and ensure that its invariants are satisfied
in the pre- and post-conditions of the functions that manipulate it. Inside
the function, we will generally temporarily violate the invariants.
bool queue_empty(queue Q)
//@requires is_queue(Q);
{
return Q->front == Q->back;
}
To obtain a new empty queue, we just allocate a list struct and point both
front and back of the new queue to this struct. We do not initialize the list
element because its contents are irrelevant, according to our representation.
It is good practice to always initialize memory if we care about its contents,
even if it happens to be the same as the default value placed there.
queue queue_new()
//@ensures is_queue(\result);
//@ensures queue_empty(\result);
{
queue Q = alloc(struct queue_header);
list* p = alloc(struct list_node);
Q->front = p;
Q->back = p;
return Q;
}
Let’s take one of these lines apart. Why does
queue Q = alloc(struct queue_header);
make sense? According to the definition of alloc, we might expect
struct queue_header* Q = alloc(struct queue_header);
since allocation returns the address of what we allocated. Fortunately, we
defined queue to be a short-hand for struct queue_header* so all is well.

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.10

To enqueue something, that is, add a new item to the back of the queue,
we just write the data (here: a string) into the extra element at the back,
create a new back element, and make sure the pointers updated correctly.
You should draw yourself a diagram before you write this kind of code.
Here is a before-and-after diagram for inserting "3" into a list. The new or
updated items are dashed in the second diagram.

data! next!
!
1! ! 2!

Q!
front! back!

data! next!
!
1! ! 2! 3!

Q!
front! back!

Another important point to keep in mind as you are writing code that ma-
nipulates pointers is to make sure you perform the operations in the right
order, if necessary saving information in temporary variables.
void enq(queue Q, string s)
//@requires is_queue(Q);
//@ensures is_queue(Q);
{
list* p = alloc(struct list);
Q->back->data = s;
Q->back->next = p;
Q->back = p;
}

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.11

Finally, we have the dequeue operation. For that, we only need to


change the front pointer, but first we have to save the dequeued element
in a temporary variable so we can return it later. In diagrams:

data! next!
!
1! ! 2! 3!

front! back!

data! next!
!
1! ! 2! 3!

front! back!

And in code:

string deq(queue Q)
//@requires is_queue(Q);
//@requires !queue_empty(Q);
//@ensures is_queue(Q);
{
string s = Q->front->data;
Q->front = Q->front->next;
return s;
}

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.12

Let’s verify that the our pointer dereferencing operations are safe. We have

Q->front->data

which entails two pointer dereference. We know is_queue(Q) from the


precondition of the function. Recall:

bool is_queue(queue Q) {
if (Q == NULL) return false;
if (Q->front == NULL) return false;
if (Q->back == NULL) return false;
if (!is_segment(Q->front, Q->back)) return false;
return true;
}

We see that Q->front is okay, because by the first test we know that Q != NULL
is the precondition holds. By the second test we see that both Q->front and
Q->back are not null, and we can therefore dereference them.
We also make the assignment Q->front = Q->front->next. Why does
this preserve the invariant? Because we know that the queue is not empty
(second precondition of deq) and therefore Q->front != Q->back. Because
Q->front to Q->back is a valid non-empty segment, Q->front->next can-
not be null.
An interesting point about the dequeue operation is that we do not ex-
plicitly deallocate the first element. If the interface is respected there cannot
be another pointer to the item at the front of the queue, so it becomes un-
reachable: no operation of the remainder of the running programming could
ever refer to it. This means that the garbage collector of the C0 runtime sys-
tem will recycle this list item when it runs short of space.

6 Stacks with Linked Lists


For the implementation of stacks, we can reuse linked lists and the basic
structure of our queue implementation, except that we read off elements
from the same end that we write them to. We call the pointer to this end
top. Since we do not perform operations on the other side of the stack, we
do not necessarily need a pointer to the other end. For structural reasons,
and in order to identify the similarities with the queue implementation,
we still decide to remember a pointer bottom to the bottom of the stack.
With this design decision, we do not have to handle the bottom of the stack
much different than any other element on the stack. The difference is that

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.13

the data at the bottom of the stack is meaningless and will not be used in
our implementation. A typical stack then has the following form:

data! next!
!
3! ! 2! 1!

top! bo.om!

Here, 3 is the element at the top of the stack.


We define:

struct list_node {
elem data;
struct list_node* next;
};
typedef struct list_node list;

struct stack_header {
list* top;
list* bottom;
};
typedef struct stack_header* stack;

To test if some structure is a valid stack, we only need to check that


the list starting at top ends in bottom; this is almost identical to the data
structure invariant for queues:

bool is_stack(stack S) {
if (S == NULL) return false;
if (Q->front == NULL) return false;
if (Q->back == NULL) return false;
if (!is_segment(Q->front, Q->back)) return false;
return true;
}

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.14

Popping from a stack requires taking an item from the front of the
linked list, which is much like dequeuing.
elem pop(stack S)
//@requires is_stack(S);
//@requires !stack_empty(S);
//@ensures is_stack(S);
{
elem e = S->top->data;
S->top = S->top->next;
return e;
}
To push an element onto the stack, we create a new list item, set its data
field and then its next field to the current top of the stack – the opposite end
of the linked list from the queue. Finally, we need to update the top field of
the stack to point to the new list item. While this is simple, it is still a good
idea to draw a diagram. We go from

data! next!
!
3! ! 2! 1!

top! bo.om!

to

data! next! data! next!


!
4! 3! ! 2! 1!

top! bo/om!

L ECTURE N OTES F EBRUARY 14, 2013


Pointers L9.15

In code:

void push(stack S, elem e)


//@requires is_stack(S);
//@ensures is_stack(S);
{
list* p = alloc(struct list_node);
p->data = e;
p->next = S->top;
S->top = p;
}

This completes the implementation of stacks, which are a very simple


and pervasive data structure.

Exercises
Exercise 1 Consider what would happen if we pop an element from the empty
stack when contracts are not checked in the linked list implementation? When
does an error arise?

Exercise 2 Stacks are usually implemented with just one pointer in the header, to
the top of the stack. Rewrite the implementation in this style, dispensing with the
bottom pointer, terminating the list with NULL instead.

L ECTURE N OTES F EBRUARY 14, 2013


Lecture Notes on
Unbounded Arrays

15-122: Principles of Imperative Computation


Frank Pfenning

Lecture 12
February 21, 2012

1 Introduction
Most lectures so far had topics related to all three major categories of learn-
ing goals for the course: computational thinking, algorithms, and program-
ming. The same is true for this lecture. With respect to algorithms, we in-
troduce unbounded arrays and operations on them. Analyzing them requires
amortized analysis, a particular way to reason about sequences of operations
on data structures. We also briefly talk about again about data structure in-
variants and interfaces, which are crucial computational thinking concepts.

2 Unbounded Arrays
In the second homework assignment, you were asked to read in some files
such as the Collected Works of Shakespeare, the Scrabble Players Dictionary, or
anonymous tweets collected from Twitter. What kind of data structure do
we want to use when we read the file? In later parts of the assignment
we want to look up words, perhaps sort them, so it is natural to want to
use an array of strings, each string constituting a word. A problem is that
before we start reading we don’t know how many words there will be in
the file so we cannot allocate an array of the right size! One solution uses
either a queue or a stack. We discussed this in Lecture 9 on Queues and
in Lecture 10 on Pointers. Unlike the linked-list implementation of queues
from lecture 10, the array-based implementation of queues from Lecture 9
was still capacity-bounded. It would work, however, if we had unbounded

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.2

arrays. In fact, in unbounded arrays, we could store the data directly. While
arrays are a language primitive, unbounded arrays are a data structure that
we need to implement.
Thinking about it abstractly, an unbounded array should be like an ar-
ray in the sense that we can get and set the value of an arbitrary element
via its index i. We should also be able to add a new element to the end of
the array, and delete an element from the end of the array.
We use the unbounded array by creating an empty one before reading a
file. Then we read words from the file, one by one, and add them to the end
of the unbounded array. Meanwhile we can count the number of elements
to know how many words we have read. We trust the data structure not to
run out of space unless we hit some hard memory limit, which is unlikely
for the kind of task we have in mind, given modern operating systems.
When we have read the whole file the words will be in the unbounded
array, in order, the first word at index 0, the second at index 1, etc.
The general implementation strategy is as follows. We maintain an ar-
ray of a fixed length limit and an internal index size which tracks how many
elements are actually used in the array. When we add a new element we
increment size, when we remove an element we decrement size. The tricky
issue is how to proceed when we are already at the limit and want to add
another element. At that point, we allocate a new array with a larger limit
and copy the elements we already have to the new array. For reasons we
explain later in this lecture, every time we need to enlarge the array we dou-
ble its size. Removing an element from the end is simple: we just decrement
size. There are some issues to consider if we want to shrink the array, but
this is optional.

3 An Interface to Unbounded Arrays


As usual when designing a data structure, we start by thinking about its
interface. We must be able to create a new unbounded array, access its
elements (both for reading and writing), and add or remove elements at the
end. The elements of the array should be of arbitrary type (like ordinary
arrays), but we cannot achieve this form of genericity in C0 at present. We
will discuss ways to write generic code later in the course when we move
to C. Instead, we just indicate this by defining and using a type name elem
(here as string). These considerations lead us to the following interface:

typedef string elem;

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.3

/* Interface of unbounded arrays */

typedef struct uba_header* uba;

uba uba_new(int initial_limit)


//@requires initial_limit > 0;
;

int uba_size(uba L) /* "\length(L)" */


//@ensures \result >= 0;
;

elem uba_get(uba L, int index) /* "L[index]" */


//@requires 0 <= index && index < uba_size(L);
;

void uba_set(uba L, int index, elem e) /* "L[index] = e" */


//@requires 0 <= index && index < uba_size(L);
;

void uba_add(uba L, elem e); /* add e at the end of L */

elem uba_rem(uba L) /* remove last element in L */


//@requires uba_size(L) > 0;
;

Contracts on interfaces are cumulative with respect to the contracts on


the implementations: both are checked when a function is called through
its interface. Note that we do not mention is_uba, since this function is
not exposed to the client. Client code should only ever be able to obtain
valid unbounded arrays if it uses the interface, so preservation of the data
structure invariants should be considered an internal invariant of the data
structure implementation.
Please read over the interface carefully to make sure you understand all
of its provisions. We would like all the specified operations to take only
constant time, that is, O(1). As we will see in the remainder of this lecture
this is quite tricky and we have to make some intriguing qualifications in
our statement of the asymptotic complexity.

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.4

Unfortunately, C (and, by association, C0) does not provide a way to en-


force that clients do not incorrectly exploit details of the implementation of
a data structure. Higher-level languages such as Java or ML have interfaces
and data abstraction as one of their explicit design goals. In this course, the
use of interfaces is a matter of programming discipline. As we discuss fur-
ther data structures we generally focus on the interface first, before writing
any code. This is because the interface often guides the selection of an im-
plementation technique and the individual functions.

4 Implementing Unbounded Arrays


According to our implementation sketch, an unbounded array needs to
track three forms of data: an integer limit, an integer size and an array
of strings. We can put these together in a struct with fields limit, size and
A as the fields of the struct. It is declared with

struct uba_header {
int limit; /* 0 < limit */
int size; /* 0 <= size && size <= limit */
elem[] A; /* \length(A) == limit */
};

Also recall the line from the interface

typedef struct uba_header* uba;

which states that a uba is a pointer to a struct uba_header. Recall that


structs can only be allocated on the heap (rather than stored in variables),
so we always work with the addresses of structs. And addresses are the
values of pointers.
There are some data structure invariants that we maintain, although they
may be temporarily violated as the elements of the structure are manipu-
lated at a low level. Generally, when we pass a pointer to the data structure
or assign it to a variable we expect these invariants to hold. C0, however,
has no intrinsic support for ensuring these invariants. Instead, our method
is to define a function to test them and then verify adherence to the in-
variants in contracts as well as loop invariants and assertions. Here, the
function is_uba serves that purpose. In previous lectures we had functions
is_queue and is_stack that fulfill a similar role.
Generally, we would like contract functions like is_uba not to fail with
a contract exception, but to return false if the data structure invariant is

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.5

violated. However, since the lengths of arrays can only be checked in con-
tracts (they may not be available when a program is compiled without -d
to make computation as efficient as possible) we may have to use contracts
to some extent even for functions whose intended use is only in contracts.

bool is_uba (uba L)


{
if (L == NULL) return false;
if (!(0 <= L->size)) return false;
if (!(L->size <= L->limit)) return false;
if (!(0 < L->limit)) return false;
//@assert L->limit == \length(L->A);
return true;
}

Note that we must check to make sure that L != NULL before checking any
other fields, including L->size and L->A (i.e. L->limit == \length(L->A))
in order to make sure the pointer dereferences on L are safe. Safety of an-
notations and safety of contract functions is just as indispensable as safety
in the rest of the code.
To create a new unbounded array, we allocate a struct uba_header
and an array of a supplied initial limit.

uba uba_new (int initial_limit)


//@requires initial_limit > 0;
//@ensures is_uba(\result);
{
uba L = alloc(struct uba_header);
L->limit = initial_limit;
L->size = 0;
L->A = alloc_array(elem, L->limit);
return L;
}

Getting and setting an element of an unbounded array is straightfor-


ward. However, we do have to verify that the array access is in bounds.
This is stricter than checking that it is within the allocated array (below
limit), because everything beyond the current size should be considered
to be undefined. These array elements have not yet been added to the array,
so reading or writing them is meaningless. We show only the operation of
writing to an unbounded array, uba_set.

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.6

void uba_set(uba L, int index, elem e)


//@requires is_uba(L);
//@requires 0 <= index && index < L->size;
{
L->A[index] = e;
return;
}
More interesting is the operation of adding an element to the end of an
unbounded array. For that we need a function to resize an unbounded ar-
ray. This function takes an unbounded array L and a new limit new_limit.
It is required that the new limit is strictly greater than the current size, to
make sure we have enough room to preserve all current elements and one
more for the next one to add. We also stipulate that the size does not change
by stating L->size == \old(L->size) in the postcondition. In general,
\old(e) in a postcondition evaluates e just after the function is called and
before the body is executed. This allows us to refer to the state of memory
when the function is called in the postcondition.
void uba_resize(uba L, int new_limit)
//@requires is_uba(L);
//@requires L->size < new_limit;
//@ensures is_uba(L);
//@ensures L->limit == new_limit && L->size == \old(L->size);
//@ensures L->size < L->limit;
{
elem[] B = alloc_array(elem, new_limit);
for (int i = 0; i < L->size; i++)
//@loop_invariant 0 <= i && i <= L->size;
{
B[i] = L->A[i];
}
L->limit = new_limit;
/* L->size remains unchanged */
L->A = B;
return;
}
Finally we are ready to write the function that adds an element to the
end of an unbounded array. We first check whether there is room for an-
other element and, if not, double the size of the underlying array of strings.
The contract just states that the array is valid before and after the operation.

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.7

void uba_add(uba L, elem e)


//@requires is_uba(L);
//@ensures is_uba(L);
{
if (L->size == L->limit) {
/* Check for overflow */
assert(L->limit <= int_max()/2);
uba_resize(L, 2*L->limit);
}

//@assert L->size < L->limit;


L->A[L->size] = e;
L->size++;
return;
}

We check that doubling the array size would not overflow and raise an
assertion failure. Using assert as a statement instead of inside an anno-
tation means that the assertion will always be checked, even if the code
is compiled without -d. It will have the same effect as, for example, the
alloc_array function when there is not enough memory to allocate the
array.
We discuss how to remove an element from an array in section 6.

5 Amortized Analysis
It is easy to see that reading from or writing to an unbounded array at a
given index is a constant-time operation. However, adding an element to
an array is not. Most of the time it takes constant time O(1), but when we
have run out of space it take times O(size) because we have to copy the old
elements to the new underlying array. On the other hand, it doesn’t seem
to happen very often. Can we characterize this situation more precisely?
This is the subject of amortized analysis.
In order to make the analysis as concrete as possible, we want to count
the number of writes to an array, that is, the number of assignments A[ ] =
that are performed. Calling the operation to add a new element to an
unbounded array an insert, we claim:

The worst-case cost of n insert operations into an unbounded array is


O(n).

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.8

This statement is quite different from what we have done before, when we
have analyzed the cost of a particular function call like sort or binsearch.
Based on the common use of unbounded arrays, we should consider the
cost of multiple operations together. Many other data structures introduced
later in the course will be subject to a similar analysis.
How do we prove the above bound? A simple insert (when there is
room in the array) requires a single write operation, so we count it as 1.
Similarly, we count the act of copying one element from one array to an-
other as 1 operation, because it requires one write operation. Now per-
forming a sequence of inserts, starting with an empty array of, say, size 4
looks as follows.
call op’s size limit
uba_add(L,"a") 1 1 4
uba_add(L,"b") 1 2 4
uba_add(L,"c") 1 3 4
uba_add(L,"d") 1 4 4
uba_add(L,"e") 5 5 8
uba_add(L,"f") 1 6 8
uba_add(L,"g") 1 7 8
uba_add(L,"h") 1 8 8
uba_add(L,"i") 9 9 16

We have taken 4 extra operations when inserting "e" in order to copy "a"
through "d". Overall, we have performed 21 operations for inserting 9
elements. Would that be O(n) by the time we had inserted n elements?
We approach this by giving us an overall budget of c ⇤ n operations
(“tokens”) before we start to insert n elements. Every time we perform
a write operation we spend a token. If we perform all n inserts without
running out of tokens, we have achieved the desired amortized complexity.
One difficulty is to guess the right constant c. We already know that
c = 1 or c = 2 will not be enough, because in the sequence above we must
spend 21 tokens to insert 9 elements. Let’s try c = 3, so we start with 27

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.9

tokens.
tokens
call op’s left size limit
uba_add(L,"a") 1 26 1 4
uba_add(L,"b") 1 25 2 4
uba_add(L,"c") 1 24 3 4
uba_add(L,"d") 1 23 4 4
uba_add(L,"e") 5 18 5 8
uba_add(L,"f") 1 17 6 8
uba_add(L,"g") 1 16 7 8
uba_add(L,"h") 1 15 8 8
uba_add(L,"i") 9 6 9 16
We see that we spend 4 tokens when adding "e" to copy "a" through "d",
and we add a new one for the insertion of "e" itself.
One of the insights of amortized analysis is that we don’t need to know
the number n of inserts ahead of time. In order to achieve the bound of
c ⇤ n operations, we simply allow each call to perform c operations. If it
performs fewer, these remain in the budget and may be spent later! Let’s
go through the same sequence of calls again.
allocated spent saved total saved
call op’s tokens tokens tokens tokens size limit
uba_add(L,"a") 1 3 1 2 2 1 4
uba_add(L,"b") 1 3 1 2 4 2 4
uba_add(L,"c") 1 3 1 2 6 3 4
uba_add(L,"d") 1 3 1 2 8 4 4
uba_add(L,"e") 5 3 5 2 6 5 8
uba_add(L,"f") 1 3 1 2 8 6 8
uba_add(L,"g") 1 3 1 2 10 7 8
uba_add(L,"h") 1 3 1 2 12 8 8
uba_add(L,"i") 9 3 9 6 6 9 16

The crucial property we need is that there are k 0 tokens left just after
we have doubled the size of the array. We think of this as an invariant of
the computation: it should always be true, no matter how many strings we
insert. In this example we reach 6 tokens after 5 inserts and again after 9
inserts.
To prove this invariant, we must show that it holds the first time we
have to double the size of the array, and that it is preserved by the opera-
tions.
When we create the array, we give it some initial limit limit 0 . We run
out of space, once we have inserted limit 0 tokens, arriving at the following

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.10

situation.

$
$
 $
$
 $
$
 $
$
 size



“a”
 “b”
 “c”
 “d”

limit0


We have accrued 2 ⇤ limit 0 tokens. We have to spend limit 0 of them to copy


the elements so far, keeping limit 0 > 0 in the bank.

$
$
 $
$
 $
$
 $
$
 size



“a”
 “b”
 “c”
 “d”

limit0


size
 $
$
$
$

“a”
 “b”
 “c”
 “d”

limit0
 2*limit0


So the invariant holds the first time we double the size.


Now assume we have just doubled the size of the array and the invari-
ant holds, that is, we have k 0 tokens, and 2 ⇤ size = limit.

size
 k*$

“a”
 “b”
 “c”
 “d”

limit
=
2*size


After size more inserts we are at limit and added another 2 ⇤ size = limit
tokens.

$$
 $$
 $$
 $$
 k*$



“a”
 “b”
 “c”
 “d”
 “e”
 “f”
 “g”
 “h”

size
 limit
=
2*size


L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.11

On the next insert we double the size of the array and copy limit array
elements, spending limit tokens.

$$
 $$
 $$
 $$
 k*$



“a”
 “b”
 “c”
 “d”
 “e”
 “f”
 “g”
 “h”

size
 limit
=
2*size


k*$


“a”
 “b”
 “c”
 “d”
 “e”
 “f”
 “g”
 “h”



size
 limit
=
2*size


Our bank account is reduced back to k tokens, but we know k 0, preserv-


ing our invariant.
Since we only save a constant number of tokens on each operation, in
addition to the constant time the operation itself takes, we never perform
more operations than a constant times the number of operations. So our
claim above is true: any sequence of n operations performs at most O(n)
steps. We also say that the insert operation has constant amortized time.
This completes the argument.
In the example, the number of tokens will now never fall below 6. If we
add another 8 elements, we will also put 2 ⇤ 8 = 16 tokens into the bank.
We will need to spend these to copy the 16 elements already in the array
and we are back down to 6.
Tokens are a conceptual tool in our analysis, but they don’t need to be
implemented. The fact that there are always 0 or more tokens during any
sequence of operations is an invariant of the data structure, although not
quite in the same way as discussed before because it tracks sequences of
operations rather than the internal state of the structure. In fact, it would
be possible to add a new field to the representation of the array that would
count tokens and raise an exception if it becomes negative. That would
alert us to some kind of mistake, either in our amortized analysis or in our
program. This would, however, incur a runtime overhead even when asser-
tions are not checked, so tokens are rarely, if ever, explicitly implemented.
This kind of analysis is important to avoid serious programming mis-
takes. For example, let’s say we decide to increase the size of the array
only by 1 whenever we run out of space. The token scheme above does
not work, because we cannot set aside enough tokens before we need to
copy the array again. And, indeed, after we hit limit the first time, the next

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.12

sequence of n inserts takes O(n2 ) operations, because we copy the array on


each step until we reach 2 ⇤ limit.

6 Removing Elements
Removing elements from the end of the array is simple, and does not change
our amortized analysis, unless we want to shrink the size of the array.
A first idea might be to simply cut the array in half whenever size
reaches half the size of the array. However, this cannot work in constant
amortized time. The example demonstrating that is an alternating sequence
of n inserts and n deletes precisely when we are at the limit of the array. In
that case the total cost of the 2 ⇤ n operations will be O(n2 ).
To avoid this problem we cut the size of the array in half only when the
number of elements in it reaches limit/4. The amortized analysis requires
two tokens for any deletion: one to delete the element, and one for any
future copy. Then if size = limit/2 just after we doubled the size of the
array and have no tokens, putting aside one token on every delete means
that we have size/2 = limit/4 tokens when we arrive at a size of limit/4.
Again, we have just enough tokens to copy the limit/4 elements to the new,
smaller array of size limit/2.
The code for uba_rem (“remove from end”):

elem uba_rem(uba L)
//@requires is_uba(L);
//@requires L->size > 0;
//@ensures is_uba(L);
{
if (L->size <= L->limit/4 && L->limit >= 2)
uba_resize(L, L->limit/2);
L->size--;
elem e = L->A[L->size];
return e;
}

We explicitly check that L->limit >= 2 to make sure that the limit never
becomes 0, which would violate one of our data structure invariants.
One side remark: before we decrement size, we should delete the el-
ement from the array by writing L->A[L->size] = "". In C0, we do not
have any explicit memory management. Storage will be reclaimed and
used for future allocation when the garbage collector can see that data are

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.13

no longer accessible from the program. If we remove an element from an


unbounded array, but keep the element in the array, the garbage collector
can not determine that we will not access it again, because the reason is
rather subtle and lies in the bounds check for uba_get. In order to allow
the garbage collector to free the space occupied by the strings stored in
the array, we therefore should overwrite the array element with the empty
string "", which is the default element for strings. This, however, makes
the code specific to strings, which we try to avoid.

L ECTURE N OTES F EBRUARY 21, 2012


Unbounded Arrays L12.14

Exercises
Exercise 1 In the amortized cost analysis for uba_add, we have concluded

Exercise 2 When removing elements from the unbounded array we resize if the
limit grossly exceeds its size. Namely when L->size <= L->limit/4. Your first
instinct might have been to already shrink the array when L->size <= L->limit/2.
We have argued by example why that does not give us constant amortized cost
O(n) for a sequence of n operations. We have also sketched an argument why
L->size <= L->limit/2 gives the right amortized cost. At which step in that
argument would you notice that L->size <= L->limit/2 is the wrong choice?

L ECTURE N OTES F EBRUARY 21, 2012


Lecture Notes on
Hash Tables
15-122: Principles of Imperative Computation
Frank Pfenning, Rob Simmons

Lecture 13
February 28, 2013

1 Introduction
In this lecture we re-introduce the dictionaries that were implemented as a
part of Clac and generalize them as so-called associative arrays. Associative
arrays are data structures that are similar to arrays but are not indexed by
integers, but other forms of data such as strings. One popular data struc-
tures for the implementation of associative arrays are hash tables. To analyze
the asymptotic efficiency of hash tables we have to explore a new point of
view, that of average case complexity. Another computational thinking con-
cept that we revisit is randomness. In order for hash tables to work effi-
ciently in practice we need hash functions whose behavior is predictable
(deterministic) but has some aspects of randomness.
Relating to our learning goals, we have

Computational Thinking: We consider the importance of randomness in al-


gorithms, and also discuss average case analysis, which is how we can
argue that hash tables have acceptable performance.

Algorithms and Data Structures: We describe a linear congruential genera-


tor, which is a certain kind of pseudorandom number generator. We also
discuss hashtables and their implementation with separate chaining
(an array of linked lists).

Programming: We review the implementation of the rand library in C0.

L ECTURE N OTES F EBRUARY 28, 2013


Hash Tables L13.2

2 Associative Arrays
Arrays can be seen as a mapping, associating with every integer in a given
interval some data item. It is finitary, because its domain, and therefore
also its range, is finite. There are many situations when we want to index
elements differently than just by integers. Common examples are strings
(for dictionaries, phone books, menus, data base records), or structs (for
dates, or names together with other identifying information). They are so
common that they are primitive in some languages such as PHP, Python,
or Perl and perhaps account for some of the popularity of these languages.
In many applications, associative arrays are implemented as hash tables
because of their performance characteristics. We will develop them incre-
mentally to understand the motivation underlying their design.

3 Keys and values


In many applications requiring associative arrays, we are storing complex
data values and want to access them by a key which is derived from the
data. A typical example of keys are strings, which are appropriate for many
scenarios. For example, the key might be a student id and the data entry
might be a collection of grades, perhaps another associative array where the
key is the name of assignment or exam and the data is a score. We make
the assumption that keys are unique in the sense that in an associative array
there is at most one data item associated with a given key.
We can think of built-in C0 arrays as having a set number of keys: a
C0 array of length 3 has three keys 0, 1, and 2. Our implementation of
unbounded arrays allowed us to add a specific new key, 3, to an array; we
want to be able to add new keys to the array. We want our associatve arrays
to allow us to have more interesting keys (like strings, or non-sequential
integers) while keeping the property that there is a unqiue value for each
valid key.

4 Chains
A first idea to explore is to implement the associative array as a linked
list, called a chain. If we have a key k and look for it in the chain, we just
traverse it, compute the intrinsic key for each data entry, and compare it
with k. If they are equal, we have found our entry, if not we continue the
search. If we reach the end of the chain and do not find an entry with key k,

L ECTURE N OTES F EBRUARY 28, 2013


Hash Tables L13.3

then no entry with the given key exists. If we keep the chain unsorted this
gives us O(n) worst case complexity for finding a key in a chain of length
n, assuming that computing and comparing keys is constant time.
Given what we have seen so far in our search data structures, this seems
very poor behavior, but if we know our data collections will always be
small, it may in fact be reasonable on occasion.
Can we do better? One idea goes back to binary search. If keys are or-
dered we may be able to arrange the elements in an array or in the form of
a tree and then cut the search space roughly in half every time we make a
comparison. We will begin thinking about this approch just before Spring
Break, and it will occupy us for a few lectures after the break as well. De-
signing such data structures is a rich and interesting subject, but the best
we can hope for with this approach is O(log(n)), where n is the number of
entries. We have seen that this function grows very slowly, so this is quite
a practical approach.
Nevertheless, the challenge arises if we can do better than O(log(n)),
say, constant time O(1) to find an entry with a given key. We know that
it can done be for arrays, indexed by integers, which allow constant-time
access. Can we also do it, for example, for strings?

5 Hashing
The first idea behind hash tables is to exploit the efficiency of arrays. So:
to map a key to an entry, we first map a key to an integer and then use the
integer to index an array A. The first map is called a hash function. We write
it as hash( ). Given a key k, our access could then simply be A[hash(k)].
There is an immediate problem with this approach: there are 231 pos-
itive integers, so we would need a huge array, negating any possible per-
formance advantages. But even if we were willing to allocate such a huge
array, there are many more strings than int’s so there cannot be any hash
function that always gives us different int’s for different strings.
The solution is to allocate an array of smaller size, say m, and then look
up the result of the hash function modulo m, for example, A[hash(k)%m].
This creates a new problem: it is inevitable that multiple strings will map
to the same array index. For example, if the array has size m then if we
have more then m elements, at least two must map to the same index. In
practice, this will happen much sooner that this.
If two hash functions map a key to the same integer value (modulo m),
we say we have a collision. In general, we would like to avoid collisions,

L ECTURE N OTES F EBRUARY 28, 2013


Hash Tables L13.4

because some additional operations will be required to deal with them,


slowing down operations and taking more space. We analyze the cost of
collisions more below.

6 Separate Chaining
How do we deal with collisions of hash values? The simplest is a technique
called separate chaining. Assume we have hash(k1 )%m = i = hash(k2 )%m,
where k1 and k2 are the distinct keys for two data entries e1 and e2 we want
to store in the table. In this case we just arrange e1 and e2 into a chain
(implemented as a linked list) and store this list in A[i].
In general, each element A[i] in the array will either be NULL or a chain of
entries. All of these must have the same hash value for their key (modulo
m), namely i. As an exercise, you might consider other data structures
here instead of chains and weigh their merits: how about sorted lists? Or
queues? Or doubly-linked lists? Or another hash table?
We stick with chains because they are simple and fast, provided the
chains don’t become too long. This technique is called separate chaining
because the chains are stored seperately, not directly in the array. Another
technique, which we do not discuss, is linear probing where we continue by
searching (linearly) for an unused spot array itself, starting from the place
where the hash function put us.
Under separate chaining, a snapshot of a hash table might look some-
thing like this picture.

m


0


L ECTURE N OTES F EBRUARY 28, 2013


Hash Tables L13.5

7 Average Case Analysis


How long do we expect the chains to be on average? For a total number
n of entries in a table of size m, it is n/m. This important number is also
called the load factor of the hash table. How long does it take to search for
an entry with key k? We follow these steps:

1. Compute i = hash(k)%m. This will be O(1) (constant time), assuming


it takes constant time to compute the hash function.

2. Go to A[i], which again is constant time O(1).

3. Search the chain starting at A[i] for an element whose key matches k.
We will analyze this next.

The complexity of the last step depends on the length of the chain. In the
worst case it could be O(n), because all n elements could be stored in one
chain. This worst case could arise if we allocated a very small array (say,
m = 1), or because the hash function maps all input strings to the same
table index i, or just out of sheer bad luck.
Ideally, all the chains would be approximately the same length, namely
n/m. Then for a fixed load factor such as n/m = ↵ = 2 we would take on
the average 2 steps to go down the chain and find k. In general, as long
as we don’t let the load factor become too large, the average time should be
O(1).
If the load factor does become too large, we could dynamically adapt its
size, like in an unbounded array. As for unbounded arrays, it is beneficial
to double the size of the hash table when the load factor becomes too high,
or possibly halve it if the size becomes too small. Analyzing these factors
is a task for amortized analysis, just as for unbounded arrays.

8 Randomness
The average case analysis relies on the fact that the hash values of the key
are relatively evenly distributed. This can be restated as saying that the
probability that each key maps to an array index i should be about the
same, namely 1/m. In order to avoid systematically creating collisions,
small changes in the input string should result in unpredicable change in
the output hash value that is uniformly distributed over the range of C0 in-
tegers. We can achieve this with a pseudorandom number generator (PRNG).

L ECTURE N OTES F EBRUARY 28, 2013


Hash Tables L13.6

A pseudorandom number generator is just a function that takes one num-


ber and obtains another in a way that is both unpredictable and easy to
calculuate. The C0 rand library is a pseudorandom numer generator with
a fairly simple interface:

/* library file rand.h0 */


typedef struct rand* rand_t;
rand_t init_rand (int seed);
int rand(rand_t gen);

One can generate a random number generator (type rand_t) by initializing


it with an arbitrary seed. Then we can generate a sequence of random
numbers by repeatedly calling rand on such a generator.
The rand library in C0 is implemented as a linear congruential genera-
tor. A linear congruential generator takes a number x and finds the next
number by calculating (a ⇥ x) + c modulo m. In C0, it’s easiest to say that
m is just 232 , since addition and multiplication in C0 are already defined
modulo 232 . The trick is finding a good multiplier a and summand c.
If we were using 4-bit numbers (from 8 to 7 where multiplication and
addition are modulo 16) then we could set a to 5 and c to 7 and our pseudo-
random number generator would generate the following series of numbers:

0 ! 7 ! ( 6) ! ( 7) ! 4 ! ( 5) ! ( 2) !
3 ! ( 8) ! ( 1) ! 1 ! ( 4) ! 3 ! 6 ! 5 ! 0 ! . . .

The PRNG used in C0’s library sets a to 1664525 and c to 1013904223


and generates the following series of numbers starting from 0:

0 ! 1013904223 ! 1196435762 ! ( 775096599) ! ( 1426500812) ! . . .

This kind of generator is fine for random testing or (indeed) the basis for
a hashing function, but the results are too predictable to use it for cryp-
tographic purposes such as encrypting a message. In particular, a linear
congruential generator will sometimes have repeating patterns in the lower
bits. If one wants numbers from a small range it is better to use the higher
bits of the generated results rather than just applying the modulus opera-
tion.
It is important to realize that these numbers just look random, they aren’t
really random. In particular, we can reproduce the exact same sequence if
we give it the exact same seed. This property is important for both test-
ing purposes and for hashing. If we discover a bug during testing with

L ECTURE N OTES F EBRUARY 28, 2013


Hash Tables L13.7

pseudorandom numbers, we want to be able to reliably reproduce it, and


whenever we hash the same key using pseudorandom numbers, we need
to be sure we will eventually get the same result.

/* library file rand.c0 */


struct rand {
int seed;
};

rand_t init_rand (int seed) {


rand_t gen = alloc(struct rand);
gen->seed = seed;
return gen;
}

int rand(rand_t gen) {


gen->seed = gen->seed * 1664525 + 1013904223;
return gen->seed;
}

We will discuss using random number generators to hash strings in Lec-


ture 14.

Exercises
Exercise 1 What happens when you replace the data structure for separate chain-
ing by something other than a linked list? Discuss the changes and identify ben-
efits and disadvantages when using a sorted list, a queue, a doubly-linked list, or
another hash table for separate chaining.

Exercise 2 Consider the situation of writing a hash function for strings of length
two, that only use the characters ’A’ to ’Z’. There are 676 different such strings.
You were hoping to get away with implementing a hash table without collisions,
since you are only using 79 out of those 676 two-letter words. But you still see
collisions most of the time. Explain this phenomenon with the birthday problem.

L ECTURE N OTES F EBRUARY 28, 2013


Lecture Notes on
Interfaces
15-122: Principles of Imperative Computation
Frank Pfenning

Lecture 14
October 16, 2012

1 Introduction
The notion of an interface to an implementation of an abstract data type or li-
brary is an extremely important concept in computer science. The interface
defines not only the types, but also the available operations on them and the
pre- and postconditions for these operations. For general data structures it
is also important to note the asymptotic complexity of the operations so
that potential clients can decide if the data structure serves their purpose.
For the purposes of this lecture we call the data structures and the op-
erations on them provided by an implementation the library and code that
uses the library the client.
What makes interfaces often complex is that in order for the library to
provide its services it may in turn require some operations provided by the
client. Hash tables provide an excellent example for this complexity, so we
will discuss the interface to hash tables in details before giving the hash
table implementation. Binary search trees, discussed in Lecture 15 provide
another excellent example.
Relating to our learning goals, we have

Computational Thinking: We discuss the separation of of client interfaces


and client implementations.

Algorithms and Data Structures: We discuss algorithms for hashing strings.

Programming: We revisit the char data type and use it to consider string
hashing.

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.2

2 Generic Hash Tables


We call hash tables generic because the implementation should work re-
gardless of the type of keys or elements to be stored in the table.
We start with the types. The implementations of which types are pro-
vided by the library? Clearly, the type of hash tables.

/* library side types */


typedef ___ ht;

where we have left it open for now (indicated by ___) how the type ht of
hash tables will eventually be defined. That is really the only type pro-
vided by the implementation. In addition, it is supposed to provide three
functions:

/* library side functions */


ht ht_new(int capacity)
//@requires capacity > 0;
;
elem ht_lookup(ht H, key k); /* O(1) avg. */
void ht_insert(ht H, elem e) /* O(1) avg. */
//@requires e != NULL;
;

The function ht_new(int capacity) takes the initial capacityof the hash
table as an argument (which must be strictly positive) and returns a new
hash table without any elements.
The function ht_lookup(ht H, key k) searches for an element with
key k in the hash table H. If such an element exists, it is returned. If it does
not exist, we return NULL instead.
From these decisions we can see that the client must provide the type of
keys and the type of elements. Only the client can know what these might
be in any particular use of the library. In addition, we observe that NULL
must be a value of type elem.
The function ht_insert(ht H, elem e) inserts an element e into the
hash table H, which is changed as an effect of this operation. But NULL
cannot be a valid element to insert, because otherwise we would know
whether the return value NULL for ht_search means that an element is
present or not. We therefore require e not to be null.
To summarize the types we have discovered will have to come from the
client:

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.3

/* client-side types */
typedef ___* elem;
typedef ___ key;

We have noted the fact that elem must be a pointer by already filling in the
* in its definition. Keys, in contrast, can be arbitrary.
Does the client also need to provide any functions? Yes! Any function
the hash table implementation requires which must understand the imple-
mentations of the type elem and key must be provided by the client, since
the library is supposed to be generic.
It turns out there are three. First, and most obviously, we need a hash
function which maps keys to integers. We also provide the hash function
with a modulus, which will be the size of array in the hash table implemen-
tation.

/* client-side functions */
int hash(key k, int m)
//@requires m > 0;
//@ensures 0 <= \result && \result < m;
;

The result must be in the range specified by m. For the hash table im-
plementation to achieve its advertised (average-case) asymptotic complex-
ity, the hash function should have the property that its results are evenly
distributed between 0 and m. Interestingly, it will work correctly (albeit
slowly), as long as hash satisfies its contract even, for example, if it maps
every key to 0.
Now recall how lookup in a hash table works. We map the key to an
integer and retrieve the chain of elements stored in this slot in the array.
Then we walk down the chain and compare keys of the stored elements
with the lookup key. This requires the client to provide two additional
operations: one to compare keys, and one to extract a key from an element.

/* client-side functions */
bool key_equal(key k1, key k2);

key elem_key(elem e)
//@requires e != NULL;
;

Key extraction works only on elements that are not null.

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.4

This completes the interface which we now summarize.

/*************************/
/* client-side interface */
/*************************/
typedef ___* elem;
typedef ___ key;

int hash(key k, int m)


//@requires m > 0;
//@ensures 0 <= \result && \result < m;
;

bool key_equal(key k1, key k2);

key elem_key(elem e)
//@requires e != NULL;
;

/**************************/
/* library side interface */
/**************************/
typedef struct ht_header* ht;

ht ht_new(int capacity)
//@requires capacity > 0;
;
elem ht_search(ht H, key k); /* O(1) avg. */
void ht_insert(ht H, elem e) /* O(1) avg. */
//@requires e != NULL;
;
int ht_size(ht H, elem e); /* O(1) */
void ht_stats(ht h);

The function ht_size reports the total number of elements in the array
(remember that the load factor is the size n divided by the capacity m). The
function ht_stats has no effect, but prints out a histogram reporting how
many chains in the hash table are empty, how many have length 1, how
many have length 2, and so on. For a hashtable to have good performance,
chains should be generally short.

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.5

3 A Tiny Client
One sample application is to count word occurrences – say, in a corpus of
Twitter data or in the collected works of Shakespeare. In this application,
the keys are the words, represented as strings. Data elements are pairs of
words and word counts, the latter represented as integers.

/******************************/
/* client-side implementation */
/******************************/
struct wcount {
string word;
int count;
};

int hash(string s, int m) {


return abs(hash_string(s) % m); /* from hash-string.c0 */
}

bool key_equal(string s1, string s2) {


return string_equal(s1, s2);
}

string elem_key(struct wcount* wc) {


return wc->word;
}

We can now fill in the types in the client-side of the interface.


typedef struct wcount* elem;
typedef string key;

4 A Universal Hash Function


One question we have to answer is how to hash strings, that is, how to map
string to integers so that the integers are evenly distributed no matter how
the input strings are distributed.
We can get access to the individual characers in a string with the string_charat(s, i)
function, and we can get the integer ASCII value of a char with the char_ord(c)
function; both of these are defined in the C0 string library. Therefore, our
general picture of hashing strings looks like this:

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.6

int hash_string(string s) {
int len = string_length(s);
int h = 0;
for (int i = 0; i < len; i++)
//@loop_invariant 0 <= i;
{
int ch = char_ord(string_charat(s, i));
// Do something to combine h and ch
}
return h;
}
Now, if we don’t add anything to replace the comment, the function above
will still allow the hashtable to work correctly, it will just be very slow
because the hash value of every string will be zero.
An slightly better idea is combining h and ch with addition or multipli-
cation:
for (int i = 0; i < len; i++)
//@loop_invariant 0 <= i;
{
int ch = char_ord(string_charat(s, i));
h = h + ch;
}
This is still pretty bad, however. We can see how bad by running entering
the n = 45, 600 news vocabulary words from Homework 2 into a table with
m = 22, 800 chains (load factor is 2) and running ht_stats:
Hash table distribution: how many chains have size...
...0: 21217
...1: 239
...2: 132
...3: 78
...4: 73
...5: 55
...6: 60
...7: 46
...8: 42
...9: 23
...10+: 835
Longest chain: 176

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.7

Most of the chains are empty, and many of the chains are very, very long.
One problem is that most strings are likely to have very small hash values
when we use this hash function. An even bigger problem is that rearrang-
ing the letters in a string will always produce another string with the same
hash value – so we know that "cab" and "abc" will always collide in a
hash table. Hash collisions are inevitable, but when we can easily predict
that two strings have the same hash value, we should be suspicious that
something is wrong.
To address this, we can manipulate the hex value in some way before
we combine it with the current value. Some versions of Java use this as
their default string hashing function.

for (int i = 0; i < len; i++)


//@loop_invariant 0 <= i;
{
int ch = char_ord(string_charat(s, i));
h = 31*h;
h = h + ch;
}

This does much better when we add all the news vocabulary strings into
the hash table:

Hash table distribution: how many chains have size...


...0: 3057
...1: 6210
...2: 6139
...3: 4084
...4: 2151
...5: 809
...6: 271
...7: 53
...8: 21
...9: 4
...10+: 1
Longest chain: 10

We can try adding a bit of randomness into this function in a number


of different ways. For instance, instead of multiplying by 31, we could
multiply by a number generated by the pseudorandom number generator
from C0’s library:

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.8

rand_t r = init_rand(0x1337BEEF);
for (int i = 0; i < len; i++)
//@loop_invariant 0 <= i;
{
int ch = char_ord(string_charat(s, i));
h = rand(r) * h;
h = h + ch;
}
If we look at the performance of this function, it is comparable to the Java
hash function, though it is not actually quite as good – more of the chains
are empty, and more are longer.
Hash table distribution: how many chains have size...
...0: 3796
...1: 6214
...2: 5424
...3: 3589
...4: 2101
...5: 1006
...6: 455
...7: 145
...8: 48
...9: 15
...10+: 7
Longest chain: 11
Many other variants are possible; for instance, we could directly apply-
ing the linear congruential generator to the hash value at every step:
for (int i = 0; i < len; i++)
//@loop_invariant 0 <= i;
{
int ch = char_ord(string_charat(s, i));
h = 1664525 * h + 1013904223;
h = h + ch;
}
The key goals are that we want a hash function that is very quick to com-
pute and that nevertheless achieves good distribution across our hash ta-
ble. Handwritten hash functions often do not work well, which can signifi-
cantly affect the performance of the hash table. Whenever possible, the use
of randomness can help to avoid any systematic bias.

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.9

5 A Fixed-Size Implementation of Hash Tables


The implementation of hash tables we wrote in lecture did not adjust their
size. This requires that we can a priori predict a good size. Choose the size
too large and it wastes space and slows the program down due to a lack of
locality. Choose the size too small and the load factor will be high, leading
to poor asymptotic (and practical) running time.
We start with the type of lists to represent the chains of elements, and
the hash table type itself.

/*******************************/
/* library-side implementation */
/*******************************/
struct list_node {
elem data; /* data != NULL */
struct list_node* next;
};
typedef struct list_node list;

struct ht_header {
int size; /* size >= 0 */
int capacity; /* capacity > 0 */
list*[] table; /* \length(table) == capacity */
};

The first thing after the definition of a data structure is a function to


verify its invariants. We do not specify them here fully, so you can discover
some of them for yourself in the homework assignment. Besides the in-
variants noted above we should check that each data value in each chain
in the hash table should be non-null and the hash value of the key of every
element in each chain stored in A[i] is indeed i.

bool is_ht(ht H) {
if (H == NULL) return false;
if (!(H->size >= 0)) return false;
if (!(H->capacity > 0)) return false;
//@assert \length(H->table) == H->capacity;
/* check that each element of table is a valid chain */
/* includes checking that all elements are non-null */
return true;
}

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.10

Recall that the test on the length of the array must be inside an annotation,
because the \length function is not available when the code is compiled
without dynamic checking enabled.
Allocating a hash table is straightforward.

ht ht_new(int capacity)
//@requires capacity > 0;
//@ensures is_ht(\result);
{
ht H = alloc(struct ht_header);
H->size = 0;
H->capacity = capacity;
H->table = alloc_array(list*, capacity);
/* initialized to NULL */
return H;
}

Equally straightforward is searching for an element with a given key. We


omit an additional loop invariant and add an assertion that should follow
from it instead.

elem ht_lookup(ht H, key k)


//@requires is_ht(H);
{
int i = hash(k, H->capacity);
list* p = H->table[i];
while (p != NULL)
// loop invariant: p points to chain
{
//@assert p->data != NULL;
if (key_equal(elem_key(p->data), k))
return p->data;
else
p = p->next;
}
/* not in list */
return NULL;
}

We can extract the key from the element l->data because the data can not
be null in a valid hash table.

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.11

Inserting an element follows generally the same structure as search. If


we find an element in the right chain with the same key we replace it. If we
find none, we insert a new one at the beginning of the chain.

void ht_insert(ht H, elem e)


//@requires is_ht(H);
//@requires e != NULL;
//@ensures is_ht(H);
//@ensures ht_lookup(H, elem_key(e)) != NULL;
{
key k = elem_key(e);
int i = hash(k, H->capacity);

list* p = H->table[i];
while (p != NULL)
// loop invariant: p points to chain
{
//@assert p->data != NULL;
if (key_equal(elem_key(p->data), k))
{
/* overwrite existing element */
p->data = e;
return;
} else {
p = p->next;
}
}
//@assert p == NULL;
/* prepend new element */
list* q = alloc(struct list_node);
q->data = e;
q->next = H->table[i];
H->table[i] = q;
(H->size)++;
return;
}

L ECTURE N OTES O CTOBER 16, 2012


Interfaces L14.12

Exercises
Exercise 1 Extend the hash table implementation so it dynamically resizes itself
when the load factor exceeds a certain threshold. When doubling the size of the
hash table you will need to explicitly insert every element from the old hash table
into the new one, because the result of hashing depends on the size of the hash table.

Exercise 2 Extend the hash table interface with new functions ht_size that re-
turns the number of elements in a table and ht_tabulate that returns an array
with the elements in the hash table, in some arbitrary order.

Exercise 3 Complete the client-side code to build a hash table containing word
frequencies for the words appearing in Shakespeare’s collected works. You should
build upon the code in Assignment 2.

Exercise 4 Extend the hash table interface with a new function to delete an ele-
ment with a given key from the table. To be extra ambitious, shrink the size of the
hash table once the load factor drops below some minimum, similarly to the way
we could grow and shrink unbounded arrays.

L ECTURE N OTES O CTOBER 16, 2012


Lecture Notes on
Binary Search Trees

15-122: Principles of Imperative Computation


Frank Pfenning André Platzer

Lecture 15
March 07, 2013

1 Introduction
In this lecture, we will continue considering associative arrays as we did in
the hash table lecture. This time, we will follow a different implementation
principle, however. With binary search trees we try to obtain efficient insert
and search times for associative arrays dictionaries, which we have pre-
viously implemented as hash tables. We will eventually be able to achieve
O(log(n)) worst-case asymptotic complexity for insert and search. This also
extends to delete, although we won’t discuss that operation in lecture. In
particular, the worst-case complexity of associative array operations im-
plemented as binary search trees is better than the worst-case complexity
when implemented as hash tables.

2 Ordered Associative Arrays


Hashtables are associative arrays that organize the data in an array at an
index that is determined from the key using a hash function. If the hash
function is good, this means that the element will be placed at a reasonably
random position spread out across the whole array. If it is bad, linear search
is needed to locate the element.
There are many alternative ways of implementing associative arrays.
For example, we could have stored the elements in an array, sorted by key.
Then lookup by binary search would have been O(log(n)), but insertion
would be O(n), because it takes O(log n) steps to find the right place, but

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.2

then O(n) steps to make room for that new element by shifting all bigger
elements over. We would also need to grow the array as in unbounded
arrays to make sure it does not run out of capacity. In this lecture, we will
follow similar principles, but move to a different data structure to make
insertion a cheap operation as well, not just lookup. In particular, arrays
themselves are not flexible enough for insertion, but the data structure that
we will be devising in this lecture will be.

3 Abstract Binary Search


What are the operations that we needed to be able to perform binary search?
We needed a way of comparing the key we were looking for with the key of
a given element in our data structure. Depending on the result of that com-
parison, binary search returns the position of that element if they were the
same, advances to the left if what we are looking for is smaller, or advances
to the right if what we are looking for is bigger. For binary search to work
with the complexity O(log n), it was important that binary search advances
to the left or right many steps at once, not just by one element. Indeed, if we
would follow the abstract binary search principle starting from the middle
of the array but advancing only by one index in the array, we would obtain
linear search, which has complexity O(n), not O(log n).
Thus, binary search needs a way of comparing keys and a way of ad-
vancing through the elements of the data structure very quickly, either to
the left (towards elements with smaller keys) or to the right (towards big-
ger ones). In arrays, advancing quickly is easy, because we just compute
the new index to look at as either

int next_mid = (lower + mid) / 2;

or as

int next_mid = ((mid+1) + upper) / 2;

We use the first case if advancing from mid to the left (where next_midmid),
because the element we are looking for is smaller than the element at mid,
so we can discard all elements to the right of mid and have to look on the
left of mid. The second case will be used if advancing from mid to the right
(where next_mid mid), because the element we are looking for is bigger
than the one at mid, so we can discard all elements to the left of mid. In
Lecture 6, we also saw that both computations might actually overflow in
arithmetic, so we devised a more clever way of computing the midpoint,

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.3

but we will ignore this for simplicity here. In Lecture 6, we also did con-
sider int as the data type. Now we study data of an arbitrary type elem
provided by the client. In particular, as one step of abstraciton, we will now
actually compare elements in terms of their keys.
Unfortunately, inserting into arrays remains an O(n) operation. For
other data structures, insertion is easy. For example, insertion into a doubly
linked list at a given list node is O(1). But if we use a sorted doubly linked
list, the insertion step will be easy, but finding the right position by binary
search is challenging, because we can only advance one step to the left or
right in a doubly linked list. That would throw us back into linear search
through the list to find the element, which gives a lookup complexity of
O(n). How can we combine the advantages of both: fast navigation by sev-
eral elements as in arrays, together with fast insertion as in doubly linked
lists? Before you read on, try to see if you can find an answer yourself.

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.4

In order to obtain the advantages of both, and, thus, enable binary


search on a data structure that supports fast insertion, we proceed as fol-
lows. The crucial observation is that arrays provide fast access to any arbi-
trary index in the array, which is why they are called a random access data
structure, but binary search only needs very selective access from each el-
ement. Whatever element binary search is looking at, it only needs access
to that element and one (sufficiently far away) left element and one (suf-
ficiently far away) right element. If binary search has just looked at index
mid, then it will subsequently only look at either (lower + mid) / 2 or
(mid+1 + upper) / 2. In particular, for each element, we need to remem-
ber what its key is, what its left successor is and what its right successor
is, but nothing else. We use this insight to generalize the principle behind
binary search to a more general data structure.

4 Binary Search in Binary Search Trees


The data structure we have developed so far results in a (binary) tree. A
binary tree consists of a set of nodes and, for each node, its left and its right
child. Finding an element in a binary search tree follows exactly the same
idea that binary search did, just on a more abstract data structure:

1. Compare the current node to what we are looking for. Stop if equal.

2. If what we are looking for is smaller, proceed to the left successor.

3. If what we are looking for is bigger, proceed to the right successor.

What do we need to know about the binary tree to make sure that this prin-
ciple will always lookup elements correctly? What data structure invariant
do we need to maintain for the binary search tree? Do you have a sugges-
tion?

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.5

5 The Ordering Invariant


Binary search was only correct for arrays if the array was sorted. Only then
do we know that it is okay not to look at the upper half of the array if the
element we are looking for is smaller than the middle element, because, in
a sorted array, it can then only occur in the lower half, if at all. For binary
search to work correctly on binary search trees, we, thus, need to maintain a
corresponding data structure invariant. All elements to the right of a node
have keys that are bigger than the key of that node. And all the nodes to
the left of that node have smaller keys than the key at that node.
At the core of binary search trees is the ordering invariant.

Ordering Invariant. At any node with key k in a binary search


tree, all keys of the elements in the left subtree are strictly less
than k, while all keys of the elements in the right subtree are
strictly greater than k.

This implies that no key occurs more than once in a tree, and we have to
make sure our insertion function maintains this invariant.
If our binary search tree were perfectly balanced, that is, had the same
number of nodes on the left as on the right for every subtree, then the order-
ing invariant would ensure that search for an element with a given key has
asymptotic complexity O(log(n)), where n is the number of elements in the
tree. Why? When searching for a given key k in the tree, we just compare
k with the key k 0 of the entry at the root. If they are equal, we have found
the entry. If k < k 0 we recursively search in the left subtree, and if k 0 < k
we recursively search in the right subtree. This is just like binary search,
except that instead of an array we traverse a tree data structure. Unlike in
an array, however, we will see that insertion is quick as well.

6 The Interface
The basic interface for binary search trees is almost the same as for hash
tables, because both implement the same abstract principle: associative
arrays. Binary search trees, of course, do not need a hash function. We
assume that the client defines a type elem of elements and a type key of
keys, as well as functions to extract keys from elements and to compare
keys. Then the implementation of binary search trees will provide a type
bst and functions to insert an element and to search for an element with a
given key.

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.6

/* Client-side interface */

typedef ______* elem;


typedef ______ key;

key elem_key(elem e)
//@requires e != NULL;
;

int key_compare(key k1, key k2)


//@ensures -1 <= \result && \result <= 1;
;

/* Library interface */

typedef ________ bst;

bst bst_new();
void bst_insert(bst B, elem e)
//@requires e != NULL;
;
elem bst_lookup(bst B, key k); /* return NULL if not in tree */
We stipulate that elem is some form of pointer type so we can return NULL
if no element with the given key can be found. Generally, some more oper-
ations may be requested at the interface, such as the number of elements in
the tree or a function to delete an element with a given key.
The key_compare function provided by the client is different from the
equality function we used for hash tables. For binary search trees, we ac-
tually need to compare keys k1 and k2 and determine if k1 < k2 , k1 = k2 ,
or k1 > k2 . A standard approach to this in imperative languages is for a
comparison function to return an integer r, where r < 0 means k1 < k2 ,
r = 0 means k1 = k2 , and r > 0 means k1 > k2 . Our contract captures that
we expect key_compare to return no values other than -1, 0, and 1.

7 A Representation with Pointers


We will use a pointer-based implementation for trees where every node has
two pointers: one to its left child and one to its right child. A missing child
is represented as NULL, so a leaf just has two null pointers.

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.7

struct tree_node {
elem data;
struct tree_node* left;
struct tree_node* right;
};
typedef struct tree_node tree;
As usual, we have a header which in this case just consists of a pointer to the
root of the tree. We often keep other information associated with the data
structure in these headers, such as the size.
struct bst_header {
tree* root;
};

8 Searching for a Key


In this lecture, we will implement insertion and lookup first before consid-
ering the data structure invariant. This is not the usual way we proceed,
but it turns out finding a good function to test the invariant is a signifi-
cant challenge—meanwhile we would like to exercise programming with
pointers in a tree a little. For now, we just assume we have two functions
bool is_ordtree(tree* T);
bool is_bst(bst B);
Search is quite straightforward, implementing the informal description
above. Recall that key_compare(k1,k2) returns 1 if k1 < k2 , 0 if k1 = k2 ,
and 1 if k1 > k2 .
elem tree_lookup(tree* T, key k)
//@requires is_ordtree(T);
//@ensures \result == NULL || key_compare(elem_key(\result), k) == 0;
{
if (T == NULL) return NULL;
int r = key_compare(k, elem_key(T->data));
if (r == 0)
return T->data;
else if (r < 0)
return tree_lookup(T->left, k);
else //@assert r > 0;
return tree_lookup(T->right, k);

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.8

elem bst_lookup(bst B, key k)


//@requires is_bst(B);
//@ensures \result == NULL || compare(elem_key(\result), k) == 0;
{
return tree_lookup(B->root, k);
}

We chose here a recursive implementation, following the structure of a tree,


but in practice an iterative version may also be a reasonable alternative (see
Exercise 1).
We can check the invariant: if T is ordered when tree_lookup(T) is
called (and presumably is_bst would guarantee that), then both subtrees
should be ordered as well and the invariant is preserved.

9 Inserting an Element
Inserting an element is almost as simple. We just proceed as if we are look-
ing for the key of the given element. If we find a node with that key, we just
overwrite its data field. If not, we insert it in the place where it would have
been, had it been there in the first place. This last clause, however, creates
a small difficulty. When we hit a null pointer (which indicates the key was
not already in the tree), we cannot just modify NULL. Instead, we return the
new tree so that the parent can modify itself.

tree* tree_insert(tree* T, elem e)


//@requires is_ordtree(T);
//@requires e != NULL;
//@ensures is_ordtree(\result);
{
if (T == NULL) {
/* create new node and return it */
T = alloc(struct tree_node);
T->data = e;
T->left = NULL; T->right = NULL;
return T;
}
int r = key_compare(elem_key(e), elem_key(T->data));
if (r == 0)

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.9

T->data = e; /* modify in place */


else if (r < 0)
T->left = tree_insert(T->left, e);
else //@assert r > 0;
T->right = tree_insert(T->right, e);
return T;
}

For the same reason as in tree_lookup, we expect the subtrees to be or-


dered when we make recursive calls. The result should be ordered for
analogous reasons. The returned subtree will also be useful at the root.

void bst_insert(bst B, elem e)


//@requires is_bst(B);
//@requires e != NULL;
//@ensures is_bst(B);
{
B->root = tree_insert(B->root, e);
return;
}

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.10

10 Checking the Ordering Invariant


When we analyze the structure of the recursive functions implementing
search and insert, we are tempted to say that a binary search is ordered if
either it is null, or the left and right subtrees have a key that is smaller. This
would yield the following code:

/* THIS CODE IS BUGGY */


bool is_ordtree(tree* T) {
if (T == NULL) return true; /* an empty tree is a BST */
key k = elem_key(T->data);
return (T->left == NULL
|| (key_compare(elem_key(T->left->data), k) < 0
&& is_ordtree(T->left)))
&& (T->right == NULL
|| (key_compare(k, elem_key(T->right->data)) < 0
&& is_ordtree(T->right)));
}

While this should always be true for a binary search tree, it is far weaker
than the ordering invariant stated at the beginning of lecture. Before read-
ing on, you should check your understanding of that invariant to exhibit a
tree that would satisfy the above, but violate the ordering invariant.

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.11

There is actually more than one problem with this. The most glaring
one is that following tree would pass this test:

7


5
 11


1
 9


Even though, locally, the key of the left node is always smaller and on the
right is always bigger, the node with key 9 is in the wrong place and we
would not find it with our search algorithm since we would look in the
right subtree of the root.
An alternative way of thinking about the invariant is as follows. As-
sume we are at a node with key k.

1. If we go to the left subtree, we establish an upper bound on the keys in


the subtree: they must all be smaller than k.

2. If we go to the right subtree, we establish a lower bound on the keys in


the subtree: they must all be larger than k.

The general idea then is to traverse the tree recursively, and pass down
an interval with lower and upper bounds for all the keys in the tree. The
following diagram illustrates this idea. We start at the root with an unre-
stricted interval, allowing any key, which is written as ( 1, +1). As usual
in mathematics we write intervals as (x, z) = {y | x < y and y < z}. At
the leaves we write the interval for the subtree. For example, if there were
a left subtree of the node with key 7, all of its keys would have to be in the

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.12

interval (5, 7).

(‐∞,
+∞)

9

(‐∞,
9)


5

(5,
9)
 (9,
+∞)


(‐∞,
5)

7


(5,
7)
 (7,
9)


The only difficulty in implementing this idea is the unbounded inter-


vals, written above as 1 and +1. Here is one possibility: we pass not
just the key, but the particular element which bounds the tree from which
we can extract the element. This allows us to pass NULL in case there is no
lower or upper bound.

bool is_ordered(tree* T, elem lower, elem upper) {


if (T == NULL) return true;
if (T->data == NULL) return false;
key k = elem_key(T->data);
if (!(lower == NULL || key_compare(elem_key(lower),k) < 0))
return false;
if (!(upper == NULL || key_compare(k,elem_key(upper)) < 0))
return false;
return is_ordered(T->left, lower, T->data)
&& is_ordered(T->right, T->data, upper);
}

bool is_ordtree(tree* T) {
/* initially, we have no bounds - pass in NULL */
return is_ordered(T, NULL, NULL);
}

bool is_bst(bst B) {
return B != NULL && is_ordtree(B->root);
}

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.13

A word of caution: the is_ordtree(T) pre- and post-condition of the


function tree_insert is actually not strong enough to prove the correct-
ness of the recursive function. A similar remark applies to tree_lookup.
This is because of the missing information of the bounds. We will return to
this issue in a later lecture.

11 The Shape of Binary Search Trees


We have already mentioned that balanced binary search trees have good
properties, such as logarithmic time for insertion and search. The question
is if binary search trees will be balanced. This depends on the order of
insertion. Consider the insertion of numbers 1, 2, 3, and 4.
If we insert them in increasing order we obtain the following trees in
sequence.
1
 1
 1
 1


2
 2
 2


3
 3


4


Similarly, if we insert them in decreasing order we get a straight line along,


always going to the left. If we instead insert in the order 3, 1, 4, 2, we obtain
the following sequence of binary search trees:

3
 3
 3
 3


1
 1
 4
 1
 4


2


Clearly, the last tree is much more balanced. In the extreme, if we insert
elements with their keys in order, or reverse order, the tree will be linear,
and search time will be O(n) for n items.
These observations mean that it is extremely important to pay attention
to the balance of the tree. We will discuss ways to keep binary search trees
balanced in a later lecture.

L ECTURE N OTES M ARCH 07, 2013


Binary Search Trees L15.14

Exercises
Exercise 1 Rewrite tree_lookup to be iterative rather than recursive.

Exercise 2 Rewrite tree_insert to be iterative rather than recursive. [Hint:


The difficulty will be to update the pointers in the parents when we replace a node
that is null. For that purpose we can keep a “trailing” pointer which should be the
parent of the note currently under consideration.]

Exercise 3 The binary search tree interface only expected a single function for key
comparison to be provided by the client:

int key_compare(key k1, key k2)


//@ensures -1 <= \result && \result <= 1;

An alternative design would have been to, instead, expect the client to provide a
set of key comparison functions, one for each outcome:

bool key_equal(key k1, key k2);


bool key_greater(key k1, key k2);
bool key_less(key k1, key k2);

What are the advantages and disadvantages of such a design?

L ECTURE N OTES M ARCH 07, 2013


Lecture Notes on
Priority Queues

15-122: Principles of Imperative Computation


Frank Pfenning

Lecture 16
October 18, 2012

1 Introduction
In this lecture we will look at priority queues as an abstract type and dis-
cuss several possible implementations. We then pick the implementation
as heaps and start to work towards an implementation. Heaps have the
structure of binary trees, a very common structure since a (balanced) bi-
nary tree with n elements has depth O(log(n)). During the presentation of
algorithms on heaps we will also come across the phenomenon that invari-
ants must be temporarily violated and then restored. We will study this in
more depth in the next lecture. From the programming point of view, we
will see a cool way to implement binary trees in arrays which, alas, does
not work very often.

2 Priority Queues
Priority queues are a generalization of stacks and queues. Rather than in-
serting and deleting elements in a fixed order, each element is assigned a
priority represented by an integer. We always remove an element with the
highest priority, which is given by the minimal integer priority assigned.
Priority queues often have a fixed size. For example, in an operating system
the runnable processes might be stored in a priority queue, where certain
system processes are given a higher priority than user processes. Similarly,
in a network router packets may be routed according to some assigned pri-
orities. In both of these examples, bounding the size of the queues helps to

L ECTURE N OTES O CTOBER 18, 2012


Priority Queues L16.2

prevent so-called denial-of-service attacks where a system is essentially dis-


abled by flooding its task store. This can happen accidentally or on purpose
by a malicious attacker.
Here is an abstract interface to a (bounded) priority queue. Our imple-
mentation uses a data structure call a heap which we discuss shortly.
/* Library-side interface */

typedef _______________ pq;

pq pq_new(int capacity) /* create new heap of given capacity */


//@requires capacity > 0;
;
bool pq_empty(pq P); /* is P empty? */
bool pq_full(pq P); /* is P full? */
void pq_insert(pq P, elem e) /* insert e into P */
//@requires !pq_full(P);
;
elem pq_min(pq P) /* find minimum */
//@requires !pq_empty(P);
;
elem pq_delmin(pq P) /* delete minimum */
//@requires !pq_empty(P);
;
On the client side we must have a function that extracts the priority of an
element, since the library cannot know in general what this priority would
be. Since priorities in general are supposed to be a linear order, we just
use integers directly, rather than an abstract type such a key together with a
comparison function.
/* Client-side interface */

typedef ______________ elem;

int elem_priority(elem e);

3 Some Implementations
Before we come to heaps, it is worth considering different implementation
choices and consider the complexity of various operations.

L ECTURE N OTES O CTOBER 18, 2012


Priority Queues L16.3

The first idea is to use an unordered array of size limit, where we keep
a current index n. Inserting into such an array is a constant-time operation,
since we only have to insert it at n and increment n. However, finding
the minimum will take O(n), since we have to scan the whole portion of
the array that’s in use. Consequently, deleting the minimal element also
takes O(n): first we find the minimal element, then we swap it with the last
element in the array, and decrement n.
A second idea is to keep the array sorted. In this case, inserting an el-
ement is O(n). We can quickly (in O(log(n)) steps) find the place i where
it belongs using binary search, but then we need to shift elements to make
room for the insertion. This take O(n) copy operations. Finding the mini-
mum is O(1) (since it is stored at index 0 in the array). We can also make
deleting it O(1) if we keep the array sorted in descending order, or if we
keep two array indices: one for the smallest current element and one for
the largest.
To anticipate our analysis, heaps will have logarithmic time for insert
and deleting the minimal element.

insert delmin findmin


unordered array O(1) O(n) O(n)
ordered array O(n) O(1) O(1)
heap O(log(n)) O(log(n)) O(1)

4 The Heap Invariant


Typically, when using a priority queue, we expect the number of inserts
and deletes to roughly balance. Then neither the unordered nor the or-
dered array provide a good data structure since a sequence of n inserts and
deletes will have worst-case complexity O(n2 ).
The idea of the heap is to use something cleverly situated in between.
A heap is like an array that is ordered to some extent: enough, that the least
element can be found in O(1), but not so rigidly that inserting would take
O(n) time. A heap is a binary tree where the invariant guarantees that the
least element is at the root. For this to be the case we just require that the
key of a node is less or equal to the keys of its children. Alternatively, we
could say that each node except the root is greater or equal to its parent.

Heap ordering invariant, alternative (1) : The key of each node in the tree
is less or equal to all of its childrens’ keys.

L ECTURE N OTES O CTOBER 18, 2012


Priority Queues L16.4

Heap ordering invariant, alternative (2) : The key of each node in the tree
except for the root is greater or equal to its parent’s key.

These two characterizations are equivalent. Sometimes it turns out to be


convenient to think of it one way, sometimes the other. Either of them im-
plies that the minimal element in the heap is a the root, due to the transi-
tivity of the ordering.
There is a second invariant, not as crucial but convenient, which is that
we fill the tree level by level, from left to right. This means the shape of the
tree is completely determined by the number of elements in it. Here are the
shapes of heaps with 1 through 7 nodes.

1
node
 2
nodes
 3
nodes
 4
nodes


5
nodes
 6
nodes
 7
nodes


We call this latter invariant the shape invariant.

5 Inserting into a Heap


When we insert into a heap, we already know (by the shape invariant)
where a new node has to go. However, we cannot simply put the new
data element there, because it might violate the ordering invariant. We do
it anyway and then work to restore the invariant. We will talk more about
temporarily violating a data structure invariant in the next lecture, as we
develop code. Let’s consider an example. On the left is the heap before
insertion of data with key 1; on the right after, but before we have restored

L ECTURE N OTES O CTOBER 18, 2012


Priority Queues L16.5

the invariant.

2
 2


4
 3
 4
 3


9
 7
 8
 9
 7
 8
 1


The dashed line indicates where the ordering invariant might be violated.
And, indeed, 3 > 1.
We can fix the invariant at the dashed edge by swapping the two nodes.
The result is shown on the right.

2
 2


4
 3
 4
 1


9
 7
 8
 1
 9
 7
 8
 3


The link from the node with key 1 to the node with key 8 will always satisfy
the invariant, because we have replaced the previous key 3 with a smaller
key (1). But the invariant might now be violated going up the tree to the
root. And, indeed 2 > 1.
We repeat the operation, swapping 1 with 2.

2
 1


4
 1
 4
 2


9
 7
 8
 3
 9
 7
 8
 3


As before, the link between the root and its left child continues to satisfy
the invariant because we have replaced the key at the root with a smaller

L ECTURE N OTES O CTOBER 18, 2012


Priority Queues L16.6

one. Furthermore, since the root node has no parent, we have fully restored
the ordering invariant.
In general, we swap a node with its parent if the parent has a strictly
greater key. If not, or if we reach the root, we have restored the ordering
invariant. The shape invariant was always satisfied since we inserted the
new node into the next open place in the tree.
The operation that restores the ordering invariant is called sifting up,
since we take the new node and move it up the heap until the invariant has
been reestablished. The complexity of this operation is O(l), where l is the
number of levels in the tree. For a tree of n 1 nodes there are log(n) + 1
levels. So the complexity of inserting a new node is O(log(n)), as promised.

6 Deleting the Minimal Element


To delete the minimal element from the priority queue we cannot sim-
ple delete the root node where the minimal element is stored, because we
would not be left with a tree. But by the shape invariant we know what the
tree has to look like. So we take the last element in the tree and move it to
the root, and delete that last node.

2
 8


3
 4
 3
 4


9
 7
 8
 9
 7


However, the node that is now at the root might have a strictly greater key
one or both of its children, which would violate the ordering invariant.
If the ordering invariant in indeed violated, we swap the node with the

L ECTURE N OTES O CTOBER 18, 2012


Priority Queues L16.7

smaller of its children.

8
 3


3
 4
 8
 4


9
 7
 9
 7


This will reestablish the invariant at the root. In general we see this as
follows. Assume that before the swap the invariant is violated, and the left
child has a smaller key than the right one. It must also be smaller than
the root, otherwise the ordering invariant would hold. Therefore, after we
swap the root with its left child, the root will be smaller than its left child. It
will also be smaller than its right child, because the left was smaller than the
right before the swap. When the right is smaller than the left, the argument
is symmetric.
Unfortunately, we may not be done, because the invariant might now
be violated at the place where the old root ended up. If not, we stop. If yes,
we compare the children as before and swap with the smaller one.

3
 3


8
 4
 7
 4


9
 7
 9
 8


We stop this downward movement of the new node if either the order-
ing invariant is satisfied, or we reach a leaf. In both cases we have fully
restored the ordering invariant. This process of restoring the invariant is
called sifting down, since we move a node down the tree. As in the case for
insert, the number of operations is bounded by the number of levels in the
tree, which is O(log(n)) as promised.

L ECTURE N OTES O CTOBER 18, 2012


Priority Queues L16.8

7 Finding the Minimal Element


Since the minimal element is at the root, finding the minimal element is a
constant-time operation.

8 Representing Heaps as Arrays


A first thought on how to represent a heap would be using structs with
pointers. The sift-down operation follows the pointers from nodes to their
children, and the sift-up operation follows goes from children to their par-
ents. This means all interior nodes require three pointers: one to each child
and one to the parent, the root requires two, and each leaf requires one.
While a pointer structure is not unreasonable, there is a more elegant
representation using arrays. We use binary numbers as addresses of tree
nodes. Assume a node has index i. Then we append a 0 to the binary
representation of i to obtain the index for the left child and a 1 to obtain the
index of the right child. We start at the root with the number 1. If we tried
to use 0, then the root and its left child would get the same address. The
node number for a full three-level tree on the left in binary and on the right
in decimal.

1
 1


10
 11
 2
 3


100
 101
 110
 111
 4
 5
 6
 7


Mapping this back to numeric operations, for a node at index i we obtain


its left child as 2⇤i, its right child as 2⇤i+1, and its parent as i/2. Care must
be taken, since any of these may be out of bounds of the array. A node may
not have a right child, or neither right nor left child, and the root does not
have a parent.
In the next lecture we will write some code to implement heaps and
reason about its correctness.

L ECTURE N OTES O CTOBER 18, 2012


Priority Queues L16.9

Exercises
Exercise 1 During the lecture, students suggested to work with a sorted linked
list instead of a sorted array to implement priority queues. What is the complex-
ity of the priority queue operations on this representation? What are the advan-
tages/disadvantages compared to an ordered array?

Exercise 2 Consider implementing priority queues using an unordered list in-


stead of an unordered array to implement priority queues. What is the complex-
ity of the priority queue operations on this representation? What are the advan-
tages/disadvantages compared to an unordered array?

L ECTURE N OTES O CTOBER 18, 2012


Lecture Notes on
Restoring Invariants

15-122: Principles of Imperative Computation


Frank Pfenning

Lecture 17
October 23, 2012

1 Introduction
In this lecture we will implement operations on heaps. The theme of this
lecture is reasoning with invariants that are partially violated, and making
sure they are restored before the completion of an operation. We will only
briefly review the algorithms for inserting and deleting the minimal node
of the heap; you should read the notes for Lecture 16 on priority queues
and keep them close at hand.
Temporarily violating and restoring invariants is a common theme in
algorithms. It is a technique you need to master.

2 The Heap Structure


We use the following header struct to represent heaps.
struct heap_header {
int limit; /* limit = capacity+1 */
int next; /* 1 <= next && next <= limit */
elem[] data; /* \length(data) == limit */
};
typedef struct heap_header* heap;
Since the significant array elements start at 1, as explained in the previous
lecture, the limit must be one greater than the desired capacity. The next in-
dex must be between 1 and limit, and the element array must have exactly
limit elements.

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.2

3 The Heap Ordering Invariant


Before we implement the operations, we define a function that checks the
heap invariants. The shape invariant is automatically satisfied due to the
representation of heaps as arrays, but we need to carefully check the order-
ing invariants. It is crucial that no instance of the data structure that is not a
true heap will leak across the interface to the client, because the client may
then incorrectly call operations that require heaps with data structures that
are not.
First, we check that the heap is not null and that the length of the array
matches the given limit. The latter must be checked in an annotation,
because, in C and C0, the length of an array is not available to us at runtime
except in contracts.
Second we check that next is in range, between 1 and limit. As a general
stylistic choice, when writing functions that check data structure invariants
and have to return a boolean, we think of the tests like assertions. If they
would fail, we return false instead. Therefore we usually write negated
conditionals and return false if the negated condition is true. In the code
below, we think
//@assert H != NULL;
//@assert \length(H->heap) == H->limit;
//@assert 1 <= H->next && H->next <= H->limit;
and write
bool is_heap(heap H) {
if (!(H != NULL)) return false;
//@assert \length(H->heap) == H->limit;
if (!(1 <= H->next && H->next <= H->limit)) return false;
...
}
The remaining code has to check the ordering invariant. It turns out to be
simpler in the second form, which stipulates that each node except the root
needs to be greater or equal to its parent. To check this we iterate through
the array and compare the priority of each node data[i] with its parent,
except for the root (i = 1) which has no parent. As a matter of program-
ming style, we always put the parent to left in any comparison, to make
it easy to see that we are comparing the correct elements. We also write
struct heap_header* H for the argument to emphasize that the argument
H is not necessarily a heap.

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.3

bool is_heap(struct heap_header* H) {


if (!(H != NULL)) return false;
//@assert \length(H->heap) == H->limit;
if (!(1 <= H->next && H->next <= H->limit)) return false;
/* check parent <= node for all nodes except root (i = 1) */
for (int i = 2; i < H->next; i++)
//@loop_invariant 2 <= i;
if (!(H->data[i/2] <= H->data[i])) return false;
return true;
}

The test in the loop is not quite right, but lets just verify that it is at least
safe

• We can dereference H->data because we have checked that H is not


null.

• We can access H->data at i, because (by loop invariant) i 2 1


and (by the loop guard), i < H!next. The latter implies safety since
H!next  H!limit = length(H!data).

• We can access H->data at i/2, because i/2 1 since i 2 (by loop


invariant) and i/2 < i < length(H!next).

Why is it incorrect? Recall that in our interface we specified heaps to con-


tain data of type elem, and that no assumption should be made about this
type except that the client provides a function elem_priority. So we need
to extract the priority from the data element.

if (!(elem_priority(H->data[i/2]) <= elem_priority(H->data[i])))


return false;

We commonly need to access the priority of data stored in the heap, so


we separate this out as a function. The only tricky aspect of this function
is its contract. We cannot require the argument to be a heap, since in the
is_heap function we don’t know this yet! It would also make is_heap and
the priority function mutually recursive, leading to nontermination. But
we need to say enough so that access to the heap array is safe.

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.4

int priority(struct heap_header* H, int i)


//@requires H != NULL;
//@requires 1 <= i && i < H->next;
//@requires H->next <= \length(H->data);
{
return elem_priority(H->data[i]);
}

The middle line is a little stronger than we need for safety, but it is im-
portant that we never access an element that is meaningless, like the one
stored at index 0, and the ones stored at H!next and beyond. Then the
final version of our is_heap function is:
bool is_heap(struct heap_header* H) {
if (!(H != NULL)) return false;
//@assert \length(H->data) == H->limit;
if (!(1 <= H->next && H->next <= H->limit)) return false;
for (int i = 2; i < H->next; i++)
//@loop_invariant 2 <= i;
if (!(priority(H, i/2) <= priority(H, i))) return false;
return true;
}

4 Creating Heaps
We start with the simple code to test if a heap is empty or full, and to allo-
cate a new (empty) heap. A heap is empty if the next element to be inserted
would be at index 1. A heap is full if the next element to be inserted would
be at index limit (the size of the array).
bool pq_empty(heap H)
//@requires is_heap(H);
{
return H->next == 1;
}

bool pq_full(heap H)
//@requires is_heap(H);
{
return H->next == H->limit;
}

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.5

To create a new heap, we allocate a struct and an array and set all the
right initial values.

heap pq_new(int capacity)


//@requires capacity > 0;
//@ensures is_heap(\result) && pq_empty(\result);
{
heap H = alloc(struct heap_header);
H->limit = capacity+1;
H->next = 1;
H->data = alloc_array(elem, capacity+1);
return H;
}

5 Insert and Sifting Up


The shape invariant tells us exactly where to insert the new element: at the
index H!next in the data array. Then we increment the next index.

void pq_insert(heap H, elem e)


//@requires is_heap(H) && !pq_full(H);
//@ensures is_heap(H);
{
H->data[H->next] = e;
(H->next)++;
...
}

By inserting e in its specified place, we have, of course, violated the order-


ing invariant. We need to sift up the new element until we have restored
the invariant. The invariant is restored when the new element is bigger
than or equal to its parent or when we have reached the root. We still need
to sift up when the new element is less than its parent. This suggests the
following code:

int i = H->next - 1;
while (i > 1 && priority(H,i) < priority(H,i/2))
{
swap(H->data, i, i/2);
i = i/2;
}

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.6

Here, swap is the standard function, swapping two elements of the array.
Setting i = i/2 is moving up in the array, to the place we just swapped the
new element to.
At this point, as always, we should ask why accesses to the elements
of the priority queue are safe. By short-circuiting of conjunction, we know
that i > 1 when we ask priority(H, i) < priority(H, i/2). But we need a
loop invariant to make sure that it respects the upper bound. The index i
starts at H!next 1, so it should always be strictly less that H!next.

int i = H->next - 1;
while (i > 1 && priority(H,i) < priority(H,i/2))
//@loop_invariant 1 <= i && i < H->next;
{
swap(H->data, i, i/2);
i = i/2;
}

One small point regarding the loop invariant: we just incremented H!next,
so it must be strictly greater than 1 and therefore the invariant 1  i must
be satisfied.
But how do we know that swapping the element up the tree restores the
ordering invariant? We need an additional loop invariant which states that
H is a valid heap except at index i. Index i may be smaller than its parent,
but it still needs to be less or equal to its children. We therefore postulate a
function is_heap_expect_up and use it as a loop invariant.

int i = H->next - 1;
while (i > 1 && priority(H,i) < priority(H,i/2))
//@loop_invariant 1 <= i && i < H->next;
//@loop_invariant is_heap_except_up(H, i);
{
swap(H->data, i, i/2);
i = i/2;
}

The next step is to write this function. We copy the is_heap function, but
check a node against its parent only when it is different from the distin-
guished element where the exception is allowed.

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.7

bool is_heap_except_up(heap H, int n) {


if (H == NULL) return false;
//@assert \length(H->data) == H->limit;
if (!(1 <= H->next && H->next <= H->limit)) return false;
for (int i = 2; i < H->next; i++)
//@loop_invariant 2 <= i;
{
if (i != n && !(priority(H, i/2) <= priority(H, i)))
return false;
}
return true;
}

We observe that is heap except up(H, 1) is equivalent to is heap(H). That’s


because the loop over i starts at 2, so the exception i 6= n is always true.

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.8

Now we try to prove that this is indeed a loop invariant, and there-
fore our function is correct. Rather than using a lot of text we verify this
properties on general diagrams. Other versions of this diagram are entirely
symmetric. On the left is the relevant part of the heap before the swap and
on the right is the relevant part of the heap after the swap. The relevant
nodes in the tree are labeled with their priority. Nodes that may be above
a or below c, c1 , c2 and to the right of a are not shown. These do not enter
into the invariant discussion, since their relations between each other and
the shown nodes remain fixed. Also, if x is in the last row the constraints
regarding c1 and c2 are vacuous.

a" a"

b" x"

c" x" c" b"

c1" c2" c1" c2"

We know the following properties on the left from which the properties
shown on the right follow as shown:

ab (1) order a?x allowed exception


bc (2) order xc from (5) and (2)
x  c1 (3) order xb from (5)
x  c2 (4) order
b  c1 ??
x<b (5) since we swap b  c2 ??

So we see that simply stipulating the (temporary) invariant that every node
is greater or equal to its parent except for the one labeled x is not strong
enough. It is not necessarily preserved by a swap.
But we can strengthen it a bit. You might want to think about how
before you move on to the next page.

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.9

The strengthened invariant also requires that the children of the po-
tentially violating node x are greater or equal to their grandparent! Let’s
reconsider the diagrams.

a" a"

b" x"

c" x" c" b"

c1" c2" c1" c2"

We have more assumptions on the left now ((6) and (7)), but we have also
two additional proof obligations on the right (a  c and a  b).
ab (1) order a?x allowed exception
bc (2) order ac from (1) and (2)
x  c1 (3) order ab (1)
x  c2 (4) order
xc from (5) and (2)
x<b (5) since we swap xb from (5)
b  c1 (6) grandparent b  c1 (6)
b  c2 (7) grandparent b  c2 (7)

Success! We just need to update the code for is_heap_except_up to check


this additional property.
bool is_heap_except_up(heap H, int n) {
if (H == NULL) return false;
//@assert \length(H->data) == H->limit;
if (!(1 <= H->next && H->next <= H->limit)) return false;
for (int i = 2; i < H->next; i++)
//@loop_invariant 2 <= i;
{
if (i != n && !(priority(H, i/2) <= priority(H, i)))
return false;
/* for children of node n, check grandparent */
if (i/2 == n && (i/2)/2 >= 1
&& !(priority(H, (i/2)/2) <= priority(H,i)))
return false;
}
return true;
}

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.10

Note that the strengthened loop invariants (or, rather, the strengthened
definition what it means to be a heap except in one place) is not necessary
to show that the postcondition of pq_insert (i.e. is_heap(H)) is implied.
Postcondition: If the loop exits, we know the loop invariants and the negated
loop guard:

1  i < next (LI 1)


is heap except up(H, i) (LI 2)
Either i  1 or priority(H, i) priority(H, i/2) Negated loop guard

We distinguish the two cases.


Case: i  1. Then i = 1 from (LI 1), and is heap except up(H, 1). As
observed before, that is equivalent to is heap(H).
Case: priority(H, i) priority(H, i/2). Then the only possible index
i where is heap except up(H, i) makes an exception and does not
check whether priority(H, i/2)  priority(H, i) is actually no ex-
ception, and we have is heap(H).

6 Deleting the Minimum and Sifting Down


Recall that deleting the minimum swaps the root with the last element in
the current heap and then applies the sifting down operation to restore the
invariant. As with insert, the operation itself is rather straightforward, al-
though there are a few subtleties. First, we have to check that H is a heap,
and that it is not empty. Then we save the minimal element, swap it with
the last element (at next 1), and delete the last element (now the element
that was previously at the root) from the heap by decrementing next.
elem pq_delmin(heap H)
//@requires is_heap(H) && !pq_empty(H);
//@ensures is_heap(H);
{
int n = H->next;
elem min = H->data[1];
H->data[1] = H->data[n-1];
H->next = n-1;
if (H->next > 1) sift_down(H, 1);
return min;
}

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.11

Next we need to restore the heap invariant by sifting down from the root,
with sift_down(H, 1). We only do this if there is at least one element left
in the heap.
But what is the precondition for the sifting down operation? Again, we
cannot express this using the functions we have already written. Instead,
we need a function is_heap_except_down(H, n) which verifies that the
heap invariant is satisfied in H, expect possibly at n. This time, though,
it is between n and its children where things may go wrong, rather than
between n and its parent as in is_heap_except_up(H, n). In the pictures
below this would be at n = 1 on the left and n = 2 on the right.

8
 3


3
 4
 8
 4


9
 7
 9
 7


We change the test accordingly. Anticipating the earlier problem, we again


require that the children of the exceptional node are less than their grand-
parent.

bool is_heap_except_down(heap H, int n) {


if (H == NULL) return false;
//@assert \length(H->data) == H->limit;
if (!(1 <= H->next && H->next <= H->limit)) return false;
/* check parent <= node for all nodes except root (i = 1) */
/* and children of n (i/2 = n) */
for (int i = 2; i < H->next; i++)
//@loop_invariant 2 <= i;
{
if (i/2 != n && !(priority(H, i/2) <= priority(H, i)))
return false;
/* for children of node n, check grandparent */
if (i/2 == n && (i/2)/2 >= 1
&& !(priority(H, (i/2)/2) <= priority(H,i)))
return false;
}

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.12

return true;
}

With this we can have the right invariant to write our sift_down func-
tion. The tricky part of this function is the nature of the loop. Our loop
index i starts at n (which actually will always be 1 when this function is
called). We have reached a leaf if 2 ⇤ i next because if there is no left
child, there cannot be a right one, either. So the outline of our function
shapes up as follows:

void sift_down(heap H, int i)


//@requires 1 <= i && i < H->next;
//@requires is_heap_except_down(H, i);
//@ensures is_heap(H);
{ int n = H->next;
int left = 2*i;
int right = left+1;
while (left < n)
//@loop_invariant 1 <= i && i < n;
//@loop_invariant left == 2*i && right == 2*i+1;
//@loop_invariant is_heap_except_down(H, i);
...
}

We also have written down three loop invariants: the bounds for i, the
heap invariant (everywhere, except possibly at i, looking down), and the
invariant defining local variables left and right, standing for the left and
right children of i.
We want to return from the function if we have restored the invariant,
that is priority(H, i)  priority(H, 2 ⇤ i) and priority(H, i)  priority(H, 2 ⇤
i + 1). However, the latter reference might be out of bounds, namely if
we found a node that has a left child but not a right child. So we have to
guard this access by a bounds check. Clearly, when there is no right child,
checking the left one is sufficient.

while (left < n)


//@loop_invariant 1 <= i && i < n;
//@loop_invariant left == 2*i && right == 2*i+1;
//@loop_invariant is_heap_except_down(H, i);
{ if (priority(H,i) <= priority(H,left)
&& (right >= n || priority(H,i) <= priority(H,right)))

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.13

return;
...
}

If this test fails, we have to determine the smaller of the two children. If
there is no right child, we pick the left one, of course. Once we have found
the smaller one we swap the current one with the smaller one, and then
make the child the new current node i.

void sift_down(heap H, int i)


//@requires 1 <= i && i < H->next;
//@requires is_heap_except_down(H, i);
//@ensures is_heap(H);
{ int n = H->next;
int left = 2*i;
int right = left+1;
while (left < n)
//@loop_invariant 1 <= i && i < n;
//@loop_invariant left == 2*i && right == 2*i+1;
//@loop_invariant is_heap_except_down(H, i);
{ if (priority(H,i) <= priority(H,left)
&& (right >= n || priority(H,i) <= priority(H,right)))
return;
if (right >= n || priority(H,left) < priority(H,right)) {
swap(H->data, i, left);
i = left;
} else {
//@assert right < n && priority(H, right) <= priority(H,left);
swap(H->data, i, right);
i = right;
}
left = 2*i;
right = left+1;
}
//@assert i < n && 2*i >= n;
//@assert is_heap_except_down(H, i);
return;
}

Before the second return, we know that is_heap_except_down(H,i) and


2⇤i next. This means there is no node j in the heap such that j/2 = i

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.14

and the exception in is_heap_except_down will never apply. H is indeed a


heap.
At this point we should give a proof that is_heap_except_down is really
an invariant. This is left as Exercise 4.

7 Heapsort
We rarely discuss testing in these notes, but it is useful to consider how to
write decent test cases. Mostly, we have been doing random testing, which
has some drawbacks but is often a tolerable first cut at giving the code a
workout. It is much more effective in languages that are type safe such as
C0, and even more effective when we dynamically check invariants along
the way.
In the example of heaps, one nice way to test the implementation is to
insert a random sequence of numbers, then repeatedly remove the minimal
element until the heap is empty. If we store the elements in an array in the
order we take them out of the heap, the array should be sorted when the
heap is empty! This is the idea behind heapsort. We first show the code,
using the random number generator we have used for several lectures now,
then analyze the complexity.
int main() {
int n = (1<<9)-1; // 1<<9 for -d; 1<<13 for timing
int num_tests = 10; // 10 for -d; 100 for timing
int seed = 0xc0c0ffee;
rand_t gen = init_rand(seed);
int[] A = alloc_array(int, n);
heap H = pq_new(n);

print("Testing heap of size "); printint(n);


print(" "); printint(num_tests); print(" times\n");
for (int j = 0; j < num_tests; j++) {
for (int i = 0; i < n; i++) {
pq_insert(H, rand(gen));
}
for (int i = 0; i < n; i++) {
A[i] = pq_delmin(H);
}
assert(pq_empty(H)); /* heap not empty */
assert(is_sorted(A, 0, n)); /* heapsort failed */

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.15

}
print("Passed all tests!\n");
return 0;
}
Now for the complexity analysis. Inserting n elements into the heap is
bounded by O(n ⇤ log(n)), since each of the n inserts is bounded by log(n).
Then the n element deletions are also bounded by O(n ⇤ log(n)), since each
of the n deletions is bounded by log(n). So all together we get O(2 ⇤ n ⇤
log(n)) = O(n ⇤ log(n)). Heapsort is asymptotically as good as mergesort
(Lecture 7) or as good as the expected complexity of quicksort with random
pivots (Lecture 8).
The sketched algorithm uses O(n) auxiliary space, namely the heap.
One can use the same basic idea to do heapsort in place, using the unused
portion of the heap array to accumulate the sorted array.
Testing, including random testing, has many problems. In our context,
one of them is that it does not test the strength of the invariants. For ex-
ample, say we write no invariants whatsoever (the weakest possible form),
then compiling with or without dynamic checking will always yield the
same test results. We really should be testing the invariants themselves by
giving examples where they are not satisfied. However, we should not be
able to construct such instances of the data structure on the client side of the
interface. Furthermore, within the language we have no way to “capture”
an exception such as a failed assertion and continue computation.

8 Summary
We briefly summarize key points of how to deal with invariants that must
be temporarily violated and then restored.
1. Make sure you have a clear high-level understanding of why invari-
ants must be temporarily violated, and how they are restored.

2. Ensure that at the interface to the abstract type, only instances of the
data structure that satisfy the full invariants are being passed. Other-
wise, you should rethink all the invariants.

3. Write predicates that test whether the partial invariants hold for a
data structure. Usually, these will occur in the preconditions and
loop invariants for the functions that restore the invariants. This will
force you to be completely precise about the intermediate states of the

L ECTURE N OTES O CTOBER 23, 2012


Restoring Invariants L17.16

data structure, which should help you a lot in writing correct code for
restoring the full invariants.

Exercises
Exercise 1 Write a recursive version of is_heap.

Exercise 2 Write a recursive version of is_heap_except_up.

Exercise 3 Write a recursive version of is_heap_except_down.

Exercise 4 Give a diagrammatical proof for the invariant property of sifting down
for delete (called is_heap_except_down), along the lines of the one we gave for
sifting up for insert.

Exercise 5 Say we want to extend priority queues so that when inserting a new
element and the queue is full, we silently delete the element with the lowest priority
(= maximal key value) before adding the new element. Describe an algorithm,
analyze its asymptotic complexity, and provide its implementation.

Exercise 6 Using the invariants described in this lecture, write a function heapsort
which sorts a given array in place by first constructing a heap, element by element,
within the same array and then deconstructing the heap, element by element.
[Hint: It may be easier to sort the array in descending order and reverse in a last
pass or use so called max heaps where the maximal element is at the top]

Exercise 7 Is the array H->data of a heap always sorted?

L ECTURE N OTES O CTOBER 23, 2012


Lecture Notes on
Memory Management

15-122: Principles of Imperative Computation


Frank Pfenning, Rob Simmons

Lecture 18b
March 26, 2013

1 Introduction
In this lecture we will start the transition from C0 to C. In some ways, the
lecture is therefore about knowledge rather than principles, a return to the
emphasis on programming that we had earlier in the semester. In future
lectures, we will explore some deeper issues in the context of C, but this
lecture is full of cautionary tales.
The main theme of this lecture is the way C manages memory. Unlike
C0 and other modern languages like Java, C#, or ML, C requires programs
to explicitly manage their memory. Allocation is relatively straightforward,
like in C0, requiring only that we correctly calculate the size of allocated
memory. Deallocating (“freeing”) memory, however, is difficult and error-
prone, even for experienced C programmers. Mistakes can either lead to
attempts to access memory that has already been deallocated, in which case
the result is undefined and may be catastrophic, or it can lead the running
program to hold on to memory no longer in use, which may slow it down
and eventually crash it when it runs out of memory. The second category
is a so-called memory leak.

2 A First Look at C
Syntactically, C and C0 are very close. Philosophically, they diverge rather
drastically. Underlying C0 are the principles of memory safety and type
safety. A program is memory safe if it only reads from memory that has

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.2

been properly allocated and initialized, and only writes to memory that
has been properly allocated. A program is type safe if all data it manipulates
have their declared types. In C0, all programs are type safe and memory
safe. The compiler guarantees this through a combination of static (that
is, compile-time) and dynamic (that is, run-time) checks. An example of
a static check is the error issued by the compiler when trying to assign an
integer to a variable declared to hold a pointer, such as
int* p = 37;
An example of a dynamic check is an array out-of-bounds error, which
would try to access memory that has not been allocated by the program.
Modern languages such as Java, ML, or Haskell are both type safe and
memory safe.
In contrast, C is neither type safe nor memory safe. This means that the
behavior of many operations in C is undefined. Unfortunately, undefined
behavior in C may yield any result or have any effect, which means that
the outcome of many programs is unpredictable. In many cases, even pro-
grams that are patently absurd will yield a consistent answer on a given
machine with a given compiler, or perhaps even across different machines
and different compilers. No amount of testing will catch the fact that such
programs have bugs, but they may break when, say, the compiler is up-
graded or details of the runtime system changes. Taken together, these
design decisions make it very difficult to write correct programs in C. This
fact is in evidence every day, when we download so-called security critical
updates to operating systems, browsers, and other software. In many cases,
the security critical flaws arise because an attacker can exploit behavior that
is undefined, but predictable across some spectrum of implementations, in
order to cause your machine to execute some arbitrary malicious code. You
will learn in 15-213 Computer Systems exactly how such attacks work.
These difficulties are compounded by the fact that there are other parts
of the C standard that are implementation defined. For example, the size of
values of type int is explicitly not specified by the C standard, but each
implementation must of course provide a size. This makes it very diffi-
cult to write portable programs. Even on one machine, the behavior of a
program might differ from compiler to compiler. We will talk more about
implementation defined behavior in the next lecture.
Despite all these problems, almost 40 years after its inception, C is still
a significant language. For one, it is the origin of the object-oriented lan-
guages C++ and strongly influenced Java and C#. For another, much sys-
tems code such as operating systems, file systems, garbage collectors, or

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.3

networking code are still written in C. Designing type-safe alternative lan-


guages for systems code is still an active area of research, including the
Static OS project at Carnegie Mellon University.

3 Undefined Behavior in C
For today’s lecture, there are three important undefined behaviors in C are:

Out-of-bounds array access: accessing outside the range of an allocated


array has undefined results.

Null pointer dereference: dereferencing the null pointer has undefined re-
sults.

Double-free: We’ll talk about this in Section 7.

3.1 Arrays, pointers, and out-of-bounds access


When compared to C0, the most shocking difference is that C does not
distinguish arrays from pointers. Array accesses are not checked at all,
and out-of-bounds memory references (whose result is formally undefined)
may lead to unpredictable results. For example, the code fragment

int main() {
int* A = malloc(sizeof(int));
A[0] = 0; /* ok - A[0] is like *A */
A[1] = 1; /* error - not allocated */
A[317] = 29; /* error - not allocated */
A[-1] = 32; /* error - not allocated(!) */
printf("A[-1] = %d\n", A[-1]);
return 0;
}

will not raise any compile time error or even warnings, even under the
strictest settings. Here, the call to malloc allocates enough space for a single
integer in memory. In this class, we are using gcc with the following flags:

gcc -Wall -Wextra -Werror -std=c99 -pedantic

which generates all warnings (-Wall and -Wextra), turns all warnings into
errors (-Werror), and applies the C99 standard (-std=c99) pedantically

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.4

(-pedantic). The code above executes ok, and in fact prints 32, despite
four blatant errors in the code.
To discover whether such errors may have occurred at runtime, we can
use the valgrind tool.

% valgrind ./a.out
...
==nnnn== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)

which produces useful error messages (elided above) and indeed, flags 4
error in code whose observable behavior was bug-free.
valgrind slows down execution, but if at all feasible you should test all
your C code in the manner to uncover memory problems. For best error
messages, you should pass the -g flag to gcc which preserves some corre-
lation between binary and source code.
You can also guard memory accesses with approriate assert statements
that abort the program when attempting out-of-bounds accesses.
Conflating pointers and arrays provides a hint on how to convert C0
programs to C. We need to convert t[] which indicates a C0 array with
elements of type t to t* to indicate a pointer instead. In addition, the
alloc and alloc_array calls need to be changed, or defined by appropri-
ate macros (we’ll talk about this more later).

3.2 Null pointer dereference


In C0, an out of bounds array access or null pointer dereference will always
cause the program to print out Segmentation fault and exit aborting with
(abort with signal SIGSEGV). In C, reading or writing to an array out of
bounds may cause a segmentation fault, but it is impossible to rely on this
behavior in practice.
In contrast, it is so common that dereferencing the null pointer will lead
to a segmentation fault that it may be overlooked that this is not defined.
Nevertheless, it is undefined, dereferencing NULL may not yield an excep-
tion, particularly if your code runs in kernel mode, as part of the operating
system,

3.3 What actually happens?


If you do not get an error, then perhaps nothing at all will happen, and
perhaps memory will become silently corrupted and cause unpredictable

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.5

errors down the road. But we were able to describe, in each of the examples
above, what sorts of things were likely to happen.
There’s an old joke that whenever your encounter undefined behavior,
your computer could decide to play Happy Birthday or it could catch on fire.
This is less of a joke considering recent events:

• In 2010, Alex Halderman’s team at the University of Michigan suc-


cessfully hacked into Washington D.C.’s prototype online voting sys-
tem, and caused its web page to play the University of Michigan fight
song, “The Victors.”1

• The Stuxnet worm caused centrifuges, such as those used for ura-
nium enrichment in Iran, to malfunction, physically damaging the
devices.2

Not quite playing Happy Birthday and catching on fire, but close enough.

4 Memory Allocation
Two important system-provided functions for allocating memory are malloc
and calloc.
malloc(sizeof(t)) allocates enough memory to hold a value of type
t. In C0, we would have written alloc(t) instead. The difference is that
alloc(t) has type t*, while malloc(sizeof(t)) returns a special type
void*, which we will discuss more in the next lecture. The important thing
to realize is that C will not even check that the pointer we allocated is the
right size, so that while we can write this:

int* p = malloc(sizeof(int));

we can also write this:

int* p = malloc(sizeof(char));

which will generally have undefined results. Also, malloc does not guar-
antee that the memory it returns has been initialized, so the following code
is an error:
1
Scott Wolchok, Eric Wustrow, Dawn Isabel, and J. Alex Halderman. Attacking the Wash-
ington, D.C. Internet Voting System. Proceedings of the 16th Conference on Financial Cryp-
tography and Data Security, February 2012.
2
Holger Stark. Stuxnet Virus Opens New Era of Cyber War. Spiegel Online, August 8, 2011.

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.6

int* p = malloc(sizeof(char));
printf("%d\n", *p);

Valgrind will report the error “Use of uninitialised value of size 8”


if code with the above two lines is compiled and run.
calloc(n, sizeof(t)) allocates enough memory for n objects of type
t. Unlike malloc, it also guarantees that all memory cells are initialized
with 0. For many types, this yields a default element, such as false for
booleans, 0 for ints, ’\0’ for char, or NULL for pointers.
Both malloc and calloc may fail when there is not enough memory
available. In that case, they just return NULL. This means any code calling
these two functions should check that the return value is not NULL before
proceeding. Because makes it tedious and error-prone to write safe code,
we have defined functions xmalloc and xcalloc which are just like malloc
and calloc, respectively, but abort computation in case the operation fails.
They are thereby guaranteed to return a pointer that is not NULL, if they
return at all. These functions are in the file xalloc.c; their declarations are
in xalloc.h (see Section 5 for an explanation of header files).

5 Header Files
To understand how the xalloc library works, and to take our our C0 im-
plementation of binary search treas and begin turning it into a C imple-
mentation, we will need to start by explaining the C convention of using
header files to specify interfaces. Header files have the extension .h and
contain type declarations and definitions as well as function prototypes
and macros, but never code. Header files are not listed on the command
line when the compiler is invoked, but included in C source files (with the
.c extension) using the #include preprocessor directive. The typical use is
to #include3 the header file both in the implementation of a data structure
and all of its clients. In this way, we know both match the same interface.
This applies to standard libraries as well as user-written libraries. For
example, the client of the C implementation of BSTs starts with

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
3
when we say “include” in the rest of this lecture, we mean #include

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.7

#include "xalloc.h"
#include "contracts.h"
#include "bst.h"
#include "bst-client.h"
The form #include <filename.h> includes file filename.h which must
be one of the system libraries provided by the suite of compilation tools
(gcc, in our case). The second form #include "filename.h" looks for
filename.h in the current source directory, so this is reserved for user
files. The names of the standard libraries and the types and functions they
provide can be found in the standard reference book The C Programming
Language, 2nd edition by Brian Kernighan and Dennis Ritchie or in various
places on the Internet.4

5.1 Client interface


We’ll start with are bst-client.h, which contains the client interface to
BSTs
/*************************/
/* Client-side interface */
/*************************/

#include <string.h>

#ifndef _BST_CLIENT_H_
#define _BST_CLIENT_H_

typedef struct wcount *elem;


typedef string key;

key elem_key(elem e); /* e cannot be NULL! */


int key_compare(key k1, key k2); /* returns -1, 0, or 1 */
void elem_free(elem e);

#endif
Note that the way we are defining a client interface here is not good C style,
but it will take a few more lectures before we have the tools to do a better
job.
4
for example, http://www.acm.uiuc.edu/webmonkeys/book/c_guide/

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.8

The core of this file is exactly the client interface part of our C0 BST spec-
ification. It defines the type elem as a pointer to a struct wcount, whose
implementation remains hidden. There is one new function, elem_free,
which we have not yet discussed. All of our contracts are gone - C does not
have a facility for putting contracts in an interface.
We also see a certain idiom

#ifndef _BST_CLIENT_H_
#define _BST_CLIENT_H_
...
#endif

which is interpreted by the preprocessor, like other directives starting with


#. This is a header guard, which prevents the header from being processed
multiple times. The first time the header file is processed, the preprocessor
variable _BST_CLIENT_H_ will not be defined, so the test #ifndef (if not de-
fined) will succeed. The next directive defines the variable _BST_CLIENT_H_
(as the empty string, but that is irrelevant) and then processes the following
declarations up to the matching #endif, usually at the end of the file.
Now if this file were included a second time, which happens frequently
because standard libraries, for example, are included in many different
source files that are compiled together, then the variable _STACKS_H_ would
be defined, the test would fail, and the body of the file ignored.

5.2 Library interface


Our library interface uses types defined in bst-client.h, so we include
this file in the library interface, bst.h.

/*********************/
/* Library interface */
/*********************/

#include "bst-client.h"

#ifndef _BST_H_
#define _BST_H_

typedef struct bst_header *bst;

bst bst_new();

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.9

void bst_insert(bst B, elem e); /* e cannot be NULL! */


elem bst_lookup(bst B, key k); /* return NULL if not in tree */
void bst_free(bst B);

#endif

6 Macros
Macros are another extension of C that we left out from C0. We use macros
to get some semblence of contracts in C0, which are defined in the header
file contracts.h.
Macros are expanded by a preprocessor and the result is fed to the “reg-
ular” C compiler. When we do not want REQUIRES to be checked (which is
the default, just as for @requires), there is a macro definition
#define REQUIRES(COND) ((void)0)
which can be found in the file contracts.h. The right-hand side of this
definition, ((void)0) is the number zero, cast to have type void which
means it cannot be used as an argument to a function or operator; its result
must be ignored. When the code is compiled with
gcc -DDEBUG ...
then it is defined instead as a regular assert:
#define REQUIRES(COND) assert(COND)
In this case, any use of REQUIRES(e) is expanded into assert(e) before the
result is compiled into a runtime test.
The three macros, all of which behave identically are
REQUIRES(e);
ENSURES(e);
ASSERT(e);
although they are intended for different purposes, mirroring the @requires,
@ensures, and @assert annotations of C0. @loop_invariant is missing,
since there appears to be no good syntax to support loop invariants di-
rectly; we recommend you check them right after the exit test or at the end
of the loop using the ASSERT macro.
Another common use for macros is to define compile-time constants.
In general, it is considered good style to isolate “magic” numbers at the

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.10

beginning of a file, for easy reference; for instance, if we were coding our
E0 editor in C, it would make sense to

#define GAP_BUFFER_SIZE 16

to make it easy to change from size 16 gap buffers to some other size. The
C implementation itself uses them as well, for example, limits.h defines
INT_MAX as the maximal (signed) integer, and INT_MIN and the minimal
signed integer, and similarly for UINT_MAX for unsigned integers.

6.1 Conditional compliation


Header guards _BST_H_ are an example of conditional compilation which is
often used in systems files in order to make header files and their imple-
mentation portable. Another idiomatic use of conditional compilation is

#ifdef DEBUG
...debugging statements...
#endif

where the variable DEBUG is usually set on the gcc command line with

gcc -DDEBUG ...

Guarding debugging statements in this way generalizes the simple asser-


tion macros provided in contracts.h.

7 Freeing Memory
Unlike C0, C does not automatically manage memory. As a result, pro-
grams have to free the memory they allocate explicitly; otherwise, long-
running programs or memory-intensive programs are likely to run out of
space. For that, the C standard library provides the function free, declared
with

void free(void* p);

The restrictions as to its proper use are

1. It is only called on pointers that were returned from malloc or calloc.5


5
or realloc, which we have not discussed

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.11

2. After memory has been freed, it is not longer referenced by the pro-
gram in any way.
Freeing memory counts as a final use, so the goals imply that you should
not free memory twice. And, indeed, in C the behavior of freeing mem-
ory that has already been freed is undefined and may be exploited by and
adversary. If these rules are violated, the result of the operations is un-
defined. The valgrind tool will catch dynamically occurring violations of
these rules, but it cannot check statically if your code will respect these
rules when executed.
The golden rule of memory management in C is
You allocate it, you free it!
By inference, if you didn’t allocate it, you are not allowed to free it! But
this rule is tricky in practice, because sometimes we do need to transfer
ownership of allocated memory so that it “belongs” to a data structure.
Binary search trees are one example. When we allocate an element to
the binary search tree, are we still in charge of freeing that element, or
should it be freed when it is removed from the binary search tree? There
are arguments to be made for both of these options. If we want the BST
to “own” the reference, and therefore be in charge of freeing it, we can
write the following functions that free a binary search tree, given a func-
tion elem_free() that frees elements.
void tree_free(tree *T) {
REQUIRES(is_ordtree(T));
if(T != NULL) {
elem_free(T->data);
tree_free(T->left);
tree_free(T->right);
free(T);
}
return;
}

void bst_free(bst B) {
REQUIRES(is_bst(B));
tree_free(B->root);
free(B);
return;
}

L ECTURE N OTES M ARCH 26, 2013


Memory Management L18b.12

We should never free elements allocated elsewhere; rather, we should


use the appropriate function provided in the interface to free the memory
associated with the data structure. Freeing a data structure (for instance, by
calling free(T)) is something the client itself cannot do reliably, because it
would need to be privy to the internals of its implementation. If we called
free(B) on a binary search tree it would only free the header; the tree itself
would be irrevocably leaked memory.

L ECTURE N OTES M ARCH 26, 2013


Lecture Notes on
AVL Trees
15-122: Principles of Imperative Computation
Frank Pfenning

Lecture 19
March 28, 2013

1 Introduction
Binary search trees are an excellent data structure to implement associa-
tive arrays, maps, sets, and similar interfaces. The main difficulty, as dis-
cussed in Lecture 15, is that they are efficient only when they are balanced.
Straightforward sequences of insertions can lead to highly unbalanced trees
with poor asymptotic complexity and unacceptable practical efficiency. For
example, if we insert n elements with keys that are in strictly increasing or
decreasing order, the complexity will be O(n2 ). On the other hand, if we
can keep the height to O(log(n)), as it is for a perfectly balanced tree, then
the complexity is bounded by O(n ⇤ log(n)).
The solution is to dynamically rebalance the search tree during insert
or search operations. We have to be careful not to destroy the ordering
invariant of the tree while we rebalance. Because of the importance of bi-
nary search trees, researchers have developed many different algorithms
for keeping trees in balance, such as AVL trees, red/black trees, splay trees,
or randomized binary search trees. They differ in the invariants they main-
tain (in addition to the ordering invariant), and when and how the rebal-
ancing is done.
In this lecture we use AVL trees, which is a simple and efficient data
structure to maintain balance, and is also the first that has been proposed.
It is named after its inventors, G.M. Adelson-Velskii and E.M. Landis, who
described it in 1962.

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.2

2 The Height Invariant


Recall the ordering invariant for binary search trees.

Ordering Invariant. At any node with key k in a binary search


tree, all keys of the elements in the left subtree are strictly less
than k, while all keys of the elements in the right subtree are
strictly greater than k.

To describe AVL trees we need the concept of tree height, which we de-
fine as the maximal length of a path from the root to a leaf. So the empty
tree has height 0, the tree with one node has height 1, a balanced tree with
three nodes has height 2. If we add one more node to this last tree is will
have height 3. Alternatively, we can define it recursively by saying that the
empty tree has height 0, and the height of any node is one greater than the
maximal height of its two children. AVL trees maintain a height invariant
(also sometimes called a balance invariant).

Height Invariant. At any node in the tree, the heights of the left
and right subtrees differs by at most 1.

As an example, consider the following binary search tree of height 3.

10


height
=
3

4
 16

height
inv.
sa4sfied


1
 7
 13
 19


L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.3

If we insert a new element with a key of 14, the insertion algorithm for
binary search trees without rebalancing will put it to the right of 13.

10


height
=
4

4
 16

height
inv.
sa.sfied


1
 7
 13
 19


14


Now the tree has height 4, and one path is longer than the others. However,
it is easy to check that at each node, the height of the left and right subtrees
still differ only by one. For example, at the node with key 16, the left subtree
has height 2 and the right subtree has height 1, which still obeys our height
invariant.
Now consider another insertion, this time of an element with key 15.
This is inserted to the right of the node with key 14.

10


height
=
5

4
 16

height
inv.
violated
at
13,
16,
10


1
 7
 13
 19


14


15


L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.4

All is well at the node labeled 14: the left subtree has height 0 while the
right subtree has height 1. However, at the node labeled 13, the left subtree
has height 0, while the right subtree has height 2, violating our invariant.
Moreover, at the node with key 16, the left subtree has height 3 while the
right subtree has height 1, also a difference of 2 and therefore an invariant
violation.
We therefore have to take steps to rebalance tree. We can see without
too much trouble, that we can restore the height invariant if we move the
node labeled 14 up and push node 13 down and to the right, resulting in
the following tree.

10


height
=
4

4
 16

height
inv.
restored
at
14,
16,
10


1
 7
 14
 19


13
 15


The question is how to do this in general. In order to understand this we


need a fundamental operation called a rotation, which comes in two forms,
left rotation and right rotation.

3 Left and Right Rotations


Below, we show the situation before a left rotation. We have generically
denoted the crucial key values in question with x and y. Also, we have
summarized whole subtrees with the intervals bounding their key values.
At the root of the subtree we can have intervals that are unbounded on the
left or right. We denote these with pseudo-bounds 1 on the left and +1
on the right. We then write ↵ for a left endpoint which could either be an
integer or 1 and ! for a right endpoint which could be either an integer

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.5

of +1. The tree on the right is after the left rotation.

(α,"ω)"
(α,"ω)"
x"
y"
le+"rota0on"at"x"
y"
(α,"x)" x"
(y,"ω)"
(x,"y)" (y,"ω)"
(α,"x)" (x,"y)"

From the intervals we can see that the ordering invariants are preserved, as
are the contents of the tree. We can also see that it shifts some nodes from
the right subtree to the left subtree. We would invoke this operation if the
invariants told us that we have to rebalance from right to left.
We implement this with some straightforward code. First, recall the
type of trees from last lecture. We do not repeat the function is_ordtree
that checks if a tree is ordered.

struct tree_node {
elem data;
struct tree_node *left;
struct tree_node *right;
};
typedef struct tree_node tree;
bool is_ordtree(tree *T);

The main point to keep in mind is to use (or save) a component of the
input before writing to it. We apply this idea systematically, writing to a
location immediately after using it on the previous line. We repeat the type
specification of tree from last lecture.

tree *rotate_left(tree *T) {


REQUIRES(is_ordtree(T));
REQUIRES(T != NULL && T->right != NULL);
tree *root = T->right;
T->right = root->left;
root->left = T;
ENSURES(is_ordtree(root));

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.6

ENSURES(root != NULL && root->left != NULL);


return root;
}

These rotations work generically. When we apply them to AVL trees specif-
ically later in this lecture, we will also have to recalculate the heights of the
two nodes involved. This involves only looking up the height of their chil-
dren.
The right rotation is exactly the inverse. First in pictures:

(α,"ω)"
(α,"ω)"
y"
x"

x" right"rota1on"at"y"
(z,"ω)" y"
(α,"y)"

(α,"y)" (y,"z)"
(y,"z)" (z,"ω)"

Then in code:

tree *rotate_right(tree *T) {


REQUIRES(is_ordtree(T));
REQUIRES(T != NULL && T->left != NULL);
tree *root = T->left;
T->left = root->right;
root->right = T;
ENSURES(is_ordtree(root));
ENSURES(root != NULL && root->right != NULL);
return root;
}

4 Searching for a Key


Searching for a key in an AVL tree is identical to searching for it in a plain
binary search tree as described in Lecture 15. The reason is that we only
need the ordering invariant to find the element; the height invariant is only
relevant for inserting an element.

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.7

5 Inserting an Element
The basic recursive structure of inserting an element is the same as for
searching for an element. We compare the element’s key with the keys
associated with the nodes of the trees, inserting recursively into the left or
right subtree. When we find an element with the exact key we overwrite
the element in that node. If we encounter a null tree, we construct a new
tree with the element to be inserted and no children and then return it. As
we return the new subtrees (with the inserted element) towards the root,
we check if we violate the height invariant. If so, we rebalance to restore
the invariant and then continue up the tree to the root.
The main cleverness of the algorithm lies in analyzing the situations
when we have to rebalance and need to apply the appropriate rotations to
restore the height invariant. It turns out that one or two rotations on the
whole tree always suffice for each insert operation, which is a very elegant
result.
First, we keep in mind that the left and right subtrees’ heights before
the insertion can differ by at most one. Once we insert an element into one
of the subtrees, they can differ by at most two. We now draw the trees in
such a way that the height of a node is indicated by the height that we are
drawing it at.
The first situation we describe is where we insert into the right subtree,
which is already of height h + 1 where the left subtree has height h. If we
are unlucky, the result of inserting into the right subtree will give us a new
right subtree of height h + 2 which raises the height of the overall tree to
h + 3, violating the height invariant. This situation is depicted below. Note
that the node we inserted does not need to be z, but there must be a node z
in the indicated position.

(α,"ω)"
h+3"
x"
(α,"ω)"
h+2" y"
x"
insert"to"the"right"of"y"

y" h+1"
z"

h"
(α,"x)" (x,"y)" (y,"ω)" (α,"x)" (x,"y)" (y,"z)" (z,"ω)"

If the new right subtree has height h + 2, either its right or its left subtree

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.8

must be of height h + 1 (and only one of them; think about why). If it is the
right subtree we are in the situation depicted on the right above (and on the
left below). While the trees (↵, x) and (x, y) must have exactly height h, the
trees (y, z) and (z, !) need not. However, they differ by at most 1, because
we are investigating the case were the lowest place in the tree where the
invariant is violated is at x.

(α,"ω)"
h+3"
x"
(α,"ω)"
y" h+2" y"
le1"rota6on"at"x"
h+1" x"
z" z"

h"
(α,"x)" (x,"y)" (y,"z)" (z,"ω)" (α,"x)" (x,"y)" (y,"z)" (z,"ω)"

We fix this with a left rotation at x, the result of which is displayed to the
right. Because the height of the overall tree is reduced to its original h + 2,
no further rotation higher up in the tree will be necessary.
In the second case we consider we insert to the left of the right subtree,
and the result has height h+1. This situation is depicted on the right below.

h+3" (α,"ω)"
x"
(α,"ω)"
h+2"
x" insert"to"the"le7"of"z" z"

z" h+1"
y"

h"
(α,"x)" (x,"z)" (z,"ω)" (α,"x)" (x,"y)" (y,"z)" (z,"ω)"

In the situation on the right, the subtrees labeled (↵, x) and (z, !) must have
exactly height h, but only one of (x, y) and (y, z). In this case, a single left
rotation alone will not restore the invariant (see Exercise 1). Instead, we
apply a so-called double rotation: first a right rotation at z, then a left rotation
at the root labeled x. When we do this we obtain the picture on the right,

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.9

restoring the height invariant.

(α,"ω)"
h+3"
x"
(α,"ω)"
h+2" y"
z"
double"rota8on"at"z"and"x"
h+1" x"
y" z"

h"
(α,"x)" (x,"y)" (y,"z)" (z,"ω)" (α,"x)" (x,"y)" (y,"z)" (z,"ω)"

There are two additional symmetric cases to consider, if we insert the new
element on the left (see Exercise 4).
We can see that in each of the possible cases where we have to restore
the invariant, the resulting tree has the same height h + 2 as before the
insertion. Therefore, the height invariant above the place where we just
restored it will be automatically satisfied, without any further rotations.

6 Checking Invariants
The interface for the implementation is exactly the same as for binary search
trees, as is the code for searching for a key. In various places in the algo-
rithm we have to compute the height of the tree. This could be an operation
of asymptotic complexity O(n), unless we store it in each node and just look
it up. So we have:

typedef struct tree_node tree;

struct tree_node {
elem data;
int height;
struct tree_node *left;
struct tree_node *right;
};

/* height(T) returns the precomputed height of T in O(1) */


int height(tree *T) {
return T == NULL ? 0 : T->height;

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.10

Of course, if we store the height of the trees for fast access, we need to
adapt it when rotating trees. After all, the whole purpose of tree rotations
is to rebalance and change the height. For that, we implement a function
fix_height that computes the height of a tree from the height of its chil-
dren. Its implementation directly follows the definition of the height of a
tree. The implementation of rotate_right and rotate_left needs to be
adapted to include calls to fix_height. These calls need to compute the
heights of the children first, before computing that of the root, because the
height of the root depends on the height we had previously computed for
the child. Hence, we need to update the height of the child before updating
the height of the root. Look at the code for details.
When checking if a tree is balanced, we also check that all the heights
that have been computed are correct.

bool is_balanced(tree *T) {


if (T == NULL) return true;
int h = T->height;
int hl = height(T->left);
int hr = height(T->right);
if (!(h == (hl > hr ? hl+1 : hr+1))) return false;
if (hl > hr+1 || hr > hl+1) return false;
return is_balanced(T->left) && is_balanced(T->right);
}

A tree is an AVL tree if it is both ordered (as defined and implementa-


tion in the BST lecture) and balanced.

bool is_avl(tree *T) {


return is_ordtree(T) && is_balanced(T);
}

We use this, for example, in a utility function that creates a new leaf
from an element (which may not be null).

tree *leaf(elem e) {
REQUIRES(e != NULL);
tree *T = xmalloc(sizeof(struct tree_node));
T->left = NULL;
T->data = e;
T->right = NULL;

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.11

T->height = 1;
ENSURES(is_avl(T));
return T;
}

Recall that, unlike in C0, xmalloc of C does not initialize memory any more,
so the initialization of T->left and T->right to NULL is crucial.

7 Implementing Insertion
The code for inserting an element into the tree is mostly identical with
the code for plain binary search trees. The difference is that after we in-
sert into the left or right subtree, we call a function rebalance_left or
rebalance_right, respectively, to restore the invariant if necessary and cal-
culate the new height.

tree *tree_insert(tree *T, elem e){


REQUIRES(is_avl(T));
REQUIRES(e != NULL);
if (T == NULL) {
T = leaf(e); /* create new leaf with data e */
} else {
int r = key_compare(elem_key(e), elem_key(T->data));
if (r < 0) {
T->left = tree_insert(T->left, e);
T = rebalance_left(T); /* also fixes height */
} else if (r == 0) {
elem_free(T->data);
T->data = e;
} else {
ASSERT(r > 0);
T->right = tree_insert(T->right, e);
T = rebalance_right(T); /* also fixes height */
}
}

ENSURES(is_avl(T));
return T;
}

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.12

The pre- and post-conditions of this functions are actually not strong enough
to prove this function correct. We also need an assertion about how the tree
might change due to insertion, which is somewhat tedious. If we perform
dynamic checking with the contract above, however, we establish that the
result is indeed an AVL tree. As we have observed several times already:
we can test for the desired property, but we may need to strengthen the
pre- and post-conditions in order to rigorously prove it.
We show only the function rebalance_right; rebalance_left is sym-
metric.

tree *rebalance_right(tree *T) {


REQUIRES(T != NULL);
REQUIRES(is_avl(T->left) && is_avl(T->right));
/* also requires that T->right is result of insert into T */

tree *l = T->left;
tree *r = T->right;
int hl = height(l);
int hr = height(r);
if (hr > hl+1) {
ASSERT(hr == hl+2);
if (height(r->right) > height(r->left)) {
ASSERT(height(r->right) == hl+1);
T = rotate_left(T);
ASSERT(height(T) == hl+2);
} else {
ASSERT(height(r->left) == hl+1);
/* double rotate left */
T->right = rotate_right(T->right);
T = rotate_left(T);
ASSERT(height(T) == hl+2);
}
} else {
ASSERT(!(hr > hl+1));
fix_height(T);
}
ENSURES(is_avl(T));
return T;
}

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.13

Note that the preconditions are weaker than we would like. In partic-
ular, they do not imply some of the assertions we have added in order to
show the correspondence to the pictures. This is left as the (difficult) Ex-
ercise 5. Such assertions are nevertheless useful because they document
expectations based on informal reasoning we do behind the scenes. Then,
if they fail, they may be evidence for some error in our understanding, or
in the code itself, which might otherwise go undetected.

8 Experimental Evaluation
We would like to assess the asymptotic complexity and then experimen-
tally validate it. It is easy to see that both insert and search operations take
type O(h), where h is the height of the tree. But how is the height of the tree
related to the number of elements stored, if we use the balance invariant of
AVL trees? It turns out that h is O(log(n)). It is not difficult to prove this,
but it is beyond the scope of this course.
To experimentally validate this prediction, we have to run the code with
inputs of increasing size. A convenient way of doing this is to double the
size of the input and compare running times. If we insert n elements into
the tree and look them up, the running time should be bounded by c ⇤ n ⇤
log(n) for some constant c. Assume we run it at some size n and observe
r = c ⇤ n ⇤ log(n). If we double the input size we have c ⇤ (2 ⇤ n) ⇤ log(2 ⇤ n) =
2 ⇤ c ⇤ n ⇤ (1 + log(n)) = 2 ⇤ r + 2 ⇤ c ⇤ n, we mainly expect the running
time to double with an additional summand that roughly doubles with as n
doubles. In order to smooth out minor variations and get bigger numbers,
we run each experiment 100 times. Here is the table with the results:

n AVL trees increase BSTs


29 0.129 1.018
210 0.281 2r + 0.023 2.258
211 0.620 2r + 0.058 3.094
212 1.373 2r + 0.133 7.745
213 2.980 2r + 0.234 20.443
214 6.445 2r + 0.485 27.689
215 13.785 2r + 0.895 48.242

We see in the third column, where 2r stands for the doubling of the previ-

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.14

ous value, we are quite close to the predicted running time, with a approx-
imately linearly increasing additional summand.
In the fourth column we have run the experiment with plain binary
search trees which do not rebalance automatically. First of all, we see that
they are much less efficient, and second we see that their behavior with
increasing size is difficult to predict, sometimes jumping considerably and
sometimes not much at all. In order to understand this behavior, we need
to know more about the order and distribution of keys that were used in
this experiment. They were strings, compared lexicographically. The keys
were generated by counting integers upward and then converting them to
strings. The distribution of these keys is haphazard, but not random. For
example, if we start counting at 0

"0" < "1" < "2" < "3" < "4" < "5" < "6" < "7" < "8" < "9"
< "10" < "12" < ...

the first ten strings are in ascending order but then numbers are inserted
between "1" and "2". This kind of haphazard distribution is typical of
many realistic applications, and we see that binary search trees without
rebalancing perform quite poorly and unpredictably compared with AVL
trees.

The complete code for this lecture can be found in directory 19-avl/ on the
course website.

L ECTURE N OTES M ARCH 28, 2013


AVL Trees L19.15

Exercises
Exercise 1 Show that in the situation on page 9 a single left rotation at the root
will not necessarily restore the height invariant.

Exercise 2 Show, in pictures, that a double rotation is a composition of two ro-


tations. Discuss the situation with respect to the height invariants after the first
rotation.

Exercise 3 Show that left and right rotations are inverses of each other. What can
you say about double rotations?

Exercise 4 Show the two cases that arise when inserting into the left subtree
might violate the height invariant, and show how they are repaired by a right ro-
tation, or a double rotation. Which two single rotations does the double rotation
consist of in this case?

Exercise 5 Strengthen the invariants in the AVL tree implementation so that the
assertions and postconditions which guarantee that rebalancing restores the height
invariant and reduces the height of the tree follow from the preconditions.

L ECTURE N OTES M ARCH 28, 2013


Lecture Notes on
Types in C

15-122: Principles of Imperative Computation


Frank Pfenning, Rob Simmons

Lecture 20
April 2, 2013

1 Introduction
In lecture 18, we emphasized the things we lost by going to C:

• Many operations that would safely cause an error in C0, like derefer-
encing NULL or reading outside the bounds of an array, are undefined
in C – we cannot predict or reason about what happens when we have
undefined behaviors.

• It is not possible to capture or check the length of C arrays.

• In C, pointers and arrays are the same – and we declare them like
pointers, writing int *i.

• The C0 types string, char* and char[] are all represented as point-
ers to char in C.

• C is not garbage collected, so we have to explicitly say when we ex-


pect memory to be freed, which can easily lead to memory leaks.

In this lecture, we will endeavor to look on the bright side and look at the
new things that C gives us. But remember: with great power comes great
responsibility.
This lecture has three parts. First, we will continue our discussion of
memory management in C: everything has an address and we can use the
address-of operation &e to obtain this address. Second, we will look at the
different ways that C represents numbers and the general, though mostly

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.2

implementation-defined, properties of these numbers that we frequently


count on. Third, we will look at the type void* and how it can be used to
implement generic data structures.

2 Address-of
In C0, we can only obtain new pointers and arrays with the built-in alloc
and alloc_array operations. As we discussed last time, alloc(ty) in C0
roughly translates to malloc(sizeof(ty)) in C, with the exception that C
does not initialize allocated memory to default values. Similarly, the C0
code alloc_array(ty, n) roughly translates to calloc(n, sizeof(ty)),
and calloc does initialize allocated memory to default values. Because
both of these operations can return NULL, we also introduced xmalloc and
xcalloc that allow us to safely assume a non-NULL result.
C also gives us a new way to create pointers. If e is an expression (like
x, A[12], or *x) that describes a memory location which we can read from
and potentially write to, then the expression &e gives us a pointer to that
memory location. In C0, if we have a struct containing a string and an
integer, it’s not possible to get a pointer to just the integer. This is possible
in C:

struct wcount {
char *word;
int count;
};

void increment(int *p) {


REQUIRES(p != NULL);
*p = *p + 1;
}

void increment_count(struct wcount *wc) {


REQUIRES(wc != NULL);
increment(&(wc->count));
}

Because the type of wc->count is int, the expression &(wc->count) is


a pointer to an int. Calling increment_count(B) on a non-null struct will
cause the count field of the struct to be incremented by the increment func-
tion, which is passed a pointer to the second field of the struct.

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.3

3 Stack Allocation
In C, we can also allocate data on the system stack (which is different from
the explicit stack data structure used in the running example). As discussed
in the lecture on memory layout, each function allocates memory in its so-
called stack frame for local variables. We can obtain a pointer to this memory
using the address-of operator. For example:
int main () {
int a1 = 1;
int a2 = 2;
increment(&a1);
increment(&a2);
...
}
Note that there is no call to malloc or calloc which allocate spaces on the
system heap (again, this is different from the heap data structure we used
for priority queues).
Note that we can only free memory allocated with malloc or calloc,
but not memory that is on the system stack. Such memory will automat-
ically be freed when the function whose frame it belongs to returns. This
has two important consequences. The first is that the following is a bug,
because free will try to free the memory holding a1 , which is not on the
heap:
int main() {
int a1 = 1;
int a2 = 2;
free(a1);
...
}
The second consequence is pointers to data stored on the system stack do
not survive the function’s return. For example, the following is a bug:
int *f_ohno() {
int a = 1; /* bug: a is deallocated when f_ohno() returns */
return &a;
}
A correct implementation requires us to allocate on the system heap, using
a call to malloc or calloc (or one of the library functions which calls them
in turn).

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.4

int *f() {
int* x = xmalloc(sizeof(int));
*x = 1;
return x;
}
In general, stack allocation is more efficient than heap allocation, be-
cause it is freed automatically when the function in which it is defined
returns. That removes the overhead of managing the memory explicitly.
However, if the data structure we allocate needs to survive past the end of
the current function you must allocate it on the heap.

4 Pointer Arithmetic in C
We have already discussed that C does not distinguish between pointers
and arrays; essentially a pointer holds a memory address which may be
the beginning of an array. In C we can actually calculate with memory
addresses. Before we explain how, please heed our recommendation: rec-
ommendation
Do not perform arithmetic on pointers!
Code with explicit pointer arithmetic will generally be harder to read and
is more error-prone than using the usual array access notation A[i].
Now that you have been warned, here is how it works. We can add an
integer to a pointer in order to obtain a new address. In our running ex-
ample, we can allocate an array and then push pointers to the first, second,
and third elements in the array onto a stack.
int* A = xcalloc(3, sizeof(int));
A[0] = 0; A[1] = 1; A[2] = 2;
increment(A); /* A[0] now equals 1 */
increment(A+1); /* A[1] now equals 2 */
increment(A+2); /* A[2] now equals 3 */
The actual address denoted by A + 1 depends on the size of the elements
stored at ⇤A, in this case, the size of an int. A much better way to achieve
the same effect is
int* A = xcalloc(3, sizeof(int));
A[0] = 0; A[1] = 1; A[2] = 2;
increment(&A[0]); /* A[0] now equals 1 */

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.5

increment(&A[1]); /* A[1] now equals 2 */


increment(&A[2]); /* A[2] now equals 3 */

We cannot free array elements individually, even though they are located
on the heap. The rule is that we can apply free only to pointers returned
from malloc or calloc. So in the example code we can only free A.

int* A = xcalloc(3, sizeof(int));


A[0] = 0; A[1] = 1; A[2] = 2;
free(&A[2]); /* bug: cannot free A[1] or A[2] separately */

The correct way to free this is as follows.

int* A = xcalloc(3, sizeof(int));


A[0] = 0; A[1] = 1; A[2] = 2;
free(A);

5 Numbers in C
In addition to the undefined behavior resulting from bad memory access
(dereferencing a NULL pointer or reading outside of an array), there is un-
defined behavior in C. In particular:

• Division by zero is undefined. (In C0, this always causes an excep-


tion.)

• Shifting left or right by negative numbers or by a too-large number is


undefined. (In C0, this always causes an exception.)

• Arithmetic overflow for signed types like int is undefined. (In C0,
this is defined as modular arithmetic.)

This has some strange effects. If x and y are signed integers, then the
expressions x < x+1 and x/y == x/y are either true or undefined (due to
signed arithmetic or overflow, respectively). So the compiler is allowed
to pretend that these expressions are just true all the time. The compiler
is also allowed to behave the same way C0 does, returning false in the
first case when x is the maximum integer and raising an exception in the
second case when y is 0. The compiler is also free to check for signed inte-
ger overflow and division by zero and start playing Rick Astley’s “Never
Gonna Give You Up” if either occurs, though this is last option is unlikely
in practice. Undefined behavior is unpredictable – it can and does change

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.6

dramatically between different computers, different compilers, and even


different versions of the same compiler.
The fact that signed integer arithmetic is undefined is particularly an-
noying. In situations where we expect integer overflow to occur, we need to
use unsigned types: unsigned int instead of int. As an example, consider
a simple function to compute Fibonacci numbers. There are even faster
ways of doing this, but what we do here is to allocate an array on the stack,
fill it with successive Fibonacci numbers, and finally return it at the end.

uint fib(int n) {
REQUIRES(n >= 0);
uint A[n+2]; /* stack-allocated array A */
A[0] = 0;
A[1] = 1;
for (int i = 0; i <= n-2; i++)
A[i+2] = A[i] + A[i+1];
return A[n]; /* deallocate A just before actual return */
}

In addition to int, which is a signed type, there are the signed types
short and long, and unsigned versions of each of these types – short is
smaller than int and long is bigger. The numeric type char is smaller than
short and always takes up one byte. The maximum and minimum values
of these numeric types can be found in the standard header file limits.h.
C, annoyingly, does not define whether char is signed or unsigned. A
signed char is definitely signed, a unsigned char is unsigned. The type
char can be either signed or unsigned – this is implementation defined.
(C also gives us floating point numbers, float and double, but we will
not cover these in 122.)

6 Implementation-defined Behavior
It is often very difficult to say useful and precise things about the C pro-
gramming language, because many of the features of C that we have to
rely on in practice are not part of the C standard. Instead, they are things
that the C standard leaves up to the implementation – implementation
defined behaviors. Implementation defined behaviors make it quite dif-
ficult to write code on one computer that will compile and run on another
computer, because on the other compiler may make completely different
choices about implementation defined behavior.

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.7

The first example we have seen is that, while a char is always exactly
one byte, we don’t know whether it is signed or unsigned – whether it
can represent integer values in the range [128, 128) or integer values in the
range [0, 256). And it is even worse, because a byte can be more than 8 bits!
If you really want to mean “8 bits,” you should say octet.
In this class we going to rely on a number of implementation-defined
behaviors. For example, you can always assume that bytes are 8 bits. When
it is important to not rely on integer sizes being implementation-defined, it
is possible to use the types defined in stdint.h, which defines signed and
unsigned types of specific sizes. In the systems that you are going to use for
programming, you can reasonably expect a common set of implementation-
defined behaviors: char will be a signed 8-bit integer and so on. This chart
describes how these types line up:

C (signed) stdint.h (signed) stdint.h (unsigned) C (unsigned)


char int8_t uint8_t unsigned char
short int16_t uint16_t unsigned short
int int32_t uint32_t unsigned int
long int64_t uint64_t unsigned long

However, please remember that we cannot count on this correspondence


behavior in all C compilers!
There are two other crucial numerical types. The first, size_t, is the
type used represent memory sizes and array indices. The sizeof(ty) op-
eration in C actually returns just the size of a type in bytes, so malloc
and xmalloc actually take one argument of type size_t and calloc and
xcalloc take two arguments of type size_t. In the early decades of the
21st century, we’re still used to finding both 32-bit and 64-bit machines and
programs; size_t will usually be the same as uint32_t in a 32-bit system
and the same as uint64_t in a 64-bit system. The same goes for uintptr_t,
an integer type used to represent pointers. Everything in C has an address,
every address can be turned into a pointer with the address-of operation,
and every address is ultimately representable as a number. To make sense
of what it means to store a pointer in a integer type, we’re going to need to
introduce a new topic, casting.

7 Casting Between Numeric Types


If we have the hexadecimal value 0xF0 – the series of bits 11110000 – stored
in an unsigned char, and we want to turn that value into an int. (This is

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.8

a problem you will actually encounter later in this semester.) We can cast
this character value to an integer value buy writing (int)e.

unsigned char c = 0xF0;


int i = (int)c;

However, what will the value of this integer be? You can run this code and
find out on your own, but the important thing to realize is that it’s not clear,
because there are two different stories we can tell.
In the first story, we start by transforming the unsigned char into an
unsigned int. When we cast from a small unsigned quantity to a large un-
signed quantity, we can be sure that the value will be preserved. Because
the bits 11110000 are understood as the unsigned integer 240, the unsigned
int will also be 240, written in hexadecimal as 0x000000F0. Then, when
we cast from an unsigned int to a signed int, we can expect the bits to re-
main the same (though this is really implementation defined), and because
the interpretation of signed integers is two’s-complement (also implemen-
tation defined) the final value will be 240.
In the second story, we transform the unsigned char into the signed-
char. Again, the implementation-defined behavior we expect is that we will
interpret the result as a 8-bit signed two’s-complement quantity, meaning
that 0xF0 is understood as -16. Then, when we cast from the small signed
quantity (char) to a large signed quantity (int), we know the quantity -16
will be preserved, meaning that we will end up with a signed integer writ-
ten in hexadecimal as 0xFFFFFFF0.

0xF0 !
(as$a$uint8_t:$240)$

preserve$value$ preserve$bit$pa9ern$

0x000000F0! 0xF0!
(as$an$uint32_t:$240)$ (as$an$int8_t:$116)$

preserve$bit$pa9ern$ preserve$value$

0x000000F0! 0xFFFFFFF0!
(as$an$int32_t:$240)$ (as$an$int32_t:$116)$

The order in which we do these two steps matters! Therefore, if we


want to be clear about what result we want, we should cast in smaller steps
to be explicit about how we want our casts to work:

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.9

unsigned char c = 0xF0;


int i1 = (int)(unsigned int) c;
int i2 = (int)(char) c;
assert(i1 == 240);
assert(i2 == -16);

8 Void Pointers
In C, a special type void* denotes a pointer to a value of unknown type.
For most pointers, the type of a pointer tells C how big it is. When you
have a char*, it represents an address that points to one byte (or, equiva-
lently, an array of one-byte objects). When you have a int*, it represents
an address that points to four bytes (assuming the implementation defines
4-byte integers), so when C dereferences this pointer it will read or write to
four bytes at a time. A void* is just an address; C does not know how to
read or write from it. We can cast back and forth between void pointers to
other pointers.
int x = 12;
int *y = xcalloc(1, sizeof(int));
int *z;
void *px = (void*)&x;
void *py = (void*)y;
z = (int*)px;
z = (int*)py;
Casting out of void* incorrectly is generally either undefined or implementation-
defined. We can also cast between pointers and the intptr_t types that can
contain them.
int x = 12;
int *y = xcalloc(1, sizeof(int));
int *z;
intptr_t ipx = (intptr_t)&x;
uintptr_t ipy = (uintptr_t)y;
z = (int*)ipx;
z = (int*)ipy;
Thus, we don’t strictly need the void* type – we could always use uintptr_t
– but it is helpful to use the C type system to help us avoid accidentally, say
adding two pointers together.
The return type of xmalloc and company is actually a void pointer.

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.10

void *xcalloc(size_t nobj, size_t size);


void *xmalloc(size_t size);

We have not shown explicit casts when we allocate, because C is willing to


insert some casts for us. This is convenient when allocating memory, but
in other situations it is a source of buggy code and does more harm than
good. If we wanted to be explicit about the cast from void* to int*, we
would write this:

int *px = (int*)xmalloc(sizeof(int));

As one last example, while this is implementation defined behavior, we


can also store integers directly inside of a void pointer:

int x = 12;
void *px = (void*)(intptr_t)12;
int y = (int)(intptr_t)px;

This is a bit of an abuse – px does not contain a memory address, it contains


the number 12 pretending to be an address – but this is a fairly common
practice.

9 Simple Libraries
We can use void pointers to make data structures more generic. For exam-
ple, an interface to generic stacks might be specified as

typedef struct stack* stack;


bool stack_empty(stack S); /* O(1) */
stack stack_new(); /* O(1) */
void push(stack S, void* e); /* O(1) */
void* pop(stack S); /* O(1) */
void* stack_free(stack S); /* S must be empty! */

Notice the use of void* for the first argument to push and for the return
type of pop.

stack S = stack_new();
struct wcount *wc = malloc(sizeof(struct wcount));
wc->name = "wherefore"
wc->count = 3;
push(S, wc);

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.11

wc = malloc(sizeof(struct wcount));
wc->name = "henceforth"
wc->count = 5;
push(S, wc);

while(!stack_empty(S)) {
wc = (struct wcount*)pop(S);
printf("Popped %s with count %d\n", wc->name, wc->count);
free(wc);
}

Because we can squeeze integers into a void*, we can also use the
generic stacks to store integers:
stack S = stack_new();
push(S, (void*)(intptr_t)6);
push(S, (void*)(intptr_t)12);

while(!stack_empty(S)) {
printf("Popped: %d\n", (int)(intptr_t)pop(S));
}
Translating stacks from C0 to C and making them generic is no different
than translating BSTs. In fact, we no longer need stacks to know about the
client interface, because rather than having one specific element, we have
a generic element. The trade-off is that we no longer know how we are
supposed to free a generic element when we free a stack. As the previous
example shows, the elements stored as void pointers might not even be
pointers!
The easy way out is to require that stack_free only be called on empty
stacks, which means there are no elements that we have to consider freeing.
This makes the implementation of stack_free simple:
void stack_free(stack S) {
REQUIRES(is_stack(S) && stack_empty(S));
ASSERT(S->top == S->bottom);
free(S->top);
free(S);
}
In the next C lecture (Lecture 22), we will learn how to extend the stack
implementation so that we can free non-empty stacks without leaks. This

L ECTURE N OTES A PRIL 2, 2013


Types in C L20.12

strategy will also be necessary to make generic versions of more interesting


data structures like BSTs, hash tables, and priority queues.

L ECTURE N OTES A PRIL 2, 2013


Lecture Notes on
Tries
15-122: Principles of Imperative Computation
Thomas Cortina, Frank Pfenning, Rob Simmons

Lecture 21
April 4, 2012

1 Introduction
In the data structures implementing associative arrays so far, we have needed
either an equality operation and a hash function, or a comparison operator
with a total order on keys. Similarly, our sorting algorithms just used a total
order on keys and worked by comparisons of keys. We obtain a different
class of representations and algorithms if we analyze the structure of keys
and decompose them. In this lecture we explore tries, an example from this
class of data structures. The asymptotic complexity we obtain has a differ-
ent nature from data structures based on comparisons, depending on the
structure of the key rather than the number of elements stored in the data
structure.

2 The Boggle Word Game


The Boggle word game is played on an n ⇥ n grid (usually 4 ⇥ 4 or 5 ⇥ 5).
We have n ⇤ n dice that have letters on all 6 sides and which are shaken so
that they randomly settle into the grid. At that point we have an n ⇥ n grid
filled with letters. Now the goal is to find as many words as possible in this
grid within a specified time limit. To construct a word we can start at an
arbitrary position and use any of the 8 adjacent letters as the second letter.
From there we can again pick any adjacent letter as the third letter in the
word, and so on. We may not reuse any particular place in the grid in the
same word, but they may be in common for different words. For example,

L ECTURE N OTES A PRIL 4, 2012


Tries L21.2

in the grid
E F R A
H G D R
P S N A
E E B E

we have the words SEE, SEEP, and BEARDS, but not SEES. Scoring as-
signs points according to the lengths of the words found, where longer
words score higher.
One simple possibility for implementing this game is to systematically
search for potential words and then look them up in a dictionary, perhaps
stored as a sorted word list, some kind of binary search tree, or a hash table.
The problem is that there are too many potential words on the grid, so we
want to consider prefixes and abort the search when a prefix does not start
a word. For example, if we start in the upper right-hand corner and try
horizontally first, then EF is a prefix for a number of words, but EFR, EFD,
EFG, EFH are not and we can abandon our search quickly. A few more
possibilities reveal that no word with 3 letters or more in the above grid
starts in the upper left-hand corner.
Because a dictionary is sorted alphabetically, by prefix, we may be able
to use a sorted array effectively in order for the computer to play Boggle
and quickly determine all possible words on a grid. But we may still look
for potentially more efficient data structures which take into account that
we are searching for words that are constructed by incrementally extending
the prefix.

3 Multi-Way Tries
One possibility is to use a multi-way trie, where each node has a potential
child for each letter in the alphabet. Consider the word SEE. We start at the
root and follow the link labeled S, which gets us to a node on the second
level in the tree. This tree indexes all words with first character S. From
here we follow the link labeled E, which gets us to a node indexing all
words that start with SE. After one more step we are at SEE. At this point
we cannot be sure if this is a complete word or just a prefix for words stored
in it. In order to record this, we can either store a Boolean (true if the
current prefix is a complete word) or terminate the word with a special
character that cannot appear in the word itself.

L ECTURE N OTES A PRIL 4, 2012


Tries L21.3

Below is an example of a multi-way trie indexing the three words BE,


BEE, and BACCALAUREATE.

A" B" C" D" E" …" Z"


false"

A" B" C" D" E" …" Z"


false"

A" B" C" D" E" …" Z" A" B" C" D" E" …" Z"
false" true"

A" B" C" D" E" …" Z"


true"

While the paths to finding each word are quite short, including one more
node than characters in the word, the data structure consumes a lot of
space, because there are a lot of nearly empty arrays.
An interesting property is that the lookup time for a word is O(k),
where k is the number of characters in the word. This is independent of
how many words are stored in the data structure! Contrast this with, say,
balanced binary search trees where the search time is O(log(n)), where n is
the number of words stored. For the latter analysis we assumed that key
comparisons where constant time, which is not really true because the keys
(which are strings) have to be compared character by character. So each
comparison, while searching through a binary search tree, might take up to
O(k) individual character comparison, which would make it O(k ⇤ log(n))
in the worst case. Compare that with O(k) for a trie.
On the other hand, the wasted space of the multi-way trie with an array
at each node costs time in practice. This is not only because this memory
must be allocated, but because on modern architectures the so-called mem-
ory hierarchy means that accesses to memory cells close to each other will be
much faster than accessing distant cells. You will learn more about this in
15-213 Computer Systems.

L ECTURE N OTES A PRIL 4, 2012


Tries L21.4

4 Binary Tries
The idea of the multi-way trie is quite robust, and there are useful special
cases. One of these if we want to represent sets of numbers. In that case
we can decompose the binary representation of numbers bit by bit in order
to index data stored in the trie. We could start with the most significant or
least significant bit, depending on the kind of numbers we expect. In this
case every node would have at most two successors, one for 0 and one for
1. This does not waste nearly as much space and can be efficient for many
purposes.

5 Linked Lists
For the particular application we have in mind, namely searching for words
on a grid of letters, we could either use multiway tries directly (wasting
space) or use binary tries (wasting time and space, because each character
is decomposed into individual bits).
A compromise solution is replacing the array (which may end up mostly
empty) with a linked list. This gives us two fundamentally different uses of
pointers. Child pointers (drawn in blue) correspond to forward movement
through the string. The next pointers of the linked list (drawn in red), on
the other hand, connect what used to be parts of the same array list.
In this representation, it also becomes natural to have the Boolean “end
of word” flag stored with the final character, rather than one step below
the final character like we did above. (This means it’s no longer possible to
store the empty string, however.) The tree above, containing BACCALAU-
REATE, BE, and BEE, now looks like this:

false! b!

e!
!

false! a! true!

! !

false! c! true! e!

L ECTURE N OTES A PRIL 4, 2012


Tries L21.5

If we add OR and BED, the result looks like this:

! !

false! b! false! o!

e!
!

a!
!
false! true! true! r!

d!
! !

false! c! true! e! true!

For lookup, we have to make at most 26 comparisons between each


character in the input string and the characters stored in the tree. Therefore
search time is O(26⇤k) = O(k), where k is the length of the string. Insertion
has the same asymptotic complexity bound. Note that this still does not
change with the number of strings, only with its length.

6 Ternary Search Tries


It should be apparent that we could get some performance gains over this
linked list solution if we kept the linked lists sorted. This is an idea that
allows us to motivate a more suitable data structure, a ternary search trie
(TST) which combines ideas from binary search trees with tries. Roughly,
at each node in a trie we store a binary search tree with characters as keys.
The entries are pointers to the subtries.
More precisely, at each node we store a character c and three pointers.
The left subtree stores all words starting with characters alphabetically less
than c. The right subtree stores all words starting with characters alphabet-
ically greater than c and the middle stores a subtrie with all words starting
with c, from the second character on. The middle children work exactly
like the child pointers from our linked list implementation. The left and
right pointers work like the next pointers from the linked list version, and
are correspondingly drawn in red in the diagram below, which contains
the words BE, BED, BEE, BACCALAUREATE, and OR. In this diagram, we
represent the flag for “end of word” using the presence of absence of an X.

L ECTURE N OTES A PRIL 4, 2012


Tries L21.6

First"Character"
b"

o"
Second"Character"
e" r"

a"

Third"Character"
e"

c" d"

We have not discussed any strategy for balancing TSTs. However, in


the worst case (a completely unbalanced tree) we end up with something
similar to the linked list implementation and we have to make at most 26
comparisons between each character in the input string and the characters
stored in the tree. Therefore search time is O(26 ⇤ k) = O(k), where k is the
length of the string. Even when the embedded trees are perfectly balanced,
the constant factor decreases, but not the asymptotic complexity because
O(log(26) ⇤ k) = O(k).

7 Specifying an Interface
Specifying an interface for tries is tricky. If we want to just use tries as
dictionaries, then we can store arbitrary elements, but we commit to strings
as keys. In that sense our interface is not very abstract, but well-suited to
our application. (To relate this to our discussion above, an X in the diagram
above can represent the presence or absence of an element rather than a
bool flag.)

L ECTURE N OTES A PRIL 4, 2012


Tries L21.7

typedef struct trie_header *trie;

trie trie_new();
elem trie_lookup(trie TR, char *s);
void trie_insert(trie TR, char *s, elem e);
void trie_free(trie TR);

The interface above is unable to handle freeing the data encoded in a


trie. We discussed a client interface function elem_free when we ported
binary search trees to C, but here we treat it as the client’s responsibility
to track and free their elements. If we typedef the type elem to be a void
pointer, we now have a data structure without generic keys (keys are de-
fined to be strings) but with generic values. By casting into and out of
void*, we can store different values in the trie without copying code the
way we did in Claclab.

8 Adapting the interface


However, it turns out we oversimplified. While the interface works well
to use the trie as a kind of dictionary, it does not work well for our boggle
application since we cannot tell if a string is a valid prefix of a complete
word or not.
If we want to be able to search for valid prefixes in tries, not just com-
plete words, one option is to expose to the user a little bit of what the
internal structure of a trie looks like. The function trie_lookup_prefix
returns a sub-trie (or NULL if the prefix is not in the trie) and the function
tnode_elem checks to see if there is data (a checkmark) stored in that loca-
tion in the trie.

typedef struct trie_node tnode;


tnode *trie_lookup_prefix(trie T, char *s); /* i < strlen(s) */
elem tnode_elem(tnode *T); /* T != NULL */

Exposing more of our interface always comes with a cost, because we


may be less able to change our internal representation of tries once we have
exposed a richer interface. However, if our goal is to use tries to play Bog-
gle, some interface that checks prefixes is absolutely necessary.

L ECTURE N OTES A PRIL 4, 2012


Tries L21.8

9 Checking Invariants
The declarations of the types is completely straightforward.

typedef struct tst_node tst;


struct tst_node {
char c; /* discriminating character */
elem data; /* possible data element (void pointer) */
tst *left; /* left child in bst */
tst *middle; /* subtrie following c */
tst *right; /* right child in bst */
};

struct trie_header {
tst *root;
};

To check that a trie is valid we use two mutually recursive functions.


One checks the order invariant for the binary search trees embedded in the
trie, the other checks the correctness of the subtries. For mutually recursive
functions we need forward declare the function which comes textually sec-
ond in the file so that type checking by the compiler can be done in order.

bool is_tst(tst *T, int lower, int upper);


bool is_tst_root(tst *T);

bool is_tst(tst *T, int lower, int upper) {


if (T == NULL) return true;
if (!(lower < T->c && T->c < upper)) return false;
if (!(is_tst(T->left, lower, T->c))) return false;
if (!(is_tst_root(T->middle))) return false;
if (!(is_tst(T->right, T->c, upper))) return false;
if (T->middle == NULL && T->data == NULL) return false;
return true;
}

bool is_tst_root(tst *T) {


return is_tst(T, 0, ((int)CHAR_MAX)+1);
}

L ECTURE N OTES A PRIL 4, 2012


Tries L21.9

It only makes sense to add one to CHAR_MAX because int is a bigger type
than char. 0 and CHAR_MAX+1 essentially function as 1 and +1 for check-
ing the intervals of a binary search tree with strictly positive character val-
ues as keys.

10 Implementing Lookup on TSTs


Implementing lookup is just a direct combination of traversing a trie and
searching through binary search trees. We pass a trie T , a string s and an
index i which should either be a valid string index or be equal to the length
of the string. If it is equal (as determined by checking whether i+1 is \0),
we consider the character to be word-ending.

tnode *tnode_lookup(tnode *T, char *s, size_t i)


{
REQUIRES(is_tnode_root(T));
REQUIRES(s != NULL);
REQUIRES(i < strlen(s));

if (T == NULL) return NULL;


if (s[i] < T->c) return tnode_lookup(T->left, s, i);
if (s[i] > T->c) return tnode_lookup(T->right, s, i);
if (s[i+1] == ’\0’) return T;

return tnode_lookup(T->middle, s, i+1);


}

This function can then be used to implement several interface functions:

tnode *trie_lookup_prefix(trie *TR, char *s) {


REQUIRES(is_trie(TR));
REQUIRES(s != NULL);
return tnode_lookup(TR->root, s, 0);
}

elem tnode_elem(tnode *T) {


REQUIRES(is_tnode_root(T));
REQUIRES(T != NULL);
return T->data; /* could be NULL */
}

L ECTURE N OTES A PRIL 4, 2012


Tries L21.10

elem trie_lookup(trie TR, char *s)


{
REQUIRES(is_trie(TR));
REQUIRES(s != NULL);
tnode *T = trie_lookup_prefix(TR, s);
if (T == NULL)
return NULL;
else
return tnode_elem(T);
}

If the tree is null, the word is not stored in the trie and we return NULL.
On the other hand, if we are at the end of the string (s[i+1] = ’\0’) we
return the stored data. Otherwise, we continue lookup in the left, middle,
or right subtree as appropriate. Important for the last case: if the string
character s[i] equal to the character stored at the node, then we look for
the remainder of the word in the middle subtrie. This is implemented by
passing i + 1 to the subtrie.

11 Implementing Insertion
Insertion follows the same structure as search, which is typical for the kind
of data structure we have been considering in the last few weeks. If the tree
to insert into is null, we create a new node with the character of the string
we are currently considering (the ith) and null children and then continue
with the insertion algorithm.

tnode *tnode_insert(tnode *T, char *s, size_t i, elem e)


{
REQUIRES(is_tnode_root(T));
REQUIRES(s != NULL);
REQUIRES(i < strlen(s));

if (T == NULL)
{
T = xmalloc(sizeof(struct trie_node));
T->c = s[i];
T->data = NULL;
T->left = NULL;
T->right = NULL;

L ECTURE N OTES A PRIL 4, 2012


Tries L21.11

T->middle = NULL;
}

if (s[i] < T->c) T->left = tnode_insert(T->left, s, i, e);


else if (s[i] > T->c) T->right = tnode_insert(T->right, s, i, e);
else if (s[i+1] == ’\0’) T->data = e;
else T->middle = tnode_insert(T->middle, s, i+1, e);
return T;
}

As usual with recursive algorithms, we return the the trie after insertion to
handle the null case gracefully, but we operate imperatively on the subtries.
At the top level we just insert into the root, with an initial index of 0. At
this (non-recursive) level, insertion is done purely by modifying the data
structure.

void trie_insert(trie TR, char *s, elem e) {


REQUIRES(is_trie(TR));
REQUIRES(s != NULL);
TR->root = tnode_insert(TR->root, s, 0, e);
}

L ECTURE N OTES A PRIL 4, 2012


Tries L21.12

Exercises
Exercise 1 Implement the game of Boggle as sketched in this lecture. Make sure
to pick the letters according to the distribution of their occurrence in the English
language. You might use the Scrabble dictionary itself, for example, to calculate
the relative frequency of the letters.
If you are ambitious, try to design a simple textual interface to print a random
grid and then input words from the human player and show the words missed by
the player.

Exercise 2 Modify the implementation of search in TSTs so it can process a star


(’*’) character in the search string. It can match any number of characters for
words stored in the trie. This matching is done by adding all matching string to a
queue that is an input argument to the generalized search function.
For example, after we insert BE, BED, and BACCALAUREATE, the string
"BE*" matches the first two words, and "*A*" matches the only the third, in three
different ways. The search string "*" should match the entire set of words stored
in the trie and produce them in alphabetical order. You should decide if the different
ways to match a search string should show up multiple times in the result queue
or just one.

Exercise 3 Modify TSTs to use a special non-alphabetic character (either period


’.’ or the nil character ’\0’), which is shown as a small filled circle in the dia-
gram, to represent the end of the word.

B"

A"

C" E"

C"

D"

L ECTURE N OTES A PRIL 4, 2012


Tries L21.13

Exercise 4 Using this modified TST implementation from the question above, al-
low repeated searching of prefixes with the following interface:

/* Prefix search interface */


typedef struct trie_node tnode;
tnode *trie_root(trie TR);
tnode *tnode_lookup_prefix(tnode *T, char *s); /* i < strlen(s) */
elem tnode_elem(tnode *T);

The tnode pointer returned by trie_root or tnode_lookup_prefix should be a


pointer to the root of a TST, and tnode_elem will look for the the special character
’.’ or ’\0’ by following left and right pointers and return any data stored at
that special node.

Exercise 5 Consider other implementations of the interface above that allow re-
peated searching of prefixes.

L ECTURE N OTES A PRIL 4, 2012


Lecture Notes on
Generic Data Structures
15-122: Principles of Imperative Computation
Frank Pfenning

Lecture 22
November 15, 2012

1 Introduction
Using void* to represent pointers to values of arbitrary type, we were able
to implement generic stacks in that the types of the elements were arbitrary.
The main remaining restriction was that they had to be pointers. Generic
queues or unbounded arrays can be implemented in an analogous fashion.
However, when considering, say, hash tables or binary search trees, we run
into difficulties because implementations of these data structures require
operations on data provided by the client. For example, a hash table im-
plementation requires a hash function and an equality function on keys.
Similarly, binary search trees require a comparison function on keys with
respect to an order. In this lecture we show how to overcome this limitation
using function pointers as introduce in the previous lecture.

2 The Hash Table Interface Revisited


Recall the client-side interface for hash tables, in file ht-resize.c0. The client
must provide a type elem (which must be a pointer), a type key (which
was arbitrary), a hash function on keys, an equality function on keys, and
a function to extract a key from an element. We write ___ while a concrete
type must be supplied there in the actual file.

/************************************/
/* Hash table client-side interface */
/************************************/

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.2

typedef ___* elem;


typedef ___ key;

int hash(key k, int m)


//@requires m > 0;
//@ensures 0 <= \result && \result < m;
;

bool key_equal(key k1, key k2);

key elem_key(elem e)
//@requires e != NULL;
;

We were careful to write the implementation so that it did not need to know
what these types and functions were. But due to limitations in C0, we could
not obtain multiple implementations of hash tables to be used in the same
application, because once we fix elem, key, and the above three functions,
they cannot be changed.
Given the above the library provides a type ht of hash tables and means
to create, insert, and search through a hash table.

/*************************************/
/* Hash table library side interface */
/*************************************/
typedef struct ht_header* ht;

ht ht_new(int capacity)
//@requires capacity > 0;
;
elem ht_lookup(ht H, key k); /* O(1) avg. */
void ht_insert(ht H, elem e) /* O(1) avg. */
//@requires e != NULL;
;

3 Generic Types
Since both keys and elements are defined by the clients, they turn into
generic pointer types when we implement a truly generic structure in C.

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.3

We might try the following in a file ht.h, where we have added the func-
tion ht_free to the interface. The latter takes a pointer to the function that
frees elements stored in the table, as explained in a previous lecture.

#include <stbool.h>
#include <stdlib.h>

#ifndef _HASHTABLE_H_
#define _HASHTABLE_H_

typedef void *ht_elem;


typedef void *ht_key;

/* Hash table interface */


typedef struct ht_header *ht;

ht ht_new (size_t capacity);


void ht_insert(ht H, ht_elem e);
ht_elem ht_lookup(ht H, ht_key k);
void ht_free(ht H, void (*elem_free)(ht_elem e));

#endif

We use type definitions instead of writing void* in this interface so the role
of the arguments as keys or elements is made explicit (even if the compiler
is blissfully unaware of this distinction). We write ht_elem now in the C
code instead of elem to avoid clashes with functions of variables of that
name.
However, this does not yet work. Before you read on, try to think about
why not, and how we might solve it

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.4

4 Generic Operations via Function Pointers


The problem with the approach in the previous section is that the imple-
mentation of hashtables must call the functions elem_key, key_equal, and
hash. Their types would now involve void* but in the environment in
which the hash table implementation is compiled, there can still only be
one of each of these functions. This means the implementation cannot be
truly generic. We could not even use two hash tables with different element
simultaneously this way.
Instead, we should pass pointers to these functions! But where do we
pass them? We could pass all three to ht_insert and ht_lookup, where
they are actually used. However, it is awkward to do this on every call.
We notice that for a particular hash table, all three functions should be the
same for all calls to insert into and search this table, because a single hash
table stores elements of the same type and key. We can therefore pass these
functions just once, when we first create the hash table, and store them with
the table!
This gives us the following interface (in file ht.h):
#include <stbool.h>
#include <stdlib.h>

#ifndef _HASHTABLE_H_
#define _HASHTABLE_H_

typedef void* ht_key;


typedef void* ht_elem;

/* Hash table interface */


typedef struct ht* ht;
ht ht_new (size_t capacity,
ht_key (*elem_key)(ht_elem e),
bool (*key_equal)(ht_key k1, ht_key k2),
unsigned int (*key_hash)(ht_key k, unsigned int m));
void ht_insert(ht H, ht_elem e);
ht_elem ht_lookup(ht H, ht_key k);
void ht_free(ht H, void (*elem_free)(ht_elem e));

#endif
We have made some small changes to exploit the presence of unsigned in-

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.5

tegers (in key_hash) and the also unsigned size_t types to provide more
appropriate types to certain functions.
Storing the function for manipulating the data brings us closer to the
realm of object-oriented programming where such functions are called meth-
ods, and the structure they are stored in are objects. We don’t pursue this
analogy further in this course, but you may see it in follow-up courses,
specifically 15-214 Software System Construction.

5 Using Generic Hashtables


First, we see how the client code works with the above interface. We use
here the example of word counts, which we also used to illustrate and test
hash tables earlier. The structure contains a string and a count.

/* elements */
struct wc {
char *word; /* key */
int count; /* information */
};
typedef struct wc *ht_elem;

As mentioned before, strings are represented as arrays of characters (type


char*). The C function strcmp from library with header string.h com-
pares strings. We then define:

bool word_equal(ht_key w1, ht_key w2) {


return strcmp((char*)w1,(char*)w2) == 0;
}

Keep in mind that ht_key is defined to be void*. We therefore have to


cast it to the appropriate type char* before we pass it to strcmp, which
requires two strings as arguments. Similarly, when extracting a key from
an element, we are given a pointer of type void* and have to cast it as of
type struct wc*.

/* extracting keys from elements */


ht_key elem_key(ht_elem e) {
REQUIRES(e != NULL);
struct wc *wcount = (struct wc*)e;
return wcount->word;
}

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.6

The hash function is defined in a similar manner.


Here is an example where we insert strings created from integers (func-
tion itoa) into a hash table and then search for them.
int n = (1<<10);
ht H = ht_new(n/5, &elem_key, &key_equal, &key_hash);
for (int i = 0; i < n; i++) {
struct wc* e = xmalloc(sizeof(struct wc));
e->word = itoa(i);
e->count = i;
ht_insert(H, e);
}
for (int i = 0; i < n; i++) {
char *s = itoa(i);
struct wc *wcount = (struct wc*)(ht->lookup(H, s));
assert(wcount->count == i);
free(s);
}
Note the required cast when we receive an element from the table, while
the arguments e and s do not need to be cast because the conversion from
t* to void* is performed implicitly by the compiler.

6 Implementing Generic Hash Tables


The hash table structure, defined in file hashtable.c now needs to store
the function pointers passed to it.
struct ht_header {
size_t size; /* size >= 0 */
size_t capacity; /* capacity > 0 */
chain **table; /* \length(table) == capacity */
ht_key (*elem_key)(ht_elem e);
bool (*key_equal)(ht_key k1, ht_key k2);
unsigned int (*key_hash)(ht_key k, unsigned int m);
void (*elem_free)(ht_elem e);
};
We have also decided here to add the elem_free function to the hash table
header, instead of passing it in to the free function. This exploits that we can
generally anticipate how the elements will be freed when we first create the
hash table. A corresponding change must be made in the header file ht.h.

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.7

ht ht_new(size_t capacity,
ht_key (*elem_key)(ht_elem e),
bool (*key_equal)(ht_key k1, ht_key k2),
unsigned int (*key_hash)(ht_key k, unsigned int m),
void (*elem_free)(ht_elem e))
{
REQUIRES(capacity > 0);
ht H = xmalloc(sizeof(struct ht_header));
H->size = 0;
H->capacity = capacity;
H->table = xcalloc(capacity, sizeof(chain*));
/* initialized to NULL */
H->elem_key = elem_key;
H->key_equal = key_equal;
H->key_hash = key_hash;
H->elem_free = elem_free;
ENSURES(is_ht(H));
return H;
}

When we search for an element (and insertion is similar) we retrieve


the functions from the hash table structure and call them. It is good style to
wrap this in short functions to make the code more readable. We use here
the static inline specified to instruct the compiler to inline the function,
which means that wherever a call to this function occurs, we just replace
it by the body. This provides a similar but semantically cleaner and less
error-prone alternative to C preprocessor macros.

static inline ht_key elemkey(ht H, ht_elem e) {


return (*H->elem_key)(e);
}

static inline bool keyequal(ht H, ht_key k1, ht_key k2) {


return (*H->key_equal)(k1, k2);
}

static inline unsigned int keyhash(ht H, ht_key k, unsigned int m) {


return (*H->key_hash)(k, m);
}

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.8

We exploit here that C allows function pointers to be directly applied to


arguments, implicitly dereferencing the pointer. We use

/* ht_lookup(H, k) returns NULL if key k not present in H */


ht_elem ht_lookup(ht H, ht_key k)
{
REQUIRES(is_ht(H));
int i = keyhash(H, k, H->capacity);
chain* p = H->table[i];
while (p != NULL) {
ASSERT(p->data != NULL);
if (keyequal(H, elemkey(H,p->data), k))
return p->data;
else
p = p->next;
}
/* not in chain */
return NULL;
}

This concludes this short discussion of generic implementations of li-


braries, exploiting void* and function pointers.
In more modern languages such as ML, so-called parametric polymor-
phism can eliminate the need for checks when coercing from void*. The
corresponding construct in object-oriented languages such as Java is usu-
ally called generics. We do not discuss these in this course.

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.9

7 A Subtle Memory Leak


Let’s look at the beginning code for insertion into the hash table.

void ht_insert(ht H, ht_elem e) {


REQUIRES(is_ht(H));
REQUIRES(e != NULL);
ht_key k = elemkey(H, e);
unsigned int i = keyhash(H, k, H->capacity);

chain *p = H->table[i];
while (p != NULL) {
ASSERT(is_chain(H, i, NULL));
ASSERT(p->data != NULL);
if (keyequal(H, elemkey(H, p->data), k)) {
/* overwrite existing element */
p->data = e;
} else {
p = p->next;
}
}
ASSERT(p == NULL);
...
}

At the end of the while loop, we know that the key k is not already in the
hash table. But this code fragment has a subtle memory leak. Can you see
it?1

1
The code author overlooked this in the port of the code from C0 to C, but one of the
students noticed.

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.10

The problem is that when we overwrite p->data with e, the element


currently stored in that field may be lost and can potentially no longer be
freed.
There seem to be two solutions. The first is for the hash table to apply
the elem_free function it was given. We should guard this with a check
that the element we are inserting is indeed new, otherwise we would have
a freed element in the hash table, leading to undefined behavior.

if (keyequal(H, elemkey(H, p->data), k)) {


/* free existing element, if different from new one */
if (p->data != e) (*H->elem_free)(e);
/* overwrite existing element */
p->data = e;
}

The client has to be aware that the element already in the table will be freed
when a new one with the same key is added.
In order to avoid this potentially dangerous convention, we can also
just return the old element if there is one, and NULL otherwise. The infor-
mation that such an element already existed may be useful to the client in
other situations, so it seems like the preferable solution. The client could
always immediately apply it element free function if that is appropriate.
This requires a small change in the interface, but first we show the relevant
code.

chain *p = H->table[i];
while (p != NULL) {
ASSERT(p->data != NULL);
if (keyequal(H, elemkey(H, p->data), k)) {
/* overwrite existing element and return it */
ht_elem tmp = p->data;
p->data = e;
return tmp;
} else {
p = p->next;
}
}

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.11

The relevant part of the revised header file ht.h now reads:
typedef void* ht_elem;
typedef void* ht_key;

typedef struct ht_header* ht;

ht ht_new(size_t capacity,
ht_key (*elem_key)(ht_elem e),
bool (*key_equal)(ht_key k1, ht_key k2),
unsigned int (*key_hash)(ht_key k, unsigned int m),
void (*elem_free)(ht_elem e));

/* ht_insert(H,e) returns previous element with key of e, if exists */


ht_elem ht_insert(ht H, ht_elem e);

/* ht_lookup(H,k) returns NULL if no element with key k exists */


ht_elem ht_lookup(ht H, ht_key k);

void ht_free(ht H);

8 Separate Compilation
Although the C language does not provide much support for modularity,
convention helps. The convention rests on a distinction between header files
(with extension .h) and program files (with extension c).
When we implement a data structure or other code, we provide not
only filename.c with the code, but also a header file filename.h with
declarations providing the interface for the code in filename.c. The im-
plementation filename.c contains #include "filename.h" at its top, and
client will have the same line. The fact that both implementation and client
include the same header file provides a measure of consistency between
the two.
Header files filename.h should never contain any function definitions
(that is, code), only type definition, structure declarations, macros, and
function declarations (so-called function prototypes). In contrast, program
files filename.c can contain both declarations and definitions, with the
understanding that the definitions are not available to other files.
We only ever #include header files, never program files, in order to
maintain the separation between code and interface.

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.12

When gcc is invoked with multiple files, it behaves somewhat differ-


ently than cc0. It compiles each file separately, referring only to the included
header files. Those come in two forms, #include <syslib.h> where syslib
is a system library, and #include "filename.h", where filename.h is pro-
vided in the local directory. Therefore, if the right header files are not in-
cluded, the program file will not compiler correctly. We never pass a header
file directly to gcc.
The compiler then produces a separate so-called object file filename.o
for each filename.c that is compiled. All the object files and then linked
together to create the executable. By default, that is a.out, but it can also
be provided with the -o executable switch.
Let us summarize the most important conventions:

• Every file filename, except for the one with the main function, has a
header file filename.h and a program file filename.c.

• The program filename.c and any client that would like to use it has
a line #include "filename.h" at the beginning.

• The header file filename.h never contains any code, only macros,
type definition, structure definitions, and functions header files. It
has appropriate header guards to void problems if it is loaded more
than once.

• We never #include any program files, only header files (with .h ex-
tension).

• We only pass program files (with .c extension) to gcc on the com-


mand line.

L ECTURE N OTES N OVEMBER 15, 2012


Generic Data Structures L22.13

Exercises
Exercise 1 Convert the interface and implementation for binary search trees from
C0 to C and make them generic. Also convert the testing code, and verify that no
memory is leaked in your tests. Make sure to adhere to the conventions described
in Section 8.

L ECTURE N OTES N OVEMBER 15, 2012


EXAMPLE 1

int main () {
return (3+4)*5/2;
}

We compile it with
% cc0 –b ex1.c0
to generate the corresponding byte code file ex1.bc0:

C0 C0 FF EE # magic number
00 09 # version 4, arch = 1 (64 bits)

00 00 # int pool count


# int pool

00 00 # string pool total size


# string pool

00 01 # function count
# function_pool

#<main>
00 00 # number of arguments = 0
00 00 # number of local variables = 0
00 0C # code length = 12 bytes
10 03 # bipush 3 # 3
10 04 # bipush 4 # 4
60 # iadd # (3 + 4)
10 05 # bipush 5 # 5
68 # imul # ((3 + 4) * 5)
10 02 # bipush 2 # 2
6C # idiv # (((3 + 4) * 5) / 2)
B0 # return #

00 00 # native count
# native pool
EXAMPLE 2

int mid (int lower, int upper) {


int mid = lower + (upper - lower)/2;
return mid;
}

int main () {
return mid(3,6);
}

Local variable array V = [lower, upper, mid]

Corresponding byte code for mid function


(other parts of bytecode file not show):

#<mid>
00 02 # number of arguments = 2
00 03 # number of local variables = 3
00 10 # code length = 16 bytes
15 00 # vload 0 # lower
15 01 # vload 1 # upper
15 00 # vload 0 # lower
64 # isub # (upper - lower)
10 02 # bipush 2 # 2
6C # idiv # ((upper - lower) / 2)
60 # iadd # (lower + ((upper - lower) / 2))
36 02 # vstore 2 # mid = ...;
15 02 # vload 2 # mid
B0 # return #
EXAMPLE 3

int next_rand(int last) {


return last * 1664525 + 1013904223;
}

int main() {
return next_rand(0xdeadbeef);
}

BYTECODE:

C0 C0 FF EE # magic number
00 09 # version 4, arch = 1 (64 bits)

00 03 # int pool count


# int pool
00 19 66 0D
3C 6E F3 5F
DE AD BE EF

00 00 # string pool total size


# string pool

00 02 # function count
# function_pool

#<main>
00 00 # number of arguments = 0
00 01 # number of local variables = 1
00 07 # code length = 7 bytes
13 00 02 # ildc 2 # c[2] = -559038737
B8 00 01 # invokestatic 1 # next_rand(-559038737)
B0 # return #

#<next_rand>
00 01 # number of arguments = 1
00 01 # number of local variables = 1
00 0B # code length = 11 bytes
15 00 # vload 0 # last
13 00 00 # ildc 0 # c[0] = 1664525
68 # imul # (last * 1664525)
13 00 01 # ildc 1 # c[1] = 1013904223
60 # iadd # ((last * 1664525) + 1013904223)
B0 # return #

00 00 # native count
# native pool
EXAMPLE 4

int main () {
int sum = 0;
for (int i = 1; i < 100; i += 2)
//@loop_invariant 0 <= i && i <= 100;
sum += i;
return sum;
}

BYTECODE (only <main> shown):

#<main>
00 00 # number of arguments = 0
00 02 # number of local variables = 2
00 26 # code length = 38 bytes
10 00 # bipush 0 # 0
36 00 # vstore 0 # sum = 0;
10 01 # bipush 1 # 1
36 01 # vstore 1 # i = 1;
# <00:loop>
15 01 # vload 1 # i
10 64 # bipush 100 # 100
A1 00 06 # if_icmplt +6 # if (i < 100) goto <01:body>
A7 00 14 # goto +20 # goto <02:exit>
# <01:body>
15 00 # vload 0 # sum
15 01 # vload 1 # i
60 # iadd #
36 00 # vstore 0 # sum += i;
15 01 # vload 1 # i
10 02 # bipush 2 # 2
60 # iadd #
36 01 # vstore 1 # i += 2;
A7 FF E8 # goto -24 # goto <00:loop>
# <02:exit>
15 00 # vload 0 # sum
B0 # return #
EXAMPLE 5

struct point {
int x;
int y;
};
typedef struct point* point;

point reflect(point p) {
point q = alloc(struct point);
q->x = p->y;
q->y = p->x;
return q;
}

int main () {
point p = alloc(struct point);
p->x = 1;
p->y = 2;
point q = reflect(p);
return q->x*10 + q->y;
}

BYTECODE (only <reflect> shown):

#<reflect>
00 01 # number of arguments = 1
00 02 # number of local variables = 2
00 1B # code length = 27 bytes
BB 08 # new 8 # alloc(struct point)
36 01 # vstore 1 # q = alloc(struct point);
15 01 # vload 1 # q
62 00 # aaddf 0 # &q->x
15 00 # vload 0 # p
62 04 # aaddf 4 # &p->y
2E # imload # p->y
4E # imstore # q->x = p->y;
15 01 # vload 1 # q
62 04 # aaddf 4 # &q->y
15 00 # vload 0 # p
62 00 # aaddf 0 # &p->x
2E # imload # p->x
4E # imstore # q->y = p->x;
15 01 # vload 1 # q
B0 # return #
EXAMPLE 6

#use <conio>

int main() {
int[] A = alloc_array(int, 100);
for (int i = 0; i < 100; i++)
A[i] = i;
return A[99];
}

BYTECODE (only <main> shown):

#<main>
00 00 # number of arguments = 0
00 02 # number of local variables = 2
00 2D # code length = 45 bytes
10 64 # bipush 100 # 100
BC 04 # newarray 4 # alloc_array(int, 100)
36 00 # vstore 0 # A = alloc_array(int, 100);
10 00 # bipush 0 # 0
36 01 # vstore 1 # i = 0;
# <00:loop>
15 01 # vload 1 # i
10 64 # bipush 100 # 100
A1 00 06 # if_icmplt +6 # if (i < 100) goto <01:body>
A7 00 15 # goto +21 # goto <02:exit>
# <01:body>
15 00 # vload 0 # A
15 01 # vload 1 # i
63 # aadds # &A[i]
15 01 # vload 1 # i
4E # imstore # A[i] = i;
15 01 # vload 1 # i
10 01 # bipush 1 # 1
60 # iadd #
36 01 # vstore 1 # i += 1;
A7 FF E7 # goto -25 # goto <00:loop>
# <02:exit>
15 00 # vload 0 # A
10 63 # bipush 99 # 99
63 # aadds # &A[99]
2E # imload # A[99]
B0 # return #
EXAMPLE 7

#use <string>
#use <conio>

int main () {
string h = "Hello ";
string hw = string_join(h, "World!\n");
print(hw);
return string_length(hw);
}

BYTECODE:

C0 C0 FF EE # magic number
00 09 # version 4, arch = 1 (64 bits)

00 00 # int pool count


# int pool

00 0F # string pool total size


# string pool
48 65 6C 6C 6F 20 00 # "Hello "
57 6F 72 6C 64 21 0A 00 # "World!\n"

00 01 # function count
# function_pool

#<main>
00 00 # number of arguments = 0
00 02 # number of local variables = 2
00 1B # code length = 27 bytes
14 00 00 # aldc 0 # s[0] = "Hello "
36 00 # vstore 0 # h = "Hello ";
15 00 # vload 0 # h
14 00 07 # aldc 7 # s[7] = "World!\n"
B7 00 00 # invokenative 0 # string_join(h, "World!\n")
36 01 # vstore 1 # hw = ...
15 01 # vload 1 # hw
B7 00 01 # invokenative 1 # print(hw)
57 # pop # (ignore result)
15 01 # vload 1 # hw
B7 00 02 # invokenative 2 # string_length(hw)
B0 # return #

00 03 # native count
# native pool
00 02 00 4F # string_join
00 01 00 06 # print
00 01 00 50 # string_length
Lecture Notes on
Programs as Data: The C0VM

15-122: Principles of Imperative Computation


Thomas Cortina
Notes by Frank Pfenning

Lecture 23
November 20, 2012

1 Introduction
A recurring theme in computer science is to view programs as data. For
example, a compiler has to read a program as a string of characters and
translate it into some internal form, a process called parsing. Another in-
stance are first-class functions, which you will study in great depth in 15–
150, a course dedicated to functional programming. When you learn about
computer systems in 15–213 you will see how programs are represented as
machine code in binary form.
In this lecture we will take a look at a virtual machine. In general, when
a program is read by a compiler, it will be translated to some lower-level
form that can be executed. For C and C0, this is usually machine code. For
example, the cc0 compiler you have been using in this course translates
the input file to a file in the C language, and then a C compiler (gcc) trans-
lates that in turn into code that can be executed directly by the machine. In
contrast, Java implementations typically translate into some intermediate
form called byte code which is saved in a class file. Byte code is then inter-
preted by a virtual machine called the JVM (for Java Virtual Machine). So
the program that actually runs on the machine hardware is the JVM which
interprets byte code and performs the requested computations.
Using a virtual machine has one big drawback, which is that it will be
slower than directly executing a binary on the machine. But it also has a
number of important advantages. One is portability: as long as we have an
implementation of the virtual machine on our target computing platform,

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.2

we can run the byte code there. So we need a virtual machine implementa-
tion for each computing platform, but only one compiler. A second advan-
tage is safety: when we execute binary code, we give away control over the
actions of the machine. When we interpret byte code, we can decide at each
step if we want to permit an action or not, possibly terminating execution if
the byte code would do something undesirable like reformatting the hard
disk or crashing the computer. The combination of these two advantages
led the designers of Java to create an abstract machine. The intent was for
Java to be used for mobile code, embedded in web pages or downloaded
from the Internet, which may not be trusted or simply be faulty. Therefore
safety was one of the overriding concerns in the design.
In this lecture we explore how to apply the same principles to develop
a virtual machine to implement C0. We call this the C0VM and in Assign-
ment 8 of this course you will have the opportunity to implement it. The
cc0 compiler has an option (-b) to produce bytecode appropriate for the
C0VM. This will give you insight not only into programs-as-data, but also
into how C0 is executed, its operational semantics.
As a side remark, at the time the C language was designed, machines
were slow and memory was scarce compared to today. Therefore, efficiency
was a principal design concern. As a result, C sacrificed safety in a number
of crucial places, a decision we still pay for today. Any time you download
a security patch for some program, chances are a virus or worm or other
malware was found that takes advantage of the lack of safety in C in order
to attack your machine. The most gaping hole is that C does not check if
array accesses are in bounds. So by assigning to A[k] where k is greater
than the size of the array, you may be able to write to some arbitrary place
in memory and, for example, install malicious code. In 15–213 Computer
Systems you will learn precisely how these kind of attacks work, because
you will carry out some of your own!
In C0, we spent considerable time and effort to trim down the C lan-
guage so that it would permit a safe implementation. This makes it mar-
ginally slower than C on some programs, but it means you will not have
to try to debug programs that crash unpredictably. You have been intro-
duced to all the unsafe features of C, when the course switched to C, and
we taught you programming practices that avoid these kinds of behavior.
But it is very difficult, even for experienced teams of programmers, as the
large number of security-relevant bugs in today’s commercial software at-
tests. One might ask why program in C at all? One reason is that many
of you, as practicing programmers, will have to deal with large amounts
of legacy code that is written in C or C++. As such, you should be able to

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.3

understand, write, and work with these languages. The other reason is that
there are low-level systems-oriented programs such as operating systems
kernels, device drivers, garbage collectors, networking software, etc. that
are difficult to write in safe languages and are usually written in a combina-
tion of C and machine code. But don’t lose hope: research in programming
language has made great strides of the last two decades, and there is an
ongoing effort at Carnegie Mellon to build an operating system based on
a safe language that is a cousin of C. So perhaps we won’t be tied to an
unsafe language and a flood of security patches forever.
Implementation of a virtual machine is actually one of the applications
where even today C is usually the language of choice. That’s because C
gives you control over the memory layout of data, and also permits the
kind of optimizations that are crucial to make a virtual machine efficient.
Here, we don’t care so much about efficiency, being mostly interested in
correctness and clarity, but we still use C to implement the C0VM.

2 A Stack Machine
The C0VM is a stack machine. This means that the evaluation of expressions
uses a stack, called the operand stack. It is written from left to right, with the
rightmost element denoting the top of the stack.
We begin with a simple example, evaluating an expression without
variables:
(3 + 4) ⇤ 5/2
In the table below we show the virtual machine instruction on left, in tex-
tual form, and the operand stack after the instruction on the right has been
executed. We write ‘·’ for the empty stack.
Instruction Operand Stack
·
bipush 3 3
bipush 4 3, 4
iadd 7
bipush 5 7, 5
imul 35
bipush 2 35, 2
idiv 17
The translation of expressions to instructions is what a compiler would
normally do. Here we just write the instructions by hand, in effect simulat-

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.4

ing the compiler. The important part is that executing the instructions will
compute the correct answer for the expression. We always start with the
empty stack and end up with the answer as the only item on the stack.
In the C0VM, instructions are represented as bytes. This means we only
have at most 256 different instructions. Some of these instructions require
more than one byte. For example, the bipush instruction requires a second
byte for the number to push onto the stack. The following is an excerpt
from the C0VM reference, listing only the instructions needed above.
0x10 bipush <b> S -> S,b
0x60 iadd S,x,y -> S,x+y
0x68 imul S,x,y -> S,x*y
0x6C idiv S,x,y -> S,x/y
On the right-hand side we see the effect of the operation on the stack S.
Using these code we can translate the program into code.

Code Instruction Operand Stack


·
10 03 bipush 3 3
10 04 bipush 4 3, 4
60 iadd 7
10 05 bipush 5 7, 5
68 imul 35
10 02 bipush 2 35, 2
6C idiv 17

In the figure above, and in the rest of these notes, we always show bytecode
in hexadecimal form, without the 0x prefix. In a binary file that contains
this program we would just see the bytes
10 03 10 04 60 10 05 68 10 02 6C
and it would be up to the C0VM implementation to interpret them ap-
propriately. The file format we use is essentially this, except we don’t
use binary but represent the hexadecimal numbers as strings separated by
whitespace, literally as written in the display above.

3 Compiling to Bytecode
The cc0 compiler provides an option -b to generate bytecode. You can use
this to experiment with different programs to see what they translate to.

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.5

For the simple arithmetic expression from the previous section we could
create a file ex1.c0:
int main () {
return (3+4)*5/2;
}
We compile it with
% cc0 -b ex1.c0
which will write a file ex1.bc0. In the current version of the compiler, this
has the following content:
C0 C0 FF EE # magic number
00 09 # version 4, arch = 1 (64 bits)

00 00 # int pool count


# int pool

00 00 # string pool total size


# string pool

00 01 # function count
# function_pool

#<main>
00 00 # number of arguments = 0
00 00 # number of local variables = 0
00 0C # code length = 12 bytes
10 03 # bipush 3 # 3
10 04 # bipush 4 # 4
60 # iadd # (3 + 4)
10 05 # bipush 5 # 5
68 # imul # ((3 + 4) * 5)
10 02 # bipush 2 # 2
6C # idiv # (((3 + 4) * 5) / 2)
B0 # return #

00 00 # native count
# native pool

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.6

We will explain various parts of this file later on.


It consists of a sequence of bytes, each represented by two hexadecimal
digits. In order to make the bytecode readable, it also includes comments.
Each comment starts with # and extends to the end of the line. Comments
are completely ignored by the virtual machine and are there only for you
to read.
We focus on the section starting with #<main>. The first three lines

#<main>
00 00 # number of arguments = 0
00 00 # number of local variables = 0
00 0C # code length = 12 bytes

tell the virtual machine that the function main takes no arguments, uses no
local variables, and its code has a total length of 12 bytes (0x0C in hex). The
next few lines embody exactly the code we wrote by hand. The comments
first show the virtual machine instruction and then the expression in the
source code that was translated to the corresponding byte code.

10 03 # bipush 3 # 3
10 04 # bipush 4 # 4
60 # iadd # (3 + 4)
10 05 # bipush 5 # 5
68 # imul # ((3 + 4) * 5)
10 02 # bipush 2 # 2
6C # idiv # (((3 + 4) * 5) / 2)
B0 # return #

The return instruction at the end means that the function returns the value
that is currently the only one on the stack. When this function is exe-
cuted, this will be the value of the expression shown on the previous line,
(((3 + 4) * 5) / 2).
As we proceed through increasingly complex language constructs, you
should experiment yourself, writing C0 programs, compiling them to byte
code, and testing your understanding by checking that it is as expected (or
at least correct).

4 Local Variables
So far, the only part of the runtime system that we needed was the local
operand stack. Next, we add the ability to handle function arguments and

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.7

local variables to the machine. For that purpose, a function has an array
V containing local variables. We can push the value of a local variable onto
the operand stack with the vload instruction, and we can pop the value
from the top of the stack and store it in a local variable with the vstore
instruction. Initially, when a function is called, its arguments x0 , . . . , xn 1
are stored as local variables V [0], . . . , V [n 1].
Assume we want to implement the function mid.
int mid(int lower, int upper) {
int mid = lower + (upper - lower)/2;
return mid;
}
Here is a summary of the instructions we need
0x15 vload <i> S -> S,v (v = V[i])
0x36 vstore <i> S,v -> S (V[i] = v)
0x64 isub S,x,y -> S,x-y
0xB0 return .,v -> .
Notice that for return, there must be exactly one element on the stack. Us-
ing these instructions, we obtain the following code for our little function.
We indicate the operand stack on the right, using symbolic expressions to
denote the corresponding runtime values. The operand stack is not part of
the code; we just write it out as an aid to reading the program.
#<mid>
00 02 # number of arguments = 2
00 03 # number of local variables = 3
00 10 # code length = 16 bytes
15 00 # vload 0 # lower
15 01 # vload 1 # lower, upper
15 00 # vload 0 # lower, uppper, lower
64 # isub # lower, (upper - lower)
10 02 # bipush 2 # lower, (upper - lower), 2
6C # idiv # lower, ((upper - lower) / 2)
60 # iadd # (lower + ((upper - lower) / 2))
36 02 # vstore 2 # mid = (lower + ((upper - lower) / 2));
15 02 # vload 2 # mid
B0 # return #
We can optimize this piece of code, simply removing the last vstore 2 and
vload 2, but we translated the original literally to clarify the relationship
between the function and its translation.

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.8

5 Constants
So far, the bipush <b> instruction is the only way to introduce a constant
into the computation. Here, b is a signed byte, so that its possible values are
128  b < 128. What if the computation requires a larger constant?
The solution for the C0VM and similar machines is not to include the
constant directly as arguments to instructions, but store them separately
in the byte code file, giving each of them an index that can be referenced
from instructions. Each segment of the byte code file is called a pool. For
example, we have a pool of integer constants. The instruction to refer to an
integer is ildc (integer load constant).

0x13 ildc <c1,c2> S -> S, x:w32 (x = int_pool[(c1<<8)|c2])

The index into the constant pool is a 16-bit unsigned quantity, given in two
bytes with the most significant byte first. This means we can have at most
216 1 = 65, 535 different constants in a byte code file.
As an example, consider a function that is part of a linear congruen-
tial pseudorandom number generator. It generates the next pseudorandom
number in a sequence from the previous number.

int next_rand(int last) {


return last * 1664525 + 1013904223;
}

int main() {
return next_rand(0xdeadbeef);
}

There are three constants in this file that require more than one byte to
represent: 1664252, 1013904223, and 0xdeadbeef. Each of them is assigned
an index in the integer pool. The constants are then pushed onto the stack
with the ildc instruction.

C0 C0 FF EE # magic number
00 09 # version 4, arch = 1 (64 bits)

00 03 # int pool count


# int pool
00 19 66 0D
3C 6E F3 5F
DE AD BE EF

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.9

00 00 # string pool total size


# string pool

00 02 # function count
# function_pool

#<main>
00 00 # number of arguments = 0
00 01 # number of local variables = 1
00 07 # code length = 7 bytes
13 00 02 # ildc 2 # c[2] = -559038737
B8 00 01 # invokestatic 1 # next_rand(-559038737)
B0 # return #

#<next_rand>
00 01 # number of arguments = 1
00 01 # number of local variables = 1
00 0B # code length = 11 bytes
15 00 # vload 0 # last
13 00 00 # ildc 0 # c[0] = 1664525
68 # imul # (last * 1664525)
13 00 01 # ildc 1 # c[1] = 1013904223
60 # iadd # ((last * 1664525) + 1013904223)
B0 # return #

00 00 # native count
# native pool

The comments denote the ith integer in the constant pool by c[i].
There are other pools in this file. The string pool contains string con-
stants. The function pool contains the information on each of the functions,
as explained in the next section. The native pool contains references to “na-
tive” functions, that is, library functions not defined in this file.

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.10

6 Function Calls
As already explained, the function pool contains the information on each
function which is the number of arguments, the number of local variables,
the code length, and then the byte code for the function itself. Each function
is assigned a 16-bit unsigned index into this pool. The main function always
has index 0. We call a function with the invokestatic instruction.
0xB8 invokestatic <c1,c2> S, v1, v2, ..., vn -> S, v
We find the function g at function_pool[c1<<8|c2], which must take n
arguments. After g(v1 , . . . , vn ) returns, its value will be on the stack instead
of the arguments.
Execution of the function will start with the first instruction and ter-
minate with a return (which does not need to be the last byte code in the
function). So the description of functions themselves is not particularly
tricky, but the implementation of function calls is.
Let’s collect the kind of information we already know about the runtime
system of the virtual machine. We have a number of pools which come from
the byte code file. These pools are constant in that they never change when
the program executes.
Then we have the operand stack which expands and shrinks within each
function’s operation, and the local variable array which holds function argu-
ments and the local variables needed to execute the function body.
In order to correctly implement function calls and returns we need one
further runtime structure, the call stack. The call stack is a stack of so-called
frames. We now analyze what the role of the frames is and what they need
to contain.
Consider the situation where a function f is executing and calls a func-
tion g with n arguments. At this point, we assume that f has pushed the
arguments onto the operand stack. Now we need take the following steps:
1. Create a new local variable array Vg for the function g.

2. Pop the arguments from f ’s operand stack Sf and store them in g’s
local variable array Vg [0..n).

3. Push a frame containing Vf , Sf , and the next program counter pc f on


the call stack.

4. Create a new (empty) operand stack Sg for g.

5. Start executing the code for g.

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.11

When the called function g returns, its return value is the only value on its
operand stack Sg . We need to do the following

1. Pop the last frame from the call stack. This frame holds Vf , Sf , and
pc f (the return address).

2. Take the return value from Sg and push it onto Sf .

3. Restore the local variable array Vf .

4. Deallocate any structs no longer required.

5. Continue with the execution of f at pc f .

Concretely, we suggest that a frame from the call stack contain the fol-
lowing information:

1. An array of local variables V .

2. The operand stack S.

3. A pointer to the function body.

4. The return address which specifies where to continue execution.

We recommend that you simulate the behavior of the machine on a sim-


ple function call sequence to make sure you understand the role of the call
stack.

7 Conditionals
The C0VM does not have if-then-else or conditional expressions. Like ma-
chine code and other virtual machines, it has conditional branches that jump
to another location in the code if a condition is satisfied and otherwise con-
tinue with the next instruction in sequence.

0x9F if_cmpeq <o1,o2> S, v1, v2 -> S (pc = pc+(o1<<8|o2) if v1 == v2)


0xA0 if_cmpne <o1,o2> S, v1, v2 -> S (pc = pc+(o1<<8|o2) if v1 != v2)
0xA1 if_icmplt <o1,o2> S, x:w32, y:w32 -> S (pc = pc+(o1<<8|o2) if x < y)
0xA2 if_icmpge <o1,o2> S, x:w32, y:w32 -> S (pc = pc+(o1<<8|o2) if x >= y)
0xA3 if_icmpgt <o1,o2> S, x:w32, y:w32 -> S (pc = pc+(o1<<8|o2) if x > y)
0xA4 if_icmple <o1,o2> S, x:w32, y:w32 -> S (pc = pc+(o1<<8|o2) if x <= y)
0xA7 goto <o1,o2> S -> S (pc = pc+(o1<<8|o2))

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.12

As part of the test, the arguments are popped from the operand stack. Each
of the branching instructions takes two bytes are arguments which describe
a signed 16-bit offset. If that is positive we jump forward, if it is negative we
jump backward in the program.
As an example, we compile the following loop, adding up odd numbers
to obtain perfect squares.

int main () {
int sum = 0;
for (int i = 1; i < 100; i += 2)
//@loop_invariant 0 <= i && i <= 100;
sum += i;
return sum;
}

The compiler currently produces somewhat ideosyncratic code, so what we


show below has been edited to make the correspondence to the source code
more immediate.

#<main>
00 00 # number of arguments = 0
00 02 # number of local variables = 2
00 23 # code length = 35 bytes
10 00 # bipush 0 # 0
36 00 # vstore 0 # sum = 0;
10 01 # bipush 1 # 1
36 01 # vstore 1 # i = 1;
# <00:loop>
15 01 # vload 1 # i
10 64 # bipush 100 # 100
A2 00 14 # if_icmpge 20 # if (i >= 100) goto <01:endloop>
15 00 # vload 0 # sum
15 01 # vload 1 # i
60 # iadd #
36 00 # vstore 0 # sum += i;
15 01 # vload 1 # i
10 02 # bipush 2 # 2
60 # iadd #
36 01 # vstore 1 # i += 2;
A7 FF EB # goto -21 # goto <00:loop>
# <01:endloop>

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.13

15 00 # vload 0 # sum
B0 # return #

The compiler has embedded symbolic labels in this code, like <00:loop>
and <01:endloop> which are the targets of jumps or conditional branches.
In the actual byte code, they are turned into relative offsets. For example,
if we count forward 20 bytes, starting from A2 (the byte code of if_icmpge,
the negation of the test i < 100 in the source) we land at <01:endloop>
which labels the vload 0 instruction just before the return. Similarly, if we
count backwards 21 bytes from A7 (which is a goto), we land at <00:loop>
which starts with vload 1.

8 The Heap
In C0, structs and arrays can only be allocated on the system heap. The
virtual machine must therefore also provide a heap in its runtime system.
If you implement this in C, the simplest way to do this is to use the runtime
heap of the C language to implement the heap of the C0VM byte code that
you are interpreting. One can use a garbage collector for C such as libgc
in order to manage this memory. We can also sidestep this difficulty by
assuming that the C0 code we interpret does not run out of memory.
We have two instructions to allocate memory.

0xBB new <s> S -> S, a:* (*a is now allocated, size <s>)
0xBC newarray <s> S, n:w32 -> S, a:* (a[0..n) now allocated)

The new instructions takes a size s as an argument, which is the size (in
bytes) of the memory to be allocated. The call returns the address of the
allocated memory. It can also fail with an exception, in case there is insuffi-
cient memory available, but it will never return NULL. newarray also takes
the number n of elements from the operand stack, so that the total size of
allocated space is n ⇤ s bytes.
For a pointer to a struct, we can compute the address of a field by using
the aaddf instruction. It takes an unsigned byte offset f as an argument,
pops the address a from the stack, adds the offset, and pushes the resulting
address a + f back onto the stack. If a is null, and error is signaled, because
the address computation would be invalid.

0x62 aaddf <f> S, a:* -> S, (a+f):* (a != NULL; f field offset)

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.14

To access memory at an address we have computed we have the mload


and mstore instructions. The vary, depending on the size of data that are
loaded from or stored to memory.
0x2E imload S, a:* -> S, x:w32 (x = *a, a != NULL, load 4 bytes)
0x2F amload S, a:* -> S, b:* (b = *a, a != NULL, load address)
0x4E imstore S, a:*, x:w32 -> S (*a = x, a != NULL, store 4 bytes)
0x4F amstore S, a:*, b:* -> S (*a = b, a != NULL, store address)
They all consume an address from the operand stack. imload reads a 4-
byte value from the given memory address and pushes it on the operand
stack. imstore pops a 4-byte value from the operand stack and stores it at
the given address. The amload and amstore versions load and store and
address, respectively. There are also cmload and cmstore explained in the
next section for single-byte loads and stores.
As an example, consider the following struct declaration and function.
struct point {
int x;
int y;
};
typedef struct point* point;

point reflect(point p) {
point q = alloc(struct point);
q->x = p->y;
q->y = p->x;
return q;
}
The reflect function is compiled to the following code. When reading this
code, recall that q->x, for example, stands for (*q).x. In the comments, the
compiler writes the address of the x field in the struct pointed to by q as
&(*(q)).x, in analogy with C’s address-of operator &.
#<reflect>
00 01 # number of arguments = 1
00 02 # number of local variables = 2
00 1B # code length = 27 bytes
BB 08 # new 8 # alloc(struct point)
36 01 # vstore 1 # q = alloc(struct point);
15 01 # vload 1 # q

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.15

62 00 # aaddf 0 # &q->x
15 00 # vload 0 # p
62 04 # aaddf 4 # &p->y
2E # imload # p->y
4E # imstore # q->x = p->y;
15 01 # vload 1 # q
62 04 # aaddf 4 # &q->y
15 00 # vload 0 # p
62 00 # aaddf 0 # &p->x
2E # imload # p->x
4E # imstore # q->y = p->x;
15 01 # vload 1 # q
B0 # return #

We see that in this example, the size of a struct point is 8 bytes, 4 each for
the x and y fields. You should scrutinize this code carefully to make sure
you understands how structs work.
Array accesses are similar, except that the address computation takes
an index i from the stack. The size of the array elements is stored in the
runtime structure, so it is not passed as an explicit argument. Instead, the
byte code interpreter must retrieve the size from memory. The following is
our sample program.

int main() {
int[] A = alloc_array(int, 100);
for (int i = 0; i < 100; i++)
A[i] = i;
return A[99];
}

Showing only the loop, we have the code below (again slightly edited).
Notice the use of aadds to consume A and i from the stack, pushing &A[i]
onto the stack.

# <00:loop>
15 01 # vload 1 # i
10 64 # bipush 100 # 100
9F 00 15 # if_cmpge 21 # if (i >= 100) goto <01:endloop>
15 00 # vload 0 # A
15 01 # vload 1 # i
63 # aadds # &A[i]

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.16

15 01 # vload 1 # i
4E # imstore # A[i] = i;
15 01 # vload 1 # i
10 01 # bipush 1 # 1
60 # iadd #
36 01 # vstore 1 # i += 1;
A7 FF EA # goto -22 # goto <00:loop>
# <01:endloop>

There is a further subtlety regarding booleans and characters stored in


memory, as explained in the next section.

9 Characters and Strings


Characters in C0 are ASCII characters in the range from 0  c < 128.
Strings are sequences of non-NULL characters. While C0 does not pre-
scribe the representation, we follow the convention of C to represent them
as an array of characters, terminated by ’\0’ (NUL). Arrays (and therefore
strings) are manipulated via their addresses, and therefore add to the types
we denote by a:*.
But what about constant strings appearing in the program? For them,
we introduce the string pool as another section of the byte code file. This
pool consists of a sequence of strings, each of them terminated by ’\0’,
represented as the byte 0x00. Consider the program

#use <string>
#use <conio>

int main () {
string h = "Hello ";
string hw = string_join(h, "World!\n");
print(hw);
return string_length(hw);
}

There are two string constants, "Hello " and "World!\n". In the byte code
file below they are stored in the string pool at index positions 0 and 7.

C0 C0 FF EE # magic number
00 05 # version 2, arch = 1 (64 bits)

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.17

00 00 # int pool count


# int pool

00 0F # string pool total size


# string pool
48 65 6C 6C 6F 20 00 # "Hello "
57 6F 72 6C 64 21 0A 00 # "World!\n"

In the byte code program, we access these strings by pushing their address
onto the stack using the aldc instruction.

0x14 aldc <c1,c2> S -> S, a:* (a = &string_pool[(c1<<8)|c2])

We can see its use in the byte code for the main function.

#<main>
00 00 # number of arguments = 0
00 02 # number of local variables = 2
00 1B # code length = 27 bytes
14 00 00 # aldc 0 # s[0] = "Hello "
36 00 # vstore 0 # h = "Hello ";
15 00 # vload 0 # h
14 00 07 # aldc 7 # s[7] = "World!\n"
B7 00 00 # invokenative 0 # string_join(h, "World!\n")
36 01 # vstore 1 # hw = string_join(h, "World!\n");
15 01 # vload 1 # hw
B7 00 01 # invokenative 1 # print(hw)
57 # pop # (ignore result)
15 01 # vload 1 # hw
B7 00 02 # invokenative 2 # string_length(hw)
B0 # return #

Another noteworthy aspect of the code is the use of native functions with
index 0, 1, and 2. For each of these, the native pool contains the number of
arguments and an internal index.

00 03 # native count
# native pool
00 02 00 4E # string_join
00 01 00 06 # print
00 01 00 4F # string_length

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.18

There is a further subtle point regarding the memory load and store
instructions and their interaction with strings. As we can see from the
string pool representation, a character takes only one byte of memory. The
operand stack and local variable array maintains all primitive types as 4-
byte quantities. We need to mediate this difference when loading or storing
characters. Booleans similarly take only one byte, where 0 stands for false
and 1 for true. For this purpose, the C0VM has variants of the mload and
mstore instructions that load and store only a single byte.

0x34 cmload S, a:* -> S, x:w32 (x = (w32)(*a), a != NULL, load 1 byte)


0x55 cmstore S, a:*, x:w32 -> S (*a = x & 0x7f, a != NULL, store 1 byte)

As part of the load operation we have to convert the byte to a four-byte


quantity to be pushed onto the stack; when writing we have to mask out
the upper bits. Because characters c in C0 are in the range 0  c < 128 and
booleans are represented by just 0 (for false) and 1 (for true), we exploit
and enforce that all bytes represent 7-bit unsigned quantities.

10 Byte Code Verification


So far, we have not discussed any invariants to be satisfied by the informa-
tion stored in the byte code file. What are the invariants for code, encoded
as data? How do we establish them?
We can try to derive this from the program that interprets the bytecode.
First, we would like to check that there is valid instruction at every address
we can reach when the program is executed. This is slightly complicated by
forward and backward conditional branches and jumps, but overall not too
difficult to check. We also want to check that all local variables used are less
that num_vars, so that references V [i] will always be in bounds. Further, we
check that when a function returns, there is exactly one value on the stack.
This more difficult to check, again due to conditional branches and jumps,
because the stack grows and shrinks. As part of this we should also verify
that at any given instruction there are enough items on the stack to execute
the instruction, for example, at least two for iadd.
These and a few other checks are performed by byte code verification of
the Java Virtual Machine (JVM). The most important one we omitted here
is type checking. It is not relevant for the C0VM because we simplified the
file format by eliminating type information. After byte code verification, a
number of runtime checks can be avoided because we have verified stat-
ically that they can not occur. Realistic byte code verification is far from

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.19

trivial, but we see here that it just establishes a data structure invariant for
the byte code interpreter.
It is important to recognize that there are limits to what can be done
with bytecode verification before the code is executed. For example, we
can not check in general if division might try to divide by 0, or if the pro-
gram will terminate. There is a lot of research in the area of programming
languages concerned with pushing the boundaries of static verification, in-
cluding here at Carnegie Mellon University. Perhaps future instances of
this course will benefit from this research by checking your C0 program in-
variants, at least to some extent, and pointing out bugs before you ever run
your program just like the parser and type checker do.

11 Implementing the C0VM


For some information, tips, and hints for implementing the C0VM in C we
refer the reader to the Assignment 8 writeup and starter code.

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.20

12 C0VM Instruction Reference


S = operand stack
V = local variable array, V[0..num_vars)

Instruction operands:
<i> = local variable index (unsigned)
<b> = byte (signed)
<s> = element size in bytes (unsigned)
<f> = field offset in struct in bytes (unsigned)
<c> = <c1,c2> = pool index = (c1<<8|c2) (unsigned)
<o> = <o1,o2> = pc offset = (o1<<8|o2) (signed)

Stack operands:
a : * = address ("reference")
x, i, n : w32 = 32 bit word representing an int, bool, or char ("primitive")
v = arbitrary value (v:* or v:w32)

Stack operations

0x59 dup S, v -> S, v, v


0x57 pop S, v -> S
0x5F swap S, v1, v2 -> S, v2, v1

Arithmetic

0x60 iadd S, x:w32, y:w32 -> S, x+y:w32


0x7E iand S, x:w32, y:w32 -> S, x&y:w32
0x6C idiv S, x:w32, y:w32 -> S, x/y:w32
0x68 imul S, x:w32, y:w32 -> S, x*y:w32
0x80 ior S, x:w32, y:w32 -> S, x|y:w32
0x70 irem S, x:w32, y:w32 -> S, x%y:w32
0x78 ishl S, x:w32, y:w32 -> S, x<<y:w32
0x7A ishr S, x:w32, y:w32 -> S, x>>y:w32
0x64 isub S, x:w32, y:w32 -> S, x-y:w32
0x82 ixor S, x:w32, y:w32 -> S, x^y:w32

Local Variables

0x15 vload <i> S -> S, v v = V[i]


0x36 vstore <i> S, v -> S V[i] = v

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.21

Constants

0x01 aconst_null S -> S, null:*


0x10 bipush <b> S -> S, x:w32 (x = (w32)b, sign extended)
0x13 ildc <c1,c2> S -> S, x:w32 (x = int_pool[(c1<<8)|c2])
0x14 aldc <c1,c2> S -> S, a:* (a = &string_pool[(c1<<8)|c2])

Control Flow

0x00 nop S -> S


0x9F if_cmpeq <o1,o2> S, v1, v2 -> S (pc = pc+(o1<<8|o2) if v1 == v2)
0xA0 if_cmpne <o1,o2> S, v1, v2 -> S (pc = pc+(o1<<8|o2) if v1 != v2)
0xA1 if_icmplt <o1,o2> S, x:w32, y:w32 -> S (pc = pc+(o1<<8|o2) if x < y)
0xA2 if_icmpge <o1,o2> S, x:w32, y:w32 -> S (pc = pc+(o1<<8|o2) if x >= y)
0xA3 if_icmpgt <o1,o2> S, x:w32, y:w32 -> S (pc = pc+(o1<<8|o2) if x > y)
0xA4 if_icmple <o1,o2> S, x:w32, y:w32 -> S (pc = pc+(o1<<8|o2) if x <= y)
0xA7 goto <o1,o2> S -> S (pc = pc+(o1<<8|o2))
0xBF athrow S, a:* -> S (c0_user_error(a))
0xCF assert S, x:w32, a:* -> S (c0_assertion_failure(a) if x == 0)

Functions

0xB8 invokestatic <c1,c2> S, v1, v2, ..., vn -> S, v


(function_pool[c1<<8|c2] => g, g(v1,...,vn) = v)
0xB0 return ., v -> . (return v to caller)
0xB7 invokenative <c1,c2> S, v1, v2, ..., vn -> S, v
(native_pool[c1<<8|c2] => g, g(v1,...,vn) = v)

Memory

0xBB new <s> S -> S, a:* (*a is now allocated, size <s>)
0xBC newarray <s> S, n:w32 -> S, a:* (a[0..n) now allocated)
0xBE arraylength S, a:* -> S, n:w32 (n = \length(a))

0x62 aaddf <f> S, a:* -> S, (a+f):* (a != NULL; f field offset)


0x63 aadds S, a:*, i:w32 -> S, (a+s*i):*
(a != NULL, 0 <= i < \length(a))

0x2E imload S, a:* -> S, x:w32 (x = *a, a != NULL, load 4 bytes)


0x2F amload S, a:* -> S, b:* (b = *a, a != NULL, load address)
0x4E imstore S, a:*, x:w32 -> S (*a = x, a != NULL, store 4 bytes)
0x4F amstore S, a:*, b:* -> S (*a = b, a != NULL, store address)

0x34 cmload S, a:* -> S, x:w32 (x = (w32)(*a), a != NULL, load 1 byte)


0x55 cmstore S, a:*, x:w32 -> S (*a = x & 0x7f, a != NULL, store 1 byte)

L ECTURE N OTES N OVEMBER 20, 2012


Programs as Data: The C0VM L23.22

13 C0VM File Format Reference


C0VM Byte Code File Reference, Version 4, Spring 2013

u4 - 4 byte unsigned integer


u2 - 2 byte unsigned integer
u1 - 1 byte unsigned integer
i4 - 4 byte signed (two’s complement) integer
fi - struct function_info, defined below
ni - struct native_info, defined below

The size of some arrays is variable, depending on earlier fields.


These are only arrays conceptually, in the file, all the information
is just stored as sequences of bytes, in hexadecimal notation,
separated by whitespace. We present the file format in a
pseudostruct notation.

struct bc0_file {
u4 magic; # magic number, always 0xc0c0ffee
u2 version+arch; # version number (now 2) and architecture
u2 int_count; # number of integer constants
i4 int_pool[int_count]; # integer constants
u2 string_count; # number of characters in string pool
u1 string_pool[string_count]; # adjacent ’\0’-terminated strings
u2 function_count; # number of functions
fi function_pool[function_count]; # function info
u2 native_count; # number of native (library) functions
ni native_pool[native_count]; # native function info
};

struct function_info {
u2 num_args; # number of arguments, V[0..num_args)
u2 num_vars; # number of variables, V[0..num_vars)
u2 code_length; # number of bytes of bytecode
u1 code[code_length]; # bytecode
};

struct native_info {
u2 num_args; # number of arguments, V[0..num_args)
u2 function_table_index; # index into table of library functions
};

L ECTURE N OTES N OVEMBER 20, 2012


Lecture Notes on
Search in Graphs
(Partial Draft)
15-122: Principles of Imperative Computation
Frank Pfenning, André Platzer, Rob Simmons

Lecture 24
April 23, 2013

1 Introduction
In this lecture we introduce graphs. Graphs provide a uniform model for
many structures, for example, maps with distances or Facebook relation-
ships. Algorithms on graphs are therefore important to many applications.
They will be a central subject in the algorithms courses later in the curricu-
lum; here we only provide a very small sample of graph algorithms.

2 Paths in Graphs
We start with undirected graphs which consist of a set V of vertices (also
called nodes) and a set E of edges, each connecting two different vertices.
In particular, these graphs have no edges from a node back to itself. A
graph is connected if we can reach any vertex from any other vertex by
following edges in either direction. In a directed graph edges provide a con-
nection from one node to another, but not necessarily in the opposite direc-
tion. More mathematically, we say that the edge relation between vertices is
symmetric for undirected graphs. In this lecture we only discuss undirected
graphs, although directed graphs also play an important role in many ap-
plications.
The following is a simple example of a connected, undirected graph

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.2

with 5 vertices (A, B, C, D, E) and 6 edges (AB, BC, CD, AE, BE, CE).

A
 D

E


B
 C


A path in a graph is a sequence of vertices where each vertex is connected


to the next by an edge. That is, a path is a sequence

v 0 , v1 , v2 , v3 , . . . , v l

of some length l 0 such that there is an edge from vi to vi+1 in the graph
for each i < l. For example, all of the following are paths in the graph
above:
A B E C D
A B A
E C D C B
B
The last one is a special case: The length of a path is given by the number of
edges in it, so a node by itself is a path of length 0 (without following any
edges). Paths always have a starting vertex and an ending vertex, which
coincide in a path of length 0. We also say that a path connects its end-
points.
The graph reachability problem is to determine if there is a path connecting
two given vertices in a graph. If we know the graph is connected, this
problem is easy since one can reach any node from any other node. But
we might refine our specification to request that the algorithm return not
just a boolean value (reachable or not), but an actual path. At that point
the problem is somewhat interesting even for connected graphs. Using our
earlier terminology, a path from vertex v to vertex w is a certificate or explicit
evidence for the fact that vertex w is reachable from another vertex v. It is
easy to check whether the certificate is valid, since it is easy to check if each
node in the path is connected to the next one by an edge. It is more difficult
to produce such a certificate.

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.3

For example, the path


A B E C D
is a certificate for the fact that vertex D is reachable from vertex A in the
above graph. It is easy to check this certificate by following along the path
and checking whether the indicated edges are in the graph.

3 Implicit Graphs
There are many, many different ways to represent graphs. In some appli-
cations they are never explicitly constructed but remain implicit in the way
the problem was solved. One such example was peg solitaire. The vertices of
the graph implicit in this problem are board positions. There is an edge from
A to B if we can make a move in position A to reach position B. Note that
this implicit graph is actually a directed graph since the game does not allow
us to undo a move we just made. The classical reachability question here
would be if from some initial position we can reach another given final po-
sition. We actually solved a related question, namely if we can reach any of
a number of alternative positions (those with exactly one peg) from a given
initial position. We win the game if we can reach any of those positions
with a single peg.
The reason why we did not explicitly construct the full graph is that
for standard boards it is unreasonably large – there are too many reachable
positions. Instead, we incrementally construct it as we search for a solu-
tion in the hope we can find a solution without ever generating all nodes.
In some examples (like the standard English board), this hope was justi-
fied if we were lucky enough to pick a good move strategy. To make sure
that unsolvable boards had no solution, however, we still had to visit ev-
ery reachable position. Just because we have 3 pegs remaining with one
attempt of trying to solve the board does not mean we could not have been
more successful if we had moved the pegs around in a different way.

4 Explicit Graphs & Graph Interface


Sometimes, we do want to represent a graph as a specific set of edges and
vertices, however. In the C code, we’ll refer to our vertices with unsigned
integers. A minimal interface for graphs would allow us to create and free
graphs, check whether an edge exists in the graph, and add a new edge to
the graph.

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.4

typedef unsigned int vertex;


typedef struct graph_header* graph;

graph graph_new(unsigned int numvert);


void graph_free(graph G);
unsigned int graph_size(graph G);
// number of vertices in the graph
bool graph_hasedge(graph G, vertex v, vertex w);
//@requires v < graph_size(G) && w < graph_size(w);
void graph_addedge(graph G, vertex v, vertex w);
//@requires v < graph_size(G) && w < graph_size(w);
//@requires !graph_hasedge(G, v, w);

We use the C0 notation for contracts on the interface functions here. Even
though C compilers do not recognize the //@requires contract and will
simply discard it as a comment, the contract still serves an important role
for the programmer reading the program. For the graph interface, we de-
cide that it does not make sense to add an edge into a graph when that edge
is already there, hence the second requires.
With this minimal interface, we can create a graph for our running ex-
ample (letting A = 0, B = 1, and so on).

graph G = graph_new(5);
graph_addedge(G, 0, 1);
graph_addedge(G, 1, 2);
graph_addedge(G, 2, 3);
graph_addedge(G, 0, 4);
graph_addedge(G, 1, 4);
graph_addedge(G, 2, 4);

5 Adjacency Matrices
There are two simple ways to implement the graph interface. One way is
to represent the graph as a two-dimensional array that represents its edge
relation. We can check if there is an edge from B (= 1) to D (= 3) by looking
for a checkmark in row 1, column 3. In an undirected graph, the top-right
half of this two-dimensional array will be a mirror image of the bottom-left,

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.5

because the edge relation is symmetric.

A!!!!!B!!!!!C!!!!!D!!!!E!
A! ✔! ✔!
B! ✔! ✔! ✔!
C! ✔! ✔! ✔!
D! ✔!
E! ✔! ✔! ✔!

This representation of a graph is called an adjacency matrix, because it is a


matrix that stores which nodes are neighbors.

6 Adjacency Lists
The other classic representation of a graph is as an adjacency list. In an
adjacency list representation, we have a one-dimensional array that looks
much like a hash table. Each vertex has a spot in the array, and each spot
in the array contains a linked list of all the other vertices connected to that
vertex. Our running example would look like this as an adjacency list:

A" B" E"


B" A" C" E"
C" B" D" E"
D" C"
E" A" B" C"

Adjacenct lists and adjacency matrices have different tradeoffs in the


time and space it takes to perform operations. In a sparse matrix, where
there are lots of vertices and not a lot of edges, it usually makes more sense
to use an adjacency list. In a dense matrix, where there are lots of edges, it
may be more efficient to use adjacency matrix.
If we do use an adjacency list representation, it will often make sense to
extend the interface to graphs to give the client access to this linked list of
adjacent edges.

typedef struct adjlist_node adjlist;

struct adjlist_node {

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.6

vertex vert;
adjlist *next;
};

/* Returns a linked list of the neighbors of vertex v.


* This adjacency list is owned by the graph and should
* not be modified by the user.
* @requires(v < graph_size(G)) */
adjlist *graph_connections(graph G, vertex v);

One way to implement the adjacency list version of graphs is as a pointer


to a special kind of C struct, a struct with a flexible array member, i.e. its last
field is an array whose length is only specified at runtime, not at the time of
declaring the type of the struct. This is a struct with two fields: the first is a
unsigned integer representing the actual size, and the second is an array of
adjacency lists.

struct graph_header {
unsigned int size;
adjlist *adj[]; // Flexible array member!
};

The array adj of adjacency lists will be contiguous in memory with the
size field – this is quite different than, say, a hashtable chain, which is a
linked list with data fields of type elem and next pointers. We allocate
this adjacency list using xcalloc to make sure that the adjacency list is
initialized to an empty array. Behind the scenes xcalloc just multiplies
its two arguments; because we are allocating a struct with a flexible array
member, we pass in 1 for the first argument and explicitly figure out the
desired size of the array for the second argument:

graph graph_new(unsigned int size) {


size_t adj_size = sizeof(adjlist*) * size;
graph G = xcalloc(1, sizeof(struct graph_header) + adj_size);
G->size = size;
ENSURES(is_graph(G));
return G;
}

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.7

7 Depth-First Search
The first algorithm we consider for determining if one vertex is reachable
from another is called depth-first search.
Let’s try to work our way up to this algorithm. Assume we are trying to
find a path from u to w. We start at u. If it is equal to w we are done, because
w is reachable by a path of length 0. If not we pick an arbitrary edge leaving
u to get us to some node v. Now we have “reduced” the original problem
to the one of finding a path from v to w.
The problem here is of course that we may never arive at w even if there
is a path. For example, say we want to find a path from A to D in our earlier
example graph.

A
 D

E


B
 C


We can go A B E A B E · · · without ever reaching D (or we


can go just A B A B · · ·), even though there exists a path.
We need to avoid repeating nodes in the path we are exploring. A cy-
cle is a path of length 1 or greater that has the same starting and ending
point. So another way to say we need to avoid repeating nodes is to say
that we need to avoid cycles in the path. We accomplish this by marking the
nodes we have already visited so when we visit them again we know not
to consider them again.
Let’s go back to the earlier example and play through this idea while
trying to find a path from A to D. We start by marking A (indicated by hol-
lowing the circle) and go to B. We indicate the path we have been following

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.8

by drawing a double-line along the edges contained in it.

A" D" A" D" A" D" A" D"


E" E" E" E"

B" C" B" C" B" C" B" C"

When we are at B we mark B and have three choices for the next step.

1. We could go back to A, but A is already marked and therefore ruled


out.

2. We could go to E.

3. We could go to C.

Say we pick E. At this point have again three choices. We might consider
A as a next node on the path, but it is ruled out because A has already been
marked. We show this by dashing the edge from A to E to indicate it was
considered, but ineligible. The only possibility now is to go to C, because
we have been at B as well (we just came from B).

A" D" A" D"


E" E"

B" C" B" C"

From C we consider the link to D (before considering the link to B) and we


arrive at D, declaring success with the path

A B E C D

which, by construction, has no cycles.

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.9

There is one more consideration to make, namely what we do when we


get stuck. Let’s reconsider the original graph

A
 D

E


B
 C


and the goal to find a path from E to B. Let’s say we start E C and then
C D. At this point, all the vertices we could go to (which is only C) have
already been marked! So we have to backtrack to the most recent choice
point and pursue alternatives. In this case, this could be C, where the only
remaining alternative would be B, completing the path E C B. Notice
that when backtracking we have to go back to C even though it is already
marked.

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.10

Depth-first search is characterized not only by the marking, but also


that we always return to our most recent choice and follow a different path.
When no other alternatives are available, we backtrack further. Let’s con-
sider the following slightly more larger graph, where we explore the out-
going edges using the alphabetically last label first.

D" F"
A"
E"

B" G"
C"

We write the current node we are visiting on the left and on the right a stack
of nodes we have to return to when we backtrack. For each of these we
also remember which choices remain (in parentheses). We annotate marked
nodes with an asterisk, which means that we never pick them as the next
node to visit. In particular, when our current node has been marked, then
we have been there and did not find the goal (yet), so we do not explore
again. Hence, we do not care what their neighbors are now, because those
had already been put on the stack earlier and will be or will have been
considered at some point.
For example, at step 5 we do not consider E ⇤ but go to D instead. We
backtrack, when no unmarked neighbors remain to the current node.
Step Current Neighbors Stack Remark
1 A (E, B)
2 E (C, B, A⇤ ) A⇤ (B)
3 C (G, E ⇤ , D) E ⇤ (B, A⇤ ) | A⇤ (B)
4 G (C ⇤ ) C ⇤ (E ⇤ , D) | E ⇤ (B, A⇤ ) | A⇤ (B)
5 C⇤ don’t care G⇤ () | C ⇤ (E ⇤ , D) | E ⇤ (B, A⇤ ) | A⇤ (B) Backtrack
6 E⇤ don’t care C ⇤ (D) | E ⇤ (B, A⇤ ) | A⇤ (B)
7 D (F, C ⇤ ) C ⇤ () | E ⇤ (B, A⇤ ) | A⇤ (B)
8 F (D⇤ ) D⇤ (C ⇤ ) | C ⇤ () | E ⇤ (B, A⇤ ) | A⇤ (B)
9 D⇤ don’t care F ⇤ () | D⇤ (C ⇤ ) | C ⇤ () | E ⇤ (B, A⇤ ) | A⇤ (B) Backtrack
10 C⇤ don’t care D⇤ () | C ⇤ () | E ⇤ (B, A⇤ ) | A⇤ (B) Backtrack ⇥2
11 B (A⇤ ) E ⇤ (B, A⇤ ) | A⇤ (B) Goal Reached

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.11

We are keeping the visited nodes on a stack so we can easily return to


the most recent one.

bool dfs(graph G, bool *mark, vertex start, vertex target) {


REQUIRES(G != NULL && mark != NULL);
REQUIRES(start < graph_size(G) && target < graph_size(G));
if (mark[start]) return false;
mark[start] = true;

printf("Visiting %d\n", start);


if(start == target) return true;

for(adjlist *L = graph_connections(G, start); L != NULL; L = L->next) {


if(dfs(G, mark, L->vert, target)) return true;
}

return false;
}

Note that this recursive implementation of DFS uses the (implicit) stack
of the function calls to dfs and each dfs function body has its own linked
list for the adjacency list. In effect, that gives the search management data
structure the form of a stack of queues (see Clac lab) as indicated in the
example above. The stack elements are separated by | and the elements of
the queue are wrapped in parentheses (B, A⇤ ) etc.

8 Depth-First Search with a single stack


When scrutinizing the above example, we notice that the sophisticated data
structure of a stack of queues was really quite unnecessary for DFS. The
recursive implementation is simple and elegant, but its effect on the data
management is nontrivial.
This can all be simplified by making the stack explicit. In that case there
is a single stack with all the nodes on it that we still need to look at.

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.12

Step Current Neighbors + Old stack = New stack


1 A (E, B) () (E, B)
2 E (C, B, A⇤ ) (B) (C, B, A⇤ , B)
3 C (G, E ⇤ , D) (B, A⇤ , B) (G, E ⇤ , D, B, A⇤ , B)
4 G (C ⇤ ) (E ⇤ , D, B, A⇤ , B) (C ⇤ , E ⇤ , D, B, A⇤ , B)
5 C⇤ don’t care (E ⇤ , D, B, A⇤ , B) (E ⇤ , D, B, A⇤ , B)
6 E⇤ don’t care (D, B, A⇤ , B) (D, B, A⇤ , B)
7 D (F, C ⇤ ) (B, A⇤ , B) (F, C ⇤ , B, A⇤ , B)
8 F (D⇤ ) (C ⇤ , B, A⇤ , B) (D⇤ , C ⇤ , B, A⇤ , B)
9 D⇤ don’t care (C ⇤ , B, A⇤ , B) (C ⇤ , B, A⇤ , B)
10 C⇤ don’t care (B, A⇤ , B) (B, A⇤ , B)
11 B goal reached (A⇤ , B)

bool dfs_iter1(graph G, vertex start, vertex target) {


REQUIRES(G != NULL);
REQUIRES(start < graph_size(G) && target < graph_size(G));

stack S = stack_new();
push(S, (void*)(uintptr_t)start);

bool mark[graph_size(G)];
for(unsigned int i = 0; i < graph_size(G); i++)
mark[i] = false;

while(!stack_empty(S)) {
vertex v = (vertex)(uintptr_t) pop(S);
if(!mark[v]) {
printf("Visiting %d\n", v);
mark[v] = true;
if (v == target) {
stack_free(S, NULL);
return true;
}

for(adjlist *L = graph_connections(G, v); L != NULL; L = L->next)


push(S, (void*)(uintptr_t)L->vert);
}
}

L ECTURE N OTES A PRIL 23, 2013


Search in Graphs
(Partial Draft) L24.13

stack_free(S, NULL);
return false;
}

We explicitly put the starting node on the stack. Then, every time we
pop an element of the stack, we check to see whether it has been marked
already (like we did in the iterative implementation). If it wasn’t, we visit
the node by marking it and comparing it to the target. Ultimately, we push
all neighbors on the stack to make sure we look at them later.

9 Breadth-First Search
The iterative DFS algorithm managed his agenda, i.e. the list of nodes it
still had to look at using a stack. But there’s no reason to insist on a stack
for the purposes of that. What happens if we replace the data management
by a queue instead? All of a sudden, we will no longer explore the most
recently found neighbor first as in depth-first search, but, instead, we will
look at the oldest neighbor first. This corresponds to a breadth-first search
where you explore the graph layer by layer. So BFS completes a layer of
the graph before proceeding to the next layer. The code for that and many
other interesting variations of graph search can be found on the web page.

L ECTURE N OTES A PRIL 23, 2013


Lecture Notes on
Spanning Trees

15-122: Principles of Imperative Computation


Frank Pfenning

Lecture 26
April 25, 2013

The following is a simple example of a connected, undirected graph


with 5 vertices (A, B, C, D, E) and 6 edges (AB, BC, CD, AE, BE, CE).

A
 D

E


B
 C


In this lecture we are particularly interested in the problem of computing


a spanning tree for a connected graph. What is a tree here? They are a
bit different than the binary search trees we considered early. One simple
definition is that a tree is a connected graph with no cycles, where a cycle let’s
you go from a node to itself without repeating an edge. A spanning tree for
a connected graph G is a tree containing all the vertices of G. Below are

L ECTURE N OTES A PRIL 25, 2013


Spanning Trees L26.2

two examples of spanning trees for our original example graph.

A
 D
 A
 D

E
 E


B
 C
 B
 C


When dealing with a new kind of data structure, it is a good strategy to


try to think of as many different characterization as we can. This is some-
what similar to the problem of coming up with good representations of the
data; different ones may be appropriate for different purposes. Here are
some alternative characterizations the class came up with:
1. Connected graph with no cycle (original).

2. Connected graph where no two neighbors are otherwise connected.


Neighbors are vertices connected directly by an edge, otherwise con-
nected means connected without the connecting edge.

3. Two trees connected by a single edge. This is a recursive characteriza-


tion. The base case is a single node, with the empty tree (no vertices)
as a possible special case.

4. A connected graph with exactly n 1 edges, where n is the number


of vertices.

5. A graph with exactly one path between any two distinct vertices,
where a path is a sequence of distinct vertices where each is connected
to the next by an edge. (For paths in a tree to be distinct, we have to
disallow paths that double back on themselves).
When considering the asymptotic complexity it is often useful to cate-
gorize graphs as dense or sparse. Dense graphs have a lot of edges compared
to the number of vertices. Writing n = |V | for the number of vertices (which
will be our notation in the rest of the lecture) know there can be at most
n ⇤ (n 1)/2: every node is connected to any other node (n ⇤ (n 1)), but in
an undirected way (n ⇤ (n 1)/2). If we write e for the number of edges, we
have e = O(n2 ). By comparison, a tree is sparse because e = n 1 = O(n).

L ECTURE N OTES A PRIL 25, 2013


Spanning Trees L26.3

1 Computing a Spanning Tree


There are many algorithms to compute a spanning tree for a connected
graph. The first is an example of a vertex-centric algorithm.

1. Pick an arbitrary node and mark it as being in the tree.

2. Repeat until all nodes are marked as in the tree:

(a) Pick an arbitrary node u in the tree with an edge e to a node w


not in the tree. Add e to the spanning tree and mark w as in the
tree.

We iterate n 1 times in Step 2, because there are n 1 vertices that have to


be added to the tree. The efficiency of the algorithm is determined by how
efficiently we can find a qualifying w.
The second algorithm is edge-centric.

1. Start with the collection of singleton trees, each with exactly one node.

2. As long as we have more than one tree, connect two trees together
with an edge in the graph.

This second algorithm also performs n steps, because it has to add n 1


edges to the trees until we have a spanning tree. Its efficiency is determined
by how quickly we can tell if an edge would connect two trees or would
connect two nodes already in the same tree, a question we come back to in
the next lecture.

L ECTURE N OTES A PRIL 25, 2013


Spanning Trees L26.4

Let’s try this algorithm on our first graph, considering edges in the
listed order: (AB, BC, CD, AE, BE, CE).

A
 D
 A
 D
 A
 D
 A
 D

E
 E
 E
 E


B
 C
 B
 C
 B
 C
 B
 C


A
 D
 A
 D

E
 E


B
 C
 B
 C


The first graph is the given graph, the completley disconnected graph is the
starting point for this algorithm. At the bottom right we have computed the
spanning tree, which we know because we have added n 1 = 4 edges. If
we tried to continue, the next edge BE could not be added because it does
not connect two trees, and neither can CE. The spanning tree is complete.

2 Creating a Random Maze


We can use the algorithm to compute a spanning tree for creating a random
maze. We start with the graph where the vertices are the cells and the
edges represent the neighbors we can move to in the maze. In the graph,
all potential neighbors are connected. A spanning tree will be defined by a
subset of the edges in which all cells in the maze are still connected by some
(unique) path. Because a spanning tree connects all cells, we can arbitrarily
decide on the starting point and end point after we have computed it.
How would we ensure that the maze is random? The idea is to gener-
ate a random permutation (see Exercise 1) of the edges and then consider
the edges in the fixed order. Each edge is either added (if it connects two
disconnected parts of the maze) or not (if the two vertices are already con-
nected).

L ECTURE N OTES A PRIL 25, 2013


Spanning Trees L26.5

3 Minimum Weight Spanning Trees


In many applications of graphs, there is some measure associated with the
edges. For example, when the vertices are locations then the edge weights
could be distances. We might then be interested in not any spanning tree,
but one whose total edge weight is minimal among all the possible span-
ning trees, a so-called minimum weight spanning tree (MST). An MST is not
necessarily unique. For example, all the edge weights could be identical in
which case any spanning tree will be minimal.
We annotate the edges in our running example with edge weights as
shown on the left below. On the right is the minimum weight spanning
tree, which has weight 9.

A
 D
 A
 D

2
 2

E
 E

3
 3
 3

2
 2
 2
 2

B
 C
 B
 C

3


Before we develop a refinement of our edge-centric algorithm for span-


ning trees to take edge weights into account, we discuss a basic property it
is based on.

Cycle Property. Let C ✓ E be a cycle, and e be an edge of maximal weight


in C. The e does not need to be in an MST.

How do we convince ourselves of this property? Assume we have a


spanning tree, and edge e from the cycle property connects vertices u and
w. If e is not in the spanning tree, then, indeed, we don’t need it. If e is in
the spanning tree, we will construct another MST without e. Edge e splits
the spanning tree into two subtrees. There must be another edge e0 from
C connecting the two subtrees. Removing e and adding e0 instead yields
another spanning tree, and one which does not contain e. It has equal or
lower weight to the first MST, since e0 must have less or equal weight than
e.
The cycle property is the basis for Kruskal’s algorithm.

L ECTURE N OTES A PRIL 25, 2013


Spanning Trees L26.6

1. Sort all edges in increasing weight order.

2. Consider the edges in order. If the edge does not create a cycle, add
it to the spanning tree. Otherwise discard it. Stop, when n 1 edges
have been added, because then we must have spanning tree.

Why does this create a minimum-weight spanning tree? It is a straightfor-


ward application of the cycle property (see Exercise 2).
Sorting the edges will take O(e ⇤ log(e)) steps with most appropriate
sorting algorithms. The complexity of the second part of the algorithm
depends on how efficiently we can check if adding an edge will create a
cycle or not. As we will see in Lecture 27, this can be O(n ⇤ log(n)) or even
more efficient if we use a so-called union-find data structure.
Illustrating the algorithm on our example

A
 D

2

E

3
 3

2
 2

B
 C

3


we first sort the edges. There is some ambiguity—say we obtain the follow-
ing list
AE 2
BE 2
CE 2
BC 3
CD 3
AB 3
We now add the edges in order, making sure we do not create a cycle. After

L ECTURE N OTES A PRIL 25, 2013


Spanning Trees L26.7

AE, BE, CE, we have

A
 D

2

E

2
 2

B
 C


At this point we consider BC. However, this edge would create a cycle
BCE since it connects two vertices in the same tree instead of two differ-
ent trees. We therefore do not add it to the spanning tree. Next we consider
CD, which does connect two trees. At this point we have a minimum span-
ning tree

A
 D

2

E

3

2
 2

B
 C


We do not consider the last edge, AB, because we have already added n
1 = 4 edges.
In the next lecture we will analyze the problem of incrementally adding
edges to a tree in a way that allows us to quickly determine if an edge
would create a cycle.

L ECTURE N OTES A PRIL 25, 2013


Spanning Trees L26.8

Exercises
Exercise 1 Write a function to generate a random permutation of a given array,
using a random number generator with the interface in the standard rand library.
What is the asymptotic complexity of your function?

Exercise 2 Prove that the cycle property implies the correctness of Kruskal’s algo-
rithm.

L ECTURE N OTES A PRIL 25, 2013


Lecture Notes on
Union-Find
15-122: Principles of Imperative Computation
Frank Pfenning

Lecture 26
April 30, 2013

1 Introduction
Kruskal’s algorithm for minimum weight spanning trees starts with a col-
lection of single-node trees and adds edges until it has constructed a span-
ning tree. At each step, it must decide if adding the edge under consid-
eration would create a cycle. If so, the edge would not be added to the
spanning tree; if not, it will.
In this lecture we will consider an efficient data structure for checking if
adding an edge to a partial spanning tree would create a cycle, a so-called
union-find structure.

2 Maintaining Equivalence Classes


The basic idea behind the data structure is to maintain equivalence classes
of nodes, efficiently. An equivalence class is a set of nodes related by an
equivalence relation, which must be reflexive, symmetric, and transitive. In our
case, this equivalence relation is defined on nodes in a partial spanning
tree, where two nodes are related if there is a path between them. This is
reflexive, because from each node u we can reach u by a path of length 0.
It is symmetric because we are working with undirected graphs, so if u is
connected to w, then w is also connected to u. It is transitive because if there
is a path from u to v and one from v to w, then the concatenation is a path
from u to w.
Initially in Kruskal’s algorithm, each node is in its own equivalence
class. When we connect two trees with an edge, we have to form the union

L ECTURE N OTES A PRIL 30, 2013


Union-Find L26.2

of the two equivalence classes, because each node in either of the two trees
is now connected to all nodes in both trees.
When we have to decide if adding an edge between two nodes u and w
would create a cycle, we have to determine if u and w belong to the same
equivalence class. If so, then there is already a path and u and w; adding
the edge would create a cycle. If not, then there is not already such a path,
and adding the edge would therefore not create a cycle.
The union-find data structure maintains a so-called canonical represen-
tative of each equivalence class, which can be computed efficiently from
any element in the class. We then determine if two nodes u and w are in
the same class by computing the canonical representatives from u and w,
respectively, and comparing them. If they are equal, they must be in the
same class, otherwise they are in two different classes.

3 An Example
In order to motivate how the union-find data structure works, we consider
an example of Kruskal’s algorithm. We have the following graph, with the
indicated edge weights.

D
 C

1
 1

E
 2
 F

2
 2

1
 1

A
 B


We have to consider the edges in increasing order, so let’s fix the order AE,
ED, F B, CF , AD, EF , CB. We represent the nodes A–F as integers 0–5
and keep the canonical representation for each node in an array.

L ECTURE N OTES A PRIL 30, 2013


Union-Find L26.3

Initially, each node is in its own equivalence class.

D
 C

E
 F


A
 B


In the array, we have the following state

A
 B
 C
 D
 E
 F

0
 1
 2
 3
 4
 5

0
 1
 2
 3
 4
 5


We begin by considering the edge AE, we see that they are in two dif-
ferent equivalence classes because A[0] = 0 and A[4] = 4, and 0 6= 4. This
means we have to add an edge between A and E.

D
 C

E
 F


A
 B


In the array of canonical representatives, we either have to set A[0] = 4 or


A[4] = 0, depending on whether we choose 0 or 4 as the representative.

L ECTURE N OTES A PRIL 30, 2013


Union-Find L26.4

Let’s assume it’s 0. The array then would be the following:

A
 B
 C
 D
 E
 F

0
 1
 2
 3
 4
 5

0
 1
 2
 3
 0
 5


Next we consider ED. Again, this edge should be added because A[4] =
0 6= 3 = A[3].

D
 C

E
 F


A
 B


If we want to maintain the array it is clearly easier to change A[3] to 0, than


to change A[4] and A[0] to 3, since the latter would require two changes. In
general, the representative of the larger class should be the representation
of the union of two classes.

A
 B
 C
 D
 E
 F

0
 1
 2
 3
 4
 5

0
 1
 2
 0
 0
 5


L ECTURE N OTES A PRIL 30, 2013


Union-Find L26.5

We now combine two more steps, because they are analagous to the
above, adding edges F B and CF .

D
 C

E
 F


A
 B


The array:

A
 B
 C
 D
 E
 F

0
 1
 2
 3
 4
 5

0
 5
 5
 0
 0
 5


Next was the edge AD. In the array we have that A[0] = 0 = A[3], so A
and D belong to the same equivalence class. Adding the edge would create
a cycle, so we ignore it and move on.
The next edge to consider is EF . Since A[4] = 0 6= 5 = A[5] they are
in different equivalence classes. The two classes are of equal size, so we
have to decide which to make the canonical representative. If it is 0, then
we would need to change A[1], A[2], and A[5] all to be 0. This could take
up to n/2 changes in the array, potentially multiple times at different stages
during the algorithm.
In order to avoid this we make one representative “point” to the other
(say, setting A[5] = 0), but we do not change A[1] and A[2]. Now, to find
the canonical representative of, say, 1 we first look up A[1] = 5. Next we
lookup A[5] = 0. Then we lookup A[0] = 0. Since A[0] = 0 we know that
we have found a canonical element and stop. In essence, we follow a chain
of pointers until we reach a root, which is the canonical representative of
the equivalence class and looks like it points to itself.

L ECTURE N OTES A PRIL 30, 2013


Union-Find L26.6

Taking this step we now have

D
 C

E
 F


A
 B


and a union-find structure which is as follows:

A
 B
 C
 D
 E
 F

0
 1
 2
 3
 4
 5

0
 5
 5
 0
 0
 0


At this point we can stop, and we don’t even need to consider the last
edge BC. That’s because we have already added 5 = n 1 edges (where n
is the number of nodes), so we must have a spanning tree at this stage.
Examining the union-find structure we see that the representative for
all nodes is indeed 0, so we have reduced it to one equivalence class and
therefore a spanning tree.
In this algorithm we are trying to keep the chain from any node to its
representative short. Therefore, when we are applying the union to two
classes, we want the one with shorter chains to point to the one with longer
chains. In that case, we will only increase the size of the longest chain if
both are equal.
In this case, we can show relatively easily that the worst-case complex-
ity of a sequence of n find or union operations is O(n ⇤ log(n)) (see Exer-
cise 1).

L ECTURE N OTES A PRIL 30, 2013


Union-Find L26.7

4 An Implementation
Instead of developing the implementation here, we refer the reader to the
code posted at 26-unionfind. There is a simple implementation, unionfind-
lin.c. as we developed in lecture, which does not try to maintain balance,
and is therefore linear in the worst case.
This second implementation at unionfind-log.c changes the representa-
tion we used above slightly. In the above representation, an element i is
canonical if A[i] = i. In the improved represenation, A[i] = d, where d is
the maximal length of the chain leading to the representative i. This allows
us to make a quick decision how to pick a representative for the union.

Exercises
Exercise 1 Prove that after n union operations, the longest chain from an element
to its representative is O(log(n)) if we always take care to have the class with
longer chains be the canonical representative of the union. This is without any
form of path compression.

Exercise 2 Modify the simple implementation from lecture in unionfind-lin.c so


it does strong path compression, which means that on every find operation, every
intermediate node will be redirected to point directly to its canonical representative.

Exercise 3 Modify the more efficient implementation at unionfind-log.c. Note


that this may require loosening the invariants, since in the straightforward imple-
mentation the stored number is only a bound on the longest path and may not be
exact (since the path may be compresses).

L ECTURE N OTES A PRIL 30, 2013


15-122 Homework 2 Page 1 of 11

15-122: Principles of Imperative Computation, Spring 2013

Homework 2 Programming: Twitterlab

Due: Monday, February 11, 2013 by 23:59

For the programming portion of this week’s homework, you’ll write two files C0 files corre-
sponding to two di↵erent string processing tasks, and a two other files that performs unit
tests on a potentially buggy sorting implementation:

• duplicates.c0 (described in Section 2)

• count_vocab.c0 (described in Section 3)

• sort-test.c0 (described in Section 4)

• sort_copy-test.c0 (described in Section 4)

You should submit these files electronically by 11:59 pm on the due date. Detailed submission
instructions can be found below.
15-122 Homework 2 Page 2 of 11

Assignment: String Processing (20 points)

Starter code. Download the file hw2-handout.tgz from the course website. When you
unpack it, you will find a lib/ directory with several C0 files, including stringsearch.c0
and readfile.c0. You will also see a texts/ directory with some sample text files you may
use to test your code. You should not modify or submit code in the lib directory.
For this homework, you are not provided any main() functions. Instead, you should write
your own main() functions for testing your code. You should put this test code in separate
files from the ones you will submit for the problems below (e.g. duplicates-test.c0). You
may hand in these files or not.

Compiling and running. You will compile and run your code using the standard C0 tools.
For example, if you’ve completed the program duplicates that relies on functions defined
in stringsearch.c0 and you’ve implemented some test code in duplicates-test.c0, you
might compile with a command like the following:

% cc0 duplicates.c0 duplicates-test.c0

Don’t forget to include the -d switch if you’d like to enable dynamic annotation checking,
but this check should be turned o↵ when you are evaluating the running time of a function.

Submitting. Once you’ve completed some files, you can submit them to Autolab. There
are two ways to do this:

From the terminal on Andrew Linux (via cluster or ssh) type:

% handin hw2 duplicates.c0 count_vocab.c0 \


sort-test.c0 sort_copy-test.c0 README.txt

Your score will then be available on the Autolab website.

Your files can also be submitted to the web interface of Autolab. To do so, please tar
them, for example:

% tar -czvf sol.tgz duplicates.c0 count_vocab.c0 \


sort-test.c0 sort_copy-test.c0 README.txt

Then, go to https://autolab.cs.cmu.edu to submit.

You can submit this assignment as often as you would like. When we grade your assign-
ment, we will consider the most recent version submitted before the due date. If you get any
errors while trying to submit your code, you should contact the course sta↵ immediately.
15-122 Homework 2 Page 3 of 11

Annotations. Be sure to include appropriate //@requires, //@ensures, //@assert, and


//@loop_invariant annotations in your program. For this assignment, we have provided
the pre- and postconditions for many of the functions that you will need to implement.
However, you should provide loop invariants and any assertions that you use to check your
reasoning. If you write any “helper” functions, include precise and appropriate pre- and
postconditions.
You should write these as you are writing the code rather than after you’re done: docu-
menting your code as you go along will help you reason about what it should be doing, and
thus help you write code that is both clearer and more correct. Annotations are part of
your score for the programming problems; you will not receive maximum credit
if your annotations are weak or missing.

Unit testing. You should write unit tests for your code. This involves writing a separate
main() function that runs individual functions many times with various inputs, asserting
that the expected output is produced. You should specifically choose function inputs that
are tricky or are otherwise prone to fail. While you will not directly receive a large amount
of credit for these tests, your tests will help you check the correctness of your code, pinpoint
the location of bugs, and save you hours of frustration.

Style. Strive to write code with good style: indent every line of a block to the same level,
use descriptive variable names, keep lines to 80 characters or fewer, document your code
with comments, etc. If you find yourself writing the same code over and over, you should
write a separate function to handle that computation and call it whenever you need it. We
will read your code when we grade it, and good style is sure to earn our good graces. Feel
free to ask on Piazza if you’re unsure of what constitutes good style.

Task 0 (5 points) 5 points on this assignment will be given for style.


15-122 Homework 2 Page 4 of 11

1 String Processing Overview


The three short programming problems you have for this assignment deal with processing
strings. In the C0 language, a string is a sequence of characters. Unlike languages like C, a
string is not the same as an array of characters. (See section 8 in the C0 language reference,
section 2.2 of the C0 library reference, and the page on Strings in the C0 tutorial1 for more
information on strings). There is a library of string functions (which you include in your
code by #use <string>) that you can use to process strings:

// Returns the length of the given string


int string_length(string s)

// Returns the character at the given index of the string.


// If the index is out of range, aborts.
char string_charat(string s, int idx)
//@requires 0 <= idx && idx < string_length(s);

// Returns a new string that is the result of concatenating b to a.


string string_join(string a, string b)
//@ensures string_length(\result) == string_length(a) + string_length(b);

// Returns the substring composed of the characters of s beginning at


// index given by start and continuing up to but not including the
// index given by end. If end <= start, the empty string is returned
string string_sub(string a, int start, int end)
//@requires 0 <= start && start <= end && end <= string_length(a);
//@ensures string_length(\result) == end - start;

bool string_equal(string a, string b)

int string_compare(string a, string b)


//@ensures -1 <= \result && \result <= 1;

The string_compare function performs a lexicographic comparison of two strings, which


is essentially the ordering used in a dictionary, but with character comparisons being based
on the characters’ ASCII codes, not just alphabetical. For this reason, the ordering used
here is sometimes whimsically referred to as “ASCIIbetical” order. A table of all the ASCII
codes is shown in Figure 1. The ASCII value for ’0’ is 0x30 (48 in decimal), the ASCII code
for ’A’ is 0x41 (65 in decimal) and the ASCII code for ’a’ is 0x61 (97 in decimal). Note
that ASCII codes are set up so the character ’A’ is “less than” the character ’B’ which is
less than the character ’C’ and so on, so the “ASCIIbetical” order coincides roughly with
ordinary alphabetical order.

1
http://c0.typesafety.net/tutorial/Strings.html
15-122 Homework 2 Page 5 of 11

Figure 1: The ASCII table

2 Removing Duplicates
In this programming exercise, you will take a sorted array of strings and return a new sorted
array that contains the same strings without duplicates. The length of the new array should
be just big enough to hold the resulting strings. Place your code for this section in a file
called duplicates.c0; you’ll want this file to start with #use "lib/stringsearch.c0" in
order to get the is_sorted function from class adapted to string arrays. Implement unit
tests for all of the functions in this section in a file called duplicates_test.c0.
Task 1 (1 pt) Implement a function matching the following function declaration:
bool is_unique(string[] A, int n)
//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
where n represents the size of the subarray of A that we are considering. This function should
return true if the given string array contains no repeated strings and false otherwise.
Task 2 (1 pt) Implement a function matching the following function declaration:
int count_unique(string[] A, int n)
//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
where n represents the size of the subarray of A that we are considering. This function should
return the number of unique strings in the array, and your implementation should have an
appropriate asymptotic running time given the precondition.
15-122 Homework 2 Page 6 of 11

Task 3 (3 pts) Implement a function matching the following function declaration:

string[] remove_duplicates(string[] A, int n)


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);

where n represents the size of the subarray of A that we are considering. The strings in
the array should be sorted before the array is passed to your function. This function should
return a new array that contains only one copy of each distinct string in the array A. Your
new array should be sorted as well. Your implementation should have a linear asymptotic
running time. Your solution should include annotations for at least 3 strong postconditions.

You must include annotations for the precondition(s), postcondition(s) and loop invari-
ant(s) for each function. You may include additional annotations for assertions as necessary.
You may include any auxiliary functions you need in the same file, but you should not include
a main() function.

3 DosLingos (Counting Common Words)


The story: You’re working for a Natural Language Processing (NLP) startup company
called DosLingos.2 Already, your company has managed to convince thousands of users to
translate material from English to Spanish for free. In a recent experiment, you had users
translate only newswire text and you’ve managed to train your users to recognize words in
an English newspaper. However, now you’re considering having these same users translate
Twitter tweets as well, but you’re not sure how many words of English Twitter dialect your
Spanish-speaking users will be able to recognize.

Your job: In this exercise, you will write a functions for analyzing the number of tokens
from a Twitter feed that appear (or not) in a user’s vocabulary. The user’s expected vo-
cabulary will be represented by a sorted array of strings vocab that has length v, and we
will maintain another integer array, freq, where freq[i] represents the number of times we
have seen vocab[i] in tweets so far (where i 2 [0, v)).

vocab&

“burrow”( “ha”( “his”( “is”( “list”( “of”( “out”( “winter”(

freq&

1( 12( 0( 0( 2( 4( 1( 2(

2
Any resemblance between this scenario and Dr. Luis von Ahn’s company DuoLingo (www.duolingo.com)
are purely coincidental and should not be construed otherwise.
15-122 Homework 2 Page 7 of 11

This is an important pattern, and one that we will see repeatedly throughout the semester
in 15-122: the (sorted) vocabulary words stored in vocab are keys and the frequency counts
stored in freq are values.
The function count_vocab that we will write updates the values – the frequency counts
– based on the unsorted Twitter data we are getting in. For example, consider a Twitter
corpus containing only this tweet by weatherman Scott Harbaugh:

We would expect count_vocab(vocab,freq,8,"texts/scottweet.txt",b) to return 1 (be-


cause only one word, “Phil,” is not in our example vocabulary), leave the contents of vocab
unchanged, and update the frequency counts in freq as follows:

vocab&

“burrow”( “ha”( “his”( “is”( “list”( “of”( “out”( “winter”(

freq&

2( 12( 1( 1( 2( 5( 2( 2(

Your data: DosLingos has given you 4 data files for your project in the texts/ directory:

• news_vocab_sorted.txt - A sorted list of vocabulary words from news text that


DosLingos users are familiar with.

• scotttweet.txt - Scott Harbaugh’s tweet above.

• twitter_1k.txt - A small collection of 1000 tweets to be used for testing slower


algorithms.

• twitter_200k.txt - A larger collection of 200k tweets to be used for testing faster


algorithms.
15-122 Homework 2 Page 8 of 11

Your tools: DosLingos already has a C0 library for reading text files, provided to you as
lib/readfile.c0, which implements the following functions:

// first call read_words to read in the content of the file


string_bundle read_words(string filename)

You need not understand anything about the type string_bundle other than that you
can extract its underlying string array and the length of that array:

// access the array inside of the string_bundle using:


string[] string_bundle_array(string_bundle sb)

// to determine the length of the array in the string_bundle, use:


int string_bundle_length(string_bundle sb)

Here’s an example of these functions being used on Scott Harbaugh’s tweet:

$ coin lib/readfile.c0
--> string_bundle bund = read_words("texts/scotttweet.txt");
bund is 0xFFAFB8E0 (struct fat_string_array*)
--> string_bundle_length(bund);
6 (int)
--> string[] tweet = string_bundle_array(bund);
tweet is 0xFFAFB670 (string[] with 6 elements)
--> tweet[0];
"phil" (string)
--> tweet[5];
"burrow" (string)

Being connoisseurs of efficient algorithms, DosLingos has also implemented their own set
of string search algorithms in lib/stringsearch.c0, which you may also find useful for this
assignment:

int linsearch(string x, string[] A, int n) // Linear search


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && string_equal(A[\result], x)); @*/

int binsearch(string x, string[] A, int n) // Binary search


//@requires 0 <= n && n <= \length(A);
//@requires is_sorted(A, 0, n);
/*@ensures (-1 == \result && !is_in(x, A, 0, n))
|| ((0 <= \result && \result < n) && string_equal(A[\result], x)); @*/

You can include these libraries in your code by writing #use "lib/readfile.c0" and
#use "lib/stringsearch.c0".
15-122 Homework 2 Page 9 of 11

Task 4 (4 pts) Create a file count_vocab.c0 containing a function definition count_vocab


that matches the following function declaration:

int count_vocab(string[] vocab, int[] freq, int v,


string tweetfile,
bool fast)
//@requires v == \length(vocab) && v == \length(freq);
//@requires is_sorted(vocab, 0, v) && is_unique(vocab, v);

The function should return the number of occurrences of words in the file tweetfile that
do not appear in the array vocab, and should update the frequency counts in freq with the
number of times each word in the vocabulary appears. If a word appears multiple times in
the tweetfile, you should count each occurrence separately, so the tweet “ha ha ha LOL
LOL” would cause the the frequency count for “ha” to be incremented by 3 and would cause
2 to be returned, assuming LOL was not in the vocabulary.
Note that a precondition of count_vocab is that the vocab must be sorted, a fact you
should exploit. Your function should use the linear search algorithm when fast is set to false
and it should use the binary search algorithm when fast is true.

Because count_vocab uses the is_unique function you wrote earlier, when you write a
count_vocab-test.c0 function to test your implementation, you’ll want to include the file
duplicates.c0 on the command line:

% cc0 duplicates.c0 count_vocab.c0 count_vocab-test.c0

Task 5 (1 pt) Create a file README.txt answering the following questions:

1. Give the asymptotic running time of count_vocab under (1) linear and (2) binary
search using big-O notation. This should be in terms of v, the size of the vocabulary,
and n, the number of tweets in tweetfile.

2. How many seconds did it take your function to run on the linear search strategy
(fast=false) using the small 1K twitter text? Do not use contract checking via
the -d option. Also, these tests should use cc0, not Coin, so you’ll need to write a
file count_vocab-time.c0 to help you when you do this step.
You should use the Unix command time in this step. You can report either wall clock
time or CPU time, but say which one you used.
Example: time ./count_vocab

3. How many seconds did it take for the binary search strategy (fast=true) to run on the
small 1K twitter text?

4. How many seconds did it take for the binary search strategy (fast=true) on the larger
200K twitter text?

Submit this file along with the rest of your code.


15-122 Homework 2 Page 10 of 11

4 Unit testing
DosLingos’s old selection sort is no longer up to the task of sorting large texts to make
vocabularies. Your colleagues currently use two sorts, both given in lib/stringsort.c0:

void sort(string[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(A, lower, upper);

string[] sort_copy(string[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(\result, 0, \length(\result));

The first is an in-place sort like we discussed in class, and the second is a copying sort that
must leave the original array A unchanged and return a sorted array of length upper-lower.
(Note that neither of these conditions are directly expressed in the contracts.)
DosLingos decided to pay another company to write faster sorting algorithms with the
same interface. Unfortunately, they didn’t realize that the other company was a closed-source
shop, so now your company’s future is depending on code you can’t see – you know that the
contracts are set up correctly, but you don’t know anything about the implementation. This
causes (at least) two big problems.
First, you can’t prove that your sorting functions always respect their contracts – the
best you can do is give a counterexample, writing a test that causes the @ensures statement
to fail. If the outside contractors give you this completely bogus implementation. . .

void sort(string[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(A, lower, upper);
{
return;
}

. . . then you can show that their implementation is buggy by writing a test file with a main()
function that performs a sort and observing that the @ensures statement fails when you
compile the test with -d and run it.
Second, even code that does satisfy the contracts may not actually be correct! For exam-
ple, this sort_copy function will never fail the postcondition, but it is definitely incorrect:

string[] sort_copy(string[] A, int lower, int upper)


//@requires 0 <= lower && lower <= upper && upper <= \length(A);
//@ensures is_sorted(\result, 0, \length(\result));
{
return alloc_array(string, upper-lower);
}
15-122 Homework 2 Page 11 of 11

In order to catch this kind of incorrect sort, you will have to write a test file with a main()
function that runs the sort and then uses //@assert statements to check that other correct-
ness conditions hold – to catch this bug, the is_in() and/or binsearch() functions might
be helpful, for instance, though they certainly aren’t necessary.
To get full credit on the next task, you’ll need to write tests with extra assertions that will
fail with assertion errors both if the outside contractors wrote a sometimes-postcondition-
failing implementation and if they exploited the contracts to give you a bogus-but-contract-
abiding implementation.

Task 6 (5 pts) Write two files, sort-test.c0 and sort_copy-test.c0, that test the two
sorting functions. The autograder will assign you a grade based on the ability of your unit
tests to pass when given a correct sort and fail when given various buggy sorts. Your tests
must still be safe: it should not be possible for your code to make an array access out-of-
bounds when -d is turned on.
You do not need to catch all our bugs to get full points, but catching additional tests will
be reflected on the scoreboard (and considered for bonus points).

4.1 Testing your tests


You can test your functions with DosLingos’s own (presumably correct) selection sort algo-
rithm, and on the two awful badly broken implementations given in this section, by running
the following commands:

% cc0 -d lib/stringsort.c0 sort-test.c0


% ./a.out
% cc0 -d lib/stringsort.c0 sort_copy-test.c0
% ./a.out
% cc0 -d lib/sort-awful.c0 sort-test.c0
% ./a.out
% cc0 -d lib/sort_copy-awful.c0 sort_copy-test.c0
% ./a.out

All four of these tests should compile and run, but the last two invocations of ./a.out
should trigger a contract violation if your tests are more than minimal. We will only test
one function at a time, so sort-test.c0 must only reference the sort() function and
sort_copy-test.c0 must only reference the sort_copy() function. Both can reference the
specification function is_sorted and all other functions defined in lib/stringsearch.c0.
You can #use "lib/stringsearch.c0" in your tests, but it is important that you do not
#use "lib/stringsort.c0".
15-122 Homework 6 Page 1 of 11

15-122 : Principles of Imperative Computation, Spring 2013

Homework 6 Theory [UPDATE 1]

Due: Thursday, April 4, 2013, at the beginning of lecture

Name:

Andrew ID:

Recitation:

The written portion of this week’s homework will give you some practice working with heaps,
priority queues, BSTs, and AVL trees, as well as begin our transition to full C. You can either
type up your solutions or write them neatly by hand in the spaces provided. You should
submit your work in class on the due date just before lecture or recitation begins. Please
remember to staple your written homework before submission.

Question Points Score


1 5
2 6
3 6
4 8
5 5
Total: 30

You must use this printout, include this cover sheet,


and staple the whole thing together before turning it in.
Either type up the assignment using 15122-theory6.tex,
or print this PDF and write your answers neatly by hand.
15-122 Homework 6 Page 2 of 11

1. Heaps.
We represent heaps, conceptually, as trees. For example, consider the min-heap below.1

(2) (a) Assume a heap is stored in an array as discussed in class where the root is stored
at index 1. Using the above min-heap, at what index is the element with value 32
stored? At what index is its parent stored? At what indices are its left and right
children stored?

Solution:
The value 32 is stored at index ___________.

The parent of value 32 is stored at index ____________.

The left child of value 32 is stored at index ____________.

The right child of value 32 is stored at index ____________.

(1) (b) Suppose we have a non-empty min-heap of integers of size n and we wish to find
the maximum integer in the heap. Describe precisely where the maximum must
be in the min-heap. (You should be able to answer this question with one short
sentence.)

Solution:

1
Diagram courtesy of Hamilton (http://hamilton.herokuapp.com)
15-122 Homework 6 Page 3 of 11

(1) (c) Using the following C0 definition for a heap of integers (position 0 in the array is
not used):

struct heap_header {
int limit; // size of the array of integers
int next; // next available array position for an integer
int[] value;
};
typedef struct heap_header* heap;

Write a C0 function find_max that takes a non-empty min-heap and returns the
maximum value. Your code should examine only those cells that could possibly
hold the maximum.

Solution:
int find_max(heap H)
//@requires is_heap(H);
//@requires H->next > 1;
{

(1) (d) What is the worst-case runtime complexity in big-O notation of your find_max
function on a non-empty min-heap of n elements from the previous problem?

Solution:
15-122 Homework 6 Page 4 of 11

2. Heaps and BSTs.


Though heaps and binary search trees (BSTs) are very different in terms of their invari-
ants and uses, they are both conceptually represented as trees. This question asks about
three invariants of trees: the BST ordering invariant, the heap shape invariant, and
the heap ordering invariant (for min-heaps, where higher-priority keys are lower integer
values). For the first part of this question, we assume that each element has a single C0
int that is used as both the BST key and the heap priority.
(1) (a) Draw a tree with five elements that is a BST and satisfies the heap shape invariant.

Solution:

(1) (b) Draw a tree with at least four elements that is a BST and satisfies the (min-)heap
ordering invariant.

Solution:
15-122 Homework 6 Page 5 of 11

(1) (c) Why is it not a good idea to have a data structure that enforces both the (min-)heap
ordering invariant and the BST ordering invariant? (Be brief!)

Solution:

(3) (d) Maintaining the BST ordering invariant and the heap invariant on the same set of
values may not be a good idea, but it can be useful to have a tree structure where
each node has two separate values – a key used for the BST ordering invariant and
a priority used for the heap ordering invariant. Such trees are called treaps; we will
use strings as keys and ints as priorities in this question.
The treap below satisfies the BST ordering invariant, but violates the heap ordering
invariant because of the relationship between the “e”/9 node and its parent. In a
heap, we restore the heap shape invariant using swaps. But in a treap, such a
swap would violate the BST ordering invariant. However, by using the same local
rotations we learned about for AVL trees, it is possible to restore the heap ordering
invariant while preserving the BST ordering invariant.
The heap ordering invariant for the tree below can be restored with two tree rota-
tions. Draw the tree that results from each rotation. You should be drawing two
trees.

Solution:
15-122 Homework 6 Page 6 of 11

3. Priority Queues.
In a priority queue, each element has a priority value which is represented as an
integer. As in the previous question, the lower the integer, the higher the priority.
When we call pq_delmin, we remove the element with the highest priority.
(a) Consider the following ways that we can implement a priority queue. Using big-O
notation, what is the worst-case runtime complexity for each implementation to
perform pq_insert and pq_delmin on a priority queue with n elements?
(1) i. Using an unsorted array.

Solution:
pq_insert:

pq_delmin:

(1) ii. Using a sorted array, where the elements are stored from lowest to highest
priority.

Solution:
pq_insert:

pq_delmin:

(1) iii. Using a heap.

Solution:
pq_insert:

pq_delmin:

(1) (b) Which implementation in (a) is preferable if the number of pq_insert and pq_delmin
operations are relatively balanced? Explain in one sentence.

Solution:
15-122 Homework 6 Page 7 of 11

(1) (c) Under what specific condition does a priority queue behave like a FIFO queue if
it is implemented using a heap? (Warning: if you words like “higher,” “lower,”
“increasing,” or “decreasing” in your answer, be clear whether you are talking about
priority or integer value.)

Solution:

(1) (d) Under what specific condition does a priority queue behave like a LIFO stack if it
is implemented using a heap?

Solution:
15-122 Homework 6 Page 8 of 11

4. AVL Trees.
(4) (a) Draw the AVL trees that result after successively inserting the following keys into
an initially empty tree, in the order shown:

98, 88, 54, 67, 23, 72, 39

Show the tree after each insertion and subsequent re-balancing (if any): the tree
after the first element, 98, is inserted into an empty tree, then the tree after 88 is
inserted into the first tree, and so on for a total of seven trees. Make it clear what
order the trees are in.
Be sure to maintain and restore the BST invariants and the additional balance
invariant required for an AVL tree after each insert.

Solution:
15-122 Homework 6 Page 9 of 11

(b) Recall our definition for the height h of a tree:


The height of a tree is the maximum length of a path from the
root to a leaf. So the empty tree has height 0, the tree with
one node has height 1, and a balanced tree with three nodes has
height 2.
The minimum number of nodes n in a valid AVL tree is related to its height. The
goal of this question is to quantify this relationship.
(2) i. Fill in the table below relating the variables h and n:

h n
0 0
1 1
2 2
3
4
5

(2) ii. Recall that the xth Fibonacci number F (x) is defined by:

F (0) = 0
F (1) = 1
F (x) = F (x 1) + F (x 2), x>1

Using the table in part (i), give an expression for T (h), where T (h) = n. You
may find it useful to use F (n) in your answer, but your answer does not need
to be a closed form expression.

Solution:
15-122 Homework 6 Page 10 of 11

5. Pass by reference

We now begin our transition in 15-122 to full C!

At various points in our C0 programming experience we had to use somewhat awkward


workarounds to deal with functions that need to return more than one value. The address-
of operator (&) in C gives us a new way of dealing with this issue.
(2) (a) Sometimes, a function needs to be able to both 1) signal whether it can return a
result, and 2) return that result if it is able to. One such function that we’ve seen
is peg_solve. When a solution is found, peg_solve returns 1 and modifies the
originally-empty stack passed in with the winning moves, and when no solution is
found peg_solve simply returns the minimum number of pegs seen. Parsing also
fits this pattern. Consider the following code:

bool my_int_parser(char *s, int *i); // Returns true iff parse succeeds

void parseit(char *s) {


REQUIRES(s != NULL);
int *i = xmalloc(sizeof(int));
if (my_int_parser(s, i))
printf("Success: %d.\n", *i);
else
printf("Failure.\n");
free(i);
return;
}

Using the address-of operator, rewrite the body of the parseit function so that it
does not heap-allocate, free, or leak any memory on the heap. You may assume
my_int_parser has been implemented (its prototype is given above).

Solution:
void parseit(char *s) {
REQUIRES(s != NULL);

return;
}
15-122 Homework 6 Page 11 of 11

(3) (b) In both C and C0, multiple values can be ‘returned’ by bundling them in a struct:
struct bundle { int x; int y; };
struct bundle *foo(int x) {
...
struct bundle *B = xmalloc(sizeof(struct bundle));
B->x = e1;
B->y = e2;
return B;
}
int main() {
...
struct bundle *B = foo(e);
int x = B->x;
int y = B->y;
free(B);
...
}
Rewrite the declaration and the last few lines of the function foo, as well as the
snippet of main, to avoid heap-allocating, freeing, or leaking any memory on the
heap. The rest of the code (...) should continue to behave exactly as it did before.

Solution:
_______________ foo(___________________________________________) {
...

________________________________________________________________

________________________________________________________________

________________________________________________________________
}

int main() {
...

________________________________________________________________

________________________________________________________________

________________________________________________________________

________________________________________________________________
...
}

You might also like