0% found this document useful (0 votes)
3 views405 pages

Software Construction 4

Software Construction 4 complete

Uploaded by

qamarmemon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views405 pages

Software Construction 4

Software Construction 4 complete

Uploaded by

qamarmemon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 405

Reading 1: Static Checking

Objectives for Today’s Class

Today’s class has two topics:

static typing
the big three properties of good software

Hailstone Sequence

As a running example, we’re going to explore the hailstone sequence,


which is defined as follows. Starting with a number n, the next
number in the sequence is n/2 if n is even, or 3n+1 if n is odd. The
sequence ends when it reaches 1. Here are some examples:

2, 1
3, 10, 5, 16, 8, 4, 2, 1
4, 2, 1
2n, 2n-1 , ... , 4, 2, 1
5, 16, 8, 4, 2, 1
7, 22, 11, 34, 17, 52, 26, 13, 40, ...? (where does
this stop?)

Because of the odd-number rule, the sequence may bounce up and


down before decreasing to 1. It’s conjectured that all hailstones
eventually fall to the ground – i.e., the hailstone sequence reaches 1
for all starting n – but that’s still an open question. Why is it called a
hailstone sequence? Because hailstones form in clouds by bouncing
up and down, until they eventually build enough weight to fall to earth.

Computing Hailstones
Here’s some code for computing and printing the hailstone sequence
for some starting n. We’ll write Java and Python side by side for
comparison:

// Java # Python
int n = 3; n = 3
while (n != 1) { while n != 1:
System.out.println(n); print(n)
if (n % 2 == 0) { if n % 2 == 0:
n = n / 2; n = n / 2
} else { else:
n = 3 * n + 1; n = 3 * n + 1
}
}
System.out.println(n); print(n)

A few things are worth noting here:

The basic semantics of expressions and statements in Java are


very similar to Python: while and if behave the same, for
example.
Java requires semicolons at the ends of statements. The extra
punctuation can be a pain, but it also gives you more freedom in
how you organize your code – you can split a statement into
multiple lines for more readability.
Java requires parentheses around the conditions of the if and
while .

Java requires curly braces around blocks, instead of indentation.


You should always indent the block, even though Java won’t pay
any attention to your extra spaces. Programming is a form of
communication, and you’re communicating not only to the
compiler, but to human beings. Humans need that indentation.
We’ll come back to this later.

Types

The most important semantic difference between the Python and


Java code above is the declaration of the variable n , which specifies
its type: int .

A type is a set of values, along with operations that can be


performed on those values.

Java has several primitive types, among them:

int (for integers like 5 and -200, but limited to the range ± 231, or
roughly ± 2 billion)
long (for larger integers up to ± 263)
boolean (for true or false)
double (for floating-point numbers, which represent a subset of the
real numbers)
char (for single characters like 'A' and '$' )
Java also has object types, for example:

String represents a sequence of characters, like a Python string.


BigInteger represents an integer of arbitrary size, so it acts like a
Python integer.

By Java convention, primitive types are lowercase, while object types


start with a capital letter.

Operations are functions that take inputs and produce outputs (and
sometimes change the values themselves). The syntax for operations
varies, but we still think of them as functions no matter how they’re
written. Here are three different syntaxes for an operation in Python
or Java:

As an infix, prefix, or postfix operator. For example, a + b invokes


the operation + : int × int → int .

As a method of an object. For example, bigint1.add(bigint2) calls


the operation add: BigInteger × BigInteger → BigInteger .

As a function. For example, Math.sin(theta) calls the operation


sin: double → double . Here, Math is not an object. It’s the class that
contains the sin function.

Contrast Java’s str.length() with Python’s len(str) . It’s the same


operation in both languages – a function that takes a string and
returns its length – but it just uses different syntax.

Some operations are overloaded in the sense that the same


operation name is used for different types. The arithmetic operators
+, -, *, / are heavily overloaded for the numeric primitive types in
Java. Methods can also be overloaded. Most programming
languages have some degree of overloading.

Static Typing

Java is a statically-typed language. The types of all variables are


known at compile time (before the program runs), and the compiler
can therefore deduce the types of all expressions as well. If a and b

are declared as int s, then the compiler concludes that a+b is also an
int . The Eclipse environment does this while you’re writing the code,
in fact, so you find out about many errors while you’re still typing.

In dynamically-typed languages like Python, this kind of checking is


deferred until runtime (while the program is running).

Static typing is a particular kind of static checking, which means


checking for bugs at compile time. Bugs are the bane of
programming. Many of the ideas in this course are aimed at
eliminating bugs from your code, and static checking is the first idea
that we’ve seen for this. Static typing prevents a large class of bugs
from infecting your program: to be precise, bugs caused by applying
an operation to the wrong types of arguments. If you write a broken
line of code like:

"5" * "6"

that tries to multiply two strings, then static typing will catch this error
while you’re still programming, rather than waiting until the line is
reached during execution.

Static Checking, Dynamic Checking, No Checking


It’s useful to think about three kinds of automatic checking that a
language can provide:

Static checking: the bug is found automatically before the


program even runs.
Dynamic checking: the bug is found automatically when the code
is executed.
No checking: the language doesn’t help you find the error at all.
You have to watch for it yourself, or end up with wrong answers.

Needless to say, catching a bug statically is better than catching it


dynamically, and catching it dynamically is better than not catching it
at all.

Here are some rules of thumb for what errors you can expect to be
caught at each of these times.

Static checking can catch:

syntax errors, like extra punctuation or spurious words. Even


dynamically-typed languages like Python do this kind of static
checking. If you have an indentation error in your Python program,
you’ll find out before the program starts running.
wrong names, like Math.sine(2) . (The right name is sin .)

wrong number of arguments, like Math.sin(30, 20) .

wrong argument types, like Math.sin("30") .


wrong return types, like return "30"; from a function that’s
declared to return an int .

Dynamic checking can catch:

illegal argument values. For example, the integer expression x/y is


only erroneous when y is actually zero; otherwise it works. So in
this expression, divide-by-zero is not a static error, but a dynamic
error.
unrepresentable return values, i.e., when the specific return value
can’t be represented in the type.
out-of-range indexes, e.g., using a negative or too-large index on
a string.
calling a method on a null object reference ( null is like Python
None ).

Static checking tends to be about types, errors that are independent


of the specific value that a variable has. A type is a set of values.
Static typing guarantees that a variable will have some value from
that set, but we don’t know until runtime exactly which value it has. So
if the error would be caused only by certain values, like divide-by-zero
or index-out-of-range then the compiler won’t raise a static error
about it.

Dynamic checking, by contrast, tends to be about errors caused by


specific values.

Surprise: Primitive Types Are Not True Numbers


One trap in Java – and many other programming languages – is that
its primitive numeric types have corner cases that do not behave like
the integers and real numbers we’re used to. As a result, some
errors that really should be dynamically checked are not checked at
all. Here are the traps:

Integer division. 5/2 does not return a fraction, it returns a


truncated integer. So this is an example of where what we might
have hoped would be a dynamic error (because a fraction isn’t
representable as an integer) frequently produces the wrong
answer instead.

Integer overflow. The int and long types are actually finite sets
of integers, with maximum and minimum values. What happens
when you do a computation whose answer is too positive or too
negative to fit in that finite range? The computation quietly
overflows (wraps around), and returns an integer from
somewhere in the legal range but not the right answer.

Special values in floating-point types. Floating-point types like


double have several special values that aren’t real numbers: NaN

(which stands for “Not a Number”), POSITIVE_INFINITY , and


NEGATIVE_INFINITY . So when you apply certain operations to a double

that you’d expect to produce dynamic errors, like dividing by zero


or taking the square root of a negative number, you will get one of
these special values instead. If you keep computing with it, you’ll
end up with a bad final answer.
reading exercises

Let’s try some examples of buggy code and see how they behave in
Java. Are these bugs caught statically, dynamically, or not at all?

int n = 5;
if (n) {
n = n + 1;
}

static error
dynamic error
no error, wrong answer

(missing explanation)

check

int big = 200000; // 200,000


big = big * big; // big should be 4 billion now

static error
dynamic error
no error, wrong answer

(missing explanation)

check

double probability = 1/5;


static error
dynamic error
no error, wrong answer

(missing explanation)

check

int sum = 0;
int n = 0;
int average = sum/n;

static error
dynamic error
no error, wrong answer

(missing explanation)

check

double sum = 7;
double n = 0;
double average = sum/n;

static error
dynamic error
no error, wrong answer

(missing explanation)

check
Arrays and Collections

Let’s change our hailstone computation so that it stores the sequence


in a data structure, instead of just printing it out. Java has two kinds
of list-like types that we could use: arrays and Lists.

Arrays are fixed-length sequences of another type T. For example,


here’s how to declare an array variable and construct an array value
to assign to it:

int[] a = new int[100];

The int[] array type includes all possible array values, but a
particular array value, once created, can never change its length.
Operations on array types include:

indexing: a[2]

assignment: a[2]=0

length: a.length (note that this is different syntax from


String.length() – a.length is not a method call, so you don’t put
parentheses after it)

Here’s a crack at the hailstone code using an array. We start by


constructing the array, and then use an index variable i to step
through the array, storing values of the sequence as we generate
them.

int[] a = new int[100]; // <==== DANGER WILL ROBINSON


int i = 0;
int n = 3;
while (n != 1) {
a[i] = n;
i++; // very common shorthand for i=i+1
if (n % 2 == 0) {
n = n / 2;
} else {
n = 3 * n + 1;
}
}
a[i] = n;
i++;

Something should immediately smell wrong in this approach. What’s


that magic number 100? What would happen if we tried an n that
turned out to have a very long hailstone sequence? It wouldn’t fit in a
length-100 array. We have a bug. Would Java catch the bug
statically, dynamically, or not at all? Incidentally, bugs like these –
overflowing a fixed-length array, which are commonly used in less-
safe languages like C and C++ that don’t do automatic runtime
checking of array accesses – have been responsible for a large
number of network security breaches and internet worms.

Instead of a fixed-length array, let’s use the List type. Lists are
variable-length sequences of another type T . Here’s how we can
declare a List variable and make a list value:

List<Integer> list = new ArrayList<Integer>();

And here are some of its operations:

indexing: list.get(2)

assignment: list.set(2, 0)

length: list.size()
Note that List is an interface, a type that can’t be constructed directly
with new, but that instead specifies the operations that a List must
provide. We’ll talk about this notion in a future class on abstract data
types. ArrayList is a class, a concrete type that provides
implementations of those operations. ArrayList isn’t the only
implementation of the List type, though it’s the most commonly used
one. LinkedList is another. Check them out in the Java API
documentation, which you can find by searching the web for “Java 8
API”. Get to know the Java API docs, they’re your friend. (“API”
means “application programmer interface,” and is commonly used as
a synonym for “library.”)

Note also that we wrote List<Integer> instead of List<int> .

Unfortunately we can’t write List<int> in direct analog to int[] . Lists


only know how to deal with object types, not primitive types. In Java,
each of the primitive types (which are written in lowercase and often
abbreviated, like int ) has an equivalent object type (which is
capitalized, and fully spelled out, like Integer ). Java requires us to use
these object type equivalents when we parameterize a type with
angle brackets. But in other contexts, Java automatically converts
between int and Integer , so we can write Integer i = 5 without any
type error.

Here’s the hailstone code written with Lists:

List<Integer> list = new ArrayList<Integer>();


int n = 3;
while (n != 1) {
list.add(n);
if (n % 2 == 0) {
n = n / 2;
} else {
n = 3 * n + 1;
}
}
list.add(n);

Not only simpler but safer too, because the List automatically
enlarges itself to fit as many numbers as you add to it (until you run
out of memory, of course).

Iterating
A for loop steps through the elements of an array or a list, just as in
Python, though the syntax looks a little different. For example:

// find the maximum point of a hailstone sequence stored in list


int max = 0;
for (int x : list) {
max = Math.max(x, max);
}

You can iterate through arrays as well as lists. The same code would
work if the list were replaced by an array.

Math.max() is a handy function from the Java API. The Math class is full
of useful functions like this – search for “java 8 Math” on the web to
find its documentation.

Methods

In Java, statements generally have to be inside a method, and every


method has to be in a class, so the simplest way to write our
hailstone program looks like this:
public class Hailstone {
/**
* Compute a hailstone sequence.
* @param n Starting number for sequence. Assumes n > 0.
* @return hailstone sequence starting with n and ending with 1.
*/
public static List<Integer> hailstoneSequence(int n) {
List<Integer> list = new ArrayList<Integer>();
while (n != 1) {
list.add(n);
if (n % 2 == 0) {
n = n / 2;
} else {
n = 3 * n + 1;
}
}
list.add(n);
return list;
}
}

Let’s explain a few of the new things here.

public means that any code, anywhere in your program, can refer to
the class or method. Other access modifiers, like private, are used to
get more safety in a program, and to guarantee immutability for
immutable types. We’ll talk more about them in an upcoming class.

static means that the method doesn’t take a self parameter – which
in Java is implicit anyway, you won’t ever see it as a method
parameter. Static methods can’t be called on an object. Contrast that
with the List add() method or the String length() method, for example,
which require an object to come first. Instead, the right way to call a
static method uses the class name instead of an object reference:

Hailstone.hailstoneSequence(83)
Take note also of the comment before the method, because it’s very
important. This comment is a specification of the method, describing
the inputs and outputs of the operation. The specification should be
concise and clear and precise. The comment provides information
that is not already clear from the method types. It doesn’t say, for
example, that n is an integer, because the int n declaration just below
already says that. But it does say that n must be positive, which is
not captured by the type declaration but is very important for the
caller to know.

We’ll have a lot more to say about how to write good specifications in
a few classes, but you’ll have to start reading them and using them
right away.

Mutating Values vs. Reassigning Variables


The next reading will introduce snapshot diagrams to give us a way
to visualize the distinction between changing a variable and changing
a value. When you assign to a variable, you’re changing where the
variable’s arrow points. You can point it to a different value.

When you assign to the contents of a mutable value – such as an


array or list – you’re changing references inside that value.

Change is a necessary evil. Good programmers avoid things that


change, because they may change unexpectedly.

Immutability (immunity from change) is a major design principle in this


course. Immutable types are types whose values can never change
once they have been created. (At least not in a way that’s visible to
the outside world – there are some subtleties there that we’ll talk
more about in a future class about immutability.) Which of the types
we’ve discussed so far are immutable, and which are mutable?

Java also gives us immutable references: variables that are assigned


once and never reassigned. To make a reference immutable, declare
it with the keyword final:

final int n = 5;

If the Java compiler isn’t convinced that your final variable will only be
assigned once at runtime, then it will produce a compiler error. So
final gives you static checking for immutable references.

It’s good practice to use final for declaring the parameters of a


method and as many local variables as possible. Like the type of the
variable, these declarations are important documentation, useful to
the reader of the code and statically checked by the compiler.

There are two variables in our hailstoneSequence method: can we


declare them final, or not?

public static List<Integer> hailstoneSequence(final int n) {


final List<Integer> list = new ArrayList<Integer>();

Documenting Assumptions

Writing the type of a variable down documents an assumption about


it: e.g., this variable will always refer to an integer. Java actually
checks this assumption at compile time, and guarantees that there’s
no place in your program where you violated this assumption.

Declaring a variable final is also a form of documentation, a claim that


the variable will never change after its initial assignment. Java checks
that too, statically.

We documented another assumption that Java (unfortunately) doesn’t


check automatically: that n must be positive.

Why do we need to write down our assumptions? Because


programming is full of them, and if we don’t write them down, we
won’t remember them, and other people who need to read or change
our programs later won’t know them. They’ll have to guess.

Programs have to be written with two goals in mind:

communicating with the computer. First persuading the compiler


that your program is sensible – syntactically correct and type-
correct. Then getting the logic right so that it gives the right results
at runtime.
communicating with other people. Making the program easy to
understand, so that when somebody has to fix it, improve it, or
adapt it in the future, they can do so.

Hacking vs. Engineering


We’ve written some hacky code in this class. Hacking is often marked
by unbridled optimism:

Bad: writing lots of code before testing any of it


Bad: keeping all the details in your head, assuming you’ll
remember them forever, instead of writing them down in your
code
Bad: assuming that bugs will be nonexistent or else easy to find
and fix

But software engineering is not hacking. Engineers are pessimists:

Good: write a little bit at a time, testing as you go. In a future


class, we’ll talk about test-first programming.
Good: document the assumptions that your code depends on
Good: defend your code against stupidity – especially your own!
Static checking helps with that.

Our primary goal in this course is learning how to produce software


that is:

Safe from bugs. Correctness (correct behavior right now), and


defensiveness (correct behavior in the future).
Easy to understand. Has to communicate to future programmers
who need to understand it and make changes in it (fixing bugs or
adding new features). That future programmer might be you,
months or years from now. You’ll be surprised how much you
forget if you don’t write it down, and how much it helps your own
future self to have a good design.
Ready for change. Software always changes. Some designs
make it easy to make changes; others require throwing away and
rewriting a lot of code.

There are other important properties of software (like performance,


usability, security), and they may trade off against these three. But
these are the Big Three that we care about in 6.005, and that
software developers generally put foremost in the practice of building
software. It’s worth considering every language feature, every
programming practice, every design pattern that we study in this
course, and understanding how they relate to the Big Three.

Why we use Java in this course


Since you’ve had 6.01, we’re assuming that you’re comfortable with
Python. So why aren’t we using Python in this course? Why do we
use Java in 6.005?

Safety is the first reason. Java has static checking (primarily type
checking, but other kinds of static checks too, like that your code
returns values from methods declared to do so). We’re studying
software engineering in this course, and safety from bugs is a key
tenet of that approach. Java dials safety up to 11, which makes it a
good language for learning about good software engineering
practices. It’s certainly possible to write safe code in dynamic
languages like Python, but it’s easier to understand what you need to
do if you learn how in a safe, statically-checked language.
Ubiquity is another reason. Java is widely used in research,
education, and industry. Java runs on many platforms, not just
Windows/Mac/Linux. Java can be used for web programming (both
on the server and in the client), and native Android programming is
done in Java. Although other programming languages are far better
suited to teaching programming (Scheme and ML come to mind),
regrettably these languages aren’t as widespread in the real world.
Java on your resume will be recognized as a marketable skill. But
don’t get us wrong: the real skills you’ll get from this course are not
Java-specific, but carry over to any language that you might program
in. The most important lessons from this course will survive language
fads: safety, clarity, abstraction, engineering instincts.

In any case, a good programmer must be multilingual. Programming


languages are tools, and you have to use the right tool for the job.
You will certainly have to pick up other programming languages
before you even finish your MIT career (JavaScript, C/C++, Scheme
or Ruby or ML or Haskell), so we’re getting started now by learning a
second one.

As a result of its ubiquity, Java has a wide array of interesting and


useful libraries (both its enormous built-in library, and other libraries
out on the net), and excellent free tools for development (IDEs like
Eclipse, editors, compilers, test frameworks, profilers, code
coverage, style checkers). Even Python is still behind Java in the
richness of its ecosystem.
There are some reasons to regret using Java. It’s wordy, which
makes it hard to write examples on the board. It’s large, having
accumulated many features over the years. It’s internally inconsistent
(e.g. the final keyword means different things in different contexts,
and the static keyword in Java has nothing to do with static
checking). It’s weighted with the baggage of older languages like
C/C++ (the primitive types and the switch statement are good
examples). It has no interpreter like Python’s, where you can learn by
playing with small bits of code.

But on the whole, Java is a reasonable choice of language right now


to learn how to write code that is safe from bugs, easy to
understand, and ready for change. And that’s our goal.

Summary
The main idea we introduced today is static checking. Here’s how
this idea relates to the goals of the course:

Safe from bugs. Static checking helps with safety by catching


type errors and other bugs before runtime.

Easy to understand. It helps with understanding, because types


are explicitly stated in the code.

Ready for change. Static checking makes it easier to change


your code by identifying other places that need to change in
tandem. For example, when you change the name or type of a
variable, the compiler immediately displays errors at all the places
where that variable is used, reminding you to update them as well.
Reading 2: Basic Java

Objectives

Learn basic Java syntax and semantics


Transition from writing Python to writing Java

Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Correct today and Communicating clearly Designed to


Suppose we’re editing the body of a function in Java, declaring and using
local variables.

int a = 5; // (1)
if (a > 10) { // (2)
int b = 2; // (3)
} else { // (4)
int b = 4; // (5)
} // (6)
b *= 3; // (7)

Which line of Java code causes a compilation error?

(missing explanation)

check

Fix the bug


Select the smallest set of changes that will fix the bug:
Declare int b; after line 1
Assign b = 0; before line 2
Assign b = 2; instead of line 3
Assign b = 4; instead of line 5
Declare and assign int b *= 3; instead of line 7

(missing explanation)

check

Who are you again?


If we make the required changes above, what will happen if we comment
out the body of the else clause from the if-else?
b will be 0
b will be 3
b will be 6
We will receive an error from the Java compiler, before we run the
program
We will receive an error when we run the program, before we reach the
last line
We will receive an error when we run the program, when we reach the
last line

(missing explanation)

check

Numbers and strings

Read Numbers and Strings.

Don’t worry if you find the Number wrapper classes confusing. They are.

You should be able to answer the questions on both Questions and


Exercises pages.

Questions: Numbers
Questions: Characters, Strings

reading exercises

Numbers and strings


Does this Python code give an accurate conversion from Fahrenheit to
Celsius?

fahrenheit = 212.0
celsius = (fahrenheit - 32) * 5/9

Yes
No: integer arithmetic will cause celsius to be zero
No: integer arithmetic will cause celsius to be rounded down
(missing explanation)

check

Double shot
Rewrite the first line in Java:
int fahrenheit = 212.0;
Integer fahrenheit = 212.0;
float fahrenheit = 212.0;
Float fahrenheit = 212.0;
double fahrenheit = 212.0;
Double fahrenheit = 212.0;
And the second line, where ??? is the same type you selected above:
??? celsius = (fahrenheit - 32) * 5/9;
??? celsius = (fahrenheit - 32) * (5 / 9);
??? celsius = (fahrenheit - 32) * (5. / 9);

(missing explanation)

check

Fit to print
How should we print the result?
System.out.println(fahrenheit, " -> ", celsius);
System.out.println(fahrenheit + " -> " + celsius);
System.out.println("%s -> %s" % (fahrenheit, celsius));

System.out.println(Double.toString(fahrenheit) + " -> " + Double.toString(celsius));

(missing explanation)

check

Classes and objects

Read Classes and Objects.

You should be able to answer the questions on the first two Questions and
Exercises pages.
Questions: Classes
Questions: Objects

Don’t worry if you don’t understand everything in Nested Classes and Enum
Types right now. You can go back to those constructs later in the semester
when we see them in class.

reading exercises

Classes and objects

class Tortoise:
def __init__(self):
self.position = 0

def forward(self):
self.position += 1

pokey = Tortoise()
pokey.forward()
print pokey.position

If we translate Tortoise to Java, how do we declare the class?

(missing explanation)

check

Under construction

In Python we declare an __init__ function to initialize new objects.

What will the equivalent delaration look like in Java Tortoise ?

(missing explanation)

And how can we obtain a reference to a new Tortoise object?


(missing explanation)

check

Methodical

To declare the forward method on Tortoise objects in Java:

public void forward() {


// self.position += 1 (Python)
}

What’s the appropriate line of code for the body of the method? (check all
that apply)
position += 1;
self.position += 1;
this.position += 1;
Tortoise.position += 1;

(missing explanation)

check

On your mark

In Python, we used self.position = 0 to give Tortoise objects a position that


starts at zero.

In Java, we can do this either in one line:

Which of the options initializes position in one line?

public class Tortoise {

private int position = 0; // (1)


static int position = 0; // (2)

public Tortoise() {
int position = 0; // (3)
int self.position = 0; // (4)
int this.position = 0; // (5)
int Tortoise.position = 0; // (6)
}
// ...
}

1
2
3
4
5
6

… or in a combination of lines:

Which of the options initializes position using two lines?

public class Tortoise {

private int position; // (1)


static int position; // (2)

public Tortoise() {
self.position = 0; // (3)
this.position = 0; // (4)
Tortoise.position = 0; // (5)
}
// ...
}

1
2
3
4
5

(missing explanation)

check

Hello, world!
Read Hello World!

You should be able to create a new HelloWorldApp.java file, enter the code
from that tutorial page, and compile and run the program to see Hello World!

on the console.

Snapshot diagrams
Many readings include optional videos from the MITx version of 6.005.
More info about the videos

Note: this video uses a different version of the text.

It will be useful for us to draw pictures of what’s happening at runtime, in


order to understand subtle questions. Snapshot diagrams represent the
internal state of a program at runtime – its stack (methods in progress and
their local variables) and its heap (objects that currently exist).

Here’s why we use snapshot diagrams in 6.005:

To talk to each other through pictures (in class and in team meetings)
To illustrate concepts like primitive types vs. object types, immutable
values vs. immutable references, pointer aliasing, stack vs. heap,
abstractions vs. concrete representations.
To help explain your design for your team project (with each other and
with your TA).
To pave the way for richer design notations in subsequent courses. For
example, snapshot diagrams generalize into object models in 6.170.

Although the diagrams in this course use examples from Java, the notation
can be applied to any modern programming language, e.g., Python,
Javascript, C++, Ruby.

Primitive values

Primitive values are represented by bare constants. The incoming arrow is a


reference to the value from a variable or an object field.

Object values

An object value is a circle labeled by its type. When we want to show more
detail, we write field names inside it, with arrows pointing out to their values.
For still more detail, the fields can include their declared types. Some
people prefer to write x:int instead of int x , but both are fine.

Mutating values vs. reassigning variables

Snapshot diagrams give us a way to visualize the distinction between


changing a variable and changing a value:

When you assign to a variable or a field, you’re changing where the


variable’s arrow points. You can point it to a different value.

When you assign to the contents of a mutable value – such as an array


or list – you’re changing references inside that value.

Reassignment and immutable values


For example, if we have a String variable s , we can reassign it from a value
of "a" to "ab" .

String s = "a";
s = s + "b";

String is an example of an immutable type, a type whose values can never


change once they have been created. Immutability (immunity from change)
is a major design principle in this course, and we’ll talk much more about it in
future readings.

Immutable objects (intended by their designer to always represent the same


value) are denoted in a snapshot diagram by a double border, like the String

objects in our diagram.

Mutable values

By contrast, StringBuilder (another built-in Java class) is a mutable object


that represents a string of characters, and it has methods that change the
value of the object:

StringBuilder sb = new StringBuilder("a");


sb.append("b");
These two snapshot diagrams look very different, which is good: the
difference between mutability and immutability will play an important role in
making our code safe from bugs.

Immutable references

Java also gives us immutable references: variables that are assigned once
and never reassigned. To make a reference immutable, declare it with the
keyword final :

final int n = 5;

If the Java compiler isn’t convinced that your final variable will only be
assigned once at runtime, then it will produce a compiler error. So final

gives you static checking for immutable references.

In a snapshot diagram, an immutable reference ( final ) is denoted by a


double arrow. Here’s an object whose id never changes (it can’t be
reassigned to a different number), but whose age can change.

Notice that we can have an immutable reference to a mutable value (for


example: final StringBuilder sb ) whose value can change even though we’re
pointing to the same object.

We can also have a mutable reference to an immutable value (like


String s ), where the value of the variable can change because it can be re-
pointed to a different object.
Java Collections
The very first Language Basics tutorial discussed arrays, which are fixed-
length containers for a sequence of objects or primitive values. Java
provides a number of more powerful and flexible tools for managing
collections of objects: the Java Collections Framework.

Lists, Sets, and Maps

A Java List is similar to a Python list. A List contains an ordered


collection of zero or more objects, where the same object might appear
multiple times. We can add and remove items to and from the List , which
will grow and shrink to accomodate its contents.

Example List operations:

Java description Python


int count = lst.size(); count the number of elements count = len(lst)

append an element to the


lst.add(e); lst.append(e)
end
if (lst.isEmpty()) ... test if the list is empty if not lst: ...

In a snapshot diagram, we represent a List as an object with indices drawn


as fields:

This list of cities might represent a trip from Boston to Bogotá to Barcelona.

A Set is an unordered collection of zero or more unique objects. Like a


mathematical set or a Python set – and unlike a List – an object cannot
appear in a set multiple times. Either it’s in or it’s out.

Example Set operations:


Java description Python

test if the set contains an


s1.contains(e) e in s1
element

s1.containsAll(s2) test whether s1 ⊇ s2 s1.issuperset(s2)


s1 >= s2

s1.difference_update(s2)
s1.removeAll(s2) remove s2 from s1 s1 -= s2

In a snapshot diagram, we represent a Set as an object with no-name fields:

Here we have a set of integers, in no particular order: 42, 1024, and -7.

A Map is similar to a Python dictionary. In Python, the keys of a map must


be hashable. Java has a similar requirement that we’ll discuss when we
confront how equality works between Java objects.

Example Map operations:

Java description Python


map.put(key, val) add the mapping key → val map[key] = val

map.get(key) get the value for a key map[key]

map.containsKey(key) test whether the map has a key key in map

map.remove(key) delete a mapping del map[key]

In a snapshot diagram, we represent a Map as an object that contains


key/value pairs:
Reading 3: Testing

Validation

Testing is an example of a more general process called validation. The purpose of validation is to
uncover problems in a program and thereby increase your confidence in the program’s correctness.
Validation includes:

Formal reasoning about a program, usually called verification. Verification constructs a formal
proof that a program is correct. Verification is tedious to do by hand, and automated tool support
for verification is still an active area of research. Nevertheless, small, crucial pieces of a program
may be formally verified, such as the scheduler in an operating system, or the bytecode interpreter
in a virtual machine, or the filesystem in an operating system.
Code review. Having somebody else carefully read your code, and reason informally about it, can
be a good way to uncover bugs. It’s much like having somebody else proofread an essay you have
written. We’ll talk more about code review in the next reading.
Testing. Running the program on carefully selected inputs and checking the results.

Even with the best validation, it’s very hard to achieve perfect quality in software. Here are some
typical residual defect rates (bugs left over after the software has shipped) per kloc (one thousand
lines of source code):

1 - 10 defects/kloc: Typical industry software.


0.1 - 1 defects/kloc: High-quality validation. The Java libraries might achieve this level of
correctness.
0.01 - 0.1 defects/kloc: The very best, safety-critical validation. NASA and companies like Praxis
can achieve this level.

This can be discouraging for large systems. For example, if you have shipped a million lines of typical
industry source code (1 defect/kloc), it means you missed 1000 bugs!

Why Software Testing is Hard

Here are some approaches that unfortunately don’t work well in the world of software.

Exhaustive testing is infeasible. The space of possible test cases is generally too big to cover
exhaustively. Imagine exhaustively testing a 32-bit floating-point multiply operation, a*b . There are 2^64
test cases!

Haphazard testing (“just try it and see if it works”) is less likely to find bugs, unless the program is so
buggy that an arbitrarily-chosen input is more likely to fail than to succeed. It also doesn’t increase our
confidence in program correctness.

Random or statistical testing doesn’t work well for software. Other engineering disciplines can test
small random samples (e.g. 1% of hard drives manufactured) and infer the defect rate for the whole
production lot. Physical systems can use many tricks to speed up time, like opening a refrigerator
1000 times in 24 hours instead of 10 years. These tricks give known failure rates (e.g. mean lifetime of
a hard drive), but they assume continuity or uniformity across the space of defects. This is true for
physical artifacts.

But it’s not true for software. Software behavior varies discontinuously and discretely across the space
of possible inputs. The system may seem to work fine across a broad range of inputs, and then
abruptly fail at a single boundary point. The famous Pentium division bug affected approximately 1 in 9
billion divisions. Stack overflows, out of memory errors, and numeric overflow bugs tend to happen
abruptly, and always in the same way, not with probabilistic variation. That’s different from physical
systems, where there is often visible evidence that the system is approaching a failure point (cracks in
a bridge) or failures are distributed probabilistically near the failure point (so that statistical testing will
observe some failures even before the point is reached).

Instead, test cases must be chosen carefully and systematically, and that’s what we’ll look at next.

reading exercises

Testing basics

In the 1990s, the Ariane 5 launch vehicle, designed and built for the European Space Agency, self-
destructed 37 seconds after its first launch.

The reason was a control software bug that went undetected. The Ariane 5’s guidance software was
reused from the Ariane 4, which was a slower rocket. When the velocity calculation converted from a
64-bit floating point number (a double in Java terminology, though this software wasn’t written in Java)
to a 16-bit signed integer (a short ), it overflowed the small integer and caused an exception to be
thrown. The exception handler had been disabled for efficiency reasons, so the guidance software
crashed. Without guidance, the rocket crashed too. The cost of the failure was $1 billion.

What ideas does this story demonstrate?

Even high-quality safety-critical software may still have residual bugs.


Testing all possible inputs is the best solution to this problem.
Software exhibits discontinuous behavior, unlike many physically-engineered systems.
Static type checking could have detected this bug.

(missing explanation)

check

Putting on Your Testing Hat

Testing requires having the right attitude. When you’re coding, your goal is to make the program work,
but as a tester, you want to make it fail.

That’s a subtle but important difference. It is all too tempting to treat code you’ve just written as a
precious thing, a fragile eggshell, and test it very lightly just to see it work.

Instead, you have to be brutal. A good tester wields a sledgehammer and beats the program
everywhere it might be vulnerable, so that those vulnerabilities can be eliminated.

Test-first Programming
Test early and often. Don’t leave testing until the end, when you have a big pile of unvalidated code.
Leaving testing until the end only makes debugging longer and more painful, because bugs may be
anywhere in your code. It’s far more pleasant to test your code as you develop it.
In test-first-programming, you write tests before you even write any code. The development of a single
function proceeds in this order:

1. Write a specification for the function.


2. Write tests that exercise the specification.
3. Write the actual code. Once your code passes the tests you wrote, you’re done.

The specification describes the input and output behavior of the function. It gives the types of the
parameters and any additional constraints on them (e.g. sqrt ’s parameter must be nonnegative). It also
gives the type of the return value and how the return value relates to the inputs. You’ve already seen
and used specifications on your problem sets in this class. In code, the specification consists of the
method signature and the comment above it that describes what it does. We’ll have much more to say
about specifications a few classes from now.

Writing tests first is a good way to understand the specification. The specification can be buggy, too —
incorrect, incomplete, ambiguous, missing corner cases. Trying to write tests can uncover these
problems early, before you’ve wasted time writing an implementation of a buggy spec.

Choosing Test Cases by Partitioning


Creating a good test suite is a challenging and interesting design problem. We want to pick a set of
test cases that is small enough to run quickly, yet large enough to validate the program.

To do this, we divide the input space into subdomains, each consisting of a set of inputs. Taken
together the subdomains completely cover the input space, so that every input lies in at least one
subdomain. Then we choose one test case from each subdomain, and that’s our test suite.

The idea behind subdomains is to partition the input space into sets of similar inputs on which the
program has similar behavior. Then we use one representative of each set. This approach makes the
best use of limited testing resources by choosing dissimilar test cases, and forcing the testing to
explore parts of the input space that random testing might not reach.

We can also partition the output space into subdomains (similar outputs on which the program has
similar behavior) if we need to ensure our tests will explore different parts of the output space. Most of
the time, partitioning the input space is sufficient.

Example: BigInteger.multiply()
Let’s look at an example. BigInteger is a class built into the Java library that can represent integers of
any size, unlike the primitive types int and long that have only limited ranges. BigInteger has a method
multiply that multiplies two BigInteger values together:

/**
* @param val another BigIntger
* @return a BigInteger whose value is (this * val).
*/
public BigInteger multiply(BigInteger val)

For example, here’s how it might be used:

BigInteger a = ...;
BigInteger b = ...;
BigInteger ab = a.multiply(b);

This example shows that even though only one parameter is explicitly shown in the method’s
declaration, multiply is actually a function of two arguments: the object you’re calling the method on ( a
in the example above), and the parameter that you’re passing in the parentheses ( b in this example). In
Python, the object receiving the method call would be explicitly named as a parameter called self in
the method declaration. In Java, you don’t mention the receiving object in the parameters, and it’s
called this instead of self .

So we should think of multiply as a function taking two inputs, each of type BigInteger , and producing
one output of type BigInteger :

multiply : BigInteger × BigInteger → BigInteger

So we have a two-dimensional input space, consisting of all the pairs of integers (a,b). Now let’s
partition it. Thinking about how multiplication works, we might start with these partitions:

a and b are both positive


a and b are both negative
a is positive, b is negative
a is negative, b is positive

There are also some special cases for multiplication that we should check: 0, 1, and -1.

a or b is 0, 1, or -1

Finally, as a suspicious tester trying to find bugs, we might suspect that the implementor of BigInteger
might try to make it faster by using int or long internally when possible, and only fall back to an
expensive general representation (like a list of digits) when the value is too big. So we should definitely
also try integers that are very big, bigger than the biggest long .

a or b is small
the absolute value of a or b is bigger than Long.MAX_VALUE , the biggest possible primitive integer in
Java, which is roughly 2^63.
Let’s bring all these observations together into a straightforward partition of the whole (a,b) space.
We’ll choose a and b independently from:

0
1
-1
small positive integer
small negative integer
huge positive integer
huge negative integer

So this will produce 7 × 7 = 49 partitions that completely cover the space of pairs of integers.

To produce the test suite, we would pick an arbitrary pair (a,b) from each square of the grid, for
example:

(a,b) = (-3, 25) to cover (small negative, small positive)


(a,b) = (0, 30) to cover (0, small positive)
(a,b) = (2^100, 1) to cover (large positive, 1)
etc.

The figure at the right shows how the two-dimensional (a,b) space is divided by this partition, and the
points are test cases that we might choose to completely cover the partition.

Example: max()

Let’s look at another example from the Java library: the integer max() function, found in the Math class.

/**
* @param a an argument
* @param b another argument
* @return the larger of a and b.
*/
public static int max(int a, int b)

Mathematically, this method is a function of the following type:

max : int × int → int

From the specification, it makes sense to partition this function as:

a<b
a=b
a>b

Our test suite might then be:

(a, b) = (1, 2) to cover a < b


(a, b) = (9, 9) to cover a = b
(a, b) = (-5, -6) to cover a > b

Include Boundaries in the Partition

Bugs often occur at boundaries between subdomains. Some examples:

0 is a boundary between positive numbers and negative numbers


the maximum and minimum values of numeric types, like int and double

emptiness (the empty string, empty list, empty array) for collection types
the first and last element of a collection

Why do bugs often happen at boundaries? One reason is that programmers often make off-by-one
mistakes (like writing <= instead of < , or initializing a counter to 0 instead of 1). Another is that some
boundaries may need to be handled as special cases in the code. Another is that boundaries may be
places of discontinuity in the code’s behavior. When an int variable grows beyond its maximum positive
value, for example, it abruptly becomes a negative number.

It’s important to include boundaries as subdomains in your partition, so that you’re choosing an input
from the boundary.

Let’s redo max : int × int → int .


Partition into:

relationship between a and b


a<b
a=b
a>b
value of a
a=0
a<0
a>0
a = minimum integer
a = maximum integer
value of b
b=0
b<0
b>0
b = minimum integer
b = maximum integer

Now let’s pick test values that cover all these classes:

(1, 2) covers a < b, a > 0, b > 0


(-1, -3) covers a > b, a < 0, b < 0
(0, 0) covers a = b, a = 0, b = 0
(Integer.MIN_VALUE, Integer.MAX_VALUE) covers a < b, a = minint, b = maxint
(Integer.MAX_VALUE, Integer.MIN_VALUE) covers a > b, a = maxint, b = minint

Two Extremes for Covering the Partition

After partitioning the input space, we can choose how exhaustive we want the test suite to be:

Full Cartesian product.


Every legal combination of the partition dimensions is covered by one test case. This is what we
did for the multiply example, and it gave us 7 × 7 = 49 test cases. For the max example that
included boundaries, which has three dimensions with 3 parts, 5 parts, and 5 parts respectively, it
would mean up to 3 × 5 × 5 = 75 test cases. In practice not all of these combinations are possible,
however. For example, there’s no way to cover the combination a < b, a=0, b=0, because a can’t
be simultaneously less than zero and equal to zero.

Cover each part.


Every part of each dimension is covered by at least one test case, but not necessarily every
combination. With this approach, the test suite for max might be as small as 5 test cases if carefully
chosen. That’s the approach we took above, which allowed us to choose 5 test cases.
Often we strike some compromise between these two extremes, based on human judgement and
caution, and influenced by whitebox testing and code coverage tools, which we look at next.

reading exercises

Partitioning

Consider the following specification:

/**
* Reverses the end of a string.
*
* 012345 012345
* For example: reverseEnd("Hello, world", 5) returns "Hellodlrow ,"
* <-----> <----->
*
* With start == 0, reverses the entire text.
* With start == text.length(), reverses nothing.
*
* @param text non-null String that will have its end reversed
* @param start the index at which the remainder of the input is reversed,
* requires 0 <= start <= text.length()
* @return input text with the substring from start to the end of the string reversed
*/
public static String reverseEnd(String text, int start)

Which of the following are reasonable partitions for the start parameter?

start = 0, start = 5, start = 100


start < 0, start = 0, start > 0
start = 0, 0 < start < text.length(), start = text.length()
start < text.length(), start = text.length(), start > text.length()

(missing explanation)

check

Partitioning a String

Which of the following are reasonable partitions for the text parameter?

text contains some letters; text contains no letters, but some numbers; text contains neither letters
nor numbers
text.length() = 0; text.length() > 0
text.length() = 0; text.length()-start is odd; text.length()-start is even
text is every possible string from length 0 to 100

(missing explanation)

check

Blackbox and Whitebox Testing


Recall from above that the specification is the description of the function’s behavior — the types of
parameters, type of return value, and constraints and relationships between them.

Blackbox testing means choosing test cases only from the specification, not the implementation of
the function. That’s what we’ve been doing in our examples so far. We partitioned and looked for
boundaries in multiply and max without looking at the actual code for these functions.

Whitebox testing (also called glass box testing) means choosing test cases with knowledge of how
the function is actually implemented. For example, if the implementation selects different algorithms
depending on the input, then you should partition according to those domains. If the implementation
keeps an internal cache that remembers the answers to previous inputs, then you should test repeated
inputs.

When doing whitebox testing, you must take care that your test cases don’t require specific
implementation behavior that isn’t specifically called for by the spec. For example, if the spec says
“throws an exception if the input is poorly formatted,” then your test shouldn’t check specifically for a
NullPointerException just because that’s what the current implementation does. The specification in this
case allows any exception to be thrown, so your test case should likewise be general to preserve the
implementor’s freedom. We’ll have much more to say about this in the class on specs.

reading exercises

Blackbox and whitebox testing

Consider the following function:

/**
* Sort a list of integers in nondecreasing order. Modifies the list so that
* values.get(i) <= values.get(i+1) for all 0<=i<values.length()-1
*/
public static void sort(List<Integer> values) {
// choose a good algorithm for the size of the list
if (values.length() < 10) {
radixSort(values);
} else if (values.length() < 1000*1000*1000) {
quickSort(values);
} else {
mergeSort(values);
}
}

Which of the following test cases are likely to be boundary values produced by white box testing?

values = [] (the empty list)


values = [1, 2, 3]
values = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
values = [0, 0, 1, 0, 0, 0, 0]

(missing explanation)

check
Documenting Your Testing Strategy

For the example function on the left, on the right is how we can document the testing strategy we
worked on in the partitioning exercises above. The strategy also addresses some boundary values we
didn’t consider before.

Document the strategy at the top of the test


class:

/*
/** * Testing strategy
* Reverses the end of a string. *
* * Partition the inputs as follows:
* For example: * text.length(): 0, 1, > 1
* reverseEnd("Hello, world", 5) * start: 0, 1, 1 < start < text.length(),
* returns "Hellodlrow ," * text.length() - 1, text.length()
* * text.length()-start: 0, 1, even > 1, odd > 1
* With start == 0, reverses the entire text. *
* With start == text.length(), reverses nothing. * Include even- and odd-length reversals because
* * only odd has a middle element that doesn't move.
* @param text non-null String that will have *
* its end reversed * Exhaustive Cartesian coverage of partitions.
* @param start the index at which the */
* remainder of the input is
* reversed, requires 0 <= Document how each test case was chosen,
* start <= text.length()
* @return input text with the substring from
including white box tests:
* start to the end of the string
* reversed // covers test.length() = 0,
*/ // start = 0 = text.length(),
static String reverseEnd(String text, int start) // text.length()-start = 0
@Test public void testEmpty() {
assertEquals("", reverseEnd("", 0));
}

// ... other test cases ...

Coverage

One way to judge a test suite is to ask how thoroughly it exercises the program. This notion is called
coverage. Here are three common kinds of coverage:

Statement coverage: is every statement run by some test case?


Branch coverage: for every if or while statement in the program, are both the true and the false
direction taken by some test case?
Path coverage: is every possible combination of branches — every path through the program —
taken by some test case?

Branch coverage is stronger (requires more tests to achieve) than statement coverage, and path
coverage is stronger than branch coverage. In industry, 100% statement coverage is a common goal,
but even that is rarely achieved due to unreachable defensive code (like “should never get here”
assertions). 100% branch coverage is highly desirable, and safety critical industry code has even more
arduous criteria (e.g., “MCDC,” modified decision/condition coverage). Unfortunately 100% path
coverage is infeasible, requiring exponential-size test suites to achieve.

A standard approach to testing is to add tests until the test suite achieves adequate statement
coverage: i.e., so that every reachable statement in the program is executed by at least one test case.
In practice, statement coverage is usually measured by a code coverage tool, which counts the
number of times each statement is run by your test suite. With such a tool, white box testing is easy;
you just measure the coverage of your black box tests, and add more test cases until all important
statements are logged as executed.

A good code coverage tool for Eclipse is EclEmma, shown on the right.

Lines that have been executed by the test suite are colored green, and lines not yet covered are red. If
you saw this result from your coverage tool, your next step would be to come up with a test case that
causes the body of the while loop to execute, and add it to your test suite so that the red lines become
green.

reading exercises

Using a coverage tool

Install EclEmma in Eclipse on your laptop. Use your laptop, because you’ll need it for testing exercises
in class, too.

Then create a new Java class called Hailstone.java (you can make a new project for it, or just put it in
the project from class 2 exercises) containing this code:

public class Hailstone {


public static void main(String[] args) {
int n = 3;
while (n != 1) {
if (n % 2 == 0) {
n = n / 2;
} else {
n = 3 * n + 1;
}
}
}
}

Run this class with EclEmma code coverage highlighting turned on, by choosing Run → Coverage As
→ Java Application.

By changing the initial value of n , you can observe how EclEmma highlights different lines of code
differently.

When n=3 initially, what color is the line n = n/2 after execution?

(missing explanation)

When n=16 initially, what color is the line n = 3 * n + 1 after execution?

(missing explanation)

What initial value of n would make the line while (n != 1) yellow after execution?

(missing explanation)

check

Unit Testing and Stubs

A well-tested program will have tests for every individual module (where a module is a method or a
class) that it contains. A test that tests an individual module, in isolation if possible, is called a unit
test. Testing modules in isolation leads to much easier debugging. When a unit test for a module fails,
you can be more confident that the bug is found in that module, rather than anywhere in the program.

The opposite of a unit test is an integration test, which tests a combination of modules, or even the
entire program. If all you have are integration tests, then when a test fails, you have to hunt for the
bug. It might be anywhere in the program. Integration tests are still important, because a program can
fail at the connections between modules. For example, one module may be expecting different inputs
than it’s actually getting from another module. But if you have a thorough set of unit tests that give you
confidence in the correctness of individual modules, then you’ll have much less searching to do to find
the bug.

Suppose you’re building a web search engine. Two of your modules might be getWebPage() , which
downloads web pages, and extractWords() , which splits a page into its component words:
/** @return the contents of the web page downloaded from url
*/
public static String getWebPage(URL url) {...}

/** @return the words in string s, in the order they appear,


* where a word is a contiguous sequence of
* non-whitespace and non-punctuation characters
*/
public static List<String> extractWords(String s) { ... }

These methods might be used by another module makeIndex() as part of the web crawler that makes
the search engine’s index:

/** @return an index mapping a word to the set of URLs


* containing that word, for all webpages in the input set
*/
public static Map<String, Set<URL>> makeIndex(Set<URL> urls) {
...
for (URL url : urls) {
String page = getWebPage(url);
List<String> words = extractWords(page);
...
}
...
}

In our test suite, we would want:

unit tests just for getWebPage() that test it on various URLs


unit tests just for extractWords() that test it on various strings
unit tests for makeIndex() that test it on various sets of URLs

One mistake that programmers sometimes make is writing test cases for extractWords() in such a way
that the test cases depend on getWebPage() to be correct. It’s better to think about and test
extractWords() in isolation, and partition it. Using test partitions that involve web page content might be
reasonable, because that’s how extractWords() is actually used in the program. But don’t actually call
getWebPage() from the test case, because getWebPage() may be buggy! Instead, store web page content
as a literal string, and pass it directly to extractWords() . That way you’re writing an isolated unit test,
and if it fails, you can be more confident that the bug is in the module it’s actually testing, extractWords() .

Note that the unit tests for makeIndex() can’t easily be isolated in this way. When a test case calls
makeIndex() , it is testing the correctness of not only the code inside makeIndex() , but also all the methods
called by makeIndex() . If the test fails, the bug might be in any of those methods. That’s why we want
separate tests for getWebPage() and extractWords() , to increase our confidence in those modules
individually and localize the problem to the makeIndex() code that connects them together.

Isolating a higher-level module like makeIndex() is possible if we write stub versions of the modules that
it calls. For example, a stub for getWebPage() wouldn’t access the internet at all, but instead would return
mock web page content no matter what URL was passed to it. A stub for a class is often called a
mock object.
Automated Testing and Regression Testing

Nothing makes tests easier to run, and more likely to be run, than complete automation. Automated
testing means running the tests and checking their results automatically. A test driver should not be an
interactive program that prompts you for inputs and prints out results for you to manually check.
Instead, a test driver should invoke the module itself on fixed test cases and automatically check that
the results are correct. The result of the test driver should be either “all tests OK” or “these tests
failed: …” A good testing framework, like JUnit, helps you build automated test suites.

Note that automated testing frameworks like JUnit make it easy to run the tests, but you still have to
come up with good test cases yourself. Automatic test generation is a hard problem, still a subject of
active computer science research.

Once you have test automation, it’s very important to rerun your tests when you modify your code.
This prevents your program from regressing — introducing other bugs when you fix new bugs or add
new features. Running all your tests after every change is called regression testing.

Whenever you find and fix a bug, take the input that elicited the bug and add it to your automated test
suite as a test case. This kind of test case is called a regression test. This helps to populate your test
suite with good test cases. Remember that a test is good if it elicits a bug — and every regression
test did in one version of your code! Saving regression tests also protects against reversions that
reintroduce the bug. The bug may be an easy error to make, since it happened once already.

This idea also leads to test-first debugging. When a bug arises, immediately write a test case for it
that elicits it, and immediately add it to your test suite. Once you find and fix the bug, all your test
cases will be passing, and you’ll be done with debugging and have a regression test for that bug.

In practice, these two ideas, automated testing and regression testing, are almost always used in
combination.
Regression testing is only practical if the tests can be run often, automatically. Conversely, if you
already have automated testing in place for your project, then you might as well use it to prevent
regressions. So automated regression testing is a best-practice of modern software engineering.

reading exercises

Regression testing

Which of the following best defines regression testing?

Changes should be tested against all inputs that elicited bugs in earlier versions of the code.
Every component in your code should have an associated set of tests that exercises all the corner
cases in its specification.
Tests should be written before you write the code as a way of checking your understanding of the
specification.
When a new test exposes a bug, you should run it on all previous versions of the code until you find
the version where the bug was introduced.
check

Running automated tests

Which of the following are good times to rerun all your JUnit tests?

Before doing git add/commit/push


After rewriting a function to make it faster
When using a code coverage tool
After you think you fixed a bug

(missing explanation)

check

Testing techniques

Which of these techniques are useful for choosing test cases in test-first programming, before any
code is written?

black box
regression
static typing
partitioning
boundaries
white box
coverage

(missing explanation)

check

Summary
In this reading, we saw these ideas:

Test-first programming. Write tests before you write code.


Partitioning and boundaries for choosing test cases systematically.
White box testing and statement coverage for filling out a test suite.
Unit-testing each module, in isolation as much as possible.
Automated regression testing to keep bugs from coming back.

The topics of today’s reading connect to our three key properties of good software as follows:
Safe from bugs. Testing is about finding bugs in your code, and test-first programming is about
finding them as early as possible, immediately after you introduced them.

Easy to understand. Testing doesn’t help with this as much as code review does.

Ready for change. Readiness for change was considered by writing tests that only depend on
behavior in the spec. We also talked about automated regression testing, which helps keep bugs
from coming back when changes are made to code.
Reading 4: Code Review

Objectives for Today’s Class

In today’s class, we will practice:

code review: reading and discussing code written by somebody


else
general principles of good coding: things you can look for in every
code review, regardless of programming language or program
purpose

Code Review

Code review is careful, systematic study of source code by people


who are not the original author of the code. It’s analogous to
proofreading a term paper.
Code review really has two purposes:

Improving the code. Finding bugs, anticipating possible bugs,


checking the clarity of the code, and checking for consistency with
the project’s style standards.
Improving the programmer. Code review is an important way
that programmers learn and teach each other, about new
language features, changes in the design of the project or its
coding standards, and new techniques. In open source projects,
particularly, much conversation happens in the context of code
reviews.

Code review is widely practiced in open source projects like Apache


and Mozilla. Code review is also widely practiced in industry. At
Google, you can’t push any code into the main repository until another
engineer has signed off on it in a code review.

Style Standards

Most companies and large projects have coding style standards (for
example, Google Java Style). These can get pretty detailed, even to
the point of specifying whitespace (how deep to indent) and where
curly braces and parentheses should go. These kinds of questions
often lead to holy wars since they end up being a matter of taste and
style.
For Java, there’s a general style guide (unfortunately not updated for
the latest versions of Java). Some of its advice gets very specific:

The opening brace should be at the end of the line that begins the
compound statement; the closing brace should begin a line and be
indented to the beginning of the compound statement.

In 6.005, we have no official style guide of this sort. We’re not going
to tell you where to put your curly braces. That’s a personal decision
that each programmer should make. It’s important to be self-
consistent, however, and it’s very important to follow the conventions
of the project you’re working on. If you’re the programmer who
reformats every module you touch to match your personal style, your
teammates will hate you, and rightly so. Be a team player.

But there are some rules that are quite sensible and target our big
three properties, in a stronger way than placing curly braces. The
rest of this reading talks about some of these rules, at least the ones
that are relevant at this point in the course, where we’re mostly
talking about writing basic Java. These are some things you should
start to look for when you’re code reviewing other students, and when
you’re looking at your own code for improvement. Don’t consider it an
exhaustive list of code style guidelines, however. Over the course of
the semester, we’ll talk about a lot more things — specifications,
abstract data types with representation invariants, concurrency and
thread safety — which will then become fodder for code review.

Smelly Example #1
Programmers often describe bad code as having a “bad smell” that
needs to be removed. “Code hygiene” is another word for this. Let’s
start with some smelly code.

public static int dayOfYear(int month, int dayOfMonth, int year) {


if (month == 2) {
dayOfMonth += 31;
} else if (month == 3) {
dayOfMonth += 59;
} else if (month == 4) {
dayOfMonth += 90;
} else if (month == 5) {
dayOfMonth += 31 + 28 + 31 + 30;
} else if (month == 6) {
dayOfMonth += 31 + 28 + 31 + 30 + 31;
} else if (month == 7) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30;
} else if (month == 8) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31;
} else if (month == 9) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31;
} else if (month == 10) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31 + 30;
} else if (month == 11) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31 + 30 + 31;
} else if (month == 12) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31 + 30 + 31 + 31;
}
return dayOfMonth;
}

The next few sections and exercises will pick out the particular smells
in this code example.

Don’t Repeat Yourself


Duplicated code is a risk to safety. If you have identical or very similar
code in two places, then the fundamental risk is that there’s a bug in
both copies, and some maintainer fixes the bug in one place but not
the other.

Avoid duplication like you’d avoid crossing the street without looking.
Copy-and-paste is an enormously tempting programming tool, and
you should feel a frisson of danger run down your spine every time
you use it. The longer the block you’re copying, the riskier it is.

Don’t Repeat Yourself, or DRY for short, has become a


programmer’s mantra.

The dayOfYear example is full of identical code. How would you DRY it
out?

reading exercises

Don’t repeat yourself

Some of the repetition in dayOfYear() is repeated values. How many


times is the number of days in April written in dayOfYear() ?

(missing explanation)

check

Don’t repeat yourself

One reason why repeated code is bad is because a problem in the


repeated code has to be fixed in many places, not just one. Suppose
our calendar changed so that February really has 30 days instead of
28. How many numbers in this code have to be changed?
(missing explanation)

check

Don’t repeat yourself

Another kind of repetition in the code is dayOfMonth+= . Assume you have


an array:
int[] monthLengths = new int[] { 31, 28, 31, 30, ..., 31}

Which of the following code skeletons could be used to DRY the code
out enough so that dayOfMonth+= appears only once?

for (int m = 1; m < month; ++m) { ... }


switch (month) { case 1: ...; break; case 2: ...; break; ... }
while (m < month) { ...; m += 1; }

if (month == 1) { ... } else { ... dayOfYear(month-1, dayOfMonth, year) ... }

(missing explanation)

check

Comments Where Needed

A quick general word about commenting. Good software developers


write comments in their code, and do it judiciously. Good comments
should make the code easier to understand, safer from bugs
(because important assumptions have been documented), and ready
for change.
One kind of crucial comment is a specification, which appears above
a method or above a class and documents the behavior of the
method or class. In Java, this is conventionally written as a Javadoc
comment, meaning that it starts with /** and includes @ -syntax, like
@param and @return for methods. Here’s an example of a spec:

/**
* Compute the hailstone sequence.
* See http://en.wikipedia.org/wiki/Collatz_conjecture#Statement_of_the_pro
* @param n starting number of sequence; requires n > 0.
* @return the hailstone sequence starting at n and ending with 1.
* For example, hailstone(3)=[3,10,5,16,8,4,2,1].
*/
public static List<Integer> hailstoneSequence(int n) {
...
}

Specifications document assumptions. We’ve already mentioned


specs a few times, and there will be much more to say about them in
a future reading.

One reason for documenting sources is to avoid violations of


copyright. Small snippets of code on Stack Overflow are typically in
the public domain, but code copied from other sources may be
proprietary or covered by other kinds of open source licenses, which
are more restrictive. Another reason for documenting sources is that
the code can fall out of date; the Stack Overflow answer from which
this code came has evolved significantly in the years since it was first
answered.

Some comments are bad and unnecessary. Direct transliterations of


code into English, for example, do nothing to improve understanding,
because you should assume that your reader at least knows Java:

while (n != 1) { // test whether n is 1 (don't write comments like this!)


++i; // increment i
l.add(n); // add n to l
}

But obscure code should get a comment:

sendMessage("as you wish"); // this basically says "I love you"

The dayOfYear code needs some comments — where would you put
them? For example, where would you document whether month runs
from 0 to 11 or from 1 to 12?
if (month == 2) { // we're in February [C2]
dayOfMonth += 31; // add in the days of January that already passe
} else if (month == 3) {
dayOfMonth += 59; // month is 3 here [C4]
} else if (month == 4) {
dayOfMonth += 90;
}
...
} else if (month == 12) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31 + 30 + 31 + 31;
}
return dayOfMonth; // the answer [C5]
}

C1
C2
C3
C4
C5

(missing explanation)

check

Fail Fast

Failing fast means that code should reveal its bugs as early as
possible. The earlier a problem is observed (the closer to its cause),
the easier it is to find and fix. As we saw in the first reading, static
checking fails faster than dynamic checking, and dynamic checking
fails faster than producing a wrong answer that may corrupt
subsequent computation.
The dayOfYear function doesn’t fail fast — if you pass it the arguments
in the wrong order, it will quietly return the wrong answer. In fact, the
way dayOfYear is designed, it’s highly likely that a non-American will
pass the arguments in the wrong order! It needs more checking —
either static checking or dynamic checking.

reading exercises

Fail fast

public static int dayOfYear(int month, int dayOfMonth, int year) {


if (month == 2) {
dayOfMonth += 31;
} else if (month == 3) {
dayOfMonth += 59;
} else if (month == 4) {
dayOfMonth += 90;
} else if (month == 5) {
dayOfMonth += 31 + 28 + 31 + 30;
} else if (month == 6) {
dayOfMonth += 31 + 28 + 31 + 30 + 31;
} else if (month == 7) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30;
} else if (month == 8) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31;
} else if (month == 9) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31;
} else if (month == 10) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31 + 30;
} else if (month == 11) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31 + 30 + 31;
} else if (month == 12) {
dayOfMonth += 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31 + 30 + 31 + 31;
}
return dayOfMonth;
}

Suppose the date is February 9, 2019. The correct dayOfYear() result


for this date is 40, since it’s the fortieth day of the year.
Which of the following are plausible ways that a programmer might
(mistakenly) call dayOfYear() ? And for each one, does it lead to a static
error, dynamic error, or wrong answer?

dayOfYear(2, 9, 2019)

(missing explanation)

dayOfYear(1, 9, 2019)

(missing explanation)

dayOfYear(9, 2, 2019)

(missing explanation)

dayOfYear("February", 9, 2019)

(missing explanation)

dayOfYear(2019, 2, 9)

(missing explanation)

dayOfYear(2, 2019, 9)
(missing explanation)

check

Fail faster

Which of the following changes (considered separately) would make


the code fail faster if it were called with arguments in the wrong
order?

public static int dayOfYear(String month, int dayOfMonth, int year) {


...
}

(missing explanation)

public static int dayOfYear(int month, int dayOfMonth, int year) {


if (month < 1 || month > 12) {
return -1;
}
...
}

(missing explanation)

public static int dayOfYear(int month, int dayOfMonth, int year) {


if (month < 1 || month > 12) {
throw new IllegalArgumentException();
}
...
}

(missing explanation)
public enum Month { JANUARY, FEBRUARY, MARCH, ..., DECEMBER };
public static int dayOfYear(Month month, int dayOfMonth, int year) {
...
}

(missing explanation)

public static int dayOfYear(int month, int dayOfMonth, int year) {


if (month == 1) {
...
} else if (month == 2) {
...
}
...
} else if (month == 12) {
...
} else {
throw new IllegalArgumentException("month out of range");
}
}

(missing explanation)

check

Avoid Magic Numbers


▶ Play MITx video
There are really only two constants that computer scientists
recognize as valid in and of themselves: 0, 1, and maybe 2. (Okay,
three constants.)

All other constants are called magic because they appear as if out of
thin air with no explanation.
One way to explain a number is with a comment, but a far better way
is to declare the number as a named constant with a good, clear
name.

dayOfYear is full of magic numbers:

The months 2, …, 12 would be far more readable as FEBRUARY , …,


DECEMBER .

The days-of-months 30, 31, 28 would be more readable (and


eliminate duplicate code) if they were in a data structure like an
array, list, or map, e.g. MONTH_LENGTH[month] .

The mysterious numbers 59 and 90 are particularly pernicious


examples of magic numbers. Not only are they uncommented and
undocumented, they are actually the result of a computation done
by hand by the programmer. Don’t hardcode constants that you’ve
computed by hand. Java is better at arithmetic than you are.
Explicit computations like 31 + 28 make the provenance of these
mysterious numbers much clearer.
MONTH_LENGTH[JANUARY] + MONTH_LENGTH[FEBRUARY] would be clearer still.

reading exercises

Avoid magic numbers

In the code:

if (month == 2) { ... }

what might a reasonable programmer plausibly assume about the


meaning of the magic number 2?
2 might mean January
2 might mean February
2 might mean March
2 might mean the year 2 AD

(missing explanation)

check

What happens when you assume

Suppose you’re reading some code that uses a turtle graphics library
that you don’t know well, and you see the code:

turtle.rotate(3);

Which of the following are likely assumptions you might make about
the meaning of the magic number 3?

3 might mean 3 degrees clockwise


3 might mean 3 degrees counterclockwise
3 might mean 3 radians clockwise
3 might mean 3 full revolutions

(missing explanation)

check

Names instead of numbers

Consider this code:

for (int i = 0; i < 5; ++i) {


turtle.forward(36);
turtle.turn(72);
}

The magic numbers in this code cause it to fail all three of our
measures of code quality: it’s not safe from bugs (SFB), not easy to
understand (ETU) and not ready for change (RFC).

For each of the following rewrites, judge whether it improves SFB,


ETU, and/or RFC, or none of the above.

final int five = 5;


final int thirtySix = 36;
final int seventyTwo = 72;
for (int i = 0; i < five; ++i) {
turtle.forward(thirtySix);
turtle.turn(seventyTwo);
}

no improvement (or worse)


safer from bugs
easier to understand
more ready for change

int[] numbers = new int[] { 5, 36, 72 };


for (int i = 0; i < numbers[0]; ++i) {
turtle.forward(numbers[1]);
turtle.turn(numbers[2]);
}

no improvement (or worse)


safer from bugs
easier to understand
more ready for change

int x = 5;
for (int i = 0; i < x; ++i) {
turtle.forward(36);
turtle.turn(360.0 / x);
}

no improvement (or worse)


safer from bugs
easier to understand
more ready for change

(missing explanation)

final double fullCircleDegrees = 360.0;


final int numSides = 5;
final int sideLength = 36;
for (int i = 0; i < numSides; ++i) {
turtle.forward(sideLength);
turtle.turn(fullCircleDegrees / numSides);
}

no improvement (or worse)


safer from bugs
easier to understand
more ready for change

(missing explanation)

check

One Purpose For Each Variable

In the dayOfYear example, the parameter dayOfMonth is reused to


compute a very different value — the return value of the function,
which is not the day of the month.
Don’t reuse parameters, and don’t reuse variables. Variables are not
a scarce resource in programming. Introduce them freely, give them
good names, and just stop using them when you stop needing them.
You will confuse your reader if a variable that used to mean one thing
suddenly starts meaning something different a few lines down.

Not only is this an ease-of-understanding question, but it’s also a


safety-from-bugs and ready-for-change question.

Method parameters, in particular, should generally be left unmodified.


(This is important for being ready-for-change — in the future, some
other part of the method may want to know what the original
parameters of the method were, so you shouldn’t blow them away
while you’re computing.) It’s a good idea to use final for method
parameters, and as many other variables as you can. The final

keyword says that the variable should never be reassigned, and the
Java compiler will check it statically. For example:

public static int dayOfYear(final int month, final int dayOfMonth, final in
...
}

Smelly Example #2
There was a latent bug in dayOfYear . It didn’t handle leap years at all.
As part of fixing that, suppose we write a leap-year method.

public static boolean leap(int y) {


String tmp = String.valueOf(y);
if (tmp.charAt(2) == '1' || tmp.charAt(2) == '3' || tmp.charAt(2) == 5
if (tmp.charAt(3)=='2'||tmp.charAt(3)=='6') return true; /*R1*/
else
return false; /*R2*/
}else{
if (tmp.charAt(2) == '0' && tmp.charAt(3) == '0') {
return false; /*R3*/
}
if (tmp.charAt(3)=='0'||tmp.charAt(3)=='4'||tmp.charAt(3)=='8')retu
}
return false; /*R5*/
}

What are the bugs hidden in this code? And what style problems that
we’ve already talked about?

reading exercises

Mental execution 2016

What happens when you call:

leap(2016)

returns true on line R1


returns false on line R2
returns false on line R3
returns true on line R4
returns false on line R5
error before program starts
error while program is running

(missing explanation)

check

Mental execution 2017

What happens when you call:


leap(2017)

returns true on line R1


returns false on line R2
returns false on line R3
returns true on line R4
returns false on line R5
error before program starts
error while program is running
check

Mental execution 2050

What happens when you call:

leap(2050)

returns true on line R1


returns false on line R2
returns false on line R3
returns true on line R4
returns false on line R5
error before program starts
error while program is running

(missing explanation)

check

Mental execution 10016

What happens when you call:


leap(10016)

returns true on line R1


returns false on line R2
returns false on line R3
returns true on line R4
returns false on line R5
error before program starts
error while program is running

(missing explanation)

check

Mental execution 916

What happens when you call:

leap(916)

returns true on line R1


returns false on line R2
returns false on line R3
returns true on line R4
returns false on line R5
error before program starts
error while program is running

(missing explanation)

check

Magic numbers
How many magic numbers are in this code? Count every occurrence
if some appear more than once.

(missing explanation)

check

DRYing out

Suppose you wrote the helper function:

public static boolean isDivisibleBy(int number, int factor) { return number

If leap() were rewritten to use isDivisibleBy(year, ...) , and to correctly


follow the leap year algorithm, how many magic numbers would be
in the code?

(missing explanation)

check

Use Good Names


▶ Play MITx video
Good method and variable names are long and self-descriptive.
Comments can often be avoided entirely by making the code itself
more readable, with better names that describe the methods and
variables.

For example, you can rewrite


int tmp = 86400; // tmp is the number of seconds in a day (don't do this!)

as:

int secondsPerDay = 86400;

In general, variable names like tmp , temp , and data are awful,
symptoms of extreme programmer laziness. Every local variable is
temporary, and every variable is data, so those names are generally
meaningless. Better to use a longer, more descriptive name, so that
your code reads clearly all by itself.

Follow the lexical naming conventions of the language. In Python,


classes are typically Capitalized, variables are lowercase, and
words_are_separated_by_underscores. In Java:

methodsAreNamedWithCamelCaseLikeThis
variablesAreAlsoCamelCase
CONSTANTS_ARE_IN_ALL_CAPS_WITH_UNDERSCORES
ClassesAreCapitalized
packages.are.lowercase.and.separated.by.dots

Method names are usually verb phrases, like getDate or isUpperCase ,

while variable and class names are usually noun phrases. Choose
short words, and be concise, but avoid abbreviations. For example,
message is clearer than msg , and word is so much better than wd . Keep in
mind that many of your teammates in class and in the real world will
not be native English speakers, and abbreviations can be even harder
for non-native speakers.
ALL_CAPS_WITH_UNDERSCORES is used for static final

constants. All variables declared inside a method, including final

ones, use camelCaseNames.

The leap method has bad names: the method name itself, and the
local variable name. What would you call them instead?

reading exercises

Better method names

public static boolean leap(int y) {


String tmp = String.valueOf(y);
if (tmp.charAt(2) == '1' || tmp.charAt(2) == '3' || tmp.charAt(2) == 5
if (tmp.charAt(3)=='2'||tmp.charAt(3)=='6') return true;
else
return false;
}else{
if (tmp.charAt(2) == '0' && tmp.charAt(3) == '0') {
return false;
}
if (tmp.charAt(3)=='0'||tmp.charAt(3)=='4'||tmp.charAt(3)=='8')retu
}
return false;
}

Which of the following are good names for the leap() method?

leap
isLeapYear
IsLeapYear
is_divisible_by_4

(missing explanation)

check

Better variable names


Which of the following are good names for the tmp variable inside
leap() ?

leapYearString
yearString
temp
secondsPerDay
s

(missing explanation)

check

Use Whitespace to Help the Reader

Use consistent indentation. The leap example is bad at this. The


dayOfYear example is much better. In fact, dayOfYear nicely lines up all
the numbers into columns, making them easy for a human reader to
compare and check. That’s a great use of whitespace.

Put spaces within code lines to make them easy to read. The leap
example has some lines that are packed together — put in some
spaces.

Never use tab characters for indentation, only space characters. Note
that we say characters, not keys. We’re not saying you should never
press the Tab key, only that your editor should never put a tab
character into your source file in response to your pressing the Tab
key. The reason for this rule is that different tools treat tab characters
differently — sometimes expanding them to 4 spaces, sometimes to
2 spaces, sometimes to 8. If you run “git diff” on the command line, or
if you view your source code in a different editor, then the indentation
may be completely screwed up. Just use spaces. Always set your
programming editor to insert space characters when you press the
Tab key.

Smelly Example #3
Here’s a third example of smelly code that will illustrate the remaining
points of this reading.

public static int LONG_WORD_LENGTH = 5;


public static String longestWord;

public static void countLongWords(List<String> words) {


int n = 0;
longestWord = "";
for (String word: words) {
if (word.length() > LONG_WORD_LENGTH) ++n;
if (word.length() > longestWord.length()) longestWord = word;
}
System.out.println(n);
}

Don’t Use Global Variables

Avoid global variables. Let’s break down what we mean by global


variable. A global variable is:

a variable, a name whose meaning can be changed


that is global, accessible and changeable from anywhere in the
program.
Why Global Variables Are Bad (cached version) has a good list of the
dangers of global variables.

In Java, a global variable is declared public static . The public modifier


makes it accessible anywhere, and static means there is a single
instance of the variable.

In general, change global variables into parameters and return values,


or put them inside objects that you’re calling methods on. We’ll see
many techniques for doing that in future readings.

reading exercises

Identifying global variables

In this code, which of these are global variables?

countLongWords
n
LONG_WORD_LENGTH
longestWord
word
words

(missing explanation)

check

Effect of final

Making a variable into a constant by adding the final keyword can


eliminate the risk of global variables. What happens to each of these
when the final keyword is added?

n
(missing explanation)

LONG_WORD_LENGTH

(missing explanation)

longestWord

(missing explanation)

word

(missing explanation)

words

(missing explanation)

check

Methods Should Return Results, not Print Them


countLongWords isn’t ready for change. It sends some of its result to the
console, System.out . That means that if you want to use it in another
context — where the number is needed for some other purpose, like
computation rather than human eyes — it would have to be rewritten.
In general, only the highest-level parts of a program should interact
with the human user or the console. Lower-level parts should take
their input as parameters and return their output as results. The sole
exception here is debugging output, which can of course be printed to
the console. But that kind of output shouldn’t be a part of your design,
only a part of how you debug your design.

Summary
Code review is a widely-used technique for improving software quality
by human inspection. Code review can detect many kinds of problems
in code, but as a starter, this reading talked about these general
principles of good code:

Don’t Repeat Yourself (DRY)


Comments where needed
Fail fast
Avoid magic numbers
One purpose for each variable
Use good names
No global variables
Return results, don’t print them
Use whitespace for readability

The topics of today’s reading connect to our three key properties of


good software as follows:

Safe from bugs. In general, code review uses human reviewers


to find bugs. DRY code lets you fix a bug in only one place,
without fear that it has propagated elsewhere. Commenting your
assumptions clearly makes it less likely that another programmer
will introduce a bug. The Fail Fast principle detects bugs as early
as possible. Avoiding global variables makes it easier to localize
bugs related to variable values, since non-global variables can be
changed in only limited places in the code.

Easy to understand. Code review is really the only way to find


obscure or confusing code, because other people are reading it
and trying to understand it. Using judicious comments, avoiding
magic numbers, keeping one purpose for each variable, using
good names, and using whitespace well can all improve the
understandability of code.

Ready for change. Code review helps here when it’s done by
experienced software developers who can anticipate what might
change and suggest ways to guard against it. DRY code is more
ready for change, because a change only needs to be made in
one place. Returning results instead of printing them makes it
easier to adapt the code to a new purpose.
Reading 5: Version Control

Introduction
Version control systems are essential tools of the software engineering world. More or less
every project — serious or hobby, open source or proprietary — uses version control. Without
version control, coordinating a team of programmers all editing the same project’s code will
reach pull-out-your-hair levels of aggravation.

Version control systems you’ve already used

Dropbox
Undo/redo buffer
Keeping multiple copies of files with version numbers

Project Project Project Project Project


Report Report v2 Report v3 Report final Report final- Project Report final-
v2 v2-fix-part-5
Inventing version control
Suppose Alice is working on a problem set by herself.

Version 1

Alice Hello.java
She starts with one file Hello.java in her pset, which she works on for several days.

At the last minute before she needs to hand in her pset to be graded, she realizes she has
made a change that breaks everything. If only she could go back in time and retrieve a past
version!

A simple discipline of saving backup files would get the job done.

Version 1 Version 2 Version 3


HEAD
Alice Hello.1.java Hello.2.java Hello.java

Alice uses her judgment to decide when she has reached some milestone that justifies saving
the code. She saves the versions of Hello.java as Hello.1.java , Hello.2.java , and Hello.java . She
follows the convention that the most recent version is just Hello.java to avoid confusing Eclipse.
We will call the most recent version the head.

Now when Alice realizes that version 3 is fatally flawed, she can just copy version 2 back into
the location for her current code. Disaster averted! But what if version 3 included some changes
that were good and some that were bad? Alice can compare the files manually to find the
changes, and sort them into good and bad changes. Then she can copy the good changes into
version 2.

This is a lot of work, and it’s easy for the human eye to miss changes. Luckily, there are
standard software tools for comparing text; in the UNIX world, one such tool is diff . A better
version control system will make diffs easy to generate.

Version 1 Version 2 Version 3

Cloud Hello.1.java Hello.2.java Hello.java

Version 1 Version 2 Version 3

Alice Hello.1.java Hello.2.java Hello.java

Alice also wants to be prepared in case her laptop gets run over by a bus, so she saves a
backup of her work in the cloud, uploading the contents of her working directory whenever she’s
satisfied with its contents.

If her laptop is kicked into the Charles, Alice can retrieve the backup and resume work on the
pset on a fresh machine, retaining the ability to time-travel back to old versions at will.
Furthermore, she can develop her pset on multiple machines, using the cloud provider as a
common interchange point. Alice makes some changes on her laptop and uploads them to the
cloud. Then she downloads onto her desktop machine at home, does some more work, and
uploads the improved code (complete with old file versions) back to the cloud.

Cloud

Version 5L Version 5D
Alice on Alice on
Hello.java laptop desktop Hello.java

If Alice isn’t careful, though, she can run into trouble with this approach. Imagine that she starts
editing Hello.java to create “version 5” on her laptop. Then she gets distracted and forgets about
her changes. Later, she starts working on a new “version 5” on her desktop machine, including
different improvements. We’ll call these versions “5L” and “5D,” for “laptop” and “desktop.”

When it comes time to upload changes to the cloud, there is an opportunity for a mishap! Alice
might copy all her local files into the cloud, causing it to contain version 5D only. Later Alice
syncs from the cloud to her laptop, potentially overwriting version 5L, losing the worthwhile
changes. What Alice really wants here is a merge, to create a new version based on the two
version 5’s.

At this point, considering just the scenario of one programmer working alone, we already have a
list of operations that should be supported by a version control scheme:

reverting to a past version


comparing two different versions
pushing full version history to another location
pulling history back from that location
merging versions that are offshoots of the same earlier version

Multiple developers

Now let’s add into the picture Bob, another developer. The picture isn’t too different from what
we were just thinking about.

Cloud
Version 5A Version 5A Version 5B Version 5B

Hello.java Greet.java Alice Bob Hello.java Greet.java

Alice and Bob here are like the two Alices working on different computers. They no longer share
a brain, which makes it even more important to follow a strict discipline in pushing to and pulling
from the shared cloud server. The two programmers must coordinate on a scheme for coming
up with version numbers. Ideally, the scheme allows us to assign clear names to whole sets of
files, not just individual files. (Files depend on other files, so thinking about them in isolation
allows inconsistencies.)

Merely uploading new source files is not a very good way to communicate to others the high-
level idea of a set of changes. So let’s add a log that records for each version who wrote it,
when it was finalized, and what the changes were, in the form of a short human-authored
message.

Cloud

Log: Log:
1: 1:
Alice, Alice,
7pm, 7pm,
... ...
... ...
Ver. 5A Ver. 5A Ver. 5B Ver. 5B
4: 4:
Bob, Bob,
Hello.java Greet.java Alice Bob Hello.java Greet.java
8pm, 8pm,
... ...
5A: 5B:
Alice, Bob,
9pm, 9pm,
... ...

Pushing another version now gets a bit more complicated, as we need to merge the logs. This
is easier to do than for Java files, since logs have a simpler structure – but without tool support,
Alice and Bob will need to do it manually! We also want to enforce consistency between the
logs and the actual sets of available files: for each log entry, it should be easy to extract the
complete set of files that were current at the time the entry was made.

But with logs, all sorts of useful operations are enabled. We can look at the log for just a
particular file: a view of the log restricted to those changes that involved modifying some file.
We can also use the log to figure out which change contributed each line of code, or, even
better, which person contributed each line, so we know who to complain to when the code
doesn’t work. This sort of operation would be tedious to do manually; the automated operation
in version control systems is called annotate (or, unfortunately, blame).

Multiple branches

It sometimes makes sense for a subset of the developers to go off and work on a branch, a
parallel code universe for, say, experimenting with a new feature. The other developers don’t
want to pull in the new feature until it is done, even if several coordinated versions are created
in the meantime. Even a single developer can find it useful to create a branch, for the same
reasons that Alice was originally using the cloud server despite working alone.

In general, it will be useful to have many shared places for exchanging project state. There may
be multiple branch locations at once, each shared by several programmers. With the right set-
up, any programmer can pull from or push to any location, creating serious flexibility in
cooperation patterns.

The shocking conclusion

Of course, it turns out we haven’t invented anything here: Git does all these things for you, and
so do many other version control systems.

Distributed vs. centralized

Dan Carol

Cloud

Alice Bob

Traditional centralized version control systems like CVS and Subversion do a subset of the
things we’ve imagined above. They support a collaboration graph – who’s sharing what changes
with whom – with one master server and copies that only communicate with the master.

In a centralized system, everyone must share their work to and from the master repository.
Changes are safely stored in version control if they are in the master repository, because that’s
the only repository.

Dan Carol
Cloud

Alice Bob

In contrast, distributed version control systems like Git and Mercurial allow all sorts of different
collaboration graphs, where teams and subsets of teams can experiment easily with alternate
versions of code and history, merging versions together as they are determined to be good
ideas.

In a distributed system, all repositories are created equal, and it’s up to users to assign them
different roles. Different users might share their work to and from different repos, and the team
must decide what it means for a change to be in version control. If the change is stored in just
a single programmer’s repo, do they still need to share it with a designated collaborator or
specific server before the rest of the team considers it official?

reading exercises

More equal
In 6.005, which of these problem set repos has a special role?
The repository on 6.005’s Athena locker
The repository on Didit
The repository on your laptop
The repository on your desktop

(missing explanation)

check

Version control terminology

Repository: a local or remote store of the versions in our project


Working copy: a local, editable copy of our project that we can work on
File: a single file in our project
Version or revision: a record of the contents of our project at a point in time
Change or diff: the difference between two versions
Head: the current version

Features of a version control system

Reliable: keep versions around for as long as we need them; allow backups
Multiple files: track versions of a project, not single files
Meaningful versions: what were the changes, why were they made?
Revert: restore old versions, in whole or in part
Compare versions
Review history: for the whole project or individual files
Not just for code: prose, images, …

It should allow multiple people to work together:

Merge: combine versions that diverged from a common previous version


Track responsibility: who made that change, who touched that line of code?
Work in parallel: allow one programmer to work on their own for a while (without giving up
version control)
Work-in-progress: allow multiple programmers to share unfinished work (without disrupting
others, without giving up version control)

Git

The version control system we’ll use in 6.005 is Git. It’s powerful and worth learning. But Git’s
user interface can be terribly frustrating. What is Git’s user interface?

In 6.005, we will use Git on the command line. The command line is a fact of life,
ubiquitous because it is so powerful.

The command line can make it very difficult to see what is going on in your repositories. You
may find SourceTree (shown on the right) for Mac & Windows useful. On any platform, gitk
can give you a basic Git GUI. Ask Google for other suggestions.

An important note about tools for Git:

Eclipse has built-in support for Git. If you follow the problem set instructions, Eclipse will
know your project is in Git and will show you helpful icons. We do not recommend using the
Eclipse Git UI to make changes, commit, etc., and course staff may not be able to help you
with problems.

GitHub makes desktop apps for Mac and Windows. Because the GitHub app changes how
some Git operations work, if you use the GitHub app, course staff will not be able to help
you.

Getting started with Git

On the Git website, you can find two particularly useful resources:

Pro Git documents everything you might need to know about Git.
The Git command reference can help with the syntax of Git commands.

You’ve already completed PS0 and the Getting Started intro to Git.

The Git object graph

Read: Pro Git 1.3: Git Basics

That reading introduces the three pieces of a Git repo: .git directory, working directory, and
staging area.

All of the operations we do with Git — clone, add, commit, push, log, merge, … — are
operations on a graph data structure that stores all of the versions of files in our project, and all
the log entries describing those changes. The Git object graph is stored in the .git directory of
your local repository. Another copy of the graph, e.g. for PS0, is on Athena in:
/mit/6.005/git/fa16/psets/ps0/[your username].git

Copy an object graph with git clone


How do you get the object graph from Athena to your local machine in order to start working on
the problem set? git clone copies the graph.

Suppose your username is bitdiddle :

git clone ssh://.../psets/ps0/bitdiddle.git ps0

Hover or tap on each step to update the diagram below:

1. Create an empty local directory ps0 , and ps0/.git .

2. Copy the object graph from ssh://.../psets/ps0/bitdiddle.git into ps0/.git .

3. Check out the current version of the master branch.

Diagram for highlighted step:


We still haven’t explained what’s in the object graph. But before we do that, let’s understand
step 3 of git clone : check out the current version of the master branch.

The object graph is stored on disk in a convenient and efficient structure for performing Git
operations, but not in a format we can easily use. In Alice’s invented version control scheme, the
current version of Hello.java was just called Hello.java because she needed to be able to edit it
normally. In Git, we obtain normal copies of our files by checking them out from the object
graph. These are the files we see and edit in Eclipse.

We also decided above that it might be useful to support multiple branches in the version
history. Multiple branches are essential for large teams working on long-term projects. To keep
things simple in 6.005, we will not use branches and we don’t recommend that you create any.
Every Git repo comes with a default branch called master , and all of our work will be on the
master branch.

So step 2 of git clone gets us an object graph, and step 3 gets us a working directory full of
files we can edit, starting from the current version of the project.

Let’s finally dive into that object graph!

Clone an example repo: https://github.com/mit6005/fa16-ex05-hello-git.git

Using commands from Getting Started or Pro Git 2.3: Viewing the Commit History, or by using
a tool like SourceTree, explain the history of this little project to yourself.

Here’s the output of git lol for this example repository:

* b0b54b3 (HEAD, origin/master, origin/HEAD, master) Greeting in Java


* 3e62e60 Merge
|\
| * 6400936 Greeting in Scheme
* | 82e049e Greeting in Ruby
|/
* 1255f4e Change the greeting
* 41c4b8f Initial commit
The history of a Git project is a directed acyclic graph (DAG). The history graph is the
backbone of the full object graph stored in .git , so let’s focus on it for a minute.

Each node in the history graph is a commit a.k.a. version a.k.a. revision of the project: a
complete snapshot of all the files in the project at that point in time. You may recall from our
earlier reading that each commit is identified by a unique ID, displayed as a hexadecimal
number.

Except for the initial commit, each commit has a pointer to its parent commit. For example,
commit 1255f4e has parent 41c4b8f : this means 41c4b8f happened first, then 1255f4e .

Some commits have the same parent: they are versions that diverged from a common previous
version. And some commits have two parents: they are versions that tie divergent histories back
together.

A branch — remember master will be our only branch for now — is just a name that points to a
commit.

Finally, HEAD points to our current commit — almost. We also need to remember which branch
we’re working on. So HEAD points to the current branch, which points to the current commit.

Check your understanding…

reading exercises

HEAD count
How many commits are in this project?
(missing explanation)

How many different versions of hello.txt are there?

(missing explanation)

How many times has a new file been added to the project?

How many times has an existing file been modified?

And how many times has a file been deleted?

(missing explanation)

check

First impression
What was the original contents of hello.txt ?

(missing explanation)

check

Graph-ical
Which of these are a correct representation of the history of this repository?

Choose all the correct answers.


(missing explanation)

check

Around and around


What would be the meaning of a cycle in the history graph?
Diverging changes were made in parallel
More than two diverging histories were merged in a single merge
Some commit is its own ancestor
Some commit is a descendant of itself
A pair of commits contain inverse changes
This is bad

(missing explanation)

check

What else is in the object graph?


The history graph is the backbone of the full object graph. What else is in there?

Each commit is a snapshot of our entire project, which Git represents with a tree node. For a
project of any reasonable size, most of the files won’t change in any given revision. Storing
redundant copies of the files would be wasteful, so Git doesn’t do that.

Instead, the Git object graph stores each version of an individual file once, and allows multiple
commits to share that one copy. To the left is a more complete rendering of the Git object graph
for our example.

Keep this picture in the back of your mind, because it’s a wonderful example of the sharing
enabled by immutable data types, which we’re going to discuss a few classes from now.

Each commit also has log data — who, when, short log message, etc. — not shown in the
diagram.

Add to the object graph with git commit


How do we add new commits to the history graph? git commit creates a new commit.

In some alternate universe, git commit might create a new commit based on the current contents
of your working directory. So if you edited Hello.java and then did git commit , the snapshot would
include your changes.

We’re not in that universe; in our universe, Git uses that third and final piece of the repository:
the staging area (a.k.a. the index, which is only a useful name to know because sometimes it
shows up in documentation).
The staging area is like a proto-commit, a commit-in-progress. Here’s how we use the staging
area and git add to build up a new snapshot, which we then cast in stone using git commit :

Modify hello.txt , git add hello.txt , git commit

Hover or tap on each step to update the diagram, and to see the output of git status at each
step:

1. If we haven’t made any changes yet, then the working directory, staging area, and HEAD
commit are all identical.
2. Make a change to a file. For example, let’s edit hello.txt .

Other changes might be creating a new file, or deleting a file.


3. Stage those changes using git add .

4. Create a new commit out of all the staged changes using git commit .

$ git status
On branch master
Your branch is up-to-date with 'origin/master'.

nothing to commit, working directory clean

Use git status frequently to keep track of whether you have no changes, unstaged changes, or
staged changes; and whether you have new commits in your local repository that haven’t been
pushed.

reading exercises

Classy

The Java compiler compiles .java files into .class files.


Should you commit .class files to version control?
yes
no

(missing explanation)

check

Take the stage


Can we have both staged and unstaged changes at the same time?
yes
no

(missing explanation)

check

Upstaged

Suppose we have a repo and there are changes staged for commit.

We run git commit (with no fancy arguments).

After the commit, can there still be changes that are staged?
yes
no

(missing explanation)

Can there still be changes that are unstaged?


yes
no

(missing explanation)

check

Downplayed

Suppose we start at version A of our project.

In version B, we make some changes.

Then in version C, we make exactly the inverse changes we made in version B.

Which of the following is true?


Our working copy now looks like it did when we started at version A
Our version history now looks like it did when we started at version A
Committing version C doesn’t add a new commit, it removes commit B
This sequence of operations makes sense if commit A was a bad idea entirely
This sequence of operations makes sense if commit B was a bad idea entirely
This sequence of operations makes sense if commit C was a bad idea entirely
This sequence of operations doesn’t make any sense

(missing explanation)

check

Sequences, trees, and graphs

When you’re working independently, on a single machine, the DAG of your version history will
usually look like a sequence: commit 1 is the parent of commit 2 is the parent of commit 3…

There are three programmers involved in the history of our example repository. Two of them –
Alyssa and Ben – made changes “at the same time.” In this case, “at the same time” doesn’t
mean precisely contemporaneous. Instead, it means they made two different new versions
based on the same previous version, just as Alice made version 5L and 5D on her laptop and
desktop.

When multiple commits share the same parent commit, our history DAG changes from a
sequence to a tree: it branches apart. Notice that a branch in the history of the project doesn’t
require anyone to create a new Git branch, merely that we start from the same commit and
work in parallel on different copies of the repository:


* commit 82e049e248c63289b8a935ce71b130a74dc04152
| Author: Ben Bitdiddle <[email protected]>
| Greeting in Ruby
|
| * commit 64009369c5ab93492931ad07962ee81bda921ded
|/ Author: Alyssa P. Hacker <[email protected]>
| Greeting in Scheme
|
* commit 1255f4e4a5836501c022deb337fda3f8800b02e4
| Author: Max Goldman <[email protected]>
| Change the greeting

Finally, the history DAG changes from tree- to graph-shaped when the branching changes are
merged together:

* commit 3e62e60a7b4a0c262cd8eb4308ac3e5a1e94d839
|\ Author: Max Goldman <[email protected]>
| | Merge
| |
* | commit 82e049e248c63289b8a935ce71b130a74dc04152
| | Author: Ben Bitdiddle <[email protected]>
| | Greeting in Ruby
| |
| * commit 64009369c5ab93492931ad07962ee81bda921ded
|/ Author: Alyssa P. Hacker <[email protected]>
| Greeting in Scheme
|
* commit 1255f4e4a5836501c022deb337fda3f8800b02e4
| Author: Max Goldman <[email protected]>
| Change the greeting

How is it that changes are merged together? First we’ll need to understand how history is
shared between different users and repositories.

Send & receive object graphs with git push & git pull
We can send new commits to a remote repository using git push :

git push origin master

Hover or tap on each step to update the diagram:

1. When we clone a repository, we obtain a copy of the history graph.


Git remembers where we cloned from as a remote repository called origin .

2. Using git commit , we add new commits to the local history on the master branch.
3. To send those changes back to the origin remote, use git push origin master .

And we receive new commits using git pull . Note that git pull , in addition to fetching new parts
of the object graph, also updates the working copy by checking out the latest version (just like
git clone checked out a working copy to start with).
Merging
Now, let’s examine what happens when changes occur in parallel:

Create and commit hello.scm and hello.rb in parallel

Hover or tap on each step to update the diagram:

1. Both Alyssa and Ben clone the repository with two commits ( 41c4b8f and 1255f4e ).

2. Alyssa creates hello.scm and commits her change as 6400936 .

3. At the same time, Ben creates hello.rb and commits his change as 82e049e .

At this point, both of their changes only exist in their local repositories. In each repo, master

now points to a different commit.


4. Let’s suppose Alyssa is the first to push her change up to Athena.
5. What happens if Ben tries to push now? The push will be rejected: if the server updates
master to point to Ben’s commit, Alyssa’s commit will disappear from the project history!
6. Ben must merge his changes with Alyssa’s.
To perform the merge, he pulls her commit from Athena, which does two things:
(a) Downloads new commits into Ben’s repository’s object graph
7. (b) Merges Ben’s history with Alyssa’s, creating a new commit ( 3e62e60 ) that joins together
the disparate histories. This commit is a snapshot like any other: a snapshot of the
repository with both of their changes applied.
8. Now Ben can git push , because no history will go missing when he does.
9. And Alyssa can git pull to obtain Ben’s work.

In this example, Git was able to merge Alyssa’s and Ben’s changes automatically, because they
each modified different files. If both of them had edited the same parts of the same files, Git
would report a merge conflict. Ben would have to manually weave their changes together
before committing the merge. All of this is discussed in the Getting Started section on merges,
merging, and merge conflicts.

reading exercises

Merge
Alice and Bob both start with the same Java file:

public class Hello {


public static void greet(String name) {
System.out.println(greeting() + ", " + name);
}
public static String greeting() {
return "Hello";
}
}

Alice changes greet(..) :


Bob changes greeting() :

public static void greet(String name) {


public static String greeting() {
System.out.println(greeting() +
return "Ciao";
", " + name + "!");
}
}

If Git merges the changes of Alice and Bob, what is the result of Hello.greet("Eve") ?
Hello, Eve
Hello, Eve!
Ciao, Eve
Ciao, Eve!
we can automatically merge, but the resulting code is broken (static error)
we can automatically merge, but the resulting code is broken (dynamic error)
we can automatically merge, but the resulting code is broken (no error, wrong answer)
we cannot automatically merge the changes

(missing explanation)

check

Dangerous Merge Ahead

Same starting program:

public class Hello {


public static void greet(String name) {
System.out.println(greeting() + ", " + name);
}
public static String greeting() {
return "Hello";
}
}

Alice changes greeting() : Bob changes where the comma appears:

public static String greeting() { public static void greet(String name) {


return "Ciao"; System.out.println(greeting() + name);
} }
public static String greeting() {
return "Hello, ";
}

If Git merges the changes of Alice and Bob, what is the result of Hello.greet("Eve") ?
Hello, Eve
HelloEve
Ciao, Eve
CiaoEve

we can automatically merge, but the resulting code is broken (static error)
we can automatically merge, but the resulting code is broken (dynamic error)
we can automatically merge, but the resulting code is broken (no error, wrong answer)
we cannot automatically merge the changes

(missing explanation)

check

Continue Merging

Same starting program:

public class Hello {


public static void greet(String name) {
System.out.println(greeting() + ", " + name);
}
public static String greeting() {
return "Hello";
}
}

Alice changes greet(..) to return instead of print:

public static String greet(String name) {


return greeting() + ", " + name;
}

Bob creates a new file, Main.java :

public class Main {


public static void main(String[] args) {
// print a greeting to Eve
Hello.greet("Eve");
}
}

If Git merges the changes of Alice and Bob, what is the result of running main ?
Hello, Eve
Hello, Eve!
Ciao, Eve
Ciao, Eve!
we can automatically merge, but the resulting code is broken (static error)
we can automatically merge, but the resulting code is broken (dynamic error)
we can automatically merge, but the resulting code is broken (no error, wrong answer)
we cannot automatically merge the changes

(missing explanation)

check

Why do commits look like diffs?


We’ve defined a commit as a snapshot of our entire project, but if you ask Git, it doesn’t seem
to see things that way:

$ git show 1255f4e


commit 1255f4e4a5836501c022deb337fda3f8800b02e4
Author: Max Goldman <[email protected]>
Date: Mon Sep 14 14:58:40 2015 -0400

Change the greeting

diff --git a/hello.txt b/hello.txt


index c1106ab..3462165 100644
--- a/hello.txt
+++ b/hello.txt
@@ -1 +1 @@
-Hello, version control!
+Hello again, version control!

Git is assuming that most of our project does not change in any given commit, so showing only
the differences will be more useful. Almost all the time, that’s true.

But we can ask Git to show us what was in the repo at a particular commit:

$ git show 3e62e60:


tree 3e62e60:

hello.rb
hello.scm
hello.txt

Yes, the addition of a : completely changes the meaning of that command.

We can also see what was in a particular file in that commit:

$ git show 3e62e60:hello.scm


(display "Hello, version control!")
This is one of the simplest ways you can use Git to recover from a disaster: ask it to git show

you the contents of a now-broken file at some earlier version when the file was OK.

We’ll practice some disaster recovery commands in class.

Version control and the big three


How does version control relate to the three big ideas of 6.005?

Safe from bugs


find when and where something broke
look for other, similar mistakes
gain confidence that code hasn’t changed accidentally

Easy to understand
why was a change made?
what else was changed at the same time?
who can I ask about this code?

Ready for change


all about managing and organizing changes
accept and integrate changes from other developers
isolate speculative work on branches
Checking preconditions is an example of defensive programming.
Real programs are rarely bug-free. Defensive programming offers a
way to mitigate the effects of bugs even if you don’t know where they
are.

Assertions
It is common practice to define a procedure for these kinds of
defensive checks, usually called assert :

assert (x >= 0);

This approach abstracts away from what exactly happens when the
assertion fails. The failed assert might exit; it might record an event in
a log file; it might email a report to a maintainer.

Assertions have the added benefit of documenting an assumption


about the state of the program at that point. To somebody reading
your code, assert (x >= 0) says “at this point, it should always be true
that x >= 0.” Unlike a comment, however, an assertion is executable
code that enforces the assumption at runtime.
In Java, runtime assertions are a built-in feature of the language. The
simplest form of the assert statement takes a boolean expression,
exactly as shown above, and throws AssertionError if the boolean
expression evaluates to false:

assert x >= 0;

An assert statement may also include a description expression, which


is usually a string, but may also be a primitive type or a reference to
an object. The description is printed in an error message when the
assertion fails, so it can be used to provide additional details to the
programmer about the cause of the failure. The description follows
the asserted expression, separated by a colon. For example:

assert (x >= 0) : "x is " + x;

If x == -1, then this assertion fails with the error message

x is -1

along with a stack trace that tells you where the assert statement
was found in your code and the sequence of calls that brought the
program to that point. This information is often enough to get started
in finding the bug.

A serious problem with Java assertions is that assertions are off


by default.

If you just run your program as usual, none of your assertions will be
checked! Java’s designers did this because checking assertions can
sometimes be costly to performance. For example, a procedure that
searches an array using binary search has a requirement that the
array be sorted. Asserting this requirement requires scanning through
the entire array, however, turning an operation that should run in
logarithmic time into one that takes linear time. You should be willing
(eager!) to pay this cost during testing, since it makes debugging
much easier, but not after the program is released to users. For most
applications, however, assertions are not expensive compared to the
rest of the code, and the benefit they provide in bug-checking is worth
that small cost in performance.

So you have to enable assertions explicity by passing -ea (which


stands for enable assertions) to the Java virtual machine. In Eclipse,
you enable assertions by going to Run → Run Configurations →
Arguments, and putting -ea in the VM arguments box. It’s best, in
fact, to enable them by default by going to Preferences → Java →
Installed JREs → Edit → Default VM Arguments, as you hopefully did
in the Getting Started instructions.

It’s always a good idea to have assertions turned on when you’re


running JUnit tests. You can ensure that assertions are enabled using
the following test case:

@Test(expected=AssertionError.class)
public void testAssertionsEnabled() {
assert false;
}

If assertions are turned on as desired, then assert false throws an


AssertionError . The annotation (expected=AssertionError.class) on the test
expects and requires this error to be thrown, so the test passes. If
assertions are turned off, however, then the body of the test will do
nothing, failing to throw the expected exception, and JUnit will mark
the test as failing.

Note that the Java assert statement is a different mechanism from the
JUnit methods assertTrue() , assertEquals() , etc. They all assert a
predicate about your code, but are designed for use in different
contexts. The assert statement should be used in implementation
code, for defensive checks inside the implementation. JUnit
assert...() methods should be used in JUnit tests, to check the result
of a test. The assert statements don’t run without -ea , but the JUnit
assert...() methods always run.

What to Assert
Here are some things you should assert:

Method argument requirements, like we saw for sqrt .

Method return value requirements. This kind of assertion is


sometimes called a self check. For example, the sqrt method might
square its result to check whether it is reasonably close to x:

public double sqrt(double x) {


assert x >= 0;
double r;
... // compute result r
assert Math.abs(r*r - x) < .0001;
return r;
}

Covering all cases. If a conditional statement or switch does not


cover all the possible cases, it is good practice to use an assertion to
block the illegal cases:

switch (vowel) {
case 'a':
case 'e':
case 'i':
case 'o':
case 'u': return "A";
default: assert false;
}

The assertion in the default clause has the effect of asserting that
vowel must be one of the five vowel letters.

When should you write runtime assertions? As you write the code, not
after the fact. When you’re writing the code, you have the invariants in
mind. If you postpone writing assertions, you’re less likely to do it,
and you’re liable to omit some important invariants.

What Not to Assert


Runtime assertions are not free. They can clutter the code, so they
must be used judiciously. Avoid trivial assertions, just as you would
avoid uninformative comments. For example:

// don't do this:
x = y + 1;
assert x == y+1;

This assertion doesn’t find bugs in your code. It finds bugs in the
compiler or Java virtual machine, which are components that you
should trust until you have good reason to doubt them. If an assertion
is obvious from its local context, leave it out.
Never use assertions to test conditions that are external to your
program, such as the existence of files, the availability of the network,
or the correctness of input typed by a human user. Assertions test the
internal state of your program to ensure that it is within the bounds of
its specification. When an assertion fails, it indicates that the program
has run off the rails in some sense, into a state in which it was not
designed to function properly. Assertion failures therefore indicate
bugs. External failures are not bugs, and there is no change you can
make to your program in advance that will prevent them from
happening. External failures should be handled using exceptions
instead.

Many assertion mechanisms are designed so that assertions are


executed only during testing and debugging, and turned off when the
program is released to users. Java’s assert statement behaves this
way. Since assertions may be disabled, the correctness of your
program should never depend on whether or not the assertion
expressions are executed. In particular, asserted expressions should
not have side-effects. For example, if you want to assert that an
element removed from a list was actually found in the list, don’t write
it like this:

// don't do this:
assert list.remove(x);

If assertions are disabled, the entire expression is skipped, and x is


never removed from the list. Write it like this instead:

boolean found = list.remove(x);


assert found;
reading exercises

Assertions

Consider this (incomplete) function:

/**
* Solves quadratic equation ax^2 + bx + c = 0.
*
* @param a quadratic coefficient, requires a != 0
* @param b linear coefficient
* @param c constant term
* @return a list of the real roots of the equation
*/
public static List<Double> quadraticRoots(final int a, final int b, final i
List<Double> roots = new ArrayList<Double>();
// A
... // compute roots
// B
return roots;
}

What statements would be reasonable to write at position A?


assert a != 0;
assert b != 0;
assert c != 0;
assert roots.size() >= 0;
assert roots.size() <= 2;
for (double x : roots) { assert Math.abs(a*x*x + b*x + c) < 0.0001; }
What statements would be reasonable to write at position B?
assert a != 0;
assert b != 0;
assert c != 0;
assert roots.size() >= 0;
assert roots.size() <= 2;
for (double x : roots) { assert Math.abs(a*x*x + b*x + c) < 0.0001; }

(missing explanation)

check

Incremental Development

A great way to localize bugs to a tiny part of the program is


incremental development. Build only a bit of your program at a time,
and test that bit thoroughly before you move on. That way, when you
discover a bug, it’s more likely to be in the part that you just wrote,
rather than anywhere in a huge pile of code.

Our class on testing talked about two techniques that help with this:

Unit testing: when you test a module in isolation, you can be


confident that any bug you find is in that unit – or maybe in the test
cases themselves.
Regression testing: when you’re adding a new feature to a big
system, run the regression test suite as often as possible. If a
test fails, the bug is probably in the code you just changed.

Modularity & Encapsulation


You can also localize bugs by better software design.
Modularity. Modularity means dividing up a system into components,
or modules, each of which can be designed, implemented, tested,
reasoned about, and reused separately from the rest of the system.
The opposite of a modular system is a monolithic system – big and
with all of its pieces tangled up and dependent on each other.

A program consisting of a single, very long main() function is


monolithic – harder to understand, and harder to isolate bugs in. By
contrast, a program broken up into small functions and classes is
more modular.

Encapsulation. Encapsulation means building walls around a module


(a hard shell or capsule) so that the module is responsible for its own
internal behavior, and bugs in other parts of the system can’t damage
its integrity.

One kind of encapsulation is access control, using public and private

to control the visibility and accessibility of your variables and


methods. A public variable or method can be accessed by any code
(assuming the class containing that variable or method is also public).
A private variable or method can only be accessed by code in the
same class. Keeping things private as much as possible, especially
for variables, provides encapsulation, since it limits the code that
could inadvertently cause bugs.

Another kind of encapsulation comes from variable scope. The


scope of a variable is the portion of the program text over which that
variable is defined, in the sense that expressions and statements can
refer to the variable. A method parameter’s scope is the body of the
method. A local variable’s scope extends from its declaration to the
next closing curly brace. Keeping variable scopes as small as
possible makes it much easier to reason about where a bug might be
in the program. For example, suppose you have a loop like this:

for (i = 0; i < 100; ++i) {


...
doSomeThings();
...
}

…and you’ve discovered that this loop keeps running forever – i

never reaches 100. Somewhere, somebody is changing i . But


where? If i is declared as a global variable like this:

public static int i;


...
for (i = 0; i < 100; ++i) {
...
doSomeThings();
...
}

…then its scope is the entire program. It might be changed anywhere


in your program: by doSomeThings() , by some other method that
doSomeThings() calls, by a concurrent thread running some completely
different code. But if i is instead declared as a local variable with a
narrow scope, like this:

for (int i = 0; i < 100; ++i) {


...
doSomeThings();
...
}
…then the only place where i can be changed is within the for
statement – in fact, only in the … parts that we’ve omitted. You don’t
even have to consider doSomeThings() , because doSomeThings() doesn’t
have access to this local variable.

Minimizing the scope of variables is a powerful practice for bug


localization. Here are a few rules that are good for Java:

Always declare a loop variable in the for-loop initializer. So


rather than declaring it before the loop:

int i;
for (i = 0; i < 100; ++i) {

which makes the scope of the variable the entire rest of the outer
curly-brace block containing this code, you should do this:

for (int i = 0; i < 100; ++i) {

which makes the scope of i limited just to the for loop.

Declare a variable only when you first need it, and in the
innermost curly-brace block that you can. Variable scopes in
Java are curly-brace blocks, so put your variable declaration in
the innermost one that contains all the expressions that need to
use the variable. Don’t declare all your variables at the start of the
function – it makes their scopes unnecessarily large. But note that
in languages without static type declarations, like Python and
Javascript, the scope of a variable is normally the entire function
anyway, so you can’t restrict the scope of a variable with curly
braces, alas.
Avoid global variables. Very bad idea, especially as programs
get large. Global variables are often used as a shortcut to provide
a parameter to several parts of your program. It’s better to just
pass the parameter into the code that needs it, rather than putting
it in global space where it can inadvertently reassigned.

reading exercises

Variable scope

Consider the following code (which is missing some variable


declarations):

1 class Apartment {
2 Apartment(String newAddress) {
3 this.address = newAddress;
4 this.roommates = new HashSet<Person>();
5 }
6
7 String getAddress() {
8 return address;
9 }
10
11 void addRoommate(Person newRoommate) {
12 roommates.add(newRoommate);
13 if (roommates.size() > MAXIMUM_OCCUPANCY) {
14 roommates.remove(newRoommate);
15 throw new TooManyPeopleException();
16 }
17 }
18
19 int getMaximumOccupancy() {
20 return MAXIMUM_OCCUPANCY;
21 }
22 }

Which of these lines are within the scope of the newRoommate variable?
line 3
line 8
line 12
line 15
line 20

(missing explanation)

What would be the scope for the (currently undeclared) address

variable?
lines 2-21
lines 3-4
line 8
lines 12-16

(missing explanation)

Out of the choices below, what is the best declaration for the
roommates variable?
List<Person> roommates;
Set<Person> roommates;
final Set<Person> roommates;
HashSet<Person> roommates;

(missing explanation)

Out of the choices below, what is the best declaration for the
MAXIMUM_OCCUPANCY variable?
int MAXIMUM_OCCUPANCY = 8;
final int MAXIMUM_OCCUPANCY = 8;
static int MAXIMUM_OCCUPANCY = 8;
static final int MAXIMUM_OCCUPANCY = 8;
(missing explanation)

check

Summary

In this reading, we looked at some ways to minimize the cost of


debugging:

Avoid debugging
make bugs impossible with techniques like static typing,
automatic dynamic checking, and immutable types and
references
Keep bugs confined
failing fast with assertions keeps a bug’s effects from
spreading
incremental development and unit testing confine bugs to your
recent code
scope minimization reduces the amount of the program you
have to search

Thinking about our three main measures of code quality:

Safe from bugs. We’re trying to prevent them and get rid of
them.
Easy to understand. Techniques like static typing, final
declarations, and assertions are additional documentation of the
assumptions in your code. Variable scope minimization makes it
easier for a reader to understand how the variable is used,
because there’s less code to look at.
Ready for change. Assertions and static typing document the
assumptions in an automatically-checkable way, so that when a
future programmer changes the code, accidental violations of
those assumptions are detected.
Reading 9: Mutability & Immutability

Objectives

Understand mutability and mutable objects


Identify aliasing and understand the dangers of mutability
Use immutability to improve correctness, clarity, & changeability

Mutability

Recall from Basic Java when we discussed snapshot diagrams that some
objects are immutable: once created, they always represent the same
value. Other objects are mutable: they have methods that change the value
of the object.

String is an example of an immutable type. A String object always


represents the same string. StringBuilder is an example of a mutable type. It
has methods to delete parts of the string, insert or replace characters, etc.
Since String is immutable, once created, a String object always has the
same value. To add something to the end of a String, you have to create a
new String object:

String s = "a";
s = s.concat("b"); // s+="b" and s=s+"b" also mean the same thing

By contrast, StringBuilder objects are mutable. This class has methods that
change the value of the object, rather than just returning new values:

StringBuilder sb = new StringBuilder("a");


sb.append("b");

StringBuilder has other methods as well, for deleting parts of the string,
inserting in the middle, or changing individual characters.

So what? In both cases, you end up with s and sb referring to the string of
characters "ab" . The difference between mutability and immutability doesn’t
matter much when there’s only one reference to the object. But there are
big differences in how they behave when there are other references to the
object. For example, when another variable t points to the same String
object as s , and another variable tb points to the same StringBuilder as sb ,
then the differences between the immutable and mutable objects become
more evident:

String t = s;
t = t + "c";

StringBuilder tb = sb;
tb.append("c");

This shows that changing t had no effect on s , but changing tb affected sb

too — possibly to the surprise of the programmer. That’s the essence of the
problem we’re going to look at in this reading.

Since we have the immutable String class already, why do we even need
the mutable StringBuilder in programming? A common use for it is to
concatenate a large number of strings together. Consider this code:

String s = "";
for (int i = 0; i < n; ++i) {
s = s + n;
}

Using immutable strings, this makes a lot of temporary copies — the first
number of the string ( "0" ) is actually copied n times in the course of building
up the final string, the second number is copied n-1 times, and so on. It
actually costs O(n2) time just to do all that copying, even though we only
concatenated n elements.

StringBuilder is designed to minimize this copying. It uses a simple but clever


internal data structure to avoid doing any copying at all until the very end,
when you ask for the final String with a toString() call:
StringBuilder sb = new StringBuilder();
for (int i = 0; i < n; ++i) {
sb.append(String.valueOf(i));
}
String s = sb.toString();

Getting good performance is one reason why we use mutable objects.


Another is convenient sharing: two parts of your program can communicate
more conveniently by sharing a common mutable data structure.

reading exercises

Follow me

Can a client with the variable terrarium modify the Turtle in red?
No, because the terrarium reference is immutable
No, because the Turtle object is immutable
Yes, because the reference from list index 0 to the Turtle is mutable
Yes, because the Turtle object is mutable

(missing explanation)

Can a client with the variable george modify the Gecko in blue?
No, because the george reference is immutable
No, because the Gecko object is immutable
Yes, because the reference from list index 1 to the Gecko is mutable
Yes, because the Gecko object is mutable

(missing explanation)

Can a client with just the variable petStore make it impossible for a client with
just the variable terrarium to reach the Gecko in blue?

Choose the best answer.

No, because the terrarium reference is immutable


No, because the Gecko object is immutable
Yes, because the petStore reference is mutable
Yes, because the PetStore object is mutable
Yes, because the List object is mutable
Yes, because the reference from list index 1 to the Gecko is mutable

(missing explanation)

check

Risks of mutation

Mutable types seem much more powerful than immutable types. If you were
shopping in the Datatype Supermarket, and you had to choose between a
boring immutable String and a super-powerful-do-anything mutable
StringBuilder , why on earth would you choose the immutable one?
StringBuilder should be able to do everything that String can do, plus set()

and append() and everything else.


The answer is that immutable types are safer from bugs, easier to
understand, and more ready for change. Mutability makes it harder to
understand what your program is doing, and much harder to enforce
contracts. Here are two examples that illustrate why.

Risky example #1: passing mutable values

Let’s start with a simple method that sums the integers in a list:

/** @return the sum of the numbers in the list */


public static int sum(List<Integer> list) {
int sum = 0;
for (int x : list)
sum += x;
return sum;
}

Suppose we also need a method that sums the absolute values. Following
good DRY practice (Don’t Repeat Yourself), the implementer writes a
method that uses sum() :

/** @return the sum of the absolute values of the numbers in the list */
public static int sumAbsolute(List<Integer> list) {
// let's reuse sum(), because DRY, so first we take absolute values
for (int i = 0; i < list.size(); ++i)
list.set(i, Math.abs(list.get(i)));
return sum(list);
}

Notice that this method does its job by mutating the list directly. It
seemed sensible to the implementer, because it’s more efficient to reuse the
existing list. If the list is millions of items long, then you’re saving the time
and memory of generating a new million-item list of absolute values. So the
implementer has two very good reasons for this design: DRY, and
performance.

But the resulting behavior will be very surprising to anybody who uses it! For
example:
// meanwhile, somewhere else in the code...
public static void main(String[] args) {
// ...
List<Integer> myData = Arrays.asList(-5, -3, -2);
System.out.println(sumAbsolute(myData));
System.out.println(sum(myData));
}

What will this code print? Will it be 10 followed by -10? Or something else?

reading exercises

Risky #1
What will the code print?

(missing explanation)

check

Let’s think about the key points here:

Safe from bugs? In this example, it’s easy to blame the implementer of
sumAbsolute() for going beyond what its spec allowed. But really, passing
mutable objects around is a latent bug. It’s just waiting for some
programmer to inadvertently mutate that list, often with very good
intentions like reuse or performance, but resulting in a bug that may be
very hard to track down.

Easy to understand? When reading main() , what would you assume


about sum() and sumAbsolute() ? Is it clearly visible to the reader that myData

gets changed by one of them?


The this keyword is used at one point to refer to the instance object, in
particular to refer to an instance variable ( this.list ). This was done to
disambiguate two different variables named list (an instance variable and a
constructor parameter). Most of MyIterator ’s code refers to instance
variables without an explicit this , but this is just a convenient shorthand that
Java supports — e.g., index actually means this.index .

private is used for the object’s internal state and internal helper methods,
while public indicates methods and constructors that are intended for clients
of the class (access control).

final is used to indicate which of the object’s internal variables can be


reassigned and which can’t. index is allowed to change ( next() updates it as
it steps through the list), but list cannot (the iterator has to keep pointing at
the same list for its entire life — if you want to iterate through another list,
you’re expected to create another iterator object).

Here’s a snapshot diagram showing a typical state for a MyIterator object in


action:

Note that we draw the arrow from list with a double line, to indicate that it’s
final. That means the arrow can’t change once it’s drawn. But the ArrayList

object it points to is mutable — elements can be changed within it — and


declaring list as final has no effect on that.
Useful immutable types
Since immutable types avoid so many pitfalls, let’s enumerate some
commonly-used immutable types in the Java API:

The primitive types and primitive wrappers are all immutable. If you need
to compute with large numbers, BigInteger and BigDecimal are immutable.

Don’t use mutable Date s, use the appropriate immutable type from
java.time based on the granularity of timekeeping you need.

The usual implementations of Java’s collections types — List , Set , Map —


are all mutable: ArrayList , HashMap , etc. The Collections utility class has
methods for obtaining unmodifiable views of these mutable collections:

Collections.unmodifiableList

Collections.unmodifiableSet

Collections.unmodifiableMap

You can think of the unmodifiable view as a wrapper around the


underlying list/set/map. A client who has a reference to the wrapper and
tries to perform mutations — add , remove , put , etc. — will trigger an
UnsupportedOperationException .

Before we pass a mutable collection to another part of our program, we


can wrap it in an unmodifiable wrapper. We should be careful at that
point to forget our reference to the mutable collection, lest we
accidentally mutate it. (One way to do that is to let it go out of scope.)
Just as a mutable object behind a final reference can be mutated, the
mutable collection inside an unmodifiable wrapper can still be modified
by someone with a reference to it, defeating the wrapper.
Collections also provides methods for obtaining immutable empty
collections: Collections.emptyList , etc. Nothing’s worse than discovering
your definitely very empty list is suddenly definitely not empty!

reading exercises

Immutability
Which of the following are correct?
1. A class is immutable if all of its fields are final
2. A class is immutable if instances of it always represent the same value
3. Instances of an immutable class can be safely shared
4. Objects can be made immutable using defensive copying
5. Immutability allows us to reason about global properties instead of
local ones

(missing explanation)

check

Summary
In this reading, we saw that mutability is useful for performance and
convenience, but it also creates risks of bugs by requiring the code that
uses the objects to be well-behaved on a global level, greatly complicating
the reasoning and testing we have to do to be confident in its correctness.

Make sure you understand the difference between an immutable object (like
a String ) and an immutable reference (like a final variable). Snapshot
diagrams can help with this understanding. Objects are values, represented
by circles in a snapshot diagram, and an immutable one has a double
border indicating that it never changes its value. A reference is a pointer to
an object, represented by an arrow in the snapshot diagram, and an
immutable reference is an arrow with a double line, indicating that the arrow
can’t be moved to point to a different object.

The key design principle here is immutability: using immutable objects and
immutable references as much as possible. Let’s review how immutability
helps with the main goals of this course:

Safe from bugs. Immutable objects aren’t susceptible to bugs caused


by aliasing. Immutable references always point to the same object.

Easy to understand. Because an immutable object or reference always


means the same thing, it’s simpler for a reader of the code to reason
about — they don’t have to trace through all the code to find all the
places where the object or reference might be changed, because it can’t
be changed.

Ready for change. If an object or reference can’t be changed at


runtime, then code that depends on that object or reference won’t have
to be revised when the program changes.
Reading 11: Abstract Data Types

Objectives

Today’s class introduces two ideas:

Abstract data types


Representation independence

In this reading, we look at a powerful idea, abstract data types, which enable us to separate how we
use a data structure in a program from the particular form of the data structure itself.

Abstract data types address a particularly dangerous problem: clients making assumptions about the
type’s internal representation. We’ll see why this is dangerous and how it can be avoided. We’ll also
discuss the classification of operations, and some principles of good design for abstract data types.

Access Control in Java

You should already have read: Controlling Access to Members of a Class in the Java Tutorials.

reading exercises

The following questions use the code below. Study it first, then answer the questions.

class Wallet {
private int amount;

public void loanTo(Wallet that) {


// put all of this wallet's money into that wallet
/*A*/ that.amount += this.amount;
/*B*/ amount = 0;
}

public static void main(String[] args) {


/*C*/ Wallet w = new Wallet();
/*D*/ w.amount = 100;
/*E*/ w.loanTo(w);
}
}

class Person {
private Wallet w;

public int getNetWorth() {


/*F*/ return w.amount;
}

public boolean isBroke() {


/*G*/ return Wallet.amount == 0;
}
}

Access control A
Which of the following statements are true about the line marked /*A*/ ?

that.amount += this.amount;

The reference to this.amount is allowed by Java.


The reference to this.amount is not allowed by Java because it uses this to access a private field.
The reference to that.amount is allowed by Java.
The reference to that.amount is not allowed by Java because that.amount is a private field in a different
object.
The reference to that.amount is not allowed by Java because it writes to a private field.
The illegal access(es) are caught statically.
The illegal access(es) are caught dynamically.

(missing explanation)

check

Access control B
Which of the following statements are true about the line marked /*B*/ ?

amount = 0;

The reference to amount is allowed by Java.


The reference to amount is not allowed by Java because it doesn’t use this .

The illegal access is caught statically.


The illegal access is caught dynamically.

(missing explanation)

check

Access control C
Which of the following statements are true about the line marked /*C*/ ?

Wallet w = new Wallet();

The call to the Wallet() constructor is allowed by Java.


The call to the Wallet() constructor is not allowed by Java because there is no public Wallet()

constructor declared.
The illegal access is caught statically.
The illegal access is caught dynamically.

(missing explanation)
check

Access control D
Which of the following statements are true about the line marked /*D*/ ?

w.amount = 100;

The access to w.amount is allowed by Java.


The access to w.amount is not allowed by Java because amount is private.
The illegal access is caught statically.
The illegal access is caught dynamically.

(missing explanation)

check

Access control E
Which of the following statements are true about the line marked /*E*/

w.loanTo(w);

The call to loanTo() is allowed by Java.


The call to loanTo() is not allowed by Java because this and that will be aliases to the same object.
The problem will be found by a static check.
The problem will be found by a dynamic check.
After this line, the Wallet object pointed to by w will have amount 0.
After this line, the Wallet object pointed to by w will have amount 100.
After this line, the Wallet object pointed to by w will have amount 200.

(missing explanation)

check

Access control F
Which of the following statements are true about the line marked /*F*/ ?

return w.amount;

The reference to w.amount is allowed by Java because both w and amount are private variables.
The reference to w.amount is allowed by Java because amount is a primitive type, even though it’s
private.
The reference to w.amount is not allowed by Java because amount is a private field in a different class.
The illegal access is caught statically.
The illegal access is caught dynamically.

(missing explanation)

check

Access control G
Which of the following statements are true about the line marked /*G*/ ?

return Wallet.amount == 0;

The reference to Wallet.amount is allowed by Java because Wallet has permission to access its own
private field amount .

The reference to Wallet.amount is allowed by Java because amount is a static variable.


The reference to Wallet.amount is not allowed by Java because amount is a private field.
The reference to Wallet.amount is not allowed by Java because amount is an instance variable.
The illegal access is caught statically.
The illegal access is caught dynamically.

(missing explanation)

check

What Abstraction Means

Abstract data types are an instance of a general principle in software engineering, which goes by many
names with slightly different shades of meaning. Here are some of the names that are used for this
idea:

Abstraction. Omitting or hiding low-level details with a simpler, higher-level idea.


Modularity. Dividing a system into components or modules, each of which can be designed,
implemented, tested, reasoned about, and reused separately from the rest of the system.
Encapsulation. Building walls around a module (a hard shell or capsule) so that the module is
responsible for its own internal behavior, and bugs in other parts of the system can’t damage its
integrity.
Information hiding. Hiding details of a module’s implementation from the rest of the system, so
that those details can be changed later without changing the rest of the system.
Separation of concerns. Making a feature (or “concern”) the responsibility of a single module,
rather than spreading it across multiple modules.

As a software engineer, you should know these terms, because you will run into them frequently. The
fundamental purpose of all of these ideas is to help achieve the three important properties that we care
about in 6.005: safety from bugs, ease of understanding, and readiness for change.

User-Defined Types

In the early days of computing, a programming language came with built-in types (such as integers,
booleans, strings, etc.) and built-in procedures, e.g., for input and output. Users could define their own
procedures: that’s how large programs were built.
A major advance in software development was the idea of abstract types: that one could design a
programming language to allow user-defined types, too. This idea came out of the work of many
researchers, notably Dahl (the inventor of the Simula language), Hoare (who developed many of the
techniques we now use to reason about abstract types), Parnas (who coined the term information
hiding and first articulated the idea of organizing program modules around the secrets they
encapsulated), and here at MIT, Barbara Liskov and John Guttag, who did seminal work in the
specification of abstract types, and in programming language support for them – and developed the
original 6.170, the predecessor to 6.005. Barbara Liskov earned the Turing Award, computer science’s
equivalent of the Nobel Prize, for her work on abstract types.

The key idea of data abstraction is that a type is characterized by the operations you can perform on
it. A number is something you can add and multiply; a string is something you can concatenate and
take substrings of; a boolean is something you can negate, and so on. In a sense, users could already
define their own types in early programming languages: you could create a record type date, for
example, with integer fields for day, month, and year. But what made abstract types new and different
was the focus on operations: the user of the type would not need to worry about how its values were
actually stored, in the same way that a programmer can ignore how the compiler actually stores
integers. All that matters is the operations.

In Java, as in many modern programming languages, the separation between built-in types and user-
defined types is a bit blurry. The classes in java.lang, such as Integer and Boolean are built-in; whether
you regard all the collections of java.util as built-in is less clear (and not very important anyway). Java
complicates the issue by having primitive types that are not objects. The set of these types, such as int
and boolean, cannot be extended by the user.

reading exercises

Abstract Data Types

Consider an abstract data type Bool . The type has the following operations:

true : Bool
false : Bool

and : Bool × Bool → Bool


or : Bool × Bool → Bool
not : Bool → Bool

… where the first two operations construct the two values of the type, and the last three operations
have the usual meanings of logical and, logical or, and logical not on those values.

Which of the following are possible ways that Bool might be implemented, and still be able to satisfy the
specs of the operations? Choose all that apply.
As a single bit, where 1 means true and 0 means false.
As an int value where 5 means true and 8 means false.
As a reference to a String object where "false" means true and "true" means false.
As a long value where all possible values mean true.

(missing explanation)

check

Classifying Types and Operations

Types, whether built-in or user-defined, can be classified as mutable or immutable. The objects of a
mutable type can be changed: that is, they provide operations which when executed cause the results
of other operations on the same object to give different results. So Date is mutable, because you can
call setMonth and observe the change with the getMonth operation. But String is immutable, because its
operations create new String objects rather than changing existing ones. Sometimes a type will be
provided in two forms, a mutable and an immutable form. StringBuilder , for example, is a mutable
version of String (although the two are certainly not the same Java type, and are not interchangeable).

The operations of an abstract type are classified as follows:

Creators create new objects of the type. A creator may take an object as an argument, but not an
object of the type being constructed.
Producers create new objects from old objects of the type. The concat method of String , for
example, is a producer: it takes two strings and produces a new one representing their
concatenation.
Observers take objects of the abstract type and return objects of a different type. The size method
of List , for example, returns an int .
Mutators change objects. The add method of List , for example, mutates a list by adding an element
to the end.

We can summarize these distinctions schematically like this (explanation to follow):

creator : t* → T
producer : T+, t* → T
observer : T+, t* → t
mutator : T+, t* → void | t | T

These show informally the shape of the signatures of operations in the various classes. Each T is the
abstract type itself; each t is some other type. The + marker indicates that the type may occur one or
more times in that part of the signature, and the * marker indicates that it occurs zero or more times. |

indicates or. For example, a producer may take two values of the abstract type T, like String.concat()

does:
concat : String × String → String

Some observers take zero arguments of other types t, such as:

size : List → int

… and others take several:

regionMatches : String × boolean × int × String × int × int → boolean

A creator operation is often implemented as a constructor, like new ArrayList() . But a creator can simply
be a static method instead, like Arrays.asList() . A creator implemented as a static method is often
called a factory method. The various String.valueOf methods in Java are other examples of creators
implemented as factory methods.

Mutators are often signaled by a void return type. A method that returns void must be called for some
kind of side-effect, since it doesn’t otherwise return anything. But not all mutators return void. For
example, Set.add() returns a boolean that indicates whether the set was actually changed. In Java’s
graphical user interface toolkit, Component.add() returns the object itself, so that multiple add() calls can
be chained together.

Abstract Data Type Examples

Here are some examples of abstract data types, along with some of their operations, grouped by kind.

int is Java’s primitive integer type. int is immutable, so it has no mutators.

creators: the numeric literals 0 , 1 , 2 , …


producers: arithmetic operators + , - , * , /

observers: comparison operators == , != , < , >

mutators: none (it’s immutable)

List is Java’s list type. List is mutable. List is also an interface, which means that other classes
provide the actual implementation of the data type. These classes include ArrayList and LinkedList .

creators: ArrayList and LinkedList constructors, Collections.singletonList

producers: Collections.unmodifiableList

observers: size , get

mutators: add , remove , addAll , Collections.sort

String is Java’s string type. String is immutable.

creators: String constructors


producers: concat , substring , toUpperCase

observers: length , charAt

mutators: none (it’s immutable)


check

Designing an Abstract Type

Designing an abstract type involves choosing good operations and determining how they should
behave. Here are a few rules of thumb.

It’s better to have a few, simple operations that can be combined in powerful ways, rather than lots
of complex operations.

Each operation should have a well-defined purpose, and should have a coherent behavior rather than
a panoply of special cases. We probably shouldn’t add a sum operation to List , for example. It might
help clients who work with lists of integers, but what about lists of strings? Or nested lists? All these
special cases would make sum a hard operation to understand and use.

The set of operations should be adequate in the sense that there must be enough to do the kinds of
computations clients are likely to want to do. A good test is to check that every property of an object of
the type can be extracted. For example, if there were no get operation, we would not be able to find
out what the elements of a list are. Basic information should not be inordinately difficult to obtain. For
example, the size method is not strictly necessary for List, because we could apply get on increasing
indices until we get a failure, but this is inefficient and inconvenient.

The type may be generic: a list or a set, or a graph, for example. Or it may be domain-specific: a
street map, an employee database, a phone book, etc. But it should not mix generic and domain-
specific features. A Deck type intended to represent a sequence of playing cards shouldn’t have a
generic add method that accepts arbitrary objects like integers or strings. Conversely, it wouldn’t make
sense to put a domain-specific method like dealCards into the generic type List .

Representation Independence
Critically, a good abstract data type should be representation independent. This means that the use
of an abstract type is independent of its representation (the actual data structure or data fields used to
implement it), so that changes in representation have no effect on code outside the abstract type itself.
For example, the operations offered by List are independent of whether the list is represented as a
linked list or as an array.

You won’t be able to change the representation of an ADT at all unless its operations are fully specified
with preconditions and postconditions, so that clients know what to depend on, and you know what you
can safely change.

Example: Different Representations for Strings

Let’s look at a simple abstract data type to see what representation independence means and why it’s
useful. The MyString type below has far fewer operations than the real Java String , and their specs are
a little different, but it’s still illustrative. Here are the specs for the ADT:

/** MyString represents an immutable sequence of characters. */


public class MyString {

//////////////////// Example of a creator operation ///////////////


/** @param b a boolean value
* @return string representation of b, either "true" or "false" */
public static MyString valueOf(boolean b) { ... }

//////////////////// Examples of observer operations ///////////////


/** @return number of characters in this string */
public int length() { ... }

/** @param i character position (requires 0 <= i < string length)


* @return character at position i */
public char charAt(int i) { ... }

//////////////////// Example of a producer operation ///////////////


/** Get the substring between start (inclusive) and end (exclusive).
* @param start starting index
* @param end ending index. Requires 0 <= start <= end <= string length.
* @return string consisting of charAt(start)...charAt(end-1) */
public MyString substring(int start, int end) { ... }
}

These public operations and their specifications are the only information that a client of this data type is
allowed to know. Following the test-first programming paradigm, in fact, the first client we should
create is a test suite that exercises these operations according to their specs. At the moment,
however, writing test cases that use assertEquals directly on MyString objects wouldn’t work, because we
don’t have an equality operation defined on MyString . We’ll talk about how to implement equality
carefully in a later reading. For now, the only operations we can perform with MyStrings are the ones
we’ve defined above: valueOf , length , charAt , and substring . Our tests have to limit themselves to those
operations. For example, here’s one test for the valueOf operation:

MyString s = MyString.valueOf(true);
assertEquals(4, s.length());
assertEquals('t', s.charAt(0));
assertEquals('r', s.charAt(1));
assertEquals('u', s.charAt(2));
assertEquals('e', s.charAt(3));

We’ll come back to the question of testing ADTs at the end of this reading.

For now, let’s look at a simple representation for MyString : just an array of characters, exactly the
length of the string, with no extra room at the end. Here’s how that internal representation would be
declared, as an instance variable within the class:

private char[] a;

With that choice of representation, the operations would be implemented in a straightforward way:

public static MyString valueOf(boolean b) {


MyString s = new MyString();
s.a = b ? new char[] { 't', 'r', 'u', 'e' }
: new char[] { 'f', 'a', 'l', 's', 'e' };
return s;
}
public int length() {
return a.length;
}

public char charAt(int i) {


return a[i];
}

public MyString substring(int start, int end) {


MyString that = new MyString();
that.a = new char[end - start];
System.arraycopy(this.a, start, that.a, 0, end - start);
return that;
}

(The ?: syntax in valueOf is called the ternary conditional operator and it’s a shorthand if-else statement.
See The Conditional Operators on this page of the Java Tutorials.)

Question to ponder: Why don’t charAt and substring have to check whether their parameters are within
the valid range? What do you think will happen if the client calls these implementations with illegal
inputs?

One problem with this implementation is that it’s passing up an opportunity for performance
improvement. Because this data type is immutable, the substring operation doesn’t really have to copy
characters out into a fresh array. It could just point to the original MyString object’s character array and
keep track of the start and end that the new substring object represents. The String implementation in
some versions of Java do this.

To implement this optimization, we could change the internal representation of this class to:

private char[] a;
private int start;
private int end;

With this new representation, the operations are now implemented like this:

public static MyString valueOf(boolean b) {


MyString s = new MyString();
s.a = b ? new char[] { 't', 'r', 'u', 'e' }
: new char[] { 'f', 'a', 'l', 's', 'e' };
s.start = 0;
s.end = s.a.length;
return s;
}

public int length() {


return end - start;
}

public char charAt(int i) {


return a[start + i];
}

public MyString substring(int start, int end) {


MyString that = new MyString();
that.a = this.a;
that.start = this.start + start;
that.end = this.start + end;
return that;
}
Reading : Interfaces

Objectives

The topic of today’s class is interfaces: separating the interface of an abstract data type from its
implementation, and using Java interface types to enforce that separation.

After today’s class, you should be able to define ADTs with interfaces, and write classes that
implement interfaces.

Interfaces

Java’s interface is a useful language mechanism for expressing an abstract data type. An interface
in Java is a list of method signatures, but no method bodies. A class implements an interface if it
declares the interface in its implements clause, and provides method bodies for all of the interface’s
methods. So one way to define an abstract data type in Java is as an interface, with its
implementation as a class implementing that interface.

One advantage of this approach is that the interface specifies the contract for the client and nothing
more. The interface is all a client programmer needs to read to understand the ADT. The client can’t
create inadvertent dependencies on the ADT’s rep, because instance variables can’t be put in an
interface at all. The implementation is kept well and truly separated, in a different class altogether.

Another advantage is that multiple different representations of the abstract data type can co-exist in
the same program, as different classes implementing the interface. When an abstract data type is
represented just as a single class, without an interface, it’s harder to have multiple representations.
In the MyString example from Abstract Data Types, MyString was a single class. We explored two
different representations for MyString , but we couldn’t have both representations for the ADT in the
same program.

Java’s static type checking allows the compiler to catch many mistakes in implementing an ADT’s
contract. For instance, it is a compile-time error to omit one of the required methods, or to give a
method the wrong return type. Unfortunately, the compiler doesn’t check for us that the code
adheres to the specs of those methods that are written in documentation comments.
reading exercises

Java interfaces

Consider this Java interface and Java class, which are intended to implement an immutable set data
type:

/** Represents an immutable set of elements of type E. */


public interface Set<E> {
/** make an empty set */
A public Set();
/** @return true if this set contains e as a member */
public boolean contains(E e);
/** @return a set which is the union of this and that */
B public ArraySet<E> union(Set<E> that);
}

/** Implementation of Set<E>. */


public class ArraySet<E> implements Set<E> {
/** make an empty set */
public ArraySet() { ... }
/** @return a set which is the union of this and that */
public ArraySet<E> union(Set<E> that) { ... }
/** add e to this set */
public void add(E e) { ... }
}

Which of the following statements are true about Set<E> and ArraySet<E> ?

The line labeled A is a problem because Java interfaces can’t have constructors.

True
False

(missing explanation)

The line labeled B is a problem because Set mentions ArraySet , but ArraySet also mentions Set , which
is circular.

True
False

(missing explanation)

The line labeled B is a problem because it isn’t representation-independent.

True
False

(missing explanation)

ArraySet doesn’t correctly implement Set because it’s missing the contains() method.
True
False

(missing explanation)

ArraySet doesn’t correctly implement Set because it includes a method that Set doesn’t have.

True
False

(missing explanation)

ArraySet doesn’t correctly implement Set because ArraySet is mutable while Set is immutable.

True
False

(missing explanation)

check

Subtypes

Recall that a type is a set of values. The Java List type is defined by an interface. If we think about
all possible List values, none of them are List objects: we cannot create instances of an interface.
Instead, those values are all ArrayList objects, or LinkedList objects, or objects of another class that
implements List . A subtype is simply a subset of the supertype: ArrayList and LinkedList are
subtypes of List .

“B is a subtype of A” means “every B is an A.” In terms of specifications: “every B satisfies the


specification for A.”

That means B is only a subtype of A if B’s specification is at least as strong as A’s specification.
When we declare a class that implements an interface, the Java compiler enforces part of this
requirement automatically: for example, it ensures that every method in A appears in B, with a
compatible type signature. Class B cannot implement interface A without implementing all of the
methods declared in A.

But the compiler cannot check that we haven’t weakened the specification in other ways:
strengthening the precondition on some inputs to a method, weakening a postcondition, weakening
a guarantee that the interface abstract type advertises to clients. If you declare a subtype in Java
— implementing an interface is our current focus — then you must ensure that the subtype’s spec is
at least as strong as the supertype’s.
reading exercises

Immutable shapes

Let’s define an interface for rectangles:

/** An immutable rectangle. */


public interface ImmutableRectangle {
/** @return the width of this rectangle */
public int getWidth();
/** @return the height of this rectangle */
public int getHeight();
}

It follows that every square is a rectangle:

/** An immutable square. */


public class ImmutableSquare {
private final int side;
/** Make a new side x side square. */
public ImmutableSquare(int side) { this.side = side; }
/** @return the width of this square */
public int getWidth() { return side; }
/** @return the height of this square */
public int getHeight() { return side; }
}

Does ImmutableSquare.getWidth() satisfy the spec of ImmutableRectangle.getWidth() ?

Yes
No

Does ImmutableSquare.getHeight() satisfy the spec of ImmutableRectangle.getHeight() ?

Yes
No

Does the whole ImmutableSquare spec satisfy the ImmutableRectangle spec?

Yes
No

(missing explanation)

check

Mutable shapes

/** A mutable rectangle. */


public interface MutableRectangle {
// ... same methods as above ...
/** Set this rectangle's dimensions to width x height. */
public void setSize(int width, int height);
}

Surely every square is still a rectangle?


/** A mutable square. */
public class MutableSquare {
private final int side;
// ... same constructor and methods as above ...
// TODO implement setSize(..)
}

For each possible MutableSquare.setSize(..) implementation below, is it a valid implementation?

/** Set this square's dimensions to width x height.


* Requires width = height. */
public void setSize(int width, int height) { ... }

(missing explanation)

/** Set this square's dimensions to width x height.


* @throw BadSizeException if width != height */
public void setSize(int width, int height) throws BadSizeException { ... }

(missing explanation)

/** If width = height, set this square's dimensions to width x height.


* Otherwise, new dimensions are unspecified. */
public void setSize(int width, int height) { ... }

(missing explanation)

/** Set this square's dimensions to side x side. */


public void setSize(int side) { ... }

(missing explanation)

check

Example: MyString

Let’s revisit MyString . Using an interface instead of a class for the ADT, we can support multiple
implementations:

/** MyString represents an immutable sequence of characters. */


public interface MyString {

// We'll skip this creator operation for now


// /** @param b a boolean value
// * @return string representation of b, either "true" or "false" */
// public static MyString valueOf(boolean b) { ... }

/** @return number of characters in this string */


public int length();
/** @param i character position (requires 0 <= i < string length)
* @return character at position i */
public char charAt(int i);

/** Get the substring between start (inclusive) and end (exclusive).
* @param start starting index
* @param end ending index. Requires 0 <= start <= end <= string length.
* @return string consisting of charAt(start)...charAt(end-1) */
public MyString substring(int start, int end);
}

We’ll skip the static valueOf method and come back to it in a minute. Instead, let’s go ahead using a
different technique from our toolbox of ADT concepts in Java: constructors.

Here’s our first implementation:

public class SimpleMyString implements MyString {

private char[] a;

/* Create an uninitialized SimpleMyString. */


private SimpleMyString() {}

/** Create a string representation of b, either "true" or "false".


* @param b a boolean value */
public SimpleMyString(boolean b) {
a = b ? new char[] { 't', 'r', 'u', 'e' }
: new char[] { 'f', 'a', 'l', 's', 'e' };
}

@Override public int length() { return a.length; }

@Override public char charAt(int i) { return a[i]; }

@Override public MyString substring(int start, int end) {


SimpleMyString that = new SimpleMyString();
that.a = new char[end - start];
System.arraycopy(this.a, start, that.a, 0, end - start);
return that;
}
}

And here’s the optimized implementation:

public class FastMyString implements MyString {

private char[] a;
private int start;
private int end;

/* Create an uninitialized FastMyString. */


private FastMyString() {}

/** Create a string representation of b, either "true" or "false".


* @param b a boolean value */
public FastMyString(boolean b) {
a = b ? new char[] { 't', 'r', 'u', 'e' }
: new char[] { 'f', 'a', 'l', 's', 'e' };
start = 0;
end = a.length;
}

@Override public int length() { return end - start; }


@Override public char charAt(int i) { return a[start + i]; }

@Override public MyString substring(int start, int end) {


FastMyString that = new FastMyString();
that.a = this.a;
that.start = this.start + start;
that.end = this.start + end;
return that;
}
}

Compare these classes to the implementations of MyString in Abstract Data Types. Notice how
the code that previously appeared in static valueOf methods now appears in the constructors,
slightly changed to refer to the rep of this .

Also notice the use of @Override . This annotation informs the compiler that the method must have
the same signature as one of the methods in the interface we’re implementing. But since the
compiler already checks that we’ve implemented all of the interface methods, the primary value
of @Override here is for readers of the code: it tells us to look for the spec of that method in the
interface. Repeating the spec wouldn’t be DRY, but saying nothing at all makes the code harder
to understand.

And notice the private empty constructors we use to make new instances in substring(..) before
we fill in their reps with data. We didn’t have to write these empty constructors before because
Java provides them by default when we don’t declare any others. Adding the constructors that
take boolean b means we have to declare the empty constructors explicitly.

Now that we know good ADTs scrupulously preserve their own invariants, these do-nothing
constructors are a bad pattern: they don’t assign any values to the rep, and they certainly don’t
establish any invariants. We should strongly consider revising the implementation. Since MyString

is immutable, a starting point would be making all the fields final .

How will clients use this ADT? Here’s an example:

MyString s = new FastMyString(true);


System.out.println("The first character is: " + s.charAt(0));

This code looks very similar to the code we write to use the Java collections classes:

List<String> s = new ArrayList<String>();


...

Unfortunately, this pattern breaks the abstraction barrier we’ve worked so hard to build between
the abstract type and its concrete representations. Clients must know the name of the concrete
representation class. Because interfaces in Java cannot contain constructors, they must directly call
one of the concrete class’ constructors. The spec of that constructor won’t appear anywhere in the
interface, so there’s no static guarantee that different implementations will even provide the same
constructors.

Fortunately, (as of Java 8) interfaces are allowed to contain static methods, so we can implement
the creator operation valueOf as a static factory method in the interface MyString :

public interface MyString {

/** @param b a boolean value


* @return string representation of b, either "true" or "false" */
public static MyString valueOf(boolean b) {
return new FastMyString(true);
}

// ...

Now a client can use the ADT without breaking the abstraction barrier:

MyString s = MyString.valueOf(true);
System.out.println("The first character is: " + s.charAt(0));

reading exercises

Code review

Let’s review the code for FastMyString . Which of these are useful criticisms:

I wish the abstraction function was documented

True
False

(missing explanation)

I wish the representation invariant was documented

True
False

(missing explanation)

I wish the rep fields were final so they could not be reassigned

True
False

(missing explanation)

I wish the private constructor was public so clients could use it to construct empty strings

True
False

(missing explanation)

I wish the charAt specification did not expose that the rep contains individual characters

True
False

(missing explanation)

I wish the charAt implementation behaved more helpfully when i is greater than the length of the
string

True
False

(missing explanation)

check

Example: Generic Set<E>


Java’s collection classes provide a good example of the idea of separating interface and
implementation.

Let’s consider as an example one of the ADTs from the Java collections library, Set . Set is the ADT
of finite sets of elements of some other type E . Here is a simplified version of the Set interface:

/** A mutable set.


* @param <E> type of elements in the set */
public interface Set<E> {

Set is an example of a generic type: a type whose specification is in terms of a placeholder type to
be filled in later. Instead of writing separate specifications and implementations for Set<String> ,

Set<Integer> , and so on, we design and implement one Set<E> .

We can match Java interfaces with our classification of ADT operations, starting with a creator:

// example creator operation


/** Make an empty set.
* @param <E> type of elements in the set
* @return a new set instance, initially empty */
public static <E> Set<E> make() { ... }

The make operation is implemented as a static factory method. Clients will write code like:
Set<String> strings = Set.make();

and the compiler will understand that the new Set is a set of String objects. (We write <E> at the
front of this signature because make is a static method. It needs its own generic type parameter,
separate from the E we’re using in instance method specs.)

// example observer operations

/** Get size of the set.


* @return the number of elements in this set */
public int size();

/** Test for membership.


* @param e an element
* @return true iff this set contains e */
public boolean contains(E e);

Next we have two observer methods. Notice how the specs are in terms of our abstract notion of a
set; it would be malformed to mention the details of any particular implementation of sets with
particular private fields. These specs should apply to any valid implementation of the set ADT.

// example mutator operations

/** Modifies this set by adding e to the set.


* @param e element to add */
public void add(E e);

/** Modifies this set by removing e, if found.


* If e is not found in the set, has no effect.
* @param e element to remove */
public void remove(E e);
// ... checkRep();
} }
// ...
}

The representations used by CharSet1 / 2 / 3 are not suited for representing sets of arbitrary-type
elements. The String reps, for example, cannot represent a Set<Integer> without careful work to
define a new rep invariant and abstraction function that handles multi-digit numbers.

Generic interface, generic implementation. We can also implement the generic Set<E> interface
without picking a type for E . In that case, we write our code blind to the actual type that clients will
choose for E . Java’s HashSet does that for Set . Its declaration looks like:

public interface Set<E> { public class HashSet<E> implements Set<E> {

// ... // ...

A generic implementation can only rely on details of the placeholder types that are included in the
interface’s specification. We’ll see in a future reading how HashSet relies on methods that every type
in Java is required to implement — and only on those methods, because it can’t rely on methods
declared in any specific type.

Why Interfaces?

Interfaces are used pervasively in real Java code. Not every class is associated with an interface,
but there are a few good reasons to bring an interface into the picture.

Documentation for both the compiler and for humans. Not only does an interface help the
compiler catch ADT implementation bugs, but it is also much more useful for a human to read
than the code for a concrete implementation. Such an implementation intersperses ADT-level
types and specs with implementation details.
Allowing performance trade-offs. Different implementations of the ADT can provide methods
with very different performance characteristics. Different applications may work better with
different choices, but we would like to code these applications in a way that is representation-
independent. From a correctness standpoint, it should be possible to drop in any new
implementation of a key ADT with simple, localized code changes.
Optional methods. List from the Java standard library marks all mutator methods as optional.
By building an implementation that does not support these methods, we can provide immutable
lists. Some operations are hard to implement with good enough performance on immutable lists,
so we want mutable implementations, too. Code that doesn’t call mutators can be written to
work automatically with either kind of list.
Methods with intentionally underdetermined specifications. An ADT for finite sets could
leave unspecified the element order one gets when converting to a list. Some implementations
might use slower method implementations that manage to keep the set representation in some
sorted order, allowing quick conversion to a sorted list. Other implementations might make many
methods faster by not bothering to support conversion to sorted lists.
Multiple views of one class. A Java class may implement multiple interfaces. For instance, a
user interface widget displaying a drop-down list is natural to view as both a widget and a list.
The class for this widget could implement both interfaces. In other words, we don’t implement an
ADT multiple times just because we are choosing different data structures; we may make
multiple implementations because many different sorts of objects may also be seen as special
cases of the ADT, among other useful perspectives.
More and less trustworthy implementations. Another reason to implement an interface
multiple times might be that it is easy to build a simple implementation that you believe is correct,
while you can work harder to build a fancier version that is more likely to contain bugs. You can
choose implementations for applications based on how bad it would be to get bitten by a bug.

Realizing ADT Concepts in Java

ADT concept Ways to do it in Java Examples

Single class String

Abstract data type Interface + class(es) List and ArrayList

Enum DayOfWeek

Constructor ArrayList()

Collections.singletonList() ,
Creator operation Static (factory) method
Arrays.asList()

Constant BigInteger.ZERO

Observer operation Instance method List.get()


Static method Collections.max()

Instance method String.trim()

Producer operation
Static method Collections.unmodifiableList()

Instance method List.add()

Mutator operation
Static method Collections.copy()

Representation private fields

reading exercises

Suppose you have an abstract data type for rational numbers, similar to the one we discussed in
Abstraction Functions & Rep Invariants, which is currently represented as a Java class:

public class RatNum {


...
}

You decide to change RatNum to a Java interface instead, along with an implementation class called
IntFraction :

public interface RatNum {


...
}

public class IntFraction implements RatNum {


...
}

For each piece of code below from the old RatNum class, identify it and decide where it should go in
the new interface—plus—implementation-class design.

Interface + implementation 1

private int numer;


private int denom;

This piece of code is: (check all that apply) It should be put in:
abstraction function the interface
creator the implementation class
mutator both
Reading 14: Recursion

Objectives

After today’s class, you should:

be able to decompose a recursive problem into recursive steps and base cases
know when and how to use helper methods in recursion
understand the advantages and disadvantages of recursion vs. iteration

Recursion
In today’s class, we’re going to talk about how to implement a method, once you already have a
specification. We’ll focus on one particular technique, recursion. Recursion is not appropriate for every
problem, but it’s an important tool in your software development toolbox, and one that many people
scratch their heads over. We want you to be comfortable and competent with recursion, because you will
encounter it over and over. (That’s a joke, but it’s also true.)

Since you’ve taken 6.01, recursion is not completely new to you, and you have seen and written recursive
functions like factorial and fibonacci before. Today’s class will delve more deeply into recursion than you
may have gone before. Comfort with recursive implementations will be necessary for upcoming classes.

A recursive function is defined in terms of base cases and recursive steps.

In a base case, we compute the result immediately given the inputs to the function call.
In a recursive step, we compute the result with the help of one or more recursive calls to this same
function, but with the inputs somehow reduced in size or complexity, closer to a base case.

Consider writing a function to compute factorial. We can define factorial in two different ways:

Product Recurrence relation

(where the empty product equals


multiplicative identity 1)

which leads to two different implementations:

Iterative Recursive
Iterative Recursive

public static long factorial(int n) { public static long factorial(int n) {


long fact = 1; if (n == 0) {
for (int i = 1; i <= n; i++) { return 1;
fact = fact * i; } else {
} return n * factorial(n-1);
return fact; }
} }

In the recursive implementation on the right, the base case is n = 0, where we compute and return the
result immediately: 0! is defined to be 1. The recursive step is n > 0, where we compute the result with
the help of a recursive call to obtain (n-1)!, then complete the computation by multiplying by n.

To visualize the execution of a recursive function, it is helpful to diagram the call stack of currently-
executing functions as the computation proceeds.

Let’s run the recursive implementation of factorial in a main method:

public static void main(String[] args) {


long x = factorial(3);
}

At each step, with time moving left to right:

starts
calls calls calls calls returns to returns to returns
in factorial(3) factorial(2) factorial(1) factorial(0) factorial(1) factorial(2) factoria
main

factorial
n=0
factorial factorial
returns 1
n=1 n=1
factorial factorial factorial
returns 1
n=2 n=1 n=2
factorial factorial factorial factoria
returns 2
n=3 n=2 n=2 n=3
main factorial factorial factorial
returns
n=3 n=2 n=3
main factorial factorial main
x n=3 n=3 x
main factorial main
x n=3 x
main main
x x
main
x

In the diagram, we can see how the stack grows as main calls factorial and factorial then calls itself, until
factorial(0) does not make a recursive call. Then the call stack unwinds, each call to factorial returning
its answer to the caller, until factorial(3) returns to main .

Here’s an interactive visualization of factorial . You can step through the computation to see the
recursion in action. New stack frames grow down instead of up in this visualization.
You’ve probably seen factorial before, because it’s a common example for recursive functions. Another
common example is the Fibonacci series:

/**
* @param n >= 0
* @return the nth Fibonacci number
*/
public static int fibonacci(int n) {
if (n == 0 || n == 1) {
return 1; // base cases
} else {
return fibonacci(n-1) + fibonacci(n-2); // recursive step
}
}

Fibonacci is interesting because it has multiple base cases: n=0 and n=1. You can look at an interactive
visualization of Fibonacci. Notice that where factorial’s stack steadily grows to a maximum depth and
then shrinks back to the answer, Fibonacci’s stack grows and shrinks repeatedly over the course of the
computation.

check

Recursive Fibonacci

Consider this recursive implementation of the Fibonacci sequence.

public static int fibonacci(int n) {


if (n == 0 || n == 1) {
return 1; // base cases
} else {
return fibonacci(n-1) + fibonacci(n-2); // recursive step
}
}
subsequences("c")
What does subsequences("c") return?
"c"
""
",c"
"c,"

(missing explanation)

check

subsequences("gc")
What does subsequences("gc") return?
"g,c"
",g,c,gc"
",gc,g,c"
"g,c,gc"

(missing explanation)

check

Structure of Recursive Implementations


A recursive implementation always has two parts:

base case, which is the simplest, smallest instance of the problem, that can’t be decomposed any
further. Base cases often correspond to emptiness – the empty string, the empty list, the empty set,
the empty tree, zero, etc.

recursive step, which decomposes a larger instance of the problem into one or more simpler or
smaller instances that can be solved by recursive calls, and then recombines the results of those
subproblems to produce the solution to the original problem.

It’s important for the recursive step to transform the problem instance into something smaller, otherwise
the recursion may never end. If every recursive step shrinks the problem, and the base case lies at the
bottom, then the recursion is guaranteed to be finite.

A recursive implementation may have more than one base case, or more than one recursive step. For
example, the Fibonacci function has two base cases, n=0 and n=1.

reading exercises

Recursive structure

Recursive methods have a base case and a recursive step. What other concepts from computer science
also have (the equivalent of) a base case and a recursive step?

proof by induction
regression testing
recessive functions
binary trees

(missing explanation)

check

Helper Methods
The recursive implementation we just saw for subsequences() is one possible recursive decomposition of
the problem. We took a solution to a subproblem – the subsequences of the remainder of the string after
removing the first character – and used it to construct solutions to the original problem, by taking each
subsequence and adding the first character or omitting it. This is in a sense a direct recursive
implementation, where we are using the existing specification of the recursive method to solve the
subproblems.

In some cases, it’s useful to require a stronger (or different) specification for the recursive steps, to make
the recursive decomposition simpler or more elegant. In this case, what if we built up a partial
subsequence using the initial letters of the word, and used the recursive calls to complete that partial
subsequence using the remaining letters of the word? For example, suppose the original word is
“orange”. We’ll both select “o” to be in the partial subsequence, and recursively extend it with all
subsequences of “range”; and we’ll skip “o”, use “” as the partial subsequence, and again recursively
extend it with all subsequences of “range”.

Using this approach, our code now looks much simpler:

/**
* Return all subsequences of word (as defined above) separated by commas,
* with partialSubsequence prepended to each one.
*/
private static String subsequencesAfter(String partialSubsequence, String word) {
if (word.isEmpty()) {
// base case
return partialSubsequence;
} else {
// recursive step
return subsequencesAfter(partialSubsequence, word.substring(1))
+ ","
+ subsequencesAfter(partialSubsequence + word.charAt(0), word.substring(1));
}
}

This subsequencesAfter method is called a helper method. It satisfies a different spec from the original
subsequences , because it has a new parameter partialSubsequence . This parameter fills a similar role that a
local variable would in an iterative implementation. It holds temporary state during the evolution of the
computation. The recursive calls steadily extend this partial subsequence, selecting or ignoring each letter
in the word, until finally reaching the end of the word (the base case), at which point the partial
subsequence is returned as the only result. Then the recursion backtracks and fills in other possible
subsequences.
To finish the implementation, we need to implement the original subsequences spec, which gets the ball
rolling by calling the helper method with an initial value for the partial subsequence parameter:

public static String subsequences(String word) {


return subsequencesAfter("", word);
}

Don’t expose the helper method to your clients. Your decision to decompose the recursion this way
instead of another way is entirely implementation-specific. In particular, if you discover that you need
temporary variables like partialSubsequence in your recursion, don’t change the original spec of your
method, and don’t force your clients to correctly initialize those parameters. That exposes your
implementation to the client and reduces your ability to change it in the future. Use a private helper
function for the recursion, and have your public method call it with the correct initializations, as shown
above.

reading exercises

Unhelpful 1

Louis Reasoner doesn’t want to use a helper method, so he tries to implement subsequences() by storing
partialSubsequence as a static variable instead of a parameter. Here is his implementation:

private static String partialSubsequence = "";


public static String subsequencesLouis(String word) {
if (word.isEmpty()) {
// base case
return partialSubsequence;
} else {
// recursive step
String withoutFirstLetter = subsequencesLouis(word.substring(1));
partialSubsequence += word.charAt(0);
String withFirstLetter = subsequencesLouis(word.substring(1));
return withoutFirstLetter + "," + withFirstLetter;
}
}

Suppose we call subsequencesLouis("c") followed by subsequencesLouis("a") .

What does subsequencesLouis("c") return?


"c"
""
",c"
"c,"

What does subsequencesLouis("a") return?


"a"
""
",a"
"a,"
"c,ca"

(missing explanation)

check

Unhelpful 2
Louis fixes that problem by making partialSubsequence public:

/**
* Requires: caller must set partialSubsequence to "" before calling subsequencesLouis().
*/
public static String partialSubsequence;

Alyssa P. Hacker throws up her hands when she sees what Louis did. Which of these statements are true
about his code?

partialSubsequence is risky – it should be final


partialSubsequence is risky – it is a global variable
partialSubsequence is risky – it points to a mutable object

(missing explanation)

check

Unhelpful 3

Louis gives in to Alyssa’s strenuous arguments, hides his static variable again, and takes care of
initializing it properly before starting the recursion:

public static String subsequences(String word) {


partialSubsequence = "";
return subsequencesLouis(word);
}

private static String partialSubsequence = "";

public static String subsequencesLouis(String word) {


if (word.isEmpty()) {
// base case
return partialSubsequence;
} else {
// recursive step
String withoutFirstLetter = subsequencesLouis(word.substring(1));
partialSubsequence += word.charAt(0);
String withFirstLetter = subsequencesLouis(word.substring(1));
return withoutFirstLetter + "," + withFirstLetter;
}
}

Unfortunately a static variable is simply a bad idea in recursion. Louis’s solution is still broken. To
illustrate, let’s trace through the call subsequences("xy") . You can step through an interactive visualization
of this version to see what happens. It will produce these recursive calls to subsequencesLouis() :

1. subsequencesLouis("xy")
2. subsequencesLouis("y")
3. subsequencesLouis("")
4. subsequencesLouis("")
5. subsequencesLouis("y")
6. subsequencesLouis("")
7. subsequencesLouis("")

When each of these calls starts, what is the value of the static variable partialSubsequence?

1. subsequencesLouis("xy")
2. subsequencesLouis("y")

3. subsequencesLouis("")

4. subsequencesLouis("")

5. subsequencesLouis("y")

6. subsequencesLouis("")

7. subsequencesLouis("")

(missing explanation)

check

Choosing the Right Recursive Subproblem


Let’s look at another example. Suppose we want to convert an integer to a string representation with a
given base, following this spec:

/**
* @param n integer to convert to string
* @param base base for the representation. Requires 2<=base<=10.
* @return n represented as a string of digits in the specified base, with
* a minus sign if n<0.
*/
public static String stringValue(int n, int base)

For example, stringValue(16, 10) should return "16" , and stringValue(16, 2) should return "10000" .

Let’s develop a recursive implementation of this method. One recursive step here is straightforward: we
can handle negative integers simply by recursively calling for the representation of the corresponding
positive integer:

if (n < 0) return "-" + stringValue(-n, base);

This shows that the recursive subproblem can be smaller or simpler in more subtle ways than just the
value of a numeric parameter or the size of a string or list parameter. We have still effectively reduced
the problem by reducing it to positive integers.

The next question is, given that we have a positive n, say n=829 in base 10, how should we decompose it
into a recursive subproblem? Thinking about the number as we would write it down on paper, we could
either start with 8 (the leftmost or highest-order digit), or 9 (the rightmost, lower-order digit). Starting at
the left end seems natural, because that’s the direction we write, but it’s harder in this case, because we
would need to first find the number of digits in the number to figure out how to extract the leftmost digit.
Instead, a better way to decompose n is to take its remainder modulo base (which gives the rightmost
digit) and also divide by base (which gives the subproblem, the remaining higher-order digits):

return stringValue(n/base, base) + "0123456789".charAt(n%base);

Think about several ways to break down the problem, and try to write the recursive steps. You
want to find the one that produces the simplest, most natural recursive step.

It remains to figure out what the base case is, and include an if statement that distinguishes the base
case from this recursive step.

reading exercises

Implementing stringValue

Here is the recursive implementation of stringValue() with the recursive steps brought together but with
the base case still missing:

/**
* @param n integer to convert to string
* @param base base for the representation. Requires 2<=base<=10.
* @return n represented as a string of digits in the specified base, with
* a minus sign if n<0. No unnecessary leading zeros are included.
*/
public static String stringValue(int n, int base) {
if (n < 0) {
return "-" + stringValue(-n, base);
} else if (BASE CONDITION) {
BASE CASE
} else {
return stringValue(n/base, base) + "0123456789".charAt(n%base);
}
}

Which of the following can be substituted for the BASE CONDITION and BASE CASE to make the code correct?

else if (n == 0) { return "0"; }


else if (n < base) { return "" + n; }
else if (n == 0) { return ""; }
else if (n < base) { return "0123456789".substring(n,n+1); }

(missing explanation)

check

Calling stringValue

Assuming the code is completed with one of the base cases identified in the previous problem, what does
stringValue(170, 16) do?

returns "AA"

returns "170"

returns "1010"

throws StringIndexOutOfBoundsException

doesn’t compile, static error


StackOverflowError

infinite loop

(missing explanation)

check

Recursive Problems vs. Recursive Data


The examples we’ve seen so far have been cases where the problem structure lends itself naturally to a
recursive definition. Factorial is easy to define in terms of smaller subproblems. Having a recursive
problem like this is one cue that you should pull a recursive solution out of your toolbox.

Another cue is when the data you are operating on is inherently recursive in structure. We’ll see many
examples of recursive data a few classes from now, but for now let’s look at the recursive data found in
every laptop computer: its filesystem. A filesystem consists of named files. Some files are folders, which
can contain other files. So a filesystem is recursive: folders contain other folders which contain other
folders, until finally at the bottom of the recursion are plain (non-folder) files.

The Java library represents the file system using java.io.File . This is a recursive data type, in the sense
that f.getParentFile() returns the parent folder of a file f , which is a File object as well, and f.listFiles()

returns the files contained by f , which is an array of other File objects.

For recursive data, it’s natural to write recursive implementations:

/**
* @param f a file in the filesystem
* @return the full pathname of f from the root of the filesystem
*/
public static String fullPathname(File f) {
if (f.getParentFile() == null) {
// base case: f is at the root of the filesystem
return f.getName();
} else {
// recursive step
return fullPathname(f.getParentFile()) + "/" + f.getName();
}
}

Recent versions of Java have added a new API, java.nio.Files and java.nio.Path , which offer a cleaner
separation between the filesystem and the pathnames used to name files in it. But the data structure is
still fundamentally recursive.

Reentrant Code
Recursion – a method calling itself – is a special case of a general phenomenon in programming called
reentrancy. Reentrant code can be safely re-entered, meaning that it can be called again even while a
call to it is underway. Reentrant code keeps its state entirely in parameters and local variables, and
doesn’t use static variables or global variables, and doesn’t share aliases to mutable objects with other
parts of the program, or other calls to itself.
Direct recursion is one way that reentrancy can happen. We’ve seen many examples of that during this
reading. The factorial() method is designed so that factorial(n-1) can be called even though factorial(n)
hasn’t yet finished working.

Mutual recursion between two or more functions is another way this can happen – A calls B, which calls
A again. Direct mutual recursion is virtually always intentional and designed by the programmer. But
unexpected mutual recursion can lead to bugs.

When we talk about concurrency later in the course, reentrancy will come up again, since in a concurrent
program, a method may be called at the same time by different parts of the program that are running
concurrently.

It’s good to design your code to be reentrant as much as possible. Reentrant code is safer from bugs
and can be used in more situations, like concurrency, callbacks, or mutual recursion.

When to Use Recursion Rather Than Iteration


We’ve seen two common reasons for using recursion:

The problem is naturally recursive (e.g. Fibonacci)


The data is naturally recursive (e.g. filesystem)

Another reason to use recursion is to take more advantage of immutability. In an ideal recursive
implementation, all variables are final, all data is immutable, and the recursive methods are all pure
functions in the sense that they do not mutate anything. The behavior of a method can be understood
simply as a relationship between its parameters and its return value, with no side effects on any other
part of the program. This kind of paradigm is called functional programming, and it is far easier to
reason about than imperative programming with loops and variables.

In iterative implementations, by contrast, you inevitably have non-final variables or mutable objects that
are modified during the course of the iteration. Reasoning about the program then requires thinking about
snapshots of the program state at various points in time, rather than thinking about pure input/output
behavior.

One downside of recursion is that it may take more space than an iterative solution. Building up a stack
of recursive calls consumes memory temporarily, and the stack is limited in size, which may become a
limit on the size of the problem that your recursive implementation can solve.

Common Mistakes in Recursive Implementations


Here are two common ways that a recursive implementation can go wrong:

The base case is missing entirely, or the problem needs more than one base case but not all the base
cases are covered.
The recursive step doesn’t reduce to a smaller subproblem, so the recursion doesn’t converge.

Look for these when you’re debugging.


private String lastName;
...

public boolean equals(Object obj) {


if (!(obj instanceof Person)) return false;
Person that = (Person) obj;
return this.lastName.toUpperCase().equals(that.lastName.toUpperCase()
}

public int hashCode() {


// TODO
}
}

Which of the following could be put in place of the line marked TODO to
make hashCode() consistent with equals() ?
return 42;
return firstName.toUpperCase();
return lastName.toUpperCase().hashCode();
return firstName.hashCode() + lastName.hashCode();

(missing explanation)

check

Equality of Mutable Types


▶ Play MITx video
We’ve been focusing on equality of immutable objects so far in this
reading. What about mutable objects?

Recall our definition: two objects are equal when they cannot be
distinguished by observation. With mutable objects, there are two
ways to interpret this:
when they cannot be distinguished by observation that doesn’t
change the state of the objects, i.e., by calling only observer,
producer, and creator methods. This is often strictly called
observational equality, since it tests whether the two objects
“look” the same, in the current state of the program.
when they cannot be distinguished by any observation, even state
changes. This interpretation allows calling any methods on the two
objects, including mutators. This is often called behavioral
equality, since it tests whether the two objects will “behave” the
same, in this and all future states.

For immutable objects, observational and behavioral equality are


identical, because there aren’t any mutator methods.

For mutable objects, it’s tempting to implement strict observational


equality. Java uses observational equality for most of its mutable data
types, in fact. If two distinct List objects contain the same sequence
of elements, then equals() reports that they are equal.

But using observational equality leads to subtle bugs, and in fact


allows us to easily break the rep invariants of other collection data
structures. Suppose we make a List , and then drop it into a Set :

List<String> list = new ArrayList<>();


list.add("a");

Set<List<String>> set = new HashSet<List<String>>();


set.add(list);

We can check that the set contains the list we put in it, and it does:

set.contains(list) → true
But now we mutate the list:

list.add("goodbye");

And it no longer appears in the set!

set.contains(list) → false!

It’s worse than that, in fact: when we iterate over the members of the
set, we still find the list in there, but contains() says it’s not there!

for (List<String> l : set) {


set.contains(l) → false!
}

If the set’s own iterator and its own contains() method disagree about
whether an element is in the set, then the set clearly is broken. You
can see this code in action on Online Python Tutor.

What’s going on? List<String> is a mutable object. In the standard


Java implementation of collection classes like List , mutations affect
the result of equals() and hashCode() . When the list is first put into the
HashSet , it is stored in the hash bucket corresponding to its hashCode()

result at that time. When the list is subsequently mutated, its


hashCode() changes, but HashSet doesn’t realize it should be moved to a
different bucket. So it can never be found again.

When equals() and hashCode() can be affected by mutation, we can


break the rep invariant of a hash table that uses that object as a key.

Here’s a telling quote from the specification of java.util.Set :


Note: Great care must be exercised if mutable objects are used
as set elements. The behavior of a set is not specified if the value
of an object is changed in a manner that affects equals
comparisons while the object is an element in the set.

The Java library is unfortunately inconsistent about its interpretation


of equals() for mutable classes. Collections use observational equality,
but other mutable classes (like StringBuilder ) use behavioral equality.

The lesson we should draw from this example is that equals() should
implement behavioral equality. In general, that means that two
references should be equals() if and only if they are aliases for the
same object. So mutable objects should just inherit equals() and
hashCode() from Object . For clients that need a notion of observational
equality (whether two mutable objects “look” the same in the current
state), it’s better to define a new method, e.g., similar() .

The Final Rule for equals() and hashCode()


For immutable types:

equals() should compare abstract values. This is the same as


saying equals() should provide behavioral equality.
hashCode() should map the abstract value to an integer.

So immutable types must override both equals() and hashCode() .

For mutable types:

equals() should compare references, just like == . Again, this is the


same as saying equals() should provide behavioral equality.
hashCode() should map the reference into an integer.

So mutable types should not override equals() and hashCode() at all,


and should simply use the default implementations provided by Object .

Java doesn’t follow this rule for its collections, unfortunately, leading
to the pitfalls that we saw above.

reading exercises

Bag

Suppose Bag<E> is a mutable ADT representing what is often called a


multiset, an unordered collection of objects where an object can
occur more than once. It has the following operations:

/** make an empty bag */


public Bag<E>()

/** modify this bag by adding an occurrence of e, and return this bag */
public Bag<E> add(E e)

/** modify this bag by removing an occurrence of e (if any), and return thi
public Bag<E> remove(E e)

/** return number of times e occurs in this bag */


public int count(E e)

Suppose we run this code:

Bag<String> b1 = new Bag<>().add("a").add("b");


Bag<String> b2 = new Bag<>().add("a").add("b");
Bag<String> b3 = b1.remove("b");
Bag<String> b4 = new Bag<>().add("b").add("a"); // swap!

Which of the following expressions are true after all the the code has
been run?
b1.count("a") == 1
b1.count("b") == 1
b2.count("a") == 1
b2.count("b") == 1
b3.count("a") == 1
b3.count("b") == 1
b4.count("a") == 1
b4.count("b") == 1

(missing explanation)

check

Bag behavior

If Bag is implemented with behavioral equality, which of the following


expressions are true?

b1.equals(b2)
b1.equals(b3)
b1.equals(b4)
b2.equals(b3)
b2.equals(b4)
b3.equals(b1)

(missing explanation)

check

Bean bag

If Bag were part of the Java API, it would probably implement


observational equality, counter to the recommendation in the reading.

If Bag implemented observational equality despite the dangers, which


of the following expressions are true?
b1.equals(b2)
b1.equals(b3)
b1.equals(b4)
b2.equals(b3)
b2.equals(b4)
b3.equals(b1)

(missing explanation)

check

Autoboxing and Equality

▶ Play MITx video


One more instructive pitfall in Java. We’ve talked about primitive
types and their object type equivalents – for example, int and Integer .

The object type implements equals() in the correct way, so that if you
create two Integer objects with the same value, they’ll be equals() to
each other:

Integer x = new Integer(3);


Integer y = new Integer(3);
x.equals(y) → true

But there’s a subtle problem here; == is overloaded. For reference


types like Integer , it implements referential equality:

x == y // returns false

But for primitive types like int , == implements behavioral equality:

(int)x == (int)y // returns true


So you can’t really use Integer interchangeably with int . The fact that
Java automatically converts between int and Integer (this is called
autoboxing and autounboxing) can lead to subtle bugs! You have to
be aware what the compile-time types of your expressions are.
Consider this:

Map<String, Integer> a = new HashMap(), b = new HashMap();


a.put("c", 130); // put ints into the map
b.put("c", 130);
a.get("c") == b.get("c") → ?? // what do we get out of the map?

You can see this code in action on Online Python Tutor.

reading exercises

Boxes

In the last code example above…

What is the compile-time type of the expression 130 ?

(missing explanation)

After executing a.put("c", 130) , what is the runtime type that is used to
represent the value 130 in the map?

(missing explanation)

What is the compile-time type of a.get("c") ?


(missing explanation)

check

Circles

Map<String, Integer> a = new HashMap<>(), b = new HashMap<>();


a.put("c", 130); // put ints into the map
b.put("c", 130);

Draw a snapshot diagram after the code above has executed. How
many HashMap objects are in your snapshot diagram?

(missing explanation)

How many Integer objects are in your snapshot diagram?

(missing explanation)

check

Equals

Map<String, Integer> a = new HashMap<>(), b = new HashMap<>();


a.put("c", 130); // put ints into the map
b.put("c", 130);

After this code executes, what would a.get("c").equals(b.get("c"))

return?

(missing explanation)

What would a.get("c") == b.get("c") return?


(missing explanation)

check

Unboxes

Now suppose you assign the get() results to int variables:

int i = a.get("c");
int j = b.get("c");
boolean isEqual = (i == j);

After executing this code, what is the value of isEqual ?

(missing explanation)

check

Summary
Equality should be an equivalence relation (reflexive, symmetric,
transitive).
Equality and hash code must be consistent with each other, so
that data structures that use hash tables (like HashSet and HashMap )

work properly.
The abstraction function is the basis for equality in immutable data
types.
Reference equality is the basis for equality in mutable data types;
this is the only way to ensure consistency over time and avoid
breaking rep invariants of hash tables.
Equality is one part of implementing an abstract data type, and we’ve
already seen how important ADTs are to achieving our three primary
objectives. Let’s look at equality in particular:

Safe from bugs. Correct implementation of equality and hash


codes is necessary for use with collection data types like sets and
maps. It’s also highly desirable for writing tests. Since every
object in Java inherits the Object implementations, immutable types
must override them.

Easy to understand. Clients and other programmers who read


our specs will expect our types to implement an appropriate
equality operation, and will be surprised and confused if we do
not.

Ready for change. Correctly-implemented equality for immutable


types separates equality of reference from equality of abstract
value, hiding from clients our decisions about whether values are
shared. Choosing behavioral rather than observational equality for
mutable types helps avoid unexpected aliasing bugs.
Reading 16: Recursive Data Types
Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Communicating clearly Designed to


Correct today and
with future accommodate
correct in the
programmers, change without
unknown future.
including future you. rewriting.

Objectives

Understand recursive datatypes


Read and write datatype definitions
Understand and implement functions over recursive datatypes
Understand immutable lists and know the standard operations on
immutable lists
Know and follow a recipe for writing programs with ADTs

Introduction
In this reading we’ll look at recursively-defined types, how to specify
operations on such types, and how to implement them. Our main
example will be immutable lists.

Then we’ll use another recursive datatype example, matrix


multiplications, to walk through our process for programming with
ADTs.
Part 1: Recursive Data Types
Part 2: Writing a Program with ADTs

Summary
Let’s review how recursive datatypes fit in with the main goals of this
course:

Safe from bugs. Recursive datatypes allow us to tackle


problems with a recursive or unbounded structure. Implementing
appropriate data structures that encapsulate important operations
and maintain their own invariants is crucial for correctness.

Easy to understand. Functions over recursive datatypes,


specified in the abstract type and implemented in each concrete
variant, organize the different behavior of the type.

Ready for change. A recursive ADT, like any ADT, separates


abstract values from concrete representations, making it possible
to change low-level code and high-level structure of the
implementation without changing clients.
Reading 17: Regular Expressions &
Grammars
Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Communicating clearly Designed to


Correct today and
with future accommodate
correct in the
programmers, change without
unknown future.
including future you. rewriting.

Objectives

After today’s class, you should:

Understand the ideas of grammar productions and regular


expression operators
Be able to read a grammar or regular expression and determine
whether it matches a sequence of characters
Be able to write a grammar or regular expression to match a set
of character sequences and parse them into a data structure

Introduction
Today’s reading introduces several ideas:

grammars, with productions, nonterminals, terminals, and


operators
regular expressions
parser generators

Some program modules take input or produce output in the form of a


sequence of bytes or a sequence of characters, which is called a
string when it’s simply stored in memory, or a stream when it flows
into or out of a module. In today’s reading, we talk about how to write
a specification for such a sequence. Concretely, a sequence of bytes
or characters might be:

A file on disk, in which case the specification is called the file


format
Messages sent over a network, in which case the specification is
a wire protocol
A command typed by the user on the console, in which case the
specification is a command line interface
A string stored in memory

For these kinds of sequences, we introduce the notion of a grammar,


which allows us not only to distinguish between legal and illegal
sequences, but also to parse a sequence into a data structure that a
program can work with. The data structure produced from a grammar
will often be a recursive data type like we talked about in the
recursive data type reading.

We also talk about a specialized form of a grammar called a regular


expression. In addition to being used for specification and parsing,
regular expressions are a widely-used tool for many string-processing
tasks that need to disassemble a string, extract information from it, or
transform it.
The next reading will talk about parser generators, a kind of tool that
translate a grammar automatically into a parser for that grammar.

Grammars
To describe a sequence of symbols, whether they are bytes,
characters, or some other kind of symbol drawn from a fixed set, we
use a compact representation called a grammar.

A grammar defines a set of sentences, where each sentence is a


sequence of symbols. For example, our grammar for URLs will
specify the set of sentences that are legal URLs in the HTTP
protocol.

The symbols in a sentence are called terminals (or tokens).


They’re called terminals because they are the leaves of a tree that
represents the structure of the sentence. They don’t have any
children, and can’t be expanded any further. We generally write
terminals in quotes, like 'http' or ':' .

A grammar is described by a set of productions, where each


production defines a nonterminal. You can think of a nonterminal like
a variable that stands for a set of sentences, and the production as
the definition of that variable in terms of other variables
(nonterminals), operators, and constants (terminals). Nonterminals
are internal nodes of the tree representing a sentence.

A production in a grammar has the form


nonterminal ::= expression of terminals, nonterminals, and
operators

In 6.005, we will name nonterminals using lowercase identifiers, like x

or y or url .

One of the nonterminals of the grammar is designated as the root.


The set of sentences that the grammar recognizes are the ones that
match the root nonterminal. This nonterminal is often called root or
start , but in the grammars below we will typically choose more
memorable names like url , html , and markdown .

Grammar Operators

The three most important operators in a production expression are:

concatenation

x ::= y z an x is a y followed by a z

repetition

x ::= y* an x is zero or more y

union (also called alternation)

x ::= y | z an x is a y or a z

You can also use additional operators which are just syntactic sugar
(i.e., they’re equivalent to combinations of the big three operators):

option (0 or 1 occurrence)
x ::= y? an x is a y or is the empty sentence

1+ repetition (1 or more occurrences)

x ::= y+ an x is one or more y


(equivalent to x ::= y y* )

character classes

x ::= [abc] is equivalent to x ::= 'a' | 'b' | 'c'

x ::= [^b] is equivalent to x ::= 'a' | 'c' | 'd' | 'e' | 'f'


| ... (all other characters)

By convention, the operators * , ? , and + have highest precedence,


which means they are applied first. Alternation | has lowest
precedence, which means it is applied last. Parentheses can be used
to override this precedence, so that a sequence or alternation can be
repeated:

grouping using parentheses

x ::= (y z | a b)* an x is zero or more y-z or a-b pairs

reading exercises

Reading a Grammar 1

Consider this grammar:

S ::= (B C)* T
B ::= M+ | P B P
C ::= B | E+

What are the nonterminals in this grammar? (Note that capitalization


and quoting won’t give you a clue here, so go by the structure of the
grammar alone.)
B
C
E
M
P
S
T
|
*
+
(
)

(missing explanation)

What are the terminals in this grammar?


B
C
E
M
P
S
T
|
+
*
(
)

(missing explanation)

Which productions are recursive?


S
B
C

(missing explanation)

check

Reading a Grammar 2
Which strings match the root nonterminal of this grammar?

root ::= 'a'+ 'b'* 'c'?

aabcc
bbbc
aaaaaaaa
abc
abab
aac

(missing explanation)

check

Reading a Grammar 3
Which strings match the root nonterminal of this grammar?

root ::= integer ('-' integer)+


integer ::= [0-9]+
617
617-253
617-253-1000
---
integer-integer-integer
5--5
3-6-293-1

(missing explanation)

check

Reading a Grammar 4
Which strings match the root nonterminal of this grammar?

root ::= (A B)+


A ::= [Aa]
B ::= [Bb]

aaaBBB
abababab
aBAbabAB
AbAbAbA

(missing explanation)

check

Example: URL

Suppose we want to write a grammar that represents URLs. Let’s


build up a grammar gradually by starting with simple examples and
extending the grammar as we go.

Here’s a simple URL:


http://mit.edu/

A grammar that represents the set of sentences containing only this


URL would look like:

url ::= 'http://mit.edu/'

But let’s generalize it to capture other domains, as well:

http://stanford.edu/
http://google.com/

We can write this as one line, like this:

url ::= 'http://' [a-z]+ '.' [a-z]+ '/'

This grammar represents the set of all URLs that consist of just a
two-part hostname, where each part of the hostname consists of 1 or
more letters. So http://mit.edu/ and http://yahoo.com/ would match, but
not http://ou812.com/ . Since it has only one nonterminal, a parse tree
for this URL grammar would look like the picture on the right.

In this one-line form, with a single nonterminal whose production uses


only operators and terminals, a grammar is called a regular
expression (more about that later). But it will be easier to understand
if we name the parts using new nonterminals:
the parse tree produced
by parsing 'http://mit.edu'
with a grammar with url,
hostname, and word
nonterminals

url ::= 'http://' hostname '/'


hostname ::= word '.' word
word ::= [a-z]+

The parse tree for this grammar is now shown at right. The tree has
more structure now. The leaves of the tree are the parts of the string
that have been parsed. If we concatenated the leaves together, we
would recover the original string. The hostname and word nonterminals
are labeling nodes of the tree whose subtrees match those rules in
the grammar. Notice that the immediate children of a nonterminal
node like hostname follow the pattern of the hostname rule, word '.' word .

How else do we need to generalize? Hostnames can have more than


two components, and there can be an optional port number:

http://didit.csail.mit.edu:4949/

To handle this kind of string, the grammar is now:


the parse tree produced by
parsing 'http://mit.edu' with a
grammar with a recursive
hostname rule

url ::= 'http://' hostname (':' port)? '/'


hostname ::= word '.' hostname | word '.' word
port ::= [0-9]+
word ::= [a-z]+

Notice how hostname is now defined recursively in terms of itself.


Which part of the hostname definition is the base case, and which
part is the recursive step? What kinds of hostnames are allowed?

Using the repetition operator, we could also write hostname like this:

hostname ::= (word '.')+ word

Another thing to observe is that this grammar allows port numbers


that are not technically legal, since port numbers can only range from
0 to 65535. We could write a more complex definition of port that
would allow only these integers, but that’s not typically done in a
grammar. Instead, the constraint 0 <= port <= 65535 would be
specified alongside the grammar.

There are more things we should do to go farther:


generalizing http to support the additional protocols that URLs can
have
generalizing the / at the end to a slash-separated path
allowing hostnames with the full set of legal characters instead of
just a-z

reading exercises

Writing a Grammar

Suppose we want the url grammar to also match strings of the form:

https://websis.mit.edu/
ftp://ftp.athena.mit.edu/

but not strings of the form:

ptth://web.mit.edu/
mailto:[email protected]

So we change the grammar to:

url ::= protocol '://' hostname (':' port)? '/'


protocol ::= TODO
hostname ::= word '.' hostname | word '.' word
port ::= [0-9]+
word ::= [a-z]+

What could you put in place of TODO to match the desirable URLs
but not the undesirable ones?
word
'ftp' | 'http' | 'https'
('http' 's'?) | 'ftp'
('f' | 'ht') 'tp' 's'?
(missing explanation)

check

Example: Markdown and HTML

Now let’s look at grammars for some file formats. We’ll be using two
different markup languages that represent typographic style in text.
Here they are:

Markdown

This is _italic_.

HTML

Here is an <i>italic</i> word.

For simplicity, our example HTML and Markdown grammars will only
specify italics, but other text styles are of course possible.

Here’s the grammar for our simplified version of Markdown:


a parse tree produced by the
Markdown grammar

markdown ::= ( normal | italic ) *


italic ::= '_' normal '_'
normal ::= text
text ::= [^_]*

Here’s the grammar for our simplified version of HTML:

a parse tree produced by the HTML


grammar
html ::= ( normal | italic ) *
italic ::= '<i>' html '</i>'
normal ::= text
text ::= [^<>]*

reading exercises

Recursive Grammars

Look at the markdown and html grammars above, and compare their
italic productions. Notice that not only do they differ in delimiters ( _
in one case, < > tags in the other), but also in the nonterminal that is
matched between those delimiters. One grammar is recursive; the
other grammar is not.

For each string below, if you match the specified grammar against it,
which letters are inside matches to the italic nonterminal? Your
answer should be some subset of the letters abcde .

markdown: a_b_c_d_e

(missing explanation)

html: a<i>b<i>c</i>d</i>e

(missing explanation)

check

Regular Expressions
A regular grammar has a special property: by substituting every
nonterminal (except the root one) with its righthand side, you can
reduce it down to a single production for the root, with only terminals
and operators on the right-hand side.

Our URL grammar was regular. By replacing nonterminals with their


productions, it can be reduced to a single expression:

url ::= 'http://' ([a-z]+ '.')+ [a-z]+ (':' [0-9]+)? '/'

The Markdown grammar is also regular:

markdown ::= ([^_]* | '_' [^_]* '_' )*

But our HTML grammar can’t be reduced completely. By substituting


righthand sides for nonterminals, you can eventually reduce it to
something like this:

html ::= ( [^<>]* | '<i>' html '</i>' )*

…but the recursive use of html on the righthand side can’t be


eliminated, and can’t be simply replaced by a repetition operator
either. So the HTML grammar is not regular.

The reduced expression of terminals and operators can be written in


an even more compact form, called a regular expression. A regular
expression does away with the quotes around the terminals, and the
spaces between terminals and operators, so that it consists just of
terminal characters, parentheses for grouping, and operator
characters. For example, the regular expression for our markdown

format is just
([^_]*|_[^_]*_)*

Regular expressions are also called regexes for short. A regex is far
less readable than the original grammar, because it lacks the
nonterminal names that documented the meaning of each
subexpression. But a regex is fast to implement, and there are
libraries in many programming languages that support regular
expressions.

The regex syntax commonly implemented in programming language


libraries has a few more special operators, in addition to the ones we
used above in grammars. Here’s are some common useful ones:

. any single character

\d any digit, same as [0-9]


\s any whitespace character, including space, tab, newline
\w any word character, including letters and digits

\., \(, \), \*, \+, ...


escapes an operator or special character so that it matches literal

Using backslashes is important whenever there are terminal


characters that would be confused with special characters. Because
our url regular expression has . in it as a terminal, we need to use a
backslash to escape it:

http://([a-z]+\.)+[a-z]+(:[0-9]+)/

reading exercises

Regular Expressions
Consider the following regular expression:

[A-G]+(♭|♯)?

Which of the following strings match the regular expression?


A♭
C♯
ABK♭
A♭B
GFE

(missing explanation)

check

Using regular expressions in Java

Regular expressions (“regexes”) are widely used in programming,


and you should have them in your toolbox.

In Java, you can use regexes for manipulating strings (see


String.split , String.matches , java.util.regex.Pattern ). They’re built-in as
a first-class feature of modern scripting languages like Python, Ruby,
and Javascript, and you can use them in many text editors for find
and replace. Regular expressions are your friend! Most of the time.
Here are some examples.

Replace all runs of spaces with a single space:

String singleSpacedString = string.replaceAll(" +", " ");

Match a URL:
Pattern regex = Pattern.compile("http://([a-z]+\\.)+[a-z]+(:[0-9]+)?/");
Matcher m = regex.matcher(string);
if (m.matches()) {
// then string is a url
}

Extract part of an HTML tag:

Pattern regex = Pattern.compile("<a href=['\"]([^']*)['\"]>");


Matcher m = regex.matcher(string);
if (m.matches()) {
String url = m.group(1);
// Matcher.group(n) returns the nth parenthesized part of the regex
}

Notice the backslashes in the URL and HTML tag examples. In the
URL example, we want to match a literal period . , so we have to first
escape it as \. to protect it from being interpreted as the regex
match-any-character operator, and then we have to further escape it
as \\. to protect the backslash from being interpreted as a Java
string escape character. In the HTML example, we have to escape
the quote mark " as \" to keep it from ending the string. The
frequency of backslash escapes makes regexes still less readable.

reading exercises

Using regexes in Java

Write the shortest regex you can to remove single-word, lowercase-


letter-only HTML tags from a string:

String input = "The <b>Good</b>, the <i>Bad</i>, and the <strong>Ugly</stro


String regex = "TODO";
String output = input.replaceAll(regex, "");
If the desired output is "The Good, the Bad, and the Ugly" , what is
shortest regex you can put in place of TODO? You may find it useful
to run this example in the Online Python Tutor.

(missing explanation)

check

Context-Free Grammars

In general, a language that can be expressed with our system of


grammars is called context-free. Not all context-free languages are
also regular; that is, some grammars can’t be reduced to single
nonrecursive productions. Our HTML grammar is context-free but not
regular.

The grammars for most programming languages are also context-


free. In general, any language with nested structure (like nesting
parentheses or braces) is context-free but not regular. That
description applies to the Java grammar, shown here in part:

statement ::=
'{' statement* '}'
| 'if' '(' expression ')' statement ('else' statement)?
| 'for' '(' forinit? ';' expression? ';' forupdate? ')' statement
| 'while' '(' expression ')' statement
| 'do' statement 'while' '(' expression ')' ';'
| 'try' '{' statement* '}' ( catches | catches? 'finally' '{' statement* '}
| 'switch' '(' expression ')' '{' switchgroups '}'
| 'synchronized' '(' expression ')' '{' statement* '}'
| 'return' expression? ';'
| 'throw' expression ';'
| 'break' identifier? ';'
| 'continue' identifier? ';'
| expression ';'
| identifier ':' statement
| ';'

Summary
Machine-processed textual languages are ubiquitous in computer
science. Grammars are the most popular formalism for describing
such languages, and regular expressions are an important subclass
of grammars that can be expressed without recursion.

The topics of today’s reading connect to our three properties of good


software as follows:

Safe from bugs. Grammars and regular expressions are


declarative specifications for strings and streams, which can be
used directly by libraries and tools. These specifications are often
simpler, more direct, and less likely to be buggy then parsing code
written by hand.

Easy to understand. A grammar captures the shape of a


sequence in a form that is easier to understand than hand-written
parsing code. Regular expressions, alas, are often not easy to
understand, because they are a one-line reduced form of what
might have been a more understandable regular grammar.

Ready for change. A grammar can be easily edited, but regular


expressions, unfortunately, are much harder to change, because a
complex regular expression is cryptic and hard to understand.
Reading 18: Parser Generators
Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Communicating clearly Designed to


Correct today and
with future accommodate
correct in the
programmers, change without
unknown future.
including future you. rewriting.

Objectives

After today’s class, you should:

Be able to use a grammar in combination with a parser generator,


to parse a character sequence into a parse tree
Be able to convert a parse tree into a useful data type

Parser Generators
A parser generator is a good tool that you should make part of your
toolbox. A parser generator takes a grammar as input and
automatically generates source code that can parse streams of
characters using the grammar.

The generated code is a parser, which takes a sequence of


characters and tries to match the sequence against the grammar.
The parser typically produces a parse tree, which shows how
grammar productions are expanded into a sentence that matches the
character sequence. The root of the parse tree is the starting
nonterminal of the grammar. Each node of the parse tree expands
into one production of the grammar. We’ll see how a parse tree
actually looks in the next section.

The final step of parsing is to do something useful with this parse


tree. We are going to translate it into a value of a recursive data
type. Recursive abstract data types are often used to represent an
expression in a language, like HTML, or Markdown, or Java, or
algebraic expressions. A recursive abstract data type that represents
a language expression is called an abstract syntax tree (AST).

For this class, we are going to use “ParserLib”, a parser generator


for Java that we have developed specifically for 6.005. The parser
generator is similar in spirit to more widely used parser generators
like Antlr, but it has a simpler interface and is generally easier to use.

A ParserLib Grammar
The code for the examples that follow can be found on GitHub as
fa16-ex18-parser-generators.

Here is what our HTML grammar looks like as a ParserLib source


file:

root ::= html;


html ::= ( italic | normal ) *;
italic ::= '<i>' html '</i>';
normal ::= text;
text ::= [^<>]+; /* represents a string of one or more characters that are
Let’s break it down.

Each ParserLib rule consists of a name, followed by a ::= , followed


by its definition, terminated by a semicolon. The ParserLib grammar
can also include Java-style comments, both single line and multiline.

By convention, we use lower-case for non-terminals: root , html ,

normal , italic . (The ParserLib library is actually case insensitive with


respect to non-terminal names; internally, it canonicalizes names to
all-lowercase, so even if you don’t write all your names into
lowercase, you will see them as lowercase when you print your
grammar). Terminals are either quoted strings, like '<i>' , or names
like text defined in terms of regular expressions over strings.

root ::= html;

root is the entry point of the grammar. This is the nonterminal that the
whole input needs to match. We don’t have to call it root . When
loading the grammar into our program, we will tell the library which
nonterminal to use as the entry point.

html ::= ( normal | italic ) *;

This rule shows that ParserLib rules can have the alternation operator
|, the repetition operators * and + , and parentheses for grouping, in
the same way we’ve been using in the grammars reading. Optional
parts can be marked with ? , just like we did earlier, but this particular
grammar doesn’t use ? .

italic ::= '<i>' html '</i>';


normal ::= text;
text ::= [^<>]+;

Note that the terminal text uses the notation [^<>] from before to
represent all characters except < and > .
In general, terminal symbols do not have to be a fixed string; they can
be a regular expression as in the example. For example, here are
some other terminal patterns we used in the URL grammar earlier in
the reading, now written in ParserLib syntax:

identifier ::= [a-z]+;


integer ::= [0-9]+;

Whitespace
Consider the grammar shown below.

root ::= sum;


sum ::= primitive ('+' primitive)*;
primitive ::= number | '(' sum ')';
number ::= [0-9]+;

This grammar will accept an expression like 42+2+5 , but will reject a
similar expression that has any spaces between the numbers and the
+ signs. We could modify the grammar to allow white space around
the plus sign by modifying the production rule for sum like this:

sum ::= primitive (whitespace* '+' whitespace* primitive)*;


whitespace ::= [ \t\r\n];

However, this can become cumbersome very quickly once the


grammar becomes more complicated. ParserLib allows a shorthand
to indicate that certain kinds of characters should be skipped.
//The IntegerExpression grammar
@skip whitespace{
root ::= sum;
sum ::= primitive ('+' primitive)*;
primitive ::= number | '(' sum ')';
}
whitespace ::= [ \t\r\n];
number ::= [0-9]+;

The @skip whitespace notation indicates that any text matching the
whitespace nonterminal should be skipped in between the parts that
make up the definitions of sum root and primitive . Two things are
important to note. First, there is nothing special about whitespace . The
@skip directive works with any nonterminal or terminal defined in the
grammar. Second, note how the definition of number was intentionally
left outside the @skip block. This is because we want to accept
expressions like 42 + 2 + 5 , but we want to reject expressions like
4 2 + 2 + 5. In the rest of the text, we refer to this grammar as the
IntegerExpression grammar.

Generating the parser


The rest of this reading will use as a running example the
IntegerExpression grammar defined earlier, which we’ll store in a file
called IntegerExpression.g .

The ParserLib parser generator tool converts a grammar source file


like IntegerExpression.g into a parser. In order to do this, you need to
follow three steps. First, you need to import the ParserLib library,
which resides in a package lib6005.parser :

import lib6005.parser;
The second step is to define an Enum type that contains all the
terminals and non-terminals used by your grammar. This will tell the
compiler which definitions to expect in the grammar and will allow it to
check for any missing ones.

enum IntegerGrammar {ROOT, SUM, PRIMITIVE, NUMBER, WHITESPACE};

Note that ParserLib itself is case insensitive, but by convention, the


names of enum values are all upper case.

From within your code, you can create a parser by calling the compile

static method in GrammarCompiler .

...
Parser<IntegerGrammar> parser = GrammarCompiler.compile(new File("IntegerEx

The code opens the file IntegerExpression.g and compiles it using the
GrammarCompiler into a Parser object. The compile method takes as a
second argument the name of the nonterminal to use as the entry
point of the grammar; root in the case of this example.

Assuming you don’t have any syntax errors in your grammar file, the
result will be a Parser object that can be used to parse text in either a
string or a file. Notice that the Parser is a generic type that is
parameterized by the enum you defined earlier.

Calling the parser


Now that you’ve generated the parser object, you are ready to parse
your own text. The parser has a method called parse that takes in the
text to be parsed (in the form of either a String , an InputStream , a File

or a Reader ) and returns a ParseTree . Calling it produces a parse tree:

ParseTree<IntegerGrammar> tree = parser.parse("5+2+3+21");

Note that the ParseTree is also a generic type that is parameterized by


the enum type IntegerGrammar .

For debugging, we can then print this tree out:

System.out.println(tree.toString());

You can also try calling the method display() which will attempt to
open a browser window that will show you a visualization of your
parse tree. If for any reason it is not able to open the browser
window, the method will print a URL to the terminal which you can
copy and paste to your browser to view the visualization.

In the example code: Main.java lines 34-35, which use the enum in
lines 13-17.

reading exercises

Parse trees
Which of the following statements are true of a parse tree?
the root node of the tree corresponds to the starting symbol of the
grammar
the leaves of the tree correspond to terminals
the internal nodes of the tree correspond to nonterminals
only a grammar with recursive productions can generate a parse
tree
(missing explanation)

check

Traversing the parse tree


So we’ve used the parser to turn a stream of characters into a parse
tree, which shows how the grammar matches the stream. Now we
need to do something with this parse tree. We’re going to translate it
into a value of a recursive abstract data type.

The first step is to learn how to traverse the parse tree. The ParseTree

object has four methods that you need to be most familiar with.

/**
* Returns the substring of the original string that corresponds to this pa
* @return String containing the contents of this parse tree.
*/
public String getContents()

/**
* Ordered list of all the children nodes of this ParseTree node.
* @return a List of all children of this ParseTree node, ordered by positi
*/
public List<ParseTree<Symbols>> children()

/**
* Tells you whether a node corresponds to a terminal or a non-terminal.
* If it is terminal, it won't have any children.
* @return true if it is a terminal value.
*/
public boolean isTerminal()

/**
* Get the symbol for the terminal or non-terminal corresponding to this pa
* @return T will generally be an Enum representing the different symbols
* in the grammar, so the return value will be one of those.
*/
public Symbols getName()
Additionally, you can query the ParseTree for all children that match a
particular production rule:

/**
* Get all the children of this PareseTree node corresponding to a particul
* @param name
* Name of the non-terminal corresponding to the desired production rule.
* @return
* List of children ParseTree objects that match that name.
*/
public List <ParseTree<Symbols>> childrenByName(Symbols name);

Note that like the Parser itself, the ParseTree is also parameterized by
the type of the Symbols , which is expected to be an enum type that lists
all the symbols in the grammar.

The ParseTree implements the iterable inerface, so you can iterate


over all the children using a for loop. One way to visit all the nodes in
a parse tree is to write a recursive function. For example, the
recursive function below prints all nodes in the parse tree with proper
indentation.

/**
* Traverse a parse tree, indenting to make it easier to read.
* @param node
* Parse tree to print.
* @param indent
* Indentation to use.
*/
void visitAll(ParseTree<IntegerGrammar> node, String indent){
if(node.isTerminal()){
System.out.println(indent + node.getName() + ":" + node.getContents
}else{
System.out.println(indent + node.getName());
for(ParseTree<IntegerGrammar> child: node){
visitAll(child, indent + " ");
}
}
}
}

Constructing an abstract syntax tree


We need to convert the parse tree into a recursive data type. Here’s
the definition of the recursive data type that we’re going to use to
represent integer arithmetic expressions:

IntegerExpression = Number(n:int)
+ Plus(left:IntegerExpression, right:IntegerExpression)

If this syntax is mysterious, review recursive data type definitions.

When a recursive data type represents a language this way, it is


often called an abstract syntax tree. An IntegerExpression value
captures the important features of the expression – its grouping and
the integers in it – while omitting unnecessary details of the sequence
of characters that created it.

By contrast, the parse tree that we just generated with the


IntegerExpression parser is a concrete syntax tree. It’s called
concrete, rather than abstract, because it contains more details about
how the expression is represented in actual characters. For example,
the strings 2+2 , ((2)+(2)) , and 0002+0002 would each produce a different
concrete syntax tree, but these trees would all correspond to the
same abstract IntegerExpression value: Plus(Number(2), Number(2)) .

Now, we can create a simple recursive function that walks the


ParseTree to produce an IntegerExpression as follows.

Here’s the code:


Main.java line 41

/**
* Function converts a ParseTree to an IntegerExpression.
* @param p
* ParseTree<IntegerGrammar> that is assumed to have been constructed
* @return
*/
IntegerExpression buildAST(ParseTree<IntegerGrammar> p){

switch(p.getName()){
/*
* Since p is a ParseTree parameterized by the type IntegerGrammar,
* returns an instance of the IntegerGrammar enum. This allows the
* that we have covered all the cases.
*/
case NUMBER:
/*
* A number will be a terminal containing a number.
*/
return new Number(Integer.parseInt(p.getContents()));
case PRIMITIVE:
/*
* A primitive will have either a number or a sum as child (in
* By checking which one, we can determine which case we are in
*/

if(p.childrenByName(IntegerGrammar.number).isEmpty()){
return buildAST(p.childrenByName(IntegerGrammar.sum).get(0)
}else{
return buildAST(p.childrenByName(IntegerGrammar.number).get
}

case SUM:
/*
* A sum will have one or more children that need to be summed
* Note that we only care about the children that are primitive
* some whitespace children which we want to ignore.
*/
boolean first = true;
IntegerExpression result = null;
for(ParseTree<IntegerGrammar> child : p.childrenByName(IntegerG
if(first){
result = buildAST(child);
first = false;
}else{
result = new Plus(result buildAST(child));
result = new Plus(result, buildAST(child));
}
}
if(first){ throw new RuntimeException("sum must have a non whit
return result;
case ROOT:
/*
* The root has a single sum child, in addition to having poten
*/
return buildAST(p.childrenByName(IntegerGrammar.sum).get(0));
case WHITESPACE:
/*
* Since we are always avoiding calling buildAST with whitespac
* the code should never make it here.
*/
throw new RuntimeException("You should never reach here:" + p);
}
/*
* The compiler should be smart enough to tell that this code is un
*/
throw new RuntimeException("You should never reach here:" + p);
}

The function is quite simple, and very much follows the structure of
the grammar. An important thing to note is that there is a very strong
assumption that the code will process a ParseTree that corresponds to
the grammar in IntegerExpression.g . If you feed it a different kind of
ParseTree, the code will likely fail with a RuntimeException , but it will
always terminate and will never return a null reference.

reading exercises

String to AST 1

If the input string is "19+23+18" , which abstract syntax tree would be


produced by buildAST above?

Plus(Number(19))
Plus(19, 23, 18)
Plus(Plus(19, 23), 18)
Plus(Plus(Number(19), Number(23)), Number(18))
Plus(Number(19), Plus(Number(23), Number(18)))

(missing explanation)

check

String to AST 2

Which of the following input strings would produce:

Plus(Plus(Number(1), Number(2)),
Plus(Number(3), Number(4)))

"(1+2)+(3+4)"
"1+2+3+4"
"(1+2)+3+4"
"(((1+2)))+(3+4)"

(missing explanation)

check

Handling errors
Several things can go wrong when parsing a file.

Your grammar file may fail to open.


Your grammar may be syntactically incorrect.
The string you are trying to parse may not be parseable with your
given grammar, either because your grammar is incorrect, or
because your string is incorrect.
In the first case, the compile method will throw an IOException . In the
second case, it will throw an UnableToParseException . In the third case,
the UnableToParseException will be thrown by the parse method. The
UnableToParseException exception will contain some information about
the possible location of the error, although parse errors are
sometimes inherently difficult to localize, since the parser cannot
know what string you intended to write, so you may need to search a
little to find the true location of the error.

Left recursion and other ParserLib limitations


ParserLib works by generating a top-down Recursive Descent
Parser. These kind of parsers have a few limitations in terms of the
grammars that they can parse. There are two in particular that are
worth pointing out.

Left recursion. A recursive descent parser can go into an infinite


loop if the grammar involves left recursion. This is a case where a
definition for a non-terminal involves that non-terminal as its leftmost
symbol. For example, the grammar below includes left recursion
because one of the possible definitions of sum is sum '+' number which
has sum as its leftmost symbol.

//The IntegerExpression grammar


@skip whitespace{
root ::= sum;
sum ::= number | sum '+' number;
}
whitespace ::= [ \t\r\n];
number ::= [0-9]+;
Left recursion can also happen indirectly. For example, changing the
grammar above to the one below does not address the problem
because the definition of sum still indirectly involves a symbol that has
sum as its first symbol.

//The IntegerExpression grammar


@skip whitespace{
root ::= sum;
sum ::= number | thing number;
thing ::= sum '+';
}
whitespace ::= [ \t\r\n];
number ::= [0-9]+;

If you give any of these grammars to ParserLib and then try to use
them to parse a symbol, ParserLib will fail with an UnableToParse
exception listing the offending non-terminal.

There are some general techniques to eliminate left recursion; for our
purposes, the simplest approach will be to replace left recursion with
repetition ( * ), so the grammar above becomes:

//The IntegerExpression grammar


@skip whitespace{
root ::= sum;
sum ::= (number '+')* number;
}
whitespace ::= [ \t\r\n];
number ::= [0-9]+;

Greediness. This is not an issue that you will run into in this class,
but it is a limitation of ParserLib you should be aware of. The
ParserLib parsers are greedy in that at every point they try to match
a maximal string for any rule they are currently considering. For
example, consider the following grammar.
root ::= ab threeb;
ab ::= 'a'*'b'*
threeb ::= 'bbb';

The string 'aaaabbb' is clearly in the grammar, but a greedy parser


cannot parse it because it will try to parse a maximal substring that
matches the ab symbol, and then it will find that it cannot parse threeb

because it has already consumed the entire string. Unlike left


recursion, which is easy to fix, this is a more fundamental limitation of
the type of parser implemented by ParserLib, but as mentioned
before, this is not something you should run into in this class.

Summary
The topics of today’s reading connect to our three properties of good
software as follows:

Safe from bugs. A grammar is a declarative specification for


strings and streams, which can be implemented automatically by a
parser generator. These specifications are often simpler, more
direct, and less likely to be buggy than parsing code written by
hand.

Easy to understand. A grammar captures the shape of a


sequence in a form that is compact and easier to understand than
hand-written parsing code.

Ready for change. A grammar can be easily edited, then run


through a parser generator to regenerate the parsing code.
Reading 19: Concurrency
Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Communicating clearly Designed to


Correct today and
with future accommodate
correct in the
programmers, change without
unknown future.
including future you. rewriting.

Objectives

The message passing and shared memory models of concurrency


Concurrent processes and threads, and time slicing
The danger of race conditions

Concurrency
Concurrency means multiple computations are happening at the
same time. Concurrency is everywhere in modern programming,
whether we like it or not:

Multiple computers in a network


Multiple applications running on one computer
Multiple processors in a computer (today, often multiple processor
cores on a single chip)

In fact, concurrency is essential in modern programming:


Web sites must handle multiple simultaneous users.
Mobile apps need to do some of their processing on servers (“in
the cloud”).
Graphical user interfaces almost always require background work
that does not interrupt the user. For example, Eclipse compiles
your Java code while you’re still editing it.

Being able to program with concurrency will still be important in the


future. Processor clock speeds are no longer increasing. Instead,
we’re getting more cores with each new generation of chips. So in
the future, in order to get a computation to run faster, we’ll have to
split up a computation into concurrent pieces.

Two Models for Concurrent Programming


There are two common models for concurrent programming: shared
memory and message passing.

shared memory
Shared memory. In the shared memory model of concurrency,
concurrent modules interact by reading and writing shared objects in
memory.

Examples of the shared-memory model:

A and B might be two processors (or processor cores) in the


same computer, sharing the same physical memory.

A and B might be two programs running on the same computer,


sharing a common filesystem with files they can read and write.

A and B might be two threads in the same Java program (we’ll


explain what a thread is below), sharing the same Java objects.

message passing

Message passing. In the message-passing model, concurrent


modules interact by sending messages to each other through a
communication channel. Modules send off messages, and incoming
messages to each module are queued up for handling. Examples
include:

A and B might be two computers in a network, communicating by


network connections.
A and B might be a web browser and a web server – A opens a
connection to B and asks for a web page, and B sends the web
page data back to A.

A and B might be an instant messaging client and server.

A and B might be two programs running on the same computer


whose input and output have been connected by a pipe, like
ls | grep typed into a command prompt.

Processes, Threads, Time-slicing


The message-passing and shared-memory models are about how
concurrent modules communicate. The concurrent modules
themselves come in two different kinds: processes and threads.

Process. A process is an instance of a running program that is


isolated from other processes on the same machine. In particular, it
has its own private section of the machine’s memory.

The process abstraction is a virtual computer. It makes the program


feel like it has the entire machine to itself – like a fresh computer has
been created, with fresh memory, just to run that program.

Just like computers connected across a network, processes normally


share no memory between them. A process can’t access another
process’s memory or objects at all. Sharing memory between
processes is possible on most operating systems, but it needs
special effort. By contrast, a new process is automatically ready for
message passing, because it is created with standard input & output
streams, which are the System.out and System.in streams you’ve used
in Java.

Thread. A thread is a locus of control inside a running program. Think


of it as a place in the program that is being run, plus the stack of
method calls that led to that place (so the thread can go back up the
stack when it reaches return statements).

Just as a process represents a virtual computer, the thread


abstraction represents a virtual processor. Making a new thread
simulates making a fresh processor inside the virtual computer
represented by the process. This new virtual processor runs the
same program and shares the same memory as other threads in the
process.

Threads are automatically ready for shared memory, because


threads share all the memory in the process. It takes special effort to
get “thread-local” memory that’s private to a single thread. It’s also
necessary to set up message-passing explicitly, by creating and using
queue data structures. We’ll talk about how to do that in a future
reading.
time-slicing

How can I have many concurrent threads with only one or two
processors in my computer? When there are more threads than
processors, concurrency is simulated by time slicing, which means
that the processor switches between threads. The figure on the right
shows how three threads T1, T2, and T3 might be time-sliced on a
machine that has only two actual processors. In the figure, time
proceeds downward, so at first one processor is running thread T1
and the other is running thread T2, and then the second processor
switches to run thread T3. Thread T2 simply pauses, until its next
time slice on the same processor or another processor.

On most systems, time slicing happens unpredictably and


nondeterministically, meaning that a thread may be paused or
resumed at any time.

In the Java Tutorials, read:

Processes & Threads (just 1 page)


Defining and Starting a Thread (just 1 page)

The second Java Tutorials reading shows two ways to create a


thread.

Never use their second way (subclassing Thread ).

Always implement the Runnable interface and use the new Thread(..)

constructor.

Their example declares a named class that implements Runnable :

public class HelloRunnable implements Runnable {


public void run() {
System.out.println("Hello from a thread!");
}
}
// ... in the main method:
new Thread(new HelloRunnable()).start();

A very common idiom is starting a thread with an anonymous Runnable ,

which eliminates the named class:

new Thread(new Runnable() {


public void run() {
System.out.println("Hello from a thread!");
}
}).start();

Read: using an anonymous Runnable to start a thread

reading exercises

Processes and threads 1

When you run a Java program (for example, using the Run button in
Eclipse), how many processors, processes, and threads are created
at first?

Processors:

Processes:

Threads:

(missing explanation)

check

Processes and threads 2

Suppose we run main in this program, which contains bugs:

public class Moirai {


public static void main(String[] args) {
Thread clotho = new Thread(new Runnable() {
public void run() { System.out.println("spinning"); };
});
clotho.start();
new Thread(new Runnable() {
public void run() { System.out.println("measuring"); };
}).start();
new Thread(new Runnable() {
public void run() { System.out.println("cutting"); };
});
}
}

How many new Thread objects are created?

(missing explanation)
How many new threads are run?

(missing explanation)

What is the maximum number of threads that might be running at the


same time?

(missing explanation)

check

Processes and threads 3

Suppose we run main in this program, which demonstrates two


common bugs:

public class Parcae {


public static void main(String[] args) {
Thread nona = new Thread(new Runnable() {
public void run() { System.out.println("spinning"); };
});
nona.run();
Runnable decima = new Runnable() {
public void run() { System.out.println("measuring"); };
};
decima.run();
// ...
}
}

How many new Thread objects are created?

(missing explanation)
How many new threads are run?

(missing explanation)

check

Shared Memory Example


Let’s look at an example of a shared memory system. The point of
this example is to show that concurrent programming is hard,
because it can have subtle bugs.

shared memory model for bank accounts


Imagine that a bank has cash machines that use a shared memory
model, so all the cash machines can read and write the same account
objects in memory.

To illustrate what can go wrong, let’s simplify the bank down to a


single account, with a dollar balance stored in the balance variable,
and two operations deposit and withdraw that simply add or remove a
dollar:

// suppose all the cash machines share a single bank account


private static int balance = 0;

private static void deposit() {


balance = balance + 1;
}
private static void withdraw() {
balance = balance - 1;
}

Customers use the cash machines to do transactions like this:

deposit(); // put a dollar in


withdraw(); // take it back out

In this simple example, every transaction is just a one dollar deposit


followed by a one-dollar withdrawal, so it should leave the balance in
the account unchanged. Throughout the day, each cash machine in
our network is processing a sequence of deposit/withdraw
transactions.

// each ATM does a bunch of transactions that


// modify balance, but leave it unchanged afterward
private static void cashMachine() {
for (int i = 0; i < TRANSACTIONS_PER_MACHINE; ++i) {
deposit(); // put a dollar in
withdraw(); // take it back out
}
}

So at the end of the day, regardless of how many cash machines


were running, or how many transactions we processed, we should
expect the account balance to still be 0.

But if we run this code, we discover frequently that the balance at the
end of the day is not 0. If more than one cashMachine() call is running at
the same time – say, on separate processors in the same computer –
then balance may not be zero at the end of the day. Why not?

Interleaving
Here’s one thing that can happen. Suppose two cash machines, A
and B, are both working on a deposit at the same time. Here’s how
the deposit() step typically breaks down into low-level processor
instructions:

get balance (balance=0)

add 1

write back the result (balance=1)

When A and B are running concurrently, these low-level instructions


interleave with each other (some might even be simultaneous in some
sense, but let’s just worry about interleaving for now):

A B

A get balance (balance=0)


A B

A add 1

A write back the result


(balance=1)

B get balance (balance=1)

B add 1
B write back the result
(balance=2)

This interleaving is fine – we end up with balance 2, so both A and B


successfully put in a dollar. But what if the interleaving looked like
this:

A B

A get balance (balance=0)

B get balance (balance=0)

A add 1
B add 1
A write back the result
(balance=1)

B write back the result


(balance=1)

The balance is now 1 – A’s dollar was lost! A and B both read the
balance at the same time, computed separate final balances, and
then raced to store back the new balance – which failed to take the
other’s deposit into account.

Race Condition
This is an example of a race condition. A race condition means that
the correctness of the program (the satisfaction of postconditions and
invariants) depends on the relative timing of events in concurrent
computations A and B. When this happens, we say “A is in a race
with B.”

Some interleavings of events may be OK, in the sense that they are
consistent with what a single, nonconcurrent process would produce,
but other interleavings produce wrong answers – violating
postconditions or invariants.

Tweaking the Code Won’t Help


All these versions of the bank-account code exhibit the same race
condition:

// version 1
private static void deposit() { balance = balance + 1; }
private static void withdraw() { balance = balance - 1; }

// version 2
private static void deposit() { balance += 1; }
private static void withdraw() { balance -= 1; }

// version 3
private static void deposit() { ++balance; }
private static void withdraw() { --balance; }

You can’t tell just from looking at Java code how the processor is
going to execute it. You can’t tell what the indivisible operations – the
atomic operations – will be. It isn’t atomic just because it’s one line of
Java. It doesn’t touch balance only once just because the balance
identifier occurs only once in the line. The Java compiler, and in fact
the processor itself, makes no commitments about what low-level
operations it will generate from your code. In fact, a typical modern
Java compiler produces exactly the same code for all three of these
versions!

The key lesson is that you can’t tell by looking at an expression


whether it will be safe from race conditions.

Read: Thread Interference (just 1 page)

Reordering
It’s even worse than that, in fact. The race condition on the bank
account balance can be explained in terms of different interleavings of
sequential operations on different processors. But in fact, when
you’re using multiple variables and multiple processors, you can’t
even count on changes to those variables appearing in the same
order.

Here’s an example. Note that it uses a loop that continuously checks


for a concurrent condition; this is called busy waiting and it is not a
good pattern. In this case, the code is also broken:

private boolean ready = false;


private int answer = 0;

// computeAnswer runs in one thread


private void computeAnswer() {
answer = 42;
ready = true;
}

// useAnswer runs in a different thread


private void useAnswer() {
while (!ready) {
Thread.yield();
}
if (answer == 0) throw new RuntimeException("answer wasn't ready!");
}

We have two methods that are being run in different threads.


computeAnswer does a long calculation, finally coming up with the
answer 42, which it puts in the answer variable. Then it sets the ready

variable to true, in order to signal to the method running in the other


thread, useAnswer , that the answer is ready for it to use. Looking at the
code, answer is set before ready is set, so once useAnswer sees ready as
true, then it seems reasonable that it can assume that the answer will
be 42, right? Not so.

The problem is that modern compilers and processors do a lot of


things to make the code fast. One of those things is making
temporary copies of variables like answer and ready in faster storage
(registers or caches on a processor), and working with them
temporarily before eventually storing them back to their official
location in memory. The storeback may occur in a different order than
the variables were manipulated in your code. Here’s what might be
going on under the covers (but expressed in Java syntax to make it
clear). The processor is effectively creating two temporary variables,
tmpr and tmpa , to manipulate the fields ready and answer :

private void computeAnswer() {


boolean tmpr = ready;
int tmpa = answer;
tmpa = 42;
tmpr = true;

ready = tmpr;
// <-- what happens if useAnswer() interleaves here?
// ready is set, but answer isn't.
answer = tmpa;
}

reading exercises

Interleaving 1

Here’s the buggy code from our earlier exercise where two new
threads are started:

public class Moirai {


public static void main(String[] args) {
Thread clotho = new Thread(new Runnable() {
public void run() { System.out.println("spinning"); };
});
clotho.start();
new Thread(new Runnable() {
public void run() { System.out.println("measuring"); };
}).start();
new Thread(new Runnable() {
public void run() { System.out.println("cutting"); };
});
// bug! never started
}
}

Which of the following are possible outputs from this program:

spinning
measuring
cutting
spinning
measuring
measuring
spinning
cutting
measuring
spinning
spinning
measuring

(missing explanation)

check

Interleaving 2

Here’s the buggy code from our earlier exercise where no new
threads are started:

public class Parcae {


public static void main(String[] args) {
Thread nona = new Thread(new Runnable() {
public void run() { System.out.println("spinning"); };
});
nona.run(); // bug! called run instead of start
Runnable decima = new Runnable() {
public void run() { System.out.println("measuring"); };
};
decima.run(); // bug? maybe meant to create a Thread?
// ...
}
}

Which of the following are possible outputs from this program:

spinning
measuring
measuring
spinning
spinning
measuring
(missing explanation)

check

Race conditions 1

Consider the following code:

private static int x = 1;

public static void methodA() {


x *= 2;
x *= 3;
}

public static void methodB() {


x *= 5;
}

Suppose methodA and methodB run sequentially, i.e. first one and then
the other. What is the final value of x ?

(missing explanation)

check

Race conditions 2

Now suppose methodA and methodB run concurrently, so that their


instructions might interleave arbitrarily. Which of the following are
possible final values of x ?

1
2
5
6
10
30
150

(missing explanation)

check

Message Passing Example


message passing bank account example

Now let’s look at the message-passing approach to our bank account


example.
Now not only are the cash machine modules, but the accounts are
modules, too. Modules interact by sending messages to each other.
Incoming requests are placed in a queue to be handled one at a time.
The sender doesn’t stop working while waiting for an answer to its
request. It handles more requests from its own queue. The reply to
its request eventually comes back as another message.

Unfortunately, message passing doesn’t eliminate the possibility of


race conditions. Suppose each account supports get-balance and
withdraw operations, with corresponding messages. Two users, at
cash machines A and B, are both trying to withdraw a dollar from the
same account. They check the balance first to make sure they never
withdraw more than the account holds, because overdrafts trigger big
bank penalties:

get-balance
if balance >= 1 then withdraw 1

The problem is again interleaving, but this time interleaving of the


messages sent to the bank account, rather than the instructions
executed by A and B. If the account starts with a dollar in it, then
what interleaving of messages will fool A and B into thinking they can
both withdraw a dollar, thereby overdrawing the account?

One lesson here is that you need to carefully choose the operations
of a message-passing model. withdraw-if-sufficient-funds would be a
better operation than just withdraw .

Concurrency is Hard to Test and Debug


If we haven’t persuaded you that concurrency is tricky, here’s the
worst of it. It’s very hard to discover race conditions using testing.
And even once a test has found a bug, it may be very hard to localize
it to the part of the program causing it.

Concurrency bugs exhibit very poor reproducibility. It’s hard to make


them happen the same way twice. Interleaving of instructions or
messages depends on the relative timing of events that are strongly
influenced by the environment. Delays can be caused by other running
programs, other network traffic, operating system scheduling
decisions, variations in processor clock speed, etc. Each time you run
a program containing a race condition, you may get different
behavior.

These kinds of bugs are heisenbugs, which are nondeterministic and


hard to reproduce, as opposed to a bohrbug, which shows up
repeatedly whenever you look at it. Almost all bugs in sequential
programming are bohrbugs.

A heisenbug may even disappear when you try to look at it with


println or debugger ! The reason is that printing and debugging are so
much slower than other operations, often 100-1000x slower, that they
dramatically change the timing of operations, and the interleaving. So
inserting a simple print statement into the cashMachine():

private static void cashMachine() {


for (int i = 0; i < TRANSACTIONS_PER_MACHINE; ++i) {
deposit(); // put a dollar in
withdraw(); // take it back out
System.out.println(balance); // makes the bug disappear!
}
}

…and suddenly the balance is always 0, as desired, and the bug


appears to disappear. But it’s only masked, not truly fixed. A change
in timing somewhere else in the program may suddenly make the bug
come back.

Concurrency is hard to get right. Part of the point of this reading is to


scare you a bit. Over the next several readings, we’ll see principled
ways to design concurrent programs so that they are safer from
these kinds of bugs.

reading exercises

Testing concurrency

You’re running a JUnit test suite (for code written by somebody else),
and some of the tests are failing. You add System.out.println

statements to the one method called by all the failing test cases, in
order to display some of its local variables, and the test cases
suddenly start passing. Which of the following are likely reasons for
this?

The method is calling a random number generator (e.g.


Math.random() ), so sometimes its tests will pass by random chance.
The method has code running concurrently.
The method has a race condition.
The method’s preconditions are not being met by the test cases.

(missing explanation)
check

Summary
Concurrency: multiple computations running simultaneously
Shared-memory & message-passing paradigms
Processes & threads
Process is like a virtual computer; thread is like a virtual
processor
Race conditions
When correctness of result (postconditions and invariants)
depends on the relative timing of events

These ideas connect to our three key properties of good software


mostly in bad ways. Concurrency is necessary but it causes serious
problems for correctness. We’ll work on fixing those problems in the
next few readings.

Safe from bugs. Concurrency bugs are some of the hardest bugs
to find and fix, and require careful design to avoid.

Easy to understand. Predicting how concurrent code might


interleave with other concurrent code is very hard for
programmers to do. It’s best to design your code in such a way
that programmers don’t have to think about interleaving at all.

Ready for change. Not particularly relevant here.


Reading 20: Thread Safety
Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Communicating clearly Designed to


Correct today and
with future accommodate
correct in the
programmers, change without
unknown future.
including future you. rewriting.

Objectives

Recall race conditions: multiple threads sharing the same mutable


variable without coordinating what they’re doing. This is unsafe,
because the correctness of the program may depend on accidents of
timing of their low-level operations.

There are basically four ways to make variable access safe in


shared-memory concurrency:

Confinement. Don’t share the variable between threads. This


idea is called confinement, and we’ll explore it today.
Immutability. Make the shared data immutable. We’ve talked a
lot about immutability already, but there are some additional
constraints for concurrent programming that we’ll talk about in this
reading.
Threadsafe data type. Encapsulate the shared data in an
existing threadsafe data type that does the coordination for you.
We’ll talk about that today.
Synchronization. Use synchronization to keep the threads from
accessing the variable at the same time. Synchronization is what
you need to build your own threadsafe data type.

We’ll talk about the first three ways in this reading, along with how to
make an argument that your code is threadsafe using those three
ideas. We’ll talk about the fourth approach, synchronization, in a later
reading.

The material in this reading is inspired by an excellent book: Brian


Goetz et al., Java Concurrency in Practice, Addison-Wesley, 2006.

What Threadsafe Means


A data type or static method is threadsafe if it behaves correctly
when used from multiple threads, regardless of how those threads
are executed, and without demanding additional coordination from the
calling code.

“behaves correctly” means satisfying its specification and


preserving its rep invariant;
“regardless of how threads are executed” means threads might
be on multiple processors or timesliced on the same processor;
“without additional coordination” means that the data type can’t
put preconditions on its caller related to timing, like “you can’t call
get() while set() is in progress.”

Remember Iterator ? It’s not threadsafe. Iterator ’s specification says


that you can’t modify a collection at the same time as you’re iterating
over it. That’s a timing-related precondition put on the caller, and
Iterator makes no guarantee to behave correctly if you violate it.

Strategy 1: Confinement
Our first way of achieving thread safety is confinement. Thread
confinement is a simple idea: you avoid races on mutable data by
keeping that data confined to a single thread. Don’t give any other
threads the ability to read or write the data directly.

Since shared mutable data is the root cause of a race condition,


confinement solves it by not sharing the mutable data.

Local variables are always thread confined. A local variable is stored


in the stack, and each thread has its own stack. There may be
multiple invocations of a method running at a time (in different threads
or even at different levels of a single thread’s stack, if the method is
recursive), but each of those invocations has its own private copy of
the variable, so the variable itself is confined.

But be careful – the variable is thread confined, but if it’s an object


reference, you also need to check the object it points to. If the object
is mutable, then we want to check that the object is confined as well
– there can’t be references to it that are reachable from any other
thread.

Confinement is what makes the accesses to n , i , and result safe in


code like this:

public class Factorial {


/**
* Computes n! and prints it on standard output.
* @param n must be >= 0
*/
private static void computeFact(final int n) {
BigInteger result = new BigInteger("1");
for (int i = 1; i <= n; ++i) {
System.out.println("working on fact " + n);
result = result.multiply(new BigInteger(String.valueOf(i)));
}
System.out.println("fact(" + n + ") = " + result);
}

public static void main(String[] args) {


new Thread(new Runnable() { // create a thread using an
public void run() { // anonymous Runnable
computeFact(99);
}
}).start();
computeFact(100);
}
}

This code starts the thread for computeFact(99) with an anonymous


Runnable , a common idiom discussed in the previous reading.

Let’s look at snapshot diagrams for this code. Hover or tap on each
step to update the diagram:

1. When we start the program, we start with one thread


running main .
2. main creates a second thread using the anonymous
Runnable idiom, and starts that thread.

3. At this point, we have two concurrent threads of


execution. Their interleaving is unknown! But one
possibility for the next thing that happens is that thread 1
enters computeFact .
4. Then, the next thing that might happen is that thread 2
also enters computeFact .
At this point, we see how confinement helps with thread
safety: each execution of computeFact has its own n , i , and
result variables. None of the objects they point to are
mutable; if they were mutable, we would need to check
that the objects are not aliased from other threads.
5. The computeFact computations proceed independently,
updating their respective variables.

Avoid Global Variables

Unlike local variables, static variables are not automatically thread


confined.

If you have static variables in your program, then you have to make
an argument that only one thread will ever use them, and you have to
document that fact clearly. Better, you should eliminate the static
variables entirely.

Here’s an example:

// This class has a race condition in it.


public class PinballSimulator {

private static PinballSimulator simulator = null;


// invariant: there should never be more than one PinballSimulator
// object created

private PinballSimulator() {
System.out.println("created a PinballSimulator object");
}

// factory method that returns the sole PinballSimulator object,


// creating it if it doesn't exist
public static PinballSimulator getInstance() {
if (simulator == null) {
simulator = new PinballSimulator();
}
return simulator;
}
}

This class has a race in the getInstance() method – two threads could
call it at the same time and end up creating two copies of the
PinballSimulator object, which we don’t want.

To fix this race using the thread confinement approach, you would
specify that only a certain thread (maybe the “pinball simulation
thread”) is allowed to call PinballSimulator.getInstance() . The risk here
is that Java won’t help you guarantee this.

In general, static variables are very risky for concurrency. They might
be hiding behind an innocuous function that seems to have no side-
effects or mutations. Consider this example:

// is this method threadsafe?


/**
* @param x integer to test for primeness; requires x > 1
* @return true if x is prime with high probability
*/
public static boolean isPrime(int x) {
if (cache.containsKey(x)) return cache.get(x);
boolean answer = BigInteger.valueOf(x).isProbablePrime(100);
cache.put(x, answer);
return answer;
}

private static Map<Integer,Boolean> cache = new HashMap<>();

This function stores the answers from previous calls in case they’re
requested again. This technique is called memoization, and it’s a
sensible optimization for slow functions like exact primality testing.
But now the isPrime method is not safe to call from multiple threads,
and its clients may not even realize it. The reason is that the HashMap

referenced by the static variable cache is shared by all calls to


isPrime() , and HashMap is not threadsafe. If multiple threads mutate the
map at the same time, by calling cache.put() , then the map can
become corrupted in the same way that the bank account became
corrupted in the last reading. If you’re lucky, the corruption may cause
an exception deep in the hash map, like a NullPointerException or
IndexOutOfBoundsException . But it also may just quietly give wrong
answers, as we saw in the bank account example.

reading exercises

Factorial

In the factorial example above, main looks like:

public static void main(String[] args) {


new Thread(new Runnable() { // create a thread using an
public void run() { // anonymous Runnable
computeFact(99);
}
}).start();
computeFact(100);
}

Which of the following are possible interleavings?

The call to computeFact(100) starts before the call to computeFact(99)

starts
The call to computeFact(99) starts before the call to computeFact(100)

starts
The call to computeFact(100) finishes before the call to computeFact(99)

starts
The call to computeFact(99) finishes before the call to computeFact(100)

starts

(missing explanation)

check

PinballSimulator

Here’s part of the pinball simulator example above:

public class PinballSimulator {

private static PinballSimulator simulator = null;

// ...

public static PinballSimulator getInstance() {


1) if (simulator == null) {
2) simulator = new PinballSimulator();
}
3) return simulator;
}
}

The code has a race condition that invalidates the invariant that only
one simulator object is created.

Suppose two threads are running getInstance() . One thread is about


to execute one of the numbered lines above; the other thread is about
to execute the other. For each pair of possible line numbers, is it
possible the invariant will be violated?

About to execute lines 1 and 3


Yes, it could be violated
No, we’re safe

(missing explanation)

About to execute lines 1 and 2


Yes, it could be violated
No, we’re safe

(missing explanation)

About to execute lines 1 and 1


Yes, it could be violated
No, we’re safe

(missing explanation)

check

Confinement

In the following code, which variables are confined to a single thread?

public class C {
public static void main(String[] args) {
new Thread(new Runnable() {
public void run() {
threadA();
}
}).start();

new Thread(new Runnable() {


public void run() {
threadB();
}
}).start();
}

private static String name = "Napoleon Dynamite";


private static int cashLeft = 150;

private static void threadA() {


int amountA = 20;
cashLeft = spend(amountA);
}

private static void threadB() {


int amountB = 30;
cashLeft = spend(amountB);
}

private static int spend(int amountToSpend) {


return cashLeft - amountToSpend;
}
}

amountA
amountB
amountToSpend
cashLeft
name

(missing explanation)

check

Strategy 2: Immutability
Our second way of achieving thread safety is by using immutable
references and data types. Immutability tackles the shared-mutable-
data cause of a race condition and solves it simply by making the
shared data not mutable.

Final variables are immutable references, so a variable declared final


is safe to access from multiple threads. You can only read the
variable, not write it. Be careful, because this safety applies only to
the variable itself, and we still have to argue that the object the
variable points to is immutable.

Immutable objects are usually also threadsafe. We say “usually” here


because our current definition of immutability is too loose for
concurrent programming. We’ve said that a type is immutable if an
object of the type always represents the same abstract value for its
entire lifetime. But that actually allows the type the freedom to mutate
its rep, as long as those mutations are invisible to clients. We saw an
example of this notion, called benevolent or beneficent mutation,
when we looked at an immutable list that cached its length in a
mutable field the first time the length was requested by a client.
Caching is a typical kind of beneficent mutation.

For concurrency, though, this kind of hidden mutation is not safe. An


immutable data type that uses beneficent mutation will have to make
itself threadsafe using locks (the same technique required of mutable
data types), which we’ll talk about in a future reading.

Stronger definition of immutability

So in order to be confident that an immutable data type is threadsafe


without locks, we need a stronger definition of immutability:

no mutator methods
all fields are private and final
no representation exposure
no mutation whatsoever of mutable objects in the rep – not even
beneficent mutation
If you follow these rules, then you can be confident that your
immutable type will also be threadsafe.

In the Java Tutorials, read:

A Strategy for Defining Immutable Objects (1 page)

reading exercises

Immutability

Suppose you’re reviewing an abstract data type which is specified to


be immutable, to decide whether its implementation actually is
immutable and threadsafe.

Which of the following elements would you have to look at?

fields
creator implementations
client calls to creators
producer implementations
client calls to producers
observer implementations
client calls to observers
mutator implementations
client calls to mutators

(missing explanation)

check

Strategy 3: Using Threadsafe Data Types


Our third major strategy for achieving thread safety is to store shared
mutable data in existing threadsafe data types.

When a data type in the Java library is threadsafe, its documentation


will explicitly state that fact. For example, here’s what StringBuffer
says:

[StringBuffer is] A thread-safe, mutable sequence of characters.


A string buffer is like a String, but can be modified. At any point in
time it contains some particular sequence of characters, but the
length and content of the sequence can be changed through
certain method calls.

String buffers are safe for use by multiple threads. The methods
are synchronized where necessary so that all the operations on
any particular instance behave as if they occur in some serial
order that is consistent with the order of the method calls made
by each of the individual threads involved.

This is in contrast to StringBuilder:

[StringBuilder is] A mutable sequence of characters. This class


provides an API compatible with StringBuffer, but with no
guarantee of synchronization. This class is designed for use as a
drop-in replacement for StringBuffer in places where the string
buffer was being used by a single thread (as is generally the
case). Where possible, it is recommended that this class be used
in preference to StringBuffer as it will be faster under most
implementations.
It’s become common in the Java API to find two mutable data types
that do the same thing, one threadsafe and the other not. The reason
is what this quote indicates: threadsafe data types usually incur a
performance penalty compared to an unsafe type.

It’s deeply unfortunate that StringBuffer and StringBuilder are named


so similarly, without any indication in the name that thread safety is
the crucial difference between them. It’s also unfortunate that they
don’t share a common interface, so you can’t simply swap in one
implementation for the other for the times when you need thread
safety. The Java collection interfaces do much better in this respect,
as we’ll see next.

Threadsafe Collections

The collection interfaces in Java – List , Set , Map – have basic


implementations that are not threadsafe. The implementations of
these that you’ve been used to using, namely ArrayList , HashMap , and
HashSet , cannot be used safely from more than one thread.

Fortunately, just like the Collections API provides wrapper methods


that make collections immutable, it provides another set of wrapper
methods to make collections threadsafe, while still mutable.

These wrappers effectively make each method of the collection


atomic with respect to the other methods. An atomic action
effectively happens all at once – it doesn’t interleave its internal
operations with those of other actions, and none of the effects of the
action are visible to other threads until the entire action is complete,
so it never looks partially done.

Now we see a way to fix the isPrime() method we had earlier in the
reading:

private static Map<Integer,Boolean> cache =


Collections.synchronizedMap(new HashMap<>());

A few points here.

Don’t circumvent the wrapper. Make sure to throw away


references to the underlying non-threadsafe collection, and access it
only through the synchronized wrapper. That happens automatically in
the line of code above, since the new HashMap is passed only to
synchronizedMap() and never stored anywhere else. (We saw this same
warning with the unmodifiable wrappers: the underlying collection is
still mutable, and code with a reference to it can circumvent
immutability.)

Iterators are still not threadsafe. Even though method calls on the
collection itself ( get() , put() , add() , etc.) are now threadsafe, iterators
created from the collection are still not threadsafe. So you can’t use
iterator() , or the for loop syntax:

for (String s: lst) { ... } // not threadsafe, even if lst is a synchronize

The solution to this iteration problem will be to acquire the collection’s


lock when you need to iterate over it, which we’ll talk about in a future
reading.
Finally, atomic operations aren’t enough to prevent races: the
way that you use the synchronized collection can still have a race
condition. Consider this code, which checks whether a list has at
least one element and then gets that element:

if ( ! lst.isEmpty()) { String s = lst.get(0); ... }

Even if you make lst into a synchronized list, this code still may have
a race condition, because another thread may remove the element
between the isEmpty() call and the get() call.

Even the isPrime() method still has potential races:

if (cache.containsKey(x)) return cache.get(x);


boolean answer = BigInteger.valueOf(x).isProbablePrime(100);
cache.put(x, answer);

The synchronized map ensures that containsKey() , get() , and put() are
now atomic, so using them from multiple threads won’t damage the
rep invariant of the map. But those three operations can now
interleave in arbitrary ways with each other, which might break the
invariant that isPrime needs from the cache: if the cache maps an
integer x to a value f, then x is prime if and only if f is true. If the
cache ever fails this invariant, then we might return the wrong result.

So we have to argue that the races between containsKey() , get() , and


put() don’t threaten this invariant.

1. The race between containsKey() and get() is not harmful because


we never remove items from the cache – once it contains a result
for x, it will continue to do so.
2. There’s a race between containsKey() and put() . As a result, it may
end up that two threads will both test the primeness of the same x
at the same time, and both will race to call put() with the answer.
But both of them should call put() with the same answer, so it
doesn’t matter which one wins the race – the result will be the
same.

The need to make these kinds of careful arguments about safety –


even when you’re using threadsafe data types – is the main reason
that concurrency is hard.

In the Java Tutorials, read:

Wrapper Collections (1 page)


Concurrent Collections (1 page)

reading exercises

Threadsafe data types

Consider this class’s rep:

public class Building {


private final String buildingName;
private int numberOfFloors;
private final int[] occupancyPerFloor;
private final List<String> companyNames = Collections.synchronizedList(
...
}

Which of these variables use a threadsafe data type?


buildingName
numberOfFloors
occupancyPerFloor
companyNames

(missing explanation)

Which of these variables are safe for use by multiple threads?


buildingName
numberOfFloors
occupancyPerFloor
companyNames

(missing explanation)

Which of these variables cannot be involved in any race condition?


buildingName
numberOfFloors
occupancyPerFloor
companyNames

(missing explanation)

check

How to Make a Safety Argument


We’ve seen that concurrency is hard to test and debug. So if you
want to convince yourself and others that your concurrent program is
correct, the best approach is to make an explicit argument that it’s
free from races, and write it down.

A safety argument needs to catalog all the threads that exist in your
module or program, and the data that that they use, and argue which
of the four techniques you are using to protect against races for each
data object or variable: confinement, immutability, threadsafe data
types, or synchronization. When you use the last two, you also need
to argue that all accesses to the data are appropriately atomic – that
is, that the invariants you depend on are not threatened by
interleaving. We gave one of those arguments for isPrime above.

Thread Safety Arguments for Data Types

Let’s see some examples of how to make thread safety arguments


for a data type. Remember our four approaches to thread safety:
confinement, immutability, threadsafe data types, and
synchronization. Since we haven’t talked about synchronization in this
reading, we’ll just focus on the first three approaches.

Confinement is not usually an option when we’re making an argument


just about a data type, because you have to know what threads exist
in the system and what objects they’ve been given access to. If the
data type creates its own set of threads, then you can talk about
confinement with respect to those threads. Otherwise, the threads
are coming in from the outside, carrying client calls, and the data type
may have no guarantees about which threads have references to
what. So confinement isn’t a useful argument in that case. Usually we
use confinement at a higher level, talking about the system as a
whole and arguing why we don’t need thread safety for some of our
modules or data types, because they won’t be shared across threads
by design.

Immutability is often a useful argument:

/** MyString is an immutable data type representing a string of characters.


public class MyString {
private final char[] a;
// Thread safety argument:
// This class is threadsafe because it's immutable:
// - a is final
// - a points to a mutable char array, but that array is encapsulate
// in this object, not shared with any other object or exposed to
// client

Here’s another rep for MyString that requires a little more care in the
argument:

/** MyString is an immutable data type representing a string of characters.


public class MyString {
private final char[] a;
private final int start;
private final int len;
// Rep invariant:
// 0 <= start <= a.length
// 0 <= len <= a.length-start
// Abstraction function:
// represents the string of characters a[start],...,a[start+length-1
// Thread safety argument:
// This class is threadsafe because it's immutable:
// - a, start, and len are final
// - a points to a mutable char array, which may be shared with othe
// MyString objects, but they never mutate it
// - the array is never exposed to a client

Note that since this MyString rep was designed for sharing the array
between multiple MyString objects, we have to ensure that the sharing
doesn’t threaten its thread safety. As long as it doesn’t threaten the
MyString ’s immutability, however, we can be confident that it won’t
threaten the thread safety.

We also have to avoid rep exposure. Rep exposure is bad for any
data type, since it threatens the data type’s rep invariant. It’s also
fatal to thread safety.
Bad Safety Arguments

Here are some incorrect arguments for thread safety:

/** MyStringBuffer is a threadsafe mutable string of characters. */


public class MyStringBuffer {
private String text;
// Rep invariant:
// none
// Abstraction function:
// represents the sequence text[0],...,text[text.length()-1]
// Thread safety argument:
// text is an immutable (and hence threadsafe) String,
// so this object is also threadsafe

Why doesn’t this argument work? String is indeed immutable and


threadsafe; but the rep pointing to that string, specifically the text

variable, is not immutable. text is not a final variable, and in fact it


can’t be final in this data type, because we need the data type to
support insertion and deletion operations. So reads and writes of the
text variable itself are not threadsafe. This argument is false.

Here’s another broken argument:

public class Graph {


private final Set<Node> nodes =
Collections.synchronizedSet(new HashSet<>());
private final Map<Node,Set<Node>> edges =
Collections.synchronizedMap(new HashMap<>());
// Rep invariant:
// for all x, y such that y is a member of edges.get(x),
// x, y are both members of nodes
// Abstraction function:
// represents a directed graph whose nodes are the set of nodes
// and whose edges are the set (x,y) such that
// y is a member of edges.get(x)
// Thread safety argument:
// - nodes and edges are final, so those variables are immutable
// and threadsafe
// - nodes and edges point to threadsafe set and map data types
This is a graph data type, which stores its nodes in a set and its
edges in a map. (Quick quiz: is Graph a mutable or immutable data
type? What do the final keywords have to do with its mutability?)
Graph relies on other threadsafe data types to help it implement its
rep – specifically the threadsafe Set and Map wrappers that we talked
about above. That prevents some race conditions, but not all,
because the graph’s rep invariant includes a relationship between the
node set and the edge map. All nodes that appear in the edge map
also have to appear in the node set. So there may be code like this:

public void addEdge(Node from, Node to) {


if ( ! edges.containsKey(from)) {
edges.put(from, Collections.synchronizedSet(new HashSet<>()));
}
edges.get(from).add(to);
nodes.add(from);
nodes.add(to);
}

This code has a race condition in it. There is a crucial moment when
the rep invariant is violated, right after the edges map is mutated, but
just before the nodes set is mutated. Another operation on the graph
might interleave at that moment, discover the rep invariant broken,
and return wrong results. Even though the threadsafe set and map
data types guarantee that their own add() and put() methods are
atomic and noninterfering, they can’t extend that guarantee to
interactions between the two data structures. So the rep invariant of
Graph is not safe from race conditions. Just using immutable and
threadsafe-mutable data types is not sufficient when the rep invariant
depends on relationships between objects in the rep.
We’ll have to fix this with synchronization, and we’ll see how in a
future reading.

reading exercises

Safety arguments

Consider the following ADT with a bad safety argument that


appeared above:

/** MyStringBuffer is a threadsafe mutable string of characters. */


public class MyStringBuffer {
private String text;
// Rep invariant:
// none
// Abstraction function:
// represents the sequence text[0],...,text[text.length()-1]
// Thread safety argument:
// text is an immutable (and hence threadsafe) String,
// so this object is also threadsafe

/** @return the string represented by this buffer,


* with all letters converted to uppercase */
public String toUpperCase() { return text.toUpperCase(); }

/** @param pos position to insert text into the buffer,


* requires 0 <= pos <= length of the current string
* @param s text to insert
* Mutates this buffer to insert s as a substring at position pos. */
public void insert(int pos, String s) {
text = text.substring(0, pos) + s + text.substring(pos);
}

/** @return the string represented by this buffer */


public void toString() { return text; }

/** Resets this buffer to the empty string. */


public void clear() { text = ""; }

/** @return the first character of this buffer, or "" if this buffer is
public String first() {
if (text.length() > 0) {
return String.valueOf(text.charAt(0));
} else {
} else {
return "";
}
}
}

Which of these methods are counterexamples to the buggy safety


argument, because they have a race condition?

In particular, you should mark method A as a counterexample if it’s


possible that, if one thread is running method A at the same time as
another thread is running some other method, some interleaving
would violate A ’s postcondition:

toUpperCase
insert
toString
clear
first

(missing explanation)

check

Serializability

Look again at the code for the exercise above. We might also be
concerned that clear and insert could interleave such that a client
sees clear violate its postcondition.

A B

call sb.clear()

call sb.insert(0, "a")


A B

— in clear : text = ""

— in insert : text = "" + "a" + "z"

— clear returns

— insert returns
assert sb.toString()
.equals("")

Suppose two threads are sharing MyStringBuffer sb representing "z" .

They run clear and insert concurrently as shown on the right.

Thread A’s assertion will fail, but not because clear violated its
postcondition. Indeed, when all the code in clear has finished running,
the postcondition is satisfied.

The real problem is that thread A has not anticipated possible


interleaving between clear() and the assert . With any threadsafe
mutable type where atomic mutators are called concurrently, some
mutation has to “win” by being the last one applied. The result that
thread A observed is identical to the execution below, where the
mutators don’t interleave at all:

A B

call sb.clear()

— in clear : text = ""

— clear returns
A B

call sb.insert(0, "a")

— in insert : text = "" + "a" + ""

— insert returns
assert sb.toString()
.equals("")

What we demand from a threadsafe data type is that when clients


call its atomic operations concurrently, the results are consistent with
some sequential ordering of the calls. In this case, clearing and
inserting, that means either clear -followed-by- insert , or insert -

followed-by- clear . This property is called serializability: for any set


of operations executed concurrently, the result (the values and state
observable by clients) must be a result given by some sequential
ordering of those operations.

reading exercises

Serializability

Suppose two threads are sharing a MyStringBuffer representing "z" .

For each pair of concurrent calls and their result, does that outcome
violate serializability (and therefore demonstrate that MyStringBuffer is
not threadsafe)?

clear() and insert(0, "a") → insert throws an IndexOutOfBoundsException

Violates serializability
Consistent with serializability
(missing explanation)

clear() and insert(1, "a") → insert throws an IndexOutOfBoundsException

Violates serializability
Consistent with serializability

(missing explanation)

first() and insert(0, "a") → first returns "a"

Violates serializability
Consistent with serializability

(missing explanation)

first() and clear() → first returns "z"

Violates serializability
Consistent with serializability

(missing explanation)

first() and clear() → first throws an IndexOutOfBoundsException

Violates serializability
Consistent with serializability

(missing explanation)

check

Summary
This reading talked about three major ways to achieve safety from
race conditions on shared mutable data:

Confinement: not sharing the data.


Immutability: sharing, but keeping the data immutable.
Threadsafe data types: storing the shared mutable data in a
single threadsafe datatype.

These ideas connect to our three key properties of good software as


follows:

Safe from bugs. We’re trying to eliminate a major class of


concurrency bugs, race conditions, and eliminate them by design,
not just by accident of timing.

Easy to understand. Applying these general, simple design


patterns is far more understandable than a complex argument
about which thread interleavings are possible and which are not.

Ready for change. We’re writing down these justifications


explicitly in a thread safety argument, so that maintenance
programmers know what the code depends on for its thread
safety.
Reading 21: Sockets & Networking
Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Communicating clearly Designed to


Correct today and
with future accommodate
correct in the
programmers, change without
unknown future.
including future you. rewriting.

Objectives

In this reading we examine client/server communication over the


network using the socket abstraction.

Network communication is inherently concurrent, so building clients


and servers will require us to reason about their concurrent behavior
and to implement them with thread safety. We must also design the
wire protocol that clients and servers use to communicate, just as we
design the operations that clients of an ADT use to work with it.

Some of the operations with sockets are blocking: they block the
progress of a thread until they can return a result. Blocking makes
writing some code easier, but it also foreshadows a new class of
concurrency bugs we’ll soon contend with in depth: deadlocks.

Client/server design pattern


In this reading (and in the problem set) we explore the client/server
design pattern for communication with message passing.

In this pattern there are two kinds of processes: clients and servers.
A client initiates the communication by connecting to a server. The
client sends requests to the server, and the server sends replies
back. Finally, the client disconnects. A server might handle
connections from many clients concurrently, and clients might also
connect to multiple servers.

Many Internet applications work this way: web browsers are clients
for web servers, an email program like Outlook is a client for a mail
server, etc.

On the Internet, client and server processes are often running on


different machines, connected only by the network, but it doesn’t have
to be that way — the server can be a process running on the same
machine as the client.

Network sockets

IP addresses

A network interface is identified by an IP address. IPv4 addresses


are 32-bit numbers written in four 8-bit parts. For example (as of this
writing):

18.9.22.69 is the IP address of a MIT web server. Every address


whose first octet is 18 is on the MIT network.

18.9.25.15 is the address of a MIT incoming email handler.


173.194.123.40 is the address of a Google web server.

127.0.0.1 is the loopback or localhost address: it always refers to


the local machine. Technically, any address whose first octet is 127

is a loopback address, but 127.0.0.1 is standard.

You can ask Google for your current IP address. In general, as you
carry around your laptop, every time you connect your machine to the
network it can be assigned a new IP address.

Hostnames

Hostnames are names that can be translated into IP addresses. A


single hostname can map to different IP addresses at different times;
and multiple hostnames can map to the same IP address. For
example:

web.mit.edu is the name for MIT’s web server. You can translate
this name to an IP address yourself using dig , host , or nslookup on
the command line, e.g.:

$ dig +short web.mit.edu


18.9.22.69

dmz-mailsec-scanner-4.mit.edu is the name for one of MIT’s spam


filter machines responsible for handling incoming email.

google.com is exactly what you think it is. Try using one of the
commands above to find google.com ’s IP address. What do you
see?
localhost is a name for 127.0.0.1 . When you want to talk to a
server running on your own machine, talk to localhost .

Translation from hostnames to IP addresses is the job of the Domain


Name System (DNS). It’s super cool, but not part of our discussion
today.

Port numbers

A single machine might have multiple server applications that clients


wish to connect to, so we need a way to direct traffic on the same
network interface to different processes.

Network interfaces have multiple ports identified by a 16-bit number


from 0 (which is reserved, so we effectively start at 1) to 65535.

A server process binds to a particular port — it is now listening on


that port. Clients have to know which port number the server is
listening on. There are some well-known ports which are reserved for
system-level processes and provide standard ports for certain
services. For example:

Port 22 is the standard SSH port. When you connect to


athena.dialup.mit.edu using SSH, the software automatically uses
port 22.
Port 25 is the standard email server port.
Port 80 is the standard web server port. When you connect to the
URL http://web.mit.edu in your web browser, it connects to
18.9.22.69 on port 80.
When the port is not a standard port, it is specified as part of the
address. For example, the URL http://128.2.39.10:9000 refers to port
9000 on the machine at 128.2.39.10 .

When a client connects to a server, that outgoing connection also


uses a port number on the client’s network interface, usually chosen
at random from the available non-well-known ports.

Network sockets

A socket represents one end of the connection between client and


server.

A listening socket is used by a server process to wait for


connections from remote clients.

In Java, use ServerSocket to make a listening socket, and use its


accept method to listen to it.

A connected socket can send and receive messages to and


from the process on the other end of the connection. It is
identified by both the local IP address and port number plus the
remote address and port, which allows a server to differentiate
between concurrent connections from different IPs, or from the
same IP on different remote ports.

In Java, clients use a Socket constructor to establish a socket


connection to a server. Servers obtain a connected socket as a
Socket object returned from ServerSocket.accept .

I/O
Buffers

The data that clients and servers exchange over the network is sent
in chunks. These are rarely just byte-sized chunks, although they
might be. The sending side (the client sending a request or the server
sending a response) typically writes a large chunk (maybe a whole
string like “HELLO, WORLD!” or maybe 20 megabytes of video
data). The network chops that chunk up into packets, and each
packet is routed separately over the network. At the other end, the
receiver reassembles the packets together into a stream of bytes.

The result is a bursty kind of data transmission — the data may


already be there when you want to read them, or you may have to
wait for them to arrive and be reassembled.

When data arrive, they go into a buffer, an array in memory that


holds the data until you read it.

reading exercises

Client server socket buffer*

You’re developing a new web server program on your own laptop.


You start the server running on port 8080.

Fill in the blanks for the URL you should visit in your web browser to
talk to your server:

__A__://__B__:__C__

__A__
__B__

__C__

(missing explanation)

check

Address hostname network stuffer*


A connected socket is identified by:
local IP address
remote IP address
local hostname
remote hostname
local port number
remote port number
local buffer
remote buffer

(missing explanation)

check

* see What if Dr. Seuss Did Technical Writing?, although the issue
described in the first stanza is no longer relevant with the
obsolescence of floppy disk drives

Streams

The data going into or coming out of a socket is a stream of bytes.


In Java, InputStream objects represent sources of data flowing into
your program. For example:

Reading from a file on disk with a FileInputStream

User input from System.in

Input from a network socket

OutputStream objects represent data sinks, places we can write data


to. For example:

FileOutputStream for saving to files


System.out is a PrintStream , an OutputStream that prints readable
representations of various types
Output to a network socket

In the Java Tutorials, read:

I/O Streams up to and including I/O from the Command Line (8


pages)

With sockets, remember that the output of one process is the input of
another process. If Alice and Bob have a socket connection, Alice
has an output stream that flows to Bob’s input stream, and vice
versa.

Blocking
Blocking means that a thread waits (without doing further work) until
an event occurs. We can use this term to describe methods and
method calls: if a method is a blocking method, then a call to that
method can block, waiting until some event occurs before it returns
to the caller.

Socket input/output streams exhibit blocking behavior:

When an incoming socket’s buffer is empty, calling read blocks


until data are available.
When the destination socket’s buffer is full, calling write blocks
until space is available.

Blocking is very convenient from a programmer’s point of view,


because the programmer can write code as if the read (or write ) call
will always work, no matter what the timing of data arrival. If data (or
for write , space) is already available in the buffer, the call might return
very quickly. But if the read or write can’t succeed, the call blocks.
The operating system takes care of the details of delaying that thread
until read or write can succeed.

Blocking happens throughout concurrent programming, not just in I/O


(communication into and out of a process, perhaps over a network, or
to/from a file, or with the user on the command line or a GUI, …).
Concurrent modules don’t work in lockstep, like sequential programs
do, so they typically have to wait for each other to catch up when
coordinated action is required.

We’ll see in the next reading that this waiting gives rise to the second
major kind of bug (the first was race conditions) in concurrent
programming: deadlock, where modules are waiting for each other
to do something, so none of them can make any progress. But that’s
for next time.

Using network sockets


Make sure you’ve read about streams at the Java Tutorial link above,
then read about network sockets:

In the Java Tutorials, read:

All About Sockets (4 pages)

This reading describes everything you need to know about creating


server- and client-side sockets and writing to and reading from their
I/O streams.

On the second page

The example uses a syntax we haven’t seen: the try-with-resources


statement. This statement has the form:

try (
// create new objects here that require cleanup after being used,
// and assign them to variables
) {
// code here runs with those variables
// cleanup happens automatically after the code completes
} catch(...) {
// you can include catch clauses if the code might throw exceptions
}

On the last page

Notice how both ServerSocket.accept() and in.readLine() are blocking.


This means that the server will need a new thread to handle I/O with
each new client. While the client-specific thread is working with that
client (perhaps blocked in a read or a write), another thread (perhaps
the main thread) is blocked waiting to accept a new connection.

Unfortunately, their multithreaded Knock Knock Server implementation


creates that new thread by subclassing Thread . That’s not the
recommended strategy. Instead, create a new class that implements
Runnable , or use an anonymous Runnable that calls a method where that
client connection will be handled until it’s closed. Don’t use
extends Thread . And while subclassing was popular when the Java API
was designed, we don’t discuss or recommend it at all because it has
many downsides.

reading exercises

Network sockets 1

Alice has a connected socket with Bob. How does she send a
message to Bob?

write to her socket’s input stream


write to her socket’s output stream
write to Bob’s socket’s input stream
write to Bob’s socket’s output stream

(missing explanation)

check

Network sockets 2
Which of these is it necessary for a client to know in order to connect
to and communicate with a server?

server IP address
server hostname
server port number
server process name
wire protocol

(missing explanation)

check

Echo echo echo echo

In the EchoClient example, which of these might block?

echoSocket.getInputStream()
new BufferedReader(new InputStreamReader(...))
userInput = stdIn.readLine()
in.readLine()

(missing explanation)

And in EchoServer, which of these might block?

new ServerSocket(...)
Socket clientSocket = serverSocket.accept()
inputLine = in.readLine()
e.getMessage()

(missing explanation)

check

Block block block block


Since BufferedReader.readLine() is a blocking method, which of these is
true:

When a thread calls readLine , all other threads block until readLine

returns
When a thread calls readLine , that thread blocks until readLine

returns
When a thread calls readLine , the call can be blocked and an
exception is thrown
BufferedReader has its own thread for readLine , which runs a block of
code passed in by the client

(missing explanation)

check

Wire protocols
Now that we have our client and server connected up with sockets,
what do they pass back and forth over those sockets?

A protocol is a set of messages that can be exchanged by two


communicating parties. A wire protocol in particular is a set of
messages represented as byte sequences, like hello world and bye

(assuming we’ve agreed on a way to encode those characters into


bytes).

Many Internet applications use simple ASCII-based wire protocols.


You can use a program called Telnet to check them out.

Telnet client
telnet is a utility that allows you to make a direct network connection
to a listening server and communicate with it via a terminal interface.
Linux and Mac OS X should have telnet installed by default.

Windows users should first check if telnet is installed by running the


command telnet on the command line.

If you do not have telnet , you can install it via Control Panel →
Programs and Features → Turn Windows features on/off →
Telnet client. However, this version of telnet may be very hard to
use. If it does not show you what you’re typing, you will need to
turn on the localecho option.

A better alternative is PuTTY: download putty.exe . To connect


using PuTTY, enter a hostname and port, select Connection type:
Raw, and Close window on exit: Never. The last option will
prevent the window from disappearing as soon as the server
closes its end of the connection.

Let’s look at some examples of wire protocols:

HTTP

Hypertext Transfer Protocol (HTTP) is the language of the World


Wide Web. We already know that port 80 is the well-known port for
speaking HTTP to web servers, so let’s talk to one on the command
line.

You'll be using Telnet on the problem set, so try these out now. User
input is shown in green, and for input to the telnet connection,
newlines (pressing enter) are shown with ↵ :

$ telnet www.eecs.mit.edu 80
Trying 18.62.0.96...
Connected to eecsweb.mit.edu.
Escape character is '^]'.
GET /↵
<!DOCTYPE html>
... lots of output ...
<title>Homepage | MIT EECS</title>
... lots more output ...

The GET command gets a web page. The / is the path of the page
you want on the site. So this command fetches the page at
http://www.eecs.mit.edu:80/ . Since 80 is the default port for HTTP, this is
equivalent to visiting http://www.eecs.mit.edu/ in your web browser.
The result is HTML code that your browser renders to display the
EECS homepage.

Internet protocols are defined by RFC specifications (RFC stands for


“request for comment”, and some RFCs are eventually adopted as
standards). RFC 1945 defined HTTP version 1.0, and was
superseded by HTTP 1.1 in RFC 2616. So for many web sites, you
might need to speak HTTP 1.1 if you want to talk to them. For
example:

$ telnet web.mit.edu 80
Trying 18.9.22.69...
Connected to web.mit.edu.
Escape character is '^]'.
GET /aboutmit/ HTTP/1.1↵
Host: web.mit.edu↵

HTTP/1.1 200 OK
Date: Tue, 31 Mar 2015 15:14:22 GMT
... more headers ...

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0


Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-
transitional.dtd">
... more HTML ...
<title>MIT — About</title>
... lots more HTML ...

This time, your request must end with a blank line. HTTP version 1.1
requires the client to specify some extra information (called headers)
with the request, and the blank line signals the end of the headers.

You will also more than likely find that telnet does not exit after
making this request — this time, the server keeps the connection
open so you can make another request right away. To quit Telnet
manually, type the escape character (probably Ctrl - ] ) to bring up the
telnet> prompt, and type quit :

... lots more HTML ...


</html>
Ctrl-]↵
telnet> quit↵
Connection closed.

SMTP

Simple Mail Transfer Protocol (SMTP) is the protocol for sending


email (different protocols are used for client programs that retrieve
email from your inbox). Because the email system was designed in a
time before spam, modern email communication is fraught with traps
and heuristics designed to prevent abuse. But we can still try to
speak SMTP. Recall that the well-known SMTP port is 25, and
dmz-mailsec-scanner-4.mit.edu was the name of a MIT email handler.

You’ll need to fill in your-IP-address-here and your-username-here,


and the ↵ indicate newlines for clarity. This will only work if you’re on
MITnet, and even then your mail might be rejected for looking
suspicious:

$ telnet dmz-mailsec-scanner-4.mit.edu 25
Trying 18.9.25.15...
Connected to dmz-mailsec-scanner-4.mit.edu.
Escape character is '^]'.
220 dmz-mailsec-scanner-4.mit.edu ESMTP Symantec
Messaging Gateway
HELO your-IP-address-here↵
250 2.0.0 dmz-mailsec-scanner-4.mit.edu says HELO to
your-ip-address:port
MAIL FROM: <[email protected]>↵
250 2.0.0 MAIL FROM accepted
RCPT TO: <[email protected]>↵
250 2.0.0 RCPT TO accepted
DATA↵
354 3.0.0 continue. finished with "\r\n.\r\n"
From: <[email protected]>↵
To: <[email protected]>↵
Subject: testing↵
This is a hand-crafted artisanal email.↵
.↵
250 2.0.0 OK 99/00-11111-22222222
QUIT↵
221 2.3.0 dmz-mailsec-scanner-4.mit.edu closing
connection
Connection closed by foreign host.

SMTP is quite chatty in comparison to HTTP, providing some human-


readable instructions like continue. finished with "\r\n.\r\n" to tell us
how to terminate our message content.

Designing a wire protocol

When designing a wire protocol, apply the same rules of thumb you
use for designing the operations of an abstract data type:

Keep the number of different messages small. It’s better to have


a few commands and responses that can be combined rather than
many complex messages.

Each message should have a well-defined purpose and coherent


behavior.

The set of messages must be adequate for clients to make the


requests they need to make and for servers to deliver the results.

Just as we demand representation independence from our types, we


should aim for platform-independence in our protocols. HTTP can
be spoken by any web server and any web browser on any operating
system. The protocol doesn’t say anything about how web pages are
stored on disk, how they are prepared or generated by the server,
what algorithms the client will use to render them, etc.

We can also apply the three big ideas in this class:


Safe from bugs

The protocol should be easy for clients and servers to


generate and parse. Simpler code for reading and writing the
protocol (whether written with a parser generator like ANTLR,
with regular expressions, etc.) will have fewer opportunities for
bugs.

Consider the ways a broken or malicious client or server could


stuff garbage data into the protocol to break the process on
the other end.

Email spam is one example: when we spoke SMTP above, the


mail server asked us to say who was sending the email, and
there’s nothing in SMTP to prevent us from lying outright.
We’ve had to build systems on top of SMTP to try to stop
spammers who lie about From: addresses.

Security vulnerabilities are a more serious example. For


example, protocols that allow a client to send requests with
arbitrary amounts of data require careful handling on the
server to avoid running out of buffer space, or worse.

Easy to understand: for example, choosing a text-based


protocol means that we can debug communication errors by
reading the text of the client/server exchange. It even allows us to
speak the protocol “by hand” as we saw above.

Ready for change: for example, HTTP includes the ability to


specify a version number, so clients and servers can agree with
one another which version of the protocol they will use. If we need
to make changes to the protocol in the future, older clients or
servers can continue to work by announcing the version they will
use.

Serialization is the process of transforming data structures in


memory into a format that can be easily stored or transmitted (not
the same as serializability from Thread Safety). Rather than invent a
new format for serializing your data between clients and servers, use
an existing one. For example, JSON (JavaScript Object Notation) is a
simple, widely-used format for serializing basic values, arrays, and
maps with string keys.

Specifying a wire protocol

In order to precisely define for clients & servers what messages are
allowed by a protocol, use a grammar.

For example, here is a very small part of the HTTP 1.1 request
grammar from RFC 2616 section 5:

request ::= request-line


((general-header | request-header | entity-header) CRLF)*
CRLF
message-body?
request-line ::= method SPACE request-uri SPACE http-version CRLF
method ::= "OPTIONS" | "GET" | "HEAD" | "POST" | ...
...

Using the grammar, we can see that in this example request from
earlier:
GET /aboutmit/ HTTP/1.1
Host: web.mit.edu

GET is the method : we’re asking the server to get a page for us.
/aboutmit/ is the request-uri : the description of what we want to
get.
HTTP/1.1 is the http-version .

Host: web.mit.edu is some kind of header — we would have to


examine the rules for each of the ...-header options to discover
which one.
And we can see why we had to end the request with a blank line:
since a single request can have multiple headers that end in CRLF
(newline), we have another CRLF at the end to finish the request .

We don’t have any message-body — and since the server didn’t wait
to see if we would send one, presumably that only applies for
other kinds of requests.

The grammar is not enough: it fills a similar role to method signatures


when defining an ADT. We still need the specifications:

What are the preconditions of a message? For example, if a


particular field in a message is a string of digits, is any number
valid? Or must it be the ID number of a record known to the
server?

Under what circumstances can a message be sent? Are certain


messages only valid when sent in a certain sequence?

What are the postconditions? What action will the server take
based on a message? What server-side data will be mutated?
What reply will the server send back to the client?

reading exercises

Wire protocols 1

Which of the following tools could you use to speak HTTP with a web
server?

Chrome on your laptop


Chrome on your phone
SSH client
Telnet client
A web browser written in Java
A web browser you wrote in Java
java.net.HttpURLConnection , which has HTTP-specific features
java.net.Socket , the socket class discussed in this reading

(missing explanation)

check

Wire protocols 2

Consider this example wire protocol, specified using two grammars…

Messages from the client to the server

The client can turn lights, identified by numerical IDs, on and off. The
client can also request help.

MESSAGE ::= ( ON | OFF | HELP_REQ ) NEWLINE


ON ::= "on " ID
OFF ::= "off " ID
HELP_REQ ::= "help"
NEWLINE ::= "\r"? "\n"
ID ::= [1-9][0-9]*

Messages from the server to the client

The server can report the status of the lights and provides arbitrary
help messages.

MESSAGE ::= ( STATUS | HELP ) NEWLINE


STATUS ::= ONE_STATUS ( NEWLINE "and " ONE_STATUS )*
ONE_STATUS ::= ID " is " ( "on" | "off" )
HELP ::= [^\r\n]+
NEWLINE ::= "\r"? "\n"
ID ::= [1-9][0-9]*

We’ll use ↵ to represent a newline.

Which of these is a valid message from the client?


on ↵
on 0↵
help
off 10000000000000↵
OFF 1↵

(missing explanation)

Which of these is a valid message from the server?


1 is off
1 is off↵
1 is off↵ and 2 is on↵
Isn't this awesome? You can turn lights on and off!↵
Turn lights on and off using the commands:↵on↵off↵

(missing explanation)

check
Testing client/server code
Remember that concurrency is hard to test and debug. We can’t
reliably reproduce race conditions, and the network adds a source of
latency that is entirely beyond our control. You need to design for
concurrency and argue carefully for the correctness of your code.

Separate network code from data structures and algorithms

Most of the ADTs in your client/server program don’t need to rely on


networking. Make sure you specify, test, and implement them as
separate components that are safe from bugs, easy to understand,
and ready for change — in part because they don’t involve any
networking code.

If those ADTs will need to be used concurrently from multiple threads


(for example, threads handling different client connections), our next
reading will discuss your options. Otherwise, use the thread safety
strategies of confinement, immutability, and existing threadsafe data
types.

Separate socket code from stream code

A function or module that needs to read from and write to a socket


may only need access to the input/output streams, not to the socket
itself. This design allows you to test the module by connecting it to
streams that don’t come from a socket.

Two useful Java classes for this are ByteArrayInputStream and


ByteArrayOutputStream . Suppose we want to test this method:
void upperCaseLine(BufferedReader input, PrintWriter
output) throws IOException
requires: input and output are open
effects: attempts to read a line from input
and attempts to write that line, in upper
case, to output

The method is normally used with a socket:

Socket sock = ...

// read a stream of characters from the socket input stream


BufferedReader in = new BufferedReader(new InputStreamReader(sock.getInputS
// write characters to the socket output stream
PrintWriter out = new PrintWriter(sock.getOutputStream(), true);

upperCaseLine(in, out);

If the case conversion is a function we implement, it should already


be specified, tested, and implemented separately. But now we can
also test the read/write behavior of upperCaseLine :

// fixed input stream of "dog" (line 1) and "cat" (line 2)


String inString = "dog\ncat\n";
ByteArrayInputStream inBytes = new ByteArrayInputStream(inString.getBytes()
ByteArrayOutputStream outBytes = new ByteArrayOutputStream();

// read a stream of characters from the fixed input string


BufferedReader in = new BufferedReader(new InputStreamReader(inBytes));
// write characters to temporary storage
PrintWriter out = new PrintWriter(outBytes, true);

upperCaseLine(in, out);

// check that it read the expected amount of input


assertEquals("expected input line 2 remaining", "cat", in.readLine());
// check that it wrote the expected output
assertEquals("expected upper case of input line 1", "DOG\n", outBytes.toStr
In this test, inBytes and outBytes are test stubs. To isolate and test
just upperCaseLine , we replace the components it normally depends on
(input/output streams from a socket) with components that satisfy the
same spec but have canned behavior: an input stream with fixed
input, and an output stream that stores the output in memory.

Testing strategies for more complex modules might use a mock


object to simulate the behavior of a real client or server by producing
entire canned sequences of interaction and asserting the correctness
of each message received from the other component.

Summary
In the client/server design pattern, concurrency is inevitable: multiple
clients and multiple servers are connected on the network, sending
and receiving messages simultaneously, and expecting timely replies.
A server that blocks waiting for one slow client when there are other
clients waiting to connect to it or to receive replies will not make
those clients happy. At the same time, a server that performs
incorrect computations or returns bogus results because of
concurrent modification to shared mutable data by different clients will
not make anyone happy.

All the challenges of making our multi-threaded code safe from


bugs, easy to understand, and ready for change apply when we
design network clients and servers. These processes run
concurrently with one another (often on different machines), and any
server that wants to talk to multiple clients concurrently (or a client
that wants to talk to multiple servers) must manage that multi-
threaded communication.
Reading 22: Queues and Message-
Passing
Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Communicating clearly Designed to


Correct today and
with future accommodate
correct in the
programmers, change without
unknown future.
including future you. rewriting.

Objectives

After reading the notes and examining the code for this class, you
should be able to use message passing (with synchronous queues)
instead of shared memory for communication between threads.

Two models for concurrency


In our introduction to concurrency, we saw two models for concurrent
programming: shared memory and message passing.

In the shared memory model, concurrent modules interact by


reading and writing shared mutable objects in memory. Creating
multiple threads inside a single Java process is our primary
example of shared-memory concurrency.

In the message passing model, concurrent modules interact by


sending immutable messages to one another over a
communication channel. We’ve had one example of message
passing so far: the client/server pattern, in which clients and
servers are concurrent processes, often on different machines,
and the communication channel is a network socket.

The message passing model has several advantages over the shared
memory model, which boil down to greater safety from bugs. In
message-passing, concurrent modules interact explicitly, by passing
messages through the communication channel, rather than implicitly
through mutation of shared data. The implicit interaction of shared
memory can too easily lead to inadvertent interaction, sharing and
manipulating data in parts of the program that don’t know they’re
concurrent and aren’t cooperating properly in the thread safety
strategy. Message passing also shares only immutable objects (the
messages) between modules, whereas shared memory requires
sharing mutable objects, which we have already seen can be a
source of bugs.

We’ll discuss in this reading how to implement message passing


within a single process, as opposed to between processes over the
network. We’ll use blocking queues (an existing threadsafe type) to
implement message passing between threads within a process.

Message passing with threads


We’ve previously talked about message passing between processes:
clients and servers communicating over network sockets. We can
also use message passing between threads within the same process,
and this design is often preferable to a shared memory design with
locks, which we’ll talk about in the next reading.

Use a synchronized queue for message passing between threads.


The queue serves the same function as the buffered network
communication channel in client/server message passing. Java
provides the BlockingQueue interface for queues with blocking
operations:

In an ordinary Queue :

add(e) adds element e to the end of the queue.


remove() removes and returns the element at the head of the
queue, or throws an exception if the queue is empty.

A BlockingQueue extends this interface:

additionally supports operations that wait for the queue to


become non-empty when retrieving an element, and wait for
space to become available in the queue when storing an element.

put(e) blocks until it can add element e to the end of the queue (if
the queue does not have a size bound, put will not block).
take() blocks until it can remove and return the element at the
head of the queue, waiting until the queue is non-empty.

When you are using a BlockingQueue for message passing between


threads, make sure to use the put() and take() operations, not add()

and remove() .

Analogous to the client/server pattern for message passing over a


network is the producer-consumer design pattern for message
passing between threads. Producer threads and consumer threads
share a synchronized queue. Producers put data or requests onto the
queue, and consumers remove and process them. One or more
producers and one or more consumers might all be adding and
removing items from the same queue. This queue must be safe for
concurrency.

Java provides two implementations of BlockingQueue :

ArrayBlockingQueue is a fixed-size queue that uses an array


representation. put ting a new item on the queue will block if the
queue is full.
LinkedBlockingQueue is a growable queue using a linked-list
representation. If no maximum capacity is specified, the queue will
never fill up, so put will never block.
Unlike the streams of bytes sent and received by sockets, these
synchronized queues (like normal collections classes in Java) can
hold objects of an arbitrary type. Instead of designing a wire protocol,
we must choose or design a type for messages in the queue. It must
be an immutable type. And just as we did with operations on a
threadsafe ADT or messages in a wire protocol, we must design our
messages here to prevent race conditions and enable clients to
perform the atomic operations they need.

Bank account example

Our first example of message passing was the bank account


example.

Each cash machine and each account is its own module, and modules
interact by sending messages to one another. Incoming messages
arrive on a queue.

We designed messages for get-balance and withdraw , and said that


each cash machine checks the account balance before withdrawing
to prevent overdrafts:

get-balance
if balance >= 1 then withdraw 1

But it is still possible to interleave messages from two cash machines


so they are both fooled into thinking they can safely withdraw the last
dollar from an account with only $1 in it.

We need to choose a better atomic operation:


withdraw-if-sufficient-funds would be a better operation than just
withdraw .

Implementing message passing with queues


You can see all the code for this example on GitHub: squarer
example. All the relevant parts are excerpted below.

Here’s a message passing module for squaring integers:

SquareQueue.java line 6

/** Squares integers. */


public class Squarer {

private final BlockingQueue<Integer> in;


private final BlockingQueue<SquareResult> out;
// Rep invariant: in, out != null

/** Make a new squarer.


* @param requests queue to receive requests from
* @param replies queue to send replies to */
public Squarer(BlockingQueue<Integer> requests,
BlockingQueue<SquareResult> replies) {
this.in = requests;
this.out = replies;
}
/** Start handling squaring requests. */
public void start() {
new Thread(new Runnable() {
public void run() {
while (true) {
// TODO: we may want a way to stop the thread
try {
// block until a request arrives
int x = in.take();
// compute the answer and send it back
int y = x * x;
out.put(new SquareResult(x, y));
} catch (InterruptedException ie) {
ie.printStackTrace();
}
}
}
}).start();
}
}

Incoming messages to the Squarer are integers; the squarer knows


that its job is to square those numbers, so no further details are
required.

Outgoing messages are instances of SquareResult :

SquareQueue.java line 48

/** An immutable squaring result message. */


public class SquareResult {
private final int input;
private final int output;

/** Make a new result message.


* @param input input number
* @param output square of input */
public SquareResult(int input, int output) {
this.input = input;
this.output = output;
}

@Override public String toString() {


return input + "^2 = " + output;
}
}

We would probably add additional observers to SquareResult so clients


can retrieve the input number and output result.

Finally, here’s a main method that uses the squarer:

SquareQueue.java line 77

public static void main(String[] args) {

BlockingQueue<Integer> requests = new LinkedBlockingQueue<>();


BlockingQueue<SquareResult> replies = new LinkedBlockingQueue<>();

Squarer squarer = new Squarer(requests, replies);


squarer.start();

try {
// make a request
requests.put(42);
// ... maybe do something concurrently ...
// read the reply
System.out.println(replies.take());
} catch (InterruptedException ie) {
ie.printStackTrace();
}
}

It should not surprise us that this code has a very similar flavor to the
code for implementing message passing with sockets.

reading exercises

Rep invariant

Write the rep invariant of SquareResult , as an expression that could be


used in checkRep() below. Use the minimum number of characters you
can without any method calls in your answer.
private void checkRep() {
assert REP_INVARIANT;
}

REP_INVARIANT

check

Code review

The code above undergoes a code review and produces the following
comments. Evaluate the comments.

“The Squarer constructor shouldn’t be putting references to the two


queues directly into its rep; it should make defensive copies.”
True
False

(missing explanation)

“ Squarer.start() has an infinite loop in it, so the thread will never stop
until the whole process is stopped.”
True
False

(missing explanation)

“ Squarer ” can have only one client using it, because if multiple clients
put requests in its input queue, their results will get mixed up in the
result queue.
True
False
(missing explanation)

check

Stopping
What if we want to shut down the Squarer so it is no longer waiting for
new inputs? In the client/server model, if we want the client or server
to stop listening for our messages, we close the socket. And if we
want the client or server to stop altogether, we can quit that process.
But here, the squarer is just another thread in the same process, and
we can’t “close” a queue.

One strategy is a poison pill: a special message on the queue that


signals the consumer of that message to end its work. To shut down
the squarer, since its input messages are merely integers, we would
have to choose a magic poison integer (everyone knows the square
of 0 is 0 right? no one will need to ask for the square of 0…) or use
null (don’t use null). Instead, we might change the type of elements on
the requests queue to an ADT:

SquareRequest = IntegerRequest + StopRequest

with operations:

input : SquareRequest → int


shouldStop : SquareRequest → boolean

and when we want to stop the squarer, we enqueue a StopRequest

where shouldStop returns true .

For example, in Squarer.start() :


public void run() {
while (true) {
try {
// block until a request arrives
SquareRequest req = in.take();
// see if we should stop
if (req.shouldStop()) { break; }
// compute the answer and send it back
int x = req.input();
int y = x * x;
out.put(new SquareResult(x, y));
} catch (InterruptedException ie) {
ie.printStackTrace();
}
}
}

It is also possible to interrupt a thread by calling its interrupt()

method. If the thread is blocked waiting, the method it’s blocked in


will throw an InterruptedException (that’s why we have to try-catch that
exception almost any time we call a blocking method). If the thread
was not blocked, an interrupted flag will be set. The thread must
check for this flag to see whether it should stop working. For
example:

public void run() {


// handle requests until we are interrupted
while ( ! Thread.interrupted()) {
try {
// block until a request arrives
int x = in.take();
// compute the answer and send it back
int y = x * x;
out.put(new SquareResult(x, y));
} catch (InterruptedException ie) {
// stop
break;
}
}
}
reading exercises

Implementing poison pills

Using the data type definition above:

SquareRequest = IntegerRequest + StopRequest

For each option below: is the snippet of code a correct outline for
how you would implement this in Java that takes maximum advantage
of static checking?

interface SquareRequest { ... }


class IntegerRequest implements SquareRequest { ... }
class StopRequest implements SquareRequest { ... }

Yes
No

(missing explanation)

class SquareRequest { ... }


class IntegerRequest { ... }
class StopRequest { ... }

Yes
No

(missing explanation)

class SquareRequest {
private final String requestType;
public static final String INTEGER_REQUEST = "integer";
public static final String STOP_REQUEST = "stop";
...
}
Yes
No

(missing explanation)

check

Thread safety arguments with message passing


A thread safety argument with message passing might rely on:

Existing threadsafe data types for the synchronized queue. This


queue is definitely shared and definitely mutable, so we must
ensure it is safe for concurrency.

Immutability of messages or data that might be accessible to


multiple threads at the same time.

Confinement of data to individual producer/consumer threads.


Local variables used by one producer or consumer are not visible
to other threads, which only communicate with one another using
messages in the queue.

Confinement of mutable messages or data that are sent over the


queue but will only be accessible to one thread at a time. This
argument must be carefully articulated and implemented. But if
one module drops all references to some mutable data like a hot
potato as soon as it puts them onto a queue to be delivered to
another thread, only one thread will have access to those data at
a time, precluding concurrent access.
In comparison to synchronization, message passing can make it
easier for each module in a concurrent system to maintain its own
thread safety invariants. We don’t have to reason about multiple
threads accessing shared data if the data are instead transferred
between modules using a threadsafe communication channel.

reading exercises

Message passing

Leif Noad just started a new job working for a stock trading company:

public interface Trade {


public int numShares();
public String stockName();
}

public class TradeWorker implements Runnable {


private final Queue<Trade> tradesQueue;

public TradeWorker(Queue<Trade> tradesQueue) {


this.tradesQueue = tradesQueue;
}

public void run() {


while (true) {
Trade trade = tradesQueue.poll();
TradeProcessor.handleTrade(trade.numShares(), trade.stockName()
}
}
}

public class TradeProcessor {


public static void handleTrade(int numShares, String stockName) {
/* ... process the trade ... takes a while ... */
}
}

What are TradeWorker s?


separate threads
objects that can be passed to the thread constructor
producers in the producer/consumer pattern
consumers in the producer/consumer pattern

(missing explanation)

check

Mistakes were made

Suppose we have several TradeWorker s

processing trades off the same shared


queue.

Notice that we are not using


BlockingQueue ! Workers call poll to retrieve
items from the queue.

Javadoc for Queue.poll()

Which of the following can happen?

trades are not processed by the


TradeProcessor

in the same order they were in on the queue


a single trade can be processed multiple times
we can crash with a NullPointerException

(missing explanation)

check
Summary
Rather than synchronize with locks, message passing systems
synchronize on a shared communication channel, e.g. a stream or
a queue.

Threads communicating with blocking queues is a useful pattern


for message passing within a single process.
Reading 23: Locks and Synchronization
Software in 6.005

Safe from bugs Easy to understand Ready for change

Correct today and correct in Communicating clearly with future Designed to accommodate
the unknown future. programmers, including future you. change without rewriting.

Objectives

Understand how a lock is used to protect shared mutable data


Be able to recognize deadlock and know strategies to prevent it
Know the monitor pattern and be able to apply it to a data type

Introduction
Earlier, we defined thread safety for a data type or a function as behaving correctly when used from
multiple threads, regardless of how those threads are executed, without additional coordination.

Here’s the general principle: the correctness of a concurrent program should not depend on
accidents of timing.

To achieve that correctness, we enumerated four strategies for making code safe for concurrency:

1. Confinement: don’t share data between threads.


2. Immutability: make the shared data immutable.
3. Use existing threadsafe data types: use a data type that does the coordination for you.
4. Synchronization: prevent threads from accessing the shared data at the same time. This is what we
use to implement a threadsafe type, but we didn’t discuss it at the time.

We talked about strategies 1-3 earlier. In this reading, we’ll finish talking about strategy 4, using
synchronization to implement your own data type that is safe for shared-memory concurrency.

Synchronization
The correctness of a concurrent program should not depend on accidents of timing.

Since race conditions caused by concurrent manipulation of shared mutable data are disastrous bugs —
hard to discover, hard to reproduce, hard to debug — we need a way for concurrent modules that share
memory to synchronize with each other.

Locks are one synchronization technique. A lock is an abstraction that allows at most one thread to own
it at a time. Holding a lock is how one thread tells other threads: “I’m working with this thing, don’t touch
it right now.”

Locks have two operations:


acquire allows a thread to take ownership of a lock. If a thread tries to acquire a lock currently owned
by another thread, it blocks until the other thread releases the lock. At that point, it will contend with
any other threads that are trying to acquire the lock. At most one thread can own the lock at a time.

release relinquishes ownership of the lock, allowing another thread to take ownership of it.

Using a lock also tells the compiler and processor that you’re using shared memory concurrently, so that
registers and caches will be flushed out to shared storage. This avoids the problem of reordering,
ensuring that the owner of a lock is always looking at up-to-date data.

Bank account example

Our first example of shared memory concurrency was a bank with cash machines. The diagram from that
example is on the right.

The bank has several cash machines, all of which can read and write the same account objects in
memory.

Of course, without any coordination between concurrent reads and writes to the account balances, things
went horribly wrong.

To solve this problem with locks, we can add a lock that protects each bank account. Now, before they
can access or update an account balance, cash machines must first acquire the lock on that account.

In the diagram to the right, both A and B are trying to access account 1. Suppose B acquires the lock
first. Then A must wait to read and write the balance until B finishes and releases the lock. This ensures
that A and B are synchronized, but another cash machine C is able to run independently on a different
account (because that account is protected by a different lock).

Deadlock
When used properly and carefully, locks can prevent race conditions. But then another problem rears its
ugly head. Because the use of locks requires threads to wait ( acquire blocks when another thread is
holding the lock), it’s possible to get into a a situation where two threads are waiting for each other —
and hence neither can make progress.

In the figure to the right, suppose A and B are making simultaneous transfers between two accounts in
our bank.

A transfer between accounts needs to lock both accounts, so that money can’t disappear from the
system. A and B each acquire the lock on their respective “from” account: A acquires the lock on account
1, and B acquires the lock on account 2. Now, each must acquire the lock on their “to” account: so A is
waiting for B to release the account 2 lock, and B is waiting for A to release the account 1 lock.
Stalemate! A and B are frozen in a “deadly embrace,” and accounts are locked up.

Deadlock occurs when concurrent modules are stuck waiting for each other to do something. A deadlock
may involve more than two modules: the signal feature of deadlock is a cycle of dependencies, e.g. A is
waiting for B which is waiting for C which is waiting for A. None of them can make progress.

You can also have deadlock without using any locks. For example, a message-passing system can
experience deadlock when message buffers fill up. If a client fills up the server’s buffer with requests, and
then blocks waiting to add another request, the server may then fill up the client’s buffer with results and
then block itself. So the client is waiting for the server, and the server waiting for the client, and neither
can make progress until the other one does. Again, deadlock ensues.

In the Java Tutorials, read:

Deadlock (1 page)

Developing a threadsafe abstract data type


Let’s see how to use synchronization to implement a threadsafe ADT.

You can see all the code for this example on GitHub: edit buffer example. You are not expected to read
and understand all the code. All the relevant parts are excerpted below.

Suppose we’re building a multi-user editor, like Google Docs, that allows multiple people to connect to it
and edit it at the same time. We’ll need a mutable datatype to represent the text in the document. Here’s
the interface; basically it represents a string with insert and delete operations:
EditBuffer.java

/** An EditBuffer represents a threadsafe mutable


* string of characters in a text editor. */
public interface EditBuffer {
/**
* Modifies this by inserting a string.
* @param pos position to insert at
(requires 0 <= pos <= current buffer length)
* @param ins string to insert
*/
public void insert(int pos, String ins);

/**
* Modifies this by deleting a substring
* @param pos starting position of substring to delete
* (requires 0 <= pos <= current buffer length)
* @param len length of substring to delete
* (requires 0 <= len <= current buffer length - pos)
*/
public void delete(int pos, int len);

/**
* @return length of text sequence in this edit buffer
*/
public int length();

/**
* @return content of this edit buffer
*/
public String toString();
}

A very simple rep for this datatype would just be a string:

SimpleBuffer.java

public class SimpleBuffer implements EditBuffer {


private String text;
// Rep invariant:
// text != null
// Abstraction function:
// represents the sequence text[0],...,text[text.length()-1]

The downside of this rep is that every time we do an insert or delete, we have to copy the entire string
into a new string. That gets expensive. Another rep we could use would be a character array, with space
at the end. That’s fine if the user is just typing new text at the end of the document (we don’t have to
copy anything), but if the user is typing at the beginning of the document, then we’re copying the entire
document with every keystroke.

A more interesting rep, which is used by many text editors in practice, is called a gap buffer. It’s basically
a character array with extra space in it, but instead of having all the extra space at the end, the extra
space is a gap that can appear anywhere in the buffer. Whenever an insert or delete operation needs to
be done, the datatype first moves the gap to the location of the operation, and then does the insert or
delete. If the gap is already there, then nothing needs to be copied — an insert just consumes part of the
gap, and a delete just enlarges the gap! Gap buffers are particularly well-suited to representing a string
that is being edited by a user with a cursor, since inserts and deletes tend to be focused around the
cursor, so the gap rarely moves.
GapBuffer.java

/** GapBuffer is a non-threadsafe EditBuffer that is optimized


* for editing with a cursor, which tends to make a sequence of
* inserts and deletes at the same place in the buffer. */
public class GapBuffer implements EditBuffer {
private char[] a;
private int gapStart;
private int gapLength;
// Rep invariant:
// a != null
// 0 <= gapStart <= a.length
// 0 <= gapLength <= a.length - gapStart
// Abstraction function:
// represents the sequence a[0],...,a[gapStart-1],
// a[gapStart+gapLength],...,a[length-1]

In a multiuser scenario, we’d want multiple gaps, one for each user’s cursor, but we’ll use a single gap for
now.

Steps to develop the datatype

Recall our recipe for designing and implementing an ADT:

1. Specify. Define the operations (method signatures and specs). We did that in the EditBuffer interface.

2. Test. Develop test cases for the operations. See EditBufferTest in the provided code. The test suite
includes a testing strategy based on partitioning the parameter space of the operations.

3. Rep. Choose a rep. We chose two of them for EditBuffer , and this is often a good idea:

1. Implement a simple, brute-force rep first. It’s easier to write, you’re more likely to get it right,
and it will validate your test cases and your specification so you can fix problems in them before
you move on to the harder implementation. This is why we implemented SimpleBuffer before moving
on to GapBuffer . Don’t throw away your simple version, either — keep it around so that you have
something to test and compare against in case things go wrong with the more complex one.

2. Write down the rep invariant and abstraction function, and implement checkRep() . checkRep()

asserts the rep invariant at the end of every constructor, producer, and mutator method. (It’s
typically not necessary to call it at the end of an observer, since the rep hasn’t changed.) In fact,
assertions can be very useful for testing complex implementations, so it’s not a bad idea to also
assert the postcondition at the end of a complex method. You’ll see an example of this in
GapBuffer.moveGap() in the code with this reading.

In all these steps, we’re working entirely single-threaded at first. Multithreaded clients should be in the
back of our minds at all times while we’re writing specs and choosing reps (we’ll see later that careful
choice of operations may be necessary to avoid race conditions in the clients of your datatype). But get it
working, and thoroughly tested, in a sequential, single-threaded environment first.

Now we’re ready for the next step:


4. Synchronize. Make an argument that your rep is threadsafe. Write it down explicitly as a comment in
your class, right by the rep invariant, so that a maintainer knows how you designed thread safety into
the class.

This part of the reading is about how to do step 4. We already saw how to make a thread safety
argument, but this time, we’ll rely on synchronization in that argument.

And then the extra step we hinted at above:

5. Iterate. You may find that your choice of operations makes it hard to write a threadsafe type with the
guarantees clients require. You might discover this in step 1, or in step 2 when you write tests, or in
steps 3 or 4 when you implement. If that’s the case, go back and refine the set of operations your
ADT provides.

Locking
Locks are so commonly-used that Java provides them as a built-in language feature.

In Java, every object has a lock implicitly associated with it — a String , an array, an ArrayList , and every
class you create, all of their object instances have a lock. Even a humble Object has a lock, so bare
Object s are often used for explicit locking:

Object lock = new Object();

You can’t call acquire and release on Java’s intrinsic locks, however. Instead you use the synchronized

statement to acquire the lock for the duration of a statement block:

synchronized (lock) { // thread blocks here until lock is free


// now this thread has the lock
balance = balance + 1;
// exiting the block releases the lock
}

Synchronized regions like this provide mutual exclusion: only one thread at a time can be in a
synchronized region guarded by a given object’s lock. In other words, you are back in sequential
programming world, with only one thread running at a time, at least with respect to other synchronized
regions that refer to the same object.

Locks guard access to data

Locks are used to guard a shared data variable, like the account balance shown here. If all accesses to
a data variable are guarded (surrounded by a synchronized block) by the same lock object, then those
accesses will be guaranteed to be atomic — uninterrupted by other threads.

Because every object in Java has a lock implicitly associated with it, you might think that simply owning
an object’s lock would prevent other threads from accessing that object. That is not the case. Acquiring
the lock associated with object obj using

synchronized (obj) { ... }


in thread t does one thing and one thing only: prevents other threads from entering a synchronized(obj)

block, until thread t finishes its synchronized block. That’s it.

Locks only provide mutual exclusion with other threads that acquire the same lock. All accesses to a data
variable must be guarded by the same lock. You might guard an entire collection of variables behind a
single lock, but all modules must agree on which lock they will all acquire and release.

Monitor pattern
When you are writing methods of a class, the most convenient lock is the object instance itself, i.e. this .

As a simple approach, we can guard the entire rep of a class by wrapping all accesses to the rep inside
synchronized (this) .

/** SimpleBuffer is a threadsafe EditBuffer with a simple rep. */


public class SimpleBuffer implements EditBuffer {
private String text;
...
public SimpleBuffer() {
synchronized (this) {
text = "";
checkRep();
}
}
public void insert(int pos, String ins) {
synchronized (this) {
text = text.substring(0, pos) + ins + text.substring(pos);
checkRep();
}
}
public void delete(int pos, int len) {
synchronized (this) {
text = text.substring(0, pos) + text.substring(pos+len);
checkRep();
}
}
public int length() {
synchronized (this) {
return text.length();
}
}
public String toString() {
synchronized (this) {
return text;
}
}
}

Note the very careful discipline here. Every method that touches the rep must be guarded with the lock —
even apparently small and trivial ones like length() and toString() . This is because reads must be guarded
as well as writes — if reads are left unguarded, then they may be able to see the rep in a partially-
modified state.

This approach is called the monitor pattern. A monitor is a class whose methods are mutually exclusive,
so that only one thread can be inside an instance of the class at a time.

Java provides some syntactic sugar for the monitor pattern. If you add the keyword synchronized to a
method signature, then Java will act as if you wrote synchronized (this) around the method body. So the
code below is an equivalent way to implement the synchronized SimpleBuffer :
/** SimpleBuffer is a threadsafe EditBuffer with a simple rep. */
public class SimpleBuffer implements EditBuffer {
private String text;
...
public SimpleBuffer() {
text = "";
checkRep();
}
public synchronized void insert(int pos, String ins) {
text = text.substring(0, pos) + ins + text.substring(pos);
checkRep();
}
public synchronized void delete(int pos, int len) {
text = text.substring(0, pos) + text.substring(pos+len);
checkRep();
}
public synchronized int length() {
return text.length();
}
public synchronized String toString() {
return text;
}
}

Notice that the SimpleBuffer constructor doesn’t have a synchronized keyword. Java actually forbids it,
syntactically, because an object under construction is expected to be confined to a single thread until it
has returned from its constructor. So synchronizing constructors should be unnecessary.

In the Java Tutorials, read:

Synchronized Methods (1 page)


Intrinsic Locks and Synchronization (1 page)

reading exercises

Synchronizing with locks

If thread B tries to acquire a lock currently held by thread A:

What happens to thread A?

blocks until B acquires the lock


blocks until B releases the lock
nothing

What happens to thread B?

blocks until A acquires the lock


blocks until A releases the lock
nothing

(missing explanation)

check

This list is mine, all mine

Suppose list is an instance of ArrayList<String> .


What is true while A is in a synchronized (list) { ... } block?

it owns the lock on list


it does not own the lock on list
no other thread can use observers of list

no other thread can use mutators of list

no other thread can acquire the lock on list


no other thread can acquire locks on elements in list

(missing explanation)

check

OK fine but this synchronized List is totally mine

Suppose sharedList is a List returned by Collections.synchronizedList .

It is now safe to use sharedList from multiple threads without acquiring any locks… except! Which of the
following would require a synchronized(sharedList) { ... } block?

call isEmpty

call add

iterate over the list


call isEmpty , if it returns false, call remove(0)

(missing explanation)

check

I heard you like locks so I acquired your lock so you can lock while you acquire

Suppose we run this code:

synchronized (obj) {
// ...
synchronized (obj) { // <-- uh oh, deadlock?
// ...
}
// <-- do we own the lock on obj?
}

On the line “uh oh, deadlock?”, do we experience deadlock?

yes
no

If we don’t deadlock, on the line “do we own the lock on obj”, does the thread own the lock on obj?

yes
no
we deadlocked

(missing explanation)
check

Thread safety argument with synchronization


Now that we’re protecting SimpleBuffer ’s rep with a lock, we can write a better thread safety argument:

/** SimpleBuffer is a threadsafe EditBuffer with a simple rep. */


public class SimpleBuffer implements EditBuffer {
private String text;
// Rep invariant:
// text != null
// Abstraction function:
// represents the sequence text[0],...,text[text.length()-1]
// Thread safety argument:
// all accesses to text happen within SimpleBuffer methods,
// which are all guarded by SimpleBuffer's lock

The same argument works for GapBuffer , if we use the monitor pattern to synchronize all its methods.

Note that the encapsulation of the class, the absence of rep exposure, is very important for making this
argument. If text were public:

public String text;

then clients outside SimpleBuffer would be able to read and write it without knowing that they should first
acquire the lock, and SimpleBuffer would no longer be threadsafe.

Locking discipline

A locking discipline is a strategy for ensuring that synchronized code is threadsafe. We must satisfy two
conditions:

1. Every shared mutable variable must be guarded by some lock. The data may not be read or written
except inside a synchronized block that acquires that lock.

2. If an invariant involves multiple shared mutable variables (which might even be in different objects),
then all the variables involved must be guarded by the same lock. Once a thread acquires the lock,
the invariant must be reestablished before releasing the lock.

The monitor pattern as used here satisfies both rules. All the shared mutable data in the rep — which the
rep invariant depends on — are guarded by the same lock.

Atomic operations
Consider a find-and-replace operation on the EditBuffer datatype:

/** Modifies buf by replacing the first occurrence of s with t.


* If s not found in buf, then has no effect.
* @returns true if and only if a replacement was made
*/
public static boolean findReplace(EditBuffer buf, String s, String t) {
int i = buf.toString().indexOf(s);
if (i == -1) {
return false;
}
buf.delete(i, s.length());
buf.insert(i, t);
return true;
}

This method makes three different calls to buf — to convert it to a string in order to search for s , to
delete the old text, and then to insert t in its place. Even though each of these calls individually is atomic,
the findReplace method as a whole is not threadsafe, because other threads might mutate the buffer while
findReplace is working, causing it to delete the wrong region or put the replacement back in the wrong
place.

To prevent this, findReplace needs to synchronize with all other clients of buf .

Giving clients access to a lock

It’s sometimes useful to make your datatype’s lock available to clients, so that they can use it to
implement higher-level atomic operations using your datatype.

So one approach to the problem with findReplace is to document that clients can use the EditBuffer ’s lock
to synchronize with each other:

/** An EditBuffer represents a threadsafe mutable string of characters


* in a text editor. Clients may synchronize with each other using the
* EditBuffer object itself. */
public interface EditBuffer {
...
}

And then findReplace can synchronize on buf :

public static boolean findReplace(EditBuffer buf, String s, String t) {


synchronized (buf) {
int i = buf.toString().indexOf(s);
if (i == -1) {
return false;
}
buf.delete(i, s.length());
buf.insert(i, t);
return true;
}
}

The effect of this is to enlarge the synchronization region that the monitor pattern already put around the
individual toString , delete , and insert methods, into a single atomic region that ensures that all three
methods are executed without interference from other threads.

Sprinkling synchronized everywhere?

So is thread safety simply a matter of putting the synchronized keyword on every method in your program?
Unfortunately not.

First, you actually don’t want to synchronize methods willy-nilly. Synchronization imposes a large cost on
your program. Making a synchronized method call may take significantly longer, because of the need to
acquire a lock (and flush caches and communicate with other processors). Java leaves many of its
mutable datatypes unsynchronized by default exactly for these performance reasons. When you don’t
need synchronization, don’t use it.

Another argument for using synchronized in a more deliberate way is that it minimizes the scope of access
to your lock. Adding synchronized to every method means that your lock is the object itself, and every client
with a reference to your object automatically has a reference to your lock, that it can acquire and release
at will. Your thread safety mechanism is therefore public and can be interfered with by clients. Contrast
that with using a lock that is an object internal to your rep, and acquired appropriately and sparingly using
synchronized() blocks.

Finally, it’s not actually sufficient to sprinkle synchronized everywhere. Dropping synchronized onto a method
without thinking means that you’re acquiring a lock without thinking about which lock it is, or about
whether it’s the right lock for guarding the shared data access you’re about to do. Suppose we had tried
to solve findReplace ’s synchronization problem simply by dropping synchronized onto its declaration:

public static synchronized boolean findReplace(EditBuffer buf, ...) {

This wouldn’t do what we want. It would indeed acquire a lock — because findReplace is a static method,
it would acquire a static lock for the whole class that findReplace happens to be in, rather than an instance
object lock. As a result, only one thread could call findReplace at a time — even if other threads want to
operate on different buffers, which should be safe, they’d still be blocked until the single lock was free.
So we’d suffer a significant loss in performance, because only one user of our massive multiuser editor
would be allowed to do a find-and-replace at a time, even if they’re all editing different documents.

Worse, however, it wouldn’t provide useful protection, because other code that touches the document
probably wouldn’t be acquiring the same lock. It wouldn’t actually eliminate our race conditions.

The synchronized keyword is not a panacea. Thread safety requires a discipline — using confinement,
immutability, or locks to protect shared data. And that discipline needs to be written down, or maintainers
won’t know what it is.

Designing a datatype for concurrency


findReplace ’s problem can be interpreted another way: that the EditBuffer interface really isn’t that friendly
to multiple simultaneous clients. It relies on integer indexes to specify insert and delete locations, which
are extremely brittle to other mutations. If somebody else inserts or deletes before the index position,
then the index becomes invalid.

So if we’re designing a datatype specifically for use in a concurrent system, we need to think about
providing operations that have better-defined semantics when they are interleaved. For example, it might
be better to pair EditBuffer with a Position datatype representing a cursor position in the buffer, or even a
Selection datatype representing a selected range. Once obtained, a Position could hold its location in the
text against the wash of insertions and deletions around it, until the client was ready to use that Position .

If some other thread deleted all the text around the Position , then the Position would be able to inform a
subsequent client about what had happened (perhaps with an exception), and allow the client to decide
what to do. These kinds of considerations come into play when designing a datatype for concurrency.

As another example, consider the ConcurrentMap interface in Java. This interface extends the existing Map

interface, adding a few key methods that are commonly needed as atomic operations on a shared
mutable map, e.g.:

map.putIfAbsent(key,value) is an atomic version of


if ( ! map.containsKey(key)) map.put(key, value);

map.replace(key, value) is an atomic version of


if (map.containsKey(key)) map.put(key, value);

Deadlock rears its ugly head


The locking approach to thread safety is powerful, but (unlike confinement and immutability) it introduces
blocking into the program. Threads must sometimes wait for other threads to get out of synchronized
regions before they can proceed. And blocking raises the possibility of deadlock — a very real risk, and
frankly far more common in this setting than in message passing with blocking I/O (where we first
mentioned it).

With locking, deadlock happens when threads acquire multiple locks at the same time, and two threads
end up blocked while holding locks that they are each waiting for the other to release. The monitor
pattern unfortunately makes this fairly easy to do. Here’s an example.

Suppose we’re modeling the social network of a series of books:

public class Wizard {


private final String name;
private final Set<Wizard> friends;
// Rep invariant:
// name, friends != null
// friend links are bidirectional:
// for all f in friends, f.friends contains this
// Concurrency argument:
// threadsafe by monitor pattern: all accesses to rep
// are guarded by this object's lock

public Wizard(String name) {


this.name = name;
this.friends = new HashSet<Wizard>();
}

public synchronized boolean isFriendsWith(Wizard that) {


return this.friends.contains(that);
}

public synchronized void friend(Wizard that) {


if (friends.add(that)) {
that.friend(this);
}
}

public synchronized void defriend(Wizard that) {


if (friends.remove(that)) {
that.defriend(this);
}
}
}
Like Facebook, this social network is bidirectional: if x is friends with y, then y is friends with x. The
friend() and defriend() methods enforce that invariant by modifying the reps of both objects, which
because they use the monitor pattern means acquiring the locks to both objects as well.

Let’s create a couple of wizards:

Wizard harry = new Wizard("Harry Potter");


Wizard snape = new Wizard("Severus Snape");

And then think about what happens when two independent threads are repeatedly running:

// thread A // thread B
harry.friend(snape); snape.friend(harry);
harry.defriend(snape); snape.defriend(harry);

We will deadlock very rapidly. Here’s why. Suppose thread A is about to execute harry.friend(snape) , and
thread B is about to execute snape.friend(harry) .

Thread A acquires the lock on harry (because the friend method is synchronized).
Then thread B acquires the lock on snape (for the same reason).
They both update their individual reps independently, and then try to call friend() on the other object
— which requires them to acquire the lock on the other object.

So A is holding Harry and waiting for Snape, and B is holding Snape and waiting for Harry. Both threads
are stuck in friend() , so neither one will ever manage to exit the synchronized region and release the lock
to the other. This is a classic deadly embrace. The program simply stops.

The essence of the problem is acquiring multiple locks, and holding some of the locks while waiting for
another lock to become free.

Notice that it is possible for thread A and thread B to interleave such that deadlock does not occur:
perhaps thread A acquires and releases both locks before thread B has enough time to acquire the first
one. If the locks involved in a deadlock are also involved in a race condition — and very often they are —
then the deadlock will be just as difficult to reproduce or debug.

Deadlock solution 1: lock ordering

One way to prevent deadlock is to put an ordering on the locks that need to be acquired simultaneously,
and ensuring that all code acquires the locks in that order.

In our social network example, we might always acquire the locks on the Wizard objects in alphabetical
order by the wizard’s name. Since thread A and thread B are both going to need the locks for Harry and
Snape, they would both acquire them in that order: Harry’s lock first, then Snape’s. If thread A gets
Harry’s lock before B does, it will also get Snape’s lock before B does, because B can’t proceed until A
releases Harry’s lock again. The ordering on the locks forces an ordering on the threads acquiring them,
so there’s no way to produce a cycle in the waiting-for graph.
Here’s what the code might look like:

public void friend(Wizard that) {


Wizard first, second;
if (this.name.compareTo(that.name) < 0) {
first = this; second = that;
} else {
first = that; second = this;
}
synchronized (first) {
synchronized (second) {
if (friends.add(that)) {
that.friend(this);
}
}
}
}

(Note that the decision to order the locks alphabetically by the person’s name would work fine for this
book, but it wouldn’t work in a real life social network. Why not? What would be better to use for lock
ordering than the name?)

Although lock ordering is useful (particularly in code like operating system kernels), it has a number of
drawbacks in practice.

First, it’s not modular — the code has to know about all the locks in the system, or at least in its
subsystem.
Second, it may be difficult or impossible for the code to know exactly which of those locks it will need
before it even acquires the first one. It may need to do some computation to figure it out. Think about
doing a depth-first search on the social network graph, for example — how would you know which
nodes need to be locked, before you’ve even started looking for them?

Deadlock solution 2: coarse-grained locking

A more common approach than lock ordering, particularly for application programming (as opposed to
operating system or device driver programming), is to use coarser locking — use a single lock to guard
many object instances, or even a whole subsystem of a program.

For example, we might have a single lock for an entire social network, and have all the operations on any
of its constituent parts synchronize on that lock. In the code below, all Wizard s belong to a Castle , and we
just use that Castle object’s lock to synchronize:

public class Wizard {


private final Castle castle;
private final String name;
private final Set<Wizard> friends;
...
public void friend(Wizard that) {
synchronized (castle) {
if (this.friends.add(that)) {
that.friend(this);
}
}
}
}
Coarse-grained locks can have a significant performance penalty. If you guard a large pile of mutable
data with a single lock, then you’re giving up the ability to access any of that data concurrently. In the
worst case, having a single lock protecting everything, your program might be essentially sequential —
only one thread is allowed to make progress at a time.

reading exercises

Deadlock

In the code below three threads 1, 2, and 3 are trying to acquire locks on objects alpha , beta , and gamma .

Thread 1 Thread 2 Thread 3

synchronized (alpha) { synchronized (gamma) { synchronized (gamma) {


// using alpha synchronized (alpha) { synchronized (alpha) {
// ... synchronized (beta) { // using alpha & gamma
} // using alpha, beta, & gamma // ...
// ... }
synchronized (gamma) { } }
synchronized (beta) { }
// using beta & gamma } synchronized (beta) {
// ... // finished synchronized (gamma) {
} // using beta & gamma
} // ...
// finished }
}
// finished

This system is susceptible to deadlock.

For each of the scenarios below, determine whether the system is in deadlock if the threads are currently
on the indicated lines of code.

Scenario A

Thread 1 inside using alpha

Thread 2 blocked on synchronized (alpha)

Thread 3 finished

deadlock
not deadlock

(missing explanation)

Scenario B

Thread 1 finished
Thread 2 blocked on synchronized (beta)

Thread 3 blocked on 2nd synchronized (gamma)

deadlock
not deadlock

(missing explanation)

Scenario C

Thread 1 running synchronized (beta)

Thread 2 blocked on synchronized (gamma)

Thread 3 blocked on 1st synchronized (gamma)

deadlock
not deadlock

(missing explanation)

Scenario D

Thread 1 blocked on synchronized (beta)

Thread 2 finished
Thread 3 blocked on 2nd synchronized (gamma)

deadlock
not deadlock

(missing explanation)

check

Locked out

Examine the code again.

In the previous problem, we saw deadlocks involving beta and gamma .

What about alpha ?

there is a possible deadlock where thread 1 owns the lock on alpha


there is a possible deadlock where thread 2 owns the lock on alpha
there is a possible deadlock where thread 3 owns the lock on alpha
there are no deadlocks involving alpha

(missing explanation)

check

Goals of concurrent program design


Now is a good time to pop up a level and look at what we’re doing. Recall that our primary goals are to
create software that is safe from bugs, easy to understand, and ready for change.
Building concurrent software is clearly a challenge for all three of these goals. We can break the issues
into two general classes. When we ask whether a concurrent program is safe from bugs, we care about
two properties:

Safety. Does the concurrent program satisfy its invariants and its specifications? Races in accessing
mutable data threaten safety. Safety asks the question: can you prove that some bad thing never
happens?

Liveness. Does the program keep running and eventually do what you want, or does it get stuck
somewhere waiting forever for events that will never happen? Can you prove that some good thing
eventually happens?

Deadlocks threaten liveness. Liveness may also require fairness, which means that concurrent modules
are given processing capacity to make progress on their computations. Fairness is mostly a matter for
the operating system’s thread scheduler, but you can influence it (for good or for ill) by setting thread
priorities.

Concurrency in practice
What strategies are typically followed in real programs?

Library data structures either use no synchronization (to offer high performance to single-threaded
clients, while leaving it to multithreaded clients to add locking on top) or the monitor pattern.

Mutable data structures with many parts typically use either coarse-grained locking or thread
confinement. Most graphical user interface toolkits follow one of these approaches, because a
graphical user interface is basically a big mutable tree of mutable objects. Java Swing, the graphical
user interface toolkit, uses thread confinement. Only a single dedicated thread is allowed to access
Swing’s tree. Other threads have to pass messages to that dedicated thread in order to access the
tree.

Search often uses immutable datatypes. Our Boolean formula satisfiability search would be easy to
make multithreaded, because all the datatypes involved were immutable. There would be no risk of
either races or deadlocks.

Operating systems often use fine-grained locks in order to get high performance, and use lock
ordering to deal with deadlock problems.

We’ve omitted one important approach to mutable shared data because it’s outside the scope of this
course, but it’s worth mentioning: a database. Database systems are widely used for distributed
client/server systems like web applications. Databases avoid race conditions using transactions, which
are similar to synchronized regions in that their effects are atomic, but they don’t have to acquire locks,
though a transaction may fail and be rolled back if it turns out that a race occurred. Databases can also
manage locks, and handle locking order automatically. For more about how to use databases in system
design, 6.170 Software Studio is strongly recommended; for more about how databases work on the
inside, take 6.814 Database Systems.

And if you’re interested in the performance of concurrent programs — since performance is often one of
the reasons we add concurrency to a system in the first place — then 6.172 Performance Engineering is
the course for you.

Summary
Producing a concurrent program that is safe from bugs, easy to understand, and ready for change
requires careful thinking. Heisenbugs will skitter away as soon as you try to pin them down, so debugging
simply isn’t an effective way to achieve correct threadsafe code. And threads can interleave their
operations in so many different ways that you will never be able to test even a small fraction of all
possible executions.

Make thread safety arguments about your datatypes, and document them in the code.

Acquiring a lock allows a thread to have exclusive access to the data guarded by that lock, forcing
other threads to block — as long as those threads are also trying to acquire that same lock.

The monitor pattern guards the rep of a datatype with a single lock that is acquired by every method.

Blocking caused by acquiring multiple locks creates the possibility of deadlock.


Reading 24: Map, Filter, Reduce
Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Communicating clearly Designed to


Correct today and
with future accommodate
correct in the
programmers, including change without
unknown future.
future you. rewriting.

Objectives

In this reading you’ll learn a design pattern for implementing functions that
operate on sequences of elements, and you’ll see how treating functions
themselves as first-class values that we can pass around and manipulate in
our programs is an especially powerful idea.

Map/filter/reduce
Lambda expressions
Functional objects
Higher-order functions

Introduction: an example
Suppose we’re given the following problem: write a method that finds the
words in the Java files in your project.

Following good practice, we break it down into several simpler steps and
write a method for each one:

find all the files in the project, by scanning recursively from the project’s
root folder
restrict them to files with a particular suffix, in this case .java

open each file and read it in line-by-line


break each line into words

Writing the individual methods for these substeps, we’ll find ourselves
writing a lot of low-level iteration code. For example, here’s what the
recursive traversal of the project folder might look like:

/**
* Find all the files in the filesystem subtree rooted at folder.
* @param folder root of subtree, requires folder.isDirectory() == true
* @return list of all ordinary files (not folders) that have folder as
* their ancestor
*/
public static List<File> allFilesIn(File folder) {
List<File> files = new ArrayList<>();
for (File f : folder.listFiles()) {
if (f.isDirectory()) {
files.addAll(allFilesIn(f));
} else if (f.isFile()) {
files.add(f);
}
}
return files;
}

And here’s what the filtering method might look like, which restricts that file
list down to just the Java files (imagine calling this like
onlyFilesWithSuffix(files, ".java") ):

/**
* Filter a list of files to those that end with suffix.
* @param files list of files (all non-null)
* @param suffix string to test
* @return a new list consisting of only those files whose names end with
* suffix
*/
public static List<File> onlyFilesWithSuffix(List<File> files, String suffix) {
List<File> result = new ArrayList<>();
for (File f : files) {
if (f.getName().endsWith(suffix)) {
result.add(f);
}
}
return result;
return result;
}

→ full Java code for the example

In this reading we discuss map/filter/reduce, a design pattern that


substantially simplifies the implementation of functions that operate over
sequences of elements. In this example, we’ll have lots of sequences —
lists of files; input streams that are sequences of lines; lines that are
sequences of words; frequency tables that are sequences of (word, count)
pairs. Map/filter/reduce will enable us to operate on those sequences with
no explicit control flow — not a single for loop or if statement.

Along the way, we’ll also see an important Big Idea: functions as “first-
class” data values, meaning that they can be stored in variables, passed as
arguments to functions, and created dynamically like other values.

Using first-class functions in Java is more verbose, uses some unfamiliar


syntax, and the interaction with static typing adds some complexity. So to
get started with map/filter/reduce, we’ll switch back to Python.

Abstracting out control flow


We’ve already seen one design pattern that abstracts away from the details
of iterating over a data structure: Iterator.

Iterator abstraction

Iterator gives you a sequence of elements from a data structure, without


you having to worry about whether the data structure is a set or a token
stream or a list or an array — the Iterator looks the same no matter what
the data structure is.

For example, given a List<File> files , we can iterate using indices:


for (int ii = 0; ii < files.size(); ii++) {
File f = files.get(ii);
// ...

But this code depends on the size and get methods of List , which might be
different in another data structure. Using an iterator abstracts away the
details:

Iterator<File> iter = files.iterator();


while (iter.hasNext()) {
File f = iter.next();
// ...

Now the loop will be identical for any type that provides an Iterator . There
is, in fact, an interface for such types: Iterable . Any Iterable can be used
with Java’s enhanced for statement — for (File f : files) — and under the
hood, it uses an iterator.

Map/filter/reduce abstraction

The map/filter/reduce patterns in this reading do something similar to


Iterator, but at an even higher level: they treat the entire sequence of
elements as a unit, so that the programmer doesn’t have to name and work
with the elements individually. In this paradigm, the control statements
disappear: specifically, the for statements, the if statements, and the
return statements in the code from our introductory example will be gone.
We’ll also be able to get rid of most of the temporary names (i.e., the local
variables files , f , and result ).

Sequences

Let’s imagine an abstract datatype Seq<E> that represents a sequence of


elements of type E .

For example, [1, 2, 3, 4] ∈ Seq<Integer> .


Any datatype that has an iterator can qualify as a sequence: array, list, set,
etc. A string is also a sequence (of characters), although Java’s strings
don’t offer an iterator. Python is more consistent in this respect: not only are
lists iterable, but so are strings, tuples (which are immutable lists), and even
input streams (which produce a sequence of lines). We’ll see these
examples in Python first, since the syntax is very readable and familiar to
you, and then we’ll see how it works in Java.

We’ll have three operations for sequences: map, filter, and reduce. Let’s
look at each one in turn, and then look at how they work together.

Map
Map applies a unary function to each element in the sequence and returns a
new sequence containing the results, in the same order:

map : (E → F) × Seq<E> → Seq<F>

For example, in Python:

>>> from math import sqrt


>>> map(sqrt, [1, 4, 9, 16])
[1.0, 2.0, 3.0, 4.0]
>>> map(str.lower, ['A', 'b', 'C'])
['a', 'b', 'c']

map is built-in, but it is also straightforward to implement in Python:

def map(f, seq):


result = []
for elt in seq:
result.append(f(elt))
return result

This operation captures a common pattern for operating over sequences:


doing the same thing to each element of the sequence.

Functions as values
Let’s pause here for a second, because we’re doing something unusual with
functions. The map function takes a reference to a function as its first
argument — not to the result of that function. When we wrote

map(sqrt, [1, 4, 9, 16])

we didn’t call sqrt (like sqrt(25) is a call), instead we just used its name. In
Python, the name of a function is a reference to an object representing that
function. You can assign that object to another variable if you like, and it still
behaves like sqrt :

>>> mySquareRoot = sqrt


>>> mySquareRoot(25)
5.0

You can also pass a reference to the function object as a parameter to


another function, and that’s what we’re doing here with map . You can use
function objects the same way you would use any other data value in
Python (like numbers or strings or objects).

Functions are first-class in Python, meaning that they can be assigned to


variables, passed as parameters, used as return values, and stored in data
structures. First-class functions are a very powerful programming idea. The
first practical programming language that used them was Lisp, invented by
John McCarthy at MIT. But the idea of programming with functions as first-
class values actually predates computers, tracing back to Alonzo Church’s
lambda calculus. The lambda calculus used the Greek letter λ to define new
functions; this term stuck, and you’ll see it as a keyword not only in Lisp and
its descendants, but also in Python.

We’ve seen how to use built-in library functions as first-class values; how do
we make our own? One way is using a familiar function definition, which
gives the function a name:
>>> def powerOfTwo(k):
... return 2**k
...
>>> powerOfTwo(5)
32
>>> map(powerOfTwo, [1, 2, 3, 4])
[2, 4, 8, 16]

When you only need the function in one place, however — which often
comes up in programming with functions — it’s more convenient to use a
lambda expression:

lambda k: 2**k

This expression represents a function of one argument (called k ) that


returns the value 2k. You can use it anywhere you would have used
powerOfTwo :

>>> (lambda k: 2**k)(5)


32
>>> map(lambda k: 2**k, [1, 2, 3, 4])
[2, 4, 8, 16]

Python lambda expressions are unfortunately syntactically limited, to


functions that can be written with just a return statement and nothing else
(no if statements, no for loops, no local variables). But remember that’s
our goal with map/filter/reduce anyway, so it won’t be a serious obstacle.

Guido Von Rossum, the creator of Python, wrote a blog post about the
design principle that led not only to first-class functions in Python, but first-
class methods as well: First-class Everything.

More ways to use map

Map is useful even if you don’t care about the return value of the function.
When you have a sequence of mutable objects, for example, you can map a
mutator operation over them:
map(IOBase.close, streams) # closes each stream on the list
map(Thread.join, threads) # waits for each thread to finish

Some versions of map (including Python’s built-in map ) also support mapping
functions with multiple arguments. For example, you can add two lists of
numbers element-wise:

>>> import operator


>>> map(operator.add, [1, 2, 3], [4, 5, 6])
[5, 7, 9]

reading exercises

map 1

Try these in the Python interpreter if you’re not sure!

What is the result of map(len, [ [1], [2], [3] ]) ?

1
[1, 1, 1]
[1, 2, 3]
[ [1], [1], [1] ]
[ [1], [2], [3] ]

error

(missing explanation)

check

map 2

What is the result of map(len, [1, 2, 3]) ?

1
[1, 1, 1]
[1, 2, 3]
[ [1], [1], [1] ]
[ [1], [2], [3] ]
error

(missing explanation)

check

map 3

What is the result of map(len, ['1', '2', '3']) ?

1
[1, 1, 1]
[1, 2, 3]
[ [1], [1], [1] ]
[ [1], [2], [3] ]

error

(missing explanation)

check

map 4

What is the result of map(lambda x: x.split(' '), 'a b c') ?

'a b c'
['a b c']
['a', 'b', 'c']
[ ['a', 'b', 'c'] ]

something else
error

(missing explanation)

check

map 5

What is the result of map(lambda x: x.split(' '), ['a b c']) ?

'a b c'
['a b c']
['a', 'b', 'c']
[ ['a', 'b', 'c'] ]

something else
error

(missing explanation)

check

Filter
Our next important sequence operation is filter, which tests each element
with a unary predicate. Elements that satisfy the predicate are kept; those
that don’t are removed. A new list is returned; filter doesn’t modify its input
list.

filter : (E → boolean) × Seq<E> → Seq<E>

Python examples:

>>> filter(str.isalpha, ['x', 'y', '2', '3', 'a'])


['x', 'y', 'a']

>>> def isOdd(x): return x % 2 == 1


...
>>> filter(isOdd, [1, 2, 3, 4])
[1, 3]

>>> filter(lambda s: len(s)>0, ['abc', '', 'd'])


['abc', 'd']

We can define filter in a straightforward way:

def filter(f, seq):


result = []
for elt in seq:
if f(elt):
result.append(elt)
return result
reading exercises

filter 1

Try these in the Python interpreter if you’re not sure!

Given:

x1 = {'x': 1}
y2 = {'y': 2}
x3_y4 = {'x': 3, 'y': 4}

What is the result of filter(lambda d: 'x' in d.keys(), [ x1, y2, x3_y4 ]) ?

[{'x': 1}, {'y': 2}, {'x': 3, 'y': 4}]


[{'x': 1}, {'y': 2}, {'x': 3}]
[{'x': 1}, {'x': 3, 'y': 4}]
[{'x': 1}, {'x': 3}]
[{'x': 1}]

(missing explanation)

check

filter 2

Again given:

x1 = {'x': 1}
y2 = {'y': 2}
x3_y4 = {'x': 3, 'y': 4}

What is the result of filter(lambda d: 0 in d.values(), [ x1, y2, x3_y4 ]) ?

0
False
None
[]
[ {}, {}, {} ]
[ None, None, None ]
(missing explanation)

check

filter 3

What is the result of filter(str.isalpha, [ 'a', '1', 'b', '2' ]) ?

''
'ab'
['ab']
['a', 'b']
['a', '', 'b', '']
['a', '1', 'b', '2']

(missing explanation)

check

filter 4

What is the result of filter(str.swapcase, [ 'a', '1', 'b', '2' ]) ?

'A1B2'
'a1b2'
['A1B2']
['a1b2']
['A', '1', 'B', '2']
['a', '1', 'b', '2']

(missing explanation)

check

Reduce
Our final operator, reduce, combines the elements of the sequence
together, using a binary function. In addition to the function and the list, it
also takes an initial value that initializes the reduction, and that ends up
being the return value if the list is empty.
reduce : (F × E → F) × Seq<E> × F → F

reduce(f, list, init) combines the elements of the list from left to right, as
follows:

result0 = init
result1 = f(result0, list[0])
result2 = f(result1, list[1])
...
resultn = f(resultn-1, list[n-1])

resultn is the final result for an n-element list.

Adding numbers is probably the most straightforward example:

>>> reduce(lambda x,y: x+y, [1, 2, 3], 0)


6
# --or--
>>> import operator
>>> reduce(operator.add, [1, 2, 3], 0)
6

There are two design choices in the reduce operation. First is whether to
require an initial value. In Python’s reduce function, the initial value is
optional, and if you omit it, reduce uses the first element of the list as its
initial value. So you get behavior like this instead:

result0 = undefined (reduce throws an exception if the list is empty)


result1 = list[0]
result2 = f(result1, list[1])
...
resultn = f(resultn-1, list[n-1])

This makes it easier to use reducers like max , which have no well-defined
initial value:
>>> reduce(max, [5, 8, 3, 1])
8

The second design choice is the order in which the elements are
accumulated. For associative operators like add and max it makes no
difference, but for other operators it can. Python’s reduce is also called
fold-left in other programming languages, because it combines the
sequence starting from the left (the first element). Fold-right goes in the
other direction:

fold-right : (E × F → F) × Seq<E> × F → F

where fold-right(f, list, init) of an n-element list follows this pattern:

result0 = init
result1 = f(list[n-1], result0)
result2 = f(list[n-2], result1)
...
resultn = f(list[0], resultn-1)

to produce resultn as the final result.

Here’s a diagram of two ways to reduce: from the left or from the right:

fold-left : (F × E → F) × Seq<E> × F → F
fold-left(-, [1, 2, 3], 0) = -6

fold-right : (E × F → F) × Seq<E> × F → F
fold-right(-, [1, 2, 3], 0) = 2

The return type of the reduce operation doesn’t have to match the type of
the list elements. For example, we can use reduce to glue together a
sequence into a string:
>>> reduce(lambda s,x: s+str(x), [1, 2, 3, 4], '')
'1234'

Or to flatten out nested sublists into a single list:

>>> reduce(operator.concat, [[1, 2], [3, 4], [], [5]], [])


[1, 2, 3, 4, 5]

This is a useful enough sequence operation that we’ll define it as flatten,


although it’s just a reduce step inside:

def flatten(list):
return reduce(operator.concat, list, [])

More examples

Suppose we have a polynomial represented as a list of coefficients, a[0],


a[1], ..., a[n-1], where a[i] is the coefficient of xi. Then we can evaluate it
using map and reduce:

def evaluate(a, x):


xi = map(lambda i: x**i, range(0, len(a))) # [x^0, x^1, x^2, ..., x^(n-1)]
axi = map(operator.mul, a, xi) # [a[0]*x^0, a[1]*x^1, ..., a[n-1]*
return reduce(operator.add, axi, 0) # sum of axi

This code uses the convenient Python generator method range(a,b) , which
generates a list of integers from a to b-1. In map/filter/reduce programming,
this kind of method replaces a for loop that indexes from a to b.

Now let’s look at a typical database query example. Suppose we have a


database about digital cameras, in which each object is of type Camera with
observer methods for its properties ( brand() , pixels() , cost() , etc.). The
whole database is in a list called cameras . Then we can describe queries on
this database using map/filter/reduce:

# What's the highest resolution Nikon sells?


reduce(max, map(Camera.pixels, filter(lambda c: c.brand() == "Nikon", cameras)))
Relational databases use the map/filter/reduce paradigm (where it’s called
project/select/aggregate). SQL (Structured Query Language) is the de
facto standard language for querying relational databases. A typical SQL
query looks like this:

select max(pixels) from cameras where brand = "Nikon"

cameras is a sequence (a list of rows, where each row has the data for
one camera)

where brand = "Nikon" is a filter

pixels is a map (extracting just the pixels field from the row)

max is a reduce

reading exercises

reduce 1

Which is the best description of this reduction?

reduce(lambda x, y: x and (y == 'True'), [ ... ], True)

returns False

returns True iff the list is empty


returns True iff all the values in the list are True

returns True iff all the values in the list are strings
returns True iff all the values in the list are the string 'True'

returns True iff some value in the list is the string 'True'

(missing explanation)

check

reduce 2
Try these in the Python interpreter if you’re not sure!

What is the result of:

reduce(lambda a,b: a * b, [1, 2, 3], 0)

(missing explanation)

What is the result of:

reduce(lambda a,b: a if len(a) > len(b) else b, ["oscar", "papa", "tango"])

(missing explanation)

check

Back to the intro example


Going back to the example we started with, where we want to find all the
words in the Java files in our project, let’s try creating a useful abstraction
for filtering files by suffix:

def fileEndsWith(suffix):
return lambda file: file.getName().endsWith(suffix)

fileEndsWith returns functions that are useful as filters: it takes a filename


suffix like .java and dynamically generates a function that we can use with
filter to test for that suffix:

filter(fileEndsWith(".java"), files)

fileEndsWith is a different kind of beast than our usual functions. It’s a


higher-order function, meaning that it’s a function that takes another
function as an argument, or returns another function as its result. Higher-
order functions are operations on the datatype of functions; in this case,
fileEndsWith is a creator of functions.

Now let’s use map, filter, and flatten (which we defined above using reduce)
to recursively traverse the folder tree:

def allFilesIn(folder):
children = folder.listFiles()
subfolders = filter(File.isDirectory, children)
descendants = flatten(map(allFilesIn, subfolders))
return descendants + filter(File.isFile, children)

The first line gets all the children of the folder, which might look like this:

["src/client", "src/server", "src/Main.java", ...]

The second line is the key bit: it filters the children for just the subfolders,
and then recursively maps allFilesIn against this list of subfolders! The
result might look like this:

[["src/client/MyClient.java", ...], ["src/server/MyServer.java", ...], ...]

So we have to flatten it to remove the nested structure. Then we add the


immediate children that are plain files (not folders), and that’s our result.

We can also do the other pieces of the problem with map/filter/reduce.


Once we have the list of files we want to extract words from, we’re ready
to load their contents. We can use map to get their pathnames as strings,
open them, and then read in each file as a list of files:

pathnames = map(File.getPath, files)


streams = map(open, pathnames)
lines = map(list, streams)

This actually looks like a single map operation where we want to apply
three functions to the elements, so let’s pause to create another useful
higher-order function: composing functions together.
def compose(f, g):
"""Requires that f and g are functions, f:A->B and g:B->C.
Returns a function A->C by composing f with g."""
return lambda x: g(f(x))

Now we can use a single map:

lines = map(compose(compose(File.getPath, open), list), files)

Better, since we already have three functions to apply, let’s design a way to
compose an arbitrary chain of functions:

def chain(funcs):
"""Requires funcs is a list of functions [A->B, B->C, ..., Y->Z].
Returns a fn A->Z that is the left-to-right composition of funcs."""
return reduce(compose, funcs)

So that the map operation becomes:

lines = map(chain([File.getPath, open, list]), files)

Now we see more of the power of first-class functions. We can put


functions into data structures and use operations on those data structures,
like map, reduce, and filter, on the functions themselves!

Since this map will produce a list of lists of lines (one list of lines for each
file), let’s flatten it to get a single line list, ignoring file boundaries:

allLines = flatten(map(chain([File.getPath, open, list]), files))

Then we split each line into words similarly:

words = flatten(map(str.split, lines))

And we’re done, we have our list of all words in the project’s Java files! As
promised, the control statements have disappeared.

→ full Python code for the example


Benefits of abstracting out control
Map/filter/reduce can often make code shorter and simpler, and allow the
programmer to focus on the heart of the computation rather than on the
details of loops, branches, and control flow.

By arranging our program in terms of map, filter, and reduce, and in


particular using immutable datatypes and pure functions (functions that do
not mutate data) as much as possible, we’ve created more opportunities for
safe concurrency. Maps and filters using pure functions over immutable
datatypes are instantly parallelizable — invocations of the function on
different elements of the sequence can be run in different threads, on
different processors, even on different machines, and the result will still be
the same. MapReduce is a pattern for parallelizing large computations in
this way.

reading exercises

map/filter/reduce

This Python function accepts a list of numbers and computes the product of
all the odd numbers:

def productOfOdds(list):
result = 1
for x in list:
if x % 2 == 1:
result *= x
return result

Rewrite the Python code using map, filter, and reduce:

def productOfOdds(list):
return reduce(r_func, filter(f_func, map(m_func, list)))

Where m_func , f_func , and r_func are each one of the following:
A. list H. def is_odd(x):
return x % 2 == 1

B. x I. x_is_odd = x % 2 == 1

C. y J. def odd_or_identity(x):
return x if is_odd(x) else 1

D. def identity_function(x): K. def sum(x, y):


return x return x + y

E. identity = lambda x: x L. def product(x, y):


return x * y

F. def always_true(x): M. operator.mul


return True

G. def modulus_tester(i): N. x * y
return lambda x: x % 2 == i

For each of the choices below, is it a correct implementation?

def productOfOdds(list):
return reduce(r_func, filter(f_func, map(m_func, list)))

D + H + L: reduce(product, filter(is_odd, map(identity_function, list)))

Yes
No

(missing explanation)
E + F + L: reduce(product, filter(always_true, map(identity, list)))

Yes
No

(missing explanation)

E + G + L: reduce(product, filter(modulus_tester, map(identity, list)))

Yes
No

(missing explanation)

B + J + L: reduce(product, filter(odd_or_identity, map(x, list)))

Yes
No

(missing explanation)

J + F + M: reduce(operator.mul, filter(always_true, map(odd_or_identity, list)))

Yes
No

(missing explanation)

D + I + N: reduce(x * y, filter(x_is_odd, map(identity_function, list)))

Yes
No

(missing explanation)

check
First-class functions in Java
We’ve seen what first-class functions look like in Python; how does this all
work in Java?

In Java, the only first-class values are primitive values (ints, booleans,
characters, etc.) and object references. But objects can carry functions with
them, in the form of methods. So it turns out that the way to implement a
first-class function, in an object-oriented programming language like Java
that doesn’t support first-class functions directly, is to use an object with a
method representing the function.

We’ve actually seen this before several times already:

The Runnable object that you pass to a Thread constructor is a first-class


function, void run() .

The Comparator<T> object that you pass to a sorted collection (e.g.


SortedSet ) is a first-class function, int compare(T o1, T o2) .

In a future class, we’ll see KeyListener objects that you register with the
graphical user interface toolkit to get keyboard events. They act as a
bundle of several functions, keyPressed(KeyEvent) , keyReleased(KeyEvent) , etc.

This design pattern is called a functional object or functor, an object


whose purpose is to represent a function.

Lambda expressions in Java

Java’s lambda expression syntax provides a succinct way to create


instances of functional objects. For example, instead of writing:

new Thread(new Runnable() {


public void run() {
System.out.println("Hello!");
}
}).start();
we can use a lambda expression:

new Thread(() -> {


System.out.println("Hello");
}).start();

On the Java Tutorials page for Lambda Expressions, read Syntax of


Lambda Expressions.

There’s no magic here: Java still doesn’t have first-class functions. So you
can only use a lambda when the Java compiler can verify two things:

1. It must be able to determine the type of the functional object the lambda
will create. In this example, the compiler sees that the Thread constructor
takes a Runnable , so it will infer that the type must be Runnable .

2. This inferred type must be functional interface: an interface with only


one (abstract) method. In this example, Runnable indeed only has a single
method — void run() — so the compiler knows the code in the body of
the lambda belongs in the body of a run method of a new Runnable

object.

Java provides some standard functional interfaces we can use to write


code in the map/filter/reduce pattern, e.g.:

Function<T,R> represents unary functions from T to R

BiFunction<T,U,R> represents binary functions from T × U to R

Predicate<T> represents functions from T to boolean

So we could implement map in Java like so:

/**
* Apply a function to every element of a list.
* @param f function to apply
* @param list list to iterate over
* @return [f(list[0]), f(list[1]), ..., f(list[n-1])]
*/
public static <T,R> List<R> map(Function<T,R> f, List<T> list) {
List<R> result = new ArrayList<>();
for (T t : list) {
result.add(f.apply(t));
}
return result;
}

And here’s an example of using map; first we’ll write it using the familiar
syntax:

// anonymous classes like this one are effectively lambda expressions


Function<String,String> toLowerCase = new Function<>() {
public String apply(String s) { return s.toLowerCase(); }
};
map(toLowerCase, Arrays.asList(new String[] {"A", "b", "C"}));

And with a lambda expression:

map(s -> s.toLowerCase(), Arrays.asList(new String[] {"A", "b", "C"}));


// --or--
map((s) -> s.toLowerCase(), Arrays.asList(new String[] {"A", "b", "C"}));
// --or--
map((s) -> { return s.toLowerCase(); }, Arrays.asList(new String[] {"A", "b", "C"}

In this example, the lambda expression is just wrapping a call to String ’s

toLowerCase . We can use a method reference to avoid writing the lambda,


with the syntax :: . The signature of the method we refer to must match the
signature required by the functional interface for static typing to be
satisfied:

map(String::toLowerCase, Arrays.asList(new String[] {"A", "b", "C"}));

In the Java Tutorials, you can read more about method references if you
want the details.

Using a method reference (vs. calling it) in Java serves the same purpose
as referring to a function by name (vs. calling it) in Python.

Map/filter/reduce in Java
The abstract sequence type we defined above exists in Java as Stream ,

which defines map , filter , reduce , and many other operations.

Collection types like List and Set provide a stream() operation that returns a
Stream for the collection, and there’s an Arrays.stream function for creating a
Stream from an array.

Here’s one implementation of allFilesIn in Java with map and filter:

public class Words {


static Stream<File> allFilesIn(File folder) {
File[] children = folder.listFiles();
Stream<File> descendants = Arrays.stream(children)
.filter(File::isDirectory)
.flatMap(Words::allFilesIn);
return Stream.concat(descendants,
Arrays.stream(children).filter(File::isFile));
}

The map-and-flatten pattern is so common that Java provides a flatMap

operation to do just that, and we’ve used it instead of defining flatten .

Here’s endsWith :

static Predicate<File> endsWith(String suffix) {


return f -> f.getPath().endsWith(suffix);
}

Given a Stream<File> files , we can now write e.g.


files.filter(endsWith(".java")) to obtain a new filtered stream.

Look at the revised Java code for this example.

You can compare all three versions: the familiar Java implementation,
Python with map/filter/reduce, and Java with map/filter/reduce.

Higher-order functions in Java


Map/filter/reduce are of course higher-order functions; so is endsWith above.
Let’s look at two more that we saw before: compose and chain .

The Function interface provides compose — but the implementation is very


straightforward. In particular, once you get the types of the arguments and
return values correct, Java’s static typing makes it pretty much impossible
to get the method body wrong:

/**
* Compose two functions.
* @param f function A->B
* @param g function B->C
* @return new function A->C formed by composing f with g
*/
public static <A,B,C> Function<A,C> compose(Function<A,B> f,
Function<B,C> g) {
return t -> g.apply(f.apply(t));
// --or--
// return new Function<A,C>() {
// public C apply(A t) { return g.apply(f.apply(t)); }
// };
}

It turns out that we can’t write chain in strongly-typed Java, because List s

(and other collections) must be homogeneous — we can specify a list


whose elements are all of type Function<A,B> , but not one whose first
element is a Function<A,B> , second is a Function<B,C> , and so on.

But here’s chain for functions of the same input/output type:

/**
* Compose a chain of functions.
* @param funcs list of functions A->A to compose
* @return function A->A made by composing list[0] ... list[n-1]
*/
public static <A> Function<A,A> chain(List<Function<A,A>> funcs) {
return funcs.stream().reduce(Function.identity(), Function::compose);
}

Our Python version didn’t use an initial value in the reduce , it required a non-
empty list of functions. In Java, we’ve provided the identity function (that is,
f(t) = t) as the identity value for the reduction.

reading exercises

Comparator<Dog>

In Java, suppose we have:

public interface Dog {


public String name();
public Breed breed();
public int loudnessOfBark();
}

We have several Dog objects, and we’d like to keep a collection of them,
sorted by how loud they bark.

Java provides an interface SortedSet for sorted sets of objects. TreeSet

implements SortedSet , so we’ll use that.

The TreeSet constructor takes as an argument a Comparator that tells it how to


compare two objects in the set; in this case, two Dog s.

Comparator is a functional interface: it has a single unimplemented method:


int compare(...) .

So our code will look like:

SortedSet<Dog> dogsQuietToLoud = new TreeSet<>(COMPARATOR);


dogsQuietToLoud.add(...);
dogsQuietToLoud.add(...);
dogsQuietToLoud.add(...);
// ...

An instance of Comparator is an example of:

functional object
lambda expression
method reference
something we can’t do in Java
something we can’t do in Python

(missing explanation)

check

Barking up the wrong TreeSet

Which of these would create a TreeSet to sort our dogs from quietest bark
to loudest?

Read the documentation for Comparator.compare(...) to understand what it


needs to do.

new TreeSet<>(new Comparator<Dog>() {


public int compare(Dog dog1, Dog dog2) {
return dog2.loudnessOfBark() - dog1.loudnessOfBark();
}
});

Correct
Incorrect

(missing explanation)

new TreeSet<>(new Comparator<Dog>() {


public Dog compare(Dog dog1, Dog dog2) {
return dog1.loudnessOfBark() > dog2.loudnessOfBark() ? dog1 : dog2;
}
});

Correct
Incorrect

(missing explanation)

new TreeSet<>((dog1, dog2) -> {


return dog1.loudnessOfBark() - dog2.loudnessOfBark();
});
Correct
Incorrect

(missing explanation)

public class DogBarkComparator implements Comparator<Dog> {


public int compare(Dog dog1, Dog dog2) {
return dog1.loudnessOfBark() - dog2.loudnessOfBark();
}
}
// ...
new TreeSet<>(new DogBarkComparator());

Correct
Incorrect

(missing explanation)

check

Summary
This reading is about modeling problems and implementing systems with
immutable data and operations that implement pure functions, as opposed
to mutable data and operations with side effects. Functional programming
is the name for this style of programming.

Functional programming is much easier to do when you have first-class


functions in your language and you can build higher-order functions that
abstract away control flow code.

Some languages — Haskell, Scala, OCaml — are strongly associated with


functional programming. Many other languages — JavaScript, Swift, several
.NET languages, Ruby, and so on — use functional programming to a
greater or lesser extent. With Java’s recently-added functional language
features, if you continue programming in Java you should expect to see
more functional programming there, too.
Reading 25: Graphical User Interfaces
Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Communicating clearly Designed to


Correct today and
with future accommodate
correct in the
programmers, change without
unknown future.
including future you. rewriting.

Objectives

Today we’ll take a high-level look at the software architecture of GUI


software, focusing on the design patterns that have proven most
useful. Three of the most important patterns are:

the view tree, which is a central feature in the architecture of


every important GUI toolkit;
the model-view-controller pattern, which separates input, output,
and data;
the listener pattern, which is essential to decoupling the model
from the view and controller.

View Tree
a graphical user interface with views
labeled

snapshot diagram of the view tree


Graphical user interfaces are composed of view objects, each of
which occupies a certain portion of the screen, generally a
rectangular area called its bounding box. The view concept goes by a
variety of names in various UI toolkits. In Java Swing, they’re
JComponent objects; in HTML, they’re elements or nodes; in other
toolkits, they may be called widgets, controls, or interactors.

This leads to the first important pattern we’ll talk about today: the
view tree. Views are arranged into a hierarchy of containment, in
which some views contain other views. Typical containers are
windows, panels, and toolbars. The view tree is not just an arbitrary
hierarchy, but is in fact a spatial one: child views are nested inside
their parent’s bounding box.

How the View Tree is Used


Virtually every GUI system has some kind of view tree. The view tree
is a powerful structuring idea, which is loaded with responsibilities in a
typical GUI:

Output. Views are responsible for displaying themselves, and the


view tree directs the display process. GUIs change their output by
mutating the view tree. For example, to show a new set of photos in
a photo album GUI, the current thumbnails are removed from the view
tree and a new set of thumbnails is added in their place. A redraw
algorithm built into the GUI toolkit automatically redraws the affected
parts of the subtree. In Java Swing, every view in the tree has a
paint() method that knows how to draw itself on the screen. The
repaint process is driven by calling paint() on the root of the tree,
which recursively calls paint() down through all the descendent nodes
of the view tree.

Input. Views can have input handlers, and the view tree controls how
mouse and keyboard input is processed. More on this in a moment.

Layout. The view tree controls how the views are laid out on the
screen, i.e. how their bounding boxes are assigned. An automatic
layout algorithm automatically calculates positions and sizes of views.
Specialized containers (like JSplitPane , JScrollPane ) do layout
themselves. More generic containers ( JPanel , JFrame ) delegate layout
decisions to a layout manager (e.g. GroupLayout , BorderLayout , BoxLayout ,

…).

reading exercises

View Tree

Swing’s view tree type, JComponent , is a recursive data type. Here’s a


partial data type definition for it:

JComponent = JLabel(label:String)
+ JPanel(children:JComponent[])
+ ...

This definition is partial because it omits some of the details of the


reps (e.g. JLabel has many fields, not just a string label), and because
it omits many of the variant classes that implement JComponent .

Let’s fill in some more of the “…” on the righthand side of this
definition. To answer the questions below, you may need to look at
the documentation for the particular classes.
Which is the best description of JButton for the righthand side of the
definition?
JButton()
JButton(label:String)
JButton(children:JComponent[])

(missing explanation)

Which is the best description of JCheckBox on the righthand side of the


definition?
JCheckBox()
JCheckBox(label:String, children:JComponent[])
JCheckBox(label:String, selected:boolean)

(missing explanation)

Which is the best description of JScrollPane on the righthand side of


the definition?
JScrollPane()
JScrollPane(children:JComponent[])
JScrollPane(child:JComponent)

(missing explanation)

check

Input Handling
Input is handled somewhat differently in GUIs than we’ve been
handling it in parsers and servers. In those systems, we’ve seen a
single parser that peels apart the input and decides how to direct it to
different modules of the program. If a GUI were written that way, it
might look like this (in pseudocode):
while (true) {
read mouse click
if (clicked on Thrash button) doThrash();
else if (clicked on textbox) doPlaceCursor();
else if (clicked on a name in the listbox) doSelectItem();
...
}

In a GUI, we don’t directly write this kind of method, because it’s not
modular – it mixes up responsibilities for button, listbox, and textbox
all in one place. Instead, GUIs exploit the spatial separation provided
by the view tree to provide functional separation as well. Mouse
clicks and keyboard events are distributed around the view tree,
depending on where they occur.

GUI input event handling is an instance of the Listener pattern (also


known as Publish-Subscribe). In the Listener pattern:

An event source generates a stream of discrete events, which


correspond to state transitions in the source.
One or more listeners register interest (subscribe) to the stream
of events, providing a function to be called when a new event
occurs.

In this case, the mouse is the event source, and the events are
changes in the state of the mouse: its x,y position or the state of its
buttons (whether they are pressed or released). Events often include
additional information about the transition (such as the x,y position of
mouse), which might be bundled into an event object or passed as
parameters.
When an event occurs, the event source distributes it to all
subscribed listeners, by calling their callback methods.

The control flow through a graphical user interface proceeds like this:

A top-level event loop reads input from mouse and keyboard. In


Java Swing, and most graphical user interface toolkits, this loop is
actually hidden from you. It’s buried inside the toolkit, and listeners
appear to be called magically.
For each input event, it finds the right view in the tree (by looking
at the x,y position of the mouse) and sends the event to that
view’s listeners.
Each listener does its thing (which might involve e.g. modifying
objects in the view tree), and then returns immediately to the
event loop.

The last part – listeners return to the event loop as fast as possible –
is very important, because it preserves the responsiveness of the
user interface. We’ll come back to this later in the reading.

The Listener pattern isn’t just used for low-level input events like
mouse clicks and keyboard keypresses. Many GUI objects generate
their own higher-level events, often as a result of some combination
of low-level input events. For example:

JButton sends an action event when it is pressed (whether by


mouse or keyboard)
JList sends a selection event when the selected element changes
(whether by mouse or by keyboard)
JTextField sends change events when the text inside it changes for
any reason

A button can be pressed either by the mouse (with a mouse down


and mouse up event) or by the keyboard (which is important for
people who can’t use a mouse, like blind users). So you should
always listen for these high-level events, not the low-level input
events. Use an ActionListener to respond to a JButton press, not a
mouse listener.

reading exercises

Listeners

Put the following items in order according to when they would happen
during the execution of a Swing graphical user interface.

launchButton = new JButton("Launch the Missiles");

launchButton.addActionListener(launchMissiles);

launchMissiles ’ actionPerformed() method is called

Mouse click event on the launch button is handled by the Swing event
loop

(missing explanation)
check

Separating Frontend from Backend


We’ve seen how GUI programs are structured around a view tree,
and how input events are handled by attaching listeners to views. This
is the start of a separation of concerns – output handled by views,
and input handled by listeners.

But we’re still missing the application itself – the backend that
represents the data and logic that the user interface is showing and
editing. (Why do we want to separate this from the user interface?)

The Model-View-Controller pattern has this separation of concerns


as its primary goal. It separates the user interface frontend from the
application backend, by putting backend code into the model and
frontend code into the view and controller. MVC also separates input
from output; the controller is supposed to handle input, and the view
is supposed to handle output.
Model-View-Controller pattern

The model is responsible for maintaining application-specific data and


providing access to that data. Models are often mutable, and they
provide methods for changing the state safely, preserving its
representation invariants. OK, all mutable objects do that. But a
model must also notify its clients when there are changes to its data,
so that dependent views can update their displays, and dependent
controllers can respond appropriately. Models do this notification
using the listener pattern, in which interested views and controllers
register themselves as listeners for change events generated by the
model.

View objects are responsible for output. A view usually occupies


some chunk of the screen, usually a rectangular area. Basically, the
view queries the model for data and draws the data on the screen. It
listens for changes from the model so that it can update the screen to
reflect those changes.

Finally, the controller handles the input. It receives keyboard and


mouse events, and instructs the model to change accordingly.

Model-View-Controller pattern as shown in


a JTextField

A simple example of the MVC pattern is a text field. The figure at


right shows Java Swing’s text field, called JTextField . Its model is a
mutable string of characters. The view is an object that draws the
text on the screen (usually with a rectangle around it to indicate that
it’s an editable text field). The controller is an object that receives
keystrokes typed by the user and inserts them into the mutable
string.
Instances of the MVC pattern appear at many scales in GUI
software. At a higher level, this text field might be part of a view (like
an address book editor), with a different controller listening to it (for
text-changed events), for a different model (like the address book).
But when you drill down to a lower level, the text field itself is an
instance of MVC.

Model-View-Controller pattern as shown in


a filesystem browser

Here’s a larger example, in which the view is a filesystem browser


(like the Mac Finder or Windows Explorer), the model is the disk
filesystem, and the controller is an input handler that translates the
user’s keystrokes and mouse clicks into operations on the model and
view.

The separation of model and view has several benefits. First, it


allows the interface to have multiple views showing the same
application data. For example, a database field might be shown in a
table and in an editable form at the same time. Second, it allows
views and models to be reused in other applications. The MVC
pattern enables the creation of user interface toolkits, which are
libraries of reusable views. Java Swing is such a toolkit. You can
easily reuse view classes from this library (like JButton and JTree )

while plugging your own models into them.

reading exercises

Model-View-Controller

Thinking about the separation of concerns implied by the model-view-


controller pattern, which of the following design decisions make
sense, knowing nothing else about the programs in question?

“All the data is kept in JTextField objects in the window, and other
classes can look it up just by getting a reference to the JTextField and
calling getText() .”

True
False

(missing explanation)

“If the view listens for ball-moved events from the pinball board, then
we can have multiple views showing the same board.”

True
False
(missing explanation)

“Let’s put the double-click listener in the model class.”

True
False

(missing explanation)

“Looks like the model is the best place to store the name of the
pinball board.”

True
False

(missing explanation)

check

Background Processing in Graphical User Interfaces


The last major topic for today connects back to concurrency.

First, some motivation. Why do we need to do background


processing in graphical user interfaces? Even though computer
systems are steadily getting faster, we’re also asking them to do
more. Many programs need to do operations that may take some
time: retrieving URLs over the network, running database queries,
scanning a filesystem, doing complex calculations, etc.

But graphical user interfaces are event-driven programs, which


means (generally speaking) everything is triggered by an input event
handler. For example, in a web browser, clicking a hyperlink starts
loading a new web page. But if the click handler is written so that it
actually retrieves the web page itself, then the web browser will be
very painful to use. Why? Because its interface will appear to freeze
up until the click handler finishes retrieving the web page and returns
to the event loop. Here’s why.

This happens because input handling and screen repainting is all


handled from a single thread. That thread (called the event-dispatch
thread) has a loop that reads an input event from the queue and
dispatches it to listeners on the view tree. When there are no input
events left to process, it repaints the screen. But if an input handler
you’ve written delays returning to this loop – because it’s blocking on
a network read, or because it’s searching for the solution to a big
Sudoku puzzle – then input events stop being handled, and the screen
stops updating. So long tasks need to run in the background: on a
different thread, not the event-dispatch thread.

In Java, the event-dispatch thread is distinct from the main thread of


the program (see below). It is started automatically when a user
interface object is created. As a result, every Java GUI program is
automatically multithreaded. Many programmers don’t notice,
because the main thread typically doesn’t do much in a GUI program
– it starts creation of the view, and then the main thread just exits,
leaving only the event-dispatch thread to do the main work of the
program.
The fact that Swing programs are multithreaded by default creates
risks. There’s very often a shared mutable datatype in your GUI: the
model. If you use background threads to modify the model without
blocking the event-dispatch thread, then you have to make sure your
data structure is threadsafe.

But another important shared mutable datatype in your GUI is the


view tree. Java Swing’s view tree is not threadsafe. In general, you
cannot safely call methods on a Swing object from anywhere but the
event-dispatch thread.

The view tree is a big meatball of shared state, and the Swing
specification doesn’t guarantee that there’s any lock protecting it.
Instead the view tree is confined to the event-dispatch thread, by
specification. So it’s ok to access view objects from the event-
dispatch thread (i.e., in response to input events), but the Swing
specification forbids touching – reading or writing – any JComponent

objects from a different thread. See Swing threading and the event-
dispatch thread.

In the actual Swing implementation, there is one big lock


( Component.getTreeLock() ) but only some Swing methods use it, so it’s
not effective as a synchronization mechanism.

The safe way to access the view tree is to do it from the event-
dispatch thread. So Swing takes a clever approach: it uses the event
queue itself as a message-passing queue. In other words, you can
put your own custom messages on the event queue, the same queue
used for mouse clicks, keypresses, button action events, and so
forth. Your custom message is actually a piece of executable code,
an object that implements Runnable , and you put it on the queue using
SwingUtilities.invokeLater . For example:

SwingUtilities.invokeLater(new Runnable() {
public void run() {
content.add(thumbnail);
...
}
});

The invokeLater() drops this Runnable object at the end of the queue,
and when Swing’s event loop reaches it, it simply calls run() . Thus the
body of run() ends up run by the event-dispatch thread, where it can
safely call observers and mutators on the view tree.

In the Java Tutorials, read:

Concurrency in Swing (1 page)


Initial Threads (1 page)
The Event Dispatch Thread (1 page)

reading exercises

Background Processing

Suppose you’re using a graphical user interface written in Java


Swing. You press a button, and the UI just locks up – you can’t scroll,
press other buttons, even type anything into a textbox. Which of the
following are likely explanations?

Deadlock – two different parts of the UI are trying to acquire locks


on the view tree, and are deadlocking with each other.
True
False

(missing explanation)

Event queue blocking – the event loop is waiting for an input event
on the event queue, but the queue is empty, so nothing is happening
in the program.

True
False

(missing explanation)

Too much work in the event-dispatch thread – the UI is doing a lot


of computation in response to your button press, and it hasn’t
returned to the event loop to handle more input events yet.

True
False

(missing explanation)

Network delay on the event-dispatch thread – the UI is trying to


fetch data from the network in response to your button press, and it
hasn’t returned to the event loop to handle more input events yet.

True
False

(missing explanation)
check

Summary
The view tree organizes the screen into a tree of nested
rectangles, and it is used in dispatching input events as well as
displaying output.

The Listener pattern sends a stream of events (like mouse or


keyboard events, or button action events) to registered listeners.

The Model-View-Controller pattern separates responsibilities:


model = data, view = output, controller = input.

Long-running processing should be moved to a background


thread, but the Swing view tree is confined to the event-dispatch
thread. So accessing Swing objects from another thread requires
using the event loop as a message-passing queue, to get back to
the event-dispatch thread.
Reading 26: Little Languages
Software in 6.005

Ready for
Safe from bugs Easy to understand
change

Communicating clearly Designed to


Correct today and
with future accommodate
correct in the
programmers, change without
unknown future.
including future you. rewriting.

Objectives

In this reading we will begin to explore the design of a little language


for constructing and manipulating music. Here’s the bottom line: when
you need to solve a problem, instead of writing a program to solve
just that one problem, build a language that can solve a range of
related problems.

The goal for this reading is to introduce the idea of representing


code as data and familiarize you with an initial version of the music
language.

Representing code as data


Recall the Formula datatype from Recursive Data Types:

Formula = Variable(name:String)
+ Not(formula:Formula)
+ And(left:Formula, right:Formula)
+ Or(left:Formula, right:Formula)
We used instances of Formula to take propositional logic formulas, e.g.
(p ∨ q) ∧ (¬p ∨ r), and represent them in a data structure, e.g.:
And(Or(Variable("p"), Variable("q")),
Or(Not(Variable("p")), Variable("r")))

In the parlance of grammars and parsers, formulas are a language,


and Formula is an abstract syntax tree.

But why did we define a Formula type? Java already has a way to
represent expressions of Boolean variables with logical and, or, and
not. For example, given boolean variables p , q , and r :

(p || q) && ((!p) || r)

Done!

The answer is that the Java code expression (p || q) && ((!p) || r) is


evaluated as soon as we encounter it in our running program. The
Formula value And(Or(...), Or(...)) is a first-class value that can be
stored, passed and returned from one method to another,
manipulated, and evaluated now or later (or more than once) as
needed.

The Formula type is an example of representing code as data, and


we’ve seen many more.

Consider this functional object:

class VariableNameComparator implements Comparator<Variable> {


public int compare(Variable v1, Variable v2) {
return v1.name().compareTo(v2.name());
}
}
An instance of VariableNameComparator is a value that can be passed
around, returned, and stored. But at any time, the function that it
represents can be invoked by calling its compare method with a couple
of Variable arguments:

Variable v1, v2;


Comparator<Variable> c = new VariableNameComparator();
...
int a = c.compare(v1, v2);
int b = c.compare(v2, v1);
SortedSet<Variable> vars = new TreeSet<>(c); // vars is sorted by name

Lambda expressions allow us to create functional objects with a


compact syntax:

Comparator<Variable> c = (v1, v2) -> v1.name().compareTo(v2.name());

Building languages to solve problems


When we define an abstract data type, we’re extending the universe
of built-in types provided by Java to include a new type, with new
operations, appropriate to our problem domain. This new type is like
a new language: a new set of nouns (values) and verbs (operations)
we can manipulate. Of course, those nouns and verbs are
abstractions built on top the existing nouns and verbs which were
themselves already abstractions.

A language has greater flexibility than a mere program, because we


can use a language to solve a large class of related problems,
instead of just a single problem.
That’s the difference between writing (p || q) && ((!p) || r) and
devising a Formula type to represent the semantically-equivalent
Boolean formula.

And it’s the difference between writing a matrix multiplication


function and devising a MatrixExpression type to represent matrix
multiplications — and store them, manipulate them, optimize them,
evaluate them, and so on.

First-class functions and functional objects enable us to create


particularly powerful languages because we can capture patterns of
computation as reusable abstractions.

Music language
In class, we will design and implement a language for generating and
playing music. To prepare, let’s first understand the Java APIs for
playing music with the MIDI synthesizer. We’ll see how to write a
program to play MIDI music. Then we’ll begin to develop our music
language by writing a recursive abstract data type for simple musical
tunes. We’ll choose a notation for writing music in strings, and we’ll
implement a parser to create instances of our Music type.

The full source code for the basic music language is on GitHub.

Clone the fa16-ex26-music-starting repo so you can run the code and
follow the discussion below.

Playing MIDI music


music.midi.MidiSequencePlayer uses the Java MIDI APIs to play
sequences of notes. It’s quite a bit of code, and you don’t need to
understand how it works.

MidiSequencePlayer implements the music.SequencePlayer interface,


allowing clients to use it without depending on the particular MIDI
implementation. We do need to understand this interface and the
types it depends on:

addNote : SequencePlayer × Instrument × Pitch × double × double → void


(SequencePlayer.java:15) is the workhorse of our music player.
Calling this method schedules a musical pitch to be played at some
time during the piece of music.

play : SequencePlayer → void (SequencePlayer.java:20) actually plays


the music. Until we call this method, we’re just scheduling music that
will, eventually, be played.

The addNote operation depends on two more types:

Instrument is an enumeration of all the available MIDI instruments.

Pitch is an abstract data type for musical pitches (think keys on the
piano keyboard).

Read and understand the Pitch documentation and the specifications


for its public constructor and all its public methods.

Our music data type will rely on Pitch in its rep, so be sure to
understand the Pitch spec as well as its rep and abstraction function.
Using the MIDI sequence player and Pitch , we’re ready to write code
for our first bit of music!

Read and understand the music.examples.ScaleSequence code.

Run the main method in ScaleSequence . You should hear a one-octave


scale!

reading exercises

Pitch

Which observers could MidiSequencePlayer use to determine what


frequency an arbitrary Pitch represents?

transpose(int)
difference(Pitch)
value()
equals(Object)
toString()

(missing explanation)

check

transpose

Pitch.transpose(int) is a:

creator
producer
observer
mutator

(missing explanation)
check

addNote

SequencePlayer.addNote(..) is a:

creator
producer
observer
mutator

(missing explanation)

check

Music data type

The Pitch datatype is useful, but if we want to represent a whole


piece of music using Pitch objects, we should create an abstract data
type to encapsulate that representation.

To start, we’ll define the Music type with a few operations:

notes : String × Instrument → Music (MusicLanguage.java:51) makes a


new Music from a string of simplified abc notation, described below.

duration : Music → double (Music.java:11) returns the duration, in beats,


of the piece of music.

play : Music × SequencePlayer × double → void (Music.java:18) plays the


piece of music using the given sequence player.
We’ll implement duration and play as instance methods of Music , so we
declare them in the Music interface.

notes will be a static factory method; rather than put it in Music (which
we could do), we’ll put it in a separate class: MusicLanguage will be our
place for all the static methods we write to operate on Music .

Now that we’ve chosen some operations in the spec of Music , let’s
choose a representation.

Looking at ScaleSequence , the first concrete variant that might jump


out at us is one to capture the information in each call to addNote : a
particular pitch on a particular instrument played for some amount
of time. We’ll call this a Note .

The other basic element of music is the silence between notes:


Rest .

Finally, we need a way to glue these basic elements together into


larger pieces of music. We’ll choose a tree-like structure:
Concat(m1,m2:Music) represents m1 followed by m2 , where m1 and m2

are any music.

This tree structure turns out to be an elegant decision as we


further develop our Music type later on. In a real design process,
we might iterate on the recursive structure of Music before we find
the best implementation.

Here’s the datatype definition:


Music = Note(duration:double, pitch:Pitch, instrument:Instrument)
+ Rest(duration:double)
+ Concat(m1:Music, m2:Music)

Composite

Music is an example of the composite pattern, in which we treat both


single objects (primitives, e.g. Note and Rest ) and groups of objects
(composites, e.g. Concat ) the same way.

Formula is also an example of the composite pattern.

The GUI view tree relies heavily on the composite pattern: there
are primitive views like JLabel and JTextField that don’t have
children, and composite views like JPanel and JScollPage that do
contain other views as children. Both implement the common
JComponent interface.

The composite pattern gives rise to a tree data structure, with


primitives at the leaves and composites at the internal nodes.

Emptiness

One last design consideration: how do we represent the empty


music? It’s always good to have a representation for nothing, and
we’re certainly not going to use null .

We could introduce an Empty variant, but instead we’ll use a Rest of


duration 0 to represent emptiness.

Implementing basic operations


First we need to create the Note , Rest , and Concat variants. All three
are straightforward to implement, starting with constructors, checkRep ,

some observers, toString , and the equality methods.

Since the duration operation is an instance method, each variant


implements duration appropriately.

The play operation is also an instance method; we’ll discuss it


below under implementing the player.

And we’ll discuss the notes operation in implementing the parser.

Read and understand the Note , Rest , and Concat classes.

To avoid representation exposure, let’s add some additional static


factory methods to the Music interface:

note : double × Pitch × Instrument → Music (MusicLanguage.java:92)

rest : double → Music (MusicLanguage.java:100)

concat : Music × Music → Music (MusicLanguage.java:113) is our first


producer operation.

All three of them are easy to implement by constructing the


appropriate variant.

reading exercises

Music rep

Assume we have
import music.*;
import static music.Instrument.*;
import static music.MusicLanguage.*;

Which of the following represent a middle C followed by A above


middle C?

new Concat(new Pitch('C'), new Pitch('A'))


new Concat(new Note(1, new Pitch('C'), PIANO),
new Note(1, new Pitch('A'), PIANO))
concat(note(1, new Pitch('C'), PIANO),
note(1, new Pitch('A'), PIANO))
concat(rest(0),
concat(note(1, new Pitch('C'), PIANO),
note(1, new Pitch('A'), PIANO)))
concat(concat(rest(0),
note(1, new Pitch('C'), PIANO)),
note(1, new Pitch('A'), PIANO))

(missing explanation)

check

Music notation

We will write pieces of music using a simplified version of abc


notation, a text-based music format.

We’ve already been representing pitches using their familiar letters.


Our simplified abc notation represents sequences of notes and rests
with syntax for indicating their duration, accidental (sharp or flat),
and octave.

For example:
C D E F G A B C' B A G F E D C represents the one-octave ascending and
descending C major scale we played in ScaleSequence . C is middle C,
and C' is C one octave above middle C. Each note is a quarter note.

C/2 D/2 _E/2 F/2 G/2 _A/2 _B/2 C' is the ascending scale in C minor,
played twice as fast. The E, A, and B are flat. Each note is an eighth
note.

Read and understand the specification of notes in MusicLanguage .

You don’t need to understand the parser implementation yet, but you
should understand the simplified abc notation enough to make sense
of the examples.

If you’re not familiar with music theory — why is an octave 8 notes


but only 12 semitones? — don’t worry. You might not be able to look
at the abc strings and guess what they sound like, but you can
understand the point of choosing a convenient textual syntax.

reading exercises

Simplified abc syntax

Which of these notes are twice as long as E/4 ?

E2/4
E1/2
E/2
B'/2
B''/2
C,/2
C,,/2
_D/2
^E/2

(missing explanation)

check

Implementing the parser

The notes method parses strings of simplified abc notation into Music .

notes : String × Instrument → Music (MusicLanguage.java:51) splits the


input into individual symbols (e.g. A,,/2 , .1/2 ). We start with the empty
Music , rest(0) , symbols are parsed individually, and we build up the
Music using concat .

parseSymbol : String × Instrument → Music (MusicLanguage.java:62)


returns a Rest or a Note for a single abc symbol ( symbol in the
grammar). It only parses the type (rest or note) and duration; it relies
on parsePitch to handle pitch letters, accidentals, and octaves.

parsePitch : String → Pitch (MusicLanguage.java:77) returns a Pitch by


parsing a pitch grammar production. You should be able to
understand the recursion — what’s the base case? What are the
recursive cases?

reading exercises

parsePitch

Which of these inputs is handled by the base case of parsePitch ?

C
_C
C'
C/2
.
a single space
a single vertical bar

(missing explanation)

check

Implementing the player

Recall our operation for playing music:

play : Music × SequencePlayer × double → void (Music.java:18) plays the


piece of music using the given sequence player after the given
number of beats delay.

Why does this operation take atBeat ? Why not simply play the music
now?

If we define play in that way, we won’t be able to play sequences of


notes over time unless we actually pause during the play operation,
for example with Thread.sleep . Our sequence player’s addNote operation
is already designed to schedule notes in the future — it handles the
delay.

With that design decision, it’s straightforward to implement play in


every variant of Music .

Read and understand the Note.play , Rest.play , and Concat.play

methods.
You should be able to follow their recursive implementations.

Just one more piece of utility code before we’re ready to jam:
music.midi.MusicPlayer plays a Music using the MidiSequencePlayer . Music

doesn’t know about the concrete type of the sequence player, so we


need a bit of code to bring them together.

Bringing this all together, let’s use the Music ADT:

Read and understand the music.examples.ScaleMusic code.

Run the main method in ScaleMusic . You should hear the same one-
octave scale again.

That’s not very exciting, so read music.examples.RowYourBoatInitial and


run the main method. You should hear Row, row, row your boat!

Can you follow the flow of the code from calling notes(..) to having an
instance of Music to the recursive play(..) call to individual addNote(..)

calls?

reading exercises

notes

There are 27 notes in Row, row, row your boat.

Given the actual implementation, how many Music objects will be


created by the notes call in RowYourBoatInitial ?

27
28
29
54
55
56
more than 56

(missing explanation)

check

duration

What should be the result of rowYourBoat.duration() ?

(missing explanation)

check

Music

Assume we have

import music.*;
import static music.Instrument.*;
import static music.MusicLanguage.*;

And

Music r = rest(1);
Pitch p = new Pitch('A').transpose(6);
Music n = note(1, p, GLOCKENSPIEL);
List<Music> s = Arrays.asList(r, n);

Which of the following is a valid Music ?

r
concat(r, r)
concat(r, r, r)
p
n
s

(missing explanation)

check

To be continued
Playing Row, row, row your boat is pretty exciting, but so far the most
powerful thing we’ve done is not so much the music language as it is
the very basic music parser. Writing music using the simplified abc
notation is clearly much more easy to understand, safe from bugs,
and ready for change than writing page after page of addNote addNote

addNote …

In class, we’ll expand our music language and turn it into a powerful
tool for constructing and manipulating complex musical structures.
Reading 27: Team Version Control
Software in 6.005

Safe from bugs Easy to understand Ready for change

Designed to
Correct today and Communicating clearly
accommodate
correct in the unknown with future programmers,
change without
future. including future you.
rewriting.

Objectives

Review Git basics and the commit graph


Practice multi-user Git scenarios

Git workflow
You’ve been using Git for problem sets and in-class exercises for a while now.
Most of the time, you haven’t had to coordinate with other people pushing and
pulling to and from the same repository as you at the same time. For the group
projects, that will change.

In this reading, prepare for some in-class Git exercises by reviewing what you
know and brushing up on some commands. Now that you’re more comfortable
with Git basics, it’s a good time to go back and review some of the resources
from the beginning of the semester.

Review Inventing version control: one developer, multiple developers, and


branches.

If you need to, review Learn the Git workflow from the Getting Started page.

Viewing commit history


Review 2.3 Viewing the Commit History from Pro Git.
You don’t need to remember all the different command-line options presented in
the book! Instead, learn what’s possible so you know what to search for when
you need it.

Clone the example repo from Version Control:


https://github.com/mit6005/fa16-ex05-hello-git.git

Use log commands to make sure you understand the history of the repo.

Graph of commits
Recall that the history recorded in a Git repository is a directed acyclic graph.
The history of any particular branch in the repo (such as the default master

branch) starts at some initial commit, and then its history may split apart and
come back together, if multiple developers made changes in parallel (or if a
single developer worked on two different machines without committing-pushing-
pulling before the switch).

Here’s the output of git lol for the example repository, which shows an ASCII-
art graph:

* b0b54b3 (HEAD, origin/master, origin/HEAD, master) Greeting


in Java
* 3e62e60 Merge
|\
| * 6400936 Greeting in Scheme
* | 82e049e Greeting in Ruby
|/
* 1255f4e Change the greeting
* 41c4b8f Initial commit

And here is a diagram of the DAG:

In the ex05-hello-git example repo, make sure you can explain where the history
of master splits apart, and where it comes back together.
Review Merging from the Version Control reading.

You should understand every step of the process, and how it relates to the
result in the example repo.

Review the Getting Started section on merges, including merging and merge
conflicts.

reading exercises

Merge

Alice and Bob both start with the same Java file:

public class Hello {


public static void greet(String name) {
System.out.println(greeting() + ", " + name);
}
public static String greeting() {
return "Hello";
}
}

Alice changes greet(..) :


Bob changes greeting() :

public static void greet(String name) {


public static String greeting() {
System.out.println(greeting() +
return "Ciao";
", " + name + "!");
}
}

If Git merges the changes of Alice and Bob, what is the result of
Hello.greet("Eve") ?
Hello, Eve
Hello, Eve!
Ciao, Eve
Ciao, Eve!
we can automatically merge, but the resulting code is broken (static error)
we can automatically merge, but the resulting code is broken (dynamic error)
we can automatically merge, but the resulting code is broken (no error,
wrong answer)
we cannot automatically merge the changes

(missing explanation)

check

Dangerous Merge Ahead

Same starting program:

public class Hello {


public static void greet(String name) {
System.out.println(greeting() + ", " + name);
}
public static String greeting() {
return "Hello";
}
}

Bob changes how the functions work


together:
Alice changes greeting() :
public static void greet(String name) {
public static String greeting() { greeting();
return "Ciao"; System.out.println(", " + name);
} }
public static void greeting() {
System.out.println("Hello");
}

If Git merges the changes of Alice and Bob, what is the result of
Hello.greet("Eve") ?
Hello, Eve
Hello, Eve!
Ciao, Eve
Ciao, Eve!
we can automatically merge, but the resulting code is broken (static error)
we can automatically merge, but the resulting code is broken (dynamic error)
we can automatically merge, but the resulting code is broken (no error,
wrong answer)
we cannot automatically merge the changes

(missing explanation)

check

Continue Merging

Same starting program:

public class Hello {


public static void greet(String name) {
System.out.println(greeting() + ", " + name);
}
public static String greeting() {
return "Hello";
}
}

Alice changes greet(..) to return instead of print:

public static String greet(String name) {


return greeting() + ", " + name;
}

Bob creates a new file, Main.java :

public class Main {


public static void main(String[] args) {
// print a greeting to Eve
Hello.greet("Eve");
}
}

If Git merges the changes of Alice and Bob, what is the result of running main ?
Hello, Eve
Hello, Eve!
Ciao, Eve
Ciao, Eve!

we can automatically merge, but the resulting code is broken (static error)
we can automatically merge, but the resulting code is broken (dynamic error)
we can automatically merge, but the resulting code is broken (no error,
wrong answer)
we cannot automatically merge the changes

(missing explanation)

check

Using version control in a team


Every team develops its own standards for version control, and the size of the
team and the project they’re working on is a major factor. Here are some
guidelines for a small-scope team project of the kind you will undertake in
6.005:

Communicate. Tell your teammates what you’re going to work on. Tell them
that you’re working on it. And tell them that you worked on it. Communication
is the best way to avoid wasted time and effort cleaning up broken code.

Write specs. Necessary for the things we care about in 6.005, and part of
good communication.

Write tests. Don’t wait for a giant pile of code to accumulate before you try
to test it. Avoid having one person write tests while another person writes
implementation (unless the implementation is a prototype you plan to throw
away). Write tests first to make sure you agree on the specs. Everyone
should take responsibility for the correctness of their code.

Run the tests. Tests can’t help you if you don’t run them. Run them before
you start working, run them again before you commit.

Automate. You’ve already automated your tests with a tool like JUnit, but
now you want to automate running those tests whenever the project
changes. For 6.005 group projects, we provide Didit as a way to
automatically run your tests every time a team member pushes to Athena.
This also removes “it worked on my machine” from the equation: either it
works in the automated build, or it needs to be fixed.

Review what you commit. Use git diff --staged or a GUI program to see
what you’re about to commit. Run the tests. Don’t use commit -a , that’s a
great way to fill your repo with println s and other stuff you didn’t mean to
commit. Don’t annoy your teammates by committing code that doesn’t
compile, spews debug output, isn’t actually used, etc.

Pull before you start working. Otherwise, you probably don’t have the
latest version as your starting point — you’re editing an old version of the
code! You’re guaranteed to have to merge your changes later, and you’re in
danger of having to waste time resolving a merge conflict.

Sync up. At the end of a day or at the end of a work session, make sure
everyone has pushed and pulled all the changes, you’re all at the same
commit, and everyone is satisfied with the state of the project.

We don’t recommend using features like branching or rebasing for 6.005-sized


projects.

We do strongly recommend working together in the same place at the same


time, especially if this is your first group software engineering experience.

reading exercises

Team version control

Which of these demonstrate good team software development practice?

Pushing small commits, one for each file changed during some work
Pushing small commits, one for each different change to the project
Pushing small commits, including intermediate work that doesn’t compile
Always committing all changes (for example, git commit -a )

Always reformatting any file you edit (indentation, braces, etc.)


Always pulling changes from the remote repo before working on the project

(missing explanation)

check

Team project
Most class times during the project phase will be devoted to group work.

These classes are required, just as normal classes are, and you must check in
with your project mentor TA during class.

You might also like