Project English
Project English
The Analytical Engine has no pretensions whatever to originate any thing. It can do whatever we know how to order it
to perform. It can follow analysis; but it has no power of anticipating any analytical relations or truths. Its province is to
assist us in making available what we are already acquainted with. Augusta Ada Countess of Lovelace, in Notes on the
Analytical Engine, 1843
What distinguishes a computer from other machines is its programmability. Without a program, a
computer is an overpriced door stopper. With the right program, though, a computer can be a tool
for communicating across the continent, discovering a new molecule that can cure cancer,
composing a symphony, or managing the logistics of a retail empire.
Programming is the act of writing instructions that make the computer do something useful. It is an
intensely creative activity, involving aspects of art, engineering, and science. Good programs are
written to be executed efficiently by computers, but also to be read and understood by humans. The
best programs are delightful in ways similar to the best architecture, elegant in both form and
function.
The ideal programmer would have the vision of Isaac Newton, the intellect of Albert Einstein, the
creativity of Miles Davis, the aesthetic sense of Maya Lin, the wisdom of Benjamin Franklin, the
literary talent of William Shakespeare, the oratorical skills of Martin Luther King, the audacity of
John Roebling, and the self-confidence of Grace Hopper.
Fortunately, it is not necessary to possess all of those rare qualities to be a good programmer!
Indeed, anyone who is able to master the intellectual challenge of learning a language (which,
presumably, anyone who has gotten this far has done at least for English) can become a good
programmer. Since programming is a new way of thinking, many people find it challenging and
even frustrating at first. Because the computer does exactly what it is told, a small mistake in a
program may prevent it from working as intended. With a bit of patience and persistence, however,
the tedious parts of programming become easier, and you will be able to focus your energies on the
fun and creative problem solving parts.
In the previous chapter, we explored the components of language and mechanisms for defining
languages. In this chapter, we explain why natural languages are not a satisfactory way for defining
procedures and introduce a language for programming computers and how it can be used to define
procedures.
Next, we survey several of the reasons for this. We use specifics from English, although all natural
languages suffer from these problems to varying degrees.
Complexity. Although English may seem simple to you now, it took many years of intense effort
(most of it subconscious) for you to learn it. Despite using it for most of their waking hours for
many years, native English speakers know a small fraction of the entire language. The Oxford
English Dictionary contains 615,000 words, of which a typical native English speaker knows about
40,000.
Ambiguity. Not only do natural languages have huge numbers of words, most words have many
different meanings. Understanding the intended meaning of an utterance requires knowing the
context, and sometimes pure guesswork. For example, what does it mean to be paid biweekly?
According to the American Heritage Dictionary1 , biweekly has two definitions:
So, depending on which definition is intended, someone who is paid biweekly could either be paid
once or four times every two weeks! The behavior of a payroll management program better not
depend on how biweekly is interpreted. Even if we can agree on the definition of every word, the
meaning of a sentence is often ambiguous. This particularly difficult example is taken from the
instructions with a shipment of ballistic missiles from the British Admiralty:
It is necessary for technical reasons that these warheads be stored upside down, that is, with the
top at the bottom and the bottom at the top. In order that there be no doubt as to which is the
bottom and which is the top, for storage purposes, it will be seen that the bottom of each warhead
has been labeled ’TOP’.
Irregularity. Because natural languages evolve over time as different cultures interact and speakers
misspeak and listeners mishear, natural languages end up a morass of irregularity. Nearly all
grammar rules have exceptions. For example, English has a rule that we can make a word plural by
appending an s. The new word means “more than one of the original word’s meaning”. This rule
works for most words: word 7→ words, language 7→ languages, person 7→ persons. 4 It does not
work for all words, however. The plural of goose is geese (and gooses is not an English word), the
plural of deer is deer (and deers is not an English word), and the plural of beer is controversial (and
may depend on whether you speak American English or Canadian English). These irregularities can
be charming for a natural language, but they are a constant source of difficulty for non-native
speakers attempting to learn a language. There is no sure way to predict when the rule can be
applied, and it is necessary to memorize each of the irregular forms.
Uneconomic. It requires a lot of space to express a complex idea in a natural language. Many
superfluous words are needed for grammatical correctness, even though they do not contribute to
the desired meaning. Since natural languages evolved for everyday communication, they are not
well suited to describing the precise steps and decisions needed in a computer program.
As an example, consider a procedure for finding the maximum of two numbers. In English, we
could describe it like this:
To find the maximum of two numbers, compare them. If the first number is greater than the second
number, the maximum is the first number. Otherwise, the maximum is the second number.
Perhaps shorter descriptions are possible, but any much shorter description probably assumes the
reader already knows a lot. By contrast, we can express the same steps in the Scheme programming
language in very concise way:
Limited means of abstraction. Natural languages provide small, fixed sets of pronouns to use as
means of abstraction, and the rules for binding pronouns to meanings are often unclear. Since
programming often involves using simple names to refer to complex things, we need more powerful
means of abstraction than natural languages provide.
For programming computers, we want simple, unambiguous, regular, and economical languages
with powerful means of abstraction. A programming language is a language that is designed to be
read and written by humans to create programs that can be executed by computers.
Programming languages come in many flavors. It is difficult to simultaneously satisfy all desired
properties since simplicity is often at odds with economy. Every feature that is added to a language
to increase its expressiveness incurs a cost in reducing simplicity and regularity. For the first two
parts of this book, we use the Scheme programming language which was designed primarily for
simplicity. For the later parts of the book, we use the Python programming language, which
provides more expressiveness but at the cost of some added complexity.
Another reason there are many different programming languages is that they are at different levels
of abstraction. Some languages provide programmers with detailed control over machine resources,
such as selecting a particular location in memory where a value is stored. Other languages hide
most of the details of the machine operation from the programmer, allowing them to focus on
higherlevel actions.
Ultimately, we want a program the computer can execute. This means at the lowest level we need
languages the computer can understand directly. At this level, the program is just a sequence of bits
encoding machine instructions. Code at this level is not easy for humans to understand or write, but
it is easy for a processor to execute quickly. The machine code encodes instructions that direct the
processor to take simple actions like moving data from one place to another, performing simple
arithmetic, and jumping around to find the next instruction to execute.
For example, the bit sequence 1110101111111110 encodes an instruction in the Intel x86
instruction set (used on most PCs) that instructs the processor to jump backwards two locations.
Since the instruction itself requires two locations of space, jumping back two locations actually
jumps back to the beginning of this instruction. Hence, the processor gets stuck running forever
without making any progress.
The computer’s processor is designed to execute very simple instructions like jumping, adding two
small numbers, or comparing two values. This means each instruction can be executed very quickly.
A typical modern processor can execute billions of instructions in a second.
Until the early 1950s, all programming was done at the level of simple instructions. The problem
with instructions at this level is that they are not easy for humans to write and understand, and you
need many simple instructions before you have a useful program.
A compiler is a computer program that generates other programs. It translates an input program
written in a high-level language that is easier for humans to create into a program in a machine-
level language that can be executed by the computer. Admiral Grace Hopper developed the first
compilers in the 1950s.