‘Other topes discussed ncade:
and he ned for ani to prvide the behavior
. eno mineenation rote ind by th ssccaton
‘Techniques to help readers of code understand and reason about it Focusing on such proper =
‘rp imearints and abstraction functions
‘ype hierarchy andes use in defining fales of elated data abstractions
Debugging, sting, and requirements analysis.
rogram design as 3 top-down, ferative process and design patterns
“The Java™ programming language is used forthe book's examples. However, the techniques proented
are language independent, and an introduction to key Java concepts i included for programmers who
tray not be familiar with he guage. ee m
ice arora stieig ee Neca
Sees er teeter rete eeeiien
ae ae cee Scat ean
Soe
ge Re pe wr ps a
ry es doef tye
peameeeceoeee
TT a
39.93 _U
rContents
Preface xv
Acknowledgments xix
1
Introduction 1
Decomposition and Abstraction 2
1.2 Abstraction 4
1.2 Abstraction by Parameterization 7
1.22 Abstraction by Specification 8
1.2.3 Kinds of Abstractions. 10
1.3 The Remainder of the Book 12
Exercises 15
— Understanding Objects in Java 15
2.1 Program Structure 15
22 Packages 17
23 Objects and Variables 18
23.1 Mutability 21
213.2 Method Call Semantics 22
24 Type Checking 24
2.4.1 Type Hierarchy 24
2.4.2 Conversions and Overloading 27
25 Dispatching 29
2.6 Types 30
2.6.1. Primitive Object Types 30
2.6.2 Vectors 31
2.7 Stream Input/Output 32
2.8 Java Applications 33
Exercises 35
— Procedural Abstraction 39
32 Specifications 42
3.3 Specifications of Procedural Abstractions 43
3 Implementing Procedures 47
3.5 Designing Procedural Abstractions 50
36 Summary 55
Exercises 56
— Exceptions 37
4.1 Specifications 59
4.2 The Java Exception Mechanism 61
4.2.1 Exception Types 61
4.2.2 Defining Exception Types 62
4.2.3 Throwing Exceptions 64
4.2.4 Handling Exceptions 65
4.255 Coping with Unchecked Exceptions 66
4.3, Programming with Exceptions 67
4.31 Reflecting and Masking 67
44 Design Issues 68
4.4.1 When to Use Exceptions 70
414.2 Checked versus Unchecked Exceptions 70
4.5 Defensive Programming 72
46 Summary 74
Exercises 75cae Contents
— Data Abstraction 7 6.3. Using Herators 132
64 Implementing terators 144
5.1 Specifications for Data Abstractions 79
5.1.1 Specification of Inset 80 65 Rep Invariants and Abstraction Functions for Generators 137,
5.1.2 The Poly Abstraction 83 66 Ordered Lists 138
5.2 Using Data Abstractions 85 67 Design Issues 143
5.3 Implement
5.3.1 Implementing Data Abstractions in Java 87 Exercises 144
5.3.2 Implementation of IntSet 87
5.3.3 Implementation of Poly 89
5.34 Revords 90 —Type Hierarchy 147
54 Additional Methods 94 7.1 Assignment and Dispatching 149
5.5 Aids to Understanding Implementations 99 TAA Assignment 149
5.5.1 The Abstraction Function 99 7.2 Dispatching 150
1g Data Abstractions 86 68 Summary 144
5.5.2 The Representation invariant 102 7.2 Defining a Type Hierarchy 152
5.5.3 Implementing the Abstraction Function 73. Defining Hierarchies in Java 152
and Rep Invariant 105 7.4 ASimple Example 154
5.5.4 Discussion 107
5.6 Properties of Data Abstraction Implementations 108
5.6.1 Benevolent Side Effects 108
5.6.2 Exposing the Rep 111
75 Exception Types 161
76 Abstract Classes 161
7.7 mterfaces 166
7.8 Multiple Implementations 167
7 Preserving the Rep lnvarant 123 eee
5.7.2 Reasoning about Operations 114 eee
5.7.3 Reasoning at the Abstract Level 115, 73 The Meaning of Subtypes 174
ne 79.1 The Methods Rule 176
7.9.2 The Properties Rule 179
7.9.3 Fquality 182
5.7 Reasoning about Data Abstractions 112
5.8 Design Issues
5.8.1 Mutability 116
5.8.2 Operation Categories 117
5.83 Adequacy 118 7.10. Discussion of Type Ht
5.9 Locality and Modifiability 120 TAL Summary 184
5:10 Summary 121
vrarchy 183
Exercises 121
8— Polymorphic Abstractions 189
— Iteration Abstraction 125 41 Polymorphic Data Abstractions 190
6.1 eration in Java 128, 8.2 Using Polymorphic Data Abstractions 193
62. Specifying terators 130Contents
83 Equality Revisited 193
84 Additional Methods 195
85 More Flexibility 198
8.6 Polymorphic Procedures 202
87 Summary 202
Exercises 204
9—Specifications 207
9.1 Specifications and Specificand Sets 207
9.2 Some Criteria for Specifications 208
9.2.1 Restrictiveness 208,
9.2.2 Generality 211
9.2.3 claruy 212
9.3 Why Specifications? 215,
94 Summary 217
Exercises 219
10 — Testing and Debugging 221
10.1 Testing 222
10.1.1 Black Box Testing 223
10.1.2 Glass-Box Testing 227
10.2 Testing Procedures 230
10.3 Testing Iterators 251
10.44 Testing Data Abstractions 232
10.5 Testing Polymorphic Abstractions 235
10.6 Testing a Type Hierarchy 235
10.7 Unit and Integration Testing 237
10.8 Tools for Testing 239
10.9 Debugging 242
10.10 Defensive Programming 249
10.11 Summary 251
Exercises 252
Contents
11— Requirements Analysis 255
na
ua
na
‘The Software Life Cycle 255
Requirements Analysis Overview 259
‘The Stock Tracker 264
Summary 260
Exercises 270
12 — Requirements Specifications 2m
na
ha
123
Ra
125
Data Models 272
12.1.1 Subsets 273
121.2 Relations 274
12.1.3 Textual Information 278
Requirements Specifications 282
Requirements Specification for Stock Tracker 286
123.1 The Data Model 286
12.3.2 Stock Tracker Specification 289
Requirements Specification for a Search Engine 291
Summary 298
Exercises 298
13—Design 301
Ba
B2
B3
Ba
bs
16
Ba
Bs
‘An Overview of the Design Process 301
The Design Notebook 304
13.2.1 The Introductory Section 304
13.2.2 The Abstraction Sections 308
‘The Structure of Interactive Programs 310
Starting the Design 315
Discussion of the Method 323
Continuing the Design 324
The Query Abstraction 326
‘The WordTable Abstraction 352Contents
139 Finishing Up 333
13.10 Interaction between FP and UI 334
13.11 Module Dependency Diagrams versus Data
Models 336
13.12 Review and Discussion 338
13.12.1 Inventing Helpers 339
13.12. Specifying Helpers 340
13112. Continuing the Design 341
11.124 The Design Notebook 342
15.13 Top-Down Design 343
13.14 Summary 344
Exercises 345
14— Between Design and Implementation 347
141 Evaluating a Design 347
14.1.1 Correctness and Performance 348
14.1.2 Structure 353
14.2 Ordering the Program Development Process 360
143° Summary 366
Exercises 367
15 — Design Patterns 369
15.1 Hiding Object Creation 371
15.2. Neat Hacks 375
15.2: Flywelghts 375
15.22 Singletons 378
15.2.3 The State Pattern 382
15.3 The Bridge Pattern 385
154 Procedures Should Be Objects Too 386
155 Composites 390
15.5.1 Traversing the Tree 393
156
13.7
Contents
‘The Power of Indirection 399
Publish/Subseribe 402
15.7.1 Abstracting Control 403
Summary 406
Exercises 407
Glossary 409
Index. 427Preface
Constructing production-quality programs—programs that are used over an
extended period of time—is well known to be extremely difficult. The goal
of this book is to improve the effectiveness of programmers in carrying out
this task. Phope the reader will herame a hetter programmer as a result of
reading the book. I believe the book succeeds at improving programming skills
because my students tell me that it happens for them.
‘What makes a good programmer? Its a matter of efficiency over the entire
production of a program, The key is to reduce wasted effort at each stage.
‘Things that ean help include thinking through your implementation before
{you start coding, coding in a way that eliminates errors before you test, doing
rigorous testing so that errors are found early, and paying careful attention
to modularity so that when errors are discovered, they can be corrected with
‘minimal impact on the program asa whole. This buvk wovers tecluigues in all
these areas
‘Modularity is the key to writing good programs. tis essential to break up
a program into small modules, each of which interacts with the others through
‘a narrow, well-defined interface. With modularity, an error in one part of a
program can be corrected without having to consider all the rest of the code,
‘anda partof the program can be understood without having to understand the
entire thing. Without modularity, a program is a large collection of intricately
interrelated parts, Itis difficult to comprehend and to modify such a program,
and also difficult to get it to work correctly.
“The focus ofthis book therefore is on modular program construction: how
to organize a program as a collection of well-chosen modules. The book re-
lates modularity to abstraction, Each module corresponds to an abstraction,
such as an index that keeps track of interesting words in a large collection
of documents or a procedure that uses the index to find documents that
Prelsce
‘match a particular query. Particular emphasis is placed on object-oriented
rogramming—the use of data abstraction and objects in developing pro-
grams,
‘The book uses Java for its programming examples. Familiarity with Java
{snot assumed. It is worth noting, however, that the concepts inthis book are
language independent and can be used to write programs in any programuning
language,
How Can the Book Be Used?
Program Development in Java can be used in two ways. The ftst is asthe text
fora course that focuses on an object-oriented methodology for the design and
implementation of complex systems, The second is use hy computing profes
sionals who want to improve their programming skills and their knowledge
of modular, object-oriented design.
‘When used as a text, the book is intended for a second or third program
ming course: we have used the book for many years in the second program-
‘ming course at MIT, which is taken by sophomores and juniors. At this stage,
students already know hovr to write small programs, The course buildson this
‘material in two ways: by getting them to think more carefully about small pro-
‘grams, and by teaching them how to construct latge programs using smaller
‘ones as components. This book could also be used later in the curriculum, for
‘example, ina software engineering course.
A course based on the book is suitable for all computer science majors.
Even though many students will never be designers of truly large programs,
they may work at development organizations where they will be responsible
for the design and implementation of subsystems that must fit into the overall
structure, The material on modular design is central to this kind ofa task. It
is equally important for those who take on larger design tasks.
What Is This Book About?
Rougily two-thirds ofthe book is devoted to the issues that arise in building
individual program modules. The remainder of the book is concerned with
hhow to use these modules to construct large programs.Preface
Program Modules
“This part of the book focuses on abstraction mechanisms. It discusses proce-
dures and exceptions, data abstraction, iteration abstraction, families of data
abstractions, and polymorphic abstractions.
‘Three activities are emphasized in the discussion of abstractions. The frst
is deciding on exactly what the abstraction is: what behavior itis providing to
its users. Inventing abstractions isa key part of design, and the book discusses
hhow to choose among possible alternatives and what goes into inventing good
abstractions.
“The second activity is capturing the meaning of an abstraction by giving a
specification for it. Without some description, an abstraction i too vague to be
useful. The specification provides the needed description. This book defines a
format for specifications, discusses the properties of a good specification, and
provides many examples,
‘The third activity is implementing abstractions. The book discusses how
to design an implementation and the trade-off between simplicity and per-
formance, It emphasizes encapsulation and the need for an implementa-
tion to provide the behavior defined by the specification. It also presents
techniques—in particular, the use of representation invariants and abstrac:
tion functions— that help readers of code to understand and reason about it
‘oth rep invariants and abstraction functions are implemented to the extent
possible, which is useful for debugging and testing,
‘The wateriat on type hierarchy foe
technique~a way of grouping related data abstractions into families. An im-
portant issue here is whether it is appropriate to define one type to be &
subtype of another. The book defines the substitution principle—a method-
ical way for deciding whether the subtype relation holds by examining the
specifications of the subtype and the supertype.
‘This book also covers debugging and testing, It discusses how to come up
\with a sufficient number of test cases for thorough black box and glass box
tests, and it emphasizes the importance of regression testing,
Programming in the Large
The latter part of Program Development in Java is concerned with how to
design and implement large programs in a modular way. It builds on the
‘material about abstractions and specifications covered in the earlier part of
the book
Preface
‘he material on programming in the lange covers four main toples. The
fist concerns requirements analysis how to develop an understanding of
sat waned ofthe program, The book discusses ho tocar ou eguse-
ments analysand aso describes a way of writing he esulingrequeents
specifletion. by making we of data model hat describes the abstract sate
ofthe program. Using the mode! leads to more precise spelen, and
iaso makes the requtements analysts more grou, resin i 9 beter
understanding ofthe requirements
‘The second programing in the Inge topes program design, which i
tested as an erative proces. The design process enganzed around de
covering wel abstractions one tat can serve as dsb bung Blocks
within the program asa whol, These abstractions ae carcully specie
sing gh ha wn te por inlet, the mos st
implement the absractions can be dvslped independently. The designs
dneumented by 2 dergn notebook, which includes » module dependeaey
diagram that describes the program structure
"Thethied ope implementation and testing, The book discusses the need
fo design analysis rer o implementation and how design reviews cn Be
carried eu tao dace implementation and testing order. Thi secon
compares top-down and bottom-up ognizatons,seusses the use of ever
and subs and emphasizes the ned to develop an ordering steategy prior
implementa nese mes ofthe dtelopent raion nd it
Ts bok enlade witha chaper on design patterns Some paterns
are introduced in eater chapters for cxampl,ieratin abgeaton ft
major component ofthe methodology. The hina chapter dscusenpatras
not covered cali, I i Intended a an introduction to this material The
Inerested eer ca then gn tend mere complet cussions contained
ithe bok
Barbara LiskovAcknowledgments
John Guttag was a coauthor of an earlier version of this book, Many chapters
still bear his stamp. In addition, he has made numerous helpful suggestions
about the current material
"Thiusaids of students have sued various drafis of the hook, and many
of them have contributed useful comments. Scores of graduate students have
been teaching assistants in courses based on the material in this book. Many
students have contributed to examples and exercises that have found their
way into this text. I sincerely thank all of them for their contributions.
‘My colleagues both at MIT and elsewhere have also contributed in im.
portant ways. Special thanks are due to Jeannette Wing and Daniel Jackson.
Jeannette Wing (CMU) helped to develop the material on the substitution
principle. Daniel Jackson (MIT) collaborated on teaching recent versions of
the course and contributed to the material in many ways; the most important
of these isthe data model used to write requirements specifications, which is
based on his research,
‘in addition, the publisher obtained a number of helpful reviews, and 1
want to acknowledge the efforts of James M. Coggins (University of North
Carolina), David H, Hutchens (Millersville University), Gail Kaiser (Columbia
University), Gail Murphy (University of British Columbia), James Purtilo (Uni-
versity of Maryland), and David Riley (University of Wisconsin at LaCrosse)
[ound their comments very useful, and I tried to work their suggestions into
the final manuscript.
Finally, MIT's Department of Electrical Engineering and Computer Science
and its Laboratory for Computer Science have supported this project in impor-
tant ways. By reducing my teaching load, the department has given me time
to write, The laboratory has provided an environment that enabled research
leading to many of the ideas presented in this book.
Introduction
“This book will develop amethodology for constructing software systems, Our
goal isto help programmers construct programs of high quality—progeams
that are reliable, efficient, and remonably easy to understand, modify, and
rmainain
‘A vety small program, consisting of no more than a few hundred lines
‘ean be implemented as a single monolithic unit. As the size of the program
increases, however, such a Monolithic streture no longer reason: be
cause the ende becomes difficult to understand. Instead, the program mus be
‘decomposed into a number of small independent programs, called modules,
that together provide the desired function, We shal focus on this decomposl=
tion process: how to decompose large programming problems ino small ones,
that kinds of modules ae most useful in this process, and what techniques
Increase the likelihood that modules can be combined to solve the original
problem.
Doing decomposition properly becomes more and more important as the
see ofthe program increases fora numberof reasons. Fist, many people must
be lavolved in the constuction of lange program. If usta few people ate
working on a progeam, they naturally interact regulary Such contact reduces
the possibilty of misunderstandings about who is doing what and lessens the
seriousness of the consequences should misunderstandings occur. If many
people work ona project, regular communication becomes impossible becauseIntroduction
‘This book il develop a methodology fr constructing z
folio nlp pearies cons porns of igh gaara
ARut are effien, and ceasonaly ny to understand, mod, an
i very small program, consting of no sore than 2 ew hundred ine,
can bempamented a single monolithic unt AS the ie of he program
can the cae Beane fou to understand instead, dhe program must be
Seompovad inte number of small independent programs called modus,
hoger rove eds fant, We bl sso is decomp
ton proses Row deconpse ge rogerning polenta oes
Mit Kd of les ae most wel thi proves and hat echnigus
These the elnood that modules can be combined to seve the original
aaa portant as the
Dain decomposition property becomes mre an! moe import he
siete poprm incense ort uber of esos Pst an ope
Sneed the como o ge port peopl
rok ona program, they natal ingerst ela :
Theportiny oftisundersnings abou wots dong wa and ese th
Cine fhe cegutcs shad ondersting oc, my
‘roplemorkon project ela communication becomes impossible Because
1
Chapter 1
11
Introductioe
it consumes too much time, Instead, the program must be decomposed inte
Pieces that the individuals can work on independently with a minimum o!
contact,
‘The useful life ofa program (ts production phase) begins when itis deliv
«ered to the customer. Work on the program is not over at this point, however.
‘The code will probably contain residual errors that will need attentian, ancl
[Program modifications will often be required to upgrade the program's servi
ability or to provide services better matched to the user's needs. This activity
of program modification and maintenance is likely to consume more than half
of the total effort put into the project.
For modification and maintenance, it israrely practical to start from scratch
and reimplement the entire program, Instead, one must retrofit modifications
within the existing structure, and itis therefore important that the structure
accommodate change. In particular, the pieces of the program must be inde-
pendent, so that a change to one piece can be made withuut requiring changes
tall pieces.
Finally, most programs have. long lifetime. Programmers often have to deal
with programs long after they have first worked on them. Moreover, there is
likely to be substantial turnover of personnel over the life of any project, and
rogram modification and maintenance are typically done by people other
than the original implementors. All of these factors require that programs be
structured in such a way that they can be understood easily
{In the methodology we shall describe in this book, programs will be
eveioped) by means of problem decomposition based on a recognition of
useful abstractions. Decomposition and abstraction, the two key concepts in
this book, form our next subject.
Decomposition and Abstraction
‘The basic paradigm for tackling any large problem is clear—we must “divide
and rule.” Unfortunately, merely deciding to follow Machiavell’s dictum stil
leaves usa long way from solving the problem at hand. Exactly how we choose
to divide the problem is of overriding importance.
‘Our goal in decomposing a program is to create modules that are them-
selves small programs that interact with one another in simple, well-defined
ways. If we achieve this goal, different people will be able to work on dif-1.1 Decomposition and Abstraction
ferent modules independently, without needing mich communication among
themselves, and yet the modules will work together. In addition, during pro-
‘gram modification and maintenance, it will be possible to modify some of the
‘modules without affecting all ofthe others.
‘When we decompose a problem, we factor it into separable subproblems
in euch a way that
+= Each subproblem is at the same level of detail
«= Each subproblem can be solved independently.
+ The solutions to the subproblems can be combined to solve the original
problem,
Sorting using merge sort is an elegant example of problem solving by decom-
position, It breaks the problem of sorting alist of arbitrary size into the two
simp
arbiteary size
Decomposition is a time-honored and useful technique in many disci-
plines. From Babbage’s day onward, people have recognized the utility of such
things as macros and subroutines as decomposition devices for programmers,
Is important to recognize, however, that decomposition is not a panacea
and when used improperly, it can have a harmful effect. Furthermore, large
‘or poorly understood problems are difficult to decompose properly. The most
‘common problem is creating individual components that solve the stated sub-
problems but do not combine to solve the original problem. This is one of the
reasons why system integration is often difficult,
For example, imagine creating a play by assembling a group of writers,
giving each a lst of characters and a general plot outline, and asking each
fof them to write a single characters lines. The authors might accomplish
their individual tasks admirably, but itis highly unlikely that their combined
efforts will be an admirable play. It would probably lack any sort of coherence
for sense. Individually acceptable solutions simply cannot be expected to
combine properly if the original task has been divided in a counterproduc-
Abstraction is a way to do decomposition productively by changing the
level of detail tobe considered. When we abstract from a problem, we agree to
{ignore certain details in an effort to convert the original problem to a simpler
‘one. We might, for example, abstract from the problem of writing a play to
the problem of deciding how many acts it should have, or what its plot will
problews of wotting « ist of size two and merging two sorted Hats of
(Chapter 1
1.2
Introduction
be, of even the sense (but not the wording) of individual pieces of dialog
After this has been done, the original problem (of writing all ofthe dialog)
remains, but it has been considerably simplified-—perhaps even to the point
‘where it could be turned over to another or even several others. (Alexandre
Dumas pére churned out novels in this way.)
‘The paradigm of abrtracting and then decomposing is
cypleal of the prom
_gram design process: decomposition isused to break software into components
that can be combined to solve the original problem; abstractions assist in mak-
ing a good choice of components. We alternate between the two processes until
‘we have reduced the original problem to a set of problems we already know
how to solve.
Abstraction
‘The process of abstraction can be seen as an application of a many-to-one
‘mapping. It allows us to forget information and consequently to treat things
that ae different asif they were the same. We do this in the hope of simplifying
‘our analysis by separating atributes that are relevant from those that are not
Teiscrucial to remember, however, that relevance often depends upon context.
In the context of an elementary school classroom we learn to abstract both
(2) x Band 5 +3 t0 the concept we represent by the numeral 8. Much later
we learn, often under unpleasant circumstances, that on many computing
‘machines this abstraction can get us into a world of trouble,
For example, consider the structure shown in Figure 1.1, in which the
concept is “mammal.” All mammals share certain characteristics, such as the
fact that females produce milk, At this level of abstraction, we focus on these
common characteristics and ignore the differences between the various types
‘of mammals
‘Ata lower level of abstraction, we might he interested in partiewlar inc
stances of mammals. However, even here we can abstract by considering not
individuals, or even species, but groups of related species. At this level, we
‘would have groupings such as primates or rodents. Here again, we are in-
terested in common characteristics rather than the differences between, say,
‘humans and chimpanzees. Such differences are relevant at a still lower level
of abstraction,Figure 1.1
1.2 — Abstraction
‘An abstraction hierarchy
prites rodents
/~
aeons
‘The abstraction hierarchy of Figure 1.1 comes from the field of zoology,
but it might well appear in a program that implemented some zoological
application. A more specifically computer-oriented example that is useful
in many progtams is the concept of a “file” Files abstract from raw storage
{and provide long-term, wuline storage of named entities. Operating systems
differ in their realizations of files; for example, the structure ofthe filenames
differs from system to system, as does the way in which the files are stored on
secondary storage devices.
In this book, we are interested in abstraction as itis used in programs in
‘general. The most significant development to date in this area is high-level
languages. By dealing directly with the constructs of 2 high-level language,
rather than with the many possible sequences of machine instructions into
‘which they can be translated, the programmer achieves a significant simplif-
cation.
In recent years, however, programmers have become dissatisfied with the
level of abstraction generally achieved even in high-level language programs.
Consider, for example, the program fragments in Figure 1.2. At the level of
abstraction defined by the programming language, these fragments are clearly
different; if there is an occurrence of e in @, one fragment finds the index of
the first occurrence and the other, the index of the last. IFe does not occur in
‘a, one sets i to a. length and the other to —1. It is not improbable, however,
that both were written to accomplish the same goal: to set found to false if
there is no occurrence of @ in a and, otherwise, to set found to true and z to
the index of some occurrence of ein a. Ifthis is what we want, itis not evident
from the program fragments by themselves.
‘One approach to dealing with this problem lies in the invention of "wery=
high-level languages” built around some fixed set of relatively general data
structures and a powerful set of primitives that can be used to manipulate
5
Chapter 1
Figure 1.2
Introduction
‘Two program fragments
jf search upwards
found = false;
for (int 4 = 0; 1 < a.tength: +4)
af QL) =e) ¢
found = true;
>
| search downwards
alenget-i i
ot
them. For example, suppose a language provided istn and index0f as prim-
itive operations on arrays. Then we could accomplish the task outlined in
Figure 1.2 simply by writing
it (found)
2 = avindsx0£(0);
‘The flaw in this approach is that it presumes that the designer of the
programming language will build into the language most of the abstractions
‘that users of the language will want, Such foresight isnot given to many; and
even if it were, a language containing so many built-in abstractions might well
be so unwieldy as to be unusable.
'A preferable alternative is to design into the language mechanisms that al-
low programmers to construct their own abstractions as they need them, One
common mechanism Is the use of procedures, By separating procedure defini-
tion and invocation, a programming language makes twoimportant methods of
abstraction possible: abstraction by parameterization and abstraction by spec
ification. These abstraction mechanisms are summarized in Sidebar 1.11.2 Abstaction
Sidebar 1.1 Abstraction Mechanisms
‘Abstraction by parameterization abstracts from the identity of the data by replacing
them with parameters. It generalizes modules so that they can be used in more
situations
++ Abstraction by specification abstracts from the implementation details (how the module
{s implemented) to the behavior users can depend on (what the module does) It
{isolates modules from one another's implementations; we require only that a modules
{implementation supports the behavior being relied on.
121
Abstraction by Parameterization
Abstraction by parameterization, through the introduction of parameters,
allows us to represent a potentially infinite set of different computations with
{single program text that is an abstraction of all of them. For example,
xertyey
deserihes a computation that adds the square ofthe value stored inthe variable
to the square of the value stored in the variable y.
(On the uther Inui, ie Farndnla eapression
Axjyrinvdewse by ey)
describes the set of computations that square the value stored in some integer
variable, which we shall temporarily refer to as x, and add to it the square of
the value stored in another integer variable, which we sball temporarily call
_y. In such a lambda expression, we refer to x and y as the formal parameters
and x #.x+y +yas the body of the expression, We invoke a computation by
binding the formal parameters to arguments and then evaluating the body.
For example,
Ax rin (vax ty+ yw,
is identical in meaning to
wawtees
Chapter 1
12.2
Introduction
In more familiar notation, we might denote the previous lambda expression
by
int equares (int x, 7)
{
rewn ctxt yey:
3
and the binding of actual to formal parameters and evaluation of the body by
the procedure call
squares(z, 2);
Programmers often use abstraction by parameterization without even
noticing that they are doing so, For example, suppose we need a procedure
that sorts an array of Integers a. At some time in the futute, we shall probably
hhave to sort some other array, perhaps even somewsher
‘gram. It is highly unlikely, however, that every array we need to sort will be
named a; we therefore invoke abstraction by parameterization to generalize
the procedure and thus make it more useful.
Abstraction by parameterization is an important means of achieving gen-
erality in programs. A sort routine that works on any array of integers is
much more generally useful than one that works only on a particular array
‘of integers. By further abstraction, we can achieve even more generality. For
‘example, we might define a sort abstraction that works on arrays of reals as
well as arrays of integers, or even one that works on arraylike structures in
seneral
Abstraction by parameterization isan extremely powerful mechanism. Not
only does it allow us to describe a large (even infinite) number of computations
relatively simply, but it is easily and efficiently realizable in programming
languages. Nonetheless, itis nota sufficiently powerful mechanism to describe
conveniently and fully the abstraction that the careful use of procedures can
provide.
ce in this same po
Abstraction by Specification
Abstraction by specification allows us to abstract from the computation (or
‘omputations} described by the body ofa procedure to the end that procedure
‘was designed to accomplish We do this by associating with each procedure
8 specification of its intended effect and then considering the meaning of aFigure 1.3
1.2 — Abstraction
‘The sqrt procedure
‘Aoat sqrt (float coet) {
1] REQUIRES: coef > 0
i #884CTS: Returns an approximation tothe square row of ¢0
float ane = cost/2.0;
waste <7)
fane = ane ~ ((ane * ans ~ coef)/(2.0rane))
>
procedure call tobe based on this specification rather than on the procedure’s
body.
‘We are making use of abstraction by specification whenever we associate
with @ procedure a comment that is sufficiently informative to allow others
+o use that procedure without looking at its body. A good way to write such
comments is to use pairs of assertions. The requires assertion (or precondition)
of a procedure specifies something that is assumed to be true on entry to
the procedure. In practice, what is most often asserted is a set of conditions
sufficient to ensure the proper operation of the procedure. (Thisis often simply
the vacuous assertion "true.") The effects assertion (or pestcondition) specifies
something that is supposed to be true at the completion of any invocation of
the procedure for which the precondition was satisfied
Consider, for example, the sqrt procedure in Figure 1.3. Because a spec
ification is provided, we can ignore the body of the procedure and take the
meaning of the procedure cally = sqrt x) tobe “Ifxis greater than zeto when
the procedure is invoked, then after the execution of the procedure, y is an
approximation ta the square root of x." Notice that the requires and effects
assertions permit us to say nothing about the value of y if isnot greater than
‘zero, This is important, since a uses iniyht otherwise quite tease
that sqrt (0) returned a meaningful answer.
In using a specification to reason about the meaning of a procedure call,
‘we follow two distinct rules:
1, After the execution ofthe procedure, we can assume thatthe postcondition
holds provided the precondition held when the call was made.
Chapter t
1.2.3
10
Introduction
2, We can assume only those properties that can be inferred from the post-
condition
The two rules mirror the two benefits of abstraction by specification, The first
asserts that users ofthe procedure need not bother looking at the body of the
procedure in order to use it. They are thus spared the effort of first understand-
Ing the details of the computations described by the body and then abstracting
from these details to discover that the procedure really does compute an ap-
proximation to the square root of its argument. For complicated procedures,
or even simple ones using unfamiliar algorithms, chs is a nontrivial benefit.
The second rule makes it clear that we are indeed abstracting from the
procedure body, that is, omitting some supposedly irrelevant information,
‘This insistence on forgetting information is what distinguishes abstraction
from decomposition. By examining the body of eqrt, users of the procedure
could gain a considerable amount of information that cannot be gleaned fom
the postcondition and therefore should not be relied on—for example, that
sqrt(4) will return +2. In the specification, however, we are saying that this
information about the returned result isto be ignored. We are thus saying that
the procedure eqrt is an abstraction representing the set ofall computations
that return “an approximation to the square root of x.”
Inthis book, abstraction by specification will be the major method used in
program construction, Abstraction by parameterization will be taken almost
for granted; abstractions will have parameters as a matter of course
Kinds of Abstractions
Abstraction by parameterization and by specification are powerful methods
for program construction. They enable us to define three different kinds of ab-
stractions: procedural abstraction, data abstraction, and Iteration abstraction,
In general, each procedural, data, and iteration abstraction will incorporate
both methods within it,
For example, sart is like an operation: it abstracts a single action or task.
‘We shall refer to abstractions that are operationlike as procedural abstractions,
Note that eqrt incorporates both abstraction by parameterization and abstrac-
tion by specification.
Procedural abstraction isa powerful tool. It allows us to extend the virtual
machine defined by a programming language by adding a new operation, This
kind of extension is most useful when we are dealing with problems that are12 — Abstraction
conveniently decomposable into independent functional units. However, itis
often more fruitful to think of adding new kinds of data objects tothe virtual
machine.
‘The behavior of the data objects is expressed most naturally in terms of
a set of operations that are meaningful for those objects. This set includes
‘operations to create objects, to obtain information from them, and possibly 10
‘modify them, For example, push and pop are among the meaningful operations
for stacks, integers need the usual arithmetic operations, and a bank account
“object in a banking system would have operations to deposit and withdraw
money. Thus a data abstraction (or data type) consists of a set of objects and a
set of operations characterizing the behavior ofthe objects
[As an example, consider MultsSete, Multisete are like ordinary sets
except that elements can occur more than once in a MultiSet. MultiSet
operations might include empty, insert, delete, nunberOf, and size. These
operations create an empty MultiSet, add and delete elements from a
Muieaset, tell how many times a particular element occurs in a MuLtiset,
and tell how many elements are in a Muleiset, respectively. The operations
‘might be implemented within the runtime environment of the programming
language by calls to various procedures. Programmers using MultiSets, how:
ever, need not worry about how these procedures are implemented. To them
‘enpty, insert, delete, aunberOf, and size are abstractions defined by such
+= The size of the MultiSet ineort(s, @) 1s equal t0 s2ze(s) +1
«= For all e, the nunberOf times ¢ occurs in the Multiset. enpty() i 0.
‘The key thing to notice is that each of these statements deals with more
than one operation, We do not present independent definitions of each op-
eration, but rather define them by showing how they relate to one another.
‘The emphasis on the relationships among operations is what makes a data a
straction something more than just aset of procedures. The importance ofthis
@) asad; deeds ana;
‘The majority of the content of both classes and Interfaces consists of u
definitions of methods. A class that defines 2 group of procedures provides
a method for each procedure: for example, 2 class providing procedures
that manipulate integer arrays might contain a method to sort an array and }
another method to search an array for a match with a particular integer. A >
class or interface that defines a data type provides methods for the operations
associated with the objects ofthat type. For example, in the case of aMultiset,
there might be an insert method to add an integer in the MultiSet and a public etatie void core (intl J a)
runber0£ method to determine how many times a given integer appears in
the Multiset.
‘An example of a class that defines a group of procedures is given in
Figure 2.1. (Comments begin with the // symbol and continue to the end
of the line ) Such methods are named by indicating their class and then their
‘method name, Here are examples of calls of the methods in class Nun:
public static boolean isPrine(int p) ¢
{implementation goes here
(The form int indicates thatthe argument isan array of integers of unspec-
ified length.) This method doesn’treturn a result; instead, it sorts its argument
array in place. It can be called as follows:
ant x * Mam-geaC15, 6);
Sf un tePtae(3)) 22 Packages
[A method takes zero oF more arguments and returns a single result, Its
Classes and interfaces are grouped into packages. Packages serve two purposes.
header indicates this information, The arguments are often referred to as the e area a
Fone pavancers ex omls fteel For example, go tas two rma eee eee
x and y, both of which are integers; it returns an integer result. A method Fach class and interface has a declared visibility. Only classes and interfaces
way a cote by owing nerf noxpias wl be evened Fe es oe pac muntber vp cons oe peop eee
In detail in Chapter 4. the tum class in Figure 2.1 can be used outside its package. The remaining
Because Java requires that every method have a result, a special form is definitions can be used only within the package
used when there is no result, Such a method indicates thet its return type is Inaddition, the declarations within a class have a declared visibility. Only
void. For example, suppose the Arrays clas provides routines that are useful entities declared to be public, such as the ged and iaPrsne methods in the
in manipulating arrays, among them 2 way of sorting arrays: tran class, are accessible to code in other packages. Other kinds of declared
16 v7Chapter 2
2.3
18
‘Understanding Object in Java
visibility limit the code that can access the entity—for example, to just its
class or just its package: the details will be discussed in later chapters.
‘The other use of packages is for naming. Each package has a hierarchical
name that distinguishes it from all other packages. Classes and interfaces
within the package have names that are relative to the package name, This
imeans that there are no-name conflicts between classes and interfaces defined
in different packages.
Code ina package can refer to other classes and interfaces ofits own pack
age by using their cass or interface name. For example, ifthe nathRout ines
package contains the class tur, code within that package can refer to that
class by using the name Nun. Definitions in other packages can be referred to
using their fully qualified names—thatis, their name appended to their pack-
age’s hierarchical name. For example, the fully qualified name for the Mun class
right be sathRout ines Nun. Itis also possible to use short names to refer to
definitions in other packages by using Ue fnpure statement, to either import
all public definitions from a package, or to import specific public definitions
froma package, In either case, the imported definition can be referred to using
its class or interface name.
One problem with short names is the possibility of name contlicts. For
‘example, suppose two packages both define classes, named Mun. In this case,
if code uses both classes, it cannot use a short name for each. Either it could
use a fully qualified name for each or it could import one of the classes and
use a long name for the other.
Sometimes there is a conflict between encapsulation and naming, It is
convenient to group many definitions in the same package because then code
outside the package has access to all of them by importing the whole package.
But this kind of grouping may be wrong from the point of encapsulation
because code within a package can sometimes access internal information of
other definitions within that package. In general, such a conflict should be
resolved in favor of encapsulation.
Objects and Variables
Al data are accessed by means of variables. Local variables, such as those
declared within methods, reside on the runtime stack; space is allocated for
them when the method Is called and deallocated when the method returns.
2.3 — Objects and Variables
aera on eer a
mS a
4 Germ 1&4 Ga
sce 5
: ;
amt = nev satis);
String 2 = "abedet™; // creates anew string
String t+ null;
J orstes a 5-element array
Here all variables except have been given an initial value. Variable t has been
{initialized to null; this special value provides a way of initializing a variable
‘that will eventually refer to an object.
This example shows a number of object creations. Strings are created by
indicating thelr content; thus, s refers toa string object containing "abedes™
‘Arrays can be created similarly, by indicating their elements; thus, a refers
to a five-clement array. The assignment to b shows the usual way of creating
‘a new object, by calling the built-in new operator. This operator creates an
object of the indicated class on the heap and then imtiahzes t by running
special kind of method, called a constructor, for that class. For example,
the array constructor initializes each element of a new array of integers to 0.
Thus, b refers toa three-element array of integers, where each element of the
array is 0.
Every object has an identity that is distinct from that of every other object.
That is, when an object is created by a call to new, or through use of the
special forms such as "abedet” for strings and (1,3,5,7,9) for arrays, what
Is obtained is an object that is distinct from any other object in existence.
‘An assignment
copies the value obtained by evaluating the expression ¢ into the variable v.
If the expression evaluates toa reference to an object, the reference is copied.
‘This situation is illustrated in Figure 2.2b, which shows the results of the
following assignments:
go
te
[Note that in the case of the string and array variables, we now have two
-vatiables pointing to the same object. Thus, assignment involving references
causes variables to share objects
23.1
2.3 Objects and Vaviables
‘The == operator can be used to determine whether two variables contain
the same value. This operator is used primarily for primitive types—for
example, to compare two ints, asin j = i, orto determine whether a variable
that might refer to an object instead contains mull, such as t == nul. It can
also be used to determine whether two variables refer to the same object; in
the situation in Figure 2.22, for example, a == b will not be true, whereas in
the situation in Figure 2.2b, a =» b is true
Objects in the heap continue to exist as long as they are reachable from
some variable on the stack, either directly or viaa path through other objects,
‘When an object is no longer reachable, its storage becomes available for
reclamation by the garbage collector. For example, in the state shown in
Figure 2.2b, the array formerly referred to by b is no longer reachable and
is therefore available for reclamation by the garbage collector,
Mutability
[All objects are either immutable or mutable, The state of an immutable object
never changes, while the state of a mutable object can change.
Strings are immutable: there are no String methods that cause the state ofa
String object to change, For example, strings have a concatenation operator
+, but it does not modify either of its arguments; instead, it returns a new
string whose state isthe concatenation of the states ofits arguments. If we did
the following assignment to ¢ with the state shown in Figure 2.2b:
peteret
the result as showa in Figure 2.2c is that now refers toa new String object
whose state is abedafg" and the object referred to by # is unaffected.
(On the other hand, arzays are mutable. The assignment
at se
causes the state of array ato change by replacing its element with the value
‘obtained by evaluating expression e. (The modification occurs only if isin
bounds for a: otherwise, an exception is thrown.)
Ifa mutable objects shared by two or more variables, modifications made
through one of the variables will be visible when the object is used through
the other variable, For example, suppose the shared array in Figure 2,2b Is
modified by
EnChapter 2 ‘Understanding Objects in Java 2.3 — Objects and Variables
‘ Figure 2.3. Method call
Sidebar 2.2. Mutability and Sharing =
; sick ve suck Heap sick ep
* An object is mutable i ts state can change, For example, arrays are mutable, x 7
1 An object is immutable if its state never changes. For example, strings are immutable. — 2 fhs.8,7.8 Ria, 10,14, 18)
‘= an object is shared by two variables if i cam be accessed through either of them. — 7 i!
+ Ifa mutable object is shared by two variables, modifications made through one of the
vvarlables will be visible when the objec i used through the other. a 6 @
2
blo) = 4;
‘This causes the zero element of the array to contain 6 (instead of the 1 it used
to contain), as shown in Figure 2.2c. Furthermore, the change is visible when
the array is used later, via either variable b or variable a; for example, in
s¢ (alo) = 9)
the expression will evaluate to true, and therefore, the then branch will be
executed.
Sidebar 2.2 summarizes mutability and sharing.
Method Cail Semantics
‘An attempt to call a method, e.a(...), first evaluates ¢ to obtain the class or
object whose method is being called. Then the expressions for the arguments
are evaluated to obtain actual parameter values; this evaluation happens left
to right. Next an activation record is created for the call and pushed onto
the stack; the activation record contains room for the formal parameters of
the method (as discussed earlier, the formals are the variables declared in the
‘method header) and any other local storage the method requires. Then the
actual parameters are assigned to the formals; this kind of parameter passing
Is called call by value. Finally, control is dispatched to the called method e.2;
Section 2.5 discusses how this works.
Just as was the case for assignment to variables, f an actual parameter
value isa reference to an object, that reference is assigned to the formal. This
‘means that the called procedure shares objects with its calle. Furthermore, if
these objects are mutable, and the called procedure changes thelr state, these
changes are visible to the caller when it returns.
For example, suppose the Arrays class mentioned earlier contained a
method, miitiples, that multiplies each element of its array argument a by
tes mulplier argument:
public static void multiples (int (a, int =) {
Sf (go moll) return;
for (int $= 0; 4 ¢ a.tengehs t+) ala] = aliden:
>
‘This method works on any size array; it uses @..ength to determine the length
of the array. Figure 2.3 shows what happens when the method is called by
the following code:
int (] b= (1,3,5,7,995
Azeaye.multiples(>, 2);
Figure 2.3a shows the situation just before the call. Figure 2.3b shows the
situation just after the call has occurred; the stack now contains the activation
record forthe call, and the formals have been initialized to contain the actuals,
‘Thus, the formal a of Arraye.multaples refers to the same array as b does,
Finally, Figure 2,3¢ shows the situation just after Arrays.multiples returns,
At this point. the activation record created for the call has been discarded.
Hoviever, the argument array has been modified, and this modification is
visible to the caller through variable».
Ina call e.m in which e is supposed to evaluate to an object, itis possible
that e might instead evaluate to null and thus not refer to any object. If this
happens, the calls not made, but instead the Nul1PointerException is raised
(exceptions are discussed in Chapter 4),Chapter 2
24
241
4
‘Understanding Objects in Java
Type Checking
Java is a strongly typed language, which means that the Java compiler checks
the code to ensure that every assignment and every call is type correct, Ifa
type error is discovered, compilation fails with an error message.
‘Type checking depends on the tact that every variable declaration gives
the type of the variable, and the header of every method and constructor
defines its signature: the types of its arguments and results (and also the types
of any exceptions it throws). This information allows the compiler to deduce
an apparent type for any expression, And this deduction then allows it to
determine the legality of an assignment.
For example, consider
ant y= Ti
ant B= 3
ant x = tam.gea (2, i:
‘When the compiler processes the call to tum. ged, it knows that Num. ged
requires two integer arguments, and it also knows that expressions 2 and y
are both of type int. Therefore, it knows the call of ged is legal. Furthermore,
it knows that ged returns an int, and therefore it knows that the assignment
to xis legal
Java has an important property: legal Jave programs (thatis, hose accepted
by the compiler) are guaranteed to be type safe. This means that there cannot
bbe any type errors when the program runs: itis not possible for the program
to manipulate data belonging to one type as ift belonged to a different type.
Type safety is achioved by three mechanisms: compile-time type checking,
automatic storage management, and array bounds checking. Type safety is
summarized in Sidebar 23.
‘Type Hierarchy
Java types are organized into a hierarchy in which a type can have a number
of supertypes; we say the type Is a subzype of each of its supertypes. (Other
texts may use the term superclass [subclass] to mean supertype [subtype]; in
addition, some texts say a type extends another type to mean it is a subtype
of the other type.) Type hierarchy provides a way of abstracting from the
244— Type Checking
Sidebar 2.3. Type Safety
' One important difference between Java and C and C=+ Is that Java provides type
safety, This s accomplished by three mechanisms:
+ Java isa strongly typed language. This means that type rors such as using a
pointer as an integer are detected by the compiler.
+ Java provides automatic storage management for all objects. In C and C+,
programs manage storage for objects in the heap explicitly. Explicit management is
| major source of errors such as dangling references, in which storage is deallocated
‘while a program still refers tot.
+ Jaya checks all array accesses to ensure they are within bounds,
© These techniques ensure that type mismatches cannot occur at runtime. In this way
4s unportant source of erzoes eliminated from your code.
Gifferences among subtypes to their common behavior, which is captured by
their supertype.
‘The subtype relation is transitive: if Ris a subtype of 8, and S is a sub-
type of 7, then R isa subtype of T. The relation is also reflexive: type $ is a
subtype of itself
IS is a subtype of 7, its objects are intended to be usable in any context
that expects to use objects belonging to T. For S objects to be usable, they
must have all the methods that T objects have; this requirement is enforced
by the Java compiler. in addition, all the method calls mast behave the same
way on $ and T objects; this requirement is not enforced by Java, nor could it
be, since it requires processing beyond the abilities of a compiler (it requires
proving that the two programs behave in the same way). We will discuss this
requirement in greater detail in Chapter 7 when we discuss how to define
‘cuh- and supertypes. For now, you will only make use of predefined type
hierarchies, and you can assume that subtypes are defined properly.
‘The special type Object is at the top of the type hierarchy in Java; all
object types, including String and array types, are subtypes of this type.
‘This means all objects have certain methods—namely, the ones specified for
Db ject. For example, Object methods include oquals and toString, with the
following headers:
2(Chapter?
26
‘Understanding Objects in Java
boolean equals (Object 0)
String toString ()
Object and its methods will be discussed further in Chapter 5.
Since objects of a subtype behave like those of a supertype, it makes
sense to allow them to be referred to by a varlable whose declared type is
4 supertype, This usage is permitted by Java: an assignment v = eis legal if
the type of eis a subtype ofthe type of v. For example, the following is legal
lonjact ot = a;
ject 02 = 83
Here ais an array, and s is a string, as shown in Figure 2.2
‘An implication of the assignment rule is that the actual type of an object
obtained by evaluating an expression is a subtype of the apparent type of
the expression deduced by the compiler using declarations. For example, the
apparent type of 02 is Object, ut its actual type is String,
‘Type checking is always done using the apparent type. This means, for
‘example, that any method calls made using the object will be determined
to be legal based on the apparent type. Therefore only Object methods like
‘equals can be called on 02; string methods like Length (which returns a count
of the number of characters in the string) cannot be called:
Af (o2.equate(*abe")) jf legal
Af (o2-Lengen( 9) /f legal
Furthermore, the following is illegal:
= 025 jfillegal
because the apparent type of 02 is not a subtype of String. Compilation will
fail when the program contains illegal code asin these examples.
Sometimes a program needs to determine the actual type of an object at
‘runtime, for example, sothata method not provided by the apparent type can
be called. This can be done by casting. For example,
Af (((String)o2..ength(0) == 0)
8 (tring) 02) // legal
‘The use ofa cast causesa check to occur at runtime; ifthe check succeeds, the
indicated computation is alowed, and otherwise, the ClassCaetException
will be raised. Inthe example, the casts check whether 02s actual type is the
2.4— ‘Type Checking
Sidebar 2.4 Type Hierarchy
«= Java supports type hierarchy, in which one type can be the supertype of other types,
‘hich ate ies subtypes. A subtype’ objects have all the methods defined by the
supertype,
+ All object types are subtypes of Object, which is the top of the type hierarchy.
‘Object defines a number of methods, including equals and toString. Every object is
‘guaranteed to have these methods.
«+ The apparent type ofa variableis the type understood by the compiler ftom information
available in declarations. The actual type ofan objects its realtype~the typeit receives
‘when itis created.
+ Java guarantees that the apparent type of any expression is a supertype of its actual
type.
same as the indicated type String; these checks succeed, and therefore, the
assignment in the first statement or the method call in the second statement
is allowed,
Sidebar 2.4 summarizes this discussion,
2.4.2 Conversions and Overloading
‘The determination of type correctness is actually not as simple as described
previously, for two reasons, First, Java allows certain implicit conversions of
value of one type to a value of another type. Implicit conversions involve only
the primitive types. For example, Java allows chars to be widened to numeric
types. Thus, the assignment to nin the following is legal:
In general, conversions involve computation —that is, they cause the produc-
tion of a new value (of the variables type) that is then assigned to the variable.
‘After the compiler determines the conversion needed to make the assignment
legal, it generates the code needed to produce the new value. You can learn
7Chapter 2
indertanding Object in Java
‘what conversions are legal, and what computations they involve, by consult-
ing a lava text
In addition, Java allows overloading. This means that there can be several
‘method definitions wich the same name. Most languages allow overloaded
definitions of operators; for example, +is defined for both integers and floats
Java allows overloading of operators, but in addition. it allows programmers
10 overload method names as well
For example, consider a class ¢ with the following methods:
static ant conp(int, Long) jj defi. 1
atic float comp(long, ant) /f defn. 2
static int conpCong, tong) /] defn. 3
‘This class provides three overloaded definitions of comp.
‘When there are overloaded definitions, several of them might work for a
particular call, Tor example, suppose you hav
loclaratfons
tone ¥
oat 25
In Java, an int can be widened to a Long, and also a Float can be widened
toa long. Therefore a call C.conp(x, y) could go to either the first definition
of conp (since here the types match exactly) or the third definition of comp
(by widening x to a 1ong), The second definition is not possible since it isn’t
possible to widen a long to an int.
‘The rale used to determine which method to call when there are several
choices, asin this example, is “most specific.” A method at is more specific
than another method 22 ifany legal cal of m1 would also be a legal call of 92 If
‘more conversions were done. For example, the first definition of comp would
be selected for the call C.coap(x, y) since itis more specific than the third
definition
If there is no most specific method, a compile time error occurs, For ex-
ample, all three definitions are possible matches for the call C.comp(x, x)
However, none of these is most specific, and therefore, the call is egal.
The programmer can resolve the ambiguity in a case like this by making
the conversion explicit; for example, C.comp( (Long) x, x) selects the second
definition.
Overloading decisions also take into account assignments from sub- to
supertypes. For example, consider
2.5
2.5 — Dispatching
vyoid foo (Ta, int x) ff defo.
‘void foo (Sb, long 3) |/ defn. 2
‘Then C.fo0(e, 9), where $ is a subtype of T and ¢ isa variable of type S, is
not legal since neither definition is most specific.
Dispatching
‘When a method is called on some object, It is essential that the call go to the
code provided by that object for that method, because only that code can do
the right thing. For example, consider
Sering + = "eb"
Object o= e+
String x = "abe";
boolean b = o-equais(r) ;
fl concatenation
Here the intention is to find out whether o's value is the string “abe”. This
sie will be satisfied ifthe call goes to the string’s code for equals, since this
compare the values of the two strings. Ifinstead the call goes to Object’s
code for equals, we will only learn whether o and r are the very same object.
‘The problem is that the compiler doesn't necessarily know what code to
call at compile time because it only knows the apparent type of the object
and not its actual type. This is ilustrated in the example; the compiler only
[knows that o is an Object. Ifthe apparent type were used to determine the
code to call, the wrong result would happen; for example, b would contain
alee because o and r are distinct objects.
‘Therefore, we need a way to dispaicha method cal to the code of the actual
object, This requires a runtime mechanism since the compiler cannot figure
‘out what to do at compile time.
Figure illustrates one way that dispatching works. Each object contains
reference toa dispatch vector. The dispatch vector contains an entry for each
‘of the object’s methods. The compiler generates code to access the location in
the vector that points to the code of the method being called and branch to
that code. The figure shows the situation for object o; a call to the equals
method would branch to the code referred to by the first location in the table,
and thus the call will go to the implementation provided by String,Chapter?
Figure 24
2.6
2.6.1
30
‘Understanding Objects in Java
Dispatching.
sack Heap
Dispatch
‘This section describes afew object types that are nonstandard (i.e, they don't
appear in other languages) and that we will use throughout the book,
Primitive Object Types
Primitive types like ant and char are not subtypes of Object, and their values,
such as 8 and c, cannot be used in contexts where objects are required. For
‘example, such values cannot be stored in Vectors; Vectors are discussed in
Primitive values can be used in contexts requiring objects by wrapping
them in objects. Each primitive type has an associated object type (e.g.
Integer for int, Character for char), Such a type provides a constructor for
producing one of its objects by wrapping a value of the associated primitive
type, anda method to do the reverse transformation. For example, for Integer
we have
public Int
ger(int x) // the constructor
ineVataa( ) // the method
‘These types also provide methods to produce objects oftheir associated types
from strings. Thus,
int n = Integer. pareetne(s);
2.6.2
2.6—Types
‘will return the int described by string e. Forexample, ifs isthe string "1024",
jn will contain the integer 1024. Ifs cannot be interpreted as an integer, the
method will throw YunberFornatException,
“The primitive object types have a numberof other useful methods; consult
a Java text to learn about them. They are defined in the package java. lang,
“This package defines « uumber of types that are so central to Java that the
package can be used without needing to import it. For example, the types
String and Object are defined in java. lang,
Vectors
Vectors are extensible arrays; they are empty when first created and can grow
and shrink on the high end. Vectors are defined in the java.util. package.
Here we discuss some of their methods; for more information, consult a Java
text
Likean array, a vector contains elements numbered from zero up toone less
than its current length. The length of a vector can be determined by calling
its size method
ach element in the vector has the apparent type Object. This means that
vectors can be heterogeneous: different elements ofa vector can be objects of
different types. However, vectors typically are used in a more limited ways,
so that all elements of a vector are of the same type or ofa few closely related
types
‘When a vector is created, it is empty and its length is zero; for example,
Vector v= new Vector( ); ff creates a mew, empty Vector
if (v.ai20( De* 0) | true
'A vector can be made to grow by using the add method to add an element to
its high end; for example,
veadacraber);
‘This method increases the size of the vector by 1 and stores Its argument in
the new location.
Vector elements can be accessed for egal indices. The get method fetches
the indexed element; for example,
string # = (String) v.get(0);
a(Chapter 2
2.7
3
‘Understanding Objects in Java
Note that got returns an Object, and the using code must then cast the result
to the appropriate type. Ifthe given index is not within bounds, get throws
the IndexOutOfoundsException,
String t = (String) v.get(1); throws IndexOutOFBoundsExcsption
‘The sat method is used to change a particular element: for example.
viset(0,
tf"); {nowy contains the single element "de
Finally, the vector can be caused to shrink by using the resove method; for
example,
v.renora (0
Because all elements of a vector must belong to types that are subtypes of
Object, vectors cannot contain elements of primitive types such as int and
‘char. Such values can be stored in a vector by using the associated object
types. For example,
v.2d4(3) : {Ja compile time errr
vrada(aes Integer(9)) ; | legal
‘Touse such an element later it must be both cast and converted to a value; for
example,
ant x = ((Intoger) v.get(2)).dmtvalue( );
Stream Input/Output
“The package java.io providesa numberof types of input and ouput streams,
‘These facilities are briefly described here; more detail can be found in a Java
text
Inputjoutput (1/0) is done using character streams. Input is done using
objects that belong to type Reader or one of its subtypes. For example
BufferedReader ubjects vais be used lw tead characters frum « stieau. (The
term buffered indicates that input Is done in larger chunks than individual
characters, and the data are then kept in a buffer until they are used.) The
content of a file can be read by using subtype FileReader; for example,
FileReader in = nev Filet
wader (¢s1enane) ;
2.8
2.8 — Java Applications
where the string, £2anane, is the pathname of the file.
Output occurs on objects of type river or asubtype ofthis type. Subtype
Printiriter can be used to print valuesand objects toan output device, while
subtype Filevriter can be used to send output toa file.
Java also provides some predefined objects for doing standard 1/0; these
objects are defined in class System of package java. Lang:
syee
syaten.out
Syaten.orr
im | Standard input to the program.
I Standard output from the program.
I Error output from the program.
“These objects are not character stzeam objects (since they were defined in Java
1.0, before character streams, which were introduced in Java 1.1, had been
invented). In particular, Syeten.inisan InputStream, while Systea-out and
Systen.orr are PrintStreans, However, this difference need not concern you
‘very mich hecause InputStreans behave like Readers (i.c., have the same
‘methods), and OutputStreans are similar to Writers, Furthermore, you can
use them as character streams by wrapping them; for example,
PrintWriter ayOvt = nev PrintWriter(System.out)
bufferedieader ayln = nev Bufferedleader (Syston. in);
‘Most methods on streams do error checking (e.g. to check for end of file
when data is being input from a file) and throw IOException if an error is
detected,
Java Applications
‘There are two kinds of Java applications: those run from the command line on
terminal, and those run by interacting with a user interface. We will discuss
the latter kind of application in Chapters 11 through 14.
Applications that run from the command line provide amain method. This
‘method takes an array of strings as an argument:
public static void main(Stringl 1)
‘where the argument array contains the command-line arguments,
‘The following example isa trivial complete program that prints the string
*HeLLo word" followed by a newline to standard output:
2Chapter 2
M
‘Understanding Objects in Java
public class HeLlovorld
public etatie votd main(String J arge) €
Syatom.cut.printin(*Helie world");
y
>
Since the name of the method is sain, the program can be run from the
command line.
‘The next example reads an integer from an input stream and prints its
factorial to an output stream. It shows how to read and write integers from
streams; other built-in types can be read/written similarly.
public class computeFactorial {
public static void main (String! } args) ¢
Princiriter out = nev PrintWriter(Syaten.out);
butferedRaader in =
nev BufforedRoader (nev InputStreankeader (Syston. in)):
Printiriter ere = nev PrintWriter(Systen.err);
out.printin("Enter an integer: ");
String 6 = moll;
ary {
= An.rendline( )
tnt p= foteger past
@>oe
out print (a);
out.prine(" =");
out. printin (Mun. fact (a);
} else orr.printia("input aot positive");
} catch (Exception e) { err.println(*bad éaput"); }
sine eds
3
Note that the code does not check directly for badly formatted input. In~
stead, it relies on the checking done within the call o the Integer -parseInt
method; recall that this method will throw the NunberFornatException if
there isa formatting problem. If this exception, or IOException, occurs, itis
handled by the try-catch construct (as discussed further in Chapter 4), and
the code produces an appropriate error message,
2a
22
23
24
Exercises
“Although these examples perform a single computation that transforms an
input into an output and then terminates, more generally applications run for
long time and interact with a user or other progeams to determine what to
do, We will discuss long-lived applications in Chapters 11 through 14.
Exercises
Consider the following code:
string st
String 02 =
String #3 = 211
String of = 23 + #2; // concatenation
Ilustvate the effect of the code om the heap and stack by drawing 2 diagram
similar to that in Figure 2.2.
Consider the code:
antl] © = (1.2.9:
ant) b= nev int fal;
intl] = a:
ant x = cf0);
Illustrate the effect of this code on the heap and stack by drawing a diagram
similar to that in Figure 2.2.
[Extend the diagram you produced in question 2.2 to show the effect of the
following code:
to} = x;
alt) = 6;
x= 000;
ye ath:
Consider the routine:
void eure (int ] 2)
Sf (zo mull || 2.longeh o 0) return;
for (int 4 = 15 $< z.lengeh; i++)
z[3) = 2fs-1] + 204);
a5(Chapter 2
36
Understanding Objects in Java
‘This routine modifies its argument z so that when the routine returns, each
clement zi] contains the sum of the values 210], ...,2Li] as ofthe time of
the call, Show the effect ofthe following code:
antl) 4 = @ 4, 6, 85
Arrays. ouns(@) ;
by providing diagrams similar to those in Figure 2.3. Show the state of the
program right before the call of une, right after the call of eune starts running,
and right after suas returns.
2.5. Consider the following code:
Object 0 = "abe
For each of the following statements, indicate whether or not a compile-
time error will occur, and for those statements that are legal at compile time,
indicate whether they will return normally or by throwing an exception; if
the return is normal, also indicate the result.
boolean b + o-equala(ta, by 6%
char ¢ = o-charht(1);
Object 02 = b;
String 8 = 0:
String t = (String) 0:
c= cucharst();
© = techarat(a);
2.6 Consider the following code:
intl} a= (1,2,3);
Onject 0 = "123";
string t = "12";
String v= t+ 2";
boolean b= o.equale(a);
boolean b2
boolean b3
boolean bi
‘Show the effect of executing this code by means of a diagram similar to that
in Figure 2.2. Also explain how the the code arrived atthe results in b, bi, 82
and 83
27
verses
‘Consider the following definitions:
void m (Object 0, long x, tong 9) /f defo
void m (String s, int x, Zong 9) [J defo2
void m (Object 0, sat x, Long y) // defn 3
void m (String 2, long x, ane y) defn
and suppose you have the following variable declarations:
object u;
String vi
sat
rong bs
For each of the following calls, determine which definitions would match a
particular call; also decide whether the call is legal, and if so, which of the
preveding definitions is selected
abr, ay Di
atv, a, i
aly, B®
ate, B,D:
ale, b, B):
lo, a, ai
”Procedural Abstraction
In this chapter, we discuss the most familiar kind of abstraction used in pro-
‘gramming, the procedural abstraction, or procedure for short. Anyone who
has introduced a subroutine to provide a function that can be used in other
programs has used procedural abstraction. Procedures combine the methods
of abstraction by parameterization and specification in a way that allows us
to abstract a single action or task, such as computing the greatest common
emoninator (ged) of two integers or sorting an array
{A procedure provides a transformation from input arguments to output
arguments, More precisely, it sa mapping from a set of input argumentstoa set
of output results, with possible modifications of the inputs. The set of inputs
‘or outputs, or both, might be empty. For example, ged has two inputs and one
‘output, but it does not modify its inputs. By contrast, a sort procedure might
Ihave one input (the array ta be sorted) and no output, and it does modify its
input (by sorting it)
‘We begin with the benefits of abstraction and, in particular, of abstraction
by specification. Next we discuss specifications and why they are needed
‘Then we discuss how to specify and implement standalone procedures; these
are procedures that are independent af particular objects. We conclude with
some general remarks about their design.
39
Chapter 3
a
Figure 3.1
40
Procedural Abstraction
The Benefits of Abstraction
‘An abstraction Is a many-to-one map. It “abstracts” from “irrelevant” details,
describing only those details that are relevant to the problem at hand. Its
realizations must all agree in the relevant details but can differ in the irrelevant
‘ones, Of course, distinguishing what is relevant from what 1s irrelevant is not
always easy. A major portion of this book will be concerned with how this is
done.
In abstraction by parameterization, we abstract from the identity of the
data being used. The abstraction is defined in terms of formal parameters;
the actual data are bound to these formals when the abstraction is used. Thus,
the Identity of the actual data is irrelevant, but the presence, number, and
types of the actuals are relevant. Parameterization generalizes abstractions,
making them useful in more situations. A virtue of such generalizations is
‘hat they decrease the amount of code that needs to be written and, thus,
modified and maintained,
In abstraction by specification, we focus on the behavior that the user
can depend on and abstract from the details of implementing that behavior.
Therefore, the behavior—"what” is done—i relevant, while the method of
realizing that behavior “how” it is done—is irelevant. For example, for an
Prime procedure, the fact that the procedure determines whether or not its
‘argument isa prime is relevant, but the details of how this is determined are
irrelevant,
‘A key advantage of abstraction by specification is that it allows us to
change to another implementation without affecting the meaning of any pro-
‘gram that uses the abstraction (Figure 3.1), For example, we could change
the algorithm used to implement the iePrine procedure, and programs us-
‘The general structure of abstraction by specification
12) Implementations3.41 "The Benefits of Abstraction
Sidebar 3.1. Benefits of Abstraction by Specification
‘+ Locality The implementation ofan abstraction can be read or written without needing,
to examine the implementations of any other abstractions.
+ Modifiailigy “Au abstraction can be seimplemented without requiring changer to
any abstractions that use It.
ing tsPrime would continue to run correctly with this new implementation
{although some change in performance might be noticed). The implementa
tions could even be written in different programming languages, provided
that the data types of the arguments are treated the same in these languages
For example, in many systems implemented in higher-level languages, itis
common to implement some abstractions in machine language to improve
performance.
[Abstraction by specification provides a method for achieving a program
structure with two advantageous properties. These benefits are summarized
in Sidebar 3.1, The frst property is locality, which means that the implemen=
tation of one abstraction can be read or written without needing to examine
the implementation of any other abstraction. To write a program that uses an
abstraction, a programmer need understand only its behavior, not the details
ofits implementation.
Locality is beneficial both when a program is being written and later
‘when someone wants to understand it or reason about its behavior, Because of
locality, different abstractions that make up a program can be implemented by
people working independently. One person can implement an abstraction that
uses another abstraction being implemented by someone else. As long as both
people agree on what the used abstraction is, they can work independently
and still produce programs that work together properly. Also, understanding a
program can be accomplished one abstraction at atime. To understand the code
that implements one abstraction, itis necessary to understand what the used
abstractions are, but not the code that implements them. In a large program,
the amount of information that is not needed can be enormous; we can ignore
not only the code ofthe used abstractions but also the code of any abstractions
they use, and soon.
4a
Chapter 3
3.2
2
Procedural Abstraction
‘The second property Is modifiability, Abstraction by specification helps
to bound the effects of program modification and maintenance. If the imple
mentation of an abstraction changes but its specification does not, the rest of
the program will not be affected by the change. Of course, if the number of
abstractions that must be reimplemented is large, making 2 modification will
still be alot of work, As will be discussed later, the workload can be reduced
by identifying potential modifications while designing the program and then
trying to limit their effects to a small number of abstractions. For example,
if the effects of machine dependencies can be limited to just a few abstrac-
tions, the result will be software that can be transported readily to another
machine,
‘Modifiability leads toa sensible method of tuning performance. Program-
‘mers are notoriously bad at predicting where time will actually be spent in
a complex system, probably because it is difficult to anticipate where bottle-
necks
arise, Since it ic unsvice to invest elfortin inventing techniques that
avoid nonexistent bottlenecks, a better method isto start with a simple set of,
abstractions, run the system to discover where the bottlenecks are, and then
reimplement the abstractions that are bottlenecks.
Specifications
It is essential that abstractions be given precise dehinitions; otherwise, the
advantages discussed in Section 3.1 cannot be achieved. For example, we can
replace one implementation of an abstraction by another only if everything
that was depended on by users ofthe old implementation is supported by the
new one, The entity depended on and supported isthe abstraction. Therefore,
‘we must know what the abstraction is.
‘We shall define abstractions by means of specifications, which are written
in a specification language that can be either formal or informal. The advantage
of formal specifications is that they have a precise meaning. However, we shall
use informal specifications in this book, in which the behavior ofthe abstrac-
tion is given in English. Informal specifications are easier to read and write
than formal ones, but giving them a precise meaning is difficult because the
informal specification language is not precise. Despite this, informal specifi-
cations can be very informative and can be written in such a way that readers,
will have litle trouble understanding their intended meaning,33
3.3 — Specifications of Procedural Abstactions
[A specification is distinct from any implementation of the abstraction it
defines, The implementations are all similar because they implement the same
abstraction; they differ because they implement it in different ways. The
specification defines their commonality.
[A specification language is not a programming language. Thus, our spec-
‘fications will not be written in Java. Furthermore, specifications are usually
‘quite different from programs because they focus on describing what the ab-
straction is rather than how it is implemented. This allows them to be much
shorter and easier to read than the corresponding implementation,
Specifications of Procedural Abstractions
‘The specification of a procedure consists of a header and a description of
ettects. The header gives the name of the procedure, the number, order, and
types of its parameters, and the type of its result; it also lists any exceptions
thrown by the procedure, but we defer discussion of exceptions to Chapter 4
In addition, names mast be given for the parameters. For example, the header
for renoveDvpls is
oid ranoveDvple (Vector Ws
hile the header of eqrt is
tose agrt (float 2:
The information in the header Is syntactic; it describes the “form” of
the procedure. Its similar to a description of the “form” of a mathematical
function, as in
fs integer — integer:
In neither case is the meaning—what the procedure or the function does—
described, The meaning is captured in the semantic part of the specification,
in which the behavior of the procedure Is described in English, possibly
extended with convenient mathematical notation. This description makes use
of the names of the inputs.
Figure 3.2 shows a template of a procedure specification. The semantic
part ofa specification consists of three parts: the requires, modifies, and effects
clauses, These clauses should appear in the order shown, although the requires,
8
Chapter 3
Figure 3.2
Procedural Abstraction
Specification template for procedural abstractions
return type pnane (...)
| RequnES: This clause states any constraints on use
} Movies: Ths louse identifies all modified inputs
i) BPPLcts: This clause defines the behavior
and modifies clauses are optional, The clauses are shown as comments because
they should always appear in your code.
“The clauses describe relation between the procedure’ inputs and results
For most procedures, the inputs are exactly the parameters that arelisted in the
procedure header. However, some procedures have additional implicit inputs
For example, a procedure might read a file and write some information on
Systen.out; the file and Gysten.out are also Inputs of the procedure
‘The requires clause states the constraints under which the abstraction is
defined. The requites clause is needed if the proceduce Is partial —that is, If
its behavior s not defined for some inputs. Ifthe procedure is catal—that is,
if its behavior is defined for all type-correct inputs—the requires clause can
bbe omitted. In this case, the only restrictions on a legal call are those implied
by the header—that is, the number and types of the arguments.
‘The modifies clause lists the names of any Inputs (including implicit Inputs)
that are modified by the procedure. If some inputs are modified, we say the
procedure has. side effect. The modifies clause can be omitted when no inputs
are modified. The absence ofthe modifies clause means that none ofthe inputs
is modified
Finally, the effects clause describes the behavior of the procedure for all
Inputs not ruled out by the requires clause. Tt must define what outputs are
produced and also what modifications are made to the inputs listed in the
‘modifies clause. The effects clause is written under the assumption that the
requires clause is satisfied, and it says nothing about the procedures behavior
‘when the requires clause isnot satisfied
In Java, standalone procedures are defined as static methods of classes
‘To use such a method, itis necessary to know its class, Therefore, we need
to include this information with the specification, giving us the expanded
template shown in Figure 3.3. We have simply added a little information
about the clas: its name and a brief description of its purpose. Additionally,
the specification indicates the visibility of the class and each standaloneFigure 3.3
Figure 3.4
5.3 — Specifications of Procedural Abstractions Chapter 3
Specification template for class providing standalone procedures
visibility cnane
{/ oveRvitw This clause defines the purpose ofthe class asa whale.
wisibsaiey static pt
wieiiticy static p2
>
Standalone procedure specifications
public class Arrays {
|) OVERVUEW: This class provides a number of standalone procedures that
|) are useful for manipulating arrays ofits.
public static int search (iat{ ] a, int 2)
Jj evs: Ifx is in a, returns an index where x is stored:
Y]- otherwise, returns 1
public static int searchSorted Gxt{ ] a, int 2
REQUIRES: 4 sorted in ascending order
EFFECTS: If iS i 8, returns an index where x s stored;
[] otherwise, returns -1
public static void sort Gat] a)
1 MODIFIES:
2] €8FECTS: Rearranges the elements ofa into ascending order
f] egeif'a= (3, 4, 6, 10 before che cal. om return a (1, 1, 3, 61
procedure; the visibility ofthe classand the procedures usually will be public,
so that the standalone procedures can be used in other packages.
‘A partial specification of a class, Arrays, which provides a number of
standalone procedures that are useful for manipulating arrays of integers, is
given in Figure 3.4, Since the class and the methods are public, the methods
‘can be used by code outside the package containing the class definition.
In the specification, we can see that eearch and searchSorted do not
modify their inputs, but sort modifies its input, as indicated in the modifies
“6 46
Procedural Abstraction
clause, Note the use of an example in the eort specification. Examples can
clarify a specification and should be used whenever convenient,
[Note also that gort and search are tora, since their specifications do not
contain a requires clause. searchSorted, however, is partial; it only does its
job if its argument array is sorted, Note that the effects clause does not state
‘what searchSorted does if the argument does not mect this constraint. In
this case, the implementor can do whatever is convenient; for example, the
‘implementation could even run forever. Obviously, this is nota very desirable
situation, and therefore you should avoid the use of the requires clause as much
as possible. This issue is discussed further in Section 3,5,
‘When a procedure modifies the state of some input, the specification needs
to relate the state of the object at return with its state atthe time of call. This
is what happens in the specification of sort. Writing such specifications can
be simplified by having notation to identify these different states explicitly.
‘We wall wake use of the following notation: the name of a formal argument
for example, x denotes ts state at the time of call and x_post denotes its state
at return, Thus, an alternative way of writing the specification for sort is
public static void sort (intl J 2)
| MODIFIES: a
i tevecTs: Rearranges the elements of into ascending order
W For example, ifa= (3, 1, 6, 13, a.post = [1, 1, 3, 6
Sometimes a procedure must produce a new abject. For example, cansi
public static int J boundArray Cint{ ], int m)
// #RPECTS: Returns a new array containing the elements of ain the
I] order they appear in excepe that any elements ofa that are
greater than w are replaced by n
‘You might wonder whether boundArrey could return its argument array if
none of its elements exceed n, However, this possibility is ruled out by the
specification. which indicates that boundArray must return a new object.
‘And obviously this requirement is important, since arrays are mutable: if
boundirray returned its argument, the using code is likely to notice the
sharing.
In Figure 3.4, all procedures use only formal parameters as inputs. Here is
an example of a specification of a procedure that has implicit inputs, namely
System in and System. out:3.4
plementing Procedures
publéc static void copyLine( )
QUIRES: System. in contains a line of text
J MoDInItS: System. im and Syeten.out
I] EFFECTS: Reads a line of text from Syatem. in, advances the cursor in
I] Syeven.in tothe end ofthe line, and writes the line on System. out.
[Note that the specification describes what te procedure dues to the implicit,
inputs.
‘Typically, specifications are written first, in advance of writing the code
that implements them. At that point, the class should be given a skeleton
{mplementation, consisting of just the method headers and specifications, The
bodies of the routines will be missing: code will be provided for these bodies
at a later time.
Implementing Procedures
‘The implementation of a procedure should produce the behavior defined by
its specification. In particular, it should modify only those inputs that appear
in the modifies clause; and if all inputs satisfy the requires clause, it should,
produce the result in accordance with the effects clause.
Figure 3.5 shows a Java method that implements searchSorted (specified
in Figure 3.4) using Linear search, Note that the implementation of search~
Sorted returns —I when passed null In place of the argument array. This
behavior is consistent with what is described in its specification. However, a
better specification might have treated this case specially, by indicating that
an exception should be thrown, Exceptions will be discussed in Chapter 4.
[Note also that we have included a comment in the code explaining the algo-
rithm in use; sucha comment is not needed if the algorithm is straightforward
bbut should be included if tis not.
‘Asa second example, consider the sort procedure specified in Figure 3.4
(One possible method is quick sort, which partitions the elements of the array
into two contiguous groups such that all the elements in the first group are
‘no larger than those in the second group; it continues to partition recursively
until the entire array issorted. To carry out these steps, we use two subsidiary
procedures: quickSert, which causes the partitioning of smaller and smaller
‘subparts of the array, and partition, which performs the partitioning of a
Gesignated subpart of the array.
a
Chapter 3 Procedural Abstraction
serehgorted
Figure 3.5 An implementation of
public class Arzaye {
// ovesview: This class provides a numberof standalone procedures that
are useful for manipulating arrays of ints
public static int eearchSorted Gintt Js, tnt =) &
1] REQUIRES: ai sorted in ascending order
| SFEECTS: If x18 ina, returns an index where x is stored:
I otherwise, returns
i uses linear search
Sf (q == mul) return 1;
for (int 4 = 0; 1 < a.length; i)
Af (@(a] == x) return 4; else if (@{i] > 3) return
return -1;
+
1 ther static methods go here
Figure 3.6 shows the sort implementation. Note that the quickSort and
partition routines are noe declared to be pubic; instead, their use is hmted
to the Arrays class. This is appropriate because they are just helper routines
and have little utility in their own right. Nevertheless, we have provided
specifications for them; these specifications are of interest to someone inter-
ested in understanding how quickSort is implemented but not to a user of
quickSort.
‘As another example, consider a class Vectors that is similar to Arrays but
instead provides useful routines for vectors (recall that vectors are extensible
arrays of objects). One routine provided by this class removes duplicates from
a vector. Figure 3.7 on page 50 contains the specification and implementation
of this routine. Note that the specification explains what “duplicate” means:
i is determined by using the equals method to compare elements of the
vector.3.4 — Implementing Procedures
Figure 3.6 Quick sort implementation
public class Arrays {
| over
public static void sort (intl) 2) {
[movies
if serves: Sorts aC0) ,
Sf (a == null) retarns
quickSort(a, 0, a.tength-1); >
+ a(a.tengea ~ 4] into ascending order.
private static void quickSort(ant[] a, int low, int high) (
|] REQUIRES: a isnot ULL and O-<= Low & high < a. length
| Motes: «|
if errects: Sorts a(Low), aClow+t], ..., alhigh] into ascending order
Af (ow >= high) return;
dnt mid = partition(
quicksort(a, Io, mid);
quickSort(a, mid + 1, bigh); >
few, Bagh)
private atetic int partition(int{ J a, int 4, iat 3) ¢
1] REQUIRES ais nov nat] and 0 <# 1 j~
ubile (afi) <0 its;
af (<3) { /f need 00 swap
ant comp = a{s]; aft] + all; efj) = temp:
geo ates
else return j:
0
Chapter 3
Figure 3.7,
3.5
Procedural Abstraction
Removing duplicates from a vector
public class Vectors {
{|[ overview: Provides useful standalone procedures for manipulating vectors.
public static void renoveDuple (Vector ¥) {
J) noquines: All elements of are not nl.
J] movies: ¥
ij recs: Removes all duplicate elements from v; uses equals £9
i] determine duplicates. The order of remaining elements may change.
Sf (y == auld) return;
for (int 4 = 0; 4 < vsizet ); s4+) ¢
Object x = v.gee(t)
sot sia;
1 remove all dups of x from the rest of ©
while (j < v.size( ))
Af Cix.equale(v.gee(3))) j++:
else { v.set(j, v-LlaseEleaent( ));
verenove(y.size( )-1);
Designing Procedural Abstractions
Inthis section, we discuss a number of issues that arise in designing procedural
abstractions.
Procedures are introduced during program design to shorten the calling
‘code and clarify its structure, In this way, the calling code becomes easier 10
understand and to reason about, However, itis possible to introduce too many
procedures. For example, the partition procedure in Figure 3.6 is worth in-
troducing because it has a well-defined purpose and because it allows us to
separate the details of partitioning the array from controlling the partitioning,
‘thus making quickSort easier to understand, Further decomposition is prob-
ably counterproductive, however. For example, the loop body in partition
could be made into a procedure, but Its purpose would be difficult to state,
and neither partition itself nor the new procedure would do much.3.5 — Designing Procedural Abstractions
Procedures, as well as the other kinds of abstractions that we shall discuss
later, should be designed to be minimally constraining; care should be taken
to constrain details of the procedure's behavior only to the extent necessary.
In this way, we leave more freedom to the implementor, who may be able
to provide a mote efficient implementation as a result. However, details that
Imaiier iu users iiust be constrained or the procedure will not be what is
needed,
(One kind of detail that is almost certainly left undefined is the algorithm
to be used in the implementation, Generally, users do not depend on such
details. (There are exceptions, however: for example, a numerical procedure
may be constrained to use a well-known numerical method so that ts behavior
‘with respect to rounding errors will be well defined.) Some details of what
the procedure does may also be left undefined, leading to a procedure that is
underdetermined. This means that for certain inputs, instead of a single correct
‘output, there isa set of acceptable outputs. An implementation is constrained
to produce some member of that set, but any member will do,
‘The search and searchSorted procedures are underdetermined because
‘we did not state exactly what index should be returned if x occurs in the
array mote than once. This means that implementations can differ in this
regard. For example, Figure 3.8 shows another implementation of search~
Sorted using binary search. This implementation differs from the one using
linear search (see Figure 3.5) in many details. For example, for all but very
small arrays, binary search is faster than linear search. Moreover, if x appears
jn a more than once, the two procedures may return different indices. Fi-
nally, if x is contained in a but a is not sorted, the implementation using
binary search may return —1 when the other implementation finds the in-
dex of x or vice versa (a8 an example, consider a = [1, 7, 6 4, 9] and x= 7)
Nevertheless, both implementations are correct realizations of the search~
Sorted abstraction since both provide behavior that is consistent with the
specification
renoveDuple (see Figure 3.7) is also underdetermined, since it does not
ecesserily preserve the order of elemente in ite input vector. This lack of
constraint may be a mistake, because users may care about the order: if the
{input vector is sorted, for example, it might be desirable to preserve the order.
‘The important point is that what matters depends on what users need. Details,
‘that matter to users should be specified; the others can be left undefined.
‘An underdetermined abstraction usually has a deterministic implementa-
tion; that is, one that, if called twice with identical inputs, behaves identically
3
Chapter 3
Figure 9.8
32
Procedural Abstraction
Implementing eearchSorted using binary search
public cle
{overview
arrays (
public static int searchSorted (intl J a, int 2) ¢
fuses binary ecarch
Sf G@ == mol) rewurn
6:
int lov
fant high = a.length ~ 1;
vaile (low < nigh) {
nt mid = (Low + high) / 2; // computes the floor
Af (x == almid)) return mid;
if (x ¢ almid]) high = mid - 1; elec lov = mid + 1;
?
return -1;
>
>
con the two calls. Both implementations of eearchSorted are deterministic
(Nondeterministic implementations require the use of nondeterministic prim
itives, global data, or static varlables; for example, the implementation might
read the system clock each time it is called and use that value as a way of
producing a different result from any previous cal.)
In addition to minimality, another important property of procedures is
generality, which is often achieved by using parameters instead of specific
variables or assumptions. For example, a procedute that searches for an arbi-
trary integer in an array, where the integer is an argument of the procedure,
is more general than one that works only for a specific integer. Similarly, a
procedure that works on any size array is more general than one that works
‘only on arrays of some fixed size. Generalizing a procedure Is only worth-
while, however, if doing so increases its usefulness. This is almost always truc
when size assumptions are eliminated, since by doing so we ensure that a mi-
ror change in the context of use for example, doubling the size of an array)
requires litle, Ifany, program modification. See Sidebar 3.2 for a summary of
the properties of procedural abstractions. Generalization is discussed further
in Chapter 8.3.5 — Designing Procedural Abstractions
Sidebar 3.2. Properties of Procedures and Their Implementations
‘© Minimality One specification Is more minimal than another if it contains fewer
‘constraints on allowable behavior.
= Cndondetermined belavior A procedure is underdctermined if for certain inputs ite
specification allows more than one possible result
+ Deterministic implementation —An implementation, of a procedure is deterministic
if, for the same inputs, It always produces the same result. Implementations of
tunderdetermined procedures are almost always deterministic,
+ Generality—One specification is more general than another if it ean handle a larger
lass of inputs.
Chapter 3
Sidebar 3.3 Total versus Partial Procedures
+ A procedure is total if ts behavior is specied for al legal inputs: otherwise, i s
‘partial. The specification of a partial procedure always contains a requires clause.
= Partial procedines are let safe than total ones, ‘Therefore. they should he wred only
‘when the context of use is limited or when they enable a substantial benefit, such as
better performance.
* When posbl, the implementation sould check the cnsirantsin the requires clause
and throw an exception if they are not satised,
Procedural Abstraction
‘Another important property of procedures is simplicity. A procedure
should have a well-defined and easily explained purpose that is independent
ofits context of use. A good check for simplicity is to give the procedure a
name that describes its purpose. It is difficult to think of a name, there may
bea problem with the procedure.
‘Some of the procedures discussed earlier are partial, while others are total.
‘This dichotomy leads to the question of when it 18 appropiate to define a
partial abstraction. Partial procedures are not as sae as total ones, since they
leave it to the user to satisfy the constraints in the requires clause. When
the requires clause is not satisfied, the behavior of a partial procedure is
completely unconstrained; and this can cause the using program to fail in
mysterious ways, For example, searchSorted might not return or it might
return the wrong index when its input array is not sorted, In the latter case,
the error may not be noticed until long after searchSorted returns. By then,
the reason for the error may be obscure, and important objects may have been
damaged,
(On the other hand, partial procedures can be more efficient to implement
than total ones. For example, if searchSorted had to work even when the
Input array was not sorted, then neither implementation (in Figure 3.5 or Fig-
ture 3.8) would be correct; only a less-efficient implementation that examined
all elements of the array could be used.
Im choosing between 2 partial and total procedure, we have to make 3
trade-off. On the one hand is efficiency; on the other is safe behavior, with
fewer potential surprises at runtime. How is such a choice to be made? An
Important consideration is the expected context of use. If the procedute is
{intended for general use (for example, if it is to be made available as part of a
program library), safety considerations should be given great weight. In such
a situation, itis impossible to examine all code that calls the procedure to
ensure thatthe calls satisfy the constraints. Therefore, itis wise to avoid the
constraints if possible
Alternatively, some procedures are intended to be used only tn a limited
context. This was the situation with partition and quickSort, which can be
used only within the Arrays class. In a limited context, itis easy to establish
that constraints are satisfied. For example, partition assumes that 4 is less
than j, but this condition is established by quickSort, which is its only calle.
‘Therefore, we might choose a partial procedure in such a case if this can
improve performance or lead to a simpler implementation,
‘Another point is thatthe implementation of an abstraction is not forbidden
to check the constraint given in a requires clause. Ifthe check indicates that
the requires clause is not satisfied, the procedure could produce an error
‘message, but a better approach is usually to throw an exception; exceptions
are discussed in the next chapter. Sidebar 3.3 summarizes our discussion of
total versus partial procedures.
Of course, it doesn’t make sense to check a constraint when the checking
is very expensive, for example, as it would be in the searchSorted routine.3.6
3.6— Summary
But sometimes a constraint is not expensive to check; this is the case for
renoveDuple, which requires all elements of the vector to be non-null. In
such a case itis a good idea to do the check and throw an expection if t fails.
Since such checks aren't required by the specification, they can be disabled
later, when the program is in production use, if this becomes necessary t0
achieve good performance.
Finally, itis worth noting that a specification is the only record of its
abstraction. Thetefore, it is crucial that the specification be clear and precise.
How to write good specifications is the subject of Chapter 9.
Summary
1hhis chapter has been concerned primatily with proceduses. what they arc,
how to describe their behavior, and how toimplement them. Wealso discussed
two important benefits of abstraction and the need for specifications.
‘A procedure is a mapping from inputs to outputs, with possible modi-
fications of some of the inputs, Its behavior, like that of any other kind of
abstraction, is described by a specification, and we presented a form for in-
formal specifications of procedures, A procedure is implemented in Java by a
static method; in other languages, it would be implemented by a function or
subroutine
‘Abstraction provides the two key benefits of locality and modifiability.
Both are based on the distinction between an abstraction and its implementa-
tions. Locality means thateach implementation can be understood in isolation.
‘An abstraction can be used without having to understand how it is imple-
‘mented, and it can be implemented without having to understand how itis.
used. Modifiability means that one implementation can be substituted for an-
‘other without disturbing the using programs,
‘To obtain these benefits, we must have a description of the abstraction
that ie distinct from any implementation. To this end. we introduced the
specification, which describes the behavior of an abstraction using a special
specification language. This language can be formal or informal; we used
an informal language but with a fixed structure consisting of the requires,
‘modifies, and effects clauses. Users can assume the behavior described by
the specification, and implementors must provide this behavior. Thus, the
‘specification serves as a contract between users and implementors.
55
Chapter 3
3a
32
33
34
35
36
37
Procedural Abstraction
Since we are interested in design and how to invent good abstractions, we
concluded the chapter with a discussion of what procedures should be like.
Desirable properties include minimality, simplicity, and generality. Minimal-
ity often gives rise to underdetermined abstractions. We also discussed the
pros and cons of partial and total procedures. We shall continue to discuss de-
sirable properties inthe following chapters a6 we introdtice additional kinds
of abstractions.
Exercises
Computing the greatest common divisor by repeated subtraction (see Fig-
ure 2.1 in Chapter 2) is not very efficient. Reimplement gcd to use division
instead.
Specify and implement a method with the header
public static int eur (intl J 8)
that returns the sum of the elements of a,
Specify and implement a procedure isPrine that determines whether an
intoger is prime.
Specify and implement a procedure that determines whether or nota string isa
palindrome. (A palindrome reads the came hackward and Fortsned: an example
is “deed.”)
You are to choose between two procedures, both of which compute the mini-
‘mum value in an array of integers. One procedure returns the smallest integer
if its array argument is empty. The other requires a nonempty array. Which
procedure should you choose and why?
Suppose that the implementation of sorting by quick sort shown in Figure 3.6
were changed as follows: Procedure partition is retained, but quickSore is
climinated, so that its work is done directly in sort. Is this change a good
‘dea? What purpose does quickSort have? Discuss
Suppose the implementation of partition in Figure 3.6 were changed to
return d instead of returning j. Would this work? Explain your reasoning,Exceptions
[A procedural abstraction isa mapping from arguments toresults, with possible
‘modification of some of the arguments. The arguments are members of the
domain of the procedure, and the results are members of its range
'A procedure often makes sense only for arguments in a subset of its
domain. For example, a procedure that computes the factorial makes sense
only if its argument is positive. As another example, the search procedure
can return the index of the element only if the element appears in the array.
‘One way of coping with such a situation is to use partial procedures,
as discussed in Chapter 3. For example, we might define ged only when its
arguments are positive:
public static int ged (nt a, int @)
| nequunas: 2, 4> 0
|| EFFECTS: Returns the greatest common divisor of and a.
‘he cle fa part procedure mis nse thatthe argent arn the
Penitch stedat ke dowain andthe iplementr cmt ignore gents
Satnide ths subsce. Thus, nimplmenting ge, we could ignore che cose af
Tal poceduesaregeerly bad ides, however, ice there isn gua
ante hat het arguments ae the permite subset andthe procedure may
therefore Be eed with argument outside the subset. When his happens.
the procedure Is alowed todo anything: might lop feever oe retura a
37
Chapter 4
58
Exceptions
‘erroneous result, The latter case is especially bad since it can lead to an ob-
scure error that is difficult to track down, For example, the calling code might
continue to run, using the erroneous result, and possibly damage important
databases
Partial procedures lead to programs that are not robust. A robust program
{is one that continues to behave reasonably even in the presence of errors. If
an error occurs, the program may not be able to provide exactly the same
behavior as ifthere were no error, but it should behave in a well-defined way.
Ideally, it should continue after the error by providing some approximation
of ts behavior in the absence of an error; a program like this i said to provide
_graceful degradation. At worst, it should halt with a meaningful error message
and without causing damage to permanent data,
‘A method that enhances robustness is to use oral procedures: procedures
‘whose behavior is defined for all inputs in che domain. If the procedure is
unable to perform its “intended” function for some of these inputs, at least i
‘an inform its caller of the problem. In this way, the situation is brought to
the attention of the caller, which may be able to do something about it, ot at
least avoid harmful consequences of the error.
Hovr should the caller be notified if a problem arises? One possibility is
to use a particular result to convey the information. For example, a factorial
procedure might return zero if its argument is not positive:
public static int fact (int a)
1] weeecrs: If > Orerurns nt else returns 0,
‘This solution is not very satisfactory. Since the call with illegal arguments is
probably an error, itis more constructive to treat this case ina special way, so
that a programmer wito uses the procedure is less likely to ignore the error by
Iistake. Also, returning a special result may be inconvenient for the calling
code, which then must check for it. For example, rather than writing!
Box + Mon fact(y):
the calling code instead must do the check:
ant x = Nam tact(y):
sf (r>0) 22x45; else
Furthermore, if every value of the return type is a possible result of the
procedure, the solution of returning a special result is impossible, since there
is no leftover value to use. For example, the get method of Vector returnsAL
4.1 — Specifications
1¢ value of the vector’s i element, and that value can be any object or nul.
“Thorton esa cone Infomation about he index being ot Bounds
byretrning particular bec or by rete
” Sahat edd san aproch hat coneys infomation abou masta
stuatons inal ey, even when every vale ofthe return ype i let
sen addon tn desl forthe approach fo ditinghth her
“Know in some ay otha ses ean igor them by mistake. Te woul
Sobe nie te approach slowed the handing these tations fo be
seprated fom the acral program contol Dow
etncscetion mechanism pode at we want allows roc
terminate er ormaly by recraing el or explo. There ca
‘tere erent nceptonal rmination In vs, each een er
favoncoresponds oa ilerent exception ype, The names the exception
typos are ete bythe define ofthe procedure to convey some infers
LeTtnour wnat ne role is For exampl he get ete of Wien las
Tnesoy.0fBundebopeson
Tah chapter we discuss how to specify implement, and use procedures
ch exept Weal seuss a mmber feted design ses
Specifications
|A procedure that can terminate exceptionally is indicated by having a throws
clause in its header:
trove € List_of types >
For example,
public static int fact (int n) throve NoPositivefzception
states that fact can terminate by throwing an exception; and in this case, it
throws an abject of type NonPosstiveBxception.
‘A procedure can throw more than one type of exception; for example,
ite static int search (Ant a, iat 3)
"= ‘throve NullPointerException, NotFoundException ,
srveo Ifa tenuous toincerescopesan ee fis at
[) teatro tortontbcopton eens 54h atx = 2)
9
Chapter 4
Figure 4.1
Exceptions
Some specifications with exceptions
public static int fact (int n) threve onPositiveBxception
| BFFECTS: If n is non-positive, throws WonPosttiveException, else
I] returns the factorial of
public static int wearen (mel) a, int x)
throws TullPointerException, WotFoundException
{] ques: ai sorted
|] BEFECTS: Ifa is mull throws Mul PointerExcept ion: els if is not
I|_ it, throws NotFoundExcept son: else returns § such that ali) = x
states that search can throw two exceptions: Nul1PosnterException (if a is
‘wu2l) aud NotFoundException (if ais not null and x is not in a)
‘The specification of a procedure that throws exceptions must make it clear
to users exactly what is going on. First, we require that its header list ail
exceptions that it ean throw as part of Its “ordinary” behavior, for example,
forall inputs that meet its requires clause
Second, the effects clause must explain what causes each exception to
be thrown. As before, the effects clause should define the behavior of the
procedure for all inputs not ruled out by the requires clause. Since this
behavior includes exceptions, the effects section must define what causes
the procedure to terminate with each exception, and what its behavior is
fn each case. Furthermore, if a procedure signals an exception for a certain
subset of arguments, that subset should not be excluded in the requires clause.
‘Termination by throwing an exception Is part ofthe ordinary behavior of the
procedure,
Figure 4.1 shows specifications of fact and earch, Note that the spec-
ification of search contains 2 requires clause and that, as usual, its effects
section assumes that the requires clause i satisfied.
When a procedure has side affect, ite specification must make leat how
these interact with exceptions. The modifies section ofa specification indicates
that an argument may be modified but does not say when this will happen.
Ifthere are exceptions, itis likely that the modification will happen only for
some of them. Exactly what happens must be described in the effects section,
‘Modifications must be described explicitly in each case where they occur if
no modifications are described, this means none happens. For example, the42
4.21
Figure 4.2
4.2—"The Java Exception Mechanism Chapter 4
following specification indicates that vis modified only when addMax returns
normally:
public static void aditiax (eetor ¥, Integer x)
‘thross HullPointertxception, NotSnallException
J] nequtnes: All elements of are Integers.
ff monies +
| Aarunsfi pa throws YlsPotntexcapon if oman an
1) cement ager thnx throws Kotauabxcapon: se add 0
———
The Java Exception Mechanism
“This section provides a brief discussion of how exceptions are supported in
Java
Exception Types
exception types are subtypes of either Exception or RntineException bth
brinhich are subtypesof type Thr ovabe Figure4.2showsthe hierarchy fex-
ne D0 sat there are two kinds of exceptions:
ception types The main point tonoteisthat there are to
Shecked exceptions and unchecked exceptions. Unchecked exceptions are sub-
Gipec of tact incBovopt ton; cherked exceptions are sublypes of Exception
but not of tintimoExceprion
“The exception type hierarchy
Tome
a ~
ZN
Rntnefxception —ehecked
Tit =
(unchecked exceptions)
422
a e
Exceptions
Most exceptions that are defined by Java are unchecked (¢.g., ull
PointerException, IndexOutOfBoundsException) but others are checked
(eg, 10Exception}. User-defined exceptions can similarly be either checked
or unchecked.
‘There are two differences in how checked and unchecked exceptions can
be used in Java
1, Ifa procedure might throw a checked exception, Java requires that the
exception be listed in the procedure’s header; otherwise, there will be a
compile-time error. Unchecked exceptions need not be listed in the header,
2. If code calls a procedure that might throw a checked exception, Java re-
{quires that it handle the exception as explained in Section 4.2.4; otherwise,
there will be a compile-time error. Unchecked exceptions need not be
handled in the calling code.
‘These differences between checked and unchecked exceptions make it neces-
sary to think carefully when defining a new exception type about whether or
not it should be checked. We will discuss this design issue in Section 4.2
‘We will deviate from the Java rules in one important way: we require
that the header of a procedure list all exceptions it throws, whether checked
for unchecked. For example, the header of search in Figure 4.1 lists Null-
PointerException even though this is an unchecked exception. The reason
for listing unchecked exceptions is that from the point of view of someane
using the procedure, any exception that can occur is of interest; you can't
understand how to use a procedure without this information, Of course, you
could obtain the information from the effects clause of the specification, but
including the information in the header brings it to the attention of the user
ina very direct way. Italso provides a good approach for the specifier is all
‘exceptions in the header, and then make sure the effects clause explains each,
of them.
Defining Exception Types
When a new exception type is defined, its declaration indicates whether
it is checked or unchecked by indicating its supertype: if the supertype is
Exception, it is checked; while if the supertype is RuntineException, it is
unchecked. For example, Figure 4.3 gives a definition of a new exceptionre 43
4.2.—The Java Exception Mechanism
Defining a new exception type
public class MewkindogException extends Exception {
public Nowkindotexception( ) { super); }
public NowkindO¢Exception(Steing 2) { super(s);
*
type. The header of the class states that the new type, NewkindDfExcept ion,
isa subtype of type Exception; this isthe meaning of
extends Exception
‘Therefore, the exception being defined in the figure is a checked exception,
‘The definition of an unchecked exception differs only in that its header
contains
‘extends RurtineException
{Asillustrated in Figure 4.3, aclass defining anew exception typeneed only
define constructors; recall that constructors are special methods that are used
to initialize newly created objects of the class. Defining a new exception type
requires very little work because most ofthe code for the new type is inherited
from the class that implements its supertype. We will discuss inheritance, and
also provide mote detail about te special forms used in this definition, in
Chapter 7.
“The exception type provides two constructors; In other words, the con=
structor name is overloaded as discussed in Section 2.4.2. The second con-
‘structor initializes the exception object to contain the string provided as its
‘argument; as we sball see in Section 4.2.3, this string will explain why the
‘exception was thrown. For example,
Exception ef ~ new Nevkindofexception(*this is the reason!
causes exception object ¢3 to contain the string "this is the reason”. The
first constructor initializes the abject to contain the empty string, for example,
Exception ©2 = nev Weutindofexception( );
‘The string, together with the type of exception, can be obtained by calling
the toString method on the exception object. For example,
a
Chapter 4
42.3
Exceptions
String © = ot tosteing( )s
causes © to contain the string
"WeuKindorExcoption: this is the reason
Teceptlon types mut be defined in some pachge, Ove posts
defn tem in he same package tat contain the cls ft methods that
throw them. However inthis, we would needa longer mame, or example
1otFounronSearchException,toavold name cons withenception types
defined for other procedures. A better alternative, therfore, Isto hae 2
package that defines exception types This allows the same exception ty
tobeuiediamany utes mes
Ties doesnot equ that exception types have the form zanesceptic,
However, i god programing syle to follow ths convention since makes
ieany to dstingulshenception ype which shouldbe wedonl fr towing
and handling exceptions, from ordinary types
Throwing Exceptions
A Java procedure can terminate by throwing an exception. It does this by
using the throw statement. For example, in fact we might have
14 Gn €= 0) throw naw HonBonttivaPreeptiontMn. £26k) 5
Here we are throwing an object of type NonPositveException; we actually
construct this object as part of the throw, by calling new,
‘The main issue when throwing exceptions is what to use for the string
argument. To answer this question, we need to understand the purpose ofthe
string. The string is used primarily to convey information to a person when
the program isn’t able to handle the exception and therefore stops with an
error message, of writes an error message to a log,
‘Therefore. the string muct enable the eer to find ant what went wrong.
‘good way to accomplish this isto have the string identify the procedure that
threw the exception, since in general many procedures will throw the same
exception type, The information should allow a person to find the specification
of that procedure. Giving the class and method name is usually sufficient;
however, ifthe method is overloaded, the types ofits arguments must also be
given.424
4.2 —"The Java Exception Mechanism
Handling Exceptions
‘When a procedure terminates with an exception, execution does not continue
right after the cal, Instead, control is transferred to some code that handles
the exception.
Code deals with an exception in two ways, The first isto handle itexplicitly
by using the try statement. For example, the following code uses a try
statement to handle NonPositiveException should it be thrown by the call
of fact.
tay {x= Man.fact(y); >
catch (WonPoaitiveException «) {
1 a here can use
y
Ifthe cal of fact throws NonPositiveException, the catch clause is executed:
the exception object is assigned to variable e so that this object can be used
while handling the exception
‘This example has one catch clause; however, several catch clauses can
be attached to the tzy statement so that several different exceptions can be
handled. Also, try statements can be nested. If an exception thrown by the
body of the inner try statement is not caught by one ofits catch clauses, itean
be caught by one of the catch clauses ofthe outer try statement. For example,
try Ces
‘ery (x = Arrays seareh(v, 7); >
catch (iullPointerException ©) {
‘throw new NotFoundException( ): 1
} catch (otFoundexception b) (...
the catch clause in the outer try statement will handle NotFound
Exception if itis thrown by the call of Arrays .search or by the catch clause
{for Nul PointerException.
“The catch clauses do not have to identify the actual type of an exception
object. Instead, the clause can lista supertype of the type. For example, in
tay {x= Arrays.search(s, yi}
‘eaten (Exception o) { s.printin(e); return; }
Chapter +
425
66
Exceptions
the catch clause will handle both WullPointerException and NotFound-
Exception. (Here s isa PrintWriter, and printla uses es toString method
to obtain the information to print.)
‘The second way to deal with an exception is to propagate it. This occurs
‘when a call within some procedure P signals an exception that is not handled
by a catch clause of any containing Lry sateen in P. In this case, Java auto-
matically propagates the exception to Ps caller provided one of the following
' that exception type or one of its supertypes is listed in P's header,
' the exception type is unchecked.
Otherwise, there is a compile-time error,
‘A procedure should only raise exceptions that are listed in its specification
since this ic what a person writing code that uses the procedure elies on. Uae
fortunately, Java does not enforce this requirement for unchecked exceptions.
‘Therefore, you must enforce it yourself: make sure that any exception your
code raises, either by automatic propagation or by an explicit throw. is listed
in the header of the procedure you are implementing (even if the exception is
unchecked) and described in that procedure’ specification,
Coping with Unchecked Exceptions
Any call can potentially throw any unchecked exception. This means we have
a problem in catching unchecked exceptions because it's hard to know where
they come from. For example, in
try (x= yl}; 4 arrays. coazen(e, 305 }
catch (IndexOut0fBoundsException «) {
handle IndexurOfBounasExcoption from the aray access yn}
>
code here continues assuming problem has Ben fixed
IndexGutOfBoundsException, which is an unchecked exception, might have
‘occurred because of an error in the implementation of search,
‘The only way to be certain about the origin of an unchecked exception isto
narrow the scope ofthe try statement. For example, itis certain the exception
‘comes from the array access in the following code:43
43.1
4.3 — Programming with Exceptions
ery (x= yl: >
catch (IndexDutOfBoundeBxception 9) {
i handle tndexOut0eBoundsException from the array access ya]
>
rrays.gearen(, 2);
‘We will discuss these issues further In Section 4.4.2.
Programming with Exceptions
When implementing a procedute with exceptions, the programmer's job, as
always, isto provide the behavior defined by the specification. Ifthis behavior
includes exceptions, the program must throw the proper exceptions at the
proper times with the meaning described in the specification. To accomplish
this task, the program may need to handle exceptions that are thrown by
procedures it calls
‘Some exceptions are handled specifcaly: the catch clause attempts to re-
spond to the specific situation that gave rise tothe exception, Other exceptions
are handled generically. In this case, the catch clause does not attempt to deal
with the exception in any specific way. Instead, it takes a generic action. It
might stop the program after reporting the problem to a user, or it might “re-
start” the program by reverting to an earlier state, without an attempt to fix
the exact problem, For example, such a program might carry out some kind of
shutdown, followed by a clean restart. (The shutdown should also be logged,
so that if t was due to a program error, the error can be fixed.)
Reflecting and Masking
‘There are two ways to deal with an exception. Sometimes an exception is
reflected up anather lovel that is, the caller also terminates by throwing an
exception. Reflecting an exception can be accomplished by automatic prop.
agation, as discussed in Section 4.2.4, or by explicitly catching an exception
and then throwing an exception. The former is more limited because the same
exception object is thrown, More commonly, we want to throw a different
object, of a different exception type, because the meaning of the information
‘or
Chapter 4
44
Exceptions
has changed. Another point is that before reflecting an exception, the caller
inay need to do some local processing in order to satisfy its specification.
For example, many programs that iterate through arrays need to “prime”
the iteration by obtaining an initial value from the array. This is the case in the
‘nin procedure shown in Figure 4.4.min simply fetches the zero clement of the
array: Ifthe array argument is mult, the call will rise Mul Pot nterFerraption,
and this is reflected to the caller of min by being propagated automatically. IF
the array is empty, the call will raise IndexOwtOfRoundsException. It would
not make sense to reflect this exception to wins caller, since we want excep-
tions that are related to the min abstraction rather than exceptions having to
do with how minis implemented. Instead, win throws EmptyException, which
is an exception that is meaningful for it, Note that the string in the exception
object identifies Arrays.min as the thrower.
‘A second possibility is thatthe caller masks the exception—that is, han-
des the exception itself and then continues with tlie nusial ow, This situ-
ation is illustrated in the sorted procedure in Figure 4.4, Again, the code is
priming the loop; but in this case, if the array is empty, it simply means itis
sorted.
One point to note about both examples is how we used exceptions to
control program flow. This is perfectly acceptable programming. practice:
exceptions can be used to avoid other work. For example, in both min and
sorted, the code does not need to check the length of the array explicitly.
(However, depending on how the exception mechanism is implemented, it
may be expensive to handle exceptions, and you should weigh this cost against
the benefit of using exceptions to avoid the extza work.)
Design Issues
Now we consider how to decide about the use of exceptions when design-
ing abstractions. There are two main issues: when to use an exception, and
‘whether to use a checked or unchecked exception,
‘An important point is that exceptions are not synonymous with errors,
Exceptions are a mechanism that allows a method to bring some information
to the attention of its ealler. That information might not concern an error.
For example, there isn’t anything erroneous about search being called on
an clement that isn't in the array; instead, this is just an intetesting situation44 — Design Issues Chapter +
Figure 4.4 Reflecting and masking exceptions
public clase Array
4
public static snt ain (int{ ] a) throve NullPointerException, ExptyException {
i] tre0CTS: If is aul hows Mal PotaterException els if is empty
throws kmptyexceptson else recurns che minimure vulue ups
try (n° afl; >
catch (IndexutOfBoundsException ©) (
‘throw ney EaptyException(*Arrays.ain"); } 4.41
for (int 4 = 1) 4 ¢ adengen; i+)
st (als) ails
return a; }
public static boolean sorted (int{ ] a) throve NullPointerException {
J] BEPECTS: Ifa is 21 throws ul PointerException else if is
if sorted in ascending order return tru else returns false,
ne prev:
ery { prov = alol: >
catch (IndexutOfBoundsException ) ( return true; }
for (int 4 = 1) 4
catch (IndexOut0fBoundsException e) {
handle IndexurOfBoundsExcoption from use of array y
d
{code here continues assuming problem has boon fixed
the catch clause might handle IndexOutOfBoundsException from search by
mistake. Whatever corrective action is taken by the catch clause will fx only
the problem with y and n, but not the problem with search. Itis unlikely that
n
(Chapter 4
45
n
Exceptions
the code after the catch clause will work in this case, and when the error is
finally discovered, it may be very difficult to track down,
Why does Java have unchecked exceptions when they are a problem? The
reason is that checked exceptions are also problem: If your code is certain not
tocause one tobe raised, you still must handle it! This is why many exceptions
defined hy lava are in fit unchecked
So there are good reasons on both sides here. This means that there is a
design issue: when you define a new exception type, you must think carefully
about whether it should be checked or unchecked,
Choosing between checked and unchecked exceptions should be based on
«expectations about how the exception will be used. If you expect using code to
avoid calls that raise the exception, the exception should be unchecked. This
Is the rationale behind IndexOutO#BoundsException: arrays are supposed to
be used primarily in for loops that control the indices and thus ensure that
«ll cally on array methods have indices within bounds.
Otherwise, exceptions should be checked. For example, itis likely that
many calls of search will be made without knowledge of whether the
searched-for integer is in the array. In such a case, it would be an error for
the calling code not to handle the exception. Therefore, the exception type
should be checked so that such errors can be detected by the compiler,
‘The question of whether the exception is “usually” avoided often has to
ddo with the cost and convenience of avoiding it. For example, it is convenient
and inexpensive to determine the size of a vector (by calling the size method,
‘which returns in constant time); therefore, using code is likely to use this
method to avoid IndexOutOfBoundeException, But sometimes there is no
convenient way to avoid the exception, or avoiding the exception is costly.
Both situations arise for search. There may be no other procedure to determine
whether the elements in the array, since thisis (partly) the purpose of search.
Furthermore, ifsuch a procedure existed, its call would be costly.
‘The rules for choosing between checked and unchecked exceptions are
summarized in Sidebar 4.2
Defensive Programming
Exceptions can be used fo support the practice of defensive programming —
that is, writing each procedure to defend itself against errors, Errors can be4.5 — Defensive Programming
Sidebar 4.2 Checked versus Unchecked Exceptions
= You should use an unchecked exception only if you expect that users will usually
surite code that ensures the exception will not happen, Because
+ There isa convenient and inexpensive way to avoid the exception.
+ The context of use is local.
Otherwise, you should use a checked exception.
introduced by other procedures, by the hardware, or by the user entering
data; these latter errors will continue to exist even if the software is error
free. An exception mechanism provides a means for conveying information
about errors and a way to handle errors without cluttering the main flow of a
routine, Therefore, it encourages a methodology of writing code that checks
for problems and reports them in an orderly way.
For example, the implementation of a procedure with a requires clause
should check, if possible, whether the requires clause is satisfied. This raises
the question of what to do ifthe requires clause isnot satisfied. One possibility
isto halt the program with an error message ifthe check fails. However, this s
nota very robust approach. I’ better to use the exception mechanism because
then, if the call occurs in a context in which a higher level can recover from
problems in a generic way (e.g., by doing a restart), it will be able to do this
for the filed check as well.
It’s a good idea to have a particular exception type devoted to situations
such as the requires clause not being satisfied. A good name for this type is
Failurefxception; itis an unchecked exception,
Headers of procedures should nor list FailureException, and their spec-
ifications should not mention throwing it. The reason is that this exception
{i ised for situations that do nat correspond to what is described In a proce-
dure’s specification, Instead, the exception indicates that something is broken
so that the procedure is unable to satisfy its specification.
‘There are many other situations in which FaslureBxception should be
thrown. For example, suppose you are using search in @ context in which
‘you know x is in the array, yet your call of search throws NlotFoundExcep-
‘Eon. Since this isa checked exception, you must catch it; your code can then
n
Chapter 4
46
4
Exceptions
throw FailureExcopt ion. The string within the FailureException, as usual,
should indicate what the problem is. One easy way to do this is to concate-
nate information about your class and method with the string obtained from
NotFoundException, for example,
catch (WotFoundException ) ¢
‘throw new Faziuresxcepraon("G.pr + e-tosteang( 3);
More generally, FailureBxception should be raised whenever your code
checks an assumption that should hold and discovers it doesn't. We will see
examples of ths in later chapters
Of course, checking for problems takes time, and it Is tempting not to
bother with the checks, or to use them only while debugging and disable
them during production. This is generally an unwise practice. Defensive pro-
_gtamming is particularly valuable during production because it can prevent
3 small evtor from: causing a large problem, ss
abling checks during production is analogous to disconnecting warning lights
{in an airplane; a pilot would never do this because the results could be catas-
Lrophic. Checks should be disabled only if we have proved that the errors can
never occur or ifthe checks are costly.
‘378 damaged database, Dis-
Summary
In this chapter, we have extended procedures to include exceptions. Excep-
tons are needed in robust programs because they provide a way to respond to
errors and unusual situations. Ifan argument Is not what is expected, a pro-
cedure can notify the caller of ths fact rather than simply failing or encoding
the information in a special result. Since this notification is distinct from the
normal case, the caller cannot confuse the to,
Exceptions are introduced when procedures are designed. Most proce-
dures should be defined over the entire input domain; exceptions are used to
take care of situations in which the “usual” behavior cannot happen. Partial
procedures are suitable only when it is either too expensive or not possible
to check the condition, or when the procedure is used in a limited context in
which it can be proved that all calls have proper arguments.
In implementing 2 procedure, the programmer must ensure that it termi-
‘nates as specified in all situations. Only exceptions permitted by the specifi
cation should be signaled, and each should be signaled only in the situationAl
42
43
44
Exercices
indicated in the specification. Inaddition, it isa good idea to practice defensive
programming by checking for errors in as many cases as possible; Failure
Exception can be used to report such errors. An example is checking in a
partial procedure for inputs that do not satisfy the requires clause.
Exercises
Implement. standalone procedure to read ina file containing words and white
space and produce a compressed version of the file in an output file. The
campressed version should contain all of the words inthe input file and none
of the white space, except that it should preserve lines
Implement search as specified in Figure 4.1 in two ways: using for loops, and
using while (crue) loops that are terminated when accessing the array raises
‘IndexGut0tBoundeException. Which implementation is better? Discuss.
A specification for a procedure that computes the sum of the elements in an
array of integers might require a nonempty array, return O ifthe array is empty,
or throw an exception if the array is empty. Discuss which alternative is best
and provide the specification for the procedure.
Consider a procedure
static void conbine (nef ] a, snl) b)
‘that multiplies each clement of a by the sum of the elements of 6; for example,
ifa= (1, 2, S]and® = (4, 5), thenonseturna = [9, 18, 27). What should
this procedure do if a or bis null or empty? Give a specification for combine
that answers these questions and explain why your specification isa good one.
Data Abstraction
This chapter discusses the most important abstraction mechanism, data ab-
straction, Data abstraction allows us to abstract from the details of how data
“objects are implemented to how the objects behave. This Focus oa the behavior
‘of objects forms the basis of object-oriented programming.
Data abstraction allows us to extend the programming language in use
(e.g. Java), with new data types. What new types are needed depends on the
application domain of the program. For example, in implementing a compiler
or interpreter, stacks and symbol tablesare useful, while accounts area natural
abstraction ina banking system. Polynomials arse in a symbolic manipulation
system, and matrices are useful in defining a package of numeric functions, In
‘each case, the data abstraction consists ofa set of objects—for example, stacks
‘or polynomials—phus a set of operations. For example, matrix operations
include addition, multiplication, and so on, and deposit and withdraw are
operations on accounts.
‘The new data types should incorporate abstraction both by parameteriza
tion and by specification, Abstraction by parameterization can be achieved in
the same way as for procedures—by using parameters wherever it is sensible
to do so, We achieve abstraction by specification by making the operations
part ofthe type. To understand why the operations are needed, considee what
happens if we view a type as just a set of objects. Then all that is needed to
Implement the type isto select a storage representation for the objects; all the
”Chapter
78
ta Abstraction
using programs can be implemented in terms of this representation, However,
if the representation changes, or even if its interpretation changes, all pro-
grams that use the type must be changed: there is no way to limit the impact,
of the change
(On the other hand, suppose we include operations in the type, obtaining
data abstraction = (objects, operations)
and we require users to call the operations instead of accessing the represen-
tation directly. Then to implement the type, we implement the operations in
terms of the chosen representation, and we must reimplement the operations
if we change the representation. However, we need not reimplement any using
programs because they did not use the representation. Now we have abstracted
from the representation details; using code depends only on the specified be-
havior ofthe type with its operations. Therefore, we have achieved abstraction
by specification.
I enough operations are provided, lack of access to the representation will
not cause users any difficulty—anything they need to do to the objects can
be done, and done efficiently, by calls on the operations. In general, there will
be operations to create and modify objects and to obtain information about
their values. Of course, users can augment the set of operations by defining.
standalone procedures, but such procedures would not have access to the
representation,
Data abstraction allows us to defer decisions about data structures until
‘the uses of the data are fully understood. Choosing the right data structures is
crucial to achieving an efficent program. In the absence of data abstraction,
data structures must be defined too early; they must be specified before the
implementations of using modules can be designed. At this point, however,
the uses of the data are typically not well understood. Therefore the chosen
structure may lack needed information or be organized in an inefficient way.
‘We use data abstraction to avoid defining the structure immediately: we
Introduce the abstract type with its objects and operations. Implementations
‘of using modules can then be designed in terms of the abstract type. Deci~
sions about how to implement the type are made later, when all its uses are
understood.
Data abstraction is also valuable during program modification and mainte-
nance. In this phase, data structures are particularly likely to change, either
10 improve performance or to accommodate changing requirements. Data ab-
5A
Figure 5.1
5.1 — Specifications for Data Abstractions
straction limits the changes to just the implementation of the type: none of
the using modules need be changed.
In this chapter, we deseribe how to specify and implement data abstrac-
tions in Java, Wealso discuss ways to reason about the correctness of programs
‘that use and implement types, and we describe some issues that arise in de-
signing new types.
Specifications for Data Abstractions
Just as was the case for procedures, the meaning of a type should not be
given by any of its implementations. Instead, a specification should define its
behavior. Since objects of the type are used only by calling the operations,
most of the specification consists of explaining what the operations do,
In Java, new types are defined by classes or interfaces. For now, we will
consider only classes; interfaces will be discussed in Chapter 7.
Each class defines a type by defining a name for the type, a set of con-
structors, and a set of instance methods or methods. Constructors are used to
initialize new objects of the type; these are the instances. Once an object has
been created (and initialized by a constructor), users can access it by calling
Its methods.
‘The form of a data abstraction specification is shown in Figure 5.1. The
header class dnane indicates that a new data type called dnaze is being
defined. The header contains declaration of the visibility of the class; almost
‘The form of adata abstraction specification
visibility clase dnane ¢
{| ovenview: A brief description ofthe Behavior ofthe type’ objects goes here.
1] constructors
specs for constructors go here
ff methods
i specs for methods go hereChapter 5
5.11
80
Dats Abstraction
all classes have public visibility so that they can be used by code outside of
theie containing package
‘The specification has three parts. The overview gives a brief description of
the data abstraction, including a way of viewing the abstract abjects in terms
of “well-understood” concepts. It usually presents a model for the objects;
that is it describes the objects in terms of other objects that the reader of the
specification can be expected to understand. For example, stacks might be
defined in terms of mathematical sequences. The overview section also states
‘whether objects of the type are mutable, so that their state can change over
‘ime, or immutable.
‘The constructors part of the specification defines the constructors that
Initialize new objects, while the methods part defines the methods that allow
access to the objects once they have been created. All the constructors and
‘methods that appear in the specification will be public.
Constructors and methods aie procedures, and they ae specified using
the specification notation presented in Chapters 3 and 4, with the following
differences:
= Methods and constructors both belong to objects, rather than to classes.
‘Therefore, the keyword static will not appear in the methods’ headers
(since this keyword means thatthe method belongs to the class rather than
to an object of the class),
'* Theobjecta method or constructor belongs toisavailable to itasan implicit
argument, and this object can be referred to in the method or constructor
specification as this.
[As was the case for specifications of procedures, specifications for data
abstractions take the form of comments in the code. When a data abstraction
is first invented, all that exist is the specification; almost all code in the class,
such as the bodies of the methods, is missing. Later, when the data abstraction
is implemented, this code is added.
Specification of IntSet
Figure 5.2 gives a specification for the IntSet data abstraction. IntSets are
‘unbounded sets of Integers with operations to create a new, empty IntSet,
test whether a given integer is an element of an InvSet, and add or remove
‘elements. The overview indicates that IntSets are mutable, It also indicates
5.1 — Specifications for Data Abstractions
Specification of the IneSet data abstraction
public clase Intset {
// ovenview: IntSets are mutable, unbounded sets of integers.
J Atypical IneSee i x,
Uf constructors
public Tneset ()
[| s890CTS: Initializes ease ro be empry.
jf methods
public void saeert (int x)
jj Mooi: this
{[ FFECTS: Adds x tothe elements of this, je, ehis_post = this + {x7
public void renove (int x)
| MODIFIES: this
| werucrs: Removes x from this, Le, this post
his (x)
public bootean isIn (int x)
| eerncs: If xis in ese returns true else returns false.
public int size ()
ij SeFECTS: Returns the cardinality of this,
public int choves () throws BaptyBxeape ton
| e88ECTS: If eis is empty, throws EnpryBxception else
i] returns an arbitrary element of tai
‘that we will model IntSete in terms of mathematical sets, In the rest of the
specification, we specify each operation using this model.
Figure 5.2 uses set notation in the specifications of the methods. In partic-
lular, Ic uses + for set union, and ~ for set difference. Figure 5.3 summarizes
the set notation used in this book
‘The IntSet type has a single constructor that initializes the new set to be
empty; note that the specification refers to the new set object as this. Since a
constructor always modifies this (to initialize it), we do not bother to indicate
the modification in the modifies clause. In fact, this modification is invisible
SIChapter
oo
igure 5.3
Data Abstraction
Set notation
A set is denoted as (+1,
no duplicates in a set
vn}. The ais are the elements of the set. There are
51 + s2is the set containing all the elements of set s1 and all the
elements of sets2. If s] and s2 contain an element in common, there will be
only one occurrence ofthat element in
set difference: (= s1 — s2is the set containing all the elements of s1 that are not
also elements of 52
sot intersection: ¢
‘both sl and s2.
1452 Is the set containing all elements that are members of
cardinality: [sl stands for the sizeof sets.
set membership: x in sis true ifs is an element of s
set former: ¢= {x| p(x) is the set of all elements x sueh that p(s) is tue.
to users: they do not have access to the constructor’s object until after the
constructor runs, and therefore, they cannot observe the state change.
Once an IntSet object exists, elements can be added to it by calling its
insert method, and elements can be removed by calling renove; again, the
specifications refer to the object as this, These two methods are mutators
since they modify the state of thelr object; their specifications make it clear
that they are mutators because they contain a modifies clause stating that this
is modified. Note thatthe specifications of insert and renove use the notation
‘this_post to indicate the value of this when the operation returns, An input
argument name without the post qualifier always means the value when the
‘operation is called
‘The remaining methods are observers: they return information about the
state of their object but do not change the state. Observers do not have a
modifies else. (More accurately. an oheerver daes not have a modifies clase
stating that this, or some argument object of its type, is modified; however,
observers typically don't modify anything.)
‘The choose method returns an arbitrary element of the IntSet; thus, itis
tunderdetermined. It throws an exception ifthe set is empty. This exception
can be unchecked since users can call the size method before calling choose
to cheaply and conveniently ensure that the set is nonempty.
Specifications for Data Abstactions
Note that insert does not throw an exception If the integer is already
in the set, and similarly, renove does not throw an exception if the integer
is not in the set, These decisions are based on assumptions about how sets
will be used. We expect that users will add and remove set elements with-
‘out concern for whether they are already there. Therefore, the methods do
not throw exceptions. If we expected a different pattern of nage, we might
change the specifications and headers of these methods (to throw an excep-
tion), or we might provide additional methods that throw an exception (e.g
AnsertilonDup and renoveI Tn), so that users can choose the method that best,
fits thelr needs,
In the IntSet specification, we are relying on the reader knowing what
‘mathematical sets are; otherwise, the specification would not be understand-
able. In general, this reliance on informal description isa weakness of informal
specifications. It is probably reasonable to expect the reader to understand
a number of mathematical concepts, such as sets, sequences, and integers
However, not all types can be described nicely in terms of such concepts. If
the concepts are inadequate, we must describe the type as best we can, even
by using pictures; but of course, there is always the danger that the reader
will not understand the description or will interpret it differently than we
intended, Techniques for writing understandable specifications will be dis
cussed in Chapter 9.
[Note that the specification takes the form of a preliminary version of the
class. This code could be compiled if the methods and constructors were
_given empty bodies (except that methods that return results will need a type-
correct return statement), This will allow you to compile code that uses the
abstraction, so that you'll beable to get rid of errors that the compiler catches,
suchas type errors. You probably won't beable to run the using code, however,
until after the new type is implemented
The Poly Abstraction
A second example of a data abstraction specification is given in Figure 5.4
Polys are polynomials with integer coefficients, Unlike IntSets, Polys are
Immutable: once a Poly has been created (and initialized by a constructor) It
‘cannot be modified. Operations are provided to create a one-term Poly and to
add, subtract, and multiply Polys.
8Data Abstraction
Specification of the Poly data abstraction
public clase Poly {
{J[ OVERVIEW: Poly are immutable polynomials with integer coefficients
Wf Atypical Poly isco + c1x-+
[feonseructors
public Poly ()
{sprees Initializes tase co be the zero polynomial
public Poly (int c, int a) throve NegativetxponentException
/[eevecTs: Ifa < 0 throws RagatsveExponentException else
ff initializes thas to be the Poty cs"
I methods
public int degree ( )
| EFFECTS: Returns the degre ofthis, ie, the largest exponent
|| with a non-zero coefficient. Returns Of thie is the zero Poly.
public int cosff (ant a)
{Jf r¥recTs: Returns the coefficient of the term of thie whose exponent is 4
public Poly add (Poly q) throws Nul3PointerException
fj erevcts: If qismull throws Mul LPointerException else
ff returns the Poly ease +
public Poly ml (Poly q) throva NuliPointerException 5.2
|] erevcts: [fq isn] throws MuliPointerException else
I] returns the Poly ess *4.
public Poly sub (Poly q) throws tullPointerExcepticn
J BEFECTS: 1fq is mull throws tel PosnterException else
jj returns the Poly thie ~ 4,
public Poly mime ()
J] BHUUCTS: Returns the Poly — thse,
5.2 — Using Data Abstractions
‘The Poly type has two constructors, one to create the zero polynomial,
and one to create an arbitrary monomial. In general, a type can have a num-
ber of constructors. All constructors have the same name, the type name,
and therefore, if there is more than one constructor, this name is over
loaded.
Tava allows method names to be overloaded as well. Java req
overloaded definitions differ from one another in the number of arguments
andjor their types; otherwise, a compile-time error occurs. The two definitions
{for the Poly constructor are legal since one has no arguments and the other
has two arguments.
Poly has no mutator methods: no method has a modifies clause. This
fs what we expect to see for an immutable data abstraction. Furthermore,
the method specifications do not use the post notation that was used in
the IntSet specification. This notation is not needed for immutable abstrac-
tions, sine vbjest state docsu't change, the pre and post states of objet
identical.
[As part of defining Poly, we need to decide whether NogativeExponent~
Exception is checked or unchecked. Since it seems likely that users will
avoid calls with a negative exponent, itis appropriate to make the exception
unchecked,
s that
Using Data Abstractions
Figure 5.5 gives examples of procedures that use data abstractions. (The classes
of the procedures aren't shown in the figure.) The diff method returns a new
Poly that is the result of differentiating its argument Poly. The getElenents
routine returns an IntSet containing the integersinitsarray argument a; there
are no duplicates inthe returned set (since sets do not contain duplicates) even
if there are duplicates among the elements of a
“These routines are written based on the specifications of the used abstrac-
tions and can use only what is described in the specifications. They are not
able to access the implementation details of the abstract objects since, as we
shall see, this access is limited to implementations ofthe objects’ constructors,
and methods. They can use methods to access object state and to modify that
state ifthe object is mutable, and they can use constructors to initialize new
objects.Chapter 5
Figure 5.5,
a
Data Abstraction
Using abstract data types
public static Poly aifs (Poly p) throve MullPointerBxception {
//erecrs: If is mul} throws Wsi1PosnterExcept ion
ff else returns the Poty obtained by differentiating .
Poly q = new Poly (i
for Gt = 4) 1S pudupine( 95 are)
4 q.add(ney Poly(p.cootf(a)e1, £ = 1);
return qi
>
public static IntSet gotElenents (sat ] a)
‘throws NullPoimterException {
|] eFeects: Ifa is mult throws Wal PosnterException else returns a set
|] containing an entry for each distinct element ofa
IntSet 2 = new TntSer( )
for (int 1 * 0; i < atengea; i++) a .snsert(alil);
Implementing Data Abstractions
A class both defines a new type and provides an implementation for it. The
specification constitutes the definition ofthe type. The remalnder of the class
provides the implementation,
‘To implement a data abstraction, we select a representation, oF rep, for
its objects and then implement the constructors to initialize the represemta-
tion properly and the methods to use/modify the representation properly. The
chosen representation must permit all operations to be implemented in a rea-
sonably simple and efficient manner. In addition, if some of the operations
‘must run quickly, the representation must make this possible. A representa
tion that is fast for some operations often will be slower for others. We might,
therefore, require multiple implementations of the same type; we will discuss
how to achieve this in Chapter 7.
For example, a plausible representation for an TntSet object is a vector,
where each integer in the Int Set occurs asan element of the vector. We could
5.3 — Implementing Data Abstractions
choose to have each element ofthe set occur exactly once in the vector or allow
itto occur many times. The later choice makes the implementation of insert
run faster but slows down renove and ieTn, Since isin is likely to be called
frequently, we will make the former cholce, and therefore, there will be no
duplicate elements in the vector.
Implementing Data Abstractions in Java
A representation typically has a number of components; in Java, each of these
is an instance variable of the class implementing the data abstraction, The
implementations of the constructors and methods access and manipulate the
instance variables,
‘Thus, when considered froman implementation point of view, objectshave
both methods and instance variables. To support abstraction, however, itis
important to restrict access to the instance variables to the implementation of
the methods and constructors; this allows you, for example, to reimplement
an abstract type without affecting any code that uses the type. Therefore, the
instance variables should not be visible to users; code that uses the objects
can refer only to their methods.
‘The instance variables are prevented from being visible to users by declar-
ing them to be private. Java allows instance variables to have other than
private visibility, It is generally not a good idea to have public instance vari-
ables: this point will be discussed in more detail in Sections 5.6.2 and 5.9. The
‘one exception to this rule occurs when defining record types; record types
are discussed in Section 5.3.4
Declarations of instance variables do not have the static qualifier. These
‘variables belong to objects: there isa separate set of them for each object. It
is also possible to declare static variables within a class. Such variables belong.
to the class itself, rather than to specific objects, just as static methods belong.
to the class. Static variables are not used very often in implementing data
abstractions; some examples oftheir use will be given in Chapter 15.
Implementation of IntSet
‘This section gives a first example of an implementation —for the IntSet data
abstraction. The implementation is given in Figure 5.6,
‘The first point to note here is the definition of the IntSet rep, preceding
the implementations of the constructors and methods. In this case, the rep
7Figure 5.6 Implementation of Iatset
public clase Intsee {
|| OvERvTEw: IntSete are unbounded, mutable sets of incegers.
private Vector ele: // the rep ‘in
Jf constructors
public IntSet (> {
[Durer Initiates Uke to be emp.
els = new Vecter( );
Jj methods
Public void insert (int x) ¢
|] Moots: this
{Jf s¥recTs: Adds x 0 the elements of tase,
Integer y = nev Integer);
Af (gotinder(y) < 0) ele.a€acy); >
public void renove (int x) {
J] Mopmnes: this
| BEFECTS: Removes x from this,
int i= gotIndex(now Integer(2));
sf (4 < 0) return;
ole.set(, ele taste
ois renove(el
ent Ds
size) 0); >
public boolean defn (int x) {
|) ercers: Returns true if i in thie else vetuns fue
return gotrndex(ney Integer(e)) >= 0; }
private int gotIndex (Integer x) {
] FECTS: If is in ease returns index oh
Index where x appears ese returns ~ 4
for (int 2 = 0; 1 < ele.size( ); 444)
$f Gxequats(els.get(s))) return i;
return =1; }
public int size () £
/[ ERPUCTS: Returns the cardinality of tase
return ls.size( ); }
Public int choose ( ) throve ExptyExcaption {
J[¥¥RecTs: If thie is empty throws EnpeyExcept ion ese 533
1] retums an arbitrary element of this.
if (els.size( ) = 0) throw nov BaptyException(*TatSet choose")
return els.lastElenent( ); } ,
53 — Implementing Data Abstractions
consists of single instance variable. Since this variable has private visibility,
it can be accessed only by code inside its class.
“The constructors and methods belong to a particular object of their type.
“The objec is passed as an addtional, implicit argument to the constructors
and methods, and they can refer to it using the keyword this, For example,
the instance variable ele can he accessed using the form this els. (The code
‘cannot assign to this.) However, the prefix is not needed: the code can refer
to methods and instance variables of its own object by just using their names.
‘Thus, in the methods and constructors in the figure, els refers to the els
instance variable of this.
‘The implementation of IntSet is straightforward. The constructor initial-
ines its object by creating the vector that will hold the elements and assigning
it to ole; since the vector is empty, no more work need be done. The insert,
renove, and isTn methods all make use of the private method, getIndex, to
determine whether the elemeut of interest is already in the set. Doing this
check allows insert to preserve the no-duplicates condition. This condition
is relied on in size (since otherwise the size of the vector would not be the
same as the size ofthe set) and in reaove (since otherwise there might be other
‘occurrences of the element that would need to be removed).
[Note that getIndex has private visibility; therefore, it cannot be called
outside the class. The design takes advantage of this fact by having get Index
return —1L when the element isnot in the vector rather that using an exception.
‘As discussed in Chapter 4, this isa satisfactory approach here, since getTndex
fs used only within this class.
Since vectors cannot store snts, the methods use Integer objects instead t0
‘contain the set elements. This approach is somewhat awkward, An alternative
is to use arrays of ints; but this has its own difficulties, since then the
implementation of Int Set. would need to switch to bigger arrays as the set
grows. The implementation of Vector takes care of this problem inan efficient
igetTadex uses the equals method to check for membership. This check is
correct because aquats for Integer objects returns true only ifthe two objects
being compared are both Integers and both contain the same integer value.
Implementation of Poly
Now we consider the implementation of the Poly data abstraction, Unlike
IatSets, Polys are immutable, and therefore, their size does not change over
89Chapter 5
0
Data Abstraction
time, Therefore, we can represent a Poly as an array rather than a vector.
‘The i* element of the array will contain the coefficient of the exponent;
this representation makes sense only if the Poly is dense, The zero Poly can
be represented either as an empty array of asa one-clement array containing
ero; we will use the latter approach, In addition, we will hve an instance
variable that keeps track ofthe degree ofthe Poly since this is convenient.
Figures 5.7 and 5,8 show the Implementation of Poly. The main point
to note here is that several ofthe methods (e., add and mul) make use of
instance variables of other Poly objects in addition to their own object. Code
Ina method isallowed to access private information in ather objects ofits class
2s wel as private information in its own object.
[Note how sub and mu are implemented in terms of other Poay methods.
Another point isthe use of the Poly constructor inthe implementations of
ad, sul, and minus, All of these methods actually initialize the new Poly
themselves: thie ie allowed since the new Poly is just another object of the
«lass, which can be accessed in the method. These methods create the new
Poly using the private constructor (which cannot be called by users) to get
anarray ofthe right size In the case of sul, we rly onthe fact that the array
constructor initializes all elements of an array of nts to zero. Also, note the
care taken to ensure that the new Poly object isthe right size, This requires
a precomputation in the add method to handle the case of trailing zeros
Records
Suppose polynomials are going to be sparse rather than dense. In this case, the
previous implementation would not be a good one, since the array is likely to
be large and full of zeros. Instead, we would like to store information only for
the coefficients that are nonzero.
‘This could be accomplished by using two vectors:
private Vector coefts: she non-zero coefficients
private Vactar wepa: |/ the aceniated oxpaments
However, the implementation in this case must ensure that the two arrays
are lined up, so that the i element of coefts contains the coefficient that
goes with the exponent stored in the # element of exp. It would be more
instead we could use just one vector, each of whose elements
contained both the coeflicient and the associated exponent.
gure 5.7 First part of Poly implementation
public class Poly {
// OVERVIEW:
private antl ] tems;
private int deg:
I constructors
public Poly () {
| HPECTS: Initializes Whe coe the zero polynomial
new snt(i); dog = 0; >
public Poly (int ¢, int n) throus NegativelxponentException {
i] FFOCTS: [fn < 0 throws NegativeExponentException ese
i initializes ais to be the Poly cx"
St (<0)
throw now NegetivetxponentException("Poly (int, int) constructor"
Af (cm 0) ( time = now inti}; dog = 0; return; }
for (sat 4 = 0; 4
private Poly (nt a) (tema = now intinti]; deg = ni >
1 methods
public int degree () ¢
|] EFFECTS: Returns the degree of this, i. the largest exponent
200 coeficien. Returns O if eis 1s the zero Poly.
public int coeff (ist a)
| EEPECTS: Returns the coefcint of the term ofthis whose exponent isd
AE (<0 [| 4 > dag) return 0; else return tens{dl; }
publée Poly aub (Poly q) throvs ‘ul1PointerException {
1] EFFECTS: Jf q is 21 throws Mal PosaterException else
returns the Body tse ~ 9,
return ada (q-mimus( )); >
public Poly minus () {
| eects: Returns the Poly —this.
Poly x * new Poly(dog);
for (int 4 = 0; 4 < deg; 14) retens[il «
return £5
vernal:Chapter 5
Figure 5.8
2
Data Abstraction
Rest ofthe implementation of the Poly data abstraction
public Pely sda (Poly q) thous Wul3PointerException {
i eevecrs:Ifq is walt throws Yul aPosatertxception else
Ui] returns the Poly thse +
Pony 1a, 99;
21 deg, > q.dog) {la = this; an = 95) elec Ca = gi om = thie
fant newaeg = 1a.dogs // new degre isthe larger degree
2f Gog = qu deg) |) unless there are tring zeros
for (int k= deg; k > 0; Im-)
Af (eemsOe) + q.tma(h) t= 0) break; else nexdee
Poly x = new Poly(neudeg); |) get anew Poly
ant 4
for (i= 0; 4 < andog RE A co nexdeg; 160)
rtees(i} = an.time(s] ¢ Ya,teas(J;
for Cant j= 4: 9G nevdeg: j44) Fotren{§] © 10 teneL:
return Ti}
public Poly aul (Poly q) throve tuLIPointerException ¢
| serects: Ifq is nal throws YulLPointerException else
J] returns the Poly thi "4
SE ((q.deg == 0 8 q.tene(0) == 0) 11
(@eg == 0 kk trms[0] == 0)) return nev Poly( );
Poly r= nou Poly (deg+a.dee);
.trms[degta.deg] = 0; // prepare to compute coeffs
for (ant 1 = 0; 4 <= deg: t+)
for Gt j= 0; J < q.degi 34+)
ratens(ie3] = reteme(itj] + trmeLi]eq.tems(j];
return F; }
‘This can be accomplished by using 2 record. Most languages provide
records a6 a built in feature. For example, in C and C++, you eau define a
struct with named fields of various types. Java, however, does not provide
this ability. Instead, record types must be defined using classes.
‘A record is simply a collection of fields, each with a name and type.
‘The class implementing such a type has a publie or package-visible instance
variable for each field; package visibility means the fields can be accessed
by other code in the same package but nowhere else, The class provides
figure 5.9
5.3 — Implementing Data Abstractions
A record type
clase Pair (
{JJ OVERVIEW: A record type
int coeff;
int exp:
Palatine ©) iat a) ( coate -
‘a constructor for ereating a new object of the type; the constructor takes
arguments to define the initial values of the fields. An example is given in
Figure 5.9, Since no visibility is explicitly indicated for the class and its
instance variables, they are package visible
[Note that no specification is given for ths clas, other than to indicate that
itisa record type. Such a minimal specification is sufficient: knowing that the
class defines a record type indicates that the type simply provides the fields
defined by che instance variables.
‘We can use Pair in an implementation of sparse polynomials
private Vector trmai the germs with non-rero coficients
Here each element of trme is a Pair, This representation is simpler than the
one using twvo vectors, An additional benefit is that it allows us to avoid the
use of the int Value method. For example, consider the implementation of the
coeft method. If we are using two vectors, we have:
public dnt coeff (int 2 {
for (ant i= 0; 4 < expe.size( J; a8)
Af ((Cinteges) oxps-gee(4)) antvalue( ) == 2)
return (Integer) coffe got (i) intValue( );
return 0; }
It we use the vector of pairs, however, we have
pubic int coeff (iat 2) {
for (int i= 0; 4 € tmme.ctze( Ds a0) {
Pair p* (Pair) teas.got(i);
Sf (p.exp + x) return p.coeff; }
return 0; }
934
Chapter 5
5.4
Data Abstraction
Additional Methods
Sofi, veined ome atonal methods that al bets ive. These
aremetods dened by Object. all ces dene subtypes af Object, and
there, they mit provide al the Object mths, arhermore eases
wil inert the Ipletenaon of those metas nls thy hnpement
the methods expt (ert wil be dsrussed in dell ix Caper
Inheriting thc abject methods el he inhertedinplementaton iene
frthe nw ps threes ms prove an apleneatin
Thisscon dss sone hse ets he up be
parla ntet ae the methods squats, cone and esering. oe
Sidebar 5.1.) ls
ro ss shld bee i heya haa uate Ti
means that isn poe inaingle femme thes ig op eg
Calo the bec’ meta nthe case able sje a net oe
tr dtingulhable (> uae ashe same mesg 2
Consider the following coe
For example,
IntSet © + nev Intset( );
Intser © = new IntSet( );
if (aequaie(e)) ...; alee
Sidebar 5.1 equals, clone, and toString
' Two objecis are equals If they are behaviorally equivalent. Mutable objects are
fu equals
es ‘the same object: such types can inherit equals from Object. Immutable
‘objects are equals ifthey have the same state immutable types mus a
ee ‘types must implement equal
"= clone should return an object that has the same state as its object. Immutable
{ypes cam inert clone rom Object, but mutable types must Implement i them-
‘toString should retur a string showing the type and current state of its abject. ll
types must implement toString themselves,
5.4— Additional Methods
[At the time the if is executed, both s and t have the same state (the empty
set). However, s and t are nevertheless distinguishable, because of mutations;
for example, if the code now does #.insert(3), ¢ and t will have different
states. Therefore, the call to equals in the 4f statement must return false.
In other words, for mutable objects # and t, s-equals(t) (or t.equals(s))
should return false if 2 and t are different objects even when they have the
‘On the other hand, if two immutable objects have the same state, they
should be considered equal because there will not be any way to distinguish
among them by calling their methods. For example, consider
Poly p= new Poty(3, 4;
Poly q = nev Poly(3, 4)
Af (peoquale(g)) «1; else
‘Wes the &saicmentisexceuted,pand qhave thesamestate(the polynomial
3x), Furthermore, because Polys are immutable, pand q will always have the
same state. Therefore, the cll p.equal(q) inthe if statement should return
‘The default implementation of equals provided by Object tests whether
the two objects have the same identity. This is the right test for Intsev: s and
+ are not equivalent even though they have the same state. However, itis the
‘wrong test for Poly, and it will be the wrong test for any immutable type.
‘Therefore, when you define an immutable type, you need to provide your
own implementation of equals. However, you need not worry about equals
for mutable types; objects ofthese types will havean equals method—namely,
the one they inherit from Object —that does the right thing
(Object also provides a hashCode method. The specification of hashCode
indicates that if two objects are equivalent according to the equals method,
hashCode should produce the same value for them, Yet the default implemen-
tation for hashCode will nat do this for immutable types. hashCode is needed
only for types that are intended to be keys in hash tables. If your immutable
type sone ofthese, you must implement hashCode in a way that observes this
constraint on its behavior.
"There isa weaker equality notion that we will call similarity. Two objects
are similar fits not possible to distinguish between them using any observers
of their type, Just asi isuseful to havea standard name equals for the method
that does equivalence testing, iti also useful to have a standard name for the
method that provides similarity testing. We will cal this method similar.
95Chapter 5
96
Data Abstraction
‘There is no requirement to provide this method in a new type, but you can
oso if you wish,
For immutable types, similar and equals are the same. However, for
‘mutable types, similarity is weaker than equivalence. For example, in
IntSot 5 = now Intsee( );
Inset t = new Ianset( );
st (e.sinilar(t)) ...; elee
the call to similar should return true,
‘The clone method makes a copy of its object. The copy it produces should
hhave the same state as its object; that iit should be similar tothe object being,
cloned. The default implementation provided by Object simply assigns from
the instance variables ofthe old object to those of the new one, This is often
not a correct implementation. For example, In the case of IntSet, it would
‘euse the els components of the two objects to share the same vector. Then,
‘when a modification is done to one of them (@.g., an insert), the state of
the other will also change, which Is incorrect. On the other hand, the default
{implementation is correct for Poly; again there is sharing (of the array that
is the tens component), but the sharing doesn't matter because that array Is,
never modified
If you want a type to provide a clone method, you must provide your
own implementation if the default implementation Is not correct. In general,
the default implementation will be correct for immutable types and incorrect,
for mutable ones. If the default implementation is correct, you can inherit
it by putting implements Cloneable in the class header, If a class neither
Includes this clause in its header nor provides an implementation of clone,
then if the clone method is called on one of its objects, the code will throw
CloneNotSupportedException,
For example, the implementations of IntSet and Poly shown earlier do not
support cone and, therefore, should the clone method be called on an Tnt~
Set or Poly object, it will raise CloneYiotSupportedException. If we wanted
these types to provide clane, we would need to reitmplement it for IntCet,
but we could inherit it for Poly. The situation is illustrated in Figure 5.10,
which shows how to provide clone and equals for Poly and Int Set. Note
that no specification is given for these methods since they have standard
meanings.
Poly implements equats but inherits clone from Object. because of the
implenents Cloneable in its header. Note that Poly provides two (over-
54— Additional Methods
“The clone and equals methods
public class Poly inplenents Cloneable {
J} as given before, plus
public boolean equale (Poly @)
af (q or aull || doy '= eg)
public class IntSet {
1 8 given bajre plus
private InvSet (Vector #) { ele = vi
public Object clone ( ) {
return ney IneGet (Vector) ele.clone( )); >
loaded) definitions for equals, one overriding the Gbject method and an extra
boolean equals (Object) | header of Doject method
boolean equals (Poly) // header of Poly method
call on in~
‘The second one is an optimization; It avoids the cast and the
stanceof, which are expensive, in contexts in which both the object and
the argument are known by the compiler to be Polys, For example, consider
Poly x = new Poly(3, 7s
opject y = new Poly(3, 7)s
$f (xoquals(ney Poly(3,79)
$£ (xequais())
”Chapters
Data Absteacton
In the first Sf statement, the call will go to the optimized implementation
of equals because the compiler knows both x and the argument are Polys,
but the second call will go to the unoptimized implementation because the
compiler doesn’t know that y isa Poly.
IntSet implements clone but inherits equals from Object. Note that the
implementation uses an additional constructor
Aids to Understanding Implementations
In this section, we discuss two pieces of Information, the abstraction function
and the representation invariant. that are particularly useful in understanding
an implementation of a data abstraction.
“The abstraction function captures the designer's intent in choosing a par
ticular representation, It isthe first thing you decide on when inventing the
‘ep: what instance variables to use and how they relate to the abstract object,
they are intended to represent. The abstraction function simply describes this
decision.
‘The rep invariant is invented as you investigate how to implement the
constructors and methods, It captures the common assumptions on which
these implementations are bused; in doing so, it allows you co consider the
implementation of each operation in isolation of the others.
"The abstraction function and rep invariant together provide valuable doc
tumentation, both tothe original implementor and to others who read the code.
They capture the reason why the code is the way itis: for example, why the
implementation of choose can return the zero element of el (since the ele-
ments of ¢1s represent the elements of the set), or why size can simply return
the size of ele (because there are no duplicates in #28).
Because they are so useful, both the abstraction function and rep invariant
should be included as comments in the code. This section describes how 10
define them and also how to provide them as methods.
‘The Abstraction Function
[Any implementation of a data abstraction must define how objects belonging
to the type are represented. In choosing the representation, the implementor
9Chapter §
Figure 5.12
100
Data Abstraction
‘An example of an abstraction function
Ge ”
ta oF
neser
hhas in mind a relationship between the rep and the abstract objects. For
example, in the implementation In Figure 5.6, IntSets are represented by
vectors, where the clements of the vector are the elements of the set.
‘This relationship can be defined by a function called the abstraction func-
tion that maps from the instance variables that make up the rep of an object
to the abstract object being represented:
ARC+A
Specifically, the abstraction function AF maps from a concrete state (Le., the
state of an object of the lass €) toan abstract stare (Le. the state of an abstract
object), For each object c belonging to @, AF(c) is the state of the abstract,
object a ¢ A that c represents.
For example, the abstraction function forthe Int Set implementation maps
the instance variables of objects ofthe IntSet class to abstract IntSet states.
Figure 5.12 illustrates this function at some points; it shows how objects with
various els components map to IntSet states, This abstraction function Is
‘many-to-one: many els components map to the same abstract element. For
‘example, the IntSet {1, 2} is represented by an object whose els vector
contains the Integer with value 1 followed by the Integer with value 2,
and also by an ahject whose ae vector contains the two integers in the
‘opposite order. Since the process of abstraction involves forgetting irrelevant
information, it is not surprising that abstraction functions are often many-
to-one. In this example, the order in which the elements appear in the els
component is irrelevant.
‘The abstraction function isa crucial piece of information about an imple-
‘mentation. It defines the meaning of the representation, the way in which the
5.5 — Aids to Understanding Implementations
objects of the class are supposed to implement the abstract objects. It should
always be described in a comment in the implementation.
Tin writing such a description, however, we are hampered by the fact
that if the specification of the type is informal, the range of the abstraction
function (the set A) is not really defined. We shall overcome this problem by
giving a description of a “typical” abstract object. This allows us to define
the abstraction Function in terms of this typical object. The description of the
typleal abstract object state is part of the specification: it is provided in the
overview section. For example, the overview for IntSet stated
J] A plead tuesat is (el, x0)
(Recall that we are using mathematical sets to denote IntSet states.) Then we
can say
ff The abstraction function is
Wf AF(e) = Ce.ela{s) ntValue | 0 <4 boolean
that is true of legitimate objects. For example, for IntSet, we might state the
following rep invariant:
] The rep invariant is
}] cols mull &&
I] forall integers i c.e1sUi) isan Tavoger &&
Ht Geral secgors sy
jf cvelali) sneVatue 4 c.ele{j]-intvalue )
‘Thus, J is false if e1s contains duplicates; additionally, it rules out a rep in
which els does not refer toa vector, as well asa rep in which the ets vector
contains something other than an Integer. This rep invariant is written using
predicate calculus notation. Figure 5.13 summarizes the notation we will use
in this book,
‘The rep invariant can also be given mote informally
I the rep mvarsant
i] cele #mult &&
i] all elements of c.els are Integers &&
|] there are no duplicates in c.e22
As a second example, consider an alternative representation of IntSets
that consists ofan array of 100 booleans plus a vector:
Figure 5.13
5.5 — Aids to Understanding Implementations
Predicate calculus notation
&& will be used for conjunction: p &&q is true if pis true and q is true
|| will be used for disjunction: p | qs tue if ether pis true org is true
= will be used for implication: p => q means that if pis true, then q i also true.
Note that false => anything, i. ifp is false, then we can deduce whatever we
Uke.
iff (if and only if) will be used for double implication: p iff q means that p => q
and gp
forall x in s p(x) means that predicate p(x) is tue forall xin sets
there exists» in ¢ px) means that there is atleast one in sets for which the
predicate p(s) Is true
private boclean[100] els
private Vector otherEis;
private int 22:
‘The idea here is that for an integer iin the range 0...9, we record membership
sn the set by storing true in ets [4]. Integers outside this range will be stored
in others in the same manner asin our previous implementation of TatSet,
Since it would be expensive to compute the size of the TatSet if we had to
examine every part of the els array, we also store the size explicitly in the
ep, IIhis representation is a good one iF almost all members of the set are in
the range 0... 9 and if we expect the set to have quite a few members in this
range, Otherwise, the space required for the ets array will be wasted
For this representation we have
I The abstraction function is
| AP(e) = (e.otherkie(4) sntValue | 0 <= 4 < ¢.otherEls.size }
il +
Ho {310
Inother words, the set isthe union ofthe elements of otherEls and the indexes
of the true elements of els, Also, we have
€ 100 bb e-ei{5] >
if The rep invariant és
i] coal # null ke c.otherEie # null &k exereignr206Rh
ij all elements inc. ctherEls are Integers Bk
fall elements inc. otherEls are not inthe range 0 (0 99 kk
103Chapter
104
Data Abstraction
I] there are no duplicates inc. otharEs bk
Woes
otherEle.eize + ( count of true entries in c.els )
Note that the sz instance variable of this rep is redundant; It holds in-
formation that can be computed directly from the other instance variables.
‘Whenever there is redundant information in the rep, the relationship of this,
{information to the rest ofthe rep should be explained in the rep invariant (for
‘example, in the last line of this rep invariant)
It is sometimes convenient to use 3 helping function in the rep invariant or
abstraction function. For example, the last line ofthe preceding rep invariant
could be rewritten
[fc.e2 = ccotherEle.size + ent(c.ele, 0)
ff where ent(a, 4) = ££ >» a.size then
[else ifala] chen t + cata, 41)
HW elswent(n, 191)
The helping function ent is defined by a recurrence relation,
‘The implementation of Poly in Figure 5.7 has an interesting rep invariant,
Recall that we chose to store coefficients only up to the degree, without any
trailing zeros except in the case of the zero polynomial, Therefore, we do not
expect to find a zero in the high element of the tres component unless the
‘component has just one element. In addition, these arrays always have atleast
‘one element, Furthermore, deg must be one less than the size of tras, Thus
we have
Uf The rep invariant is
| cvtens 3 null ab .trms. length
| tk c.dog > 0 => c.trmeldeg] #0
1k c.dog = ¢. tens Longth-1
Recall that the implementation of the coef operation depended on the length
of the array being one greater than the degree of the Poly: now we see this,
requirement spelled out in the rep invariant.
‘Sometimes all concrete objects are legal representations. Then we have
simply
I] The rep invariant is
i tre
‘This is what happens for record types: record objects are used by accessing
their fields directly, This means that using code will be able to modify the
ficlds, which in turn means that the class implementing the record cannot
5.5.3
5.5 — Aids to Understanding Implementations
constrain the rep in any way. OF course, there might be some constraints on
hhow the record objects are used that define a stronger relationship between
the fields, but these constraints would be ensured by the code that uses
the record objects and would show up in the rep invariant for that code.
For example, the rep invariant for the sparse polynomial implementation
discussed in Section 5.3.4 would include
forall elements eof ¢.tras
Wf 9isa ase and e.exp >= 0 ande.coett #0
Rep invariants need not be given for record types because all these classes
hhave exactly the same rep invariant. They must be given for all other types,
even those for which the jnvariant is simply true. Giving the invariant may
prevent the implementor from depending on a stronger, unsatisfied Invariant,
Implementing the Abstraction Function
and Rep Invariant
In addition to providing the abstraction function and rep invariant as com.
‘ments in your code, you should also provide methods to implement them. (The
only exception to this rule is record types, which do not need these methods.)
‘These methods are useful for finding errors in your code; in addition, the im-
plementation of the abstraction function can be used to do output, Sidebar
5.2 summarizes the abstraction function and rep invariant,
‘The toString method is used to implement the abstraction function. The
‘method that checks the rep invariant is called repOx. It has the following.
specification:
public boolean repOk( )
i] e884cts: Returns true if the rep invariant holds for this,
i] otherwise returns fale.
‘The method is public because we want it ro be callable by code outside ofits
class Ferry type should provide this method, but a specification need not be
given for it since the specification is identical for every type.
Figure 5.14 gives implementations ofthe repOk methods forthe classes we
have seen so far, Note the use of the instancecf operator in repOk for IntSet
to check that the element is an Integer.
‘The repOk method is used in two ways, Test programs can call it to check
‘whether an implementation is preserving the rep invariant. Or you can use
0s