The Pragmatic Programmer PDF
The Pragmatic Programmer PDF
“The cool thing about this book is that it’s great for keeping the
programming process fresh. [The book] helps you to continue to grow
and clearly comes from people who have been there.”
Kent Beck, author of Extreme Programming Explained:
Embrace Change
“I would buy a copy, read it twice, then tell all my colleagues to run
out and grab a copy. This is a book I would never loan because I would
worry about it being lost.”
Kevin Ruland, Management Science, MSG-Logistics
ADDISON–WESLEY
An imprint of Addison Wesley Longman, Inc.
Reading, Massachusetts Harlow, England Menlo Park, California
Berkeley, California Don Mills, Ontario Sydney
Bonn Amsterdam Tokyo Mexico City
Lyrics from the song “The Boxer” on page 157 are Copyright c 1968 Paul Simon. Used by
permission of the Publisher: Paul Simon Music. Lyrics from the song “Alice’s Restaurant”
on page 220 are by Arlo Guthrie, c 1966, 1967 (renewed) by A PPLESEED M USIC I NC. All
Rights Reserved. Used by Permission.
The authors and publisher have taken care in the preparation of this book, but make
no express or implied warranty of any kind and assume no responsibility for errors or
omissions. No liability is assumed for incidental or consequential damages in connection
with or arising out of the use of the information or programs contained herein.
The publisher offers discounts on this book when ordered in quantity for special sales.
For more information, please contact:
AWL Direct Sales
Addison Wesley Longman, Inc.
One Jacob Way
Reading, Massachusetts 01867
(781) 944-3700
All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photo-
copying, recording, or otherwise, without the prior written permission of the publisher.
Printed in the United States of America. Published simultaneously in Canada.
ISBN 0-201-61622-X
Text printed in the United States on recycled paper at Courier Stoughton in Stoughton, Massachusetts.
25th Printing February 2010
P REFACE xvii
1 A P RAGMATIC P HILOSOPHY 1
1. The Cat Ate My Source Code . . . . . . . . . . . . . . . . . 2
2. Software Entropy . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Stone Soup and Boiled Frogs . . . . . . . . . . . . . . . . . 7
4. Good-Enough Software . . . . . . . . . . . . . . . . . . . . 9
5. Your Knowledge Portfolio . . . . . . . . . . . . . . . . . . . 12
6. Communicate! . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 A P RAGMATIC A PPROACH 25
7. The Evils of Duplication . . . . . . . . . . . . . . . . . . . . 26
8. Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . 34
9. Reversibility . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
10. Tracer Bullets . . . . . . . . . . . . . . . . . . . . . . . . . 48
11. Prototypes and Post-it Notes . . . . . . . . . . . . . . . . . 53
12. Domain Languages . . . . . . . . . . . . . . . . . . . . . . 57
13. Estimating . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3 T HE B ASIC T OOLS 71
14. The Power of Plain Text . . . . . . . . . . . . . . . . . . . . 73
15. Shell Games . . . . . . . . . . . . . . . . . . . . . . . . . . 77
16. Power Editing . . . . . . . . . . . . . . . . . . . . . . . . . . 82
17. Source Code Control . . . . . . . . . . . . . . . . . . . . . . 86
18. Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
19. Text Manipulation . . . . . . . . . . . . . . . . . . . . . . . 99
20. Code Generators . . . . . . . . . . . . . . . . . . . . . . . . 102
ix
Appendices
A R ESOURCES 261
Professional Societies . . . . . . . . . . . . . . . . . . . . . . . . 262
Building a Library . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Internet Resources . . . . . . . . . . . . . . . . . . . . . . . . . 266
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
I NDEX 309
Simply put, this book tells you how to program in a way that you can
follow. You wouldn’t think that that would be a hard thing to do, but it
is. Why? For one thing, not all programming books are written by pro-
grammers. Many are compiled by language designers, or the journalists
who work with them to promote their creations. Those books tell you
how to talk in a programming language—which is certainly important,
but that is only a small part of what a programmer does.
Imagine that you are sitting in a meeting. Maybe you are thinking
that the meeting could go on forever and that you would rather be
programming. Dave and Andy would be thinking about why they were
xiii
having the meeting, and wondering if there is something else they could
do that would take the place of the meeting, and deciding if that some-
thing could be automated so that the work of the meeting just happens
in the future. Then they would do it.
That is just the way Dave and Andy think. That meeting wasn’t some-
thing keeping them from programming. It was programming. And it
was programming that could be improved. I know they think this way
because it is tip number two: Think About Your Work.
So imagine that these guys are thinking this way for a few years.
Pretty soon they would have a collection of solutions. Now imagine
them using their solutions in their work for a few more years, and
discarding the ones that are too hard or don’t always produce results.
Well, that approach just about defines pragmatic. Now imagine them
taking a year or two more to write their solutions down. You might
think, That information would be a gold mine. And you would be right.
The authors tell us how they program. And they tell us in a way that we
can follow. But there is more to this second statement than you might
think. Let me explain.
I’ve studied this problem for a dozen years and found the most promise
in a device called a pattern language. In short, a pattern is a solution,
and a pattern language is a system of solutions that reinforce each
other. A whole community has formed around the search for these
systems.
You can follow the advice in this book because it is concrete. You won’t
find vague abstractions. Dave and Andy write directly for you, as if each
tip was a vital strategy for energizing your programming career. They
make it simple, they tell a story, they use a light touch, and then they
follow that up with answers to questions that will come up when you
try.
And there is more. After you read ten or fifteen tips you will begin to see
an extra dimension to the work. We sometimes call it QWAN, short for
the quality without a name. The book has a philosophy that will ooze
into your consciousness and mix with your own. It doesn’t preach. It
just tells what works. But in the telling more comes through. That’s the
beauty of the book: It embodies its philosophy, and it does so unpre-
tentiously.
—Ward Cunningham
There are many people offering you help. Tool vendors tout the mir-
acles their products perform. Methodology gurus promise that their
techniques guarantee results. Everyone claims that their programming
language is the best, and every operating system is the answer to all
conceivable ills.
xvii
You adjust your approach to suit the current circumstances and envi-
ronment. You judge the relative importance of all the factors affecting a
project and use your experience to produce appropriate solutions. And
you do this continuously as the work progresses. Pragmatic Program-
mers get the job done, and do it well.
We don’t pretend to have all (or even most) of the answers, nor are
all of our ideas applicable in all situations. All we can say is that if
you follow our approach, you’ll gain experience rapidly, your produc-
tivity will increase, and you’ll have a better understanding of the entire
development process. And you’ll write better software.
thing new, you can grasp it quickly and integrate it with the rest of
your knowledge. Your confidence is born of experience.
Critical thinker. You rarely take things as given without first get-
ting the facts. When colleagues say “because that’s the way it’s
done,” or a vendor promises the solution to all your problems, you
smell a challenge.
Jack of all trades. You try hard to be familiar with a broad range
of technologies and environments, and you work to keep abreast of
new developments. Although your current job may require you to
be a specialist, you will always be able to move on to new areas and
new challenges.
We’ve left the most basic characteristics until last. All Pragmatic Pro-
grammers share them. They’re basic enough to state as tips:
TIP 1
TIP 2
decision you make, every day, and on every development. Never run on
auto-pilot. Constantly be thinking, critiquing your work in real time.
The old IBM corporate motto, THINK!, is the Pragmatic Programmer’s
mantra.
If this sounds like hard work to you, then you’re exhibiting the realistic
characteristic. This is going to take up some of your valuable time—time
that is probably already under tremendous pressure. The reward is a
more active involvement with a job you love, a feeling of mastery over
an increasing range of subjects, and pleasure in a feeling of continuous
improvement. Over the long term, your time investment will be repaid
as you and your team become more efficient, write code that’s easier to
maintain, and spend less time in meetings.
We disagree.
Within the overall structure of a project there is always room for in-
dividuality and craftsmanship. This is particularly true given the cur-
rent state of software engineering. One hundred years from now, our
engineering may seem as archaic as the techniques used by medieval
“Absolutely,” replied the gardener. “Do that for 500 years and you’ll
have a nice lawn, too.”
Great lawns need small amounts of daily care, and so do great pro-
grammers. Management consultants like to drop the word kaizen in
conversations. “Kaizen” is a Japanese term that captures the concept
of continuously making many small improvements. It was considered
to be one of the main reasons for the dramatic gains in productivity and
quality in Japanese manufacturing and was widely copied throughout
the world. Kaizen applies to individuals, too. Every day, work to refine
the skills you have and to add new tools to your repertoire. Unlike the
Eton lawns, you’ll start seeing results in a matter of days. Over the
years, you’ll be amazed at how your experience has blossomed and
your skills have grown.
What’s in a Name?
“When I use a word,” Humpty Dumpty said, in rather a scornful tone, “it means
just what I choose it to mean—neither more nor less.”
Lewis Carroll, Through the Looking-Glass
Having said all this, we decided to get revenge against the computer sci-
entists. Sometimes, there are perfectly good jargon words for concepts,
words that we’ve decided to ignore. Why? Because the existing jargon
is normally restricted to a particular problem domain, or to a partic-
ular phase of development. However, one of the basic philosophies of
this book is that most of the techniques we’re recommending are uni-
versal: modularity applies to code, designs, documentation, and team
www.pragmaticprogrammer.com
There you’ll also find links to resources we find useful, along with
updates to the book and news of other Pragmatic Programmer devel-
opments.
Send Us Feedback
We’d appreciate hearing from you. Comments, suggestions, errors in
the text, and problems in the examples are all welcome. E-mail us at
Acknowledgments
When we started writing this book, we had no idea how much of a team
effort it would end up being.
Then there were the reviewers: Greg Andress, Mark Cheers, Chris Clee-
land, Alistair Cockburn, Ward Cunningham, Martin Fowler, Thanh
T. Giang, Robert L. Glass, Scott Henninger, Michael Hunter, Brian
The second printing of this book benefited greatly from the eagle eyes
of our readers. Many thanks to Brian Blank, Paul Boal, Tom Ekberg,
Brent Fulgham, Louis Paul Hebert, Henk-Jan Olde Loohuis, Alan Lund,
Gareth McCaughan, Yoshiki Shibata, and Volker Wurst, both for find-
ing the mistakes and for having the grace to point them out gently.
This book was produced using LATEX, pic, Perl, dvips, ghostview, ispell,
GNU make, CVS, Emacs, XEmacs, EGCS, GCC, Java, iContract, and
SmallEiffel, using the Bash and zsh shells under Linux. The stagger-
ing thing is that all of this tremendous software is freely available. We
owe a huge “thank you” to the thousands of Pragmatic Programmers
worldwide who have contributed these and other works to us all. We’d
particularly like to thank Reto Kramer for his help with iContract.
Last, but in no way least, we owe a huge debt to our families. Not only
have they put up with late night typing, huge telephone bills, and our
permanent air of distraction, but they’ve had the grace to read what
we’ve written, time after time. Thank you for letting us dream.
Andy Hunt
Dave Thomas
A Pragmatic Philosophy
What distinguishes Pragmatic Programmers? We feel it’s an attitude, a
style, a philosophy of approaching problems and their solutions. They
think beyond the immediate problem, always trying to place it in its
larger context, always trying to be aware of the bigger picture. After all,
without this larger context, how can you be pragmatic? How can you
make intelligent compromises and informed decisions?
Another key to their success is that they take responsibility for every-
thing they do, which we discuss in The Cat Ate My Source Code. Being
responsible, Pragmatic Programmers won’t sit idly by and watch their
projects fall apart through neglect. In Software Entropy, we tell you how
to keep your projects pristine.
Most people find change difficult to accept, sometimes for good reasons,
sometimes because of plain old inertia. In Stone Soup and Boiled Frogs,
we look at a strategy for instigating change and (in the interests of
balance) present the cautionary tale of an amphibian that ignored the
dangers of gradual change.
Take Responsibility
Responsibility is something you actively agree to. You make a commit-
ment to ensure that something is done right, but you don’t necessarily
have direct control over every aspect of it. In addition to doing your own
personal best, you must analyze the situation for risks that are beyond
your control. You have the right not to take on a responsibility for an
impossible situation, or one in which the risks are too great. You’ll have
to make the call based on your own ethics and judgment.
When you do accept the responsibility for an outcome, you should ex-
pect to be held accountable for it. When you make a mistake (as we all
do) or an error in judgment, admit it honestly and try to offer options.
If there was a risk that the vendor wouldn’t come through for you, then
you should have had a contingency plan. If the disk crashes—taking
all of your source code with it—and you don’t have a backup, it’s your
fault. Telling your boss “the cat ate my source code” just won’t cut it.
TIP 3
Before you approach anyone to tell them why something can’t be done,
is late, or is broken, stop and listen to yourself. Talk to the rubber
duck on your monitor, or the cat. Does your excuse sound reasonable,
or stupid? How’s it going to sound to your boss?
Run through the conversation in your mind. What is the other person
likely to say? Will they ask, “Have you tried this ” or “Didn’t you con-
sider that?” How will you respond? Before you go and tell them the bad
news, is there anything else you can try? Sometimes, you just know
what they are going to say, so save them the trouble.
Try to flush out the lame excuses before voicing them aloud. If you
must, tell your cat first. After all, if little Tiddles is going to take the
blame. . . .
Challenges
How do you react when someone—such as a bank teller, an auto mechanic,
or a clerk—comes to you with a lame excuse? What do you think of them
and their company as a result?
2 Software Entropy
While software development is immune from almost all physical laws,
entropy hits us hard. Entropy is a term from physics that refers to the
amount of “disorder” in a system. Unfortunately, the laws of thermo-
dynamics guarantee that the entropy in the universe tends toward a
maximum. When disorder increases in software, programmers call it
“software rot.”
There are many factors that can contribute to software rot. The most
important one seems to be the psychology, or culture, at work on a
project. Even if you are a team of one, your project’s psychology can
be a very delicate thing. Despite the best laid plans and the best peo-
ple, a project can still experience ruin and decay during its lifetime. Yet
there are other projects that, despite enormous difficulties and con-
stant setbacks, successfully fight nature’s tendency toward disorder
and manage to come out pretty well.
In inner cities, some buildings are beautiful and clean, while others
are rotting hulks. Why? Researchers in the field of crime and urban
decay discovered a fascinating trigger mechanism, one that very quickly
turns a clean, intact, inhabited building into a smashed and abandoned
derelict [WK82].
A broken window.
One broken window, left unrepaired for any substantial length of time,
instills in the inhabitants of the building a sense of abandonment—a
sense that the powers that be don’t care about the building. So another
window gets broken. People start littering. Graffiti appears. Serious
structural damage begins. In a relatively short space of time, the build-
ing becomes damaged beyond the owner’s desire to fix it, and the sense
of abandonment becomes reality.
TIP 4
You may be thinking that no one has the time to go around cleaning
up all the broken glass of a project. If you continue to think like that,
then you’d better plan on getting a dumpster, or moving to another
neighborhood. Don’t let entropy win.
department rushed in to save the day—and his house. But before they
dragged their big, dirty hoses into the house, they stopped—with the
fire raging—to roll out a mat between the front door and the source of
the fire.
A pretty extreme case, to be sure, but that’s the way it must be with
software. One broken window—a badly designed piece of code, a poor
management decision that the team must live with for the duration
of the project—is all it takes to start the decline. If you find yourself
working on a project with quite a few broken windows, it’s all too easy
to slip into the mindset of “All the rest of this code is crap, I’ll just follow
suit.” It doesn’t matter if the project has been fine up to this point.
In the original experiment leading to the “Broken Window Theory,” an
abandoned car sat for a week untouched. But once a single window was
broken, the car was stripped and turned upside down within hours.
By the same token, if you find yourself on a team and a project where
the code is pristinely beautiful—cleanly written, well designed, and
elegant—you will likely take extra special care not to mess it up, just
like the firefighters. Even if there’s a fire raging (deadline, release date,
trade show demo, etc.), you don’t want to be the first one to make a
mess.
Challenges
Help strengthen your team by surveying your computing “neighborhood.”
Choose two or three “broken windows” and discuss with your colleagues
what the problems are and what could be done to fix them.
Can you tell when a window first gets broken? What is your reaction? If
it was the result of someone else’s decision, or a management edict, what
can you do about it?
Undeterred, the soldiers boiled a pot of water and carefully placed three stones
into it. The amazed villagers came out to watch.
“This is stone soup,” the soldiers explained. “Is that all you put in it?” asked
the villagers. “Absolutely—although some say it tastes even better with a few
carrots .” A villager ran off, returning in no time with a basket of carrots from
his hoard.
A couple of minutes later, the villagers again asked “Is that it?”
“Well,” said the soldiers, “a couple of potatoes give it body.” Off ran another
villager.
Over the next hour, the soldiers listed more ingredients that would enhance the
soup: beef, leeks, salt, and herbs. Each time a different villager would run off to
raid their personal stores.
Eventually they had produced a large pot of steaming soup. The soldiers removed
the stones, and they sat down with the entire village to enjoy the first square
meal any of them had eaten in months.
There are a couple of morals in the stone soup story. The villagers are
tricked by the soldiers, who use the villagers’ curiosity to get food from
them. But more importantly, the soldiers act as a catalyst, bringing
the village together so they can jointly produce something that they
couldn’t have done by themselves—a synergistic result. Eventually ev-
eryone wins.
Every now and then, you might want to emulate the soldiers.
You may be in a situation where you know exactly what needs doing
and how to do it. The entire system just appears before your eyes—you
know it’s right. But ask permission to tackle the whole thing and you’ll
be met with delays and blank stares. People will form committees, bud-
gets will need approval, and things will get complicated. Everyone will
guard their own resources. Sometimes this is called “start-up fatigue.”
It’s time to bring out the stones. Work out what you can reasonably
ask for. Develop it well. Once you’ve got it, show people, and let them
marvel. Then say “of course, it would be better if we added .” Pretend
it’s not important. Sit back and wait for them to start asking you to
add the functionality you originally wanted. People find it easier to join
an ongoing success. Show them a glimpse of the future and you’ll get
them to rally around.1
TIP 5
We’ve all seen the symptoms. Projects slowly and inexorably get totally
out of hand. Most software disasters start out too small to notice, and
most project overruns happen a day at a time. Systems drift from their
specifications feature by feature, while patch after patch gets added to
a piece of code until there’s nothing of the original left. It’s often the
accumulation of small things that breaks morale and teams.
TIP 6
We’ve never tried this—honest. But they say that if you take a frog and
drop it into boiling water, it will jump straight back out again. However,
if you place the frog in a pan of cold water, then gradually heat it, the
frog won’t notice the slow increase in temperature and will stay put
until cooked.
1. While doing this, you may be comforted by the line attributed to Rear Admiral Dr.
Grace Hopper: “It’s easier to ask forgiveness than it is to get permission.”
Note that the frog’s problem is different from the broken windows issue
discussed in Section 2. In the Broken Window Theory, people lose the
will to fight entropy because they perceive that no one else cares. The
frog just doesn’t notice the change.
Don’t be like the frog. Keep an eye on the big picture. Constantly review
what’s happening around you, not just what you personally are doing.
Challenges
While reviewing a draft of this book, John Lakos raised the following is-
sue: The soldiers progressively deceive the villagers, but the change they
catalyze does them all good. However, by progressively deceiving the frog,
you’re doing it harm. Can you determine whether you’re making stone
soup or frog soup when you try to catalyze change? Is the decision subjec-
tive or objective?
4 Good-Enough Software
Striving to better, oft we mar what’s well.
King Lear 1.4
There’s an old(ish) joke about a U.S. company that places an order for
100,000 integrated circuits with a Japanese manufacturer. Part of the
specification was the defect rate: one chip in 10,000. A few weeks later
the order arrived: one large box containing thousands of ICs, and a
small one containing just ten. Attached to the small box was a label
that read: “These are the faulty ones.”
If only we really had this kind of control over quality. But the real
world just won’t let us produce much that’s truly perfect, particularly
not bug-free software. Time, technology, and temperament all conspire
against us.
The scope and quality of the system you produce should be specified
as part of that system’s requirements.
TIP 7
But artists will tell you that all the hard work is ruined if you don’t
know when to stop. If you add layer upon layer, detail over detail, the
painting becomes lost in the paint.
Challenges
Look at the manufacturers of the software tools and operating systems that
you use. Can you find any evidence that these companies are comfortable
shipping software they know is not perfect? As a user, would you rather
(1) wait for them to get all the bugs out, (2) have complex software and
accept some bugs, or (3) opt for simpler software with fewer defects?
Ah, good old Ben Franklin—never at a loss for a pithy homily. Why, if we
could just be early to bed and early to rise, we’d be great programmers—
right? The early bird might get the worm, but what happens to the early
worm?
In this case, though, Ben really hit the nail on the head. Your knowledge
and experience are your most important professional assets.
3. An expiring asset is something whose value diminishes over time. Examples include
a warehouse full of bananas and a ticket to a ball game.
4. Investors try to buy low and sell high for maximum return.
Diversify. The more different things you know, the more valuable
you are. As a baseline, you need to know the ins and outs of the
particular technology you are working with currently. But don’t
stop there. The face of computing changes rapidly—hot technology
today may well be close to useless (or at least not in demand) to-
morrow. The more technologies you are comfortable with, the better
you will be able to adjust to change.
Of all these guidelines, the most important one is the simplest to do:
TIP 8
Goals
Now that you have some guidelines on what and when to add to your
knowledge portfolio, what’s the best way to go about acquiring intellec-
tual capital with which to fund your portfolio? Here are a few sugges-
tions.
Get wired. Want to know the ins and outs of a new language or
other technology? Newsgroups are a great way to find out what
experiences other people are having with it, the particular jargon
they use, and so on. Surf the Web for papers, commercial sites,
and any other sources of information you can find.
Don’t let it stop there. Take it as a personal challenge to find the answer.
Ask a guru. (If you don’t have a guru in your office, you should be able
to find one on the Internet: see the box on on the facing page.) Search
the Web. Go to the library.4
If you can’t find the answer yourself, find out who can. Don’t let it rest.
Talking to other people will help build your personal network, and you
may surprise yourself by finding solutions to other, unrelated problems
along the way. And that old portfolio just keeps getting bigger. . . .
All of this reading and researching takes time, and time is already in
short supply. So you need to plan ahead. Always have something to
read in an otherwise dead moment. Time spent waiting for doctors and
dentists can be a great opportunity to catch up on your reading—but be
sure to bring your own magazine with you, or you might find yourself
thumbing through a dog-eared 1973 article about Papua New Guinea.
Critical Thinking
The last important point is to think critically about what you read
and hear. You need to ensure that the knowledge in your portfolio is
accurate and unswayed by either vendor or media hype. Beware of the
zealots who insist that their dogma provides the only answer—it may
or may not be applicable to you and your project.
TIP 9
Unfortunately, there are very few simple answers anymore. But with
your extensive portfolio, and by applying some critical analysis to the
4. In this era of the Web, many people seem to have forgotten about real live libraries
filled with research material and staff.
torrent of technical publications you will read, you can understand the
complex answers.
Challenges
Start learning a new language this week. Always programmed in C++? Try
Smalltalk [URL 13] or Squeak [URL 14]. Doing Java? Try Eiffel [URL 10]
or TOM [URL 15]. See page 267 for sources of other free compilers and
environments.
Start reading a new book (but finish this one first!). If you are doing very
detailed implementation and coding, read a book on design and architec-
ture. If you are doing high-level design, read a book on coding techniques.
Get out and talk technology with people who aren’t involved in your cur-
rent project, or who don’t work for the same company. Network in your
company cafeteria, or maybe seek out fellow enthusiasts at a local user’s
group meeting.
6 Communicate!
I believe that it is better to be looked over than it is to be overlooked.
Mae West, Belle of the Nineties, 1934
Maybe we can learn a lesson from Ms. West. It’s not just what you’ve
got, but also how you package it. Having the best ideas, the finest code,
or the most pragmatic thinking is ultimately sterile unless you can com-
municate with other people. A good idea is an orphan without effective
communication.
Plan what you want to say. Write an outline. Then ask yourself, “Does
this get across whatever I’m trying to say?” Refine it until it does.
Say you want to suggest a Web-based system to allow your end users
to submit bug reports. You can present this system in many differ-
ent ways, depending on your audience. End users will appreciate that
they can submit bug reports 24 hours a day without waiting on the
phone. Your marketing department will be able to use this fact to boost
sales. Managers in the support department will have two reasons to
be happy: fewer staff will be needed, and problem reporting will be
automated. Finally, developers may enjoy getting experience with Web-
based client-server technologies and a new database engine. By making
the appropriate pitch to each group, you’ll get them all excited about
your project.
5. The word annoy comes from the Old French enui, which also means “to bore.”
you’ll have a more receptive listener to your ideas on source code repos-
itories. Make what you’re saying relevant in time, as well as in content.
Sometimes all it takes is the simple question “Is this a good time to talk
about...?”
Choose a Style
Adjust the style of your delivery to suit your audience. Some people
want a formal “just the facts” briefing. Others like a long, wide-ranging
chat before getting down to business. When it comes to written docu-
ments, some like to receive large bound reports, while others expect a
simple memo or e-mail. If in doubt, ask.
them. (Your company may already have defined style sheets that you
can use.) Learn how to set page headers and footers. Look at the sam-
ple documents included with your package to get ideas on style and
layout. Check the spelling, first automatically and then by hand. After
awl, their are spelling miss steaks that the chequer can knot ketch.
Be a Listener
There’s one technique that you must use if you want people to listen
to you: listen to them. Even if this is a situation where you have all the
information, even if this is a formal meeting with you standing in front
of 20 suits—if you don’t listen to them, they won’t listen to you.
TIP 10
It’s Both What You Say and the Way You Say It
E-Mail Communication
Everything we’ve said about communicating in writing applies equally
to electronic mail. E-mail has evolved to the point where it is a main-
stay of intra- and intercorporate communications. E-mail is used to
discuss contracts, to settle disputes, and as evidence in court. But
for some reason, people who would never send out a shabby paper
document are happy to fling nasty-looking e-mail around the world.
Our e-mail tips are simple:
Proofread before you hit SEND .
Summary
Know what you want to say.
Know your audience.
Choose your moment.
Choose a style.
Make it look good.
Involve your audience.
Be a listener.
Get back to people.
Challenges
There are several good books that contain sections on communications
within development teams [Bro95, McC95, DL99]. Make it a point to try
to read all three over the next 18 months. In addition, the book Dinosaur
Brains [Ber96] discusses the emotional baggage we all bring to the work
environment.
The next time you have to give a presentation, or write a memo advocating
some position, try working through the WISDOM acrostic on page 20 before
you start. See if it helps you understand how to position what you say. If
appropriate, talk to your audience afterward and see how accurate your
assessment of their needs was.
A Pragmatic Approach
There are certain tips and tricks that apply at all levels of software
development, ideas that are almost axiomatic, and processes that are
virtually universal. However, these approaches are rarely documented
as such; you’ll mostly find them written down as odd sentences in dis-
cussions of design, project management, or coding.
In this chapter we’ll bring these ideas and processes together. The first
two sections, The Evils of Duplication and Orthogonality, are closely
related. The first warns you not to duplicate knowledge throughout
your systems, the second not to split any one piece of knowledge across
multiple system components.
The next two sections are also related. In Tracer Bullets, we talk about
a style of development that allows you to gather requirements, test
designs, and implement code at the same time. If this sounds too good
to be true, it is: tracer bullet developments are not always applicable.
When they’re not, Prototypes and Post-it Notes shows you how to use
prototyping to test architectures, algorithms, interfaces, and ideas.
25
Finally, we all work in a world of limited time and resources. You can
survive both of these scarcities better (and keep your bosses happier) if
you get good at working out how long things will take, which we cover
in Estimating.
We feel that the only way to develop software reliably, and to make our
developments easier to understand and maintain, is to follow what we
call the DRY principle:
TIP 11
You’ll find the DRY principle popping up time and time again through-
out this book, often in contexts that have nothing to do with coding.
We feel that it is one of the most important tools in the Pragmatic Pro-
grammer’s tool box.
Imposed Duplication
Sometimes, duplication seems to be forced on us. Project standards
may require documents that contain duplicated information, or docu-
ments that duplicate information in the code. Multiple target platforms
each require their own programming languages, libraries, and devel-
opment environments, which makes us duplicate shared definitions
and procedures. Programming languages themselves require certain
structures that duplicate information. We have all worked in situations
where we felt powerless to avoid duplication. And yet often there are
ways of keeping each piece of knowledge in one place, honoring the
DRY principle, and making our lives easier at the same time. Here are
some techniques:
With a bit of ingenuity you can normally remove the need for dupli-
cation. Often the answer is to write a simple filter or code generator.
Structures in multiple languages can be built from a common metadata
representation using a simple code generator each time the software is
built (an example of this is shown in Figure 3.4, page 106). Class defini-
tions can be generated automatically from the online database schema,
or from the metadata used to build the schema in the first place. The
code extracts in this book are inserted by a preprocessor each time we
format the text. The trick is to make the process active: this cannot be
a one-time conversion, or we’re back in a position of duplicating data.
The DRY principle tells us to keep the low-level knowledge in the code,
where it belongs, and reserve the comments for other, high-level expla-
nations. Otherwise, we’re duplicating knowledge, and every change
means changing both the code and the comments. The comments will
inevitably become out of date, and untrustworthy comments are worse
than no comments at all. (See It’s All Writing, page 248, for more infor-
mation on comments.)
Inadvertent Duplication
Sometimes, duplication comes about as the result of mistakes in the
design.
Let’s look at an example from the distribution industry. Say our anal-
ysis reveals that, among other attributes, a truck has a type, a license
number, and a driver. Similarly, a delivery route is a combination of a
route, a truck, and a driver. We code up some classes based on this
understanding.
But what happens when Sally calls in sick and we have to change
drivers? Both Truck and DeliveryRoute contain a driver. Which one
do we change? Clearly this duplication is bad. Normalize it according
to the underlying business model—does a truck really have a driver
as part of its underlying attribute set? Does a route? Or maybe there
needs to be a third object that knits together a driver, a truck, and a
route. Whatever the eventual solution, avoid this kind of unnormalized
data.
At first sight, this class might appear reasonable. A line clearly has a
start and end, and will always have a length (even if it’s zero). But we
have duplication. The length is defined by the start and end points:
change one of the points and the length changes. It’s better to make
the length a calculated field:
class Line {
public:
Point start;
Point end;
double length() { return start.distanceTo(end); }
};
Later on in the development process, you may choose to violate the DRY
principle for performance reasons. Frequently this occurs when you
need to cache data to avoid repeating expensive operations. The trick is
to localize the impact. The violation is not exposed to the outside world:
only the methods within the class have to worry about keeping things
straight.
class Line {
private:
bool changed;
double length;
Point start;
Point end;
public:
void setStart(Point p) { start = p; changed = true; }
void setEnd(Point p) { end = p; changed = true; }
Point getStart(void) { return start; }
Point getEnd(void) { return end; }
double getLength() {
if (changed) {
length = start.distanceTo(end);
changed = false;
}
return length;
}
};
1. The use of accessor functions ties in with Meyer’s Uniform Access principle [Mey97b],
which states that “All services offered by a module should be available through a uni-
form notation, which does not betray whether they are implemented through storage or
through computation.”
Impatient Duplication
Every project has time pressures—forces that can drive the best of us
to take shortcuts. Need a routine similar to one you’ve written? You’ll
be tempted to copy the original and make a few changes. Need a value
to represent the maximum number of points? If I change the header
file, the whole project will get rebuilt. Maybe I should just use a literal
number here; and here; and here. Need a class like one in the Java
runtime? The source is available, so why not just copy it and make the
changes you need (license provisions notwithstanding)?
Interdeveloper Duplication
On the other hand, perhaps the hardest type of duplication to detect
and handle occurs between different developers on a project. Entire
sets of functionality may be inadvertently duplicated, and that duplica-
tion could go undetected for years, leading to maintenance problems.
We heard firsthand of a U.S. state whose governmental computer sys-
tems were surveyed for Y2K compliance. The audit turned up more
than 10,000 programs, each containing its own version of Social Secu-
rity number validation.
At a high level, deal with the problem by having a clear design, a strong
technical project leader (see page 228 in Pragmatic Teams), and a well-
understood division of responsibilities within the design. However, at
the module level, the problem is more insidious. Commonly needed
functionality or data that doesn’t fall into an obvious area of responsi-
bility can get implemented many times over.
We feel that the best way to deal with this is to encourage active and
frequent communication between developers. Set up forums to discuss
common problems. (On past projects, we have set up private Usenet
TIP 12
8 Orthogonality
Orthogonality is a critical concept if you want to produce systems that
are easy to design, build, test, and extend. However, the concept of
orthogonality is rarely taught directly. Often it is an implicit feature of
various other methods and techniques you learn. This is a mistake.
Once you learn to apply the principle of orthogonality directly, you’ll
notice an immediate improvement in the quality of systems you pro-
duce.
What Is Orthogonality?
“Orthogonality” is a term borrowed from geom- move parallel
etry. Two lines are orthogonal if they meet at to X -axis
A Nonorthogonal System
You’re on a helicopter tour of the Grand Canyon when the pilot, who
made the obvious mistake of eating fish for lunch, suddenly groans and
faints. Fortunately, he left you hovering 100 feet above the ground. You
rationalize that the collective pitch lever 2 controls overall lift, so lower-
2. Helicopters have four basic controls. The cyclic is the stick you hold in your right
hand. Move it, and the helicopter moves in the corresponding direction. Your left hand
holds the collective pitch lever. Pull up on this and you increase the pitch on all the
blades, generating lift. At the end of the pitch lever is the throttle. Finally you have two
foot pedals, which vary the amount of tail rotor thrust and so help turn the helicopter.
ing it slightly will start a gentle descent to the ground. However, when
you try it, you discover that life isn’t that simple. The helicopter’s nose
drops, and you start to spiral down to the left. Suddenly you discover
that you’re flying a system where every control input has secondary
effects. Lower the left-hand lever and you need to add compensating
backward movement to the right-hand stick and push the right pedal.
But then each of these changes affects all of the other controls again.
Suddenly you’re juggling an unbelievably complex system, where every
change impacts all the other inputs. Your workload is phenomenal:
your hands and feet are constantly moving, trying to balance all the
interacting forces.
Benefits of Orthogonality
As the helicopter example illustrates, nonorthogonal systems are in-
herently more complex to change and control. When components of
any system are highly interdependent, there is no such thing as a local
fix.
TIP 13
You get two major benefits if you write orthogonal systems: increased
productivity and reduced risk.
Gain Productivity
Changes are localized, so development time and testing time are
reduced. It is easier to write relatively small, self-contained compo-
nents than a single large block of code. Simple components can be
Reduce Risk
An orthogonal approach reduces the risks inherent in any development.
The resulting system is less fragile. Make small changes and fixes to
a particular area, and any problems you generate will be restricted
to that area.
Let’s look at some of the ways you can apply the principle of orthogo-
nality to your work.
Project Teams
Have you noticed how some project teams are efficient, with everyone
knowing what to do and contributing fully, while the members of other
teams are constantly bickering and don’t seem able to get out of each
other’s way?
Often this is an orthogonality issue. When teams are organized with lots
of overlap, members are confused about responsibilities. Every change
needs a meeting of the entire team, because any one of them might be
affected.
Design
Most developers are familiar with the need to design orthogonal sys-
tems, although they may use words such as modular, component-based,
and layered to describe the process. Systems should be composed of
a set of cooperating modules, each of which implements functionality
independent of the others. Sometimes these components are organized
into layers, each providing a level of abstraction. This layered approach
is a powerful way to design orthogonal systems. Because each layer
uses only the abstractions provided by the layers below it, you have
great flexibility in changing underlying implementations without affect-
ing code. Layering also reduces the risk of runaway dependencies be-
tween modules. You’ll often see layering expressed in diagrams such as
Figure 2.1 on the next page.
There is an easy test for orthogonal design. Once you have your com-
ponents mapped out, ask yourself: If I dramatically change the require-
User Interface
Application framework
Standard C library
Operating system
3. In reality, this is naive. Unless you are remarkably lucky, most real-world require-
ments changes will affect multiple functions in the system. However, if you analyze the
change in terms of functions, each functional change should still ideally affect just one
module.
Also ask yourself how decoupled your design is from changes in the
real world. Are you using a telephone number as a customer identifier?
What happens when the phone company reassigns area codes? Don’t
rely on the properties of things you can’t control.
When you bring in a toolkit (or even a library from other members of
your team), ask yourself whether it imposes changes on your code that
shouldn’t be there. If an object persistence scheme is transparent, then
it’s orthogonal. If it requires you to create or access objects in a special
way, then it’s not. Keeping such details isolated from your code has the
added benefit of making it easier to change vendors in the future.
aspect Trace {
advise * Fred.*(..) {
static before {
Log.write("-> Entering " + thisJoinPoint.methodName);
}
}
}
If you weave this aspect into your code, trace messages will be gen-
erated. If you don’t, you’ll see no messages. Either way, your original
source is unchanged.
Coding
Every time you write code you run the risk of reducing the orthogonality
of your application. Unless you constantly monitor not just what you
are doing but also the larger context of the application, you might un-
intentionally duplicate functionality in some other module, or express
existing knowledge twice.
Avoid global data. Every time your code references global data,
it ties itself into the other components that share that data. Even
globals that you intend only to read can lead to trouble (for exam-
ple, if you suddenly need to change your code to be multithreaded).
In general, your code is easier to understand and maintain if you
explicitly pass any required context into your modules. In object-
oriented applications, context is often passed as parameters to
Get into the habit of being constantly critical of your code. Look for any
opportunities to reorganize it to improve its structure and orthogonal-
ity. This process is called refactoring, and it’s so important that we’ve
dedicated a section to it (see Refactoring, page 184).
Testing
An orthogonally designed and implemented system is easier to test.
Because the interactions between the system’s components are formal-
ized and limited, more of the system testing can be performed at the
individual module level. This is good news, because module level (or
unit) testing is considerably easier to specify and perform than integra-
tion testing. In fact, we suggest that every module have its own unit
test built into its code, and that these tests be performed automatically
as part of the regular build process (see Code That’s Easy to Test, page
189).
Bug fixing is also a good time to assess the orthogonality of the system
as a whole. When you come across a problem, assess how localized
the fix is. Do you change just one module, or are the changes scat-
tered throughout the entire system? When you make a change, does it
fix everything, or do other problems mysteriously arise? This is a good
opportunity to bring automation to bear. If you use a source code con-
trol system (and you will after reading Source Code Control, page 86),
tag bug fixes when you check the code back in after testing. You can
then run monthly reports analyzing trends in the number of source
files affected by each bug fix.
Documentation
Perhaps surprisingly, orthogonality also applies to documentation. The
axes are content and presentation. With truly orthogonal documenta-
tion, you should be able to change the appearance dramatically without
changing the content. Modern word processors provide style sheets and
macros that help (see It’s All Writing, page 248).
Challenges
Consider the difference between large GUI-oriented tools typically found on
Windows systems and small but combinable command line utilities used
at shell prompts. Which set is more orthogonal, and why? Which is easier
to use for exactly the purpose for which it was intended? Which set is
easier to combine with other tools to meet new challenges?
Exercises
1. You are writing a class called Split, which splits input lines into fields. Answer
on p. 279
Which of the following two Java class signatures is the more orthogonal
design?
class Split1 {
public Split1(InputStreamReader rdr) { ...
public void readNextLine() throws IOException { ...
public int numFields() { ...
public String getField(int fieldNo) { ...
}
class Split2 {
public Split2(String line) { ...
public int numFields() { ...
public String getField(int fieldNo) { ...
}
2. Which will lead to a more orthogonal design: modeless or modal dialog Answer
on p. 279
boxes?
3. How about procedural languages versus object technology? Which results Answer
on p. 280
in a more orthogonal system?
9 Reversibility
Nothing is more dangerous than an idea if it’s the only one you have.
Emil-Auguste Chartier, Propos sur la religion, 1938
There is always more than one way to implement something, and there
is usually more than one vendor available to provide a third-party prod-
uct. If you go into a project hampered by the myopic notion that there is
only one way to do it, you may be in for an unpleasant surprise. Many
project teams have their eyes forcibly opened as the future unfolds:
“But you said we’d use database XYZ! We are 85% done coding the
project, we can’t change now!” the programmer protested. “Sorry, but
our company decided to standardize on database PDQ instead—for all
projects. It’s out of my hands. We’ll just have to recode. All of you will
be working weekends until further notice.”
By the time many critical decisions have been made, the target becomes
so small that if it moves, or the wind changes direction, or a butterfly in
Tokyo flaps its wings, you miss.4 And you may miss by a huge amount.
4. Take a nonlinear, or chaotic, system and apply a small change to one of its inputs.
You may get a large and often unpredictable result. The clichéd butterfly flapping its
wings in Tokyo could be the start of a chain of events that ends up generating a tornado
in Texas. Does this sound like any projects you know?
Reversibility
Many of the topics in this book are geared to producing flexible, adapt-
able software. By sticking to their recommendations—especially the
DRY principle (page 26), decoupling (page 138), and use of metadata
(page 144)—we don’t have to make as many critical, irreversible de-
cisions. This is a good thing, because we don’t always make the best
decisions the first time around. We commit to a certain technology only
to discover we can’t hire enough people with the necessary skills. We
lock in a certain third-party vendor just before they get bought out
by their competitor. Requirements, users, and hardware change faster
than we can get the software developed.
TIP 14
Flexible Architecture
While many people try to keep their code flexible, you also need to think
about maintaining flexibility in the areas of architecture, deployment,
and vendor integration.
Are you developing for Unix? Which one? Do you have all of the porta-
bility concerns addressed? Are you developing for a particular version
of Windows? Which one—3.1, 95, 98, NT, CE, or 2000? How hard will
it be to support other versions? If you keep decisions soft and pliable,
it won’t be hard at all. If you have poor encapsulation, high coupling,
and hard-coded logic or parameters in the code, it might be impossible.
Not sure how marketing wants to deploy the system? Think about it up
front and you can support a stand-alone, client-server, or n-tier model
just by changing a configuration file. We’ve written programs that do
just that.
No one knows what the future may hold, especially not us! So en-
able your code to rock-n-roll: to “rock on” when it can, to roll with
the punches when it must.
Challenges
Time for a little quantum mechanics with Schrödinger’s cat. Suppose you
have a cat in a closed box, along with a radioactive particle. The particle
has exactly a 50% chance of fissioning into two. If it does, the cat will
be killed. If it doesn’t, the cat will be okay. So, is the cat dead or alive?
According to Schrödinger, the correct answer is both. Every time a sub-
nuclear reaction takes place that has two possible outcomes, the universe
is cloned. In one, the event occurred, in the other it didn’t. The cat’s alive
in one universe, dead in another. Only when you open the box do you know
which universe you are in.
But think of code evolution along the same lines as a box full of Schrö-
dinger’s cats: every decision results in a different version of the future.
How many possible futures can your code support? Which ones are more
likely? How hard will it be to support them when the time comes?
10 Tracer Bullets
Ready, fire, aim
There are two ways to fire a machine gun in the dark.5 You can find
out exactly where your target is (range, elevation, and azimuth). You
can determine the environmental conditions (temperature, humidity,
air pressure, wind, and so on). You can determine the precise speci-
fications of the cartridges and bullets you are using, and their inter-
actions with the actual gun you are firing. You can then use tables or
a firing computer to calculate the exact bearing and elevation of the
barrel. If everything works exactly as specified, your tables are correct,
and the environment doesn’t change, your bullets should land close to
their target.
Tracer bullets are loaded at intervals on the ammo belt alongside reg-
ular ammunition. When they’re fired, their phosphorus ignites and
leaves a pyrotechnic trail from the gun to whatever they hit. If the trac-
ers are hitting the target, then so are the regular bullets.
5. To be pedantic, there are many ways of firing a machine gun in the dark, including
closing your eyes and spraying out bullets. But this is an analogy, and we’re allowed to
take liberties.
constraining the environment. Fire the gun using dead reckoning. One
big calculation up front, then shoot and hope.
To get the same effect in code, we’re looking for something that gets us
from a requirement to some aspect of the final system quickly, visibly,
and repeatably.
TIP 15
Tracer code is not disposable: you write it for keeps. It contains all the
error checking, structuring, documentation, and self-checking that any
piece of production code has. It simply is not fully functional. However,
once you have achieved an end-to-end connection among the compo-
nents of your system, you can check how close to the target you are,
adjusting if necessary. Once you’re on target, adding functionality is
easy.
It’s the same with tracer code. You use the technique in situations
where you’re not 100% certain of where you’re going. You shouldn’t
be surprised if your first couple of attempts miss: the user says “that’s
not what I meant,” or data you need isn’t available when you need it,
or performance problems seem likely. Work out how to change what
you’ve got to bring it nearer the target, and be thankful that you’ve
used a lean development methodology. A small body of code has low
inertia—it is easy and quick to change. You’ll be able to gather feed-
back on your application and generate a new, more accurate version
faster and at less cost than with any other method. And because every
major application component is represented in your tracer code, your
users can be confident that what they’re seeing is based on reality, not
just a paper specification.
You could prototype a user interface for your end users in a GUI tool.
You code only enough to make the interface responsive to user actions.
Once they’ve agreed to the layout, you might throw it away and recode
it, this time with the business logic behind it, using the target language.
Similarly, you might want to prototype a number of algorithms that
perform the actual packing. You might code functional tests in a high-
level, forgiving language such as Perl, and code low-level performance
tests in something closer to the machine. In any case, once you’d made
your decision, you’d start again and code the algorithms in their final
environment, interfacing to the real world. This is prototyping, and it is
very useful.
We build software prototypes in the same fashion, and for the same
reasons—to analyze and expose risk, and to offer chances for correction
at a greatly reduced cost. Like the car makers, we can target a prototype
to test one or more specific aspects of a project.
Things to Prototype
What sorts of things might you choose to investigate with a prototype?
Anything that carries risk. Anything that hasn’t been tried before, or
that is absolutely critical to the final system. Anything unproven, exper-
imental, or doubtful. Anything you aren’t comfortable with. You can
prototype
Architecture
New functionality in an existing system
Structure or contents of external data
Third-party tools or components
Performance issues
User interface design
Prototyping is a learning experience. Its value lies not in the code pro-
duced, but in the lessons learned. That’s really the point of prototyping.
TIP 16
Prototype to Learn
Prototyping Architecture
Many prototypes are constructed to model the entire system under con-
sideration. As opposed to tracer bullets, none of the individual modules
in the prototype system need to be particularly functional. In fact, you
may not even need to code in order to prototype architecture—you can
prototype on a whiteboard, with Post-it notes or index cards. What you
are looking for is how the system hangs together as a whole, again de-
ferring details. Here are some specific areas you may want to look for
in the architectural prototype:
Is coupling minimized?
6. If you are investigating absolute (instead of relative) performance, you will need to
stick to a language that is close in performance to the target language.
Does every module have an access path to the data it needs during
execution? Does it have that access when it needs it?
This last item tends to generate the most surprises and the most valu-
able results from the prototyping experience.
When used properly, a prototype can save you huge amounts of time,
money, pain, and suffering by identifying and correcting potential prob-
lem spots early in the development cycle—the time when fixing mis-
takes is both cheap and easy.
Exercises
Answer 4. Marketing would like to sit down and brainstorm a few Web-page designs
on p. 280
with you. They are thinking of clickable image maps to take you to other
pages, and so on. But they can’t decide on a model for the image—maybe
it’s a car, or a phone, or a house. You have a list of target pages and
content; they’d like to see a few prototypes. Oh, by the way, you have 15
minutes. What tools might you use?
12 Domain Languages
The limits of language are the limits of one’s world.
Ludwig Wittgenstein
After you’ve written the application, the users give you a new require-
ment: transactions with negative balances shouldn’t be stored, and
should be sent back on the X.25 lines in the original format:
That was easy, wasn’t it? With the proper support in place, you can pro-
gram much closer to the application domain. We’re not suggesting that
your end users actually program in these languages. Instead, you’re
giving yourself a tool that lets you work closer to their domain.
TIP 17
Remember that there are many users of an application. There’s the end
user, who understands the business rules and the required outputs.
There are also secondary users: operations staff, configuration and test
managers, support and maintenance programmers, and future genera-
tions of developers. Each of these users has their own problem domain,
and you can generate mini-environments and languages for all of them.
Domain-Specific Errors
If you are writing in the problem domain, you can also perform
domain-specific validation, reporting problems in terms your users
can understand. Take our switching application on on the facing page.
Suppose the user misspelled the format name:
From X25LINE1 (Format=AB123)
Implementing a Mini-Language
At its simplest, a mini-language may be in a line-oriented, easily parsed
format. In practice, we probably use this form more than any other.
It can be parsed simply using switch statements, or using regular
expressions in scripting languages such as Perl. The answer to Exercise
5 on page 281 shows a simple implementation in C.
You can also implement a more complex language, with a more formal
syntax. The trick here is to define the syntax first using a notation such
as BNF.7 Once you have your grammar specified, it is normally trivial to
convert it into the input syntax for a parser generator. C and C++ pro-
grammers have been using yacc (or its freely available implementation,
bison [URL 27]) for years. These programs are documented in detail in
the book Lex and Yacc [LMB92]. Java programmers can try javaCC,
which can be found at [URL 26]. The answer to Exercise 7 on page 282
7. BNF, or Backus-Naur Form, lets you specify context-free grammars recursively. Any
good book on compiler construction or parsing will cover BNF in (exhaustive) detail.
shows a parser written using bison. As it shows, once you know the
syntax, it’s really not a lot of work to write simple mini-languages.
For example, the sendmail program is used throughout the world for
routing e-mail over the Internet. It has many excellent features and
benefits, which are controlled by a thousand-line configuration file,
written using sendmail’s own configuration language:
Mlocal, P=/usr/bin/procmail,
F=lsDFMAw5:/|@qSPfhn9,
S=10/30, R=20/40,
T=DNS/RFC822/X-Unix,
A=procmail -Y -a $h -d $u
For years, Microsoft has been using a data language that can describe
menus, widgets, dialog boxes, and other Windows resources. Figure 2.2
on the next page shows an excerpt from a typical resource file. This is
far easer to read than the sendmail example, but it is used in exactly
the same way—it is compiled to generate a data structure.
MAIN_MENU MENU
{
POPUP "&File"
{
MENUITEM "&New", CM_FILENEW
MENUITEM "&Open...", CM_FILEOPEN
MENUITEM "&Save", CM_FILESAVE
}
}
MY_DIALOG_BOX DIALOG 6, 15, 292, 287
STYLE DS_MODALFRAME | WS_POPUP | WS_VISIBLE |
WS_CAPTION | WS_SYSMENU
CAPTION "My Dialog Box"
FONT 8, "MS Sans Serif"
{
DEFPUSHBUTTON "OK", ID_OK, 232, 16, 50, 14
PUSHBUTTON "Help", ID_HELP, 232, 52, 50, 14
CONTROL "Edit Text Control", ID_EDIT1,
"EDIT", WS_BORDER | WS_TABSTOP, 16, 16, 80, 56
CHECKBOX "Checkbox", ID_CHECKBOX1, 153, 65, 42, 38,
BS_AUTOCHECKBOX | WS_TABSTOP
}
You can also use your own imperative languages to ease program main-
tenance. For example, you may be asked to integrate information from
a legacy application into your new GUI development. A common way
of achieving this is by screen scraping; your application connects to
the mainframe application as if it were a regular human user, issuing
keystrokes and “reading” the responses it gets back. You could script
the interaction using a mini-language.9
9. In fact, you can buy tools that support just this kind of scripting. You can also inves-
tigate open-source packages such as Expect, which provide similar capabilities [URL 24].
The trade-off is extendibility and maintenance. While the code for pars-
ing a “real” language may be harder to write, it will be much easier for
people to understand, and to extend in the future with new features
and functionality. Languages that are too simple may be easy to parse,
but can be cryptic—much like the sendmail example on page 60.
Challenges
Could some of the requirements of your current project be expressed in
a domain-specific language? Would it be possible to write a compiler or
translator that could generate most of the code required?
Exercises
5. We want to implement a mini-language to control a simple drawing pack- Answer
on p. 281
age (perhaps a turtle-graphics system). The language consists of single-
letter commands. Some commands are followed by a single number. For
example, the following input would draw a rectangle.
P 2 # select pen 2
D # pen down
W 2 # draw west 2cm
N 1 # then north 1
E 2 # then east 2
S 1 # then back south
U # pen up
6. Design a BNF grammar to parse a time specification. All of the following Answer
on p. 282
examples should be accepted.
7. Implement a parser for the BNF grammar in Exercise 6 using yacc, bison, Answer
on p. 282
or a similar parser-generator.
8. Implement the time parser using Perl. [Hint: Regular expressions make Answer
on p. 283
good parsers.]
13 Estimating
Quick! How long will it take to send War and Peace over a 56k modem
line? How much disk space will you need for a million names and
addresses? How long does a 1,000-byte block take to pass through
a router? How many months will it take to deliver your project?
At one level, these are all meaningless questions—they are all missing
information. And yet they can all be answered, as long as you are com-
fortable estimating. And, in the process of producing an estimate, you’ll
come to understand more about the world your programs inhabit.
TIP 18
As a bonus, at the end of this section we’ll reveal the single correct
answer to give whenever anyone asks you for an estimate.
If your grandmother asks when you will arrive, she’s probably won-
dering whether to make you lunch or dinner. On the other hand, a
diver trapped underwater and running out of air is probably inter-
ested in an answer down to the second.
One of the interesting things about estimating is that the units you
use make a difference in the interpretation of the result. If you say
that something will take about 130 working days, then people will be
expecting it to come in pretty close. However, if you say “Oh, about
six months,” then they know to look for it any time between five and
seven months from now. Both numbers represent the same duration,
but “130 days” probably implies a higher degree of accuracy than you
feel. We recommend that you scale time estimates as follows:
So, if after doing all the necessary work, you decide that a project will
take 125 working days (25 weeks), you might want to deliver an esti-
mate of “about six months.”
The same concepts apply to estimates of any quantity: choose the units
of your answer to reflect the accuracy you intend to convey.
10. “3” is also apparently good enough if you are a legislator. In 1897, Indiana State
Legislature House Bill No. 246 attempted to decree that henceforth should have the
value of “3”. The Bill was tabled indefinitely at its second reading when a mathematics
professor pointed out that their powers did not quite extend to passing laws of nature.
See how their problem got solved. It’s unlikely you’ll ever find an exact
match, but you’d be surprised how many times you can successfully
draw on other’s experiences.
Model building can be both creative and useful in the long term. Often,
the process of building the model leads to discoveries of underlying
patterns and processes that weren’t apparent on the surface. You may
even want to reexamine the original question: “You asked for an esti-
mate to do X. However, it looks like Y, a variant of X, could be done in
about half the time, and you lose only one feature.”
that is added into the result. Some components may supply multiply-
ing factors, while others may be more complicated (such as those that
simulate the arrival of traffic at a node).
You’ll find that each component will typically have parameters that af-
fect how it contributes to the overall model. At this stage, simply identify
each parameter.
During the calculation phase, you may start getting answers that seem
strange. Don’t be too quick to dismiss them. If your arithmetic is cor-
rect, your understanding of the problem or your model is probably
wrong. This is valuable information.
When an estimate turns out wrong, don’t just shrug and walk away.
Find out why it differed from your guess. Maybe you chose some param-
eters that didn’t match the reality of the problem. Maybe your model
was wrong. Whatever the reason, take some time to uncover what hap-
pened. If you do, your next estimate will be better.
Check requirements
Analyze risk
Design, implement, integrate
Validate with the users
Initially, you may have only a vague idea of how many iterations will be
required, or how long they may be. Some methods require you to nail
this down as part of the initial plan, but for all but the most trivial of
projects this is a mistake. Unless you are doing an application similar
to a previous one, with the same team and the same technology, you’d
just be guessing.
So you complete the coding and testing of the initial functionality and
mark this as the end of the first increment. Based on that experience,
you can refine your initial guess on the number of iterations and what
can be included in each. The refinement gets better and better each
time, and confidence in the schedule grows along with it.
TIP 19
This may not be popular with management, who typically want a sin-
gle, hard-and-fast number before the project even starts. You’ll have to
help them understand that the team, their productivity, and the envi-
ronment will determine the schedule. By formalizing this, and refining
the schedule as part of each iteration, you’ll be giving them the most
accurate scheduling estimates you can.
You almost always get better results if you slow the process down and
spend some time going through the steps we describe in this section.
Estimates given at the coffee machine will (like the coffee) come back to
haunt you.
Challenges
Start keeping a log of your estimates. For each, track how accurate you
turned out to be. If your error was greater than 50%, try to find out where
your estimate went wrong.
Exercises
9. You are asked “Which has a higher bandwidth: a 1Mbps communications Answer
on p. 283
line or a person walking between two computers with a full 4GB tape in
their pocket?” What constraints will you put on your answer to ensure that
the scope of your response is correct? (For example, you might say that the
time taken to access the tape is ignored.)
Then begins a process of learning and adaptation. Each tool will have
its own personality and quirks, and will need its own special handling.
Each must be sharpened in a unique way, or held just so. Over time,
each will wear according to use, until the grip looks like a mold of the
woodworker’s hands and the cutting surface aligns perfectly with the
angle at which the tool is held. At this point, the tools become conduits
from the craftsman’s brain to the finished product—they have become
extensions of his or her hands. Over time, the woodworker will add new
tools, such as biscuit cutters, laser-guided miter saws, dovetail jigs—
all wonderful pieces of technology. But you can bet that he or she will
be happiest with one of those original tools in hand, feeling the plane
sing as it slides through the wood.
Tools amplify your talent. The better your tools, and the better you
know how to use them, the more productive you can be. Start with a
basic set of generally applicable tools. As you gain experience, and as
you come across special requirements, you’ll add to this basic set. Like
the craftsman, expect to add to your toolbox regularly. Always be on the
lookout for better ways of doing things. If you come across a situation
where you feel your current tools can’t cut it, make a note to look for
71
something different or more powerful that would have helped. Let need
drive your acquisitions.
In this chapter we’ll talk about investing in your own basic toolbox.
As with any good discussion on tools, we’ll start (in The Power of Plain
Text) by looking at your raw materials, the stuff you’ll be shaping. From
there we’ll move to the workbench, or in our case the computer. How
can you use your computer to get the most out of the tools you use?
We’ll discuss this in Shell Games. Now that we have material and a
bench to work on, we’ll turn to the tool you’ll probably use more than
any other, your editor. In Power Editing, we’ll suggest ways of making
you more efficient.
To ensure that we never lose any of our precious work, we should al-
ways use a Source Code Control system—even for things such as our
personal address book! And, since Mr. Murphy was really an optimist
after all, you can’t be a great programmer until you become highly
skilled at Debugging.
You’ll need some glue to bind much of the magic together. We discuss
some possibilities, such as awk, Perl, and Python, in Text Manipulation.
Spend time learning to use these tools, and at some point you’ll be sur-
prised to discover your fingers moving over the keyboard, manipulating
text without conscious thought. The tools will have become extensions
of your hands.
The reader has no idea what the significance of 467abe may be. A better
choice would be to make it understandable to humans.
DrawingType=UMLActivityDrawing
Plain text doesn’t mean that the text is unstructured; XML, SGML, and
HTML are great examples of plain text that has a well-defined structure.
You can do everything with plain text that you could do with some
binary format, including versioning.
The problem with most binary formats is that the context necessary to
understand the data is separate from the data itself. You are artificially
divorcing the data from its meaning. The data may as well be encrypted;
it is absolutely meaningless without the application logic to parse it.
With plain text, however, you can achieve a self-describing data stream
that is independent of the application that created it.
TIP 20
Drawbacks
There are two major drawbacks to using plain text: (1) It may take
more space to store than a compressed binary format, and (2) it may
be computationally more expensive to interpret and process a plain text
file.
1. MD5 is often used for this purpose. For an excellent introduction to the wonderful
world of cryptography, see [Sch95].
As long as the data survives, you will have a chance to be able to use
it—potentially long after the original application that wrote it is defunct.
You can parse such a file with only partial knowledge of its format; with
most binary files, you must know all the details of the entire format in
order to parse it successfully.
Consider a data file from some legacy system2 that you are given. You
know little about the original application; all that’s important to you is
that it maintained a list of clients’ Social Security numbers, which you
need to find and extract. Among the data, you see
<FIELD10>123-45-6789</FIELD10>
...
<FIELD10>567-89-0123</FIELD10>
...
<FIELD10>901-23-4567</FIELD10>
But imagine if the file had been formatted this way instead:
AC27123456789B11P
...
XY43567890123QTYL
...
6T2190123456788AM
You may not have recognized the significance of the numbers quite
as easily. This is the difference between human readable and human
understandable.
While we’re at it, FIELD10 doesn’t help much either. Something like
<SSNO>123-45-6789</SSNO>
makes the exercise a no-brainer—and ensures that the data will outlive
any project that created it.
Leverage
Virtually every tool in the computing universe, from source code man-
agement systems to compiler environments to editors and stand-alone
filters, can operate on plain text.
Easier Testing
If you use plain text to create synthetic data to drive system tests, then
it is a simple matter to add, update, or modify the test data without
having to create any special tools to do so. Similarly, plain text output
from regression tests can be trivially analyzed (with diff, for instance)
or subjected to more thorough scrutiny with Perl, Python, or some other
scripting tool.
Challenges
Design a small address book database (name, phone number, and so on)
using a straightforward binary representation in your language of choice.
Do this before reading the rest of this challenge.
15 Shell Games
Every woodworker needs a good, solid, reliable workbench, somewhere
to hold work pieces at a convenient height while he or she works them.
The workbench becomes the center of the wood shop, the craftsman
returning to it time and time again as a piece takes shape.
query the status of the system, and filter output. And by programming
the shell, you can build complex macro commands for activities you
perform often.
The simple answer is “no.” GUI interfaces are wonderful, and they can
be faster and more convenient for some simple operations. Moving files,
reading MIME-encoded e-mail, and typing letters are all things that
you might want to do in a graphical environment. But if you do all
your work using GUIs, you are missing out on the full capabilities of
your environment. You won’t be able to automate common tasks, or
use the full power of the tools available to you. And you won’t be able
to combine your tools to create customized macro tools. A benefit of
GUIs is WYSIWYG—what you see is what you get. The disadvantage is
WYSIAYG—what you see is all you get.
Which Java files have not been changed in the last week?
Shell . . . find . -name ’*.java’ -mtime +7 -print
GUI . . . . . Click and navigate to “Find files,” click the “Named” field
and type in “*.java”, select the “Date Modified” tab. Then
select “Between.” Click on the starting date and type in
the starting date of the beginning of the project. Click on
the ending date and type in the date of a week ago today
(be sure to have a calendar handy). Click on “Find Now.”
GUI . . . . . Load each file in the list from the previous example
into an editor and search for the string “java.awt”. Write
down the name of each file containing a match.
Clearly the list could go on. The shell commands may be obscure or
terse, but they are powerful and concise. And, because shell commands
can be combined into script files (or command files under Windows
TIP 21
Gain familiarity with the shell, and you’ll find your productivity soaring.
Need to create a list of all the unique package names explicitly imported
by your Java code? The following stores it in a file called “list.”
If you haven’t spent much time exploring the capabilities of the com-
mand shell on the systems you use, this might appear daunting. How-
ever, invest some energy in becoming familiar with your shell and things
will soon start falling into place. Play around with your command shell,
and you’ll be surprised at how much more productive it makes you.
3. The GNU General Public License [URL 57] is a kind of legal virus that Open Source
developers use to protect their (and your) rights. You should spend some time reading
it. In essence, it says that you can use and modify GPL’d software, but if you distribute
any modifications they must be licensed according to the GPL (and marked as such), and
you must make source available. That’s the virus part—whenever you derive a work from
a GPL’d work, your derived work must also be GPL’d. However, it does not limit you in
any way when simply using the tools—the ownership and licensing of software developed
using the tools are up to you.
Alternatively, David Korn (of Korn shell fame) has put together a pack-
age called UWIN. This has the same aims as the Cygwin distribution—it
is a Unix development environment under Windows. UWIN comes with
a version of the Korn shell. Commercial versions are available from
Global Technologies, Ltd. [URL 30]. In addition, AT&T allows free down-
loading of the package for evaluation and academic use. Again, read
their license before using.
Challenges
Are there things that you’re currently doing manually in a GUI? Do you
ever pass instructions to colleagues that involve a number of individual
“click this button,” “select this item” steps? Could these be automated?
16 Power Editing
We’ve talked before about tools being an extension of your hand. Well,
this applies to editors more than to any other software tool. You need
to be able to manipulate text as effortlessly as possible, because text
is the basic raw material of programming. Let’s look at some common
features and functions that help you get the most from your editing
environment.
One Editor
We think it is better to know one editor very well, and use it for all edit-
ing tasks: code, documentation, memos, system administration, and so
on. Without a single editor, you face a potential modern day Babel of
confusion. You may have to use the built-in editor in each language’s
IDE for coding, and an all-in-one office product for documentation, and
maybe a different built-in editor for sending e-mail. Even the keystrokes
you use to edit command lines in the shell may be different.4 It is diffi-
cult to be proficient in any of these environments if you have a different
set of editing conventions and commands in each.
TIP 22
Choose an editor, know it thoroughly, and use it for all editing tasks.
If you use a single editor (or set of keybindings) across all text editing
activities, you don’t have to stop and think to accomplish text manip-
ulation: the necessary keystrokes will be a reflex. The editor will be
4. Ideally, the shell you use should have keybindings that match the ones used by
your editor. Bash, for instance, supports both vi and emacs keybindings.
an extension of your hand; the keys will sing as they slice their way
through text and thought. That’s our goal.
Make sure that the editor you choose is available on all platforms you
use. Emacs, vi, CRiSP, Brief, and others are available across multiple
platforms, often in both GUI and non-GUI (text screen) versions.
Editor Features
Beyond whatever features you find particularly useful and comfortable,
here are some basic abilities that we think every decent editor should
have. If your editor falls short in any of these areas, then this may be
the time to consider moving on to a more advanced one.
Syntax highlighting
Auto-completion
Auto-indentation
Initial code or document boilerplate
Tie-in to help systems
IDE-like features (compile, debug, and so on)
Having the ability to compile and navigate directly to errors within the
editor environment is very handy on big projects. Emacs in particular
is adept at this style of interaction.
Productivity
A surprising number of people we’ve met use the Windows notepad
utility to edit their source code. This is like using a teaspoon as a
shovel—simply typing and using basic mouse-based cut and paste is
not enough.
What sort of things will you need to do that can’t be done in this way?
Or suppose you are writing Java code. You like to keep your import
statements in alphabetical order, and someone else has checked in a
few files that don’t adhere to this standard (this may sound extreme,
but on a large project it can save you a lot of time scanning through a
long list of import statements). You’d like to go quickly through a few
files and sort a small section of them. In editors such as vi and Emacs
you can do this easily (see Figure 3.1). Try that in notepad.
I have a favorite editor, but I Learn them. Cut down the number of
don’t use all of its features. keystrokes you need to type.
I have a favorite editor and Try to expand and use it for more
use it where possible. tasks than you do already.
I think you are nuts. Notepad As long as you are happy and produc-
is the best editor ever made. tive, go for it! But if you find yourself
subject to “editor envy,” you may need
to reevaluate your position.
5. The Linux kernel is developed this way. Here you have geographically dispersed
developers, many working on the same pieces of code. There is a published list of settings
(in this case, for Emacs) that describes the required indentation style.
Challenges
Some editors use full-blown languages for customization and scripting.
Emacs, for example, uses Lisp. As one of the new languages you are going
to learn this year, learn the language your editor uses. For anything you
find yourself doing repeatedly, develop a set of macros (or equivalent) to
handle it.
Do you know everything your editor is capable of doing? Try to stump your
colleagues who use the same editor. Try to accomplish any given editing
task in as few keystrokes as possible.
One of the important things we look for in a user interface is the UNDO
key—a single button that forgives us our mistakes. It’s even better if the
environment supports multiple levels of undo and redo, so you can go
back and recover from something that happened a couple of minutes
ago. But what if the mistake happened last week, and you’ve turned
your computer on and off ten times since then? Well, that’s one of the
many benefits of using a source code control system: it’s a giant UNDO
key—a project-wide time machine that can return you to those halcyon
days of last week, when the code actually compiled and ran.
But a source code control system (SCCS6 ) does far more than undo
mistakes. A good SCCS will let you track changes, answering questions
such as: Who made changes in this line of code? What’s the difference
between the current version and last week’s? How many lines of code
did we change in this release? Which files get changed most often? This
kind of information is invaluable for bug-tracking, audit, performance,
and quality purposes.
An SCCS will also let you identify releases of your software. Once iden-
tified, you will always be able to go back and regenerate the release,
independent of changes that may have occurred later.
Source code control systems may keep the files they maintain in a cen-
tral repository—a great candidate for archiving.
6. We use the uppercase SCCS to refer to generic source code control systems. There
is also a specific system called “sccs,” originally released with AT&T System V Unix.
TIP 23
The project build mechanism can pull the latest source out of the repos-
itory automatically. It can run in the middle of the night after everyone’s
(hopefully) gone home. You can run automatic regression tests to en-
sure that the day’s coding didn’t break anything. The automation of
the build ensures consistency—there are no manual procedures, and
you won’t need developers remembering to copy code into some special
build area.
The build is repeatable because you can always rebuild the source as
it existed on a given date.
as “What did you do to the xyz module?” and “What broke the build?”
This approach may also help convince your management that source
code control really works.
Challenges
Even if you are not able to use an SCCS at work, install RCS or CVS on a
personal system. Use it to manage your pet projects, documents you write,
and (possibly) configuration changes applied to the computer system itself.
Take a look at some of the Open Source projects for which publicly ac-
cessible archives are available on the Web (such as Mozilla [URL 51], KDE
[URL 54], and the Gimp [URL 55]). How do you get updates of the source?
How do you make changes—does the project regulate access or arbitrate
the inclusion of changes?
18 Debugging
It is a painful thing
To look at your own trouble and know
That you yourself and no one else has made it
Sophocles, Ajax
The word bug has been used to describe an “object of terror” ever since
the fourteenth century. Rear Admiral Dr. Grace Hopper, the inventor
of COBOL, is credited with observing the first computer bug—literally,
a moth caught in a relay in an early computer system. When asked
to explain why the machine wasn’t behaving as intended, a technician
reported that there was “a bug in the system,” and dutifully taped it—
wings and all—into the log book.
Regrettably, we still have “bugs” in the system, albeit not the flying
kind. But the fourteenth century meaning—a bogeyman—is perhaps
even more applicable now than it was then. Software defects mani-
fest themselves in a variety of ways, from misunderstood requirements
to coding errors. Unfortunately, modern computer systems are still
limited to doing what you tell them to do, not necessarily what you
want them to do.
No one writes perfect software, so it’s a given that debugging will take
up a major portion of your day. Let’s look at some of the issues involved
in debugging and some general strategies for finding elusive bugs.
Psychology of Debugging
Debugging itself is a sensitive, emotional subject for many developers.
Instead of attacking it as a puzzle to be solved, you may encounter
denial, finger pointing, lame excuses, or just plain apathy.
Embrace the fact that debugging is just problem solving, and attack it
as such.
Having found someone else’s bug, you can spend time and energy lay-
ing blame on the filthy culprit who created it. In some workplaces this
is part of the culture, and may be cathartic. However, in the technical
arena, you want to concentrate on fixing the problem, not the blame.
TIP 24
It doesn’t really matter whether the bug is your fault or someone else’s.
It is still your problem.
A Debugging Mindset
The easiest person to deceive is one’s self.
Edward Bulwer-Lytton, The Disowned
Before you start debugging, it’s important to adopt the right mindset.
You need to turn off many of the defenses you use each day to protect
your ego, tune out any project pressures you may be under, and get
yourself comfortable. Above all, remember the first rule of debugging:
TIP 25
Don’t Panic
It’s easy to get into a panic, especially if you are facing a deadline, or
have a nervous boss or client breathing down your neck while you are
trying to find the cause of the bug. But it is very important to step back
a pace, and actually think about what could be causing the symptoms
that you believe indicate a bug.
Beware of myopia when debugging. Resist the urge to fix just the symp-
toms you see: it is more likely that the actual fault may be several steps
removed from what you are observing, and may involve a number of
other related things. Always try to discover the root cause of a problem,
not just this particular appearance of it.
Where to Start
Before you start to look at the bug, make sure that you are work-
ing on code that compiled cleanly—without warnings. We routinely set
When trying to solve any problem, you need to gather all the relevant
data. Unfortunately, bug reporting isn’t an exact science. It’s easy to be
misled by coincidences, and you can’t afford to waste time debugging
coincidences. You first need to be accurate in your observations.
Finally, we got them together in the same room. The tester selected
the brush tool and painted a stroke from the upper right corner to the
lower left corner. The application exploded. "Oh," said the programmer,
in a small voice, who then sheepishly admitted that he had made test
strokes only from the lower left to the upper right, which did not expose
the bug.
You may need to interview the user who reported the bug in order
to gather more data than you were initially given.
Debugging Strategies
Once you think you know what is going on, it’s time to find out what
the program thinks is going on.
Bug Reproduction
No, our bugs aren’t really multiplying (although some of them are
probably old enough to do it legally). We’re talking about a different
kind of reproduction.
The best way to start fixing a bug is to make it reproducible. After all,
if you can’t reproduce it, how will you know if it is ever fixed?
But we want more than a bug that can be reproduced by following
some long series of steps; we want a bug that can be reproduced
with a single command. It’s a lot harder to fix a bug if you have to go
through 15 steps to get to the point where the bug shows up. Some-
times by forcing yourself to isolate the circumstances that display the
bug, you’ll even gain an insight on how to fix it.
See Ubiquitous Automation, page 230, for other ideas along these
lines.
But you can gain a much deeper insight into your data by using a
debugger that allows you to visualize your data and all of the inter-
relationships that exist. There are debuggers that can represent your
data as a 3D fly-over through a virtual reality landscape, or as a 3D
waveform plot, or just as simple structural diagrams, as shown in Fig-
ure 3.2 on the next page. As you single-step through your program,
pictures like these can be worth much more than a thousand words,
as the bug you’ve been hunting suddenly jumps out at you.
Even if your debugger has limited support for visualizing data, you
can still do it yourself—either by hand, with paper and pencil, or with
external plotting programs.
Figure 3.2. Sample debugger diagram of a circular linked list. The arrows repre-
sent pointers to nodes.
1: list
(List *)
0x804db40
next
value = 85 value = 86
self = 0x804db40 self = 0x804db50
next = 0x804db50 next = 0x804db40
next
Tracing
Debuggers generally focus on the state of the program now. Sometimes
you need more—you need to watch the state of a program or a data
structure over time. Seeing a stack trace can only tell you how you got
here directly. It can’t tell you what you were doing prior to this call
chain, especially in event-based systems.
You can use tracing statements to “drill down” into the code. That is,
you can add tracing statements as you descend the call tree.
Looks like someone sprayed a street address over our counter. Now
we know where to look.
file with Perl, you could easily identify where the offending open was
occurring.
Rubber Ducking
A very simple but particularly useful technique for finding the cause
of a problem is simply to explain it to someone else. The other person
should look over your shoulder at the screen, and nod his or her head
constantly (like a rubber duck bobbing up and down in a bathtub).
They do not need to say a word; the simple act of explaining, step by
step, what the code is supposed to do often causes the problem to leap
off the screen and announce itself.7
Process of Elimination
In most projects, the code you are debugging may be a mixture of appli-
cation code written by you and others on your project team, third-party
products (database, connectivity, graphical libraries, specialized com-
munications or algorithms, and so on) and the platform environment
(operating system, system libraries, and compilers).
TIP 26
If you “changed only one thing” and the system stopped working, that
one thing was likely to be responsible, directly or indirectly, no matter
how farfetched it seems. Sometimes the thing that changed is outside of
your control: new versions of the OS, compiler, database, or other third-
party software can wreak havoc with previously correct code. New bugs
might show up. Bugs for which you had a work-around get fixed, break-
ing the work-around. APIs change, functionality changes; in short, it’s
a whole new ball game, and you must retest the system under these
If, however, you have no obvious place to start looking, you can always
rely on a good old-fashioned binary search. See if the symptoms are
present at either of two far away spots in the code. Then look in the
middle. If the problem is present, then the bug lies between the start
and the middle point; otherwise, it is between the middle point and the
end. You can continue in this fashion until you narrow down the spot
sufficiently to identify the problem.
Of course it can. The amount of surprise you feel when something goes
wrong is directly proportional to the amount of trust and faith you have
in the code being run. That’s why, when faced with a “surprising” fail-
ure, you must realize that one or more of your assumptions is wrong.
Don’t gloss over a routine or piece of code involved in the bug because
you “know” it works. Prove it. Prove it in this context, with this data,
with these boundary conditions.
TIP 27
When you come across a surprise bug, beyond merely fixing it, you need
to determine why this failure wasn’t caught earlier. Consider whether
you need to amend the unit or other tests so that they would have
caught it.
Also, if the bug is the result of bad data that was propagated through
a couple of levels before causing the explosion, see if better parame-
ter checking in those routines would have isolated it earlier (see the
While you’re at it, are there any other places in the code that may be
susceptible to this same bug? Now is the time to find and fix them.
Make sure that whatever happened, you’ll know if it happens again.
If it took a long time to fix this bug, ask yourself why. Is there anything
you can do to make fixing this bug easier the next time around? Per-
haps you could build in better testing hooks, or write a log file analyzer.
Debugging Checklist
Is the problem being reported a direct result of the underlying bug,
or merely a symptom?
If the suspect code passes its unit tests, are the tests complete
enough? What happens if you run the unit test with this data?
Do the conditions that caused this bug exist anywhere else in the
system?
Challenges
Debugging is challenge enough.
19 Text Manipulation
Pragmatic Programmers manipulate text the same way woodworkers
shape wood. In previous sections we discussed some specific tools—
shells, editors, debuggers—that we use. These are similar to a wood-
worker’s chisels, saws, and planes—tools specialized to do one or two
jobs well. However, every now and then we need to perform some trans-
formation not readily handled by the basic tool set. We need a general-
purpose text manipulation tool.
8. Here router means the tool that spins cutting blades very, very fast, not a device for
interconnecting networks.
manipulate text, interact with programs, talk over networks, drive Web
pages, perform arbitrary precision arithmetic, and write programs that
look like Snoopy swearing.
TIP 28
book has been. However, using the DRY principle (see The Evils of
Duplication, page 26) we didn’t want to copy and paste lines of code
from the tested programs into the book. That would have meant
that the code was duplicated, virtually guaranteeing that we’d for-
get to update an example when the corresponding program was
changed. For some examples, we also didn’t want to bore you with
all the framework code needed to make our example compile and
run. We turned to Perl. A relatively simple script is invoked when
we format the book—it extracts a named segment of a source file,
does syntax highlighting, and converts the result into the typeset-
ting language we use.
Exercises
Answer 11. Your C program uses an enumerated type to represent one of 100 states.
on p. 285
You’d like to be able to print out the state as a string (as opposed to a
number) for debugging purposes. Write a script that reads from standard
input a file containing
name
state_a
state_b
: :
Answer 12. Halfway through writing this book, we realized that we hadn’t put the use
on p. 286
strict directive into many of our Perl examples. Write a script that goes
through the .pl files in a directory and adds a use strict at the end of
the initial comment block to all files that don’t already have one. Remember
to keep a backup of all files you change.
20 Code Generators
When woodworkers are faced with the task of producing the same thing
over and over, they cheat. They build themselves a jig or a template. If
they get the jig right once, they can reproduce a piece of work time after
time. The jig takes away complexity and reduces the chances of making
mistakes, leaving the craftsman free to concentrate on quality.
TIP 29
2. Active code generators are used each time their results are required.
The result is a throw-away—it can always be reproduced by the
code generator. Often, active code generators read some form of
script or control file to produce their results.
Figure 3.3. Active code generator creates code from a database schema
struct EmployeeRow
Schema
table employee active
struct EmployerRow
table employer code
table benefit generator
struct BenefitRow
not in production. Of course, this scheme works only if you make the
code generation part of the build process itself.9
9. Just how do you go about building code from a database schema? There are several
ways. If the schema is held in a flat file (for example, as create table statements), then
a relatively simple script can parse it and generate the source. Alternatively, if you use a
tool to create the schema directly in the database, then you should be able to extract the
information you need directly from the database’s data dictionary. Perl provides libraries
that give you access to most major databases.
# Add a product
# to the ’on-order’ list
M AddProduct
F id int
F name char[30]
F order_code int
E
gen
C era
te te
nera P as
ge cal
simple. Have a look at the answer to Exercise 13 (page 286): the actual
code generation is basically print statements.
Exercises
Answer 13. Write a code generator that takes the input file in Figure 3.4, and generates
on p. 286
output in two languages of your choice. Try to make it easy to add new
languages.
Pragmatic Paranoia
TIP 30
Everyone knows that they personally are the only good driver on Earth.
The rest of the world is out there to get them, blowing through stop
signs, weaving between lanes, not indicating turns, talking on the
telephone, reading the paper, and just generally not living up to our
standards. So we drive defensively. We look out for trouble before it
happens, anticipate the unexpected, and never put ourselves into a
position from which we can’t extricate ourselves.
107
But Pragmatic Programmers take this a step further. They don’t trust
themselves, either. Knowing that no one writes perfect code, includ-
ing themselves, Pragmatic Programmers code in defenses against their
own mistakes. We describe the first defensive measure in Design by
Contract: clients and suppliers must agree on rights and responsibili-
ties.
Exceptions, like any other technique, can cause more harm than good
if not used properly. We’ll discuss the issues in When to Use Exceptions.
As your programs get more dynamic, you’ll find yourself juggling sys-
tem resources—memory, files, devices, and the like. In How to Balance
Resources, we’ll suggest ways of ensuring that you don’t drop any of
the balls.
When everybody actually is out to get you, paranoia is just good thinking.
Woody Allen
21 Design by Contract
Nothing astonishes men so much as common sense and plain dealing.
Ralph Waldo Emerson, Essays
Maybe you have an employment contract that specifies the hours you’ll
work and the rules of conduct you must follow. In return, the company
pays you a salary and other perks. Each party meets its obligations and
everyone benefits.
It’s an idea used the world over—both formally and informally—to help
humans interact. Can we use the same concept to help software mod-
ules interact? The answer is “yes.”
DBC
Bertrand Meyer [Mey97b] developed the concept of Design by Contract
for the language Eiffel.1 It is a simple yet powerful technique that
focuses on documenting (and agreeing to) the rights and responsibil-
ities of software modules to ensure program correctness. What is a
correct program? One that does no more and no less than it claims
to do. Documenting and verifying that claim is the heart of Design by
Contract (DBC, for short).
1. Based in part on earlier work by Dijkstra, Floyd, Hoare, Wirth, and others. For more
information on Eiffel itself, see [URL 10] and [URL 11].
Let’s look at the contract for a routine that inserts a data value into
a unique, ordered list. In iContract, a preprocessor for Java available
from [URL 17], you’d specify it as
/**
* @invariant forall Node n in elements() |
* n.prev() != null
* implies
* n.value().compareTo(n.prev().value()) > 0
*/
public class dbc_list {
/**
* @pre contains(aNode) == false
* @post contains(aNode) == true
*/
public void insertNode(final Node aNode) {
// ...
Here we are saying that nodes in this list must always be in increas-
ing order. When you insert a new node, it can’t exist already, and we
guarantee that the node will be found after you have inserted it.
The contract between a routine and any potential caller can thus be
read as
If all the routine’s preconditions are met by the caller, the routine shall
guarantee that all postconditions and invariants will be true when it
completes.
If either party fails to live up to the terms of the contract, then a rem-
edy (which was previously agreed to) is invoked—an exception is raised,
or the program terminates, for instance. Whatever happens, make no
mistake that failure to live up to the contract is a bug. It is not some-
thing that should ever happen, which is why preconditions should not
be used to perform things such as user-input validation.
TIP 31
Subclasses must be usable through the base class interface without the
need for the user to know the difference.
In other words, you want to make sure that the new subtype you have
created really “is-a-kind-of” the base type—that it supports the same
methods, and that the methods have the same meaning. We can do
this with contracts. We need to specify a contract only once, in the base
class, to have it applied to every future subclass automatically. A sub-
class may, optionally, accept a wider range of input, or make stronger
guarantees. But it must accept at least as much, and guarantee as
much, as its parent.
/**
* @pre f != null
* @post getFont() == f
*/
public void setFont(final Font f) {
// ...
Implementing DBC
The greatest benefit of using DBC may be that it forces the issue of
requirements and guarantees to the forefront. Simply enumerating at
design time what the input domain range is, what the boundary con-
ditions are, and what the routine promises to deliver—or, more im-
Assertions
While documenting these assumptions is a great start, you can get
much greater benefit by having the compiler check your contract for
you. You can partially emulate this in some languages by using asser-
tions (see Assertive Programming, page 122). Why only partially? Can’t
you use assertions to do everything DBC can do?
Finally, the runtime system and libraries are not designed to support
contracts, so these calls are not checked. This is a big loss, because
it is often at the boundary between your code and the libraries it uses
that the most problems are detected (see Dead Programs Tell No Lies,
page 120 for a more detailed discussion).
Language Support
Languages that feature built-in support of DBC (such as Eiffel and
Sather [URL 12]) check pre- and postconditions automatically in the
compiler and runtime system. You get the greatest benefit in this case
because all of the code base (libraries, too) must honor their contracts.
But what about more popular languages such as C, C++, and Java?
For these languages, there are preprocessors that process contracts
embedded in the original source code as special comments. The pre-
processor expands these comments to code that verifies the assertions.
For C and C++, you may want to investigate Nana [URL 18]. Nana doesn’t
handle inheritance, but it does use the debugger at runtime to monitor
assertions in a novel way.
For Java, there is iContract [URL 17]. It takes comments (in JavaDoc
form) and generates a new source file with the assertion logic included.
sqrt: DOUBLE is
-- Square root routine
require
sqrt_arg_must_be_positive: Current >= 0;
--- ...
--- calculate square root here
--- ...
ensure
((Result*Result) - Current).abs <= epsilon*Current.abs;
-- Result should be within error tolerance
end;
Who’s Responsible?
Who is responsible for checking the precondition, the caller or the
routine being called? When implemented as part of the language, the
answer is neither: the precondition is tested behind the scenes after
the caller invokes the routine but before the routine itself is entered.
Thus if there is any explicit checking of parameters to be done, it
must be performed by the caller, because the routine itself will never
see parameters that violate its precondition. (For languages without
built-in support, you would need to bracket the called routine with a
preamble and/or postamble that checks these assertions.)
Consider a program that reads a number from the console, calcu-
lates its square root (by calling sqrt), and prints the result. The sqrt
function has a precondition—its argument must not be negative. If the
user enters a negative number at the console, it is up to the calling
code to ensure that it never gets passed to sqrt. This calling code
has many options: it could terminate, it could issue a warning and
read another number, or it could make the number positive and ap-
pend an “ ” to the result returned by sqrt. Whatever its choice, this
is definitely not sqrt’s problem.
By expressing the domain of the square root function in the precon-
dition of the sqrt routine, you shift the burden of correctness to the
caller—where it belongs. You can then design the sqrt routine se-
cure in the knowledge that its input will be in range.
If your algorithm for calculating the square root fails (or isn’t within the
specified error tolerance), you get an error message and a stack trace
to show you the call chain.
If you pass sqrt a negative parameter, the Eiffel runtime prints the
error “sqrt_arg_must_be_positive,” along with a stack trace. This
is better than the alternative in languages such as Java, C, and C++,
where passing a negative number to sqrt returns the special value
NaN (Not a Number). It may be some time later in the program that you
attempt to do some math on NaN, with surprising results.
It’s much easier to find and diagnose the problem by crashing early, at
the site of the problem.
Loop Invariants
Getting the boundary conditions right on a nontrivial loop can be prob-
lematic. Loops are subject to the banana problem (I know how to spell
“banana,” but I don’t know when to stop), fencepost errors (not know-
ing whether to count the fenceposts or the spaces between them), and
the ubiquitous “off by one” error [URL 52].
Semantic Invariants
You can use semantic invariants to express inviolate requirements, a
kind of “philosophical contract.”
sort of failure mode might happen, the error should be on the side of
not processing a transaction rather than processing a duplicate trans-
action.
Be sure not to confuse requirements that are fixed, inviolate laws with
those that are merely policies that might change with a new manage-
ment regime. That’s why we use the term semantic invariants—it must
be central to the very meaning of a thing, and not subject to the whims
of policy (which is what more dynamic business rules are for).
Certainly any system that relies on agent technology has a critical de-
pendence on contractual arrangements—even if they are dynamically
generated.
Imagine: with enough components and agents that can negotiate their
own contracts among themselves to achieve a goal, we might just solve
the software productivity crisis by letting software solve it for us.
Challenges
Points to ponder: If DBC is so powerful, why isn’t it used more widely? Is
it hard to come up with the contract? Does it make you think about issues
you’d rather ignore for now? Does it force you to THINK!? Clearly, this is a
dangerous tool!
Exercises
Answer 14. What makes a good contract? Anyone can add preconditions and postcon-
on p. 288
ditions, but will they do you any good? Worse yet, will they actually do
more harm than good? For the example below and for those in Exercises
15 and 16, decide whether the specified contract is good, bad, or ugly, and
explain why.
First, let’s look at an Eiffel example. Here we have a routine for adding a
STRING to a doubly linked, circular list (remember that preconditions are
labeled with require, and postconditions with ensure).
15. Next, let’s try an example in Java—somewhat similar to the example in Answer
on p. 288
Exercise 14. insertNumber inserts an integer into an ordered list. Pre-
and postconditions are labeled as in iContract (see [URL 17]).
16. Here’s a fragment from a stack class in Java. Is this a good contract? Answer
on p. 289
/**
* @pre anItem != null // Require real data
* @post pop() == anItem // Verify that it’s
* // on the stack
*/
public void push(final String anItem)
17. The classic examples of DBC (as in Exercises 14–16) show an implemen- Answer
on p. 289
tation of an ADT (Abstract Data Type)—typically a stack or queue. But not
many people really write these kinds of low-level classes.
So, for this exercise, design an interface to a kitchen blender. It will even-
tually be a Web-based, Internet-enabled, CORBA-fied blender, but for now
we just need the interface to control it. It has ten speed settings (0 means
off). You can’t operate it empty, and you can change the speed only one
unit at a time (that is, from 0 to 1, and from 1 to 2, not from 0 to 2).
Here are the methods. Add appropriate pre- and postconditions and an
invariant.
int getSpeed()
void setSpeed(int x)
boolean isFull()
void fill()
void empty()
It’s easy to fall into the “it can’t happen” mentality. Most of us have
written code that didn’t check that a file closed successfully, or that a
trace statement got written as we expected. And all things being equal,
it’s likely that we didn’t need to—the code in question wouldn’t fail
under any normal conditions. But we’re coding defensively. We’re look-
ing for rogue pointers in other parts of our program trashing the stack.
We’re checking that the correct versions of shared libraries were actu-
ally loaded.
All errors give you information. You could convince yourself that the
error can’t happen, and choose to ignore it. Instead, Pragmatic Pro-
grammers tell themselves that if there is an error, something very, very
bad has happened.
TIP 32
Crash Early
The Java language and libraries have embraced this philosophy. When
something unexpected happens within the runtime system, it throws
a RuntimeException. If not caught, this will percolate up to the top
level of the program and cause it to halt, displaying a stack trace.
You can do the same in other languages. If you don’t have an exception
mechanism, or if your libraries don’t throw exceptions, then make sure
you handle the errors yourself. In C, macros can be very useful for this:
#define CHECK(LINE, EXPECTED)
{ int rc = LINE;
if (rc != EXPECTED)
ut_abort(__FILE__, __LINE__, #LINE, rc, EXPECTED); }
void ut_abort(char *file, int ln, char *line, int rc, int exp) {
fprintf(stderr, "%s line %d n’%s’: expected %d, got %d n",
file, ln, line, exp, rc);
exit(1);
}
Then you can wrap calls that should never fail using
CHECK(stat("/tmp", &stat_buff), 0);
23 Assertive Programming
There is a luxury in self-reproach. When we blame ourselves we feel no one else
has a right to blame us.
Oscar Wilde, The Picture of Dorian Gray
“This code won’t be used 30 years from now, so two-digit dates are fine.”
“This application will never be used abroad, so why internationalize it?”
“count can’t be negative.” “This printf can’t fail.”
TIP 33
Whenever you find yourself thinking “but of course that could never
happen,” add code to check it. The easiest way to do this is with asser-
tions. In most C and C++ implementations, you’ll find some form of
assert or _assert macro that checks a Boolean condition. These
macros can be invaluable. If a pointer passed in to your procedure
should never be NULL, then check for it:
void writeString(char *string) {
assert(string != NULL);
...
And just because the supplied assert macros call exit when an as-
sertion fails, there’s no reason why versions you write should. If you
need to free resources, have an assertion failure generate an exception,
longjmp to an exit point, or call an error handler. Just make sure the
code you execute in those dying milliseconds doesn’t rely on the infor-
mation that triggered the assertion failure in the first place.
There are two patently wrong assumptions here. First, they assume
that testing finds all the bugs. In reality, for any complex program you
are unlikely to test even a miniscule percentage of the permutations
your code will be put through (see Ruthless Testing, page 245). Second,
the optimists are forgetting that your program runs in a dangerous
world. During testing, rats probably won’t gnaw through a communi-
cations cable, someone playing a game won’t exhaust memory, and log
files won’t fill the hard drive. These things might happen when your
program runs in a production environment. Your first line of defense is
checking for any possible error, and your second is using assertions to
try to detect those you’ve missed.
Even if you do have performance issues, turn off only those assertions
that really hit you. The sort example above may be a critical part of
your application, and may need to be fast. Adding the check means
another pass through the data, which might be unacceptable. Make
that particular check optional,2 but leave the rest in.
2. In C-based languages, you can either use the preprocessor or use if statements to
make assertions optional. Many implementations turn off code generation for the assert
macro if a compile-time flag is set (or not set). Otherwise, you can place the code within an
if statement with a constant condition, which many compilers (including most common
Java systems) will optimize away.
Exercises
19. A quick reality check. Which of these “impossible” things can happen? Answer
on p. 290
1. A month with fewer than 28 days
2. stat(".",&sb) == -1 (that is, can’t access the current directory)
3. In C++: a = 2; b = 3; if (a + b != 5) exit(1);
4. A triangle with an interior angle sum
5. A minute that doesn’t have 60 seconds
6. In Java: (a + 1) <= a
retcode = OK;
try {
socket.read(name);
process(name);
socket.read(address);
processAddress(address);
socket.read(telNo);
// etc, etc...
}
catch (IOException e) {
retcode = BAD_READ;
Logger.log("Error reading individual: " + e.getMessage());
}
return retcode;
The normal flow of control is now clear, with all the error handling
moved off to a single place.
What Is Exceptional?
One of the problems with exceptions is knowing when to use them. We
believe that exceptions should rarely be used as part of a program’s
normal flow; exceptions should be reserved for unexpected events.
Assume that an uncaught exception will terminate your program and
ask yourself, “Will this code still run if I remove all the exception han-
dlers?” If the answer is “no,” then maybe exceptions are being used in
nonexceptional circumstances.
For example, if your code tries to open a file for reading and that file
does not exist, should an exception be raised?
Our answer is, “It depends.” If the file should have been there, then an
exception is warranted. Something unexpected happened—a file you
were expecting to exist seems to have disappeared. On the other hand,
if you have no idea whether the file should exist or not, then it doesn’t
seem exceptional if you can’t find it, and an error return is appropriate.
Let’s look at an example of the first case. The following code opens the
file /etc/passwd, which should exist on all Unix systems. If it fails, it
passes on the FileNotFoundException to its caller.
However, the second case may involve opening a file specified by the
user on the command line. Here an exception isn’t warranted, and the
code looks different:
public boolean open_user_file(String name)
throws FileNotFoundException {
File f = new File(name);
if (!f.exists()) {
return false;
}
ipstream = new FileInputStream(f);
return true;
}
TIP 34
There are times when you may want to use error handlers, either in-
stead of or alongside exceptions. Clearly, if you are using a language
such as C, which does not support exceptions, this is one of your few
other options (see the challenge on the next page). However, sometimes
error handlers can be used even in languages (such as Java) that have
a good exception handling scheme built in.
Challenges
Languages that do not support exceptions often have some other nonlocal
transfer of control mechanism (C has longjmp/setjmp, for example). Con-
sider how you could implement some kind of ersatz exception mechanism
using these facilities. What are the benefits and dangers? What special
steps do you need to take to ensure that resources are not orphaned?
Does it make sense to use this kind of solution whenever you code in C?
Exercises
Answer 21. While designing a new container class, you identify the following possible
on p. 292
error conditions:
However, many developers have no consistent plan for dealing with re-
source allocation and deallocation. So let us suggest a simple tip:
TIP 35
All seems fine during testing. However, when the code goes into produc-
tion, it collapses after several hours, complaining of too many open files.
Because writeCustomer is not getting called in some circumstances,
the file is not getting closed.
A very bad solution to this problem would be to deal with the special
case in updateCustomer:
void updateCustomer(const char *fName, double newBalance) {
Customer cRec;
readCustomer(fName, &cRec);
if (newBalance >= 0.0) {
cRec.balance = newBalance;
writeCustomer(&cRec);
}
else
fclose(cFile);
}
This will fix the problem—the file will now get closed regardless of the
new balance—but the fix now means that three routines are coupled
through the global cFile. We’re falling into a trap, and things are going
to start going downhill rapidly if we continue on this course.
3. For a discussion of the dangers of coupled code, see Decoupling and the Law of
Demeter, page 138.
The finish what you start tip tells us that, ideally, the routine that allo-
cates a resource should also free it. We can apply it here by refactoring
the code slightly:
Now all the responsibility for the file is in the updateCustomer routine.
It opens the file and (finishing what it starts) closes it before exiting. The
routine balances the use of the file: the open and close are in the same
place, and it is apparent that for every open there will be a correspond-
ing close. The refactoring also removes an ugly global variable.
Nest Allocations
The basic pattern for resource allocation can be extended for routines
that need more than one resource at a time. There are just two more
suggestions:
This approach has particular benefits when you’re working with lan-
guages such as C++, where exceptions can interfere with resource deal-
location.
void doSomething(void) {
Node *n = new Node;
try {
// do something
}
catch (...) {
delete n;
throw;
}
delete n;
}
Notice that the node we create is freed in two places—once in the rou-
tine’s normal exit path, and once in the exception handler. This is an
obvious violation of the DRY principle and a maintenance problem wait-
ing to happen.
Here we rely on C++ to handle the destruction of the Node object auto-
matically, whether an exception is thrown or not.
If the switch from a pointer is not possible, the same effect can be
achieved by wrapping the resource (in this case, a Node pointer) within
another class.
// Wrapper class for Node resources
class NodeResource {
Node *n;
public:
NodeResource() { n = new Node; }
~NodeResource() { delete n; }
Node *operator->() { return n; }
};
void doSomething2(void) {
NodeResource n;
try {
// do something
}
catch (...) {
throw;
}
}
Now the wrapper class, NodeResource, ensures that when its objects
are destroyed the corresponding nodes are also destroyed. For conve-
nience, the wrapper provides a dereferencing operator ->, so that its
users can access the fields in the contained Node object directly.
data structures. One routine will allocate an area of memory and link
it into some larger structure, where it may stay for some time.
Finally, if keeping track of resources gets tricky, you can write your
own form of limited automatic garbage collection by implementing a
reference counting scheme on your dynamically allocated objects. The
book More Effective C++ [Mey96] dedicates a section to this topic.
At a lower, but no less useful level, you can invest in tools that (among
other things) check your running programs for memory leaks. Purify
(www.rational.com) and Insure++ (www.parasoft.com) are popular
choices.
Challenges
Although there are no guaranteed ways of ensuring that you always free
resources, certain design techniques, when applied consistently, will help.
In the text we discussed how establishing a semantic invariant for major
data structures could direct memory deallocation decisions. Consider how
Design by Contract, page 109, could help refine this idea.
Exercises
Answer 22. Some C and C++ developers make a point of setting a pointer to NULL after
on p. 292
they deallocate the memory it references. Why is this a good idea?
Answer 23. Some Java developers make a point of setting an object variable to NULL
on p. 292
after they have finished using the object. Why is this a good idea?
Bend, or Break
Life doesn’t stand still.
Neither can the code that we write. In order to keep up with today’s
near-frantic pace of change, we need to make every effort to write code
that’s as loose—as flexible—as possible. Otherwise we may find our
code quickly becoming outdated, or too brittle to fix, and may ultimately
be left behind in the mad dash toward the future.
A good way to stay flexible is to write less code. Changing code leaves
you open to the possibility of introducing new bugs. Metaprogramming
will explain how to move details out of the code completely, where they
can be changed more safely and easily.
137
Armed with these techniques, you can write code that will “roll with the
punches.”
Minimize Coupling
What’s wrong with having modules that know about each other? Noth-
ing in principle—we don’t need to be as paranoid as spies or dissidents.
However, you do need to be careful about how many other modules you
interact with and, more importantly, how you came to interact with
them.
We’d like to follow this same model in software. When we ask an object
for a particular service, we’d like the service to be performed on our
behalf. We do not want the object to give us a third-party object that we
have to deal with to get the required service.
For example, suppose you are writing a class that generates a graph
of scientific recorder data. You have data recorders spread around the
world; each recorder object contains a location object giving its position
and time zone. You want to let your users select a recorder and plot its
data, labeled with the correct time zone. You might write
Rather than digging though a hierarchy yourself, just ask for what you
need directly:
3. Developers who are afraid to change code because they aren’t sure
what might be affected
Systems with many unnecessary dependencies are very hard (and ex-
pensive) to maintain, and tend to be highly unstable. In order to keep
the dependencies to a minimum, we’ll use the Law of Demeter to design
our methods and functions.
By writing “shy” code that honors the Law of Demeter as much as pos-
sible, we can achieve our objective:
TIP 36
Studies have shown [BBM96] that classes in C++ with larger response
sets are more prone to error than classes with smaller response sets (a
1. If objects all know about each other, then a change to just one object can result
in the other objects needing changes.
class Demeter {
private:
A *a; The Law of Demeter for functions
int func(); states that any method of an
public:
object should call only methods
//...
void example(B& b); belonging to:
}
void Demeter::example(B& b) {
C c;
int f = func(); itself
Because following the Law of Demeter reduces the size of the response
set in the calling class, it follows that classes designed in this way
will also tend to have fewer errors (see [URL 56] for more papers and
information on the Demeter project).
Using The Law of Demeter will make your code more adaptable and
robust, but at a cost: as a “general contractor,” your module must dele-
gate and manage any and all subcontractors directly, without involving
clients of your module. In practice, this means that you will be writing
a large number of wrapper methods that simply forward the request on
to a delegate. These wrapper methods will impose both a runtime cost
and a space overhead, which may be significant—even prohibitive—in
some applications.
As with any technique, you must balance the pros and cons for your
particular application. In database schema design it is common prac-
tice to “denormalize” the schema for a performance improvement: to
Physical Decoupling
In this section we’re concerned largely with designing to keep things
logically decoupled within systems. However, there is another kind
of interdependence that becomes highly significant as systems grow
larger. In his book Large-Scale C++ Software Design [Lak96], John
Lakos addresses the issues surrounding the relationships among
the files, directories, and libraries that make up a system. Large
projects that ignore these physical design problems wind up with build
cycles that are measured in days and unit tests that may drag in the
entire system as support code, among other problems. Mr. Lakos
argues convincingly that logical and physical design must proceed in
tandem—that undoing the damage done to a large body of code by
cyclic dependencies is extremely difficult. We recommend this book
if you are involved in large-scale developments, even if C++ isn’t your
implementation language.
Challenges
We’ve discussed how using delegation makes it easier to obey the Law of
Demeter and hence reduce coupling. However, writing all of the methods
Exercises
24. We discussed the concept of physical decoupling in the box on on the facing Answer
on p. 293
page. Which of the following C++ header files is more tightly coupled to the
rest of the system?
person1.h: person2.h:
#include "date.h" class Date;
class Person1 { class Person2 {
private: private:
Date myBirthdate; Date *myBirthdate;
public: public:
Person1(Date &birthDate); Person2(Date &birthDate);
// ... // ...
25. For the example below and for those in Exercises 26 and 27, determine if Answer
on p. 293
the method calls shown are allowed according to the Law of Demeter. This
first one is in Java.
public void showBalance(BankAccount acct) {
Money amt = acct.getBalance();
printToScreen(amt.printFormat());
}
27 Metaprogramming
No amount of genius can overcome a preoccupation with detail.
Levy’s Eighth Law
So we say “out with the details!” Get them out of the code. While we’re
at it, we can make our code highly configurable and “soft”—that is,
easily adaptable to changes.
Dynamic Configuration
First, we want to make our systems highly configurable. Not just things
such as screen colors and prompt text, but deeply ingrained items such
as the choice of algorithms, database products, middleware technology,
and user-interface style. These items should be implemented as con-
figuration options, not through integration or engineering.
TIP 37
We use the term in its broadest sense. Metadata is any data that
describes the application—how it should run, what resources it should
use, and so on. Typically, metadata is accessed and used at runtime,
not at compile time. You use metadata all the time—at least your pro-
grams do. Suppose you click on an option to hide the toolbar on your
Metadata-Driven Applications
But we want to go beyond using metadata for simple preferences. We
want to configure and drive the application via metadata as much as
possible. Our goal is to think declaratively (specifying what is to be
done, not how) and create highly dynamic and adaptable programs. We
do this by adopting a general rule: program for the general case, and
put the specifics somewhere else—outside the compiled code base.
TIP 38
You can customize the application without recompiling it. You can
also use this level of customization to provide easy work-arounds
for critical bugs in live production systems.
We want to defer definition of most details until the last moment, and
leave the details as soft—as easy to change—as we can. By crafting
a solution that allows us to make changes quickly, we stand a better
chance of coping with the flood of directional shifts that swamp many
projects (see Reversibility, page 44).
Business Logic
So you’ve made the choice of database engine a configuration option,
and provided metadata to determine the user-interface style. Can we
do more? Definitely.
Because business policy and rules are more likely to change than any
other aspect of the project, it makes sense to maintain them in a very
flexible format.
When to Configure
As mentioned in The Power of Plain Text, page 73, we recommend
representing configuration metadata in plain text—it makes life that
much easier.
But when should a program read this configuration? Many programs
will scan such things only at startup, which is unfortunate. If you need
to change the configuration, this forces you to restart the application.
A more flexible approach is to write programs that can reload their
configuration while they’re running. This flexibility comes at a cost: it
is more complex to implement.
So consider how your application will be used: if it is a long-running
server process, you will want to provide some way to reread and apply
metadata while the program is running. For a small client GUI appli-
cation that restarts quickly, you may not need to.
This phenomenon is not limited to application code. We’ve all been
annoyed at operating systems that force us to reboot when we install
some simple application or change an innocuous parameter.
Suppose you want to create some Java software that will participate
in transactions across different machines, between different database
vendors, and with different thread and load-balancing models.
The good news is, you don’t have to worry about all that. You write
a bean—a self-contained object that follows certain conventions—and
place it in a bean container that manages much of the low-level detail
on your behalf. You can write the code for a bean without including any
transaction operations or thread management; EJB uses metadata to
specify how transactions should be handled.
Distributed systems such as EJB are leading the way into a new world
of configurable, dynamic systems.
Cooperative Configuration
We’ve talked about users and developers configuring dynamic applica-
tions. But what happens if you let applications configure each other—
software that adapts itself to its environment? Unplanned, spur-of-the-
moment configuration of existing software is a powerful concept.
Your larger applications probably already have issues with handling dif-
ferent versions of data and different releases of libraries and operating
systems. Perhaps a more dynamic approach will help.
The dodo didn’t adapt to the presence of humans and their livestock
on the island of Mauritius, and quickly became extinct.2 It was the first
documented extinction of a species at the hand of man.
Don’t let your project (or your career) go the way of the dodo.
2. It didn’t help that the settlers beat the placid (read stupid) birds to death with clubs
for sport.
Challenges
For your current project, consider how much of the application might be
moved out of the program itself to metadata. What would the resultant
“engine” look like? Would you be able to reuse that engine in the context
of a different application?
Exercises
28. Which of the following things would be better represented as code within Answer
on p. 295
a program, and which externally as metadata?
28 Temporal Coupling
What is temporal coupling all about, you may ask. It’s about time.
Workflow
On many projects, we need to model and analyze the users’ workflows
as part of requirements analysis. We’d like to find out what can happen
at the same time, and what must happen in a strict order. One way to
do this is to capture their description of workflow using a notation such
as the UML activity diagram.4
4. For more information on all of the UML diagram types, see [FS97].
TIP 39
For instance, in our blender project (Exercise 17, page 119), users may
initially describe their current workflow as follows.
1. Open blender
2. Open piña colada mix
3. Put mix in blender
4. Measure 1/2 cup white rum
5. Pour in rum
6. Add 2 cups of ice
7. Close blender
8. Liquefy for 2 minutes
9. Open blender
10. Get glasses
11. Get pink umbrellas
12. Serve
Even though they describe these actions serially, and may even per-
form them serially, we notice that many of them could be performed in
parallel, as we show in the activity diagram in Figure 5.2 on the next
page.
Open Measure
2. Open mix 1. 4.
blender rum
Close
7.
blender
Get pink
11.
umbrellas
8. Liquefy
Get
10.
glasses
Open
9.
blender
12. Serve
Architecture
We wrote an On-Line Transaction Processing (OLTP) system a few years
ago. At its simplest, all the system had to do was read a request and
process the transaction against the database. But we wrote a three-
tier, multiprocessing distributed application: each component was an
independent entity that ran concurrently with all other components.
While this sounds like more work, it wasn’t: taking advantage of tem-
poral decoupling made it easier to write. Let’s take a closer look at this
project.
Input
task #1
App.
App’n logic #1 Database
Input Database
task #2 handler
Queue App. Queue
logic #n
Input
task #n
The solution that gave us the best performance and cleanest architec-
ture looked something like Figure 5.3.
This example also shows a way to get quick and dirty load balancing
among multiple consumer processes: the hungry consumer model.
5. Even though we show the database as a single, monolithic entity, it is not. The
database software is partitioned into several processes and client threads, but this is
handled internally by the database software and isn’t part of our example.
TIP 40
With linear code, it’s easy to make assumptions that lead to sloppy
programming. But concurrency forces you to think through things a bit
more carefully—you’re not alone at the party anymore. Because things
can now happen at the “same time,” you may suddenly see some time-
based dependencies.
Suppose you have a windowing subsystem where the widgets are first
created and then shown on the display in two separate steps. You aren’t
allowed to set state in the widget until it is shown. Depending on how
the code is set up, you may be relying on the fact that no other object
can use the created widget until you’ve shown it on the screen.
But this may not be true in a concurrent system. Objects must always
be in a valid state when called, and they can be called at the most awk-
ward times. You must ensure that an object is in a valid state any time
it could possibly be called. Often this problem shows up with classes
that define separate constructor and initialization routines (where the
constructor doesn’t leave the object in an initialized state). Using class
invariants, discussed in Design by Contract, page 109, will help you
avoid this trap.
Cleaner Interfaces
Thinking about concurrency and time-ordered dependencies can lead
you to design cleaner interfaces as well. Consider the C library routine
strtok, which breaks a string into tokens.
The design of strtok isn’t thread safe,6 but that isn’t the worst part:
look at the time dependency. You must make the first call to strtok
with the variable you want to parse, and all successive calls with a NULL
instead. If you pass in a non-NULL value, it restarts the parse on that
buffer instead. Without even considering threads, suppose you wanted
to use strtok to parse two separate strings at the same time:
char buf1[BUFSIZ];
char buf2[BUFSIZ];
char *p, *q;
strcpy(buf1, "this is a test");
strcpy(buf2, "this ain’t gonna work");
p = strtok(buf1, " ");
q = strtok(buf2, " ");
while (p && q) {
printf("%s %s n", p, q);
p = strtok(NULL, " ");
q = strtok(NULL, " ");
}
6. It uses static data to maintain the current position in the buffer. The static data
isn’t protected against concurrent access, so it isn’t thread safe. In addition, it clobbers
the first argument you pass in, which can lead to some nasty surprises.
The code as shown will not work: there is implicit state retained in
strtok between calls. You have to use strtok on just one buffer at a
time.
TIP 41
Deployment
Once you’ve designed an architecture with an element of concurrency,
it becomes easier to think about handling many concurrent services:
the model becomes pervasive.
Challenges
How many tasks do you perform in parallel when you get ready for work in
the morning? Could you express this in a UML activity diagram? Can you
find some way to get ready more quickly by increasing concurrency?
Early on we are taught not to write a program as a single big chunk, but
that we should “divide and conquer” and separate a program into mod-
ules. Each module has its own responsibilities; in fact, a good definition
of a module (or class) is that it has a single, well-defined responsibility.
We’ll start off with the concept of an event. An event is simply a special
message that says “something interesting just happened” (interesting,
of course, lies in the eye of the beholder). We can use events to signal
changes in one object that some other object may be interested in.
the receiver. In fact, there could be multiple receivers, each one focused
on its own agenda (of which the sender is blissfully unaware).
Publish/Subscribe
Why is it bad to push all the events through a single routine? It vio-
lates object encapsulation—that one routine now has to have intimate
knowledge of the interactions among many objects. It also increases the
coupling—and we’re trying to decrease coupling. Because the objects
themselves have to have knowledge of these events as well, you are
probably going to violate the DRY principle, orthogonality, and perhaps
even sections of the Geneva Convention. You may have seen this kind
of code—it is usually dominated by a huge case statement or multiway
if-then. We can do better.
Objects should be able to register to receive only the events they need,
and should never be sent events they don’t need. We don’t want to
spam our objects! Instead, we can use a publish/subscribe protocol,
illustrated using the UML sequence diagram in Figure 5.4 on the next
page.7
Subscriber Subscriber
Publisher
one two
register
notify*
register
notify*
notify*
unsubscribe
notify*
Model-View-Controller
Suppose you have a spreadsheet application. In addition to the num-
bers in the spreadsheet itself, you also have a graph that displays the
numbers as a bar chart and a running total dialog box that shows the
sum of a column in the spreadsheet.
8. The view and controller are tightly coupled, and in some implementations of MVC
the view and controller are a single component.
TIP 42
Because we have decoupled the model from the view, we simplify the
programming a great deal. You don’t have to think about programming
a tree widget anymore. Instead, you just provide a data source.
Suppose the vice president comes up to you and wants a quick appli-
cation that lets her navigate the company’s organizational chart, which
is held in a legacy database on the mainframe. Just write a wrapper
that takes the mainframe data, presents it as a TreeModel, and voilà:
you have a fully navigable tree widget.
Now you can get fancy and start using the viewer classes; you can
change how nodes are rendered, and use special icons, fonts, or colors.
When the VP comes back and says the new corporate standards dictate
the use of a Skull and Crossbones icon for certain employees, you can
make the changes to TreeCellRenderer without touching any other
code.
Beyond GUIs
While MVC is typically taught in the context of GUI development, it is
really a general-purpose programming technique. The view is an inter-
pretation of the model (perhaps a subset)—it doesn’t need to be graph-
ical. The controller is more of a coordination mechanism, and doesn’t
have to be related to any sort of input device.
Baseball is a unique institution. Where else can you learn such gems of
trivia as “this has become the highest-scoring game played on a Tues-
day, in the rain, under artificial lights, between teams whose names
start with a vowel?” Suppose we were charged with developing software
to support those intrepid announcers who must dutifully report on the
scores, the statistics, and the trivia.
We’ll then have a number of view objects that use these models. One
view might look for runs so it can update the current score. Another
may receive notifications of new batters, and retrieve a brief summary
of their year-to-date statistics. A third viewer may look at the data
and check for new world records. We might even have a trivia viewer,
responsible for coming up with those weird and useless facts that thrill
the viewing public.
Score
collector TV feed
generator
Scores
Batter
stats
Display Web page
filter formatter
Records
Conditions
Tele-
prompter
Trivia
subscribes to
model viewer
But we don’t want to flood the poor announcer with all of these views
directly. Instead, we’ll have each view generate notifications of “inter-
esting” events, and let some higher-level object schedule what gets
shown.9
These viewer objects have suddenly become models for the higher-level
object, which itself might then be a model for different formatting view-
ers. One formatting viewer might create the teleprompter script for the
announcer, another might generate video captions directly on the satel-
lite uplink, another might update the network’s or team’s Web pages
(see Figure 5.5).
9. The fact that a plane flies overhead probably isn’t interesting unless it’s the 100th
plane to fly overhead that night.
model may have many viewers, and one viewer may work with multiple
models.
In the next section, we’ll look at ways of reducing coupling even further
by using a form of publish and subscribe where none of the participants
need know about each other, or call each other directly.
Exercises
Answer 29. Suppose you have an airline reservation system that includes the concept
on p. 296
of a flight:
If you add a passenger to the wait list, they’ll be put on the flight automat-
ically when an opening becomes available.
There’s a massive reporting job that goes through looking for overbooked or
full flights to suggest when additional flights might be scheduled. It works
fine, but it takes hours to run.
30 Blackboards
The writing is on the wall. . .
You may not usually associate elegance with police detectives, pictur-
ing instead some sort of doughnut and coffee cliché. But consider how
detectives might use a blackboard to coordinate and solve a murder
investigation.
Did Humpty really fall, or was he pushed? Each detective may make
contributions to this potential murder mystery by adding facts, state-
ments from witnesses, any forensic evidence that might arise, and so
on. As the data accumulates, a detective might notice a connection and
post that observation or speculation as well. This process continues,
across all shifts, with many different people and agents, until the case
is closed. A sample blackboard is shown in Figure 5.6 on the next page.
Figure 5.6. Someone found a connection between Humpty’s gambling debts and
the phone logs. Perhaps he was getting threatening phone calls.
Different detectives may come and go during the course of the pro-
cess, and may work different shifts.
A blackboard system lets us decouple our objects from each other com-
pletely, providing a forum where knowledge consumers and producers
can exchange data anonymously and asynchronously. As you might
guess, it also cuts down on the amount of code we have to write.
Blackboard Implementations
Computer-based blackboard systems were originally invented for use
in artificial intelligence applications where the problems to be solved
were large and complex—speech recognition, knowledge-based reason-
ing systems, and so on.
With these systems, you can store active Java objects—not just data—
on the blackboard, and retrieve them by partial matching of fields (via
templates and wildcards) or by subtypes. For example, suppose you
had a type Author, which is a subtype of Person. You could search
a blackboard containing Person objects by using an Author template
with a lastName value of “Shakespeare.” You’d get Bill Shakespeare
the author, but not Fred Shakespeare the gardener.
Name Function
read Search for and retrieve data from the space.
write Put an item into the space.
take Similar to read, but removes the item from
the space as well.
notify Set up a notification to occur whenever an
object is written that matches the template.
A big advantage of systems such as these is that you have a single, con-
sistent interface to the blackboard. When building a conventional dis-
tributed application, you can spend a great deal of time crafting unique
API calls for every distributed transaction and interaction in the sys-
tem. With the combinatorial explosion of interfaces and interactions,
the project can quickly become a nightmare.
Application Example
Suppose we are writing a program to accept and process mortgage or
loan applications. The laws that govern this area are odiously com-
plex, with federal, state, and local governments all having their say.
The lender must prove they have disclosed certain things, and must
ask for certain information—but must not ask certain other questions,
and so on, and so on.
Beyond the miasma of applicable law, we also have the following prob-
lems to contend with.
Arrival of new data may raise new questions and policies. Suppose
the credit check comes back with a less than glowing report; now
you need these five extra forms and perhaps a blood sample.
TIP 43
You can accomplish the same results with more brute-force methods,
of course, but you’ll have a more brittle system. When it breaks, all
the king’s horses and all the king’s men might not get your program
working again.
Challenges
Do you use blackboard systems in the real world—the message board by
the refrigerator, or the big whiteboard at work? What makes them effective?
Are messages ever posted with a consistent format? Does it matter?
Exercises
Answer 30. For each of the following applications, would a blackboard system be ap-
on p. 297
propriate or not? Why?
Coding is not mechanical. If it were, all the CASE tools that people
pinned their hopes on in the early 1980s would have replaced program-
mers long ago. There are decisions to be made every minute—decisions
that require careful thought and judgment if the resulting program is
to enjoy a long, accurate, and productive life.
Developers who don’t actively think about their code are programming
by coincidence—the code might work, but there’s no particular rea-
son why. In Programming by Coincidence, we advocate a more positive
involvement with the coding process.
171
to test, and you’ll increase the likelihood that it will actually get tested,
a thought we develop in Code That’s Easy to Test.
31 Programming by Coincidence
Do you ever watch old black-and-white war movies? The weary sol-
dier advances cautiously out of the brush. There’s a clearing ahead:
are there any land mines, or is it safe to cross? There aren’t any indica-
tions that it’s a minefield—no signs, barbed wire, or craters. The soldier
pokes the ground ahead of him with his bayonet and winces, expecting
an explosion. There isn’t one. So he proceeds painstakingly through the
field for a while, prodding and poking as he goes. Eventually, convinced
that the field is safe, he straightens up and marches proudly forward,
only to be blown to pieces.
The soldier’s initial probes for mines revealed nothing, but this was
merely lucky. He was led to a false conclusion—with disastrous results.
Fred doesn’t know why the code is failing because he didn’t know why
it worked in the first place. It seemed to work, given the limited “testing”
that Fred did, but that was just a coincidence. Buoyed by false confi-
dence, Fred charged ahead into oblivion. Now, most intelligent people
may know someone like Fred, but we know better. We don’t rely on
coincidences—do we?
Accidents of Implementation
Accidents of implementation are things that happen simply because
that’s the way the code is currently written. You end up relying on
undocumented error or boundary conditions.
Suppose you call a routine with bad data. The routine responds in a
particular way, and you code based on that response. But the author
didn’t intend for the routine to work that way—it was never even con-
sidered. When the routine gets “fixed,” your code may break. In the
most extreme case, the routine you called may not even be designed
to do what you want, but it seems to work okay. Calling things in the
wrong order, or in the wrong context, is a related problem.
paint(g);
invalidate();
validate();
revalidate();
repaint();
paintImmediately(r);
Here it looks like Fred is desperately trying to get something out on the
screen. But these routines were never designed to be called this way;
although they seem to work, that’s really just a coincidence.
To add insult to injury, when the component finally does get drawn,
Fred won’t try to go back and take out the spurious calls. “It works
now, better leave well enough alone. . . .”
It’s easy to be fooled by this line of thought. Why should you take the
risk of messing with something that’s working? Well, we can think of
several reasons:
For code you write that others will call, the basic principles of good
modularization and of hiding implementation behind small, well-docu-
mented interfaces can all help. A well-specified contract (see Design by
Contract, page 109) can help eliminate misunderstandings.
For routines you call, rely only on documented behavior. If you can’t,
for whatever reason, then document your assumption well.
Accidents of Context
You can have “accidents of context” as well. Suppose you are writing a
utility module. Just because you are currently coding for a GUI envi-
ronment, does the module have to rely on a GUI being present? Are you
relying on English-speaking users? Literate users? What else are you
relying on that isn’t guaranteed?
Implicit Assumptions
Coincidences can mislead at all levels—from generating requirements
through to testing. Testing is particularly fraught with false causalities
and coincidental outcomes. It’s easy to assume that X causes Y, but as
we said in Debugging, page 90: don’t assume it, prove it.
TIP 44
Always be aware of what you are doing. Fred let things get slowly
out of hand, until he ended up boiled, like the frog in Stone Soup
and Boiled Frogs, page 7.
Proceed from a plan, whether that plan is in your head, on the back
of a cocktail napkin, or on a wall-sized printout from a CASE tool.
Don’t just test your code, but test your assumptions as well. Don’t
guess; actually try it. Write an assertion to test your assumptions
(see Assertive Programming, page 122). If your assertion is right,
you have improved the documentation in your code. If you discover
your assumption is wrong, then count yourself lucky.
So next time something seems to work, but you don’t know why, make
sure it isn’t just a coincidence.
Exercises
Answer 31. Can you identify some coincidences in the following C code fragment?
on p. 298
Assume that this code is buried deep in a library routine.
fprintf(stderr,"Error, continue?");
gets(buf);
Answer 32. This piece of C code might work some of the time, on some machines. Then
on p. 298
again, it might not. What’s wrong?
1. You can also go too far here. We once knew a developer who rewrote all source he
was given because he had his own naming conventions.
33. This code comes from a general-purpose Java tracing suite. The function Answer
on p. 299
writes a string to a log file. It passes its unit test, but fails when one of the
Web developers uses it. What coincidence does it rely on?
32 Algorithm Speed
In Estimating, page 64, we talked about estimating things such as how
long it takes to walk across town, or how long a project will take to fin-
ish. However, there is another kind of estimating that Pragmatic Pro-
grammers use almost daily: estimating the resources that algorithms
use—time, processor, memory, and so on.
It turns out that these questions can often be answered using common
sense, some analysis, and a way of writing approximations called the
“big O” notation.
If the relationship were always linear (so that the time increased in
direct proportion to the value of ), this section wouldn’t be important.
However, most significant algorithms are not linear. The good news is
that many are sublinear. A binary search, for example, doesn’t need to
look at every candidate when finding a match. The bad news is that
The O ( ) Notation
The notation is a mathematical way of dealing with approximations.
When we write that a particular sort routine sorts records in
time, we are simply saying that the worst-case time taken will vary
as the square of . Double the number of records, and the time will
increase roughly fourfold. Think of the as meaning on the order of.
The notation puts an upper bound on the value of the thing we’re
measuring (time, memory, and so on). If we say a function takes
time, then we know that the upper bound of the time it takes will not
grow faster than . Sometimes we come up with fairly complex
functions, but because the highest-order term will dominate the value
as increases, the convention is to remove all low-order terms, and not
to bother showing any constant multiplying factors. is the
same as , which is equivalent to . This is actually a weakness
of the notation—one algorithm may be 1,000 times faster than
another algorithm, but you won’t know it from the notation.
For example, suppose you’ve got a routine that takes 1 s to process 100
records. How long will it take to process 1,000? If your code is , then
it will still take 1 s. If it’s , then you’ll probably be waiting about
3 s. will show a linear increase to 10 s, while an will take
some 33 s. If you’re unlucky enough to have an routine, then sit
back for 100 s while it does its stuff. And if you’re using an exponential
: selection sort
: sequential search
: binary search
: array access
The notation doesn’t apply just to time; you can use it to represent
any other resources used by an algorithm. For example, it is often use-
ful to be able to model memory consumption (see Exercise 35 on page
183).
Nested loops. If you nest a loop inside another, then your algo-
rithm becomes , where and are the two loops’ limits.
This commonly occurs in simple sorting algorithms, such as bub-
ble sort, where the outer loop scans each element in the array in
turn, and the inner loop works out where to place that element in
the sorted result. Such sorting algorithms tend to be .
algorithm for five elements: it will take six times longer to run it for
six, and 42 times longer for seven. Examples include algorithms for
many of the acknowledged hard problems—the traveling salesman
problem, optimally packing things into a container, partitioning a
set of numbers so that each set has the same total, and so on. Of-
ten, heuristics are used to reduce the running times of these types
of algorithms in particular problem domains.
TIP 45
There are some approaches you can take to address potential problems.
If you have an algorithm that is , try to find a divide and conquer
approach that will take you down to .
If you’re not sure how long your code will take, or how much memory
it will use, try running it, varying the input record count or whatever is
likely to impact the runtime. Then plot the results. You should soon get
a good idea of the shape of the curve. Is it curving upward, a straight
line, or flattening off as the input size increases? Three or four points
should give you an idea.
Also consider just what you’re doing in the code itself. A simple
loop may well perform better than a complex, one for smaller
In the middle of all this theory, don’t forget that there are practical
considerations as well. Runtime may look like it increases linearly for
small input sets. But feed the code millions of records and suddenly
the time degrades as the system starts to thrash. If you test a sort
routine with random input keys, you may be surprised the first time
it encounters ordered input. Pragmatic Programmers try to cover both
the theoretical and practical bases. After all this estimating, the only
timing that counts is the speed of your code, running in the production
environment, with real data.2 This leads to our next tip.
TIP 46
If it’s tricky getting accurate timings, use code profilers to count the
number of times the different steps in your algorithm get executed,
and plot these figures against the size of the input.
2. In fact, while testing the sort algorithms used as an exercise for this section on
a 64MB Pentium, the authors ran out of real memory while running the radix sort with
more than seven million numbers. The sort started using swap space, and times degraded
dramatically.
Challenges
Every developer should have a feel for how algorithms are designed and
analyzed. Robert Sedgewick has written a series of accessible books on the
subject ([Sed83, SF96, Sed92] and others). We recommend adding one of
his books to your collection, and making a point of reading it.
For those who like more detail than Sedgewick provides, read Donald
Knuth’s definitive Art of Computer Programming books, which analyze a
wide range of algorithms [Knu97a, Knu97b, Knu98].
Exercises
34. We have coded a set of simple sort routines, which can be downloaded Answer
on p. 299
from our Web site (www.pragmaticprogrammer.com). Run them on vari-
ous machines available to you. Do your figures follow the expected curves?
What can you deduce about the relative speeds of your machines? What
are the effects of various compiler optimization settings? Is the radix sort
indeed linear?
35. The routine below prints out the contents of a binary tree. Assuming the Answer
on p. 300
tree is balanced, roughly how much stack space will the routine use while
printing a tree of 1,000,000 elements? (Assume that subroutine calls im-
pose no significant stack overhead.)
36. Can you see any way to reduce the stack requirements of the routine in Answer
on p. 300
Exercise 35 (apart from reducing the size of the buffer)?
37. On page 180, we claimed that a binary chop is . Can you prove Answer
on p. 301
this?
33 Refactoring
Change and decay in all around I see . . .
H. F. Lyte, “Abide With Me”
3. The tenants move in and live happily ever after, calling building
maintenance to fix any problems.
Well, software doesn’t quite work that way. Rather than construction,
software is more like gardening—it is more organic than concrete. You
plant many things in a garden according to an initial plan and condi-
tions. Some thrive, others are destined to end up as compost. You may
move plantings relative to each other to take advantage of the inter-
play of light and shadow, wind and rain. Overgrown plants get split
or pruned, and colors that clash may get moved to more aesthetically
pleasing locations. You pull weeds, and you fertilize plantings that are
in need of some extra help. You constantly monitor the health of the
garden, and make adjustments (to the soil, the plants, the layout) as
needed.
to accomplish too much—it needs to be split into two. Things that don’t
work out as planned need to be weeded or pruned.
Real-World Complications
So you go to your boss or client and say, “This code works, but I need
another week to refactor it.”
Time pressure is often used as an excuse for not refactoring. But this
excuse just doesn’t hold up: fail to refactor now, and there’ll be a far
greater time investment to fix the problem down the road—when there
are more dependencies to reckon with. Will there be more time available
then? Not in our experience.
You might want to explain this principle to the boss by using a medical
analogy: think of the code that needs refactoring as a “growth.” Remov-
ing it requires invasive surgery. You can go in now, and take it out while
it is still small. Or, you could wait while it grows and spreads—but re-
moving it then will be both more expensive and more dangerous. Wait
even longer, and you may lose the patient entirely.
TIP 47
Keep track of the things that need to be refactored. If you can’t refactor
something immediately, make sure that it gets placed on the schedule.
Make sure that users of the affected code know that it is scheduled to
be refactored and how this might affect them.
2. Make sure you have good tests before you begin refactoring. Run
the tests as often as possible. That way you will know quickly if
your changes have broken anything.
Automatic Refactoring
Historically, Smalltalk users have always enjoyed a class browser
as part of the IDE. Not to be confused with Web browsers, class
browsers let users navigate through and examine class hierarchies
and methods.
Typically, class browsers allow you to edit code, create new methods
and classes, and so on. The next variation on this idea is the refac-
toring browser.
A refactoring browser can semiautomatically perform common refac-
toring operations for you: splitting up a long routine into smaller ones,
automatically propagating changes to method and variable names,
drag and drop to assist you in moving code, and so on.
As we write this book, this technology has yet to appear outside of
the Smalltalk world, but this is likely to change at the same speed
that Java changes—rapidly. In the meantime, the pioneering Small-
talk refactoring browser can be found online at [URL 20].
3. Take short, deliberate steps: move a field from one class to another,
fuse two similar methods into a superclass. Refactoring often in-
volves making many localized changes that result in a larger-scale
change. If you keep your steps small, and test after each step, you
will avoid prolonged debugging.
We’ll talk more about testing at this level in Code That’s Easy to Test,
page 189, and larger-scale testing in Ruthless Testing, page 237, but
Mr. Fowler’s point of maintaining good regression tests is the key to
refactoring with confidence.
So next time you see a piece of code that isn’t quite as it should be, fix
both it and everything that depends on it. Manage the pain: if it hurts
now, but is going to hurt even more later, you might as well get it over
with. Remember the lessons of Software Entropy, page 4: don’t live with
broken windows.
Exercises
Answer 38. The following code has obviously been updated several times over the
on p. 302
years, but the changes haven’t improved its structure. Refactor it.
if (state == TEXAS) {
rate = TX_RATE;
amt = base * TX_RATE;
calc = 2*basis(amt) + extra(amt)*1.05;
}
else if ((state == OHIO) || (state == MAINE)) {
rate = (state == OHIO) ? OH_RATE : ME_RATE;
amt = base * rate;
calc = 2*basis(amt) + extra(amt)*1.05;
if (state == OHIO)
points = 2;
}
else {
rate = 1;
amt = base;
calc = 2*basis(amt) + extra(amt)*1.05;
}
Answer 39. The following Java class needs to support a few more shapes. Refactor the
on p. 303
class to prepare it for the additions.
public class Shape {
public static final int SQUARE = 1;
public static final int CIRCLE = 2;
public static final int RIGHT_TRIANGLE = 3;
private int shapeType;
private double size;
public Shape(int shapeType, double size) {
this.shapeType = shapeType;
this.size = size;
}
// ... other methods ...
40. This Java code is part of a framework that will be used throughout your Answer
on p. 303
project. Refactor it to be more general and easier to extend in the future.
Chips are designed to be tested—not just at the factory, not just when
they are installed, but also in the field when they are deployed. More
complex chips and systems may have a full Built-In Self Test (BIST) fea-
ture that runs some base-level diagnostics internally, or a Test Access
Mechanism (TAM) that provides a test harness that allows the external
environment to provide stimuli and collect responses from the chip.
3. The term “Software IC” (Integrated Circuit) seems to have been invented in 1986 by
Cox and Novobilski in their Objective-C book Object-Oriented Programming [CN91].
Unit Testing
Chip-level testing for hardware is roughly equivalent to unit testing
in software—testing done on each module, in isolation, to verify its
behavior. We can get a better feeling for how a module will react in
the big wide world once we have tested it throughly under controlled
(even contrived) conditions.
A software unit test is code that exercises a module. Typically, the unit
test will establish some kind of artificial environment, then invoke rou-
tines in the module being tested. It then checks the results that are
returned, either against known values or against the results from pre-
vious runs of the same test (regression testing).
Before we get that far, however, we need to decide what to test at the
unit level. Typically, programmers throw a few random bits of data at
the code and call it tested. We can do much better, using the ideas
behind design by contract.
What does this mean in practice? Let’s look at the square root routine
we first encountered on page 114. Its contract is simple:
require
argument >= 0;
ensure
((Result * Result) - argument).abs <= epsilon*argument;
Armed with this contract, and assuming that our routine does its own
pre- and postcondition checking, we can write a basic test script to
exercise the square root function.
public void testValue(double num, double expected) {
double result = 0.0;
try { // We may throw a
result = mySqrt(num); // precondition exception
}
catch (Throwable e) {
if (num < 0.0) // If input is < 0, then
return; // we’re expecting the
else // exception, otherwise
assert(false); // force a test failure
}
assert(Math.abs(expected-result) < epsilon*expected);
}
Then we can call this routine to test our square root function:
testValue(-4.0, 0.0);
testValue( 0.0, 0.0);
testValue( 2.0, 1.4142135624);
testValue(64.0, 8.0);
testValue(1.0e7, 3162.2776602);
This is a pretty simple test; in the real world, any nontrivial module is
likely to be dependent on a number of other modules, so how do we go
about testing the combination?
3. A’s contract, which relies on the other contracts but does not di-
rectly expose them
If LinkedList and Sort’s tests passed, but A’s test failed, we can be
pretty sure that the problem is in A, or in A’s use of one of those sub-
components. This technique is a great way to reduce debugging effort:
we can quickly concentrate on the likely source of the problem within
module A, and not waste time reexamining its subcomponents.
TIP 48
Design to Test
When you design a module, or even a single routine, you should design
both its contract and the code to test that contract. By designing code
to pass a test and fulfill its contract, you may well consider bound-
ary conditions and other issues that wouldn’t occur to you otherwise.
There’s no better way to fix errors than by avoiding them in the first
place. In fact, by building the tests before you implement the code, you
get to try out the interface before you commit to it.
By making the test code readily accessible, you are providing developers
who may use your code with two invaluable resources:
It’s convenient, but not always practical, for each class or module to
contain its own unit test. In Java, for example, every class can have its
own main. In all but the application’s main class file, the main routine
can be used to run unit tests; it will be ignored when the application
itself is run. This has the benefit that the code you ship still contains
the tests, which can be used to diagnose problems in the field.
In C++ you can achieve the same effect (at compile time) by using
#ifdef to compile unit test code selectively. For example, here’s a
very simple unit test in C++, embedded in our module, that checks our
square root function using a testValue routine similar to the Java one
defined previously:
#ifdef __TEST__
int main(int argc, char **argv)
{
argc--; argv++; // skip program name
if (argc < 2) { // do standard tests if no args
testValue(-4.0, 0.0);
testValue( 0.0, 0.0);
testValue( 2.0, 1.4142135624);
testValue(64.0, 8.0);
testValue(1.0e7, 3162.2776602);
}
else { // else use args
double num, expected;
while (argc >= 2) {
num = atof(argv[0]);
expected = atof(argv[1]);
testValue(num,expected);
argc -= 2;
argv += 2;
}
}
return 0;
}
#endif
This unit test will either run a minimal set of tests or, if given argu-
ments, allow you to pass data in from the outside world. A shell script
could use this ability to run a much more complete set of tests.
What do you do if the correct response for a unit test is to exit, or abort
the program? In that case, you need to be able to select the test to
run, perhaps by specifying an argument on the command line. You’ll
But providing unit tests isn’t enough. You must run them, and run
them often. It also helps if the class passes its tests once in a while.
Ad Hoc Testing
During debugging, we may end up creating some particular tests on-
the-fly. These may be as simple as a print statement, or a piece of
code entered interactively in a debugger or IDE environment.
At the end of the debugging session, you need to formalize the ad
hoc test. If the code broke once, it is likely to break again. Don’t just
throw away the test you created; add it to the existing unit test.
For example, using JUnit (the Java member of the xUnit family), we
might write our square root test as follows:
This means you’ll often need to test a piece of software once it has
been deployed—with real-world data flowing though its veins. Unlike a
circuit board or chip, we don’t have test pins in software, but we can
provide various views into the internal state of a module, without using
the debugger (which may be inconvenient or impossible in a production
application).
Log files containing trace messages are one such mechanism. Log mes-
sages should be in a regular, consistent format; you may want to parse
them automatically to deduce processing time or logic paths that the
program took. Poorly or inconsistently formatted diagnostics are just
so much “spew”—they are difficult to read and impractical to parse.
For larger, more complex server code, a nifty technique for providing a
view into its operation is to include a built-in Web server. Anyone can
point a Web browser to the application’s HTTP port (which is usually
on a nonstandard number, such as 8080) and see internal status, log
entries, and possibly even some sort of a debug control panel. This may
sound difficult to implement, but it’s not. Freely available and embed-
dable HTTP Web servers are available in a variety of modern languages.
A good place to start looking is [URL 58].
A Culture of Testing
All software you write will be tested—if not by you and your team,
then by the eventual users—so you might as well plan on testing it
thoroughly. A little forethought can go a long way toward minimizing
maintenance costs and help-desk calls.
Despite its hacker reputation, the Perl community has a very strong
commitment to unit and regression testing. The Perl standard module
installation procedure supports a regression test by invoking
% make test
There’s nothing magic about Perl itself in this regard. Perl makes it
easier to collate and analyze test results to ensure compliance, but the
big advantage is simply that it’s a standard—tests go in a particular
place, and have a certain expected output. Testing is more cultural than
technical; we can instill this testing culture in a project regardless of
the language being used.
TIP 49
Exercises
41. Design a test jig for the blender interface described in the answer to Exer- Answer
on p. 305
cise 17 on page 289. Write a shell script that will perform a regression test
for the blender. You need to test basic functionality, error and boundary
conditions, and any contractual obligations. What restrictions are placed
on changing the speed? Are they being honored?
35 Evil Wizards
There’s no denying it—applications are getting harder and harder to
write. User interfaces in particular are becoming increasingly sophis-
ticated. Twenty years ago, the average application would have a glass
teletype interface (if it had an interface at all). Asynchronous terminals
would typically provide a character interactive display, while pollable
devices (such as the ubiquitous IBM 3270) would let you fill in an entire
screen before hitting SEND . Now, users expect graphical user interfaces,
with context-sensitive help, cut and paste, drag and drop, OLE integra-
tion, and MDI or SDI. Users are looking for Web-browser integration and
thin-client support.
All the time the applications themselves are getting more complex. Most
developments now use a multitier model, possibly with some middle-
ware layer or a transaction monitor. These programs are expected to be
dynamic and flexible, and to interoperate with applications written by
third parties.
Developers are struggling to keep up. If we were using the same kind
of tools that produced the basic dumb-terminal applications 20 years
ago, we’d never get anything done.
But using a wizard designed by a guru does not automatically make Joe
developer equally expert. Joe can feel pretty good—he’s just produced
a mass of code and a pretty spiffy-looking program. He just adds in the
specific application functionality and it’s ready to ship. But unless Joe
actually understands the code that has been produced on his behalf,
he’s fooling himself. He’s programming by coincidence. Wizards are a
one-way street—they cut the code for you, and then move on. If the
code they produce isn’t quite right, or if circumstances change and you
need to adapt the code, you’re on your own.
TIP 50
Some people feel that this is an extreme position. They say that develop-
ers routinely rely on things they don’t fully understand—the quantum
mechanics of integrated circuits, the interrupt structure of the proces-
sor, the algorithms used to schedule processes, the code in the supplied
libraries, and so on. We agree. And we’d feel the same about wizards
if they were simply a set of library calls or standard operating system
services that developers could rely on. But they’re not. Wizards gener-
ate code that becomes an integral part of Joe’s application. The wizard
code is not factored out behind a tidy interface—it is interwoven line by
line with functionality that Joe writes.4 Eventually, it stops being the
wizard’s code and starts being Joe’s. And no one should be producing
code they don’t fully understand.
Challenges
If you have a GUI-building wizard available, use it to generate a skeleton
application. Go through every line of code it produces. Do you understand
it all? Could you have produced it yourself? Would you have produced it
yourself, or is it doing things you don’t need?
4. However, there are other techniques that help manage complexity. We discuss two,
beans and AOP, in Orthogonality, page 34.
When you think you’ve got the problems solved, you may still not feel
comfortable with jumping in and starting. Is it simple procrastination,
or is it something more? Not Until You’re Ready offers advice on when it
may be prudent to listen to that cautionary voice inside your head.
Starting too soon is one problem, but waiting too long may be even
worse. In The Specification Trap, we’ll discuss the advantages of speci-
fication by example.
With these critical issues sorted out before the project gets under way,
you can be better positioned to avoid “analysis paralysis” and actually
begin your successful project.
201
It doesn’t quite work that way. Requirements rarely lie on the surface.
Normally, they’re buried deep beneath layers of assumptions, miscon-
ceptions, and politics.
TIP 51
However, very few requirements are as clear-cut, and that’s what makes
requirements analysis complex.
The first statement in the list above may have been stated by the users
as “Only an employee’s supervisors and the personnel department may
view that employee’s records.” Is this statement truly a requirement?
Perhaps today, but it embeds business policy in an absolute statement.
Policies change regularly, so we probably don’t want to hardwire them
into our requirements. Our recommendation is to document these poli-
cies separately from the requirement, and hyperlink the two. Make the
requirement the general statement, and give the developers the policy
information as an example of the type of thing they’ll need to support
in the implementation. Eventually, policy may end up as metadata in
the application.
This is a relatively subtle distinction, but it’s one that will have pro-
found implications for the developers. If the requirement is stated as
“Only personnel can view an employee record,” the developer may end
up coding an explicit test every time the application accesses these
files. However, if the statement is “Only authorized users may access
an employee record,” the developer will probably design and implement
some kind of access control system. When policy changes (and it will),
only the metadata for that system will need to be updated. In fact, gath-
ering requirements in this way naturally leads you to a system that is
well factored to support metadata.
for the help desk? Spend a couple of days monitoring the phones with
an experienced support person. Are you automating a manual stock
control system? Work in the warehouse for a week.1 As well as giving
you insight into how the system will really be used, you’d be amazed at
how the request “May I sit in for a week while you do your job?” helps
build trust and establishes a basis for communication with your users.
Just remember not to get in the way!
TIP 52
The requirements mining process is also the time to start to build a rap-
port with your user base, learning their expectations and hopes for the
system you are building. See Great Expectations, page 255, for more.
Documenting Requirements
So you are sitting down with the users and prying genuine require-
ments from them. You come across a few likely scenarios that describe
what the application needs to do. Ever the professional, you want to
write these down and publish a document that everyone can use as a
basis for discussions—the developers, the end users, and the project
sponsors.
1. Does a week sound like a long time? It really isn’t, particularly when you’re looking
at processes in which management and workers occupy different worlds. Management
will give you one view of how things operate, but when you get down on the floor, you’ll
find a very different reality—one that will take time to assimilate.
A. CHARACTERISTIC INFORMATION
– Goal in context
– Scope
– Level
– Preconditions
– Success end condition
– Failed end condition
– Primary actor
– Trigger
B. MAIN SUCCESS SCENARIO
C. EXTENSIONS
D. VARIATIONS
E. RELATED INFORMATION
– Priority
– Performance target
– Frequency
– Superordinate use case
– Subordinate use cases
– Channel to primary actor
– Secondary actors
– Channel to secondary actors
F. SCHEDULE
G. OPEN ISSUES
A. CHARACTERISTIC INFORMATION
Goal in context: Buyer issues request directly to our company, expects
goods shipped and to be billed.
Scope: Company
Level: Summary
Preconditions: We know buyer, their address, etc.
Success end condition: Buyer has goods, we have money for the goods.
Failed end condition: We have not sent the goods, buyer has not sent
the money.
Primary actor: Buyer, any agent (or computer) acting for the customer
Trigger: Purchase request comes in.
B. MAIN SUCCESS SCENARIO
1. Buyer calls in with a purchase request.
2. Company captures buyer’s name, address, requested goods, etc.
3. Company gives buyer information on goods, prices, delivery dates, etc.
4. Buyer signs for order.
5. Company creates order, ships order to buyer.
6. Company ships invoice to buyer.
7. Buyer pays invoice.
C. EXTENSIONS
3a. Company is out of one of the ordered items: Renegotiate order.
4a. Buyer pays directly with credit card: Take payment by credit card (use
case 44).
7a. Buyer returns goods: Handle returned goods (use case 105).
D. VARIATIONS
1. Buyer may use phone in, fax in, Web order form, electronic interchange.
7. Buyer may pay by cash, money order, check, or credit card.
E. RELATED INFORMATION
Priority: Top
Performance target: 5 minutes for order, 45 days until paid
Frequency: 200/day
Superordinate use case: Manage customer relationship (use case 2).
Subordinate use cases: Create order (15). Take payment by credit card
(44). Handle returned goods (105).
Channel to primary actor: May be phone, file, or interactive
Secondary actors: Credit card company, bank, shipping service
F. SCHEDULE
Due date: Release 1.0
G. OPEN ISSUES
What happens if we have part of the order?
What happens if credit card is stolen?
Go
home
at hand. But true use cases are textual descriptions, with a hierarchy
and cross-links. Use cases can contain hyperlinks to other use cases,
and they can be nested within each other.
Overspecifying
A big danger in producing a requirements document is being too spe-
cific. Good requirements documents remain abstract. Where require-
ments are concerned, the simplest statement that accurately reflects
the business need is best. This doesn’t mean you can be vague—you
must capture the underlying semantic invariants as requirements, and
document the specific or current work practices as policy.
Seeing Further
The Year 2000 problem is often blamed on short-sighted programmers,
desperate to save a few bytes in the days when mainframes had less
memory than a modern TV remote control.
TIP 53
Does “seeing further” require you to predict the future? No. It means
generating statements such as
The system makes active use of an abstraction of DATEs. The system
will implement DATE services, such as formatting, storage, and math
operations, consistently and universally.
The requirements will specify only that dates are used. It may hint
that some math may be done on dates. It may tell you that dates will
be stored on various forms of secondary storage. These are genuine
requirements for a DATE module or class.
It’s easy to get sucked into the “just one more feature” maelstrom, but
by tracking requirements you can get a clearer picture that “just one
more feature” is really the fifteenth new feature added this month.
Maintain a Glossary
As soon as you start discussing requirements, users and domain ex-
perts will use certain terms that have specific meaning to them. They
may differentiate between a “client” and a “customer,” for example. It
would then be inappropriate to use either word casually in the system.
Create and maintain a project glossary—one place that defines all the
specific terms and vocabulary used in a project. All participants in the
project, from end users to support staff, should use the glossary to
ensure consistency. This implies that the glossary needs to be widely
accessible—a good argument for Web-based documentation (more on
that in a moment).
TIP 54
It’s very hard to succeed on a project where the users and develop-
ers refer to the same thing by different names or, even worse, refer to
different things by the same name.
they want. Project sponsors can cruise along at a high level of abstrac-
tion to ensure that business objectives are met. Programmers can use
hyperlinks to “drill down” to increasing levels of detail (even referencing
appropriate definitions or engineering specifications).
Challenges
Can you use the software you are writing? Is it possible to have a good feel
for requirements without being able to use the software yourself?
Exercises
42. Which of the following are probably genuine requirements? Restate those Answer
on p. 307
that are not to make them more useful (if possible).
Every now and again, you will find yourself embroiled in the middle of a
project when a really tough puzzle comes up: some piece of engineering
that you just can’t get a handle on, or perhaps some bit of code that
is turning out to be much harder to write than you thought. Maybe it
looks impossible. But is it really as hard as it seems?
So you pull on the ring, or try to put the T’s in the box, and quickly
discover that the obvious solutions just don’t work. The puzzle can’t be
solved that way. But even though it’s obvious, that doesn’t stop people
from trying the same thing—over and over—thinking there must be a
way.
Of course, there isn’t. The solution lies elsewhere. The secret to solving
the puzzle is to identify the real (not imagined) constraints, and find
a solution therein. Some constraints are absolute; others are merely
preconceived notions. Absolute constraints must be honored, however
distasteful or stupid they may appear to be. On the other hand, some
apparent constraints may not be real constraints at all. For example,
there’s that old bar trick where you take a brand new, unopened cham-
pagne bottle and bet that you can drink beer out of it. The trick is to
turn the bottle upside down, and pour a small quantity of beer in the
hollow in the bottom of the bottle. Many software problems can be just
as sneaky.
Degrees of Freedom
The popular buzz-phrase “thinking outside the box” encourages us to
recognize constraints that might not be applicable and to ignore them.
But this phrase isn’t entirely accurate. If the “box” is the boundary of
constraints and conditions, then the trick is to find the box, which may
be considerably larger than you think.
For example, can you connect all of the dots in the following puzzle
and return to the starting point with just three straight lines—without
lifting your pen from the paper or retracing your steps [Hol78]?
It’s not whether you think inside the box or outside the box. The prob-
lem lies in finding the box—identifying the real constraints.
TIP 55
By the way, a solution to the Four Posts puzzle is shown on page 307.
That’s when you step back a pace and ask yourself these questions:
Are you trying to solve the right problem, or have you been dis-
tracted by a peripheral technicality?
Many times a surprising revelation will come to you as you try to answer
one of these questions. Many times a reinterpretation of the require-
ments can make a whole set of problems go away—just like the Gordian
knot.
All you need are the real constraints, the misleading constraints, and
the wisdom to know the difference.
Challenges
Take a hard look at whatever difficult problem you are embroiled in today.
Can you cut the Gordian knot? Ask yourself the key questions we outlined
above, especially “Does it have to be done this way?”
Were you handed a set of constraints when you signed on to your current
project? Are they all still applicable, and is the interpretation of them still
valid?
Great performers share a trait: they know when to start and when
to wait. The diver stands on the high-board, waiting for the perfect
moment to jump. The conductor stands before the orchestra, arms
raised, until she senses that the moment is right to start the piece.
You are a great performer. You too need to listen to the voice that whis-
pers “wait.” If you sit down to start typing and there’s some nagging
doubt in your mind, heed it.
TIP 56
As a developer, you’ve been doing the same kind of thing during your
entire career. You’ve been trying things and seeing which worked and
which didn’t. You’ve been accumulating experience and wisdom. When
you feel a nagging doubt, or experience some reluctance when faced
with a task, heed it. You may not be able to put your finger on exactly
what’s wrong, but give it time and your doubts will probably crystal-
lize into something more solid, something you can address. Software
development is still not a science. Let your instincts contribute to your
performance.
starting. So how can you tell when you’re simply procrastinating, rather
than responsibly waiting for all the pieces to fall into place?
On the other hand, as the prototype progresses you may have one of
those moments of revelation when you suddenly realize that some basic
premise was wrong. Not only that, but you’ll see clearly how you can put
it right. You’ll feel comfortable abandoning the prototype and launching
into the project proper. Your instincts were right, and you’ve just saved
yourself and your team a considerable amount of wasted effort.
Challenges
Discuss the fear-of-starting syndrome with your colleagues. Do others ex-
perience the same thing? Do they heed it? What tricks do they use to
overcome it? Can a group help overcome an individual’s reluctance, or is
that just peer pressure?
The problem is that many designers find it difficult to stop. They feel
that unless every little detail is pinned down in excruciating detail they
haven’t earned their daily dollar.
This is a mistake for several reasons. First, it’s naive to assume that a
specification will ever capture every detail and nuance of a system or its
requirement. In restricted problem domains, there are formal methods
that can describe a system, but they still require the designer to explain
the meaning of the notation to the end users—there is still a human
interpretation going on to mess things up. Even without the problems
inherent in this interpretation, it is very unlikely that the average user
knows going in to a project exactly what they need. They may say they
have an understanding of the requirement, and they may sign off on
the 200-page document you produce, but you can guarantee that once
they see the running system you’ll be inundated with change requests.
Here’s a challenge for you. Write a short description that tells someone
how to tie bows in their shoelaces. Go on, try it!
If you are anything like us, you probably gave up somewhere around
“now roll your thumb and forefinger so that the free end passes under
and inside the left lace. . . .” It is a phenomenally difficult thing to do.
And yet most of us can tie our shoes without conscious thought.
TIP 57
Finally, there is the straightjacket effect. A design that leaves the coder
no room for interpretation robs the programming effort of any skill and
art. Some would say this is for the best, but they’re wrong. Often, it is
only during coding that certain options become apparent. While coding,
you may think “Look at that. Because of the particular way I coded this
routine, I could add this additional functionality with almost no effort” or
“The specification says to do this, but I could achieve an almost identical
result by doing it a different way, and I could do it in half the time.”
Clearly, you shouldn’t just hack in and make the changes, but you
wouldn’t even have spotted the opportunity if you were constrained by
an overly prescriptive design.
2. There are some formal techniques that attempt to express operations algebraically,
but these techniques are rarely used in practice. They still require that the analysts
explain the meaning to the end users.
flow directly into the next, with no artificial boundaries. You’ll find that
a healthy development process encourages feedback from implementa-
tion and testing into the specification process.
Challenges
The shoelace example mentioned in the text is an interesting illustration
of the problems of written descriptions. Did you consider describing the
process using diagrams rather than words? Photographs? Some formal
notation from topology? Models with wire laces? How would you teach a
toddler?
3. Detailed specifications are clearly appropriate for life-critical systems. We feel they
should also be produced for interfaces and libraries used by others. When your entire
output is seen as a set of routine calls, you’d better make sure those calls are well
specified.
Don’t get us wrong. We like (some) formal techniques and methods. But
we believe that blindly adopting any technique without putting it into
the context of your development practices and capabilities is a recipe
for disappointment.
TIP 58
Don’t give in to the false authority of a method. People may walk into
meetings with an acre of class diagrams and 150 use cases, but all
that paper is still just their fallible interpretation of requirements and
design. Try not to think about how much a tool cost when you look at
its output.
TIP 59
Challenges
Use case diagrams are part of the UML process for gathering requirements
(see The Requirements Pit, page 202). Are they an effective way of commu-
nicating with your users? If not, why are you using them?
How can you tell if a formal method is bringing your team benefits? What
can you measure? What constitutes an improvement? Can you distinguish
between benefits of the tool and increased experience on the part of team
members?
Where is the break-even point for introducing new methods to your team?
How do you evaluate the trade-off between future benefits and current
losses of productivity as the tool is introduced?
Are tools that work for large projects good for small ones? How about the
other way around?
Pragmatic Projects
As your project gets under way, we need to move away from issues
of individual philosophy and coding to talk about larger, project-sized
issues. We aren’t going to go into specifics of project management, but
we will talk about a handful of critical areas that can make or break
any project.
As soon as you have more than one person working on a project, you
need to establish some ground rules and delegate parts of the project
accordingly. In Pragmatic Teams, we’ll show how to do this while hon-
oring the pragmatic philosophy.
The only thing that developers dislike more than testing is documenta-
tion. Whether you have technical writers helping you or are doing it on
your own, we’ll show you how to make the chore less painful and more
productive in It’s All Writing.
223
The last tip in the book is a direct consequence of all the rest. In Pride
and Prejudice, we encourage you to sign your work, and to take pride
in what you do.
41 Pragmatic Teams
At Group L, Stoffel oversees six first-rate programmers, a managerial challenge
roughly comparable to herding cats.
The Washington Post Magazine, June 9, 1985
No Broken Windows
Quality is a team issue. The most diligent developer placed on a team
that just doesn’t care will find it difficult to maintain the enthusiasm
needed to fix niggling problems. The problem is further exacerbated
if the team actively discourages the developer from spending time on
these fixes.
Boiled Frogs
Remember the poor frog in the pan of water, back in Stone Soup and
Boiled Frogs, page 7? It doesn’t notice the gradual change in its en-
vironment, and ends up cooked. The same can happen to individuals
who aren’t vigilant. It can be difficult to keep an eye on your overall
environment in the heat of project development.
It’s even easier for teams as a whole to get boiled. People assume that
someone else is handling an issue, or that the team leader must have
OK’d a change that your user is requesting. Even the best-intentioned
teams can be oblivious to significant changes in their projects.
Fight this. Make sure everyone actively monitors the environment for
changes. Maybe appoint a chief water tester. Have this person check
constantly for increased scope, decreased time scales, additional fea-
tures, new environments—anything that wasn’t in the original agree-
ment. Keep metrics on new requirements (see page 209). The team
needn’t reject changes out of hand—you simply need to be aware that
they’re happening. Otherwise, it’ll be you in the hot water.
Communicate
It’s obvious that developers in a team must talk to each other. We gave
some suggestions to facilitate this in Communicate! on page 18. How-
ever, it’s easy to forget that the team itself has a presence within the
organization. The team as an entity needs to communicate clearly with
the rest of the world.
To outsiders, the worst project teams are those that appear sullen and
reticent. They hold meetings with no structure, where no one wants to
talk. Their documents are a mess: no two look the same, and each uses
different terminology.
When the project’s too big for one librarian (or when no one wants
to play the role), appoint people as focal points for various functional
aspects of the work. If people want to talk over date handling, they
should know to talk with Mary. If there’s a database schema issue, see
Fred.
And don’t forget the value of groupware systems and local Usenet news-
groups for communicating and archiving questions and answers.
1. The team speaks with one voice—externally. Internally, we strongly encourage lively,
robust debate. Good developers tend to be passionate about their work.
Orthogonality
Traditional team organization is based on the old-fashioned waterfall
method of software construction. Individuals are assigned roles based
on their job function. You’ll find business analysts, architects, design-
ers, programmers, testers, documenters, and the like.2 There is an
implicit hierarchy here—the closer to the user you’re allowed, the more
senior you are.
TIP 60
Functionality here does not necessarily mean end-user use cases. The
database access layer counts, as does the help subsystem. We’re look-
ing for cohesive, largely self-contained teams of people—exactly the
How does this functional style of organization help? Organize our re-
sources using the same techniques we use to organize code, using
techniques such as contracts (Design by Contract, page 109), decou-
pling (Decoupling and the Law of Demeter, page 138), and orthogonality
(Orthogonality, page 34), and we help isolate the team as a whole from
the effects of change. If the user suddenly decides to change database
vendors, only the database team should be affected. Should marketing
suddenly decide to use an off-the-shelf tool for the calendar function,
the calendar group takes a hit. Properly executed, this kind of group
approach can dramatically reduce the number of interactions between
individuals’ work, reducing time scales, increasing quality, and cutting
down on the number of defects. This approach can also lead to a more
committed set of developers. Each team knows that they alone are re-
sponsible for a particular function, so they feel more ownership of their
output.
This type of team organization is similar in spirit to the old chief pro-
grammer team concept, first documented in 1972 [Bak72].
Automation
A great way to ensure both consistency and accuracy is to automate
everything the team does. Why lay code out manually when your editor
can do it automatically as you type? Why complete test forms when the
overnight build can run tests automatically?
Challenges
Look around for successful teams outside the area of software develop-
ment. What makes them successful? Do they use any of the processes
discussed in this section?
Next time you start a project, try convincing people to brand it. Give your
organization time to become used to the idea, and then do a quick audit to
see what difference it made, both within the team and externally.
Team Algebra: In school, we are given problems such as “If it takes 4 work-
ers 6 hours to dig a ditch, how long would it take 8 workers?” In real life,
however, what factors affect the answer to: “If it takes 4 programmers 6
months to develop an application, how long would it take 8 programmers?”
In how many scenarios is the time actually reduced?
42 Ubiquitous Automation
Civilization advances by extending the number of important operations we can
perform without thinking.
Alfred North Whitehead
All on Automatic
We were once at a client site where all the developers were using the
same IDE. Their system administrator gave each developer a set of
instructions on installing add-on packages to the IDE. These instruc-
tions filled many pages—pages full of click here, scroll there, drag this,
double-click that, and do it again.
TIP 61
Using cron, we can schedule backups, the nightly build, Web site main-
tenance, and anything else that needs to be done—unattended, auto-
matically.
Generating Code
In The Evils of Duplication, page 26, we advocated generating code to
derive knowledge from common sources. We can exploit make’s depen-
dency analysis mechanism to make this process easy. It’s a pretty sim-
ple matter to add rules to a makefile to generate a file from some other
source automatically. For example, suppose we wanted to take an XML
file, generate a Java file from it, and compile the result.
Type make test.class, and make will automatically look for a file
named test.xml, build a .java file by running a Perl script, and then
compile that file to produce test.class.
We can use the same sort of rules to generate source code, header files,
or documentation automatically from some other form as well (see Code
Generators, page 102).
Regression Tests
You can also use the makefile to run regression tests for you, either for
an individual module or for an entire subsystem. You can easily test
the entire project with just one command at the top of the source tree,
or you can test an individual module by using the same command in a
single directory. See Ruthless Testing, page 237, for more on regression
testing.
Recursive make
Many projects set up recursive, hierarchical makefiles for project
builds and testing. But be aware of some potential problems.
make calculates dependencies between the various targets it has to
build. But it can analyze only the dependencies that exist within one
single make invocation. In particular, a recursive make has no knowl-
edge of dependencies that other invocations of make may have. If
you are careful and precise, you can get the proper results, but it’s
easy to cause extra work unnecessarily—or miss a dependency and
not recompile when it’s needed.
In addition, build dependencies may not be the same as test depen-
dencies, and you may need separate hierarchies.
Build Automation
A build is a procedure that takes an empty directory (and a known com-
pilation environment) and builds the project from scratch, producing
whatever you hope to produce as a final deliverable—a CD-ROM mas-
ter image or a self-extracting archive, for instance. Typically a project
build will encompass the following steps.
3. If you are producing a CD-ROM in ISO9660 format, for example, you would run the
program that produces a bit-for-bit image of the 9660 file system. Why wait until the
night before you ship to make sure it works?
For most projects, this level of build is run automatically every night.
In this nightly build, you will typically run more complete tests than an
individual might run while building some specific portion of the project.
The important point is to have the full build run all available tests. You
want to know if a regression test failed because of one of today’s code
changes. By identifying the problem close to the source, you stand a
better chance of finding and fixing it.
When you don’t run tests regularly, you may discover that the appli-
cation broke due to a code change made three months ago. Good luck
finding that one.
Final Builds
Final builds, which you intend to ship as products, may have different
requirements from the regular nightly build. A final build may require
that the repository be locked, or tagged with the release number, that
optimization and debug flags be set differently, and so on. We like to
use a separate make target (such as make final) that sets all of these
parameters at once.
Automatic Administrivia
Wouldn’t it be nice if programmers could actually devote all of their time
to programming? Unfortunately, this is rarely the case. There is e-mail
to be answered, paperwork to be filled out, documents to be posted to
the Web, and so on. You may decide to create a shell script to do some
of the dirty work, but you still have to remember to run the script when
needed.
Because memory is the second thing you lose as you age,4 we don’t
want to rely on it too heavily. We can run scripts to perform proce-
dures for us automatically, based on the content of source code and
documents. Our goal is to maintain an automatic, unattended, content-
driven workflow.
Approval Procedures
Some projects have various administrative workflows that must be fol-
lowed. For instance, code or design reviews need to be scheduled and
followed through, approvals may need to be granted, and so on. We can
use automation—and especially the Web site—to help ease the paper-
work burden.
A simple script could go through all of the source code and look for
all files that had a status of needs_review, indicating that they were
ready to be reviewed. You could then post a list of those files as a
You can set up a form on a Web page for the reviewers to register
approval or disapproval. After the review, the status can be automat-
ically changed to reviewed. Whether you have a code walk-through
with all the participants is up to you; you can still do the paperwork
automatically. (In an article in the April 1999 CACM, Robert Glass sum-
marizes research that seems to indicate that, while code inspection is
effective, conducting reviews in meetings is not [Gla99a].)
But we have all the raw materials we need to craft better tools. We
have cron. We have make, Ant, and CruiseControl for automation (see
[Cla04]). And we have Ruby, Perl, and other high-level scripting lan-
guages for quickly developing custom tools, Web page generators, code
generators, test harnesses, and so on.
Challenges
Look at your habits throughout the workday. Do you see any repetitive
tasks? Do you type the same sequence of commands over and over again?
Try writing a few shell scripts to automate the process. Do you always click
on the same sequence of icons repeatedly? Can you create a macro to do
all that for you?
How much of your project paperwork can be automated? Given the high
expense of programming staff,5 determine how much of the project’s bud-
get is being wasted on administrative procedures. Can you justify the
amount of time it would take to craft an automated solution based on
the overall cost savings it would achieve?
43 Ruthless Testing
Most developers hate testing. They tend to test gently, subconsciously
knowing where the code will break and avoiding the weak spots. Prag-
matic Programmers are different. We are driven to find our bugs now,
so we don’t have to endure the shame of others finding our bugs later.
Finding bugs is somewhat like fishing with a net. We use fine, small
nets (unit tests) to catch the minnows, and big, coarse nets (integration
tests) to catch the killer sharks. Sometimes the fish manage to escape,
so we patch any holes that we find, in hopes of catching more and more
slippery defects that are swimming about in our project pool.
TIP 62
5. For estimating purposes, you can figure an industry average of about US$100,000
per head—that’s salary plus benefits, training, office space and overhead, and so on.
Many teams develop elaborate test plans for their projects. Sometimes
they will even use them. But we’ve found that teams that use auto-
mated tests have a much better chance of success. Tests that run with
every build are much more effective than test plans that sit on a shelf.
In fact, a good project may well have more test code than production
code. The time it takes to produce this test code is worth the effort. It
ends up being much cheaper in the long run, and you actually stand a
chance of producing a product with close to zero defects.
Additionally, knowing that you’ve passed the test gives you a high de-
gree of confidence that a piece of code is “done.”
TIP 63
Just because you have finished hacking out a piece of code doesn’t
mean you can go tell your boss or your client that it’s done. It’s not.
First of all, code is never really done. More importantly, you can’t claim
that it is usable by anyone until it passes all of the available tests.
What to Test
There are several major types of software testing that you need to per-
form:
Unit testing
Integration testing
Validation and verification
6. eXtreme Programming [URL 45] calls this concept “continuous integration, relent-
less testing.”
Unit Testing
A unit test is code that exercises a module. We covered this topic by
itself in Code That’s Easy to Test, page 189. Unit testing is the founda-
tion of all the other forms of testing that we’ll discuss in this section.
If the parts don’t work by themselves, they probably won’t work well
together. All of the modules you are using must pass their own unit
tests before you can proceed.
Once all of the pertinent modules have passed their individual tests,
you’re ready for the next stage. You need to test how all the modules
use and interact with each other throughout the system.
Integration Testing
Integration testing shows that the major subsystems that make up the
project work and play well with each other. With good contracts in place
and well tested, any integration issues can be detected easily. Other-
wise, integration becomes a fertile breeding ground for bugs. In fact, it
is often the single largest source of bugs in the system.
how they differ from developer test data (for an example, see the story
about brush strokes on page 92).
Memory
Disk space
CPU bandwidth
Wall-clock time
Disk bandwidth
Network bandwidth
Color palette
Video resolution
You might actually check for disk space or memory allocation failures,
but how often do you test for the others? Will your application fit on
a screen with colors? Will it run on a screen
with -bit color without looking like a postage stamp? Will the batch
job finish before the archive starts?
When the system does fail,7 will it fail gracefully? Will it try, as best
it can, to save its state and prevent loss of work? Or will it “GPF” or
“core-dump” in the user’s face?
7. Our copy editor wanted us to change this sentence to “If the system does fail .”
We resisted.
Performance Testing
Performance testing, stress testing, or testing under load may be an
important aspect of the project as well.
Usability Testing
Usability testing is different from the types of testing discussed so far.
It is performed with real users, under real environmental conditions.
How to Test
We’ve looked at what to test. Now we’ll turn our attention to how to
test, including:
Regression testing
Test data
Exercising GUI systems
Testing the tests
Testing thoroughly
Design/Methodology Testing
Can you test the design of the code itself and the methodology you
used to build the software? After a fashion, yes you can. You do this
by analyzing metrics—measurements of various aspects of the code.
The simplest metric (and often the least interesting) is lines of code—
how big is the code itself?
There are a wide variety of other metrics you can use to examine
code, including:
McCabe Cyclomatic Complexity Metric (measures complexity of
decision structures)
Inheritance fan-in (number of base classes) and fan-out (number
of derived modules using this one as a parent)
Response set (see Decoupling and the Law of Demeter, page
138)
Class coupling ratios (see [URL 48])
Some metrics are designed to give you a “passing grade,” while oth-
ers are useful only by comparison. That is, you calculate these met-
rics for every module in the system and see how a particular module
relates to its brethren. Standard statistical techniques (such as mean
and standard deviation) are usually used here.
If you find a module whose metrics are markedly different from all
the rest, you need to ask yourself if that is appropriate. For some
modules, it may be okay to “blow the curve.” But for those that don’t
have a good excuse, it can indicate potential problems.
Regression Testing
A regression test compares the output of the current test with previous
(or known) values. We can ensure that bugs we fixed today didn’t break
things that were working yesterday. This is an important safety net,
and it cuts down on unpleasant surprises.
All of the tests we’ve mentioned so far can be run as regression tests,
ensuring that we haven’t lost any ground as we develop new code. We
can run regressions to verify performance, contracts, validity, and so
on.
Test Data
Where do we get the data to run all these tests? There are only two
kinds of data: real-world data and synthetic data. We actually need
to use both, because the different natures of these kinds of data will
expose different bugs in our software.
Real-world data comes from some actual source. Possibly it has been
collected from an existing system, a competitor’s system, or a prototype
of some sort. It represents typical user data. The big surprises come as
you discover what typical means. This is most likely to reveal defects
and misunderstandings in requirements analysis.
You need a lot of data, possibly more than any real-world sample
could provide. You might be able to use the real-world data as a
seed to generate a larger sample set, and tweak certain fields that
need to be unique.
You need data to stress the boundary conditions. This data may
be completely synthetic: date fields containing February 29, 1999,
huge record sizes, or addresses with foreign postal codes.
find it, and may fail. Most modern GUI testing tools use a number of
different techniques to get around this problem, and try to adjust to
minor layout differences.
After you have written a test to detect a particular bug, cause the bug
deliberately and make sure the test complains. This ensures that the
test will catch the bug if it happens for real.
TIP 64
If you are really serious about testing, you might want to appoint a
project saboteur. The saboteur’s role is to take a separate copy of the
source tree, introduce bugs on purpose, and verify that the tests will
catch them.
When writing tests, make sure that alarms sound when they should.
Testing Thoroughly
Once you are confident that your tests are correct, and are finding
bugs you create, how do you know if you have tested the code base
thoroughly enough?
The short answer is “you don’t,” and you never will. But there are prod-
ucts on the market that can help. These coverage analysis tools watch
your code during testing and keep track of which lines of code have
been executed and which haven’t. These tools help give you a general
feel for how comprehensive your testing is, but don’t expect to see 100%
coverage.
Even if you do happen to hit every line of code, that’s not the whole
picture. What is important is the number of states that your program
may have. States are not equivalent to lines of code. For instance, sup-
pose you have a function that takes two integers, each of which can be
a number from 0 to 999.
int test(int a, int b) {
return a / (a + b);
}
TIP 65
Even with good code coverage, the data you use for testing still has a
huge impact, and, more importantly, the order in which you traverse
code may have the largest impact of all.
When to Test
Many projects tend to leave testing to the last minute—right where it
will be cut against the sharp edge of a deadline.8 We need to start much
sooner than that. As soon as any production code exists, it needs to be
tested.
% make test
But some tests may not be easily run on a such a frequent basis. Stress
tests, for instance, may require special setup or equipment, and some
hand holding. These tests may be run less often—weekly or monthly,
perhaps. But it is important that they be run on a regular, scheduled
basis. If it can’t be done automatically, then make sure it appears on
the schedule, with all the necessary resources allocated to the task.
If a bug slips through the net of existing tests, you need to add a new
test to trap it next time.
8. dead line ded-lı̄n n (1864) a line drawn within or around a prison that a prisoner
passes at the risk of being shot—Webster’s Collegiate Dictionary.
TIP 66
Once a human tester finds a bug, it should be the last time a human
tester finds that bug. The automated tests should be modified to check
for that particular bug from then on, every time, with no exceptions, no
matter how trivial, and no matter how much the developer complains
and says, “Oh, that will never happen again.”
Because it will happen again. And we just don’t have the time to go
chasing after bugs that the automated tests could have found for us.
We have to spend our time writing new code—and new bugs.
Challenges
Can you automatically test your project? Many teams are forced to answer
“no.” Why? Is it too hard to define the acceptable results? Won’t this make
it hard to prove to the sponsors that the project is “done”?
Is it too hard to test the application logic independent of the GUI? What
does this say about the GUI? About coupling?
TIP 67
TIP 68
Comments in Code
Producing formatted documents from the comments and declarations
in source code is fairly straightforward, but first we have to ensure that
we actually have comments in the code. Code should have comments,
but too many comments can be just as bad as too few.
Even worse than meaningless names are misleading names. Have you
ever had someone explain inconsistencies in legacy code such as, “The
routine called getData really writes data to disk”? The human brain
will repeatedly foul this up—it’s called the Stroop Effect [Str35]. You
can try the following experiment yourself to see the effects of such in-
terference. Get some colored pens, and use them to write down the
names of colors. However, never write a color name using that color
pen. You could write the word “blue” in green, the word “brown” in red,
and so on. (Alternatively, we have a sample set of colors already drawn
on our Web site at www.pragmaticprogrammer.com.) Once you have
the color names drawn, try to say aloud the color with which each word
is drawn, as fast as you can. At some point you’ll trip up and start read-
ing the names of the colors, and not the colors themselves. Names are
/**
* Find the peak (highest) value within a specified date
* range of samples.
*
* @param aRange Range of dates to search for data.
* @param aThreshold Minimum value to consider.
* @return the value, or <code>null</code> if no value found
* greater than or equal to the threshold.
*/
public Sample findPeak(DateRange aRange, double aThreshold);
A list of the functions exported by code in the file. There are pro-
grams that analyze source for you. Use them, and the list is guar-
anteed to be up to date.
A list of other files this file uses. This can be determined more
accurately using automatic tools.
The name of the file. If it must appear in the file, don’t maintain it
by hand. RCS and similar systems can keep this information up to
date automatically. If you move or rename the file, you don’t want
to have to remember to edit the header.
9. This kind of information, as well as the filename, is provided by the RCS $Id$ tag.
The project may also require certain copyright notices or other legal
boilerplate to appear in each source file. Get your editor to insert these
for you automatically.
Executable Documents
Suppose we have a specification that lists the columns in a database
table. We’ll then have a separate set of SQL commands to create the
actual table in the database, and probably some kind of programming
language record structure to hold the contents of a row in the table.
The same information is repeated three times. Change any one of these
three sources, and the other two are immediately out of date. This is a
clear violation of the DRY principle.
10. See It’s Just a View, page 157, for more on models and views.
Technical Writers
Up until now, we’ve talked only about internal documentation—written
by the programmers themselves. But what happens when you have
professional technical writers involved in the project? All too often, pro-
grammers just throw material “over the wall” to technical writers and
Print It or Weave It
One problem inherent with published, paper documentation is that it
can become out of date as soon as it’s printed. Documentation of any
form is just a snapshot.
If you are using a markup system, you have the flexibility to implement
as many different output formats as you need. You can choose to have
<H1>Chapter Title</H1>
generate a new chapter in the report version of the document and title
a new slide in the slide show. Technologies such as XSL and CSS11 can
be used to generate multiple output formats from this one markup.
11. eXtensible Style Language and Cascading Style Sheets, two technologies designed
to help separate presentation from content.
If you are using a word processor, you’ll probably have similar capa-
bilities. If you remembered to use styles to identify different document
elements, then by applying different style sheets you can drastically
alter the look of the final output. Most word processors now allow you
to convert your document to formats such as HTML for Web publishing.
Markup Languages
Finally, for large-scale documentation projects, we recommend looking
at some of the more modern schemes for marking up documentation.
As long as your original markup is rich enough to express all the con-
cepts you need (including hyperlinks), translation to any other pub-
lishable form can be both easy and automatic. You can produce online
help, published manuals, product highlights for the Web site, and even
a tip-a-day calendar, all from the same source—which of course is
under source control and is built along with the nightly build (see Ubiq-
uitous Automation, page 230).
Challenges
Did you write an explanatory comment for the source code you just wrote?
Why not? Pressed for time? Not sure if the code will really work—are you
just trying out an idea as a prototype? You’ll throw the code away after-
wards, right? It won’t make it into the project uncommented and experi-
mental, will it?
45 Great Expectations
Be astonished, O ye heavens, at this, and be horribly afraid...
Jeremiah 2:12
A company announces record profits, and its share price drops 20%.
The financial news that night explains that the company failed to meet
analysts’ expectations. A child opens an expensive Christmas present
and bursts into tears—it wasn’t the cheap doll the child was hoping for.
A project team works miracles to implement a phenomenally complex
application, only to have it shunned by its users because it doesn’t have
a help system.
TIP 69
Communicating Expectations
Users initially come to you with some vision of what they want. It may
be incomplete, inconsistent, or technically impossible, but it is theirs,
and, like the child at Christmas, they have some emotion invested in it.
You cannot just ignore it.
There are some important techniques that can be used to facilitate this
process. Of these, Tracer Bullets, page 48, and Prototypes and Post-
it Notes, page 53, are the most important. Both let the team construct
something that the user can see. Both are ideal ways of communicating
your understanding of their requirements. And both let you and your
users practice communicating with each other.
This is a BAD THING. Try to surprise your users. Not scare them, mind
you, but delight them.
Give them that little bit more than they were expecting. The extra bit of
effort it requires to add some user-oriented feature to the system will
pay for itself time and time again in goodwill.
Listen to your users as the project progresses for clues about what
features would really delight them. Some things you can add relatively
easily that look good to the average user include:
All of these things are relatively superficial, and don’t really overburden
the system with feature bloat. However, each tells your users that the
development team cared about producing a great system, one that was
intended for real use. Just remember not to break the system adding
these new features.
Challenges
Sometimes the toughest critics of a project are the people who worked on
it. Have you ever experienced disappointment that your own expectations
weren’t met by something you produced? How could that be? Maybe there’s
more than logic at work here.
What do your users comment on when you deliver software? Is their atten-
tion to the various areas of the application proportional to the effort you
invested in each? What delights them?
TIP 70
Craftsmen of an earlier age were proud to sign their work. You should
be, too.
Project teams are still made up of people, however, and this rule can
cause trouble. On some projects, the idea of code ownership can cause
cooperation problems. People may become territorial, or unwilling to
work on common foundation elements. The project may end up like a
bunch of insular little fiefdoms. You become prejudiced in favor of your
code and against your coworkers.
That’s not what we want. You shouldn’t jealously defend your code
against interlopers; by the same token, you should treat other peo-
ple’s code with respect. The Golden Rule (“Do unto others as you would
have them do unto you”) and a foundation of mutual respect among
the developers is critical to make this tip work.
A Pragmatic Programmer.
Resources
The only reason we were able to cover so much ground in this book is
that we viewed many of our subjects from a high altitude. If we’d given
them the in-depth coverage they deserved, the book would have been
ten times longer.
In the section Professional Societies, we give details of the IEEE and the
ACM. We recommend that Pragmatic Programmers join one (or both)
of these societies. Then, in Building a Library, we highlight periodicals,
books, and Web sites that we feel contain high-quality and pertinent
information (or that are just plain fun).
261
Professional Societies
There are two world-class professional societies for programmers: the
Association for Computing Machinery (ACM)1 and the IEEE Computer
Society.2 We recommend that all programmers belong to one (or both) of
these societies. In addition, developers outside the United States may
want to join their national societies, such as the BCS in the United
Kingdom.
Building a Library
We’re big on reading. As we noted in Your Knowledge Portfolio, page 12,
a good programmer is always learning. Keeping current with books and
periodicals can help. Here are some that we like.
Periodicals
If you’re like us, you’ll save old magazines and periodicals until they’re
piled high enough to turn the bottom ones to flat sheets of diamond.
This means it’s worth being fairly selective. Here are a few periodicals
we read.
The Perl Journal. If you like Perl, you should probably subscribe
to The Perl Journal (www.tpj.com).
Books
Computing books can be expensive, but choose carefully and they’re
a worthwhile investment. You may want to check out our Pragmatic
Bookshelf titles at http://pragmaticprogrammer.com. Additionally,
here are a handful of the many other books we like.
Specific Environments
Unix. W. Richard Stevens has several excellent books including
Advanced Programming in the Unix Environment and the Unix Net-
work Programming books [Ste92, Ste98, Ste99].
C++. As soon as you find yourself on a C++ project, run, don’t walk,
to the bookstore and get Scott Meyer’s Effective C++, and possibly
More Effective C++ [Mey97a, Mey96]. For building systems of any
appreciable size, you need John Lakos’ Large-Scale C++ Software
Design [Lak96]. For advanced techniques, turn to Jim Coplien’s
Advanced C++ Programming Styles and Idioms [Cop92].
The Web
Finding good content on the Web is hard. Here are several links that we
check at least once a week.
Internet Resources
The links below are to resources available on the Internet. They were
valid at the time of writing, but (the Net being what it is) they may well
be out of date by the time you read this. If so, you could try a general
search for the filenames, or come to the Pragmatic Programmer Web
site (www.pragmaticprogrammer.com) and follow our links.
Editors
Emacs and vi are not the only cross-platform editors, but they are freely
available and widely used. A quick scan through a magazine such as
Dr. Dobbs will turn up several commercial alternatives.
Emacs
Both Emacs and XEmacs are available on Unix and Windows platforms.
vi
There are at least 15 different vi clones available. Of these, vim is prob-
ably ported to the most platforms, and so would be a good choice of
editor if you find yourself working in many different environments.
Other Tools
[ URL 41] WinZip—Archive Utility for Windows
www.winzip.com
Nico Mak Computing, Inc., Mansfield, CT
A Windows-based file archive utility. Supports both zip and tar formats.
Miscellaneous
[ URL 57] The GNU Project
www.gnu.org
Free Software Foundation, Boston, MA
The Free Software Foundation is a tax-exempt charity that raises funds
for the GNU project. The GNU project’s goal is to produce a complete, free,
Unix-like system. Many of the tools they’ve developed along the way have
become industry standards.
Bibliography
[Bak72] F. T. Baker. Chief programmer team management of pro-
duction programming. IBM Systems Journal, 11(1):56–73,
1972.
[FBB 99] Martin Fowler, Kent Beck, John Brant, William Opdyke, and
Don Roberts. Refactoring: Improving the Design of Existing
Code. Addison Wesley Longman, Reading, MA, 1999.
[FS99] Martin Fowler and Kendall Scott. UML Distilled: Applying the
Standard Object Modeling Language. Addison Wesley Long-
man, Reading, MA, second edition, 1999.
[Hol78] Michael Holt. Math Puzzles and Games. Dorset Press, New
York, NY, 1978.
[LMB92] John R. Levine, Tony Mason, and Doug Brown. Lex and
Yacc. O’Reilly & Associates, Inc., Sebastopol, CA, second
edition, 1992.
[WK82] James Q. Wilson and George Kelling. The police and neigh-
borhood safety. The Atlantic Monthly, 249(3):29–38, March
1982.
Answers to Exercises
Exercise 1: from Orthogonality on page 43
You are writing a class called Split, which splits input lines into fields. Which
of the following two Java class signatures is the more orthogonal design?
class Split1 {
public Split1(InputStreamReader rdr) { ...
public void readNextLine() throws IOException { ...
public int numFields() { ...
public String getField(int fieldNo) { ...
}
class Split2 {
public Split2(String line) { ...
public int numFields() { ...
public String getField(int fieldNo) { ...
}
279
Answer 3: This is a little tricky. Object technology can provide a more orthog-
onal system, but because it has more features to abuse, it is actually easier to
create a nonorthogonal system using objects than it is using a procedural lan-
guage. Features such as multiple inheritance, exceptions, operator overload-
ing, and parent-method overriding (via subclassing) provide ample opportunity
to increase coupling in nonobvious ways.
With object technology and a little extra effort, you can achieve a much more
orthogonal system. But while you can always write “spaghetti code” in a pro-
cedural language, object-oriented languages used poorly can add meatballs to
your spaghetti.
P 2 # select pen 2
D # pen down
W 2 # draw west 2cm
N 1 # then north 1
E 2 # then east 2
S 1 # then back south
U # pen up
Implement the code that parses this language. It should be designed so that it
is simple to add new commands.
Answer 5: Because we want to make the language extendable, we’ll make the
parser table driven. Each entry in the table contains the command letter, a flag
to say whether an argument is required, and the name of the routine to call to
handle that particular command.
typedef struct {
char cmd; /* the command letter */
int hasArg; /* does it take an argument */
void (*func)(int, int); /* routine to call */
} Command;
static Command cmds[] = {
{ ’P’, ARG, doSelectPen },
{ ’U’, NO_ARG, doPenUp },
{ ’D’, NO_ARG, doPenDown },
{ ’N’, ARG, doPenDir },
{ ’E’, ARG, doPenDir },
{ ’S’, ARG, doPenDir },
{ ’W’, ARG, doPenDir }
};
The main program is pretty simple: read a line, look up the command, get the
argument if required, then call the handler function.
while (fgets(buff, sizeof(buff), stdin)) {
Command *cmd = findCommand(*buff);
if (cmd) {
int arg = 0;
if (cmd->hasArg && !getArg(buff+1, &arg)) {
fprintf(stderr, "’%c’ needs an argument n", *buff);
continue;
}
cmd->func(*buff, arg);
}
}
The function that looks up a command performs a linear search of the table,
returning either the matching entry or NULL.
Command *findCommand(int cmd) {
int i;
for (i = 0; i < ARRAY_SIZE(cmds); i++) {
if (cmds[i].cmd == cmd)
return cmds + i;
}
fprintf(stderr, "Unknown command ’%c’ n", cmd);
return 0;
}
Answer 7: We coded our example using bison, the GNU version of yacc. For
clarity, we’re just showing the body of the parser here. Look at the source on
our Web site for the full implementation.
time: spec END_TOKEN
{ if ($1 >= 24*60) yyerror("Time is too large");
printf("%d minutes past midnight n", $1);
exit(0);
}
;
spec: hour ’:’ minute
{ $$ = $1 + $3;
}
| hour ’:’ minute ampm
{ if ($1 > 11*60) yyerror("Hour out of range");
$$ = $1 + $3 + $4;
}
| hour ampm
{ if ($1 > 11*60) yyerror("Hour out of range");
$$ = $1 + $2;
}
;
hour: hour_num
{ if ($1 > 23) yyerror("Hour out of range");
$$ = $1 * 60;
};
Answer 8:
$_ = shift;
/^( d d?)(am|pm)$/ && doTime($1, 0, $2, 12);
/^( d d?):( d d)(am|pm)$/ && doTime($1, $2, $3, 12);
/^( d d?):( d d)$/ && doTime($1, $2, 0, 24);
die "Invalid time $_ n";
#
# doTime(hour, min, ampm, maxHour)
#
sub doTime($$$$) {
my ($hour, $min, $offset, $maxHour) = @_;
die "Invalid hour: $hour" if ($hour >= $maxHour);
$hour += 12 if ($offset eq "pm");
print $hour*60 + $min, " minutes past midnight n";
exit(0);
}
my @consts;
my $name = <>;
die "Invalid format - missing name" unless defined($name);
chomp $name;
# Read in the rest of the file
while (<>) {
chomp;
s/^ s*//; s/ s*$//;
die "Invalid line: $_" unless /^( w+)$/;
push @consts, $_;
}
# Now generate the file
open(HDR, ">$name.h") or die "Can’t open $name.h: $!";
open(SRC, ">$name.c") or die "Can’t open $name.c: $!";
my $uc_name = uc($name);
my $array_name = $uc_name . "_names";
print HDR "/* File generated automatically - do not edit */ n";
print HDR "extern const char *$ {array_name}[];";
print HDR "typedef enum { n ";
print HDR join ", n ", @consts;
print HDR " n} $uc_name; n n";
print SRC "/* File generated automatically - do not edit */ n";
print SRC "const char *$ {array_name}[] = { n "";
print SRC join " ", n "", @consts;
print SRC " " n}; n";
close(SRC);
close(HDR);
Using the DRY principle, we won’t cut and paste this new file into our code.
Instead, we’ll #include it—the flat file is the master source of these constants.
This means that we’ll need a makefile to regenerate the header when the file
changes. The following extract is from the test bed in our source tree (available
on the Web site).
Writing a language back end is simple: provide a module that implements the
required six entry points. Here’s the C generator:
#!/usr/bin/perl -w
package CG;
use strict;
# Code generator for ’C’ (see cg_base.pl)
sub blankLine() { print " n"; }
sub comment() { print "/*$_[0] */ n"; }
sub startMsg() { print "typedef struct { n"; }
sub endMsg() { print "} $_[0]; n n"; }
sub arrayType() {
my ($name, $type, $size) = @_;
print " $type $name [$size]; n";
}
sub simpleType() {
my ($name, $type) = @_;
print " $type $name; n";
}
1;
First, let’s look at an Eiffel example. Here we have a routine for adding a STRING
to a doubly linked, circular list (remember that preconditions are labeled with
require, and postconditions with ensure).
Answer 15: This is bad. The math in the index clause (index-1) won’t work
on boundary conditions such as the first entry. The postcondition assumes a
particular implementation: we want contracts to be more abstract than that.
/**
* @pre anItem != null // Require real data
* @post pop() == anItem // Verify that it’s
* // on the stack
*/
public void push(final String anItem)
Answer 16: It’s a good contract, but a bad implementation. Here, the infa-
mous “Heisenbug” [URL 52] rears its ugly head. The programmer probably just
made a simple typo—pop instead of top. While this is a simple and contrived
example, side effects in assertions (or in any unexpected place in the code) can
be very difficult to diagnose.
So, for this exercise, design an interface to a kitchen blender. It will eventually
be a Web-based, Internet-enabled, CORBA-fied blender, but for now we just
need the interface to control it. It has ten speed settings (0 means off). You
can’t operate it empty, and you can change the speed only one unit at a time
(that is, from 0 to 1, and from 1 to 2, not from 0 to 2).
Here are the methods. Add appropriate pre- and postconditions and an invari-
ant.
int getSpeed()
void setSpeed(int x)
boolean isFull()
void fill()
void empty()
Answer 17: We’ll show the function signatures in Java, with the pre- and
postconditions labeled as in iContract.
/**
* @invariant getSpeed() > 0
* implies isFull() // Don’t run empty
* @invariant getSpeed() >= 0 &&
* getSpeed() < 10 // Range check
*/
/**
* @pre Math.abs(getSpeed() - x) <= 1 // Only change by one
* @pre x >= 0 && x < 10 // Range check
* @post getSpeed() == x // Honor requested speed
*/
public void setSpeed(final int x)
/**
* @pre !isFull() // Don’t fill it twice
* @post isFull() // Ensure it was done
*/
void fill()
/**
* @pre isFull() // Don’t empty it twice
* @post !isFull() // Ensure it was done
*/
void empty()
Answer 18: There are 21 terms in the series. If you said 20, you just experi-
enced a fencepost error.
Answer 19:
1. September, 1752 had only 19 days. This was done to synchronize calen-
dars as part of the Gregorian Reformation.
2. The directory could have been removed by another process, you might not
have permission to read it, &sb might be invalid—you get the picture.
3. We sneakily didn’t specify the types of a and b. Operator overloading might
have defined +, =, or != to have unexpected behavior. Also, a and b may
be aliases for the same variable, so the second assignment will overwrite
the value stored in the first.
4. In non-Euclidean geometry, the sum of the angles of a triangle will not add
up to . Think of a triangle mapped on the surface of a sphere.
Answer 20: We chose to implement a very simple class with a single static
method, TEST, that prints a message and a stack trace if the passed condition
parameter is false.
package com.pragprog.util;
import java.lang.System; // for exit()
import java.lang.Thread; // for dumpStack()
public class Assert {
/** Write a message, print a stack trace and exit if
* our parameter is false.
*/
public static void TEST(boolean condition) {
if (!condition) {
System.out.println("==== Assertion Failed ====");
Thread.dumpStack();
System.exit(1);
}
}
// Testbed. If our argument is ’okay’, try an assertion that
// succeeds, if ’fail’ try one that fails
public static final void main(String args[]) {
if (args[0].compareTo("okay") == 0) {
TEST(1 == 1);
}
else if (args[0].compareTo("fail") == 0) {
TEST(1 == 2);
}
else {
throw new RuntimeException("Bad argument");
}
}
}
Case (3) is more problematic—if the value null is significant to the application,
then it may be justifiably added to the container. If, however, it makes no sense
to store null values, an exception should probably be thrown.
Answer 23: By setting the reference to NULL, you reduce the number of point-
ers to the referenced object by one. Once this count reaches zero, the object is
eligible for garbage collection. Setting the references to NULL can be signifi-
cant for long-running programs, where the programmers need to ensure that
memory utilization doesn’t increase over time.
Exercise 24: from Decoupling and the Law of Demeter on page 143
We discussed the concept of physical decoupling in the box on page 142. Which
of the following C++ header files is more tightly coupled to the rest of the
system?
person1.h: person2.h:
#include "date.h" class Date;
class Person1 { class Person2 {
private: private:
Date myBirthdate; Date *myBirthdate;
public: public:
Person1(Date &birthDate); Person2(Date &birthDate);
// ... // ...
Answer 24: A header file is supposed to define the interface between the
corresponding implementation and the rest of the world. The header file itself
has no need to know about the internals of the Date class—it merely needs to
tell the compiler that the constructor takes a Date as a parameter. So, unless
the header file uses Dates in inline functions, the second snippet will work
fine.
What’s wrong with the first snippet? On a small project, nothing, except that
you are unnecessarily making everything that uses a Person1 class also in-
clude the header file for Date. Once this kind of usage gets common in a
project, you soon find that including one header file ends up including most
of the rest of the system—a serious drag on compilation times.
Exercise 25: from Decoupling and the Law of Demeter on page 143
For the example below and for those in Exercises 26 and 27, determine if the
method calls shown are allowed according to the Law of Demeter. This first one
is in Java.
public void showBalance(BankAccount acct) {
Money amt = acct.getBalance();
printToScreen(amt.printFormat());
}
void showBalance(BankAccount b) {
b.printBalance();
}
Exercise 26: from Decoupling and the Law of Demeter on page 143
This example is also in Java.
Answer 26: Since Colada creates and owns both myBlender and myStuff,
the calls to addIngredients and elements are allowed.
Exercise 27: from Decoupling and the Law of Demeter on page 143
This example is in C++.
markWorkflow(acct.name(), SET_BALANCE);
If you add a passenger to the wait list, they’ll be put on the flight automatically
when an opening becomes available.
There’s a massive reporting job that goes through looking for overbooked or full
flights to suggest when additional flights might be scheduled. It works fine, but
it takes hours to run.
We’d like to have a little more flexibility in processing wait-list passengers, and
we’ve got to do something about that big report—it takes too long to run. Use
the ideas from this section to redesign this interface.
Answer 29: We’ll take Flight and add some additional methods for main-
taining two lists of listeners: one for wait-list notification, and the other for
full-flight notification.
If we try to add a Passenger and fail because the flight is full, we can, option-
ally, put the Passenger on the wait list. When a spot opens up, waitList-
Available will be called. This method can then choose to add the Passenger
automatically, or have a service representative call the customer to ask if they
are still interested, or whatever. We now have the flexibility to perform different
behaviors on a per-customer basis.
Next, we want to avoid having the BigReport troll through tons of records look-
ing for full flights. By having BigReport registered as a listener on Flights,
each individual Flight can report when it is full—or nearly full, if we want.
Now users can get live, up-to-the-minute reports from BigReport instantly,
without waiting hours for it to run as it did previously.
Answer 30:
1. Image processing. For simple scheduling of a workload among the paral-
lel processes, a shared work queue may be more than adequate. You might
want to consider a blackboard system if there is feedback involved—that
is, if the results of one processed chunk affect other chunks, as in machine
vision applications, or complex 3D image-warp transforms.
2. Group calendaring. This might be a good fit. You can post scheduled
meetings and availability to the blackboard. You have entities functioning
autonomously, feedback from decisions is important, and participants may
come and go.
You might want to consider partitioning this kind of blackboard system
depending on who is searching: junior staff may care about only the im-
mediate office, human resources may want only English-speaking offices
worldwide, and the CEO may want the whole enchilada.
There is also some flexibility on data formats: we are free to ignore formats
or languages we don’t understand. We have to understand different for-
mats only for those offices that have meetings with each other, and we do
not need to expose all participants to a full transitive closure of all possi-
ble formats. This reduces coupling to where it is necessary, and does not
constrain us artificially.
3. Network monitoring tool. This is very similar to the mortgage/loan appli-
cation program described on page 168. You’ve got trouble reports sent in
by users and statistics reported automatically, all posting to the black-
board. A human or software agent can analyze the blackboard to diagnose
network failures: two errors on a line might just be cosmic rays, but 20,000
errors and you’ve got a hardware problem. Just as the detectives solve the
murder mystery, you can have multiple entities analyzing and contributing
ideas to solve the network problems.
fprintf(stderr,"Error, continue?");
gets(buf);
Answer 31: There are several potential problems with this code. First, it
assumes a tty environment. That may be fine if the assumption is true, but
what if this code is called from a GUI environment where neither stderr nor
stdin is open?
Second, there is the problematic gets, which will write as many characters
as it receives into the buffer passed in. Malicious users have used this failing
to create buffer overrun security holes in many different systems. Never use
gets().
Finally, no one in their right mind would ever bury user interaction such as
this in a library routine.
Answer 32: POSIX strcpy isn’t guaranteed to work for overlapping strings.
It might happen to work on some architectures, but only by coincidence.
Answer 34: Clearly, we can’t give any absolute answers to this exercise. How-
ever, we can give you a couple of pointers.
If you find that your results don’t follow a smooth curve, you might want to
check to see if some other activity is using some of your processor’s power.
You probably won’t get good figures on a multiuser system, and even if you
are the only user you may find that background processes periodically take
cycles away from your programs. You might also want to check memory: if the
application starts using swap space, performance will nose dive.
Answer 35: The printTree routine uses about 1,000 bytes of stack space
for the buffer variable. It calls itself recursively to descend through the tree,
and each nested call adds another 1,000 bytes to the stack. It also calls itself
when it gets to the leaf nodes, but exits immediately when it discovers that the
pointer passed in is NULL. If the depth of the tree is , the maximum stack
requirement is therefore roughly .
A balanced binary tree holds twice as many elements at each level. A tree of
depth holds , or , elements. Our million-element
tree will therefore need , or 20 levels.
We’d therefore expect our routine to use roughly 21,000 bytes of stack.
while (node) {
if (node->left) printTree(node->left);
getNodeAsString(node, buffer);
puts(buffer);
node = node->right;
}
The biggest gain, however, comes from allocating just a single buffer, shared
by all invocations of printTree. Pass this buffer as a parameter to the recur-
sive calls, and only 1,000 bytes will be allocated, regardless of the depth of
recursion.
void printTreePrivate(const Node *node, char *buffer) {
if (node) {
printTreePrivate(node->left, buffer);
getNodeAsString(node, buffer);
puts(buffer);
printTreePrivate(node->right, buffer);
}
}
void newPrintTree(const Node *node) {
char buffer[1000];
printTreePrivate(node, buffer);
}
Answer 37: There are a couple of ways of getting there. One is to turn the
problem on its head. If the array has just one element, we don’t iterate around
the loop. Each additional iteration doubles the size of the array we can search.
The general formula for the array size is therefore , where is the
number of iterations. If you take logs to the base 2 of each side, you get
, which by the definition of logs becomes .
Answer 38: We might suggest a fairly mild restructuring here: make sure
that every test is performed just once, and make all the calculations common.
If the expression 2*basis(...)*1.05 appears in other places in the program,
we should probably make it a function. We haven’t bothered here.
We’ve added a rate_lookup array, initialized so that entries other than Texas,
Ohio, and Maine have a value of 1. This approach makes it easy to add values
for other states in the future. Depending on the expected usage pattern, we
might want to make the points field an array lookup as well.
rate = rate_lookup[state];
amt = base * rate;
calc = 2*basis(amt) + extra(amt)*1.05;
if (state == OHIO)
points = 2;
Answer 39: When you see someone using enumerated types (or their equiv-
alent in Java) to distinguish between variants of a type, you can often improve
the code by subclassing:
public class Shape {
private double size;
public Shape(double size) {
this.size = size;
}
public double getSize() { return size; }
}
public class Square extends Shape {
public Square(double size) {
super(size);
}
public double area() {
double size = getSize();
return size*size;
}
}
public class Circle extends Shape {
public Circle(double size) {
super(size);
}
public double area() {
double size = getSize();
return Math.PI*size*size/4.0;
}
}
// etc...
Answer 40: This case is interesting. At first sight, it seems reasonable that a
window should have a width and a height. However, consider the future. Let’s
imagine that we want to support arbitrarily shaped windows (which will be
difficult if the Window class knows all about rectangles and their properties).
We’d suggest abstracting the shape of the window out of the Window class itself.
Note that in this approach we’ve used delegation rather than subclassing: a
window is not a “kind-of” shape—a window “has-a” shape. It uses a shape to
do its job. You’ll often find delegation useful when refactoring.
We could also have extended this example by introducing a Java interface that
specified the methods a class must support to support the shape functions.
This is a good idea. It means that when you extend the concept of a shape, the
compiler will warn you about classes that you have affected. We recommend
using interfaces this way when you delegate all the functions of some other
class.
Answer 41: First, we’ll add a main to act as a unit test driver. It will accept
a very small, simple language as an argument: “E” to empty the blender, “F” to
fill it, digits 0-9 to set the speed, and so on.
#!/bin/sh
CMD="java dbc.dbc_ex"
failcount=0
expect_okay() {
if echo "$*" | $CMD #>/dev/null 2>&1
then
:
else
echo "FAILED! $*"
failcount=‘expr $failcount + 1‘
fi
}
expect_fail() {
if echo "$*" | $CMD >/dev/null 2>&1
then
echo "FAILED! (Should have failed): $*"
failcount=‘expr $failcount + 1‘
fi
}
report() {
if [ $failcount -gt 0 ]
then
echo -e " n n*** FAILED $failcount TESTS n"
exit 1 # In case we are part of something larger
else
exit 0 # In case we are part of something larger
fi
}
#
# Start the tests
#
expect_okay F123456789876543210E # Should run thru
expect_fail F5 # Fails, speed too high
expect_fail 1 # Fails, empty
expect_fail F10E1 # Fails, empty
expect_fail F1238 # Fails, skips
expect_okay FE # Never turn on
expect_fail F1E # Emptying while running
expect_okay F10E # Should be ok
report # Report results
The tests check to see if illegal speed changes are detected, if you try to empty
the blender while running, and so on. We put this in the makefile so we can
compile and run the regression test by simply typing
% make
% make test
Note that we have the test exit with 0 or 1 so we can use this as part of a larger
test as well.
There was nothing in the requirements that spoke of driving this component
via a script, or even using a language. End users will never see it. But we have
a powerful tool that we can use to test our code, quickly and exhaustively.
Answer 42:
1. This statement sounds like a real requirement: there may be constraints
placed on the application by its environment.
2. Even though this may be a corporate standard, it isn’t a requirement. It
would be better stated as “The dialog background must be configurable
by the end user. As shipped, the color will be gray.” Even better would be
the broader statement “All visual elements of the application (colors, fonts,
and languages) must be configurable by the end user.”
3. This statement is not a requirement, it’s architecture. When faced with
something like this, you have to dig deep to find out what the user is
thinking.
4. The underlying requirement is probably something closer to “The system
will prevent the user from making invalid entries in fields, and will warn
the user when these entries are made.”
5. This statement is probably a hard requirement.
309
F H
Feature creep, 10 Hash, secure, 74
Feedback, e-mail address, xxiii Header file, 29
File “Heisenbug”, 124, 289
exception, 126 Helicopter, 34n
header, 29 Hopper, Grace, 8n, 90
implementation, 29 “Hot-key” sequence, 196
O strings, 155
Partitioning, 168
notation, 178, 181
Pascal, 29
Object
Passive code generator, 103
coupling, 140n
Performance testing, 241
destruction, 133, 134
Perl, 55, 62, 99
persistence, 39
C/Object Pascal interface, 101
publish/subscribe protocol, 158
database schema generation, 100
singleton, 41
home page, 267
valid/invalid state, 154
Java property access, 100
viewer, 163
power tools, 270
Object Management Group (OMG), 270
test data generation, 100
Object Pascal, 29
testing, 197
C interface, 101
and typesetting, 100
Object-Oriented Programming, 189n
Unix utilities in, 81
Object-Oriented Software Construction,
web documentation, 101
264
Perl Journal, 263
Obsolescence, 74
Persistence, 39, 45
OLTP, see On-Line Transaction
Petzold, Charles, 265
Processing system
Pike, Rob, 99
OMG, see Object Management Group
Pilot
On-Line Transaction Processing system
landing, handling, etc., 217
(OLTP), 152
who ate fish, 34
Options, providing, 3
Plain text, 73
Ordering, see Workflow
vs. binary format, 73
Orthogonality, 34
drawbacks, 74
coding, 34, 36, 40
executable documents, 251
design, 37
leverage, 75
documentation, 42
obsolescence, 74
DRY principle, 42
and easier testing, 76
nonorthogonal system, 34
Unix, 76
productivity, 35
Polymorphism, 111
project teams, 36, 227
Post-it note, 53, 55
testing, 41
Powerbuilder, 55
toolkits & libraries, 39
The Practice of Programming, 99
see also Modular system
Pragmatic programmer
Over embellishment, 11
characteristics, xviii
e-mail address, xxiii
P Web site, xxiii
Pain management, 185 Pre- and postcondition, 110, 113, 114
paint() method, 173 Predicate logic, 110
Painting, 11 Preprocessor, 114
Papua New Guinea, 16 Presentation, 20
at
Looking for a book, eBook, or training video on a new technology? Seeking
timely and relevant information and tutorials? Looking for expert opinions,
advice, and tips? InformIT has the solution.
Visit informit.com /learn to discover all the ways you can access the
hottest technology content.
THE TRUSTED TECHNOLOGY LEARNING SOURCE
THE TRUSTED TECHNOLOGY LEARNING SOURCE
Addison-Wesley | Cisco Press | Exam Cram
IBM Press | Que | Prentice Hall | Sams
TIPS 1 TO 22
2. Think! About Your Work . . . . . . . . . . . . . . . . . . . . . . . . . xix 13. Eliminate Effects Between Unrelated Things . . . . 35
Turn off the autopilot and take control. Constantly Design components that are self-contained, inde-
critique and appraise your work. pendent, and have a single, well-defined purpose.
3. Provide Options, Don’t Make Lame Excuses . . . . . . . . 3 14. There Are No Final Decisions . . . . . . . . . . . . . . . . . . . 46
Instead of excuses, provide options. Don’t say it No decision is cast in stone. Instead, consider each
can’t be done; explain what can be done. as being written in the sand at the beach, and plan
for change.
4. Don’t Live with Broken Windows . . . . . . . . . . . . . . . . . . .5
Fix bad designs, wrong decisions, and poor code 15. Use Tracer Bullets to Find the Target . . . . . . . . . . . 49
when you see them. Tracer bullets let you home in on your target by
trying things and seeing how close they land.
5. Be a Catalyst for Change . . . . . . . . . . . . . . . . . . . . . . . . . . .8
You can’t force change on people. Instead, show 16. Prototype to Learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54
them how the future might be and help them par- Prototyping is a learning experience. Its value lies
ticipate in creating it. not in the code you produce, but in the lessons you
learn.
6. Remember the Big Picture . . . . . . . . . . . . . . . . . . . . . . . . . 8
Don’t get so engrossed in the details that you forget 17. Program Close to the Problem Domain . . . . . . . . . . 58
to check what’s happening around you. Design and code in your user’s language.
8. Invest Regularly in Your Knowledge Portfolio . . . . . 14 19. Iterate the Schedule with the Code . . . . . . . . . . . . . .69
Make learning a habit. Use experience you gain as you implement to refine
the project time scales.
9. Critically Analyze What You Read and Hear . . . . . . 16
Don’t be swayed by vendors, media hype, or 20. Keep Knowledge in Plain Text . . . . . . . . . . . . . . . . . . .74
dogma. Analyze information in terms of you and Plain text won’t become obsolete. It helps leverage
your project. your work and simplifies debugging and testing.
10. It’s Both What You Say and the Way You Say It . . 21 21. Use the Power of Command Shells . . . . . . . . . . . . . . 80
There’s no point in having great ideas if you don’t Use the shell when graphical user interfaces don’t
communicate them effectively. cut it.
11. DRY —D on’t R epeat Yourself . . . . . . . . . . . . . . . . . . . . 27 22. Use a Single Editor Well . . . . . . . . . . . . . . . . . . . . . . . . 82
Every piece of knowledge must have a single, un- The editor should be an extension of your hand;
ambiguous, authoritative representation within a make sure your editor is configurable, extensible,
system. and programmable.
1
Prepared exclusively for Zach
TIPS 23 TO 46
23. Always Use Source Code Control . . . . . . . . . . . . . . . 88 35. Finish What You Start . . . . . . . . . . . . . . . . . . . . . . . . . 129
Source code control is a time machine for your Where possible, the routine or object that allocates
work—you can go back. a resource should be responsible for deallocating
it.
24. Fix the Problem, Not the Blame . . . . . . . . . . . . . . . . . 91
It doesn’t really matter whether the bug is your 36. Minimize Coupling Between Modules . . . . . . . . . . 140
fault or someone else’s—it is still your problem, Avoid coupling by writing “shy” code and applying
and it still needs to be fixed. the Law of Demeter.
25. Don’t Panic When Debugging . . . . . . . . . . . . . . . . . . . 91 37. Configure, Don’t Integrate . . . . . . . . . . . . . . . . . . . . . 144
Take a deep breath and THINK ! about what could Implement technology choices for an application as
be causing the bug. configuration options, not through integration or
engineering.
26. “select” Isn’t Broken . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
It is rare to find a bug in the OS or the compiler, 38. Put Abstractions in Code, Details in Metadata . 145
or even a third-party product or library. The bug is Program for the general case, and put the specifics
most likely in the application. outside the compiled code base.
27. Don’t Assume It—Prove It . . . . . . . . . . . . . . . . . . . . . . . 97 39. Analyze Workflow to Improve Concurrency . . . . 151
Prove your assumptions in the actual environ- Exploit concurrency in your user’s workflow.
ment—with real data and boundary conditions.
40. Design Using Services . . . . . . . . . . . . . . . . . . . . . . . . . 154
Design in terms of services—independent, concur-
28. Learn a Text Manipulation Language . . . . . . . . . . 100
rent objects behind well-defined, consistent inter-
You spend a large part of each day working with
faces.
text. Why not have the computer do some of it for
you?
41. Always Design for Concurrency . . . . . . . . . . . . . . . 156
Allow for concurrency, and you’ll design cleaner in-
29. Write Code That Writes Code . . . . . . . . . . . . . . . . . . 103
terfaces with fewer assumptions.
Code generators increase your productivity and
help avoid duplication.
42. Separate Views from Models . . . . . . . . . . . . . . . . . . .161
Gain flexibility at low cost by designing your appli-
30. You Can’t Write Perfect Software . . . . . . . . . . . . . . 107 cation in terms of models and views.
Software can’t be perfect. Protect your code and
users from the inevitable errors. 43. Use Blackboards to Coordinate Workflow . . . . . .169
Use blackboards to coordinate disparate facts and
31. Design with Contracts . . . . . . . . . . . . . . . . . . . . . . . . .111 agents, while maintaining independence and isola-
Use contracts to document and verify that code tion among participants.
does no more and no less than it claims to do.
44. Don’t Program by Coincidence . . . . . . . . . . . . . . . . 175
32. Crash Early . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Rely only on reliable things. Beware of accidental
A dead program normally does a lot less damage complexity, and don’t confuse a happy coincidence
than a crippled one. with a purposeful plan.
33. Use Assertions to Prevent the Impossible . . . . . 122 45. Estimate the Order of Your Algorithms . . . . . . . . 181
Assertions validate your assumptions. Use them Get a feel for how long things are likely to take be-
to protect your code from an uncertain world. fore you write code.
34. Use Exceptions for Exceptional Problems . . . . . 127 46. Test Your Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . .182
Exceptions can suffer from all the readability and Mathematical analysis of algorithms doesn’t tell
maintainability problems of classic spaghetti code. you everything. Try timing your code in its target
Reserve exceptions for exceptional things. environment.
47. Refactor Early, Refactor Often . . . . . . . . . . . . . . . . . 186 59. Costly Tools Don’t Produce Better Designs . . . . 222
Just as you might weed and rearrange a gar- Beware of vendor hype, industry dogma, and the
den, rewrite, rework, and re-architect code when aura of the price tag. Judge tools on their merits.
it needs it. Fix the root of the problem.
60. Organize Teams Around Functionality . . . . . . . . 227
48. Design to Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .192 Don’t separate designers from coders, testers from
Start thinking about testing before you write a line
data modelers. Build teams the way you build
of code.
code.
49. Test Your Software, or Your Users Will . . . . . . . . .197
Test ruthlessly. Don’t make your users find bugs 61. Don’t Use Manual Procedures . . . . . . . . . . . . . . . . . 231
for you. A shell script or batch file will execute the same
instructions, in the same order, time after time.
50. Don’t Use Wizard Code You Don’t Understand . 199
Wizards can generate reams of code. Make sure 62. Test Early. Test Often. Test Automatically. . . . . 237
you understand all of it before you incorporate it Tests that run with every build are much more ef-
into your project. fective than test plans that sit on a shelf.
51. Don’t Gather Requirements—Dig for Them . . . . . 202
Requirements rarely lie on the surface. They’re 63. Coding Ain’t Done ’Til All the Tests Run . . . . . . . 238
buried deep beneath layers of assumptions, mis- ’Nuff said.
conceptions, and politics.
64. Use Saboteurs to Test Your Testing . . . . . . . . . . . .244
52. Work with a User to Think Like a User . . . . . . . . . 204 Introduce bugs on purpose in a separate copy of
It’s the best way to gain insight into how the sys- the source to verify that testing will catch them.
tem will really be used.
53. Abstractions Live Longer than Details . . . . . . . . .209 65. Test State Coverage, Not Code Coverage . . . . . . . 245
Invest in the abstraction, not the implementation. Identify and test significant program states. Just
Abstractions can survive the barrage of changes testing lines of code isn’t enough.
from different implementations and new technolo-
gies. 66. Find Bugs Once . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Once a human tester finds a bug, it should be the
54. Use a Project Glossary . . . . . . . . . . . . . . . . . . . . . . . . 210 last time a human tester finds that bug. Automatic
Create and maintain a single source of all the spe- tests should check for it from then on.
cific terms and vocabulary for a project.
67. English is Just a Programming Language . . . . . . 248
55. Don’t Think Outside the Box—Find the Box . . . . 213
Write documents as you would write code: honor
When faced with an impossible problem, identify
the DRY principle, use metadata, MVC, automatic
the real constraints. Ask yourself: “Does it have to
generation, and so on.
be done this way? Does it have to be done at all?”
56. Start When You’re Ready . . . . . . . . . . . . . . . . . . . . . . 215 68. Build Documentation In, Don’t Bolt It On . . . . . . 248
You’ve been building experience all your life. Don’t Documentation created separately from code is
ignore niggling doubts. less likely to be correct and up to date.
57. Some Things Are Better Done than Described . 218 69. Gently Exceed Your Users’ Expectations . . . . . . 255
Don’t fall into the specification spiral—at some Come to understand your users’ expectations, then
point you need to start coding. deliver just that little bit more.
58. Don’t Be a Slave to Formal Methods . . . . . . . . . . . 220
Don’t blindly adopt any technique without putting 70. Sign Your Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
it into the context of your development practices Craftsmen of an earlier age were proud to sign their
and capabilities. work. You should be, too.