0% found this document useful (0 votes)
18 views4 pages

Programming Pearls - Tricks of The Trade

Uploaded by

Jeff Scott
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views4 pages

Programming Pearls - Tricks of The Trade

Uploaded by

Jeff Scott
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

programming

pearls
TRICKS OF THE TRADE
“Work smarter, not harder” is old but good advice. This to Vie Vyssotsky.
column will discuss a few tricks of the trade that can
A large software project can sustain a manpower
help programmers to work smarter.
buildup of 30 percent per year.
Some of the tricks may be worth a future column of
their own; others may not. The dual purposes of this Many of Butler Lampson’s “Hints for Computer Sys-
month’s colunnn are therefore to encourage you to tem Design” in IEEE Software 1, 1 (January 1984) are
think about the tricks and to solicit your contributions handy rules of thumb:
for the future columns.
Handle normal and worst cases separately.
Computing Rules of Thumb When in doubt, use brute force.
My dictionary defines a “rule of thumb” as “a judgment In allocating resources, strive to avoid disaster rather
based on practical experience rather than on scientific than to attain an optimum.
knowledge. ” Some programming rules are useful in These examples illustrate the kind of rules I’m look-
long-term estimates: ing for: brief statements (usually just a sentence or two)
An average programmer produces 2,000 lines of de- that are general enough to be broadly applicable yet
bugged, documented code per year, regardless of the specific enough to give real insight. “Be virtuous” is too
implementation language. general, while “Use a BXLE to do a compare and
branch in a single IBM System/360 instruction” is too
Other rules are more qualitative: particular. On the other hand, the following rule from
To debug a program, try explaining it to a colleague; Guy Steele is just right when you’re trying to squeeze
talking out loud is sometimes enough to help you the last microsecond out of assembly code:
spot a silly mistake. By playing as dumb as possible, Most computer architectures have a loop operation
your partner can force you to examine your basic that does a compare and branch in a single machine
suppositions. instruction: although it was intended for loops, it can
Fred Brook’s Mythical Man Month (published in 1975 sometimes be used to do a general comparison very
by Addison-Wesley) is laden with insightful rules of efficiently.
thumb.’ On page 20, for instance, he advises Please send me your favorite rules of thumb; jotting
In scheduling a software task, allow % of the time for them down on a postcard will be convenient for both of
planning, % for coding, % for component test and us and will encourage you to keep them short. (For a
early system test, and l/4 for system test, all compo- collection of rules from everyday life, see Tom Parker’s
nents in hand. Rules of Thumb, published by Houghton Mifflin in 1983.)
His discussion of prototypes on page 116 warns us to Debugging
Plan to throw one away; you will, anyhow. The steps in debugging a program range from designing
tests that will flush out the little critters to repairing
On page 121 h.e observes that the broken pieces. In this section we’ll focus on just one
The total cost of maintaining a widely used program small part of the job: after we observe weird behavior,
is typically 40 percent or more of the cost of develop- how do we identify the culprit who is causing the prob-
ing it. lem?
Rick Lemons of Cardinal Data Systems is the best
And on page 179, he attributes the following rule debugger I’ve ever seen. Programmers describe to him a
’ Too many programmers who have spent a delightful evening with this book bug that they’ve been chasing for hours, Rick asks three
were so charmed by its easy reading that they failed to appreciate its wealth
of factual material. If you’re in that category, go back and study the book with of four questions, and three minutes later the program-
pencil in hand. mer is pointing at the faulty code. His secret is his
C 1985 ACM 00Ol-t~Z/s5/0~00-01~~ 750 attitude: he never forgets that there has to be a logical

138 Communicationr of the ACM Februa y 1985 Volume 28 Number 2


Programming Pearls

explanation, no matter how mysterious the behavior TABLE II. Knuth’s Data on FORTRANPrograms
may seem at the time.
That attitude is illustrated in an anecdote from IBM’s
Yorktown Heights Research Center. When a program- Assignment I 51.0
mer used his new computer terminal, all was fine IF $4.5 85
(-0 p-J ,; ‘,’ ‘, 13.0 8.0
when he was sitting down, but he couldn’t log in to the
CALL .:. !. ’ 8.0 4.0
system when he was standing up. That behavior was CONTINUE $ 5..0 3.0
100 percent repeatable: he could always log in when WRITE : 2, 4.0 5.0
sitting and never when standing. FORMAT 4.0 4.0
Most of us just sit back and marvel at such a story; DO 4.0 5.0
how could that terminal know whether the poor guy Total 93.0% 88.5%
was sitting or standing? Good debuggers, though, know
that there has to be a reason. Electrical theories are the
easiest to hypothesize: was there a loose wire under the
carpet, or problems with static electricity? But electri- 2 4 8 16 32 64

cal problems are rarely consistently reproducible. An


alert IBMer finally noticed that the problem was in the Assignment
IF
terminal’s keyboard: the tops of two keys were GOT0 .x .........

switched. When the programmer was seated he was a CALL x .............. - Lockheed (93.0%)
touch typist and the problem went unnoticed, but CONTINUE ................. x - Stanford (88.5%)
when he stood he was led astray by hunting and peck- WRITE
ing. FORMAT
Do
At an ACM Chapter meeting in Chicago, I heard the
story of a banking system that had worked for quite 2 4 8 16 32 64
some time, but halted the first time it was used on
international data. Programmers spent days scouring Percent of source lines
the code, but they couldn’t find any stray command (log scale)
that would return control to the operating system. FIGURE1. Knuth’s Data on FORTRANPrograms
When they observed the behavior more closely, they
found that the problem occurred as they entered data
for the country of Ecuador: when the user typed the
name of the capital city (Quito), the program inter- of computing systems, with “80-20” sometimes replaced
preted that as a request to quit the run! by “90-10” or “95-5.” A few such rules are summarized
In both cases the right questions would have guided a in Table I; they encourage programmers to focus their
wise programmer to the bug in short order: “What do efforts to get maximal returns in functionality, perform-
you do differently sitting and standing? May I watch ance, or robustness.
you logging in each way?” “ Precisely what did you type To locate the active 20 (or 10 or five) percent of a
before the program returned to the operating system?” program, a programmer must usually gather data on
I’m in the market for similar short stories on debugging how it is being used-a priori guesses are notoriously
that show the rewards of logical thinking. far from the truth. Don Knuth’s “Empirical Study of
FORTRAN Programs” was published in Software - Prac-
Let’s Look at the Data tice and Experience in 1971. He found that “less than four
Statisticians often quote the rule of thumb that 20 per- percent of a program generally accounts for more than
cent of the population of the United States accounts for half of its running time,” which is quite close to an 80-
80 percent of the beer consumed. The same rule still 20 rule. Programmers should therefore focus their at-
applies to the active 20 percent: four percent of the tack on run-time efficiency on the inner loops: optimiz-
population accounts for 64 percent of the beer, and so ing the critical four percent of the code could halve the
on. This rule tells breweries to concentrate their mar- run time, while changing any other code will have lit-
keting efforts on the high-volume customers. tle effect.
Similar rules have been observed for various aspects The above statistics deal with the dynamic behavior
of a program; Knuth also gathered data on the source
TABLE I. A Sampler of 80-20 (or 90-10) Rules text. He studied over 250,000 lines of FORTRAN from
the Lockheed Missiles and Space Corporation and
11,000 lines from Stanford University. Although FOR-
lines of source code run time of the program TRAN has 40 different statement types, Knuth found
options in a system functionality +ers exercise that the four most popular statement types account for
words in a language words in a document over 70 percent of the lines of source code. Data on
transaction types transactionsprocessed statements used in more than three percent of the
recordsin a file transactionson the file source lines is contained in Table II and Figure 1
files in a system files read by users
(which present the same data in different forms). The

February 1985 Volume 28 Number 2 Communications of the ACM 139


Programming Pearls

data is useful as compiler writers attempt to decrease ability that randomly choosing a sequence of spots
compilation time. wins the game; assume that you may use at most one
Figure z combines graphical and tabular methods to hour of CPU time.
display data from Ritchie and Thompson’s “UNIX’
I gave this problem, exactly as stated, on a take-home
Time-Sharing System,” which appeared in Communica-
examination in a course on “Applied Algorithm De-
tions 17, 7 (July 1974). The table lists all commands that
sign.” Several students described methods that could
account for more than two percent of either CPU time
compute the answer in just a few minutes of CPU time;
(useful for reducing run time) or command invocations
they were upset when their answers received zero
(useful for increasing ease of use). In both categories
points. The response “I’d talk to my statistics professor”
fewer than 10 commands accounted for over half the
was worth five points, and a perfect lo-point answer
usage.
went like this.
I experienced the value of looking at data when I was
a member of the VLSI project at Carnegie-Mellon Uni- The numbers 4..19 have no impact on the game, so
versity. Dorothea Haken, Bob Hon, and I had developed they can be ignored. The card wins if 1 and 2 are
several fancy algorithms for use in VLSI design systems. chosen (in either order) before 3. This happens when
To decide which one to implement, we collected data 3 is chosen last, which occurs one time out of three.
about the size, shape, and placement of the geometric The probability that a random sequence wins is
objects on 15’ VLSI chips. We were surprised to find therefore precisely %.
that the simplest algorithm was also the most efficient:
The students who received zero points worked
the real designs didn’t display the perverse behavior
harder, not smarter: they allowed a minor issue in the
that the sophisticated methods took great pains to han-
problem specification to hide an elegant solution. They
dle efficiently. The time invested in studying the input
overlooked a four-sentence answer (which many stu-
yielded substantial dividends in simpler programs that
dents found in just a few minutes) and instead pursued
were faster to implement and to execute.
an approach that required hours to formulate and
A lot of data has been published in problem areas
would have required days to code. I hope that the expe-
such as programming languages, computer architecture,
rience of making this mistake on an examination might
and English text. In other areas, though, programmers
help them avoid making similar mistakes in later life.
are forced to code solutions before they have enough
Previous columns have described case studies in
data to know what the problems really are. I’d appreci-
which finding the right problem to solve was the
ate hearing about statistical investigations of under-
hardest part of the programmer’s job (see especially the
studied problem domains.
August 1983 and December 1984 columns). I’d like to
collect more stories in which careful problem definition
Problem Definition
paid off.
Problem 9 in the December column was contributed by
Michael Shamos of UNILOGIC, Ltd.
Principles
A promotional game consists of a card containing 10 “Don’t get mad, get even” is another piece of old (but
spots, which hide a random permutation of the inte- usually bad) advice. It does nicely summarize, though,
gers l..lO. The player rubs the dots off the card to the theme of this column. The next time you’re totally
expose the hidden integers. If the integer three is outclassed by a fellow programmer (or even by one of
ever exposed then the card loses; if one and two (in your own brilliant insights), don’t get mad. Instead,
either order) are revealed then the card wins. De- study the act of cleverness and try to extract an under-
scribe the steps you would take to compute the prob- lying principle. If you succeed you won’t get mad,
*UNIX is a trademark of AT&T Bell Laboratories you’ll get even smarter.

Editor
List directory
C Compiler
Remove file
User’s Programs 6.0
Print file .6.0
List users logged on 3.3
Rename/move file 3.2
File status 3.1
I I I
0 5 10 15 0 5 10 15

Percent of CPU Usage Percent of Command Accesses


(Top 8 account for 62.7%) (Top 9 account for 59.1%)

FIGURE2. Ritchie and Thompson’s Data on the UNIX Operating System

140 Communicatior!s of the ACM February 1985 Volume 28 Number 2


Programming Pearls

Solutions for December’s Problems


Further Reading
Several readers communicated several different algo-
As important as tricks are, they are no substitute for
rithms for generating a sorted list of M distinct integers
good engineering. Structure and Interpretafion ofCom-
in the range l..N in O(M) time and constant space. For puter Programsby Abelson and Sussman was published
a discussion of such algorithms, see J.S.Vitter’s “Faster jointly by MIT Press and McGraw-Hill in 1985; it is
Methods for Random Sampling” in Communications 27, 7
designed for a first course in computer science. The
(July 1984, pp. 703-718). book’s approach is nicely summarized in the preface.
1. To select M integers from the range l..N, choose the The techniques we teach and draw upon are commonto all
number I at random in the range, and then report of engineeringdesign.We control complexity by building
the numbers I, I + 1, . . . , I + M - 1 (possibly abstractionsthat hide details when appropriate.We control
wrapping around to 1). This method chooses each complexity by establishingconventional interfacesthat en-
integer with probability M/N, but is strongly biased able us to construct systemsby combining standard,well-
towards certain subsets. understoodpiecesin a “mix and match” way. We control
complexity by establishing new languagesfor describing a
2. When fewer than N/2 integers have been selected so design,each of which emphasizesparticular aspectsof the
far, the probability that a randomly chosen integer is designand deemphasizesothers.
unselected is greater than l/2. That the average
number of draws to get an unselected integer is less The authors maintain a fine balance of building in-
than two follows from the logic that one must toss a teresting programs and teaching rock-solid engineering
coin twice, on the average, to get heads. principles.
My only problem with this book is the fact that it is
3. Let’s view the set S in Program 2 as a collection of N far too fascinating for a first course in computing. I tried
initially empty urns. Each call to RandInt selects to browse it, but I was immediately sucked into reading
an urn into which we throw a ball; if it was previ- every wonderful word. Be warned: don’t open a copy of
ously occupied, the Member test is true. The number this book until you are ready for a programming feast.
of balls required to ensure that each urn contains at
least one ball is known to statisticians as the “Cou- the integer is already in the set. To print duplicate
pon Collector’s Problem” (how many baseball cards integers in random order, use the trivial program
must I collect to make sure I have all N?); the an-
swer is roughly N In N. The algorithm makes M tests for I := 1 toMdo
when all the balls go into different urns; determining prlntRandInt(l,N)
when there are likely to be two balls in one urn is 9. This solution is discussed in the section on “Problem
the “Birthday Paradox” (in any group of 23 or more Definition.”
people, two are likely to share a birthday). In gen-
eral, two balls are like1 to share one of N urns if
there are at least e Ns/2 balls. For Correspondence: Ion Bentley. AT&T Bell Laboratories, Room X-317.
600 Mountain Avenue. Murray Hill. NJ 07974.
6. To print the values in increasing order one can place
the print statement after the recursive call. Permission to copy without fee all or part of this material is granted
provided that the copies are not made or distributed for direct commer-
7. To print distinct integers in random order, print cial advantage, the ACM copyright notice and the title of the publication
and its date appear. and notice is given that copying is by permission of
each one as it is first generated. To print duplicate the Association for Computing Machinery. To copy otherwise. or to
integers in sorted order, remove the test of whether republish. requires a fee and/or specific permission.

CSC 85 March 12-l 4 New Orleans

February 1985 Volume 28 Number 2 Communications of the ACM 141

You might also like