CS 312 Lecture 26 Debugging Techniques: Bug Prevention and Defensive Programming
CS 312 Lecture 26 Debugging Techniques: Bug Prevention and Defensive Programming
Debugging Techniques
Testing a program against a well-chosen set of input tests gives the programmer confidence that the
program is correct. During the testing process, the programmer observes input-output relationships,
that is, the output that the program produces for each input test case. If the program produces the
expected output and obeys the specification for each test case, then the program is successfully
tested.
But if the output for one of the test cases is not the one expected, then the program is incorrect -- it
contains errors (or defects, or "bugs"). In such situations, testing only reveals the presence of errors,
but doesn't tell us what the errors are, or how the code needs to be fixed. In other words, testing
reveals the effects (or symptoms) of errors, not the cause of errors. The programmer must then go
through a debugging process, to identify the causes and fix the errors.
Therefore, the best thing to do is to avoid the bug when you write the program in the first place! It
is important to sit and think before you code: decide exactly what needs to be achieved, how you plan
to accomplish that, design the high-level algorithm cleanly, convince yourself it is correct, decide what
are the concrete data structures you plan to use, and what are the invariants you plan to maintain. All
the effort spent in designing and thinking about the code before you write it will pay off later. The
benefits are twofold. First, having a clean design will reduce the probability of defects in your program.
Second, even if a bug shows up during testing, a clean design with clear invariants will make it much
easier to track down and fix the bug.
It may be very tempting to write the program as fast as possible, leaving little or no time to think about
it before. The programmer will be happy to see the program done in a short amount. But it's likely he
will get frustrated shortly afterwards: without good thinking, the program will be complex and unclear,
so maintenance and bug fixing will become an endless process.
Once the programmer starts coding, he should use defensive programming. This is similar to
defensive driving, which means driving under worst-case scenarios (e.g, other drivers violating traffic
laws, unexpected events or obstacles, etc). Similarly, defensive programming means developing code
such that it works correctly under the worst-case scenarios from its environment. For instance, when
writing a function, one should assume worst-case inputs to that function, i.e., inputs that are too large,
too small, or inputs that violate some property, condition, or invariant; the code should deal with these
cases, even if the programmer doesn't expect them to happen under normal circumstances.
Remember, the goal is not to become an expert at fixing bugs, but rather to get better at writing robust,
(mostly) error-free programs in the first place. As a matter of attitude, programmers should not feel
proud when they fix bugs, but rather embarrassed that their code had bugs. If there is a bug in the
program, it is only because the programmer made mistakes.
Classes of Defects
Even after careful thought and defensive programming, a program may still have defects. Generally
speaking, there are several kinds of errors one may run into:
Syntax or type errors. These are always caught by the compiler, and reported via error
messages. Typically, an error message clearly indicates the cause of error; for instance, the line
number, the incorrect piece of code, and an explanation. Such messages usually give enough
information about where the problem is and what needs to be done. In addition, editors with
syntax highlighting can give good indication about such errors even before compiling the
program.
Typos and other simple errors that have pass undetected by the type-checker or the other
checks in the compiler. Once these are identified, they can easily be fixed. Here are a few
examples: missing parentheses, for instance writing x + y * z instead of (x + y) * z;
typos, for instance case t of ... | x::tl => contains(x,t); passing parameters in
incorrect order; or using the wrong element order in tuples.
Implementation errors. It may be the case that logic in the high-level algorithm of a program is
correct, but some low-level, concrete data structures are being manipulated incorrectly, breaking
some internal representation invariants. For instance, a program that maintains a sorted list as
the underlying data structure may break the sorting invariant. Building separate ADTs to model
each data abstraction can help in such cases: it can separate the logic in the algorithm from the
manipulation of concrete structures; in this way, the problem is being isolated in the ADT. Calls to
repOK() can further point out what parts of the ADT cause the error.
Logical errors. If the algorithm is logically flawed, the programmer must re-think the algorithm.
Fixing such problems is more difficult, especially if the program fails on just a few corner cases.
One has to closely examine the algorithm, and try to come up with an argument why the
algorithm works. Trying to construct such an argument of correctness will probably reveal the
problem. A clean design can help a lot figuring out and fixing such errors. In fact, in cases where
the algorithm is too difficult to understand, it may be a good idea to redo the algorithm from
scratch and aim for a cleaner formulation.
Difficulties
The debugging process usually consists of the following: examine the error symptoms, identify the
cause, and finally fix the error. This process may be quite difficult and require a large amount of work,
because of the following reasons:
The symptoms may not give clear indications about the cause. In particular, the cause and the
symptom may be remote, either in space (i.e., in the program code), or in time (i.e., during the
execution of the program), or both. Defensive programming can help reduce the distance
between the cause and the effect of an error.
Symptoms may be difficult to reproduce. Replay is needed to better understand the problem.
Being able to reproduce the same program execution is a standard obstacle in debugging
concurrent programs. An error may show up only in one particular interleaving of statements
from the parallel threads, and it may be almost impossible to reproduce that same, exact
interleaving.
Errors may be correlated. Therefore, symptoms may change during debugging, after fixing some
of the errors. The new symptoms need to be re-examined. The good part is that the same error
may have multiple symptoms; in that case, fixing the error will eliminate all of them.
Fixing an error may introduce new errors. Statistics indicate that in many case fixing a bug
introduces a new one! This is the result of trying to do quick hacks to fix the error, without
understanding the overall design and the invariants that the program is supposed to maintain.
Once again, a clean design and careful thinking can avoid many of these cases.
Debugging strategies
Although there is no precise procedure for fixing all bugs, there are a number of useful strategies that
can reduce the debugging effort. A significant part (if not all) of this process is spent localizing the
error, that is, figuring out the cause from its symptoms. Below are several useful strategies to help with
this. Keep in mind that different techniques are better suited in different cases; there is no clear best
method. It is good to have knowledge and experience with all of these approaches. Sometimes, a
combination of one or more of these approaches will lead you to the error.
Incremental and bottom-up program development. One of the most effective ways to localize
errors is to develop the program incrementally, and test it often, after adding each piece of code.
It is highly likely that if there is an error, it occurs in the last piece of code that you wrote. With
incremental program development, the last portion of code is small; the search for bugs is
therefore limited to small code fragments. An added benefit is that small code increments will
likely lead to few errors, so the programmer is not overwhelmed with long lists of errors.
Instrument program to log information. Typically, print statements are inserted. Although the
printed information is effective in some cases, it can also become difficult to inspect when the
volume of logged information becomes huge. In those cases, automated scripts may be needed
to sift through the data and report the relevant parts in a more compact format. Visualization
tools can also help understanding the printed data. For instance, to debug a program that
manipulates graphs, it may be useful to use a graph visualization tool (such as ATT's graphviz)
and print information in the appropriate format (.dot files for graphviz).
Instrument program with assertions. Assertions check if the program indeed maintains the
properties or invariants that your code relies on. Because the program stops as soon as it an
assertion fails, it's likely that the point where the program stops is much closer to the cause, and
is a good indicator of what the problem is. An example of assertion checking is the repOK()
function that verifies if the representation invariant holds at function boundaries. Note that
checking invariants or conditions is the basis of defensive programming. The difference is that
the number of checks is usually increased during debugging for those parts of the program that
are suspected to contain errors.
Use debuggers. If a debugger is available, it can replace the manual instrumentation using print
statements or assertions. Setting breakpoints in the program, stepping into and over functions,
watching program expressions, and inspecting the memory contents at selected points during
the execution will give all the needed run-time information without generating large, hard-to-read
log files.
Backtracking. One option is to start from the point where to problem occurred and go back
through the code to see how that might have happened.
Binary search. The backtracking approach will fail if the error is far from the symptom. A better
approach is to explore the code using a divide-and-conquer approach, to quickly pin down the
bug. For example, starting from a large piece of code, place a check halfway through the code. If
the error doesn't show up at that point, it means the bug occurs in the second half; otherwise, it
is in the first half. Thus, the code that needs inspection has been reduced to half. Repeating the
process a few times will quickly lead to the actual problem.
Problem simplification. A similar approach is to gradually eliminate portions of the code that
are not relevant to the bug. For instance, if a function fun f() = (g();h();k()) yields an
error, try eliminating the calls to g, h, and k successively (by commenting them out), to determine
which is the erroneous one. Then simplify the code in the body of buggy function, and so on.
Continuing this process, the code gets simpler and simpler. The bug will eventually become
evident. A similar technique can be applied to simplify data rather than code. If the size of the
input data is too large, repeatedly cut parts of it and check if the bug is still present. When the
data set is small enough, the cause may be easier to understand.
A scientific method: form hypotheses. A related approach is as follows: inspect the test case
results; form a hypothesis that is consistent with the observed data; and then design and run a
simple test to refute the hypothesis. If the hypothesis has been refuted, derive another
hypothesis and continue the process. In some sense, this is also a simplification process: it
reduces the number of possible hypotheses at each step. But unlike the above simplification
techniques, which are mostly mechanical, this process is driven by active thinking about an
explanation. A good approach is to try to come with the simplest hypotheses and the simplest
corresponding test cases.
Bug clustering. If a large number of errors are being reported, it is useful to group them into
classes of related bugs (or similar bugs), and examine only one bug from each class. The
intuition is that bugs from each class have the same cause (or a similar cause). Therefore, fixing
a bug with automatically fix all the other bugs from the same class (or will make it obvious how to
fix them).
Error-detection tools. Such tools can help programmers quickly identify violations of certain
classes of errors. For instance, tools that check safety properties can verify that file accesses in
a program obey the open-read/write-close file sequence; that the code correctly manipulates
locks; or that the program always accesses valid memory. Such tools are either dynamic (they
instrument the program to find errors at run-time), or use static analysis (look for errors at
compile-time). For instance, Purify is a popular dynamic tool that instruments programs to
identify memory errors, such as invalid accesses or memory leaks. Examples of static tools
include ESC Java and Spec#, which use theorem proving approaches to check more general
user specifications (pre and post-conditions, or invariants); or tools from a recent company
Coverity that use dataflow analysis to detect violations of safety properties. Such tools can
dramatically increase productivity, but checking is restricted to a particular domain or class of
properties. There is also an associated learning curve, although that is usually low. Currently,
there are relatively few such tools and this is more an (active) area of research.
A number of other strategies can be viewed as a matter of attitude about where to expect the errors:
The bug may not be where you expect it. It a large amount of time has unsuccessfully been
spent inspecting a particular piece of code, the error may not be there. Keep an open mind and
start questioning the other parts of the program.
Ask yourself where the bug is not. Sometimes, looking at the problem upside-down gives a
different perspective. Often, trying to prove that the absence of a bug in a certain place actually
reveals the bug in that place.
Explain to yourself or to somebody else why you believe there is no bug. Trying to articulate the
problem can lead to the discovery of the bug.
Inspect input data, test harness. The test case or the test harness itself may be broken. One has
to check these carefully, and make sure that the bug is in the actual program.
Make sure you have the right source code. One must ensure that the source code being
debugged corresponds to the actual program being run, and that the correct libraries are linked.
This usually requires a few simple checks; using makefiles and make programs (e.g., the
Compilation Manager and .cm files) can reduce this to just typing a single command.
Take a break. If too much time is spent on a bug, the programmer becomes tired and debugging
may become counterproductive. Take a break, clear your mind; after some rest, try to think about
the problem from a different perspective.
All of the above are techniques for localizing errors. Once they have been identified, errors need to be
corrected. In some cases, this is trivial (e.g., for typos and simple errors). Some other times, it may be
fairly straightforward, but the change must ensure maintaining certain invariants. The programmer
must think well about how the fix is going to affect the rest of the code, and make sure no additional
problems are created by fixing the error. Of course, proper documentation of these invariants is
needed. Finally, bugs that represent conceptual errors in an algorithm are the most difficult to fix. The
programmer must re-think and fix the logic of the algorithm.