David Wagner CS 161 Computer Security Notes
David Wagner CS 161 Computer Security Notes
2 Stack smashing
One powerful method for exploiting buffer overrun vulnerabilities takes advantage of the way
local variables are laid out on the stack.
We need to review some background material first. Let’s recall C’s memory layout:
If the input is too long, the code will write past the end of buf and the saved SP and return
address will be overwritten. This is a stack smashing attack.
Stack smashing can be used for malicious code injection. First, the attacker arranges to
infiltrate a malicious code sequence somewhere in the program’s address space, at a known
address (perhaps using techniques previously mentioned). Next, the attacker provides a
carefully-chosen 88-byte input, where the last four bytes hold the address of the malicious
code.2 The gets() call will overwrite the return address on the stack with the last 4 bytes
of the input—in other words, with the address of the malicious code. When vulnerable()
returns, the CPU will retrieve the return address stored on the stack and transfer control to
that address, handing control over to the attacker’s malicious code.
The discussion above has barely scratched the surface of techniques for exploiting buffer
overrun bugs. Stack smashing was first introduced in 1996 (see “Smashing the Stack for
Fun and Profit” by Aleph One). Modern methods are considerably more sophisticated and
powerful. These attacks may seem esoteric, but attackers have become highly skilled at
2
In this example, I am assuming a 32-bit architecture, so that SP and the return address are 4 bytes long.
On a 64-bit architecture, SP and the return address would be 8 bytes long, and the attacker would need to
provide a 96-byte input whose last 8 bytes were chosen carefully to contain the address of the malicious code.
5
In an earlier draft of these notes, the conditional was written as “if (len > sizeof buf)”. A student
astutely observed that that code would actually be okay, because sizeof buf is of type unsigned, and the
rules of C would first convert len to also be unsigned. So a negative value of len would instead be treated
as a very large unsigned value, and the test would succeed, leading to printing of the error message and a
return before the call to memcpy. In the above verision, we instead use the constant 80, which C treats as
signed, so a negative value of len will sneak past the test.
len = read_int_from_network();
buf = malloc(len+5);
read(fd, buf, len);
...
}
This code seems to avoid buffer overflow problems (indeed, it allocates 5 more bytes than
necessary). But, there is a subtle problem: len+5 can wrap around if len is too large. For
instance, if len = 0xFFFFFFFF, then the value of len+5 is 4 (on 32-bit platforms). In this
case, the code allocates a 4-byte buffer and then writes a lot more than 4 bytes into it: a
classic buffer overflow. You have to know the semantics of your programming language very
well to avoid all the pitfalls.
5 Memory safety
Buffer overflow, format string, and the other examples above are examples of memory safety
bugs: cases where an attacker can read or write beyond the valid range of memory regions.
Other examples of memory safety violations include using a dangling pointer (a pointer into
a memory region that has been freed and is no longer valid) and double-free bugs (where
a dynamically allocated object is explicitly freed multiple times). C and C++ rely upon
the programmer to preserve memory safety, but bugs in the code can lead to violations of
memory safety. History has taught us that memory safety violations often enable malicious
code injection and other kinds of attacks.
Some modern languages are designed to be intrinsically memory-safe, no matter what the
programmer does. Java is one example. Thus, memory-safe languages eliminate the opportunity
for one kind of programming mistake that has been known to cause serious security problems.
In general, before performing any potentially unsafe operation, we can write some code to
check (at runtime) whether the operation is safe to perform and abort if not. For instance,
instead of
char digit_to_char(int i) { // BAD
char convert[] = "0123456789";
return convert[i];
}
we can write
char digit_to_char(int i) { // BETTER
char convert[] = "0123456789";
if (i < 0 || i > 9)
return "?"; // or, call exit()
return convert[i];
}
This code ensures that the array access will be within bounds. Similarly, when calling library
functions, we can use a library function that incorporates these kinds of checks, rather than
one that does not. Instead of
char buf[512];
strcpy(buf, src); // BAD
we can write
char buf[512];
strlcpy(buf, src, sizeof buf); // BETTER
Languages and libraries can help avoid memory-safety vulnerabilities by eliminating the
opportunity for programmer mistakes. For instance, Java performs automatic bounds-checking
on every array access, so programmer error cannot lead to an array bounds violation. Also,
Java provides a String class with methods for many common string operations. Importantly,
every method on String is memory-safe: the method itself performs all necessary runtime
checks and resizes all buffers as needed to ensure there is enough space for strings. Similarly,
C++ provides a safe string class; using these libraries, instead of C’s standard library, reduces
the likelihood of buffer overflow bugs in string manipulation code.
Compilers and other tools can reduce the burden on programmers by automatically introducing
runtime checks at every potentially unsafe operation, so that programmers do not have to do
so explicitly. For instance, there has been a great deal of research on augmenting C compilers
so they automatically emit a bounds check at every array or pointer access.
One challenge is that automatic bounds-checking for C/C++ has a non-trivial performance
overhead. Even after decades of research, the best techniques still slow down computationally-
intensive code by 10%-150% or so, though the good news is that for I/O-bound code the
overheads can be smaller.6 Because many security-critical network servers are I/O-bound,
6
If you’re interested in learning more, you can read a research paper on this subject at http://www.
usenix.org/events/sec09/tech/full_papers/akritidis.pdf.
Static analysis is a technique for scanning source code to try to automatically detect potential
bugs. You can think of static analysis as runtime checks, performed at compile time: the
static analysis tool attempts to predict whether there exists any program execution under
which a runtime check would fail, and if it finds any, it warns the programmer. Sophisticated
techniques are needed, and those techniques are beyond the scope of this class, but they build
on ideas from the compilers and programming language literature for automatic program
analysis (e.g., ideas initially developed for compiler optimization).
The advantage of static analysis is that it can detect bugs proactively, at development time, so
that they can be fixed before the code has been shipped. Bugs in deployed code are expensive,
not only because customers don’t like it when they get hacked due to a bug in your code, but
also because fixes require extensive testing to ensure that the fix doesn’t make things worse.
Generally speaking, the earlier a bug is found, the cheaper it can be to fix, which makes static
analysis tools attractive.
One challenge with static analysis tools is that they make errors. This is fundamental:
detecting security bugs can be shown to be undecidable (like the Halting Problem), so it
follows that any static analysis tool will either miss some bugs (false negatives), or falsely
warn about code that is correct (false positives), or both. In practice, the effectiveness of a
static analysis tool is determined by its false negative rate and false positive rate; these two
can often be traded off against each other. At one extreme are verification tools, which are
guaranteed to be free of false negatives: if your code does not trigger any warnings, then it is
guaranteed to be free of bugs (at least, of the sort of bugs that the tool attempts to detect).
7
For those interested in reading more, you can read about ASLR and the NX bit.
6.5 Testing
Another way to find security bugs proactively is by testing your code. A challenge with testing
for security, as opposed for functionality, is that security is a negative property: we need to
prove that nothing bad happens, even in unusual circumstances; whereas standard testing
focuses on ensuring that something good does happen, under normal circumstances. It is a
lot easier to define test cases that reflect normal, expected inputs and check that the desired
behavior does occur, then to define test cases that represent the kinds of unusual inputs an
attacker might provide or to detect things that are not supposed to happen.
Generally, testing for security has two aspect:
1. Test generation. We need to find a way to generate test cases, so that we can run the
program on those test cases.
2. Bug detection. We need a way to detect whether a particular test case revealed a bug
in the program.
Fuzz testing is one simple form of security testing. Fuzz testing involves testing the program
with random inputs and seeing if the program exhibits any sign of failure. Generally, the
bug detection strategy is to check whether the program crashes (or throws an unexpected
exception). For greater bug detection power, we can enable runtime checks (e.g., automatic
array bounds-checking) and see whether any of the test cases triggers a failure of some runtime
check. There are three different approaches to test generation that are commonly taken,
during fuzz testing:
• Random inputs. Construct a random input file, and run the program on that input. The
file is constructed by choosing a totally random sequence of bytes, with no structure.
• Mutated inputs. Start with a valid input file, randomly modify a few bits in the file,
and run the program on the mutated input.
• Structure-driven input generation. Taking into account the intended format of the input,
devise a program to independently “fuzz” each field of the input file. For instance, if
we know that one part of the input is a string, generate random strings (of random
lengths, with random characters, some of them with % signs to try to trigger format
string bugs, some with funny Unicode characters, etc.). If another part of the input is a
length, try random integers, try a very small number, try a very large number, try a
negative number (or an integer whose binary representation has its high bit set).
One shortcoming of purely random inputs is that, if the input has a structured format, then
it is likely that a random input file will not have the proper format and thus will be quickly
rejected by the program, leaving much of the code uncovered and untested. The other two
approaches address this problem. Generally speaking, mutating a corpus of valid files is
8
For those interested in learning more, check out Real World Fuzzing, http://pages.cs.wisc.edu/~rist/
642-fall-2012/toorcon.pdf. If you like to try things out, you could experiment with zzuf, an easy-to-
use mutation fuzzer for Linux: http://caca.zoy.org/wiki/zzuf. For instance, on Linux you can have
some fun with a command like zzuf -v -s 1:1000 valgrind -q --leak-check=no --error-exitcode=1
unzip -o foo.zip; look for error messages from Valgrind, prefixed with ==NNNNN==.