Buflab cs3214 Fall12
Buflab cs3214 Fall12
This project was developed by Randy Bryant; it was adapted for Virginia Tech by Godmar Back ([email protected]). Minimum requirement: To obtain a passing grade in CS 3214, we expect that by the end of the semester your group was successful in igniting at least Level 2, which is the Firecracker phase of the bomb.
Introduction
This assignment will help you develop a detailed understanding of IA-32 calling conventions and stack organization. It involves applying a series of buffer overow attacks on an executable le bufbomb provided to you. Note: In this project, you will gain rsthand experience with one of the methods commonly used to exploit security weaknesses in operating systems and network servers. Our purpose is to help you learn about the runtime operation of programs and to understand the nature of this form of security weakness so that you can avoid it when you write system or application code. We do not condone the use of this or any other form of attack to gain unauthorized access to any system resources. There are criminal statutes governing such activities.
Logistics
You may work in a group of up to two people in solving the problems for this project. The only hand-in will be an automated logging of your successful attacks. Any clarications and revisions to the project will be posted on the course web page. We generated the buffer bomb executable using gccs -m32 ag, so all code produced by the compiler follows IA-32 rules, even if the host is an x86-64 system.
In four of your ve buffer attacks, your objective will be to make your cookie show up in places where it ordinarily would not.
1 2 3 4 5 6
The function Gets is similar to the (long-deprecated) standard library function getsit reads a string from standard input (terminated by \n or end-of-le) and stores it (along with a null terminator) at the specied destination. In this code, you can see that the destination is an array buf having sufcient space for 32 characters. Gets (and gets) grabs a string off the standard input stream and stores it into its destination address (in this case buf). However, Gets() has no way of determining whether buf is large enough to store the whole input. It simply copies the entire input string, possibly overrunning the bounds of the storage allocated at the destination. If the string typed by the user to getbuf is no more than 31 characters long, it is clear that getbuf will return 1, as shown by the following execution example:
unix> ./bufbomb -u rnikola+butta Type string: I love CS3214. Dud: getbuf returned 0x1
As the error message indicates, overrunning the buffer typically causes the program state to be corrupted, leading to a memory access error. Your task is to be more clever with the strings you feed BUFBOMB so that it does more interesting things. These are called exploit strings. B UFBOMB takes several different command line arguments: -u teamid: Operate the bomb for the indicated teamid. You should always provide this argument for several reasons: It is required to submit your successful attacks to the grading server. BUFBOMB determines the cookie you will be using based on your teamid, as does the program MAKECOOKIE. We have built features into BUFBOMB so that some of the key stack addresses you will need to use depend on your teamids cookie. -h: Print list of possible command line arguments. 3 Revision : 1.2
-n: Operate in Nitro mode, as is used in Level 4 below. -s: Submit your solution exploit string to the grading server. At this point, you should think about the x86 stack structure a bit and gure out what entries of the stack you will be targeting. You may also want to think about exactly why the last example created a segmentation fault, although this is less clear. Your exploit strings will typically contain byte values that do not correspond to the ASCII values for printing characters. The program HEX 2 RAW can help you generate these raw strings. It takes as input a hex-formatted string. In this format, each byte value is represented by two hex digits. For example, the string 012345 could be entered in hex format as 30 31 32 33 34 35. (Recall that the ASCII code for decimal digit x is 0x3x.) The hex characters you pass HEX 2 RAW should be separated by whitespace (blanks or newlines). I recommend separating different parts of your exploit string with newlines while youre working on it. HEX 2 RAW also supports C-style block comments, so you can mark off sections of your exploit string. For example:
bf 66 7b 32 78 /* mov $0x78327b66,%edi */
Be sure to leave space around both the starting and ending comment strings (/*, */) so they will be properly ignored. If you generate a hex-formatted exploit string in the le exploit.txt, you can apply the raw string to BUFBOMB in several different ways: 1. You can set up a series of pipes to pass the string through HEX 2 RAW.
unix> cat exploit.txt | ./hex2raw | ./bufbomb -u rnikola+butta
2. You can store the raw string in a le and use I/O redirection to supply it to BUFBOMB:
unix> ./hex2raw < exploit.txt > exploit-raw.txt unix> ./bufbomb -u rnikola+butta < exploit-raw.txt
This approach can also be used when running BUFBOMB from within GDB:
unix> gdb bufbomb (gdb) run -u rnikola+butta < exploit-raw.txt
Important points: Your exploit string must not contain byte value 0x0A at any intermediate position, since this is the ASCII code for newline (\n). When Gets encounters this byte, it will assume you intended to terminate the string.
Revision : 1.2
HEX 2 RAW expects two-digit hex values separated by a whitespace. So if you want to create a byte with a hex value of 0, you need to specify 00. To create the word 0xDEADBEEF you should pass DE AD BE EF to HEX 2 RAW. When you have correctly solved one of the levels, say level 0:
../hex2raw < smoke-rnikola+butta.txt | ../bufbomb -u rnikola+butta Userid: rnikola+butta Cookie: 0x1005b2b7 Type string:Smoke!: You called smoke() VALID NICE JOB!
then can submit your solution to the grading server using the -s option:
./hex2raw < smoke-rnikola+butta.txt | ./bufbomb -u rnikola+butta -s Userid: rnikola+butta Cookie: 0x1005b2b7 Type string:Smoke!: You called smoke() VALID Sent exploit string to server to be validated. NICE JOB!
The server will test your exploit string to make sure it really works, and it will update the Buffer Lab scoreboard page indicating that your teamid (listed by your cookie for anonymity) has completed this level. You can view the scoreboard by pointing your Web browser at http://courses.cs.vt.edu/ cs3214/fall2012/buflab/buflab-scoreboard.html Unlike the Bomb Lab, there is no penalty for making mistakes in this project. Feel free to re away at BUFBOMB with any string you like. Of course, you shouldnt brute force this project either, since it would take longer than you have to do the assignment. IMPORTANT NOTE: You can work on your buffer bomb on any Linux machine, but in order to submit your solution, you will need to be running on one of the machines that are part of the rlogin.cs.vt.edu cluster.
Revision : 1.2
5 6 7 8 9 10 11 12 13 14 15 16 17 18
val = getbuf(); /* Check for corrupted stack */ if (local != 0xdeadbeef) { printf("Sabotaged!: the stack has been corrupted\n"); } else if (val == cookie) { printf("Boom!: getbuf returned 0x%x\n", val); validate(3); } else { printf("Dud: getbuf returned 0x%x\n", val); } }
When getbuf executes its return statement (line 5 of getbuf), the program ordinarily resumes execution within function test (at line 7 of this function). We want to change this behavior. Within the le bufbomb, there is a function smoke having the following C code:
void smoke() { printf("Smoke!: You called smoke()\n"); validate(0); exit(0); }
Your task is to get BUFBOMB to execute the code for smoke when getbuf executes its return statement, rather than returning to test. Note that your exploit string may also corrupt parts of the stack not directly related to this stage, but this will not cause a problem, since smoke causes the program to exit directly. Some Advice: All the information you need to devise your exploit string for this level can be determined by examining a disassembled version of BUFBOMB. Use objdump -d to get this disassembled version. Be careful about byte ordering. You might want to use GDB to step the program through the last few instructions of getbuf to make sure it is doing the right thing. The placement of buf within the stack frame for getbuf depends on which version of GCC was used to compile bufbomb, so you will have to read some assembly to gure out its true location.
Revision : 1.2
Similar to Level 0, your task is to get BUFBOMB to execute the code for fizz rather than returning to test. In this case, however, you must make it appear to fizz as if you have passed your cookie as its argument. How can you do this? Some Advice: Note that the program wont really call fizzit will simply execute its code. This has important implications for where on the stack you want to place your cookie.
Revision : 1.2
Similar to Levels 0 and 1, your task is to get BUFBOMB to execute the code for bang rather than returning to test. Before this, however, you must set global variable global_value to your teamids cookie. Your exploit code should set global_value, push the address of bang on the stack, and then execute a ret instruction to cause a jump to the code for bang. Some Advice: You can use GDB to get the information you need to construct your exploit string. Set a breakpoint within getbuf or Gets and run to this breakpoint. Determine parameters such as the address of global_value and the location of the buffer. Determining the byte encoding of instruction sequences by hand is tedious and prone to errors. You can let tools do all of the work by writing an assembly code le containing the instructions and data you want to put on the stack. Assemble this le with gcc -m32 -c and disassemble it with objdump -d. You should be able to get the exact byte sequence that you will type at the prompt. (A brief example of how to do this is included at the end of this writeup.) Keep in mind that your exploit string depends on your machine, your compiler, and even your teamids cookie. You must do this project on the machines of the rlogin cluster. Make sure you include the proper teamid on the command line to BUFBOMB. Watch your use of address modes when writing assembly code. Note that movl $0x4, %eax moves the value 0x00000004 into register %eax; whereas movl 0x4, %eax moves the value at memory location 0x00000004 into %eax. Since that memory location is usually undened, the second instruction will cause a segfault! Do not attempt to use either a jmp or a call instruction to jump to the code for bang. These instructions uses PC-relative addressing, which is very tricky to set up correctly. Instead, push an address on the stack and use the ret instruction.
program to go Boom!. Your exploit code should set your cookie as the return value, restore any corrupted state, push the correct return location on the stack, and execute a ret instruction to really return to test. Some Advice: You can use GDB to get the information you need to construct your exploit string. Set a breakpoint within getbuf and run to this breakpoint. Determine parameters such as the saved return address. Determining the byte encoding of instruction sequences by hand is tedious and prone to errors. You can let tools do all of the work by writing an assembly code le containing the instructions and data you want to put on the stack. Assemble this le with GCC and disassemble it with OBJDUMP. You should be able to get the exact byte sequence that you will type at the prompt. (A brief example of how to do this is included at the end of this writeup.) Keep in mind that your exploit string depends on your machine, your compiler, and even your teamids cookie. Do all of your work on the machines assigned by your instructor, and make sure you include the proper teamid on the command line to BUFBOMB. Once you complete this level, pause to reect on what you have accomplished. You caused a program to execute machine code of your own design. You have done so in a sufciently stealthy way that the program did not realize that anything was amiss.
Revision : 1.2
For this level, we have gone the opposite direction, making the stack positions even less stable than they normally are. Hence the name nitroglycerinan explosive that is notoriously unstable. When you run BUFBOMB with the command line ag -n, it will run in Nitro mode. Rather than calling the function getbuf, the program calls a slightly different function getbufn:
int getbufn() { char buf[KABOOM_BUFFER_SIZE]; Gets(buf); return 1; }
This function is similar to getbuf, except that it has a buffer of 512 characters. You will need this additional space to create a reliable exploit. The code that calls getbufn rst allocates a random amount of storage on the stack (using library function alloca) that ranges between 0 and 255 bytes. Thus, if you were to sample the value of %ebp during two successive executions of getbufn, you would nd they differ by as much as 127. In addition, when run in Nitro mode, BUFBOMB requires you to supply your string 5 times, and it will execute getbufn 5 times, each with a different stack offset. Your exploit string must make it return your cookie each of these times. Your task is identical to the task for the Dynamite level. Once again, your job for this level is to supply an exploit string that will cause getbufn to return your cookie back to test, rather than the value 1. You can see in the code for test that this will cause the program to go KABOOM!. Your exploit code should set your cookie as the return value, restore any corrupted state, push the correct return location on the stack, and execute a ret instruction to really return to testn. Some Advice: You can use the program HEX 2 RAW to send multiple copies of your exploit string. If you have a single copy in the le exploit.txt, then you can use the following command:
unix> cat exploit.txt | ./hex2raw -n | ./bufbomb -n -u rnikola+butta
You must use the same string for all 5 executions of getbufn. Otherwise it will fail the testing code used by our grading server. The trick is to make use of the nop instruction. It is encoded with a single byte (code 0x90). It may be useful to read about nop sleds on page 262 of the CS:APP2e textbook.
Logistical Notes
Handin occurs to the grading server whenever you correctly solve a level and use the -s option. Upon receiving your solution, the server will validate your string and update the Buffer Lab 10 Revision : 1.2
scoreboard Web page, which you can view by pointing your Web browser at http://courses. cs.vt.edu/cs3214/fall2012/buflab/buflab-scoreboard.html You should be sure to check this page after your submission to make sure your string has been validated. (If you really solved the level, your string should be valid.) Note that each level is graded individually. You do not need to do them in the specied order, but you will get credit only for the levels for which the server receives a valid message. You can check the Buffer Lab scoreboard to see how far youve gotten. The grading server creates the scoreboard by using the latest results it has for each phase. Good luck and have fun!
The code can contain a mixture of instructions and data. Anything to the right of a # character is a comment. We can now assemble and disassemble this le:
unix> gcc -m32 -c example.S unix> objdump -d example.o > example.d
Each line shows a single instruction. The number on the left indicates the starting address (starting with 0), while the hex digits after the : character indicate the byte codes for the instruction. Thus, we can see that the instruction push $0xABCDEF has hex-formatted byte code 68 ef cd ab 00. 11 Revision : 1.2
Starting at address 8, the disassembler gets confused. It tries to interpret the bytes in the le example.o as instructions, but these bytes actually correspond to data. Note, however, that if we read off the 4 bytes starting at address 8 we get: 98 ba dc fe. This is a byte-reversed version of the data word 0xFEDCBA98. This byte reversal represents the proper way to supply the bytes as a string, since a little endian machine lists the least signicant byte rst. Finally, we can read off the byte sequence for our code as:
68 ef cd ab 00 83 c0 11 98 ba dc fe
This string can then be passed through HEX 2 RAW to generate a proper input string we can give to BUFBOMB . Alternatively, we can edit example.d to look like this:
68 ef cd ab 00 /* push $0xabcdef */ 83 c0 11 /* add $0x11,%eax */ 98 ba dc fe
which is also a valid input we can pass through HEX 2 RAW before sending to BUFBOMB.
12
Revision : 1.2