Book Sample Buffer
Book Sample Buffer
Buffer Overflows
public enemy number 1
Erik Poll
Digital Security
Radboud University Nijmegen
1
Overview
1. How do buffer overflows work?
Or, more generally, memory corruption errors
2. How can we spot such problems in C(++) code?
Next week: tool-support for this
Incl. static analysi tool for first exercise.
2
Reading material
• Runtime Countermeasures for Code Injection Attacks
Against C and C++ Programs
by Yves Younan, Wouter Jooosen, and Frank Piessens
3
Essence of the problem
Suppose in a C program you have an array of length 4
char buffer[4];
What happens if the statement below is executed?
buffer[4] = 'a';
This is
4
undefined behaviour: anything can happen
5
undefined behaviour: anything can happen
6
Essence of the problem
Suppose in a C program you have an array of length 4
char buffer[4];
What happens if the statement below is executed?
buffer[4] = 'a';
If an attacker can trigger this and control the value 'a',
anything that the attacker wants may happen
8
Tony Hoare on design principles of ALGOL 60
10
Other memory corruption errors
Other memory bugs in C/C++ code, besides array access:
errors with pointers and with dynamic memory (the heap)
Who here has ever written C(++) programs with pointers?
Who ever had such a program crashing?
Or using dynamic memory, ie. using malloc & free?
Who ever had such a program crashing?
• In C/C++, the programmer is responsible for memory
management, and this is very error-prone
– Technical term: C and C++ do not offer memory-safety
(see lecture notes on language-based security, §3.1-3.2)
• Typical problems:
• dereferencing null, dangling pointers, use-after-free, double-
free, forgotten de-allocation (memory leaks), failed
allocation, flawed pointer arithmetic
11
How does classic buffer overflow work?
aka smashing the stack
12
Process memory layout
Unused Memory
Low
addresses
Program Code .text
13
Stack layout
The stack consists of Activation Records:
x
AR main() return address
AR f() buf[4..7]
buf[0..3]
x
AR main() return address
AR f() buf[4..7]
buf[0..3]
void f(int x) {
char[8] buf;
gets(buf);
}
void main() {
f(…); …
}
void format_hard_disk(){…}
15
Stack overflow attack (2)
What if gets() reads more than 8 bytes ?
Attacker can jump to his own code (aka shell code)
x
AR main() return address
AR f() /bin/sh
exec
void f(int x) {
char[8] buf;
gets(buf);
}
void main() {
f(…); …
}
void format_hard_disk(){…}
16
Stack overflow attack (2)
What if gets() reads more than 8 bytes ?
Attacker can jump to his own code (aka shell code)
x
AR main() return address
AR f() /bin/sh
exec
void
never
f(int x) {
use gets!
char[8] buf;
gets(buf);
}
gets has been removed from
the{ C standard in 2011
void main()
f(…); …
}
void format_hard_disk(){…}
17
Code injection vs code reuse attacks
18
Other attacks using memory errors
Besides messing the return address,
other ways to exploit buffer overflows & pointer bugs:
• corrupt some data
• illegally read some (confidential) data
20
What to attack? More fun on the stack
void f(void(*error_handler)(int),...) {
int diskquota = 200;
bool is_super_user = false;
char* filename = "/tmp/scratchpad";
char[] username;
int j = 12;
...
}
Suppose the attacker can overflow username
In addition to corrupting the return address, she might corrupt
• pointers, eg filename
• other data on the stack, eg is_super_user,diskquota
• function pointers, eg error_handler
But not j, unless the compiler chooses to allocate variables in a
different order, which the compiler is free to do.
21
What to attack? Fun on the heap
struct BankAccount {
int number;
char username[20];
int balance;
}
22
Spotting the problem
Reminder: C strings
str strlen(str) = 5
h e l l o \0
24
Example: gets
char buf[20];
gets(buf); // read user input until
// first EoL or EoF character
25
Example: strcpy
char dest[20];
strcpy(dest, src); // copies string src to dest
26
Spot the defect!
char buf[20];
char prefix[] = "http://";
...
strcpy(buf, prefix);
// copies the string prefix to buf
strncat(buf, path, sizeof(buf));
// concatenates path to the string buf
27
Spot the defect! (1)
char buf[20];
char prefix[] = "http://";
...
strcpy(buf, prefix);
// copies the string prefix to buf
strncat(buf, path, sizeof(buf));
// concatenates path to the string buf
28
Spot the defect! (2)
char src[9];
char dest[9];
29
Spot the defect! (2)
char src[9]; base_url is 10 chars long, incl.
char dest[9]; its null terminator, so src will not
be null-terminated
char* base_url = "www.ru.nl";
strncpy(src, base_url, 9);
// copies base_url to src
strcpy(dest, src);
// copies src to dest
30
Spot the defect! (2)
char src[9]; base_url is 10 chars long, incl.
char dest[9]; its null terminator, so src will not
be null-terminated
char* base_url = ”www.ru.nl”;
strncpy(src, base_url, 9);
// copies base_url to src
strcpy(dest, src);
// copies src to dest
31
Example: strcpy and strncpy
Don’t replace
strcpy(dest, src)
with
strncpy(dest, src, sizeof(dest))
but with
strncpy(dest, src, sizeof(dest)-1)
dst[sizeof(dest)-1] = '\0';
if dest should be null-terminated!
32
Spot the defect! (3)
char *buf;
int len;
...
33
Spot the defect! (3)
char *buf;
int len;
...
34
Spot the defect! (3)
char *buf;
int len;
...
if (len < 0)
{error ("negative length"); return; }
buf = malloc(MAX(len,1024));
read(fd,buf,len);
35
Spot the defect! (3)
char *buf;
What if the malloc() fails?
int len;
(because we are out of memory)
...
if (len < 0)
{error ("negative length"); return; }
buf = malloc(MAX(len,1024));
read(fd,buf,len);
36
Spot the defect! (3)
char *buf;
int len;
...
if (len < 0)
{error ("negative length"); return; }
buf = malloc(MAX(len,1024));
if (buf==NULL) { exit(-1);}
// or something a bit more graceful
read(fd,buf,len);
37
Better still
char *buf;
int len;
...
if (len < 0)
{error ("negative length"); return; }
buf = calloc(MAX(len,1024));
//to initialise allocate memory to 0
if (buf==NULL) { exit(-1);}
// or something a bit more graceful
read(fd,buf,len);
38
NB absence of language-level security
39
Spot the defect!
#define MAX_BUF 256
len = strlen(in);
40
Spot the defect!
#define MAX_BUF 256
See https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=integer+overflow
41
Spot the defect!
bool CopyStructs(InputFile* f, long count)
{ structs = new Structs[count];
for (long i = 0; i < count; i++)
{ if !(ReadFromFile(f,&structs[i])))
break;
}
}
effectively does a
malloc(count*sizeof(type))
which may cause integer overflow
42
Spot the defect!
1. unsigned int tun_chr_poll( struct file *file,
2. poll_table *wait)
3. { ...
4. struct sock *sk = tun->sk; // take sk field of tun
5. if (!tun) return POLLERR; // return if tun is NULL
6. ...
7. }
If tun is a null pointer, then tun->sk is undefined
What this code does if tun is null is undefined:
ANYTHING may happen then.
So compiler can remove line 5, as the behaviour when tun is NULL
is undefined anyway, so this check is 'redundant'.
Standard compilers (gcc, CLang) do this 'optimalisation' !
This is actually code from the Linux kernel, and removing line 5 led
to a security vulnerability [CVE-2009-1897]
43
Spot the defect!
1. void* f(int start)
2. if (start+100 < start) return SOME_ERROR;
3. // checks for overflow
4. for (int i=start; i < start+100; i++) {
5. . . . // i will not overflow
6. } }
Integer overflow is undefined behaviour! This means
• We cannot assume that overflow produces a negative number;
so line 2 is not a good check for integer overflow.
• Worse still, if overflow occurs, behaviour is undefined, and ANY
compilation is ok
• so the compiled code is allowed to do anything in case
start+100 overflows
This also means compiler may assume that start+100 < start
is always false (as it is always false when start+100 does not
overflow, at any behaviour is ok when it does overflow),
and remove line 2
44
Spot the defect!
// TCHAR is 1 byte ASCII or multiple byte UNICODE
#ifdef UNICODE
# define TCHAR wchar_t
# define _sntprintf _snwprintf
#else
# define TCHAR char
# define _sntprintf _snprintf
#endif sizeof(buf) is the size in bytes,
but this parameter gives the number
TCHAR buf[MAX_SIZE]; of characters that will be printed
_sntprintf(buf, sizeof(buf), "%s\n", input);
48
Format string attacks
New type of memory corruption invented/discovered in 2000
49
Format string attacks
Suppose attacker can feed malicious input string s to
printf(s). This can
• read the stack
%x reads and prints bytes from stack, so input
%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x
%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x
%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x...
dumps the stack ,including passwords, keys,… stored on
the stack
• corrupt the stack
%n writes the number of characters printed to the stack,
so input 12345678%n writes value 8 to the stack
• read arbitrary memory
a carefully crafted format string of the form
\xEF\xCD\xCD\xAB %x%x...%x%s
print string at memory address ABCDCDEF
50
-Wformat-overflow
Eg gcc has (far too many?) command line options for this:
-Wformat –Wformat-no-literal –Wformat-security ...
See https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=format+string
to see how common format strings still are
51
Recap: buffer overflows
• Buffer overflow is #1 weakness in C and C++ programs
– because these language are not memory-safe
• Tricky to spot
• Typical cause: programming with arrays, pointers, and
strings
– esp. library functions for null-terminated strings
• Related attacks
• Format string attack: another way of corrupting stack
• Integer overflows: often a stepping stone to getting a
buffer to overflows
• but just the integer overflow can already have a
security impact; eg think of banking software
52
Platform-level defences
Platform-level defenses
55
stack canaries
x
x return address
return address canary value
buf[4..7] buf[4..7]
buf[0..3] buf[0..3]
56
Further improvements
• More variation in canary values: eg not a fixed values hardcoded
in binary but a random values chosen for each execution
• Better still, XOR the return address into the canary value
• Include a null byte in the canary value, because C string
functions cannot write nulls inside strings
return return
eg changing to
canary value canary value
char* p char* p
buf[4..7] buf[4..7]
buf[0..3] buf[0..3]
57
Further improvements
• Re-order elements on the stack to reduce the potential impact of
overruns
• swapping parameters buf and fp on stack changes whether
overrunning buf can corrupt fp
• which is especially dangerous if fp is a function pointer
• hence it is safer to allocated array buffers ‘above’ all other
local variables
First introduced by IBM’s ProPolice.
58
Windows 2003 Stack Protection
Nice example of the ways in which things can go wrong...
• Enabled with /GS command line option in Visual Studio
• When canary is corrupted, control is transferred to an exception
handler
• Exception handler information is stored ...
on the stack!
• Attacker could corrupt the exception handler info on the stack, in
the process corrupt the canaries, and then let Stack Protection
mechanism transfer control for him
[http://www.securityfocus.com/bid/8522/info]
• Countermeasure: only allow transfer of control to registered
exception handlers
59
2. ASLR (Address Space Layout Randomisation)
• Attacker needs detailed info about memory layout
– eg to jump to specific piece of code
– or to corrupt a pointer at a know position on the stack
• Attacks become harder if we randomise the memory
layout every time we start a program
• ie. change the offset of the heap, stack, etc, in memory
by some random value
60
3. Non-eXecutable memory (NX , WX)
Distinguish
• X: executable memory (for storing code)
• W: writeable, non-executable memory (for storing data)
and let processor refuse to execute non-executable code
61
Defeating NX: return-to-libc attacks
Code injection attacks no longer possible,
but code reuse attacks still are...
62
(ROP)
Next stage in evolution of attacks, as people removed or
protected dangerous libc calls such as system()
63
4. Control Flow Integrity (CFI)
Return-to-libc and ROP give rise to unusual control flow
jumps between code blocks
Eg a function f() never calls library routine exec(), because
exec()does not even occur in the code of f(), but when
supplied with malicious input f() suddenly does call exec()
Idea behind Control Flow Integrity:
determine the control flow graph (cfg) and monitor
execution to spot deviant behaviour
• Many variants, with different levels of precision, overhead,
...
• Of course, not all attacks results in unusual control flow. Eg
buffer overflows that only corrupt data will not, so cannot be
detected by CFI.
64
Control Flow Graph
65
Example code
66
Example control flow graph
gt(x,y)
return
call sort
return
label 55 label 23
gt(x,y)
return 55 label 17
call sort
return 23
label 55
additional checks
return that function calls
and returns
go to legal destinations
68
5. Bound checkers
Compiler
• records size information for all pointers (eg using
so-called fat pointers)
• adds runtime checks for pointer arithmetic & array
indexing
A pointer p
s o m e d a t a
A fat pointer p size
69
6. Guard pages
Allocate heap chunks with the end at a page boundary
with a non-readable, non-writeable page between chunks
p
s o m e d a t a
q h e l l o \0
70
7. Pointer encryption
• Many buffer overflow attacks involve corrupting pointers,
pointers to data or function pointers
• To complicate this: Encrypt pointers in main memory,
unencrypted in registers
– simple & fast encryption scheme: XOR with a fixed value,
randomly chosen when a process starts
• Attacker can still corrupt encrypted pointers in memory,
but these will not decrypt to predictable values
– This uses encryption to ensure integrity.
Normally NOT a good idea, but here it works.
71
8. Instruction set randomisation
Basically the same thing as pointer encryption,
but now for instructions (ie code in memory)
72
9. Execution-aware memory protection
• More fine-grained memory protection by OS & hardware:
– not just access control based on process id
– but also based on value of the program counter (PC)
• So some memory regions only accessible from specific
part of the program code
– eg. crypto keys only accessible if PC points to module with
the crypto code
• Does not prevent buffer overflow, but reduces the impact
– eg. only buffer overflows in the crypto code can leak keys
73