0% found this document useful (0 votes)
20 views41 pages

09 FindingBugs

Uploaded by

Souhila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views41 pages

09 FindingBugs

Uploaded by

Souhila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

3761

Security Core Lecture


08 Finding Bugs

Prof. Dr. Thorsten Holz | 21.11.2023


3761

Further Reading
§ Shoshitaishvili et al.: „SoK: (State of) The Art of War: Offensive Techniques in
Binary Analysis“, IEEE Symposium on Security and Privacy, 2016
§ Song et al.: „SoK: Sanitizing for Security“, IEEE Symposium on Security and
Privacy, 2019

§ Stephens et al.: „Driller: Augmenting Fuzzing Through Selective Symbolic


Execution“, NDSS, 2016
§ Schumilo et al.: „Nyx: Greybox Hypervisor Fuzzing using Fast Snapshots and
Affine Types“, USENIX Security 2021
§ Aschermann et al.: „NAUTILUS: Fishing for Deep Bugs with Grammars“, NDSS
2019

1
3761

This Lecture
§ Last lectures
- Memory tagging
- Address Space Layout Randomization (ASLR)
- Control-Flow Integrity (CFI)
§ This lecture
- Finding bugs
- Fuzzing
- Symbolic execution

2
3761

Recap: ARM Memory Tagging Extension (MTE)

§ First two pointer


accesses are fine
§ Tag comparison for
third pointer fails

3
3761

Recap: Address Space Layout Randomization (ASLR)


§ ASLR randomizes (base) addresses of code and data
- Addresses of variables, shellcode or ROP gadgets are no longer deterministic
- Attacker does not know where to jump to or what do overwrite
§ Randomized parts are OS-dependent
- Windows: randomize base addr. of code, heap/stack and library locations
- Linux: randomize base addr. of code, heap/stack, library locations and vDSO
§ ASLR requires compiler and linker support
§ Brute-force attack feasible on 32-bit systems
1. Try an address (more or less) at random
2. Program jumps to the address
3. Program will (usually) just crash
4. Go back to step 1 and try again
4
3761

Recap: Control-Flow Integrity (CFI)


§ General defense against code-reuse attacks [Abadi et al., CCS 2005]
§ Many CFI checks are required if unique labels are assigned per node (high
overhead!)
§ Optimization: Merge labels to allow single CFI check, but this allows for
unintended control-flow paths and hence potential attacks

Exit(B) == Label_3 A Label_1

B Label_2

Label_3 C D Label_3

E F
Label_3 Label_4
5
3761

Finding Bugs
3761

General Thoughts
§ Human experts can find bugs by looking at source code
§ Static code analysis (SCA) methods find bugs by analyzing source code
§ Sanitizers add bug-finding code to the source code
§ git bisect can assist in finding bugs in source code (finds the exact commit
that introduced the change)
§ Explaining source code helps in finding bugs
§ print statements in source code can assist in finding bugs
§ Stepping through source code with a debugger reveals bugs
§ …

7
3761

Program Analysis
§ Program analysis: process of analyzing a given program behavior to
determine correctness, robustness, liveness, security, or other properties
§ Two main approaches
- Static analysis
• Analyze source code to find faults or check their absence
• Consider all possible inputs (in theory)
• Typically requires a lot of effort and generates false positives
- Dynamic analysis
• Run instrumented program to find problems (e.g., crashes)
• Need to select test inputs, only limited coverage (only executed code is
tested)
• Can find vulnerabilities, but cannot prove their absence
8
3761

Static Analysis
§ A static analysis tool S analyzes the source code of a program P to determine
whether it satisfies a property φ, such as:
- “P never deferences a null pointer”
- “P does not leak file handles”
- “No cast in P will lead to a ClassCastException”

§ Unfortunately, it is impossible to write such a tool!


- Rice's theorem states that all non-trivial, semantic properties of programs
are undecidable
- For any nontrivial property φ, there is no general automated method to
determine whether P satisfies
- So what can we do in practice?

9
3761

Soundness vs. Completeness


§ An analysis tool S analyzes the code of a program P to determine whether it
satisfies a property φ can be wrong in one of two ways:
- If S is sound, it will never miss violations, but it may say that P violates φ
even though it does not (resulting in false positives)
- If S is complete, it will never report false positives, but it may miss real
violations of φ (resulting in false negatives)

10
3761

Dynamic Analysis
§ Dynamic (Program) Analysis analyzes computer software while it is operating
(in contrast to static which looks only at code)
§ Unit tests, integration tests, system tests, and acceptance tests are all a form
of dynamic testing.
§ However, typically need to instrument code to understand where which kind
of problem occurred

§ Closely related to debuggers


- Traditional debuggers typically focus on allowing programmers to find the
source of fatal errors (e.g., NULL pointer deref)
- Not all bugs lead to crashes, especially for inputs that typically do not crash
- In contrast, security tools attempt to uncover non-fatal problems —
potential race conditions or overflows
11
3761

Google AddressSanitizer (ASan)


§ AddressSanitizer is a memory error detector for C/C++ that finds:
- Use after free (dangling pointer dereference)
- Heap buffer overflow
- Stack buffer overflow
- Global buffer overflow
- Use after return
- Use after scope
- Initialization order bugs
- Memory leaks

12
3761

Google AddressSanitizer (ASan) Implementation


§ LLVM Pass
- Modifies the code to check the shadow state for each memory access and
creates poisoned redzones around stack and global objects to detect
overflows and underflows

§ Runtime library that replaces memory-related functions


- Runtime library replaces malloc, free and related functions
- Creates poisoned redzones around allocated heap regions
- Delays the reuse of freed heap regions
- Adds error reporting

13
3761

Example: ASan

==9901==ERROR: AddressSanitizer: heap-use-after-free on address 0x60700000dfb5 at pc 0x45917b


bp 0x7fff4490c700 sp 0x7fff4490c6f8
READ of size 1 at 0x60700000dfb5 thread T0
#0 0x45917a in main use-after-free.c:5
#1 0x7fce9f25e76c in __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226
#2 0x459074 in _start (a.out+0x459074)
0x60700000dfb5 is located 5 bytes inside of 80-byte region [0x60700000dfb0,0x60700000e000)
freed by thread T0 here:
#0 0x4441ee in __interceptor_free projects/compiler-rt/lib/asan/asan_malloc_linux.cc:64
#1 0x45914a in main use-after-free.c:4
#2 0x7fce9f25e76c in __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226
previously allocated by thread T0 here:
#0 0x44436e in __interceptor_malloc projects/compiler-rt/lib/asan/asan_malloc_linux.cc:74
#1 0x45913f in main use-after-free.c:3
#2 0x7fce9f25e76c in __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226
SUMMARY: AddressSanitizer: heap-use-after-free use-after-free.c:5 main

14
3761

Summary

Pros Cons

Enables quickly finding bugs at Either over or under reports


development time
Static Misses complex bugs
Can detect some problems that
dynamic misses Typically requires code

May uncover complex


behavior missed by static analysis Depends on user input —
Dynamic only checks executed code
Can run on blackbox

15
3761

Fuzzing
3761

Overview
§ Fuzzing (“fuzz testing”) is an automated software testing method
§ Provide invalid, unexpected, or random data as input
§ Closely monitor program behavior (e.g., crash, assertion, ...)

17
3761

Fuzzing vs. Testing

§ Fuzzing § Testing
- Invalid, unexpected, or random - Normal, valid, well-formed data
data as input as input
- Automatically generated - Manually generated testcases
testcases
- Goal: Normal users should not
- Goal: Find exploitable errors get errors

18
3761

Fuzzing History
§ Very old technique, was considered worst means of testing
§ Term coined 1988 in class assignment by Miller: noise over fuzzy network
connections
§ Google runs ClusterFuzz since 2012 to fuzz Chrome, OSS-Fuzz to test open
source software
- As of August 2023, OSS-Fuzz has helped identify and fix over 10,000
vulnerabilities and 36,000 bugs across 1,000 projects
- Meta, Microsoft, Oracle, … have fuzzing teams
§ Most teams used fuzzing to automatically detect bugs in the DARPA Cyber
Grand Challenge 2016 (CGC)
- Likely also applies to AIxCC next year
§ american fuzzy lop (AFL) found many bugs, new version AFL++ is among the
most popular tools
19
3761

Types of Fuzzing
§ Different types of fuzzing
§ Fuzzing can be somewhere between dumb and smart
§ The smarter the fuzzing, the harder the setup (typically)
- However, smarter fuzzing can potentially find more
bugs
- But might also find less bugs because too much time
is spent in the heuristics / optimizations
§ Typically all fuzzers rely on some kind of mutations
- Mutation might be completely random or follow
some pattern

20
3761

Mutational and Generational Fuzzers


§ Mutational fuzzers
- Require a good corpus of inputs to mutate, randomly try mutations
- Mutational fuzzers do not know which code regions depend on input file and
which inputs are necessary to reach more code regions (“dumb fuzzing”)
- Might fail for complicated protocols (e.g., challenge response)
- Can fail for complex file formats (e.g., checksum)
§ Generational fuzzers
- Require formal specification to define input format (e.g., grammar or RFC)
- Based on specification, fuzzers are able to produce many semi-valid inputs
- Knowledge of protocol gives better results, more targeted (i.e., does not fuzz
“uninteresting” data)
- Can handle complex dependencies, e.g., checksums
21
3761

Evolutionary Fuzzing (“Coverage-guided fuzzing”)


§ Generate inputs/mutations based on program behavior
§ Different metrics: code coverage, reaching potentially dangerous functions, ...
- Basic insight: necessary precondition to find bug is ability to reach this
code location
§ Dynamically “learns” protocol or file format, no configuration needed
§ Finds many bugs, empirically a very effective technique
- Robust and fast
§ Requires instrumentation to keep track of coverage (e.g., via compiler pass,
binary instrumentation, or other techniques)
§ Popular tools are AFL++ and LibAFL

22
3761

Evolutionary Fuzzing II

23
3761

Evolutionary Fuzzing III

24
3761

Example: Nyx
Code available at https://github.com/RUB-SysSec/

user user user


space space space

kernel kernel kernel

Nyx (USENIX’21) + Nyx-Net (EuroSys’22)

Intel PT (Processor Trace)

25
3761

Example: Fuzzware / Hoedur


§ Fuzzing via rehosting

Papers published at USENIX Security’22 and USENIX Security’23, code available at https://github.com/fuzzware-fuzzer/
26
3761

Symbolic Execution
3761

Overview
§ Goal is to find the required input to reach a certain position in the program
§ Programs are interpreted, input modelled using symbolic values
- Variables can be expressed using the symbolic values
- Symbolic state maps variables to symbolic values
- Path condition is a formula over symbolic values that encodes all branch
decisions taken so far (basically we keep track of all path conditions,
conditional jumps are constraint by the symbolic values)
- All paths in the program form an execution tree: some paths are feasible,
while some are infeasible
§ Expressions of symbolic values solved using SAT solvers (boolean satisfiability)

28
3761

Example
§ Illustration of how symbolic execution works

Symbolic Execution

x = λ
Example
y = 2 * λ
x = read(); z = 2 * λ + 4

y = x * 2; 12
z = y + 4 +4 !=
2
2 ·λ ·λ
λ!
if (z == 12) { 1 2= 4 =4 +4
bug(); λ=
} else {
… Symbolic Execution Symbolic Execution
}
bug() …

29
3761

Symbolic Execution
§ Very powerful tool with several shortcomings
- Symbolic execution does not scale well to complex programs
• Possible paths grow exponentially
- Unbounded loops (i.e., iterations depend on user input) and recursion
• Only approximated
• Finitize paths by unrolling loops and recursion (bounded verification)
• Or finitize paths by limiting the size of path conditions (also bounded)
- Environment (e.g., syscalls, file system, ...) and heap are hard to model
- Possible solution: Concolic (concrete + symbolic) execution
• Run symbolic execution in parallel with real execution
• Take real values if symbolic expressions get too complicated

30
3761

Practical Example: Symbolic Execution via angr

Serial Numbers checkSerial()

char input [256]; Input bool checkSerial(const char *in) {


int sum = 0;
int main() { int digits = strlen(in);
int i; int parity = (digits - 1) % 2;

puts("Enter verification number"); for (int i = digits; i > 0; i--) {


fgets(input, 256, stdin); char current = in[i - 1];
if (current < ’0’ || current > ’9’)
if (strlen(input) != 13) return 0;
return 1; int digit = current - ’0’;

input[strlen(input) - 1] = 0; if (parity == i % 2)
digit *= 2;
if (checkSerial(input)) { Go here
printf("Number validated!\n"); sum += (digit / 10) + (digit % 10);
} else { }
printf("Invalid number\n"); Avoid this
} return 0 == sum % 10;
}
return 0;
}

31
3761

Practical Example: angr


import angr

good = 0x8048630
avoid = (0x804862)
length = 12

project = angr.Project(’main.elf’)
state = project.factory.full_init_state()
simgr = project.factory.simgr()
simgr.explore(find=good, avoid=avoid)

s = simgr.found[0]

for i in range(length):
b = s.memory.load(0x0804a060 + i, 1)
s.add_constraints(b >= ord(’0’), b <= ord(’9’))

s.se.eval_upto(s.memory.load(0x0804a060, length), length, cast_to=str)

print("Valid number: %s" % simgr.found[0].state.posix.dumps(0)[0:length])

$ gcc -std=gnu99 -m32 -no-pie main.c -o main.elf


% ./main.elf
$ python solve.py
Enter verification number 430009016964
Valid number: 430009016964
Number validated!
python solve.py 25.58s user 0.88s system 100% cpu 26.204 total
32
3761

Impact: Symbolic Execution


§ Possible to find input to get to certain location in binary
- Find reachable location although it should not be reachable
- Find flawed authentication (can it be bypassed?)

§ Idea: combine SE with fuzzing


- Start fuzzing
- If fuzzer is stuck, continue with symbolic execution
- Repeat until whole program is tested
§ Combines the strengths of both approaches
§ Open-source implementation Driller: uses AFL + angr
§ Nice concept, but does not scale well to complex programs

33
3761

Reverse Engineering
3761

Reverse Engineering
§ Reverse Engineering is the process of getting back source code from a binary
§ Identify bugs (or hidden features) if only the binary is available
§ Allows to find compiler-introduced bugs
§ Re-engineering allows to build a new binary from the reverse engineered
binary
§ Many tools available
- IDA Pro
- Ghidra
- radare2
- …

35
3761

Methods
§ Disassembler allows to
- Disassemble code (get assembly code)
- Analyze binaries (dependencies, strings, control flow)
- Debug programs (see actual register values, step through code)
- Disassembler only returns often hard-to-understand assembly code
§ Decompilers convert code to high-level language (e.g., C)
- Decompilation output is often a lot easier to read
- However, decompilation is a lot of magic - does not always work
- Highly dependent on architecture, used compiler, optimization level ,
obfuscation, …
- If it works, it gives a quick overview for further investigations

36
3761

Binary Diffing: Motivation


§ Security patches for closed-source products often have no (real) description
of the bug
§ The patch is usually available for download
§ Binary diffing is like normal diffing, except for binaries
§ Reveals differences in two binaries, i.e., the bug fix
§ Can also be used to find vulnerable functions by comparing binary with
known vulnerable functions
§ Popular tool: BinDiff

37
3761

Binary Diffing
§ Diffing tools use different methods to find matching and unmatching blocks
- Same function name
- Same assembly, same decompiled code
- Equal number of calls to and from function
- Same referenced strings
- ...
§ Most diff tools rely on the control-flow graph from a disassembler
§ Basic blocks are then matched using some heuristic
§ Binary diffing is a way to reverse engineer patches
- If there are not many changes, vulnerability can be quickly spotted
§ Knowledge of the vulnerability allows attackers to craft exploits

38
3761

Conclusion

§ You never know how long it takes to find bugs


§ Hard to figure out where the bug is and under which conditions it occurs
§ Many tools supporting developers in finding bugs
§ Some are completely automated – use them!
§ Incorporate bug finding tools into your development process
- Easy to use sanitizers, run static code analysis, fuzz software, ...
- ... and it eliminates many bugs (for free)

39
3761

Sources / References

§ Slides partly based on “Automated Security Testing”, CS155 Computer and


Network Security – Stanford University

40

You might also like