COMP3632 HW2 2022 F (With Answer)
COMP3632 HW2 2022 F (With Answer)
Principles and Practice. This assignment counts 10% of your final grade.
Written Assignment
1. (12pt) Reverse engineering.
(a) (3pt) What is the typical threat model for adversarial reverse engineering.
Solution: It starts from the first byte in the code section of the executable
file to decode each byte (map one or a sequence of bytes to its corresponding
assembly instruction), until the end.
(c) (3pt) Explain what is recursive disassembling, and what is the advantage of re-
cursive disassembling comparing to linear disassembling.
Solution: It starts from the entry of the code, and disassembles the bytes
according to the control flow. It can overcome data embedded in code.
(d) (3pt) Briefly clarify the procedure of control-flow recovery in modern C/C++
decompilers.
Solution: We talked about this in the lecture and my notes. Page 1 in https:
//course.cse.ust.hk/comp3632/note8.pdf.
2. (21pt) The C function strcat appends a copy of the source string to the destination
string. The terminating null character in destination is overwritten by the first char-
acter of source, and a null-character is included at the end of the new string formed
by the concatenation of both in destination. destination is the return value. The
definition of the interface of strcat is:
char * strcat ( char * destination, const char * source );
1
(a) (3pt) Is strcat safe? Why?
Solution: No. The destination and source may have no terminating null
character, which results in buffer overflow. Additionally, the string length
of the sum of destination and source could be larger than the buffer size of
destination.
(b) (10pt) strncat is a function with similar functionality of strcat. strncat ap-
pends the first num characters of source to destination, plus a terminating
null-character. destination is the return value. Please give an implementation of
strncat. Its interface is
char * strncat ( char * destination, const char * source, size t num );
Solution:
char ∗ s t r n c a t ( char ∗ d e s t i n a t i o n , c o n s t char ∗ s o u r c e , s i z e t num)
{
s i z e t dest len = 0;
w h i l e ( d e s t i n a t i o n [ d e s t l e n ] != ’ \ 0 ’ ) ++d e s t l e n ;
s i z e t i = 0;
f o r ( ; i <num && s o u r c e [ i ] != ’ \ 0 ’ ; ++i )
d e s t i n a t i o n [ d e s t l e n+i ] = s o u r c e [ i ] ;
d e s t i n a t i o n [ d e s t l e n+i ] = ’ \ 0 ’ ;
return destination ;
}
Solution: When the source does not have a null character, strncat can still
work properly.
3. (20pt) In addition to stack-based buffer overflow attacks, heap overflows can also be
exploited. Consider the following C code, which illustrates a heap overflow.1
1
Hint: try this online compiler to compile and run the code: https://www.onlinegdb.com/online_c_
compiler.
2
i n t main ( )
{
int diff , size = 8;
c har ∗ buf1 , ∗ buf2 ;
buf1 = ( char ∗ ) m a l l o c ( s i z e ) ;
buf2 = ( char ∗ ) m a l l o c ( s i z e ) ;
d i f f = buf2 − buf1 ;
memset ( buf2 , ’ 2 ’ , s i z e ) ;
p r i n t f ( ”BEFORE: buf2 = %s ” , buf2 ) ;
memset ( buf1 , ’ 1 ’ , d i f f ) ;
p r i n t f ( ”AFTER: buf1 = %s ” , buf1 ) ;
return 0;
}
(b) (8pt) memset and printf are invoked twice, respectively. Please list all possible
security issues of them?
(c) (10pt) In terms of C/C++ memory management, what is the difference between
stack and heap? In particular, which one is allocated/deallocated automatically,
and which one needs programmers to take care of (you can search materials online
but shouldn’t directly copy)?
3
Solution: Stack is automatically allocated and reclaimed by the OS, while
programmers have to explicitly use malloc and free to manage memory on
the heap region.
4. (10pt) Integer overflows can also be exploited. Consider the following C code, which
illustrates an integer overflow.
i n t get item ( i n t idx )
{
int array [ 1 0 0 0 ] ;
// i n i t i a l i z e a r r a y
...
// end i n i t i a l i z a t i o n
i f ( i d x >= 1000) r e t u r n −1;
return array [ idx ] ;
}
(a) (6pt) What is the potential problem with this code? Besides integer overflow,
which security issue is triggered, stack overflow or heap overflow?
Solution: The argument idx is int; however, it will be cast into unsigned
integer for indexing. When idx is negative, idx will be cast into a large
unsigned integer, which results in stack overflow.
5. (16pt) Recall that an opaque predicate is a “conditional” that is actually not a condi-
tional. That is, the conditional always evaluates to the same result, although it is not
obvious.
(a) (6pt) Please provide two conditions of opaque predicate as example, and explain
how to use them. (Please do not use too complicated conditions.)
Solution:
2 × rax + 1 ≡ 1(mod 2).
rax >> 64 == 0
4
(b) (5pt) A side effect of inserting opaque predicates is that they can slow down the
execution speed. Please explain the reason of the side effect, and how to alleviate
it?
Solution: Since the inserted code for computing the branch conditions will
be executed, the additional code can result in the slow execution.
We can insert less obfuscation code in the original code that is executed fre-
quently, like loops and recursive functions.
(c) (5pt) A side effect of inserting opaque predicates is that they can increase the size
of the executable. Please explain the reason of the side effect, and how to alleviate
it?
Solution: The additional code can increase the size of the executable.
Inserting less code for the always false branch, and designing short and effective
opaque conditions.
6. (21pt) Considering the following C++ code. Function number ratio calculates the
ratio of number characters in the input string s.
#i n c l u d e <s t r i n g >
f l o a t number ratio ( s t r i n g s )
{
int n = 0;
f o r ( i n t i = 0 ; i < s . s i z e ( ) ; ++i )
i f ( s [ i ] >= ’ 0 ’ && s [ i ] <= ’ 9 ’ )
++n ;
return n / s . s i z e ( ) ;
}
(a) (3pt) Describe how to launch fuzz testing towards this function.
(b) (9pt) What bugs would you expect a fuzzer to identify in this function? Why?
And how to fix this bug?
5
Solution: Generally the divided by zero bug can be found easily by a fuzzer,
because it causes a crash.
Insert following code at the beginning of this function.
i f ( s . s i z e ( ) == 0 )
return 0 . 0 ;
(c) (9pt) What bugs would be more difficult for a fuzzer to find in this function?
Why? And how to fix this bug?
6
Programming Assignment – Buffer Overflow
For this assignment, we provide two programs named login1.cpp, login2.cpp and login3.cpp.
These three programs check if the user provided username and password match the stored
information in password.txt.
Your task is to perform buffer overflow attack towards these two test programs, by providing
a username and password that is different from information in password.txt to bypass the
identity check. On success of the attack, you should see message “Login successful!” (Please
take a look at the code which is self explanatory on the output message). Also, you should
not use any information in the password.txt file: it should be deemed as “secret”.
login1.cpp, login2.cpp and login3.cpp check your username/password against the secret
in password.txt. Note that login2.cpp uses a hard-coded canary to detect buffer overflows,
just like stack canaries. login3.cpp mimics a ”random” canary computed during runtime
(but this one is still less challenging than the real-life scenarios).
To avoid plagiarism, the provided username must start with YOUR OWN stu-
dent id. For example if your student ID is XXX, then the username you provide
to trigger buffer overflow must start with XXX.
The content of file password.txt will be changed during scoring.
• (15 pt) Using a buffer overflow attack to successfully exploit login1.cpp or explain
why that’s not feasible. If feasible, submit your username and password in a file called
login1.txt, with exactly two lines, the first line being the username, and the second
line being the password.
• (25 pt) Using a buffer overflow attack to successfully exploit login2.cpp or explain
why that’s not feasible. If feasible, submit your username and password in a file called
login2.txt, with exactly two lines, the first line being the username, and the second
line being the password.
• (50 pt) Using a buffer overflow attack to successfully exploit login3.cpp or explain
why that’s not feasible. If feasible, submit your username and password in a file called
login3.txt, with exactly two lines, the first line being the username, and the second
line being the password.
When grading, we will 1) manually check login1.txt and login2.txt and login3.txt,
and 2) try to reproduce the attack with your inputs on our end. On the other hand, if you
believe certain attacks are not feasible, we will read your answers to grade accordingly.
7
Submission Instructions
All submissions should be done through the Canvas system. You should submit a pdf docu-
ment with your answers for written component, and two files login1.txt and login2.txt
for the programming component.
It is important to name your files correctly. Please check out the late submission policies on
the course website (https://course.cse.ust.hk/comp3632) in case you didn’t attend the
first lecture.