0% found this document useful (0 votes)
23 views5 pages

A Method For Detecting Buffer Overflow Vulnerabilities

The paper presents a method for detecting buffer overflow vulnerabilities by combining static analysis and dynamic testing on binary files. The proposed methodology involves converting binary files to assembly language, identifying potential overflow points through static analysis, and verifying these points with dynamic testing. Experimental results demonstrate the feasibility and effectiveness of the approach in enhancing network security.

Uploaded by

nguyenddat2410
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views5 pages

A Method For Detecting Buffer Overflow Vulnerabilities

The paper presents a method for detecting buffer overflow vulnerabilities by combining static analysis and dynamic testing on binary files. The proposed methodology involves converting binary files to assembly language, identifying potential overflow points through static analysis, and verifying these points with dynamic testing. Experimental results demonstrate the feasibility and effectiveness of the approach in enhancing network security.

Uploaded by

nguyenddat2410
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

A Method for detecting buffer Overflow Vulnerabilities

Jingbo Yuan, Shunli Ding


Institute of Information Management Technology and Application
Northeastern university at Qinhuangdao
Qinhuangdao, China
[email protected], [email protected]

Abstract—Buffer overflow vulnerabilities are currently the vulnerabilities in binary file. The detecting flow is illustrated
most prevalent security vulnerability. The paper presents a in the following.
method that combines static analysis with dynamic test to deal Step 1: Convert binary file to assembly language;
with the problem on buffer overflow vulnerabilities detecting. Step 2: Static analysis of assembly code. Judging whether
By using the method we can identify potential weakness there exist overflow points, if no, then end;
locations. A buffer overflow vulnerabilities testing system was Step 3: Save potential overflow points;
developed. The experiment results tested and verified that the Step 4: Dynamic testing. Judging whether can overflow?
new methodology is feasibility and availability. If can't, then end;
Step 5: Output report.
Keywords-security vulnerability; Buffer overflow; Static
First, the binary file is converted to assembly language
analysis; Dynamic test
file and then through analysis of assembly language syntax
and processes, the overflow point information that possible
I. INTRODUCTION exist buffer overflow vulnerability are extracted. The
The buffer overflow vulnerability is one of the most overflow point is the first address of the function or code
dangerous and the most widely distributed software segment that occur overflow. The information includes the
vulnerability. Hackers always breach security or simply function call relations in binary file and the address of
crash computer systems by using buffer overflow potential overflow points, the size of the source buffer and
exploitation. Therefore, it is important to find buffer destination buffer. In fact, software programs have
overflow, especially those that are not discovered or not standardized structure. A program is composed of code,
reported or reported but not receive much attention. And it stack and heap. The code is composed of many modules and
will play a significant role in the aspect of network security. each module generally realizes independent function.
So, various solutions have been developed to address the Module stores the local variable and can call each other.
buffer overflow vulnerability problem [1-4]. Although different programs may have different details, the
There are mainly two ways to find buffer overflow structured programming model is suitable for various
exploitation traditionally. One is to analysis source code of platform architectures. Therefore, this method applies to all
program, which is called static analysis. And the other is to kinds of platform architecture and programming models.
test a program while it is running, which is called dynamic This paper takes the Intel 32 bits processor architecture.
analysis. However, in fact, except for some open source code B. Looking for potential overflow points
program, most source codes are hardly to be obtained.
Dynamic analysis does not require source code, but it is In this stage the binary file is analyzed to look for
inefficient for not having any knowledge about the potential overflow points. The method doesn't need to know
program’s internal structure. any additional information of file or debugging information
This paper presents a method that strikes a proper generated by compilation but only needs to know the entry
balance between static analysis and dynamic analysis to address of the program. The purpose is to understand the
identify buffer overflow vulnerabilities in binary code basic structure of the test file and stack usage and function
without source code. The method is composed by two steps. calling relations.
First step is to find some potential weakness locations, and Analysis process is not a simple one by one byte scan for
second step is to test every potential weakness location so as the program. The binary file is first converted to assembly
to eliminate the false positives. language and analysis its instructions sequence to find code
segments that are responsible for copying the data in source
II. THE METHOD FOR DETECTING BUFFER OVERFLOW buffer to the destination buffer, and then judges whether the
VULNERABILITIES code lacks buffer bound check. If no bounds checking it is
possible to have buffer overflow vulnerability lead to be
A. The detecting flow attack. The main task of this step includes:
This paper presents a method integrating static and 1) Determining the function call relations
dynamic analysis for detecting buffer overflow All functions are identified using recursive analysis on
object code. For executable programs starting point of
analysis is entry point of the program. For library files each
___________________________________
978-1-61284-486-2/11/$26.00 ©2011 IEEE


exported function is as a starting point. The address behind disassembling results to get potential system overflow risks,
each CALL instruction should be added to the function and then do analysis and discrimination to realize effective
analysis table unless the address has been analyzed. In this mining for system-level security vulnerabilities.
way we can get a list of all the functions in the program.
2) Analysis stack space
In this step the main task is to make certain whether there
exist buffer in each function stack. There were two judging static analysis using the script
conditions. One is the size and layout of stack storage space,
that is, the number of variables and the size of each variable.
Another is the address type of pointer used when the stack script 1 script 2 script 3
seek.
3) Analysis parameters
This step is to identify the parameter types of potential
overflow function. Pointer type parameter is particularly database system
worthy of attention. If a function has not local buffer but
receive a pointer as an argument, the pointer likely points to
a buffer address of calling its function.
dynamic test potential overflow points
4) Use of local buffer
This step is to detect functions with local buffer. By
analysis the program code in function, the one hand to Figure 1. The whole structure of system
determine the sequence of instruction copy data to the local
buffer by means of detecting whether has loop instruction C. Script Controller
that includes MOV instruction or MOV instruction with
special prefix, on the other hand as long as detecting if there In addition to the extremely powerful disassembly
is PUSH REG, LEA REG AND [EBP-OFFSET] in front of functions and UI interactive interface, IDA Pro also provides
CALL instruction, we shall be able to determine the buffer IDC automatic scripting capabilities. Users can deal with
address as the instructions sequence of arguments calling. disassembly database by writing automated scripts with
specific purposes. IDC is an embedded Language. Its
C. Determining the overflow function emergence greatly enhances the expansion of IDA so that
In the phase our work is to identify and test potential many complex tasks may be completed by IDC. At the
overflow functions in order to exclude unable overflow automated same time, IDC also may control some special
functions. It is very difficult using a program to judge the circumstances.
correctness of another program, so it is necessary to use D. Generating function call diagram
special test to exclude unable overflow functions.
In the paper the function dependencies in the executable
III. IMPLEMENT OF BUFFER OVERFLOW VULNERABILITIES binaries file are extracted by writing IDC script, and then we
DETECTION mainly analyzed the library functions of format string and
the function calls that may result in buffer overflow
A. The whole structure of system vulnerabilities in order to determine if the program exists in
Detection process is divided into two stages: static the security implications, thus to reduce vulnerability false
analysis and dynamic testing. The following figure 1 shows and improve the quality of vulnerability report. According to
the overall system structure. the Depth-First IDC script traverses functions from main
Where, before testing it needs to script respectively for function of program entry in the target binary file, and
various types of potential overflow functions. The database simulates the FILO character of stack to get out all the
system is use to record information of potential overflow functions. In the course of traversing called library functions
points. This paper uses the built-in arrays to store are not traversed, so as to remove a lot of interference
information. information and enhance the efficiency of analysis.

B. Static analysis tool E. Detecting potential overflow function


This paper chooses the IDA Pro as the disassembled tool Potential overflow function is a function without
to do disassembly for the program needs to be analyzed. IDA checking buffer border as accessing buffer. We use
Pro is a Windows or Linux or Mac OS X hosted multi- BugScam to detect the library function. At present the
processor disassembler and debugger [5]. IDA Pro is a BugScam is capable of detecting function as strcpy, strcat,
professional disassembly tools and has the strongest ability MultiByteToWideChar, wsprintfA, lstrcatA, sprintf, lstrcpyA
to disassemble. and so on. In order to facilitate the dynamic detection some
Because common buffer overflow vulnerabilities are information about function need to be recorded, so we
caused by lack of verification on memory replication modified IDC scripts. Aiming at the user-defined potential
operation etc, they have some regularity. We do disassembly overflow functions there are mainly two cases.
process for the target code using IDA Pro and search the 1) Detecting string operating instruction


String operating instructions use Rep MOVs command to a) If it is MOV instruction, records in MOV instruction
realize continuous write to the buffer, and its instruction queue, and sets flag FOUNDMOVS when the number of
and instructions sequence have obvious characteristic. The MOV instructions is equal or greater than 2;
detecting flow is as following figure 2, where register CX
controls cycle times. b) If it is INC, DEC, ADD or SUB instruction, records
in indexed address adjusting instruction queue, and sets flag
FOUNDINC or FOUNDDEC depending on the situation;
Start c) If it is JMP or LOOP instruction, records loop
instruction and sets flag FOUNDLOOP.
Read the next instruction Step 2: Do the following checks when the flags
FOUNDMOVS, FOUNDINC (or FOUNDDEC) and
N FOUNDLOO all are set.
Is Rep movs instruction ?
a) Analysis if exist two MOV instructions to meet the
Y conditions of C2,C3 and C4. If exist, continue analyze, or
Y turn to step 3.
CX is a constant ?
b) Analysis if exist instruction adjusting in indexed
N address adjusting instruction queue. If exist, continue
analyze, or turn to step 3.
upward seek lea esi,xx or MOV esi,xx instruction,
and calculate the size of source buffer c) Analysis the instructions coverage in LOOP
instruction. If all these instructions are within the scope of
their addresses coverage, then continue analyze, or turn to
upward seek lea edi,xx or MOV edi,xx instruction,
and calculate the size of destination buffer step 3.
d) Analysis if MOV instruction, index register
adjustment instruction and loop instruction are in the same
Save the initial address of program loop, if are, turn to step 4, or turn to step 3.
segment and size of the buffer
Step 3: Clear FOUNDLOOP flag and read next
instruction. If address is invalid then end or turn to step 1.
End
Step 4: Detecting succeed. Adjust the analyzed
instruction address. If next address is invalid then end or turn
Figure 2. The flow of detecting string operating to step 1.
2) Detecting MOV + LOOP instruction F. The implementation of static analysis
The kind of potential overflow function achieves According to the above analysis the static analysis tool
continuous write to the buffer through MOV instruction mainly completes the analysis on five potential overflow
combined with loop command. In general to continuous functions and records the function information to the
write buffer, this type of sequence of instructions must meet database for subsequent dynamic testing. We rewrite five
the following several conditions: IDC script as following to identify and analyze the potential
C1: Consisting of two or more than two MOV overflow functions.
instructions. Strcpy.idc-detect string copy function Strcpy( );
C2: The source operand of one MOV instructions is the Strcat.idc-detect string catination function strcat( );
destination operand of another MOV instruction and belongs Sprintf.idc-detect formatted string output function
to register type. sprintf( );
C3: No other commands take the registers as target Repmov.idc-detec data copy of string operating
operands between the two MOV instructions. instruction;
C4: There are index registers in the MOV instructions Movjmp.idc-detect data copy of MOV +loop instruction;
and the index registers can be the same or different.
C5: Exist instruction adjusting index registers and the G. The implementation of dynamic test
registers can be increment or decrement at the same time. The implementation of dynamic test uses OllyDbg
C6: These instructions are in the same loop. debugger [6]. OllyDbg is a user mode analyzing debugger. It
Based on the above these conditions, the algorithm for can identify thousands of functions frequently used by C and
recognition this type function is described as follows. Windows and comment arguments. The paper does dynamic
test for the static analyzed program by means of OllyDbg
Step 1: Read instruction beginning at the current address using gained the potential overflow points information
and analyze instruction type. including address and buffer size. The testing process
includes targeted setting breakpoints and then filling the


destination buffer with different length data and judging truth of the overflow points according to the running results.
TABLE I. TESTING RESULT OF FLAWFINDER

overflow point overflow point overflow


description
(line) (column) function
30 4 sprintf does not check for buffer overflows. Use snprintf or vsnprintf.
46 4 sprintf does not check for buffer overflows. Use snprintf or vsnprintf.
62 4 sprintf does not check for buffer overflows. Use snprintf or vsnprintf.
does not check for buffer overflows. Consider using strncpy or
19 2 strcpy strlcpy. Risk is low because the source is a constant string.
does not check for buffer overflows. Consider using strncat or
24 2 strcat strlcat. Risk is low because the source is a constant string.
does not check for buffer overflows. Consider using strncpy or
35 2 strcpy strlcpy. Risk is low because the source is a constant string.
does not check for buffer overflows. Consider using strncat or
40 2 strcat strlcat. Risk is low because the source is a constant string.
does not check for buffer overflows. Consider using strncpy or
51 2 strcpy strlcpy. Risk is low because the source is a constant string.
does not check for buffer overflows. Consider using strncat or
56 2 strcat strlcat. Risk is low because the source is a constant string.

TABLE II. TESTING RESULT OF TESTBUFFEROVERFLOW

call destinati source overflow


description
address on buffer buffer function
The maximum expansion of the data appears to be larger than the
4011c6 12 40 sprintf target buffer, this might be the cause of a buffer overrun ! Maximum
Expansion: 40 Target Size: 12
The maximum expansion of the data appears to be larger than the
4012b6 20 40 sprintf target buffer, this might be the cause of a buffer overrun ! Maximum
Expansion: 40 Target Size: 20
The maximum possible size of the target buffer (12) is smaller than
401121 12 37 strcpy the minimum possible size of the source buffer (37). This is VERY
likely to be a buffer overrun!
The maximum possible size of the target buffer (20) is smaller than
401211 20 37 strcpy the minimum possible size of the source buffer (37). This is VERY
likely to be a buffer overrun!
UNKNOWN_SOURCE_SIZE: The analyzer was unable to
401aad 4096 0 strcpy determine the size of the data source; This location should be
investigated manually.
The maximum possible size of the target buffer (12) is smaller than
401171 12 37 strcat the minimum possible size of the source buffer (37). This is VERY
likely to be a buffer overrun!
The maximum possible size of the target buffer (20) is smaller than
401261 20 37 strcat the minimum possible size of the source buffer (37). This is VERY
likely to be a buffer overrun!
UNKNOWN_SOURCE_SIZE: The analyzer was unable to
4048d9 160 0 strcat determine the size of the data source; This location should be
investigated manually.
UNKNOWN_SOURCE_SIZE: The analyzer was unable to
404907 160 0 strcat determine the size of the data source; This location should be
investigated manually.
The maximum expansion of the data appears to be larger than the
40120d 16 40 repmov target buffer, this might be the cause of a buffer overrun ! Maximum
Expansion: 40 Target Size: 16


"call 00401450" is to call function strcpy, two push
IV. TEST AND ANALYSIS FOR BUFFER OVERFLOW commands push respectively the source buffer address and
VULNERABILITY the destination buffer address into stack. The size of the
source buffer and the size of destination buffer are
A. Comparison test of the program with vulnerability respectively 37 and 12. The content in address 00402201 is
The paper first writes a program with buffer overflow "this is a test,this is a test". The size of string is greater than
vulnerability and then uses our tool Testbufferoverflow to 12, because the size of the destination buffer is 12, so an
test the program and compare test results with Flawfinder. overflow error occurs at running the program. If the string is
Flawfinder is a program that examines C source code and changed to "this" and its size is less than 12, the program can
reports possible security weaknesses sorted by risk level. It's regular run without error. From this, we can come to a
very useful for quickly finding and removing some security conclusion: the function strcpy lacks of border checks, that
problems before a program is widely released [7]. The test is, don't check the buffer size, so exist buffer overflow
results are shown respectively as table 1 and table 2. vulnerability.
The tested program calls ten functions, and functions In the same way, we can dynamic test each potential
strcpy, strcat and sprintf are called respectively three times overflow point to seek possible buffer overflow vulnerability.
and in two of them the destination buffer is less than the
source buffer. Another function uses WHILE loop to copy V. CONCLUTION
data from the source buffer to the destination buffer. Buffer overflow vulnerabilities are caused by
Therefore the correct results are function strcpy, strcat and programming errors. Using buffer overflow vulnerability an
sprintf have respectively two overflows and the overflow of attacker can cause the program to write beyond the bounds
RepMOV type has one time. of an allocated memory block as a result to corrupt other
When using Flawfinder to detect the program, it finds all data structures. The paper proposes a method for mining
library function and but don't find the overflow of RepMOV potential buffer overflow vulnerability of binary files in the
type. Using Testbufferoverflow, it finds six addresses of case of no knowing the source code and implements a
possible overflow in functions strcpy, strcat and sprintf and system to detect vulnerability based on combining static
calculates the size of the source buffer and the size of analysis and dynamic test. Actual programs with buffer
destination buffer. For third function call because the overflow vulnerabilities are tested and the results are
destination buffer is large enough, so functions can't produce compared with common buffer overrun detection tool
overflow and don't take them as overflow points. In addition, Flawfinder. The test results illustrate that the method
the tool also finds a overflow point of RepMOV type. improves the accuracy of static analysis and reduces the rate
From the results we see that compared with Flawfinder, of false alarm. The tool Testbufferoverflow also provides a
our tool is more accurate and effective. It can find all the basis for dynamic testing so then reduces the dynamic test
functions of the existing problems in the program and time and improves the efficiency of dynamic testing.
precisely locate all overflow points.
REFERENCES
B. Dynamic Testing
[1] C. Cowan, F. Wagle, Calton Pu, S. Beattie, and J. Walpole, “Buffer
Dynamic testing examines program execution to overflows: attacks and defenses for the vulnerability of the decade,”
determine whether buffer overflows occur during that DARPA Information Survivability Conference and Exposition, vol. 2,
execution. According to the above static analysis results, we pp. 119–129, 2000.
take an address, for example 401121, as a potential overflow [2] Seon-Ho Park,Young-Ju Han, and Soon-Jwa Hong, “The Dynamic
Buffer Overflow Detection and Prevention Tool for Yindows
point to perform dynamic testing. Set break point 00401121 Executables Using Binary Rewriting,” The 9th International
and get following codes in break point. Conference on Advanced Communication Technology, vol. 3, pp.
00401100 push ebp 1776–1781, 2007.
00401101 MOV ebp, esp [3] B. Salamat, A. Gal, T. Jackson, K. Manivannan, G. Wagner, and M.
00401103 sub esp, 4C Franz, “Multi-variant Program Execution: Using Multi-core Systems
00401106 push ebx to Defuse Buffer-Overflow Vulnerabilities,” International
Conference on Complex, Intelligent and Software Intensive Systems,
00401107 push esi pp. 843–848, 2008.
00401108 push edi [4] M. Akbari, S. Berenji, and R. Azmi, “Vulnerability detector using
00401109 lea edi, dword ptr [ebp-4C] parse tree annotation,” 2nd International Conference on Education
0040110C MOV ecx, 13 Technology and Computer (ICETC), vol. 4, pp. 254-257, 2010.
00401111 MOV eax, CCCCCCCC [5] http://www.hex-rays.com/idapro/
00401116 rep stos dword ptr es:[edi] [6] http://www.ollydbg.de/
00401118 push 0042201C [7] http://sourceforge.net/projects/flawfinder/
0040111D lea eax, dword ptr [ebp-C]
00401120 push eax
00401121 call 00401450



You might also like