0% found this document useful (0 votes)
41 views13 pages

2020MemLock - Memory Usage Guided Fuzzing

Uploaded by

zmyzmy1201
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views13 pages

2020MemLock - Memory Usage Guided Fuzzing

Uploaded by

zmyzmy1201
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

MemLock: Memory Usage Guided Fuzzing

Cheng Wen Haijun Wang∗ Yuekang Li


CSSE, Shenzhen University Ant Financial Services Group, China Nanyang Technological University
Shenzhen, China CSSE, Shenzhen University, China Singapore

Shengchao Qin∗ Yang Liu Zhiwu Xu


SCEDT, Teesside University Nanyang Technological University CSSE, Shenzhen University
Tees Vally, UK Singapore Shenzhen, China

Hongxu Chen, Xiaofei Xie Geguang Pu Ting Liu


Nanyang Technological University East China Normal University Xi’an Jiaotong University
Singapore Shanghai, China Xi’an, China

ABSTRACT Engineering (ICSE ’20), May 23–29, 2020, Seoul, Republic of Korea. ACM, New
Uncontrolled memory consumption is a kind of critical software York, NY, USA, 13 pages. https://doi.org/10.1145/3377811.3380396
security weaknesses. It can also become a security-critical vulner-
ability when attackers can take control of the input to consume
1 INTRODUCTION
a large amount of memory and launch a Denial-of-Service attack.
However, detecting such vulnerability is challenging, as the state- Time and space complexities are two main concerns in software
of-the-art fuzzing techniques focus on the code coverage but not design and development. If they are not implemented well, unex-
memory consumption. To this end, we propose a memory usage pected behaviors and even troublesome security issues can happen.
guided fuzzing technique, named MemLock, to generate the exces- In real-world programs, lots of such security vulnerabilities have
sive memory consumption inputs and trigger uncontrolled memory been found (e.g., [17–23, 74]). For example, if the termination con-
consumption bugs. The fuzzing process is guided with memory ditions of recursive functions are not implemented correctly, an
consumption information so that our approach is general and does infinite number of recursive function calls can occur and thus ren-
not require any domain knowledge. We perform a thorough evalu- der the stack memory exhausted. The adversaries can exploit this
ation for MemLock on 14 widely-used real-world programs. Our vulnerability to launch a Denial-of-Service (DoS) attack with some
experiment results show that MemLock substantially outperforms well-crafted inputs [18, 21]. Recently, researchers have started to
the state-of-the-art fuzzing techniques, including AFL, AFLfast, pay attention to these issues. For example, SlowFuzz [58], Perf-
PerfFuzz, FairFuzz, Angora and QSYM, in discovering memory Fuzz [37] and ReScue [63] are developed to generate pathological
consumption bugs. During the experiments, we discovered many inputs to stress the time complexity issues (i.e., algorithmic com-
previously unknown memory consumption bugs and received 15 plexity vulnerabilities). However, it still leaves untouched for auto-
new CVEs. matically generating pathological inputs to stress space complexity
issues (namely memory consumption bugs) thus far.
CCS CONCEPTS Although a number of works (e.g., the popular fuzzing tech-
niques [11, 28, 45, 61, 84]) have devoted to detecting memory issues,
• Security and privacy → Software security engineering. they mostly focus on memory corruption vulnerabilities such as
buffer overflow and use-after-free. Memory corruption occurs in a
KEYWORDS program when the contents of the memory are modified due to some
Fuzz Testing, Software Vulnerability, Memory Consumption unexpected program behavior that exceeds the original intention
ACM Reference Format: of the program [65, 67, 72]. When the corrupted memory contents
Cheng Wen, Haijun Wang, Yuekang Li, Shengchao Qin, Yang Liu, Zhiwu are used later by the program, it may lead to unexpected behav-
Xu, Hongxu Chen, Xiaofei Xie, Geguang Pu, and Ting Liu. 2020. MemLock: iors (e.g., program crash). However, memory consumption bugs are
Memory Usage Guided Fuzzing. In 42nd International Conference on Software essentially different from memory corruption vulnerabilities. As de-
fined by CWE-400 [49], the software does not properly control the
∗ Corresponding authors: Shengchao Qin and Haijun Wang allocation and maintenance of a limited resource thereby enabling
an actor to influence the amount of resources consumed, eventually
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed leading to the exhaustion of available resources. To make it explicit,
for profit or commercial advantage and that copies bear this notice and the full citation this paper focuses on three types of memory consumption bugs:
on the first page. Copyrights for components of this work owned by others than ACM uncontrolled-recursion [52], uncontrolled-memory-allocation [51],
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a and memory leak [50]. Uncontrolled-recursion may exhaust stack
fee. Request permissions from [email protected]. memory when the program does not properly control the amount of
ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea recursion that takes place. Uncontrolled-memory-allocation refers
© 2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-7121-6/20/05. . . $15.00 to the situation whereby the program allocates memory based on an
https://doi.org/10.1145/3377811.3380396 untrusted size value, but it does not validate or incorrectly validates
1
ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea Cheng Wen et al.

1 struct demangle_component * 1 class EXIV2API DataBuf {


2 cplus_demangle_type (struct d_info *di) { 2 public:
3 3 // Constructor with an initial buffer size
4 // "peek" is a single character extracted from the input directly 4 explicit DataBuf(long size): pData(new byte[size]), size(size) {}
5 char peek = d_peek_char (di); 5 ...
6 6 byte* pData; // Pointer to the buffer
7 switch (peek){ 7 size_t size; // The current size of the buffer
8 ... 8 };
9 case 'P': 9
10 ret = d_make_comp (di, 10 void Jp2Image::readMetadata() {
11 DEMANGLE_COMPONENT_POINTER, 11 while (io_->read((byte*)&subBox, sizeof(subBox)) ==
12 cplus_demangle_type (di), NULL); ,→ sizeof(subBox) && subBox.length ) {
13 break; 12 subBox.length = getLong((byte*)&subBox.length, bigEndian);
14 case 'C': 13 DataBuf data(subBox.length); // Allocation without checking
15 ... 14 ...
16 } 15 io_->seek(position - sizeof(box) + box.length, BasicIo::beg);
17 ... 16 }
18 } 17 }

Figure 1: Code Snippet from cp-demangle.c in Binutils v2.31 Figure 2: Code Snippet from jp2image.cpp in Exiv2 v0.26

the size, allowing arbitrary amounts of memory to be consumed. MemLock then employs branch coverage as well as memory con-
Moreover, if the software does not track and release allocated mem- sumption information to guide the fuzzing process. The branch
ory after it has been used, it causes a memory leak. coverage information guides to explore different program paths,
Existing detection techniques for memory consumption bugs and the memory consumption information guides the program
usually use domain- or implementation-specific heuristics or rules path to consume more and more memory. If an input covers new
[15, 24, 46, 70, 79]. For example, Radmin [24] learns and executes branch compared to previous inputs, it is considered as interesting
multiple probabilistic finite automata, and then confines the re- and added into the seed queue. Besides, although an input has no
source usage of target programs to the learned automata and de- new branch coverage, if it leads to more memory consumption, we
tects resource usage anomalies at their early stages. Thus, their also retain it as an interesting input through a novel seed updat-
effectiveness heavily depends on the completeness of heuristics ing scheme. This input can be further mutated so that the newly
and rules. To create and maintain such rules requires substantial generated input leads to more memory consumption. After some
manual efforts and expertise. In this paper, we employ the grey- mutations, MemLock is expected to generate an input whereby the
box fuzzing [84] technique to develop an automated and general memory consumption exceeds the available memory.
technique to detect memory consumption bugs. We have evaluated MemLock’s effectiveness using a set of real-
Grey-box fuzzing is one of the most effective techniques to find world open source programs. The experiment results show that
vulnerabilities [39, 41], which typically adopts the coverage infor- MemLock substantially outperforms six state-of-the-art tools (i.e.,
mation as guidance to explore different program paths. However, AFL [84], AFLfast [8], PerfFuzz [37], FairFuzz [38], Angora [12] and
existing grey-box fuzzing techniques are not designed for detecting QSYM [83]), in discovering the memory consumption vulnerabil-
memory consumption bugs, because such bugs often depend not ities. MemLock finds 40.5% more unique crashes and 17.9% more
only on the program path but also on some interesting program vulnerabilities, than the second best counterpart. In particular, Mem-
states in that path (i.e., amount of memory consumption). For ex- Lock can discover a certain memory consumption vulnerability at
ample, the real-world program in Figure 2 allocates the memory at least 2.07 times faster than the other baseline fuzzers. Besides, the
Line 4, however, this memory allocation may fail if no additional generated test cases in MemLock usually lead to 150 times memory
memory can be allocated for use. To detect this bug, the grey-box consumption compared to the other state-of-the-art tools. In addi-
fuzzer needs to execute a program path that touches Line 4, as tion, we have responsibly disclosed several previously unknown
well as a large value for variable size to exceed the available heap memory consumption bugs, and received 15 new CVE1 for them,
memory. Existing coverage-based fuzzing techniques can easily demonstrating MemLock’s effectiveness in practice.
cover Line 4, but it may be difficult to produce test cases that have In summary, this paper makes the following contributions:
a large value for variable size. • We present MemLock, the first, to the best of our knowledge,
To address the aforementioned challenges, we present MemLock dedicated fuzzing technique to automatically discover memory
to enhance grey-box fuzzing to find memory consumption bugs. consumption bugs without requiring any domain knowledge.
MemLock works in two steps. Firstly, MemLock performs the static • We design a new dimension of guidance engine to deeply exploit
analysis, which identifies the statements and operations relevant the memory consumption in a program path, which is comple-
to memory consumption. We would qualitatively analyze the call mentary to the coverage guidance.
graph, which determines the stack memory usage, and quantita- • We have implemented and evaluated MemLock on various widely-
tively analyze memory usage operations, which determines the heap used real-world programs. The experimental results have shown
memory usage. Besides, we also analyze the control flow graph of that MemLock substantially outperforms five state-of-the-art
the program, which provides branch coverage for guiding to explore fuzzing techniques in discovering memory consumption bugs.
different program paths. With the memory consumption analyzed,
1 The Common Vulnerabilities and Exposures (CVE) system provides a reference for
tracking publicly known information-security vulnerabilities and exposures.
2
MemLock: Memory Usage Guided Fuzzing ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea

Static Analysis
• We have discovered 15 security-critical memory consumption Control Flow
Graph
vulnerabilities in widely-used real-world programs, and most of
Source Instrumented
these vulnerabilities have been patched by the developers. Code Static Analysis Call Graph Instrumentation
Program
Memory Usage
Operations

2 OVERVIEW Seed Selector Selected Seed Seed Mutator


2.1 Motivating Examples
Initial
We first illustrate the limitations of existing coverage-based grey- Seeds
Seed Pool Test Inputs Executor
Branch
box fuzzing techniques for detecting memory consumption bugs Coverage
with two examples summarized from real-world vulnerabilities. We Proof of Feedback
Seed Updater
Crashes Collector
use the vulnerability CVE-2018-17985 [18] in Figure 1 to demon- Memory
strate an uncontrolled-recursion bug and CVE-2018-4868 [19] in Consumption
Fuzzing Loop
Figure 2 to demonstrate an uncontrolled-memory-allocation bug.
In Figure 1, the function cplus_demangle_type recursively calls Figure 3: The overview of the proposed approach; grey rect-
itself in line 12 when the input contains the character ‘P’. The depth angles denote the new features of MemLock.
of recursion depends on the number of character ‘P’s in the input.
With a sufficiently large recursive depth, the execution would run
out of stack memory, causing stack overflow. To trigger a stack clearly see that comparing with a, b consumes much more mem-
overflow, the fuzzer would need to generate inputs containing a ory and is closer to running out of memory. However, AFL will
large number of character ‘P’s. discard input b and will not retain it as a seed because b does not
However, existing coverage-based grey-box fuzzers do not have bring new branch coverage. Consequently, AFL cannot detect this
enough awareness about the change in recursive depth and solely uncontrolled-memory-allocation problem effectively.
use coverage information to retain interesting inputs. Take AFL as Therefore, to expose uncontrolled-memory-allocation effectively,
an example, it is aware of repeatedly executed CFG edges [71] but grey-box fuzzers also need to have precise awareness about the
only in a coarse manner. To be specific, AFL adopts the concept amount of consumed heap memory of the target program when
of “loop bucket” to retain interesting inputs (see Section 3.1). The executing an input.
loop bucket cannot tell the fine-grained change in recursive depth.
Specially, it does not differentiate the change when the recursive 2.2 Approach Overview
depth is greater than 255. Nevertheless, this number is still very Figure 3 shows the workflow of MemLock, which contains two
far from causing stack exhaustion, which normally requires tens of main components: static analysis and fuzzing loop. In particular,
thousands of recursive depth. the static analysis takes the program source code as the input, and
Therefore, to expose uncontrolled-recursion effectively, grey-box generates three kinds of information (see Section 3.1): control flow
fuzzers need to have precise awareness about the stack memory graph, call graph, and memory usage operations. The static analysis
consumption of the target program when executing an input. in MemLock helps to decide where to instrument and what to in-
Figure 2 demonstrates an uncontrolled-memory-allocation prob- strument. The control flow graph information is used to collect the
lem in exiv2. At line 11-12, when the program parses a subBox in branch coverage; the call graph information aids to instrument the
readMetadata(), a length is extracted from the user inputs. Then function call entries and returns. Based on the memory usage oper-
the length is fed directly into DataBuf() at line 13. Finally, this ation statements, MemLock instruments the locations of memory
value is used as the size of a memory allocation request at line 4. allocation and free operations.
Note that the program does not check the size before allocating Once the program is instrumented, MemLock enters the con-
memory. By carefully handcrafting the input, an adversary can tinuous fuzzing loop to detect memory consumption bugs (see
provide arbitrarily large values for subBox.length, leading to pro- Section 3.2). Given the initial seeds, MemLock selects a seed s from
gram crash (i.e., std::bad_alloc) or running out of memory. To the seed pool. As for the seed s, MemLock generates the new in-
trigger this problem, the fuzzer would need to generate inputs with puts (test cases) using different mutation strategies. MemLock then
a large subBox.length. For this purpose, the fuzzer needs to col- runs the generated inputs against the instrumented program, and
lect information about the value of subBox.length to retain the collects their memory consumption information (see Section 3.2.1)
interesting inputs that can incur a large memory consumption. and branch coverage information. If the generated seeds consume
However, existing coverage-based grey-box fuzzers lack aware- more memory or have new branch coverage, they are retained as
ness about the value of subBox.length. Therefore, they cannot ef- interesting seeds. MemLock adds them into the seed pool through
fectively generate inputs causing subBox.length to become larger. a seed updating scheme (see Section 3.2.2). MemLock repeats this
Take AFL as an example, let us assume AFL now holds a seed process until reaching time or resource budget limits.
input a which incurs the subBox.length of 100 and causes the Example in Figure 1. We illustrate MemLock using the example in
function to enter the while at line 11 and eventually return at Figure 1. Suppose the initial value of peek (obtained from function
line 16. After some mutations, AFL may generate another input parameter di by function d_peek_char at Line 5) is ‘a’. This value
b which incurs the subBox.length of 10000 and also causes the is general, unbiased for any special case. Through the coverage
function to enter the while at line 11 and return at line 16. We can guidance, MemLock generates a new input i 1 that may produce the
3
ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea Cheng Wen et al.

value ‘P’ for peek as it covers the different branch. When i 1 is further 3.1.1 Control Flow Graph. MemLock collects branch coverage
mutated, it generates i 2 , which may produce four consecutive ‘P’s information in the control flow graph (CFG) of the program to guide
for peek (i.e., “PPPP”) in its recursion. Since i 2 has different branch program path explorations as AFL [84]. It inserts instrumentation
hits in the sense of “loop bucket” from i 1 , it is added into the into every branch of the program CFG, assigning a pseudo-unique
seed pool. When i 2 is selected for mutation, it generates i 3 that ID to every branch. During program execution, the instrumentation
may produce five consecutive ‘P’s for peek (i.e., “PPPPP”) in its uses an 8-bit counter to keep track of the number of times that
recursion. The coverage guidance uses the concept of “loop bucket”, a branch has been executed. MemLock groups the hit counts of
and considers that i 3 does not offer new branch coverage compared each branch execution into several buckets to denote different
to i 1 and i 2 . In this case, existing coverage-based grey-box fuzzers magnitudes2 . Consequently, the branch coverage information in an
would discard i 3 , and thus miss the chance to generate an input that executed program path can be defined as follows.
can produce more consecutive ‘P’s. On the other hand, MemLock
Definition 3.1 (Trace Bits [84]). For an executed program path,
introduces memory consumption as the guidance, under which i 3 is
its trace bits are represented by an 8-bit array with size 2K , and the
considered to cause more memory consumption (than i 1 or i 2 ). Thus,
it retains i 3 as an interesting test case, and adds it into the seed pool. value of the IDth element is stored in an 8-bit counter (In AFL, K = 16).
It can further mutate i 3 , and generate inputs that may produce more The trace bits record the accumulated branches executed in a
consecutive ‘P’s. After some mutations, MemLock may generate an program path, and they can represent a program path roughly.
input that would produce a sufficiently large number of consecutive
‘P’s (i.e., “PPP. . . ”) to run out the stack memory. Definition 3.2 (Path-ID). For an executed program path, its
Example in Figure 2. For illustration, let us assume that the avail- path-ID is the hash value of its trace bits (see Definition 3.1).
able heap memory is 10000 bytes. Suppose the initial value of 3.1.2 Call Graph. In addition to branch coverage, MemLock also
subBox.length is 100, which is produced from user input at Lines collects the memory consumption information. One important con-
11-12. At Line 13 in Figure 2, the memory is allocated successfully, struct that may cause a large bulk of stack memory consumption is
and the program executes the true branch of the while statement the recursive function call. When a function call occurs, the pro-
at Line 11. Based on the coverage guidance, MemLock performs gram automatically allocates the stack memory for use (e.g., local
the mutation and can generate a new input i 1 that produces a variables). On the other hand, when a function call is finished (re-
larger value for subBox.length. In this case, we assume the value turned), the program automatically reclaims the allocated stack
is 150. The input i 1 still executes the true branch of the while memory for reuse. To monitor the stack memory consumption of
statement, and thus there is no new branch coverage. At this time, function calls, MemLock injects the instrumentation into both the
the coverage-based grey-box fuzzers would discard i 1 , therefore entry and the exit of the function call.
missing the chance to generate an input consuming more memory. We use ft to denote the length (i.e., consumption) of call stack
On the other hand, MemLock’s memory consumption guidance during the program execution. This value changes with the execu-
considers that i 1 consumes more memory (i.e., 150 > 100), and tion of the program. When the program execution enters a function,
keeps it as an interesting input. When i 1 is further mutated, Mem- the value ft is increased by one; likewise, when a function call is
Lock can generate an input (e.g., len = 250) that consumes more returned, the value ft is decreased by one. In the following, we use
memory. After some mutations, MemLock can generate an input fm to denote the peak value of ft during the program execution.
(e.g., len = 11000) that runs out of memory. The value fm thus qualitatively reflect the maximum (stack) mem-
Note that we have not elaborated memory leaks separately ory consumption by recursive function calls during the program
as MemLock deals with them in the same way as uncontrolled- execution. We do not differentiate the memory consumption caused
memory-allocation, using the same memory usage guidance during by different functions, because usually the stack memory can be ex-
fuzzing. hausted only under infinite recursive function calls. Thus, we only
need the peak length of call stack to guide MemLock to approach
infinite recursive function calls.
3.1.3 Memory Usage Operations. Memory usage operation state-
3 METHODOLOGY ments (e.g. malloc and free) may also contribute to the consumption
3.1 Static Analysis of a large bulk of memory. In a program path, the memory opera-
The static analysis in MemLock decides how to instrument the tar- tion statements may be affected by the program inputs. When this
get program. Based on the instrumentation, MemLock collects the happens, it is possible to guide this program path to consume more
guidance information, and then uses it to drive the fuzzing process. memory by controlling the program inputs. To this end, MemLock
After analyzing the control flow graph, MemLock instruments the uses instrumentation to quantitatively obtain the size of the mem-
target program to capture branch (edge) coverage, guiding program ory operation. Due to the lack of freed memory size in deallocation
path explorations. Additionally, based on the qualitative and quan- statements, MemLock maps them to their corresponding allocation
titative analysis of call graph and memory usage operations, it also statements to obtain the size of the freed memory.
instruments the target program to collect the memory consumption In particular, we insert instrumentation into the memory allo-
information, guiding the fuzzing process towards consuming more cation/deallocation functions in the standard libraries, and obtain
memory for each program path. To facilitate the description of our 2 In
AFL, the hit counts of each branch execution are divided into 8 buckets: 1 time, 2
methodology, we define the following concepts. times, 3 times, 4-7 times, 8-15 times, 16-31 times, 32-127 times, and 128-255 times [78].
4
MemLock: Memory Usage Guided Fuzzing ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea

Algorithm 1: Memory Usage Guided Fuzzing that MemLock additionally adopts memory consumption guidance
input :an instrumented program P, and set of initial seeds T to retain interesting inputs.
output : test cases S triggering memory consumption bugs The algorithm takes the instrumented program P (see Section 3.1)
and a set of initial seeds T as the inputs, and outputs a set of test
1 S ← Φ; cases S that trigger the memory consumption bugs. The variable
2 Queue ← T ; Queue represents the seed pool, and is initialized as the initial seeds
3 while time and resource budget do not expire do T at Line 2. MemLock first selects an input t from the seed pool
4 for each input t in Queue do Queue (Line 4), and computes its probability on whether or not to
5 if with probability FuzzProbt to select t then be mutated at Line 5 (see Section 3.2.1). Upon deciding to mutate
6 numChildren ← AssiдnEnerдy(t); the input t, MemLock assigns the energy (i.e., numChildren) to it at
7 for 0 ≤ i < numChildren do Line 6, which determines the number of children to produce from
8 childi ← Mutate(t); t. MemLock uses the same heuristics to determine numChildren
9 (traceBitsi , fmi , omi ) ← Run(childi , P); as AFL [84]. It produces more children for inputs that have wider
10 k = Hash(traceBitsi ); code coverage or that are discovered later in the fuzzing process. At
11 if it triggers memory consumption bugs then Lines 4-17, MemLock mutates the input t to generate numChildren
12 S ← S ∪ childi ; children, monitors their executions, and determines their affiliations.
MemLock first performs mutation to generate the new input childi
else
(Line 8). At Line 9, MemLock then runs the input childi on the
13
if N ewCov(traceBitsi ) then
instrumented program P, and collects its branch coverage (i.e.,
14
Queue ← Queue ∪ childi ;
traceBitsi ), function memory consumption (i.e., fm), and operation
15

16 if N ewMax(fmi , omi ) then memory consumption (i.e., om), respectively.


17 Queue ← If the input childi triggers memory consumption bugs (how
U pdate(childi , fmMap[k], omMap[k]); to determine memory consumption bugs, see Section 4.1), it is
added into the output S (Line 12). Otherwise, MemLock analyzes
its branch coverage and memory consumption (Line 14 and 16). If it
has new branch coverage, it is added into the Queue for the further
18 return S mutation (Line 15). In addition, we further analyze its memory con-
sumption. MemLock checks whether childi leads to more memory
consumption based on fmmap[k] and ommap[k] at Line 16. (see
its parameters and return value. The reason is that the memory is Section 3.2.1). If so, MemLock updates the value of fmmap[k] and
allocated by some standard library functions [1, 46], e.g., malloc, ommap[k] using the function Update at Line 17 (see Section 3.2.2).
calloc, realloc, and new. On the other hand, the program may also This process is repeated until the given time or resource budget
free the memory using the standard library function such as free expires (Lines 3).
and delete. Even when the program uses a user-customized memory
usage operation function [33], it still relies on standard library func- 3.2.1 Guidance Mechanisms. One of the most important compo-
tions to operate a larger bulk of memory. Thus, we do not need to nents in the grey-box fuzzing is its guidance mechanism (Lines 14
consider the user-customized memory usage operations in practice. and 16 in Algorithm 1), which often dominates the capability of
We use ot to denote the amount of memory consumed by memory the fuzzing technique in finding bugs [11, 37]. For example, Slow-
operations in a program path. When the program allocates ot ′ bytes Fuzz [58] uses the number of executed instructions as guidance to
memory, the value ot is increased by ot ′ ; likewise, if it frees ot ′ bytes stress algorithmic complexity vulnerabilities. To find the memory
memory, the value ot is decreased by ot ′ . In the following, we use the consumption bugs effectively, MemLock uses branch coverage as
om to represent the peak value of ot during the program execution. well as memory consumption as the guidance. The branch coverage
The value om evaluates the memory consumption in a program information guides MemLock to explore different program paths,
path by memory usage operation statements. By using om as the while the memory consumption information can drive MemLock
guidance, MemLock can mutate the program inputs and gradually to focus on program paths with more memory consumption. To
increase the peak value of memory consumption in a program path. facilitate the description of our memory consumption guidance, we
define the following concepts.
3.2 Fuzzing Loop
Definition 3.3 (Maximum Function Memory). Given a path
Algorithm 1 shows the high-level procedures of MemLock. The k and a set I of inputs that all execute k, the maximum function
intuition of the algorithm is that, for each input t in the seed pool, memory consumption fmmap[k] in k is the maximum peak value of
MemLock decides whether to mutate it based on a selection prob- call stack, among all the inputs I :
ability. If so, MemLock mutates t and generates a set of child in-
puts. Then, MemLock runs each child input and monitors their fmmap[k] ← max fmi
i ∈I
executions. If a child input has new coverage or consumes more
memory (see Definitions 3.3 and 3.4), it is retained as an interesting where fmi represents the peak value of call stack during the execution
input. While this process is similar to the process of traditional of input i (see Section 3.1.2).
coverage-based grey-box fuzzers (e.g., AFL), the main difference is
5
ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea Cheng Wen et al.

Path 1 Path 2 Path 3 Path 4 seed with the generated input childi , we well exploit the advantage
of childi as it is better in terms of finding memory consumption
Original Seed Queue Seed 1 Seed 2 Seed 3
bugs. This seed updating policy ensures MemLock to gradually
improve/increase the overall memory consumption, and it could
New Path Seed 1 Seed 2 Seed 3 Seed 4 avoid getting stuck in local maxima like SlowFuzz [37], and brings
long-term stable improvements.
To tailor for our guidance mechanism, MemLock also optimizes
Larger Memory Seed 1 Seed 2 Seed 3 Seed 4
Consumption the seed selection probability (Line 5 in Algorithm 1) for the muta-
Seed 5 tion as follows.
Definition 3.7 (Favored Input). An input t is favored for muta-
Figure 4: Dynamic Seed Updating tion, if t has new branch coverage (i.e. NewCov) or t leads to maximum
memory consumption (i.e., N ewMax).

Definition 3.4 (Maximum Operation Memory). Given a path Definition 3.8 (Selection Probability). An input t is selected
k and a set I of inputs that all execute k, the maximum operation for mutation with the following probability:
memory consumption ommap[k] in k is the maximum peak value of 1 if t is favored

FuzzProbt =
memory consumption by memory usage operations, among all the a otherwise
inputs I :
ommap[k] ← max omi That is, the favored inputs are always selected, and a is the
i ∈I probability of selecting a non-favored input. In our experiments we
where omi denotes the peak value of memory consumed by memory use a = 0.01 like PerfFuzz [37].
usage operations during the execution of input i (see Section 3.1.3).
Definition 3.5 (NewCov). Given a set I of inputs and an input
4 EVALUATION
t, we say t hits a new coverage, if it either (1) executes a branch that We have built a prototype of MemLock. Our implementation adds
has not been touched by I ; or (2) hits a branch touched by I but with around 1.6k lines of C/C++ code to the file containing AFL’s core im-
a different bucket number. plementation. In particular, the static analysis and instrumentation
components are implemented based on the LLVM framework [36],
The function N ewCov (Line 14) will check whether a newly and the fuzzer engine is implemented based on the AFL-2.52b frame-
generated input childi hits a new coverage with respect the current work [84]. We have conducted thorough experiments to evaluate
Queue or not. That is, the function N ewCov considers the branch MemLock with a set of real-world programs. More detailed ex-
coverage and guides MemLock to explore different program paths. perimental results can be found on our website [48]. With these
Definition 3.6 (NewMax). Given a set I of inputs and an input t experiments, we aim to answer the following research questions:
that all execute k, we say t hits a new maximum memory consumption, RQ1. How capable is MemLock in memory consumption crash
if either fmt > fmmap[k] or omt > ommap[k]. detection?
The function N ewMax (Line 16) determines whether the input RQ2. How capable is MemLock in memory consumption real-
childi leads to the maximum memory consumption among the cur- world vulnerability detection?
rent seed set. It actually checks two kinds of memory consumption. RQ3. Do the strategies of MemLock help to trigger memory leaks
It first determines whether childi leads to the maximum function with more leakage?
memory consumption (see Definition 3.3). It also considers whether RQ4. Do the strategies of MemLock help to generate inputs with
childi leads to the maximum operation memory consumption (see more memory consumption?
Definition 3.4). If the input childi satisfies either of the above two
cases, MemLock update the seed queue with childi at Line 17 (see 4.1 Experiment Setup
Section 3.2.2).
Following the suggestions in [35], we conducted the experiments
3.2.2 Dynamic Seed Updating. In order to efficiently support re- carefully, to draw conclusions as objective as possible.
taining the most interesting input for each path, we propose a Baseline Fuzzers to Compare against. We compare MemLock
novel seed updating scheme. In MemLock, the seed queue is kept against six state-of-the-art fuzzers, namely AFL [84], AFLfast [8],
in a linked list, where each node represents a seed that explores PerfFuzz [37], FairFuzz [38], Angora [12] and QSYM [83]. The base-
a program path, as shown in Fig. 4. MemLock updates the seed line fuzzers are selected based on the following considerations. AFL
queue in the following two cases. (1) New Path. If the test input is the widely-used coverage-based greybox fuzzer, and selected
results in new branch coverage, then it will be added to the seed as baseline fuzzer in the most work. AFLfast is an advanced vari-
queue as a new node, as shown in the second row of Fig. 4. (2) ant of AFL, specially equipped with a better power schedule [8].
Larger Memory Consumption. If the input, e.g., seed2 in the third PerfFuzz [37] is to stress the time complexity issues in the pro-
row of Fig. 4, generates an input seed5, which does not result in gram, while MemLock seeks to detect space complexity issues.
new branch coverage, but it leads to larger memory consumption FairFuzz [38] leverages a targeted mutation strategy to execute
than the corresponding input. When seed2 and seed5 execute the towards rare branches. Further, Angora [12] utilizes taint analy-
same path, seed2 is replaced with seed5. With replacing the original sis to track information flow, and then uses gradient descent to
6
MemLock: Memory Usage Guided Fuzzing ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea

break through the hard branches. Lastly, QSYM [83] is a popular experiments, MemLock performs best in 10 (58.8%) groups of exper-
symbolic execution assisted fuzzer. Note that we haven’t selected iments among 7 different fuzzers, as shown in column MemLock. In
MemFuzz [16] as baseline fuzzer, because MemFuzz is not open total, MemLock finds 2009 unique memory consumption crashes in
source and it resorts to memory accesses (instead of memory con- the benchmark programs, improving by 59.2%, 70.5%, 76.9%, 98.1%,
sumption). In a word, we selected various kinds of representative 40.5% and 66.7% respectively, compared to state-of-the-art fuzzers
state-of-the-art fuzzers as baseline fuzzers, and they are widely AFL, AFLfast, PerfFuzz, FairFuzz, Angora and QSYM. Especially,
used to discover vulnerabilities in practice. MemLock is able to find unique crashes in all benchmark programs,
Evaluation Benchmarks. We select evaluation benchmarks con- while other 6 state-of-the-art fuzzers may find no crashes in some
sidering several factors, e.g., popularity, frequency of being tested, benchmark programs. For example, none of the other 6 state-of-
development activeness, and functional diversity. Finally, we use the-art fuzzers could find any unique crashes in the program flex,
14 widely-used real-world programs, which all contain memory but MemLock was able to find 61 unique crashes within 24 hours.
consumption bugs, to evaluate MemLock, including well-known To better compare different fuzzers, we also use the plots to de-
development tools (e.g., nm, cxxfilt, readelf ), code processing tools pict the performance over time in some benchmark programs, as
(e.g., nasm, flex, yaml-cpp, mjs), graphics processing libraries (e.g., shown in Figure 5. It shows that MemLock has a steady and strong
openjpeg, jasper, exiv2), video processing tools (e.g., bento4 and growth trend in finding unique crashes, and MemLock is also the
libming), and data processing libraries (e.g., libsass and yara), etc. first fuzzer that reported crashes.
These programs have also been widely tested by existing state-of- Following Klees’ recommendation [35], we also conduct the
the-art greybox fuzzers [28, 35, 38, 82]. statistic test for the results. The Â12 [68] statistic measures the
Performance Metrics. To compare against state-of-the-art fuzzers, probability that one fuzzer (in this case MemLock) outperforms
the most direct measurement is the capability to find the vulnera- another fuzzer. The value of Â12 means by what chance the result of
bilities. With this regard, we consider both unique bugs and unique MemLock is better than the competitor, as shown in columns with
crashes each fuzzer finds in the fuzzing process. Since MemLock is the heading Â12 . Further, we apply the Mann-Whitney U -test [2]
to stress the space complexity issues of programs, we also distill with a significance level of 0.05 to check the statistical significance
the memory consumption of each seed in the pool. differences of experimental results. A smaller statistical significance
Configuration Parameters. Since the fuzzers heavily rely on the difference (a.k.a p-value) indicates a more significant difference
random mutation, there could be performance jitter during fuzzing between MemLock and the competitor. In Table 1, we mark the
process. We took two actions to mitigate the randomness caused by corresponding Â12 values in bold for those with a p-value smaller
the nature of fuzzing techniques. First, we test each program for a than the significance level (0.05) (for simplicity, we do not include
longer time, until the fuzzer reaches a relatively stable state. We run p-values here but they are available at the companion website [48]).
each fuzzer for 24 hours. Second, we perform each experiment for Out of 102 Â12 values in the table, 72 (70.6%) Â12 values exceed the
5 times, and evaluate their statistical performance. Besides, we run conventionally large effect size (0.71) and are marked in bold. Thus,
all the fuzzers with the -d option to skip the deterministic mutation we can conclude that MemLock significantly outperforms other 6
stage, following the configuration of PerfFuzz [37]. state-of-the-art fuzzers in most benchmark programs.
Memory Consumption Bugs. The uncontrolled-recursion bug
usually causes stack-overflow, thus we can directly use Address- From the analysis of Table 1 and Figure 5, we can positively an-
Sanitizer [62] to detect it. The uncontrolled-memory-allocation bug swer RQ1 that MemLock significantly outperforms the start-
consumes a large amount of memory so that the program runs of-the-art fuzzers in terms of memory consumption crashes
out of the memory. Thus, we can detect it by setting the “alloca- detection.
tor_may_return_null” [29] flag of AddressSanitizer. In addition, we
use LeakSanitizer [60] to detect memory leakage.
Experiment Infrastructure. All our experiments have been per- 4.3 Real-world Vulnerability Evaluation (RQ2)
formed on machines with an Intel (R) Xeon (R) E5-1650 v3 Processor In this section, we compare the capability of MemLock to find real-
(3.40GHz) and 16GB of RAM under 64-bit Ubuntu LTS 16.04. world known vulnerabilities against baseline fuzzers, as suggested
by Klees [35].
Table 2 shows the statistic results in MemLock as well as other 6
different state-of-the-art fuzzers. The benchmark programs totally
4.2 Unique Crashes Evaluation (RQ1) contain 34 unique vulnerabilities, out of which MemLock performs
To evaluate the effectiveness of fuzzers, a direct measurement is best in the 25 vulnerabilities among other 6 state-of-the-art fuzzers,
the number of unique crashes found by different fuzzers. It is be- as shown in column MemLock. MemLock averagely takes about
lieved that more unique crashes usually indicate higher chances of 5.4 hours to find each unique vulnerability, which is 2.15, 2.15,
covering more unique vulnerabilities. 2.20, 2.69, 3.76, 2.07 times faster than the state-of-the-art fuzzers
Table 1 shows the number of unique crashes, which is caused by AFL, AFlfast, PerfFuzz, FairFuzz, Angora and QSYM respectively. In
memory consumption vulnerabilities, found by 7 different fuzzers particular, MemLock finds 33 out of 34 unique vulnerabilities within
within 24 hours in the benchmark programs. It is worth noting, we 24 hours, while other fuzzers AFL, AFLfast, PerfFuzz, FairFuzz,
identify unique crashes related to memory consumption bugs by Angora and QSYM only find 26, 28, 20, 17, 6 and 25, respectively.
reproducing the crashes and analyzing their crash stacks. And we The three unique vulnerabilities (i.e., issue#106, CVE-2018-18701
discuss other types of crashes in Section 4.6. Out of the 17 groups of and CVE-2019-6293) in mjs, nm and flex can be found only by
7
ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea Cheng Wen et al.

Table 1: Unique Crashes Evaluation


MemLock AFL AFLfast PerfFuzz FairFuzz Angora QSYM
Program Version SLoC Type
#Crashes #Crashes Â12 #Crashes Â12 #Crashes Â12 #Crashes Â12 #Crashes Â12 #Crashes Â12
mjs [53] 1.20.1
UR 40k 114 36 1.00 31 1.00 88 0.96 12 1.00 0 1.00 30 1.00
cxxfilt [5] UR2.31 1,757k 448 373 1.00 304 1.00 401 0.88 39 1.00 0 1.00 327 1.00
nm [5] UR2.31 1,757k 127 12 1.00 21 1.00 17 1.00 0 1.00 0 1.00 20 1.00
nasm [54] 2.14.03
UR 105k 132 6 1.00 4 1.00 40 1.00 0 1.00 0 1.00 4 1.00
flex [27] UR2.6.4 27k 61 0 1.00 0 1.00 0 1.00 0 1.00 0 1.00 0 1.00
yaml-cpp [80] UR0.6.2 58k 4 0 1.00 1 1.00 3 0.56 0 1.00 0 1.00 0 1.00
libsass [43] UR3.5.4 27k 23 6 1.00 4 1.00 23 0.53 11 0.88 26 0.25 7 1.00
yara [81] UR3.5.0 45k 156 34 1.00 33 1.00 65 0.94 13 1.00 0 1.00 31 1.00
readelf [5] UA2.28 1,844k 273 104 1.00 110 1.00 54 1.00 181 0.88 0 1.00 114 1.00
exiv2 [25] UA0.26 84k 10 11 0.14 11 0.20 6 0.90 15 0.00 13 0.16 8 0.52
openjpeg [55] UA2.3.0 243k 16 8 0.80 5 1.00 0 1.00 7 0.46 0 1.00 5 0.80
UA 5 2 1.00 2 0.98 2 1.00 1 1.00 189 0.00 1 1.00
bento4 [4] 1.5.1 78k
ML 145 78 1.00 72 1.00 61 1.00 125 1.00 290 0.00 74 1.00
UA 18 20 0.40 18 0.60 17 0.62 20 0.20 3 1.00 16 0.80
libming [42] 0.4.8 92k
ML 264 336 0.20 324 0.00 324 0.00 371 0.00 87 1.00 354 0.00
UA 3 2 0.84 3 0.56 0 1.00 3 0.56 2 1.00 2 0.92
jasper [32] 2.0.14 44k
ML 210 234 0.08 235 0.08 35 1.00 216 0.40 820 0.00 212 0.46
Total Unique Crashes (Improvement) 2009 1262 (+59.2%) 1178 (+70.5%) 1136 (+76.9%) 1014 (+98.1%) 1430 (+40.5%) 1205 (+66.7%)
* UR means the uncontrolled-recursion bug, UA means the uncontrolled-memory-allocation bug, and ML means the memory leak. We highlight the Â12 values in the bold if its
corresponding Mann-Whitney U test is significant.

nasm nm readelf 18 openjpeg


180 300 QSYM Angora
160 QSYM Angora 140 QSYM Angora QSYM Angora 16

Number of Unique Crashes


PerfFuzz AFLfast
Number of Unique Crashes

Number of Unique Crashes

Number of Unique Crashes

PerfFuzz AFLfast PerfFuzz AFLfast 250 PerfFuzz AFLfast 14


140 MemLock AFL 120 MemLock AFL MemLock AFL MemLock AFL
120 FairFuzz 100 FairFuzz FairFuzz 12 FairFuzz
200
100 80 10
80 150 8
60
60 100 6
40 40 4
20 20 50 2
0 0 0 0

10

15

20
0

10

15

20

10

15

20

10

15

20
time (hour) time (hour) time (hour) time (hour)
Figure 5: The growth trend of unique crashes found in different fuzzers; higher is better

MemLock within 24 hours. Therefore, it is proved that our memory- More interestingly, MemLock takes only 5.4 hours on average to
consumption guided strategy is very effective in finding memory discover this vulnerability, while other fuzzers all fail. We can also
consumption bugs. see the peak length of call stack of flex in Figure 6. AFL does not
In addition, we also conduct the statistic test for unique vulner- retain any seed over 5000 lengths, as those inputs do not increase
ability evaluation. Out of 204 Â12 values in the table, 139 (68.1%) coverage. Comparing to AFL, MemLock intentionally keeps seeds
Â12 values are bold and exceeding the conventionally large effect that increase the peak length of call stack, and finally triggering
size (0.71). Thus, MemLock significantly outperforms other 6 state- stack-overflow. This explains the reason why MemLock can find
of-the-art fuzzers in finding unique vulnerabilities. the vulnerability, while AFL can not detect it in all 5 runs.
Case Study. To demonstrate the reason behind MemLock’s superi- New Vulnerabilities MemLock Found. With MemLock, we have
ority, we present the case of CVE-2019-6293. It is an uncontrolled- discovered many previously unknown security-critical vulnera-
recursion vulnerability in flex, which is a lexical analyzer generator. bilities. These vulnerabilities were not previously reported. We
The lexical analyzer generated by flex has to provide “beginning” informed the maintainers, and Mitre assigned 15 CVEs. Among
state and “ending” states. The mark_beginning_as_normal func- these 15 CVEs, 8 CVEs are uncontrolled-recursion vulnerabilities,
tion mark each “beginning” state in a machine as being a “normal” 5 are vulnerabilities due to uncontrolled-memory-allocation issues,
state, and the “beginning” states are the epsilon closure of the first and 2 are about memory leak vulnerabilities. An attacker might
state. The mark_beginning_as_normal function would call to it- leverage these vulnerabilities to launch an attack, by providing well-
self if there is a state reachable from the first state through epsilon. conceived inputs that trigger excessive memory consumption. The
We investigate MemLock’s mutation history and identify a key mu- developers actively patched the vulnerabilities with our reports. At
tation step. The test case triggers the mark_beginning_as_normal the time of writing, 12 of these vulnerabilities have been patched.
function calling itself for multiple times, through havoc mutation Detailed information on our newly discovered vulnerabilities is
operation. Then, the recursive depth of this function is multiplied available on our website [48]. We are confident that MemLock is
by splice operation, and finally leading to stack-overflow. effective and viable in practice.
8
MemLock: Memory Usage Guided Fuzzing ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea

Table 2: Time to expose real-world vulnerability


MemLock AFL AFLfast PerfFuzz FairFuzz Angora QSYM
Program Vulnerability Type
Time(h) Time(h) Â12 Time(h) Â12 Time(h) Â12 Time(h) Â12 Time(h) Â12 Time(h) Â12
issue#58 UR 0.5 0.3 0.25 0.4 0.25 0.2 0.13 0.4 0.25 T/O 1.00 0.3 0.22
mjs
issue#106 UR 13.7 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00
CVE-2018-9138 UR 0.3 7.2 1.00 10.1 1.00 0.5 0.81 T/O 1.00 T/O 1.00 3.3 1.00
CVE-2018-9996 UR T/O 16.5 0.00 T/O 0.50 T/O 0.50 T/O 0.50 T/O 0.50 T/O 0.50
cxxfilt CVE-2018-17985 UR 0.2 1.1 1.00 4.5 1.00 0.2 0.63 1.9 1.00 T/O 1.00 1.4 1.00
CVE-2018-18484 UR 0.2 1 1.00 4.5 1.00 0.2 0.63 8 1.00 T/O 1.00 1.4 1.00
CVE-2018-18700 UR 0.2 1.2 1.00 4.6 1.00 0.3 0.75 12.6 1.00 T/O 1.00 1.4 1.00
CVE-2018-12641 UR 2.6 19.1 1.00 12.6 1.00 12.2 0.88 T/O 1.00 T/O 1.00 12.8 0.88
CVE-2018-17985 UR 10.4 18.2 0.81 11.9 0.56 T/O 1.00 T/O 1.00 T/O 1.00 13.3 0.63
CVE-2018-18484 UR 9.9 16.4 0.84 17.1 0.84 T/O 1.00 T/O 1.00 T/O 1.00 14 0.75
nm CVE-2018-18700 UR 9.6 14.9 0.63 17.8 0.88 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00
CVE-2018-18701 UR 13.9 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00
CVE-2019-9070 UR 18.4 15.6 0.56 13.9 0.44 T/O 1.00 T/O 1.00 T/O 1.00 15.8 0.56
CVE-2019-9071 UR 12.4 T/O 0.88 14 0.69 T/O 0.88 T/O 0.88 T/O 1.00 T/O 0.88
CVE-2019-6290 UR 0.9 T/O 1.00 19 1.00 9 1.00 T/O 1.00 T/O 1.00 17.6 1.00
nasm
CVE-2019-6291 UR 1.5 9 0.94 14 1.00 8.7 1.00 T/O 1.00 T/O 1.00 7.5 1.00
flex CVE-2019-6293 UR 5.4 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00
CVE-2019-6292 UR 0.4 T/O 1.00 18.4 1.00 0.9 0.81 T/O 1.00 T/O 1.00 T/O 1.00
yaml-cpp
CVE-2018-20573 UR 6.1 T/O 0.88 T/O 0.84 12.4 0.84 T/O 0.84 T/O 1.00 T/O 0.84
CVE-2018-19837 UR 1.6 13.3 0.88 10.5 0.88 1.8 0.63 8.5 0.88 T/O 1.00 5 0.81
libsass CVE-2018-20821 UR 0.1 5.7 1.00 6.5 1.00 0.1 0.50 9.5 1.00 T/O 1.00 7.4 1.00
CVE-2018-20822 UR 15.6 14.3 0.50 19.5 0.56 14.6 0.47 11.3 0.56 0.92 0.00 10.5 0.44
yara CVE-2017-9438 UR 0.2 0.9 1.00 4.3 1.00 0.61 0.91 5.3 1.00 T/O 1.00 0.8 1.00
readelf CVE-2017-15996 UA 0.2 0.3 0.86 0.2 0.68 0.5 0.92 0.3 0.68 T/O 1.00 0.3 0.96
exiv2 CVE-2018-4868 UA 0.1 0.1 0.50 0.1 0.50 0.1 0.50 0.1 0.50 0.1 0.5 0.1 0.50
CVE-2018-20186 UA 0.4 0.4 0.50 0.4 0.50 0.4 0.50 0.4 0.50 0.1 0.00 0.4 0.50
bento4
CVE-2019-7698 UA 14.6 T/O 1.00 T/O 1.00 T/O 1.00 T/O 1.00 0.5 0.00 T/O 1.00
CVE-2019-7581 UA 0.6 0.8 0.68 1.4 0.80 2 0.88 0.4 0.36 T/O 1.00 1.6 0.80
libming CVE-2019-7582 UA 0.1 0.1 0.50 0.1 0.50 0.1 0.50 0.1 0.50 0.1 0.50 0.1 0.50
issue#155 UA 1.4 1 0.30 1.3 0.36 1.4 0.40 1.2 0.42 T/O 1.00 1.6 0.64
CVE-2019-6988 UA 7.8 15.1 0.86 11.1 0.84 T/O 1.00 T/O 1.00 T/O 1.00 15.3 0.81
openjpeg
CVE-2017-12982 UA 4.5 11.4 0.72 10 0.60 T/O 1.00 11.9 0.64 T/O 1.00 10 0.50
CVE-2016-8886 UA 4.1 17 0.88 22.3 1.00 T/O 1.00 10.3 0.52 T/O 1.00 18.2 0.88
jasper
issue#207 UA 1.7 2.2 0.62 3.6 0.68 T/O 1.00 2.2 0.68 15.9 1.00 4 0.64
Average Time Usage (Improvement) 5.4 11.6 (2.15×) 11.6 (2.15×) 11.9 (2.20×) 14.5 (2.69×) 20.3 (3.76×) 11.2 (2.07×)
Unique Vulnerabilities (Improvement) 33 26 (+26.9%) 28 (+17.9%) 20 (+65.0%) 17 (+94.1%) 6 (+450.0%) 25 (+32.0%)
* UR means the uncontrolled-recursion bug, UA means the uncontrolled-memory-allocation bug. T/O means the fuzzer can’t find this vulnerability throughout 24 hours across 5
repetitions. When we calculate the average time usage, we replace T/O with 24 hours. We highlight the Â12 in the bold if its corresponding Mann-Whitney U test is significant.

From the analysis of Table 2, the case study and new vul- from 234% to 3753163%, compared to other baseline fuzzers. This is
nerabilities MemLock found, we can positively answer RQ2 because MemLock tries to maximize each allocation and generates
that MemLock significantly outperforms the state-of-the-art inputs with high memory consumption. When the memory leak
fuzzers in terms of real-world memory consumption vulnera- happens, those memory-consuming inputs will often cause more-
bility detection. bytes memory leakage.

From the results in Table 3, we can answer RQ3 that MemLock


significantly magnifies the memory leakage comparing to
4.4 Memory Leakage Evaluation (RQ3)
the state-of-the-art fuzzing techniques, due to its memory
Memory leak bugs are a little different from uncontrolled-recursion consumption guidance.
and uncontrolled-memory-allocation bugs, because they may not
lead to program crashes immediately. Only enough memory is
leaked, it would produce Denial-of-Service (DoS) attack, for exam-
ple, in a long time running programs (e.g., banking service). To 4.5 Memory Consumption Evaluation (RQ4)
evaluate the effectiveness of fuzzers in finding memory leaks, we Since MemLock seeks to generate test inputs that consume more
look into the number of total bytes leaked during 7 different fuzzers and more memory. In this experiment, we evaluate the test in-
within 24 hours put distribution according to memory consumption for MemLock,
Table 3 shows the amount of memory leak (in bytes) identified AFL, AFLfast, PerfFuzz, FairFuzz, Angora and QSYM. A fuzzer that
by each fuzzer that may occur in different programs. We can see maintains a seed pool with a larger proportion of high memory con-
that MemLock shows an obvious advantage over other baseline sumption inputs is considered to have a better chance of detecting
fuzzers. The number of bytes leaked is improved (increased) by memory consumption bugs.
9
ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea Cheng Wen et al.

nm nasm flex yara


AFL MemLock AFL MemLock AFL MemLock AFL MemLock
AFLfast PerfFuzz AFLfast PerfFuzz 104 AFLfast PerfFuzz AFLfast PerfFuzz
103 103 103
# of seeds in seed pool

# of seeds in seed pool

# of seeds in seed pool

# of seeds in seed pool


FairFuzz QSYM FairFuzz QSYM FairFuzz QSYM FairFuzz QSYM
103
102 102 102
102
101 101 101
101
100 100 100
0 5000 10000 15000 20000 25000 0 5000 10000 15000 20000 0 5000 10000 15000 20000 25000 30000 0 2000 4000 6000 8000 10000
the peak length of call stack the peak length of call stack the peak length of call stack the peak length of call stack
readelf openjepg jasper libming
104 AFL MemLock AFL MemLock 104 AFL MemLock AFL MemLock
AFLfast PerfFuzz AFLfast PerfFuzz AFLfast PerfFuzz 104 AFLfast PerfFuzz
103
# of seeds in seed pool

# of seeds in seed pool

# of seeds in seed pool

# of seeds in seed pool


FairFuzz QSYM FairFuzz QSYM 103 FairFuzz QSYM FairFuzz QSYM
103 103
102 102 102 102
101 101 101 101
100 100 100 100
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
amount of consumed heap memory (bytes)1e9 amount of consumed heap memory (bytes)1e9 amount of consumed heap memory (bytes)1e9 amount of consumed heap memory (bytes)1e9

Figure 6: Seed distribution based on memory consumption. The larger the value on the right side is better.

The results clearly demonstrate the effectiveness of the strategies


Table 3: Total Leak Bytes
of MemLock in generating inputs with high memory consumption.
Program Type Tool leakge (Bytes) Improve. p -value Â12
MemLock 52,709,574 - - - After analyzing Figure 6, we can answer RQ4 that the strate-
AFL 151,862 +34609% 0.0061 1.00
AFLfast 1,233,255 +4174% 0.0061 1.00
gies of MemLock indeed help to generate inputs with high
bento4 memory leak PerfFuzz 105,984 +49633% 0.0061 1.00 memory consumption.
FairFuzz 1,910,466 +2659% 0.0061 1.00
Angora 141,512 +37147% 0.0060 1.00
QSYM 15,784,847 +234% 0.0061 1.00
MemLock 176,320,785 - - - 4.6 Discussion
AFL 4,869,594 +3521% 0.0061 1.00
AFLfast 2,535,212 +6855% 0.0061 1.00 Additional Experiments. The above four groups of experiments
libming memory leak PerfFuzz 47,044,964 +257% 0.0061 1.00 show that MemLock is effective and efficient in finding memory
FairFuzz 828,742 +21176% 0.0061 1.00
Angora 4,698 +3753163% 0.0060 1.00 consumption vulnerabilities. Since MemLock focuses on the space
QSYM 1,219,093 +14363% 0.0061 1.00 complexity issues, it may fall behind other baseline fuzzers in other
MemLock 2,372,844,732 - - - performance metrics. For example, MemLock intentionally keeps
AFL 56,018,839 +4136% 0.0061 1.00
AFLfast 48,403,244 +4802% 0.0061 1.00 seeds that increase memory consumption, which may degrade its
jsaper memory leak PerfFuzz 6,229,898 +37988% 0.0061 1.00 capability of identifying other types of vulnerabilities. We have
FairFuzz 56,788,235 +4096% 0.0061 1.00 therefore evaluated the capability of finding other types of crashes.
Angora 191,907,941 +1136% 0.0105 0.98
QSYM 38,244,568 +6104% 0.0061 1.00 In the benchmark programs, MemLock, AFL, AFLfast, PerfFuzz,
FairFuzz, Angora and QSYM find 77, 239, 228, 189, 276, 343 and 236
other types of unique crashes, respectively. Moreover, our approach
may also incur some runtime overhead. Therefore, we compare
Figure 6 shows the input distribution based on memory consump- the code coverage and execution speed for each baseline fuzzer.
tion. In general, we can clearly see that MemLock can generate In total, the number of executed test inputs in MemLock ranges
more seeds with higher memory consumption. This is because the from 20% to 84% of those in AFL, AFLfast, FairFuzzer and QSYM.
guidance mechanisms in MemLock help to gradually add more Among all the fuzzers, PerfFuzz performs the worst likely due to
and more memory consuming inputs into the seed pool. In par- the fact that it prefers the test inputs that execute long instructions.
ticular, for the uncontrolled-recursion bugs (nm, nasm, flex and Considering the code coverage, MemLock achieves the comparable
yara), MemLock generates a large number of inputs that hold more code coverage, compared to the fuzzers AFL, AFLfast, FairFuzzer
than 30,000 function calls in the call stack, while PerfFuzz gen- and QSYM. PerfFuzz still performs the worst among those fuzzers,
erates only a few and AFL/AFLfast can hardly generate inputs and in most cases it only achieves the code coverage from about
that hold more than 10,000 function calls. The pattern is similar 60% to 70% of those in other fuzzers. All extra experimental results
for uncontrolled-memory-allocation bugs (readelf, openjpeg, jasper and data are available on our website[48] for interested readers.
and libming). MemLock can generate a considerable amount of Threats to Validity. We selected a variant of real-world programs
inputs with high memory consumption while the inputs of the to show the capabilities of MemLock, and compared it against other
other fuzzers concentrate on the low memory consumption region. state-of-the-art fuzzers. However, our benchmarks may still include
10
MemLock: Memory Usage Guided Fuzzing ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea

a certain sample bias. Further studies on more real-world programs usage of functions. Duc-Hiep et al. [15] presents a worst-case mem-
can help better evaluate MemLock. Besides, MemLock also suffers ory consumption analysis, which uses symbolic execution to ex-
from the difficulty in breaking through hard comparisons (e.g., haustively unroll loops and compute memory consumption of each
magic bytes) as most work [7, 11, 28]. Adopting some program iteration. He et al. [31] and Chin et al. [14] employ static verification
analysis techniques (e.g., symbolic execution) might help mitigate to check a program’s memory usage is within the memory bounds,
this threat. while Chin et al. [13] uses static analysis to compute the mem-
ory usage bounds for assembly level programs. These approaches
rely on type theory or symbolic execution, thus they often suffer
5 RELATED WORK from the scalability issue. SMOKE [26] is a path-sensitive memory
Coverage-based Grey-box Fuzzing. Coverage-based grey-box leak detector for millions of lines of code. It first uses a scalable
fuzzing [3, 39, 41, 44, 47, 57, 66] is one of the most effective tech- but imprecise analysis to compute a set of candidate memory leak
niques to find vulnerabilities and bugs, and has attracted a great paths and then verifies the feasibility of the candidates using a more
deal of attention from both academic and industry. Coverage-based precise analysis. While SMOKE can demonstrate the existence of
grey-box fuzzers typically adopt the coverage information to guide memory leak, MemLock can generate an input that produces the
different program path explorations. For example, Google has built memory leak.
an OSS-FUZZ platform [61] by incorporating several state-of-the- Dynamic Analysis. Yuku et al. [46] proposes an improved real-
art coverage-based grey-box fuzzers: libFuzzer [45], honggfuzz [9], time scheduling algorithm to reduce maximal heap memory con-
AFL [84] and ClusterFuzz [30]. sumption by controlling multitask scheduling. Different from Mem-
Since a coverage guidance engine is a key component for the Lock, this technique aims at reducing memory consumption by
grey-box fuzzers, much effort has been devoted to improve their dynamic online scheduling while MemLock is to find memory con-
coverage. Steelix [40], Vuzzer [59] and REDQUEEN [3] use program- sumption bugs. BLEAK [69] is a system to debug memory leaks in
state analysis or taint analysis to penetrate some paths protected by web applications. It leverages the observation that users often re-
magic bytes comparisons. QSYM [83], Driller [64] and SAFL [76] peatedly return to the same visual state. Sustained growth between
equips grey-box fuzzing with a symbolic execution engine to reach round trips is a strong indicator of a memory leak. BLEAK is only
deeper program code. Angora [12] adopts a gradient descent tech- applicable to memory leak of web applications, while MemLock can
nique to solve path constraints so as to break some hard compar- find several kinds of memory consumption bugs. Radmin [24] is a
isons. MemFuzz [16] augmenting evolutionary fuzzing by addi- system for early detection of application-level resource exhaustion
tionally leveraging information about memory accesses (instead and starvation attacks. It first learns and executes multiple proba-
of memory consumption) performed by the target program. Pro- bilistic finite automata from its benign executions. It then restricts
Fuzzer [82], GRIMOIRE [6], Superion [75] and Zest [56] leverage the resource usage to the learned automata and detects resource
the knowledge in highly-structured files to generate syntactically usage anomalies. Radmin uses some heuristics to detect resource
and semantically valid test inputs, and thus be able to touch deeper usage anomalies, while MemLock employs the fuzzing technique to
program code. CollAFL [28] proposes a coverage sensitive fuzzing automatically generate the inputs for memory consumption bugs.
solution to mitigate the path collisions. FairFuzz [38] leverages
a targeted mutation strategy to execute towards rare branches.
UAFL [73] incorporates typestate properties and information flow 6 CONCLUSION
to their fuzzing engine to guide the detection of use-after-free In this paper, we propose MemLock, an enhanced grey-box fuzzing
vulnerabilities. Besides, AFLgo [7] and Hawkeye [11] use the dis- technique to find memory consumption bugs. MemLock employs
tance metrics to execute towards user-specified target sites in the both coverage and memory consumption information to guide the
program. The main difference between MemLock and these state- fuzzing process. The coverage information guides the exploration
of-the-art fuzzers is that, MemLock aims at memory consumption of different program paths, while the memory consumption infor-
bugs while the others are to find memory corruption vulnerabilities. mation guides the search for those program paths that exhibit more
Thus, MemLock is orthogonal to these state-of-the-art fuzzers. and more memory consumption. Our experimental results have
Recently, researchers have paid attention to the algorithmic com- shown that MemLock outperforms state-of-the-art fuzzing tech-
plexity vulnerabilities (i.e., time complexity issues) such as Slow- niques (i.e., AFL, AFLfast, PerfFuzz, FairFuzz, Angora and QSYM)
Fuzz [58], Singularity [77] and PerfFuzz [37]. They use the number in detecting memory consumption bugs. We also found 15 security-
of executed instructions as the guidance to explore the program critical vulnerabilities in some real-world programs. At the time of
path with a longer path length. In contrast with MemLock, they writing, 12 of these vulnerabilities have been patched.
stress the time complexity issues while MemLock considers space
complexity issues. The space complexity issues have its own unique
characteristics, as the amount of memory consumption can increase ACKNOWLEDGEMENTS
(e.g., function entry, memory allocation) and decrease (e.g., function This work was supported in part by the National Natural Sci-
exit, memory free), MemLock takes both of them into consideration. ence Foundation of China under Grants No. 61772347, 61836005,
Static Analysis. Static analysis is also used to analyze memory 61972260, 61772408, 61721002, Ant Financial Services Group through
consumption [1, 10, 13, 14, 31, 34, 70]. Wang et al. [70] presents a Ant Financial Research Program, Guangdong Basic and Applied
type-guided worst-case input generation by using automatic amor- Basic Research Foundation under Grant No. 2019A1515011577, Na-
tized resource analysis to derive symbolic bounds on the resource tional Key R&D Program of China under Grant No. 2018YFB0803501.
11
ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea Cheng Wen et al.

REFERENCES [30] Google. 2019. ClusterFuzz. https://google.github.io/clusterfuzz/.


[1] Jeppe L Andersen, Mikkel Todberg, Andreas E Dalsgaard, and René Rydhof [31] Guanhua He, Shengchao Qin, Chenguang Luo, and Wei-Ngan Chin. 2009.
Hansen. 2013. Worst-case memory consumption analysis for SCJ. In Proceedings of Memory Usage Verification Using Hip/Sleek. In 7th International Symposium
the 11th International Workshop on Java Technologies for Real-time and Embedded on Automated Technology for Verification and Analysis (ATVA 2009). 166–181.
Systems. ACM, 2–10. https://doi.org/10.1007/978-3-642-04761-9_14
[2] Andrea Arcuri and Lionel Briand. 2011. A practical guide for using statistical tests [32] Jasper. 2019. Image Processing/Coding Tool Kit. https://www.ece.uvic.ca/~frodo/
to assess randomized algorithms in software engineering. In Software Engineering, jasper/. accessed: 2019-08-01.
2011 33rd International Conference on. IEEE, 1–10. [33] Xiangkun Jia, Chao Zhang, Purui Su, Yi Yang, Huafeng Huang, and Dengguo
[3] Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Feng. 2017. Towards efficient heap overflow discovery. In 26th USENIX Security
Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. Symposium. 989–1006.
In Proceedings of the Network and Distributed System Security Symposium. [34] Daniel Kästner and Christian Ferdinand. 2014. Proving the absence of stack
[4] Bento4. 2019. Full-featured MP4 format and MPEG DASH library and tools. overflows. In International Conference on Computer Safety, Reliability, and Security.
http://www.bento4.com. accessed: 2019-08-01. Springer, 202–213.
[5] GNU binutils. 2019. a collection of binary tools. https://www.gnu.org/software/ [35] George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018.
binutils/. accessed: 2019-08-01. Evaluating Fuzz Testing. In Proceedings of the 2018 ACM SIGSAC Conference on
[6] Tim Blazytko, Cornelius Aschermann, Moritz Schlögel, Ali Abbasi, Sergej Schu- Computer and Communications Security. ACM, 2123–2138.
milo, Simon Wörner, and Thorsten Holz. 2019. GRIMOIRE: Synthesizing Structure [36] Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for
while Fuzzing. (2019). lifelong program analysis & transformation. In Proceedings of the international
[7] Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. symposium on Code generation and optimization: feedback-directed and runtime
2017. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference optimization. IEEE Computer Society, 75.
on Computer and Communications Security. ACM, 2329–2344. [37] Caroline Lemieux, Rohan Padhye, Koushik Sen, and Dawn Song. 2018. PerfFuzz:
[8] Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage- automatically generating pathological inputs. In Proceedings of the 27th ACM
based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering SIGSOFT International Symposium on Software Testing and Analysis. ACM, 254–
(2017). 265.
[9] Maintained by Google. 2018. honggfuzz. http://honggfuzz.com/. [38] Caroline Lemieux and Koushik Sen. 2018. Fairfuzz: A targeted mutation strategy
[10] Quentin Carbonneaux, Jan Hoffmann, Tahina Ramananandro, and Zhong Shao. for increasing greybox fuzz testing coverage. In Proceedings of the 33rd ACM/IEEE
2014. End-to-end verification of stack-space bounds for C programs. In ACM International Conference on Automated Software Engineering. ACM, 475–485.
SIGPLAN Notices, Vol. 49. ACM, 270–281. [39] Jun Li, Bodong Zhao, and Chao Zhang. 2018. Fuzzing: a survey. Cybersecurity 1,
[11] Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, 1 (2018), 6.
and Yang Liu. 2018. Hawkeye: towards a desired directed grey-box fuzzer. In [40] Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu,
Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications and Alwen Tiu. 2017. Steelix: program-state based binary fuzzing. In Proceedings
Security. ACM, 2095–2108. of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM,
[12] Peng Chen and Hao Chen. 2018. Angora: Efficient fuzzing by principled search. 627–637.
In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 711–725. [41] Hongliang Liang, Xiaoxiao Pei, Xiaodong Jia, Wuwei Shen, and Jian Zhang. 2018.
[13] Wei-Ngan Chin, Huu Hai Nguyen, Corneliu Popeea, and Shengchao Qin. 2008. Fuzzing: State of the art. IEEE Transactions on Reliability 67, 3 (2018), 1199–1218.
Analysing memory resource bounds for low-level programs. In the 7th Inter- [42] Libming. 2019. A library for generating Macromedia Flash files. http://www.
national Symposium on Memory Management, (ISMM 2008). 151–160. https: libming.org/. accessed: 2019-08-01.
//doi.org/10.1145/1375634.1375656 [43] Libsass. 2019. A C/C++ implementation of a Sass compiler. https://github.com/
[14] Wei-Ngan Chin, Huu Hai Nguyen, Shengchao Qin, and Martin C. Rinard. 2005. sass/libsass. accessed: 2019-08-01.
Memory Usage Verification for OO Programs. In 12th International Symposium [44] Xiaolong Liu, Qiang Wei, Qingxian Wang, Zheng Zhao, and Zhongxu Yin. 2018.
on Static Analysis (SAS 2005). 70–86. https://doi.org/10.1007/11547662_7 CAFA: A Checksum-Aware Fuzzing Assistant Tool for Coverage Improvement.
[15] Duc-Hiep Chu, Joxan Jaffar, and Rasool Maghareh. 2016. Symbolic execution for Security and Communication Networks (2018).
memory consumption analysis. ACM SIGPLAN Notices 51, 5 (2016), 62–71. [45] LLVM-Documentation. 2018. libFuzzer - a library for coverage-guided fuzz
[16] Nicolas Coppik, Oliver Schwahn, and Neeraj Suri. 2019. MemFuzz: Using Memory testing. http://llvm.org/docs/LibFuzzer.html.
Accesses to Guide Fuzzing. In 2019 12th IEEE Conference on Software Testing, [46] Yuki Machigashira and Akio Nakata. 2018. An Improved LLF Scheduling for
Validation and Verification (ICST). IEEE, 48–58. Reducing Maximum Heap Memory Consumption by Considering Laxity Time.
[17] CVE-2017-9804. 2017. Available from MITRE. https://cve.mitre.org/cgi-bin/ In 2018 International Symposium on Theoretical Aspects of Software Engineering.
cvename.cgi?name=CVE-2017-9804. IEEE, 144–149.
[18] CVE-2018-17985. 2018. Available from MITRE. https://cve.mitre.org/cgi-bin/ [47] Valentin JM Manes, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel
cvename.cgi?name=CVE-2018-17985. Egele, Edward J Schwartz, and Maverick Woo. 2018. Fuzzing: Art, Science, and
[19] CVE-2018-4868. 2019. Available from MITRE. https://cve.mitre.org/cgi-bin/ Engineering. arXiv preprint arXiv:1812.00140 (2018).
cvename.cgi?name=CVE-2018-4868. [48] MemLock. accessed: 2020-01-01. MemLock’s Home Page. https://
[20] CVE-2019-6291. 2019. Available from MITRE. https://cve.mitre.org/cgi-bin/ icse2020-memlock.github.io/.
cvename.cgi?name=CVE-2019-6291. [49] MITRE. accessed: 2019. CWE-400: Uncontrolled Resource Consumption. https:
[21] CVE-2019-6292. 2019. Available from MITRE. https://cve.mitre.org/cgi-bin/ //cwe.mitre.org/data/definitions/400.html.
cvename.cgi?name=CVE-2019-6292. [50] MITRE. accessed: 2019. CWE-401: Missing Release of Memory after Effective
[22] CVE-2019-7704. 2019. Available from MITRE. https://cve.mitre.org/cgi-bin/ Lifetime. https://cwe.mitre.org/data/definitions/401.html.
cvename.cgi?name=CVE-2019-7704. [51] MITRE. accessed: 2019. CWE-674: Uncontrolled Recursion. https://cwe.mitre.
[23] CVE Details. accessed: 2019. The list of Vulnerabilities according to CWE-400: org/data/definitions/674.html.
Uncontrolled Resource Consumption. https://www.cvedetails.com/cwe-details/ [52] MITRE. accessed: 2019. CWE-789: Uncontrolled Memory Allocation. https:
400/Uncontrolled-Resource-Consumption-039-Resource-Exhaustion.html. //cwe.mitre.org/data/definitions/789.html.
[24] Mohamed Elsabagh, Daniel Barbará, Dan Fleck, and Angelos Stavrou. 2018. On [53] mjs. 2019. mjs: Restricted JavaScript engine. https://github.com/cesanta/mjs.
early detection of application-level resource exhaustion and starvation. Journal accessed: 2019-08-01.
of Systems and Software 137 (2018), 430–447. [54] Nasm. 2019. The Netwide Assembler. https://www.nasm.us. accessed: 2019-08-01.
[25] Exiv2. 2019. Image metadata library and tools. http://www.exiv2.org/. accessed: [55] Openjpeg. 2019. An open-source JPEG 2000 codec written in C language. https:
2019-08-01. //github.com/uclouvain/openjpeg. accessed: 2019-08-01.
[26] Gang Fan, Rongxin Wu, Qingkai Shi, Xiao Xiao, Jinguo Zhou Zhou, and Charles [56] Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, and Yves
Zhang. 2019. SMOKE: Scalable Path-Sensitive Memory Leak Detection for Mil- Le Traon. 2019. Semantic Fuzzing with Zest. In Proceedings of the 28th ACM
lions of Lines of Code. In Proceedings of the 41st International Conference on SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’19).
Software Engineering, ICSE, Gothenburg, Sweden. [57] Hui Peng, Yan Shoshitaishvili, and Mathias Payer. 2018. T-Fuzz: fuzzing by
[27] Flex. 2019. The Fast Lexical Analyzer - scanner generator for lexing in C and program transformation. In 2018 IEEE Symposium on Security and Privacy. IEEE,
C++. https://github.com/westes/flex. accessed: 2019-08-01. 697–710.
[28] Shuitao Gan, Chao Zhang, Xiaojun Qin, Xuwen Tu, Kang Li, Zhongyu Pei, and [58] Theofilos Petsios, Jason Zhao, Angelos D Keromytis, and Suman Jana. 2017.
Zuoning Chen. 2018. CollAFL: Path sensitive fuzzing. In 2018 IEEE Symposium Slowfuzz: Automated domain-independent detection of algorithmic complexity
on Security and Privacy. IEEE, 679–696. vulnerabilities. In Proceedings of the 2017 ACM SIGSAC Conference on Computer
[29] Google. 2018. The list of common sanitizer options. https://github.com/google/ and Communications Security. ACM, 2155–2168.
sanitizers/wiki/SanitizerCommonFlags. [59] Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida,
and Herbert Bos. 2017. Vuzzer: Application-aware evolutionary fuzzing. In
12
MemLock: Memory Usage Guided Fuzzing ICSE ’20, May 23–29, 2020, Seoul, Republic of Korea

Proceedings of the Network and Distributed System Security Symposium. [73] Haijun Wang, Xiaofei Xie, Yi Li, Cheng Wen, Yang Liu, Shengchao Qin, Hongxu
[60] Alexey Samsonov and Kostya Serebryany. 2013. New features in addresssanitizer. Chen, and Yulei. Sui. 2020. Typestate-Guided Fuzzer for Discovering Use-after-
(2013). Free Vulnerabilities. In 2020 IEEE/ACM 42nd International Conference on Software
[61] Kostya Serebryany. 2017. OSS-Fuzz-Google's continuous fuzzing service for open Engineering. Seoul, South Korea.
source software. (2017). [74] Haijun Wang, Xiaofei Xie, Shang-Wei Lin, Yun Lin, Yuekang Li, Shengchao Qin,
[62] Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Yang Liu, and Ting Liu. 2019. Locating vulnerabilities in binaries via memory
Vyukov. 2012. AddressSanitizer: A fast address sanity checker. In Presented as layout recovering. In Proceedings of the 2019 27th ACM Joint Meeting on European
part of the 2012 USENIX Annual Technical Conference. 309–318. Software Engineering Conference and Symposium on the Foundations of Software
[63] Yuju Shen, Yanyan Jiang, Chang Xu, Ping Yu, Xiaoxing Ma, and Jian Lu. 2018. ReS- Engineering. 718–728.
cue: crafting regular expression DoS attacks. In Proceedings of the 33rd ACM/IEEE [75] Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2019. Superion: Grammar-
International Conference on Automated Software Engineering. ACM, 225–235. Aware Greybox Fuzzing. In Proceedings of the 41st International Conference on
[64] Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Software Engineering, ICSE, Gothenburg, Sweden.
Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. [76] Mingzhe Wang, Jie Liang, Yuanliang Chen, Yu Jiang, Xun Jiao, Han Liu, Xibin
2016. Driller: Augmenting Fuzzing Through Selective Symbolic Execution.. In Zhao, and Jiaguang Sun. 2018. SAFL: increasing and accelerating testing cov-
NDSS, Vol. 16. 1–16. erage with symbolic execution and guided fuzzing. In Proceedings of the 40th
[65] Laszlo Szekeres, Mathias Payer, Tao Wei, and Dawn Song. 2013. Sok: Eternal war International Conference on Software Engineering: Companion Proceeedings. ACM,
in memory. In Security and Privacy, 2013 IEEE Symposium on. IEEE, 48–62. 61–64.
[66] Ari Takanen, Jared D Demott, Charles Miller, and Atte Kettunen. 2018. Fuzzing [77] Jiayi Wei, Jia Chen, Yu Feng, Kostas Ferles, and Isil Dillig. 2018. Singularity:
for software security testing and quality assurance. Artech House. Pattern fuzzing for worst case complexity. In Proceedings of the 2018 26th ACM
[67] Victor Van der Veen, Lorenzo Cavallaro, Herbert Bos, et al. 2012. Memory errors: Joint Meeting on European Software Engineering Conference and Symposium on
The past, the present, and the future. In International Workshop on Recent Advances the Foundations of Software Engineering. ACM, 213–223.
in Intrusion Detection. Springer, 86–106. [78] Technical whitepaper for afl fuzz. 2019. american fuzzy lop. http://lcamtuf.
[68] András Vargha and Harold D Delaney. 2000. A critique and improvement of coredump.cx/afl/technical_details.txt. accessed: 2019-08-01.
the CL common language effect size statistics of McGraw and Wong. Journal of [79] Zhiwu Xu, Cheng Wen, and Shengchao Qin. 2018. State-taint analysis for detect-
Educational and Behavioral Statistics 25, 2 (2000), 101–132. ing resource bugs. Science of Computer Programming 162 (2018), 93–109.
[69] John Vilk and Emery D Berger. 2018. BLeak: automatically debugging memory [80] yaml cpp. 2019. A YAML parser and emitter in C++. https://github.com/jbeder/
leaks in web applications. In Proceedings of the 39th ACM SIGPLAN Conference on yaml-cpp. accessed: 2019-08-01.
Programming Language Design and Implementation. ACM, 15–29. [81] Yara. 2019. The pattern matching swiss knife for malware researchers. http:
[70] Di Wang and Jan Hoffmann. 2019. Type-Guided Worst-Case Input Generation. //virustotal.github.io/yara/. accessed: 2019-08-01.
Proceedings of the ACM on Programming Languages (2019). [82] Wei You, Xueqiang Wang, Shiqing Ma, Jianjun Huang, Xiangyu Zhang, XiaoFeng
[71] Haijun Wang, Yun Lin, Zijiang Yang, Jun Sun, Yang Liu, Jin Song Dong, Qinghua Wang, and Bin Liang. 2019. Profuzzer: On-the-fly input type probing for better
Zheng, and Ting Liu. 2019. Explaining Regressions via Alignment Slicing and zero-day vulnerability discovery. In Security and Privacy, 2019 IEEE Symposium
Mending. IEEE Transactions on Software Engineering (2019), 1–1. on. IEEE.
[72] Haijun Wang, Ting Liu, Xiaohong Guan, Chao Shen, Qinghua Zheng, and Zijiang [83] Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. QSYM: A
Yang. 2016. Dependence guided symbolic execution. IEEE Transactions on Software Practical Concolic Execution Engine Tailored for Hybrid Fuzzing. In 27th USENIX
Engineering 43, 3 (2016), 252–271. Security Symposium. 745–761.
[84] Michal Zalewski. 2017. American Fuzzy Lop 2.52b. http://lcamtuf.coredump.cx/
afl/.

13

You might also like