Sec20fall Yun Prepub
Sec20fall Yun Prepub
linked to
tion, which is the unused memory (i.e., hole) among in-use returned ptr size A M P free(ptr) size A M P
memory blocks. Unfortunately, these two desirable properties fd
payload size
size
payload bk
(usable)
are fundamentally conflicting; an allocator should minimize
...
additional operations to achieve good performance, whereas
it requires additional operations to minimize fragmentation. prev_size (= size)
size A M P=1 size A M P=0
Therefore, the goal of an allocator is typically to find a good
balance between these two goals for its workloads.
(a) allocated chunk (b) free chunk
Common designs. In analyzing various heap allocators, (e.g., small bin)
we found their common design principles shown in Table 3: Figure 1: Metadata for a chunk in ptmalloc2 and memory layout
binning, in-place metadata, and cardinal data. Many allo- for the in-use and freed chunks [23].
cators use size-based classification, known as binning. In
particular, they partition a whole size range into multiple binning to explore multiple size groups of an allocator. For
groups to manage memory blocks deliberately according to example, if we just uniformly pick a size in the 264 space, the
their size groups; small-size blocks focus on performance, probability of choosing the smallest size group in ptmalloc2
and large-size blocks focus on memory usage of the alloca- (< 27 ) becomes nearly zero (2−57 ). Thus, we need to use a
tors. Moreover, by dividing size groups, when they try to find better sampling method considering binning. Moreover, the
the best-fit block, the smallest but sufficient block for given other two design principles — in-place and cardinal metadata
request, they scan only blocks in the proper size group instead — limit the locations and domains of metadata, reducing the
of scanning all memory blocks. search space. Under these design principles, we only need to
Moreover, many dynamic memory allocators place meta- focus on metadata in the boundary of a chunk with specific
data near the payload, called in-place metadata, even though forms (i.e., pointers or sizes).
some allocators avoid this because of security problems from
corrupted metadata in the presence of memory corruption
2.2 ptmalloc2: glibc’s Heap Allocator
bugs (see Table 3). To minimize memory fragmentation, a In this section, we discuss ptmalloc2 [22, 23, 27], the heap
memory allocator should maintain information about allo- allocator used in glibc, whose exploitation techniques have
cated or freed memory in metadata. Even though the allocator been heavily studied because of its prevalence and its com-
can place metadata and payload in distinct locations, many plexity of metadata [1, 3, 7, 18, 20, 25, 36, 38, 55]. Similar
allocators store the metadata near the payload (i.e., a head or to other work [17, 58], we will use ptmalloc2 as our default
a tail of a chunk) to increase locality. In particular, by con- allocator for further discussions.
necting metadata and payload, an allocator can get benefits Metadata. A chunk in ptmalloc2 is a memory region con-
from the cache, resulting in performance improvement. taining metadata and payload. Memory allocation API such
Further, memory allocators contain only cardinal data that as malloc() returns the address of the payload in the chunk.
are not encoded and essential for fast lookup and memory Figure 1 shows the metadata of a chunk and its memory lay-
usage. In particular, metadata are mostly pointers or size- out for an in-use and a freed chunk. prev_size represents the
related values that are used for their data structures. For size of a previous chunk if it is freed. Although the prev_size
example, ptmalloc2 stores a raw pointer for a linked list that of a chunk overlaps with the payload of the previous chunk,
is used to maintain freed memory blocks. this is legitimate since prev_size is considered only after the
This observation has been leveraged to devise the universal previous chunk is freed, i.e., the payload is no longer used.
method to test various allocators regardless of their imple- size represents the size of a current chunk. The real size
mentations (see §5.2). First, our approach should consider of the chunk is 8-bit aligned, and the 3 LSBs of the size are
1 #define unlink(AV, P, BK, FD) \
used for storing the state of the chunk. The last bit of size, 2 /* (1) checking if size == the next chunk’s prev_size */ \
called PREV_IN_USE (P), shows whether the previous chunk is 3 ⋆ if (chunksize(P) != prev_size(next_chunk(P))) \
4 ⋆ malloc_printerr("corrupted size vs. prev_size"); \
in use. For example, in Figure 1, after the chunk is freed, the 5 FD = P->fd; \
PREV_IN_USE in the next chunk is changed from 1 to 0. Other 6 BK = P->bk; \
7 /* (2) checking if prev/next chunks correctly point */ \
metadata, fd, bk, fd_nextsize, and bk_nextsize, are used to 8 ⋆ if (FD->bk != P || BK->fd != P) \
maintain linked lists that hold freed chunks. 9 ⋆ malloc_printerr("corrupted double-linked list"); \
10 ⋆ else { \
Binning. ptmalloc2 has several types of bins: fast bin, small 11 FD->bk = BK; \
12 BK->fd = FD; \
bin, large bin, unsorted bin, and tcache [15]. Each bin has 13 ... \
its own characteristics to achieve its goal; a fast bin uses a 14 ⋆ }
single-linked list, giving up merging for performance, but a
(a) Security checks introduced since glibc 2.3.4 and 2.26. Two
small bin merges its freed chunks to reduce fragmentation.
security checks first validate two invariants (see, comments above)
Moreover, a large bin stores chunks that have different sizes to before unlinking the victim chunk (i.e., P).
handle arbitrarily large chunks. To optimize scanning for the 1 // [PRE-CONDITION]
best-fit chunk, a large bin maintains another sorted, double- 2 // sz : any non-fast-bin size
3 // dst: where to write (void*)
linked list. The unsorted bin is a special bin that serves as a 4 // val: target value
fast staging place for free chunks. If a chunk is freed, it first 5 // [BUG] buffer overflow (p1)
6 // [POST-CONDITION] *dst = val
moves to the unsorted bin and is used to serve the subsequent 7 void *p1 = malloc(sz);
allocation. If the chunk is not suitable for the request, it 8 void *p2 = malloc(sz);
9 struct malloc_chunk *fake = p1;
will move to a regular bin (i.e., a small bin or a large bin). 10 // bypassing (1): P->size == next_chunk(P)->prev_size.
Using the unsorted bin, ptmalloc2 can increase locality for 11 // If fake_chunk->size = 0, next_chunk(fake)->prev_size
12 // will point to fake->prev_size. By setting both values
performance by deferring the decision for the regular bins. 13 // zero, we can bypass the check. These assignements
The tcache, per-thread cache, is enabled by default from 14 // can be ommitted since heap memory is zeroed out at
15 // first time of execution.
glibc 2.26. It works similarly to a fast bin but requires no 16 fake->prev_size = fake->size = 0;
locking for threads, and therefore it can achieve significant 17 // bypassing (2): P->fd->bk == P && P->bk->fd == P
18 fake->fd = (void*)&fake - offsetof(struct malloc_chunk, bk);
performance improvements for multithread programs [15]. 19 fake->bk = (void*)&fake - offsetof(struct malloc_chunk, fd);
20 struct malloc_chunk *c2 = raw_to_chunk(p2);
2.3 Complex Modern Heap Exploits 21 // it shrinks the previous chunk’s size,
22 // tricking ‘fake’ as the previous chunk
Heap exploit techniques have recently been much subtle 23 c2->prev_size = chunk_size(sz) \
24 - offsetof(struct malloc_chunk, fd);
and sophisticated to bypass the new security checks intro- 25 // [BUG] overflowing p1 to modify c2’s size:
duced in the allocators. If an attacker found a vulnerability 26 // tricking the previous chunk freed, P=0
27 c2->size &= ~1;
that corrupts heap metadata (e.g., overflow) or improperly 28 // triggering unlink(fake) via backward consolidation
uses heap APIs (e.g., double free), the next step is to de- 29 free(p2);
30 assert(p1 == (void*)&p1 - offsetof(struct malloc_chunk, bk));
velop the bug to a more useful exploit primitive such as ar- 31 // writing with p1: overwriting itself to dst
bitrary write. To do so, attackers typically have to modify 32 *(void**)(p1 + offsetof(struct malloc_chunk, bk)) = dst;
33 // writing with p1: overwriting *dst with val
the heap metadata, craft a fake chunk, or call other heap 34 *(void**)p1 = (void*)val;
APIs according to the implementation of the target heap al- 35 assert(*dst == val);
locator. This development was trivial in the good old days (b) The unsafe unlink exploitation in glibc 2.26
for attackers; they can use the universal technique for most
Figure 2: The unlink macros and an exploit abusing the mechanism
allocators (e.g., unsafe unlink). However, it became com-
in glibc 2.26. Compared to old glibc, two security checks have
plicated after many security checks were introduced to re-
been added in glibc 2.26. The first one hardens the off-by-one
spond to such attacks. Therefore, researchers have studied overflow, and the second one hardens unlinking abuse. Even though
and shared heap exploitation techniques that are reusable the security checks harden the attack, it is still avoidable.
methods to develop vulnerabilities to useful attack primi-
tives [1, 3, 7, 18, 18, 20, 25, 36, 38, 55, 66, 72]. Table 4
summarizes modern heap exploitation techniques from previ- dows allocator [1].
ous work [17] and new ones that our tool, A RC H EAP, found. To mitigate this attack, allocators have added the new se-
Example: Unsafe unlink. One of the most famous heap curity check shown in Figure 2a, which turns out to be insuf-
exploitation techniques is the unsafe unlink attack that abuses ficient to prevent more advanced attacks. The check verifies
the unlink mechanism of double-linked lists in heap allocators, an invariant of a double-linked list that a backward pointer of
as illustrated in Figure 2a. By modifying a forward pointer a forward pointer of a chunk should point to the chunk (i.e.,
(P->fd) into a properly encoded location and a backward P->fd->bk == P) and vice versa. Therefore, attackers cannot
pointer (P->bk) into a desired value, attackers can achieve make the pointer directly refer to arbitrary locations as before
arbitrary writes (see, P->fd->bk = P->bk). Due to the preva- since the pointer will not hold the invariant. Even though the
lence of double-linked lists, this technique was used for many check prevents the aforementioned attack, attackers can avoid
allocators, including dlmalloc, ptmalloc2, and even the Win- this check by making a fake chunk to meet the condition, as
Name Abbr. Description New
Fast bin dup FD Corrupting a fast bin freelist (e.g., by double free or write-after-free) to return an arbitrary location
Unsafe unlink UU Abusing unlinking in a freelist to get arbitrary write
House of spirit HS Freeing a fake chunk of fast bin to return arbitrary location
Poison null byte PN Corrupting heap chunk size to consolidate chunks even in the presence of allocated heap
House of lore HL Abusing the small bin freelist to return an arbitrary location
Overlapping chunks OC Corrupting a chunk size in the unsorted bin to overlap with an allocated heap
House of force HF Corrupting the top chunk to return an arbitrary location
Unsorted bin attack UB Corrupting a freed chunk in unsorted bin to write a uncontrollable value to arbitrary location
House of einherjar HE Corrupting PREV_IN_USE to consolidate chunks to return an arbitrary location that requires a heap address
Unsorted bin into stack UBS Abusing the unsorted freelist to return an arbitrary location ✓
House of unsorted einherjar HUE A variant of house of einherjar that does not require a heap address ✓
Unaligned double free UDF Corrupting a small bin freelist to return already allocated heap ✓
Overlapping small chunks OCS Corrupting a chunk size in a small bin to overlap chunks ✓
Fast bin into other bin FDO Corrupting a fast bin freelist and use malloc_consolidate() to return an arbitrary non-fast-bin chunk ✓
Table 4: Modern heap exploitation techniques from recent work [17] including new ones found by A RC H EAP in ptmalloc2 with abbreviations
and brief descriptions. For brevity, we omitted tcache-related techniques.
in Figure 2b. Compared to the previous one, the check makes but overwriting the NULL byte (e.g., when using string
the exploitation more complicated, but still feasible. related libraries such as sprintf).
It is worth noting that, unlike a typical exploit scenario that
3 Heap Abstract Model assumes arbitrary reads and writes, we exclude such primi-
In this section, we discuss our heap abstract model, which tives for two reasons: They are too specific to applications
enables us to describe a heap exploit technique independent and execution contexts, hardly meaningful for generalization,
from an underlying allocator. Here, we focus on an adver- and they are so powerful for attackers to launch easier attacks,
sarial model, omitting obvious heap APIs (i.e., malloc and demotivating use of heap exploitation techniques. Therefore,
free) for brevity. Note that this abstraction is consistent with such powerful primitives are considered one of the ultimate
related work [17, 58]. goals of heap exploitation.
3.1 Abstracting Heap Exploitation 2) Impact of exploitation. The goal of each heap exploita-
tion technique is to develop common types of heap-related
Our model abstracts a heap technique in two aspects: 1) bugs into more powerful exploit primitives for full-fledged
types of bugs (i.e., allowing an attacker to divert the program attacks. For the systematization of a heap exploit, we catego-
into unexpected states), and 2) impact of exploitation (i.e., rize its final impact (i.e., an achieved exploit primitive) into
describing what an attacker can achieve as a result). This four classes:
section elaborates on each of these aspects. • Arbitrary-chunk (AC): Hijacking the next malloc to
1) Types of bugs. Four common types of heap-related bugs return an arbitrary pointer of choice.
instantiate exploitation: • Overlapping-chunk (OC): Hijacking the next malloc
• Overflow (OF): Writing beyond an object boundary. to return a chunk inside a controllable (e.g., over-
• Write-after-free (WF): Reusing a freed object. writable) chunk by an attacker.
• Arbitrary free (AF): Freeing an arbitrary pointer. • Arbitrary-write (AW): Developing the heap vulnerabil-
• Double free (DF): Freeing a reclaimed object. ity into an arbitrary write (a write-where-what primitive).
Each of theses mistakes of a developer allows attackers • Restricted-write (RW): Similar to arbitrary-write, but
to divert the program into unexpected states in a certain with various restrictions (e.g., non-controllable “what”,
way: overflow allows modification of all the metadata (e.g., such as a pointer to a global heap structure).
struct malloc_chunk in Figure 1) of any consequent chunks; Attackers want to hijack control by using these exploit primi-
write-after-free allows modification of the free metadata tives combined with application-specific execution contexts.
(e.g., fd/bk in Figure 1), which is similar in spirit to use-after- For example, in the unsafe unlink (see, Figure 2), attackers
free; double free allows violation of the operational integrity can develop heap overflow to arbitrary writes and corrupt
of the internal heap metadata (e.g., multiple reclaimed point- code pointers to hijack control.
ers linked in the heap structure); and arbitrary free similarly
breaks the operational integrity of the heap management but 3.2 Threat Model
in a highly controlled manner—freeing an object with the To commonly describe heap exploitation techniques, we clar-
crafted metadata. Since overflow enables a variety of paths ify legitimate actions that an attacker can launch. First, an
for exploitation, we further characterize its types based on attacker can allocate an object with an arbitrary size, and
common mistakes and errors by developers. free objects in an arbitrary order. This essentially means
• Off-by-one (O1): Overwriting the last byte of the next that the attacker can invoke an arbitrary number of malloc
consequent chunk (e.g., when making a mistake in size calls with an arbitrary size parameter and invoke free (or not)
calculation, such as CVE-2016-5180 [31]). in whatever order the attacker wishes. Second, the attacker
• Off-by-one NULL (O1N): Similar to the previous type, can write arbitrary data on legitimate memory regions (i.e.,
the payload in Figure 1 or global memory). Although such Heap action generator PoC generator
legitimate behaviors largely depend on applications in theory, Generate random Minimize actions
heap actions using delta-debugging
assuming this powerful model lets us examine all potential Model (§5.2) (§5.4) PoC
specification exploit
opportunities for abuses. Third, the attacker can trigger only a
single type of bug. This limits the capabilities of the adversary Execute actions and Generate PoC exploit
to the realistic setting. However, we allow multiple uses of detect impacts (§5.3) (§5.4)
...
2
it is not multiplied by a constant because it is the pointer the global buffer address, and the container address. Such
type. Similar to deallocation, A RC H EAP writes data only in knowledge will affect future data generation by A RC H EAP,
a valid heap region (i.e., no overflow or underflow) to ensure as shown in Table 5.
legitimacy of an action (Line 33).
Bug invocation. To explore heap exploitation techniques
5.3 Detecting Techniques by Impact
in the presence of heap vulnerabilities, A RC H EAP needs to A RC H EAP detects four types of impact of exploitations that
conduct buggy actions. Currently, A RC H EAP handles six are the building blocks of a full chain exploit: arbitrary-
bugs in heap: ⃝ 1 overflow, ⃝ 2 write-after-free, ⃝ 3 off-by- chunk (AC), overlapping-chunk (OC), arbitrary-write (AW),
one overflow, ⃝4 off-by-one NULL overflow, ⃝ 5 double free, and restricted-write (RW). This approach has two benefits,
and ⃝6 arbitrary free. A RC H EAP performs only one of these namely, expressiveness and performance. These types are
bugs for a technique to limit the power of an adversary as useful in developing control-hijacking, the ultimate goal of
described in the threat model (see §3.2). Also, A RC H EAP an attacker. Thus, all existing techniques lead to one of these
allows repetitive execution of the same bug to emulate the types, i.e., can be represented by these types. Also, it causes
situation in which an attacker re-triggers the bug. small performance overheads to detect the existence of these
A RC H EAP deliberately builds a buggy action to ensure its types with a simple data structure shadowing the heap space.
occurrence. For overflow and off-by-one, A RC H EAP uses the 1 To detect AC and OC, A RC H EAP determines any over-
malloc_usable_size API to get the actual heap size to ensure lapping chunks in each allocation (Line 18 in Figure 4). To
overflow. This is necessary since the request size could be make the check safe, it replicates the address and size of a
smaller than the actual size due to alignment or the minimum chunk right after malloc since it could be corrupted when
size constraint. Particularly for ptmalloc2, A RC H EAP uses a buggy action is executed. Using the stored addresses and
a dedicated single-line routine to get the actual size since sizes, it can quickly check if a chunk overlaps with its data
ptmalloc2’s malloc_usable_size() is inaccurate under the structure (AC) or other chunks (OC).
presence of memory corruption bugs. Moreover, in double 2 To detect AW and RW, A RC H EAP safely replicates
free and write-after-free bugs, A RC H EAP checks whether a its data structures, the containers and the global buffer, us-
target chunk is already freed. If it is not freed yet, A RC H EAP ing the technique called shadow memory. During execution,
ignores this buggy action and waits for the next one. A RC H EAP synchronizes the state of the shadow memory
Model specifications. A user can optionally provide model whenever it performs actions that can modify its internal
specification either to direct A RC H EAP to focus on a certain structures: allocations for the container and buffer writes for
type of exploitation techniques or to restrict the conditions for the global buffer (Line 21, 49). Then, A RC H EAP checks the
a target environment. It accepts five types of a model specifi- divergence of the shadow memory when performing any ac-
cation: chunk sizes, bugs, impacts, actions, and knowledge. tion (Line 17, 29, 43, 50). Because of the explicit consistency
The first four types are self-explanatory, and knowledge is maintained by A RC H EAP, divergence can only occur when
about the ability of an attacker to break ASLR (i.e., prior previously executed actions modify A RC H EAP’s data struc-
knowledge of certain addresses). The user can specify three tures via an internal operation of the heap allocator. Later,
types of addresses that an attacker may know: a heap address, these actions can be reformulated to modify sensitive data of
an application instead of the data structure for exploitation. is effective enough for practical uses—it eliminates 84.3% of
A RC H EAP’s fuzzing strategies (Table 5) make this detec- non-essential actions on average (see §8.3).
tion efficient by limiting its analysis scope to its data struc-
tures. In general, a heap exploit technique can corrupt any Algorithm 1: Minimize actions that result in an im-
data, leading to scanning of the entire memory space. How- pact of exploitation
ever, the technique found by A RC H EAP can only modify Input :actions – actions that result in an impact
heap or the data structures because these are the only visible 1 origImpact ← GetImpact(actions)
addresses from its fuzzing strategies. A RC H EAP checks only 2 minActions ← actions
modification in its data structures, but ignores one in heap 3 for action ∈ actions do
because it is hard to distinguish a legitimate one (e.g., modi- 4 tempActions ← minActions − action
fying metadata in deallocation) from an abusing one (i.e., a 5 tempImpact = GetImpact(tempActions)
heap exploit technique) without a deep understanding of an 6 if origImpact = tempImpact then
allocator. This is semantically equivalent to monitoring the 7 minActions ← tempActions
violence of the implicit invariant of an allocator — it should 8 end
not modify memory that is not under its control. 9 end
Output :minActions – minimized actions that result in the
A RC H EAP distinguishes AW from RW based on the heap same impact
actions that introduce divergence. If a divergence occurs in
allocation or deallocation, it concludes RW, otherwise (i.e.,
in heap or buffer write), it concludes AW. The underlying Once minimized, A RC H EAP converts the encoded test case
intuition is that parameters in the former actions are hard to to a human-understandable PoC like that in Figure 5 using
control arbitrarily, but not in the latter ones. After detect- one-to-one mapping between each action and C code (e.g., an
ing divergence, A RC H EAP copies the original memory to its allocation action → malloc()).
shadow to stop repeated detections. 6 Implementation
A running example. Figure 6 shows the state of the shadow
We extended American Fuzzy Lop (AFL) to run our heap
memory when executing Figure 5. 1 After the first alloca-
action generator that randomly executes heap actions. The
tion, A RC H EAP updates its heap container and corresponding
generator sends a user-defined signal, SIGUSR2, if it finds
shadow memory to maintain their consistency, which might
actions that result in an impact of exploitation. We also
be affected by the action. 2 It performs two more allocations
modified AFL to save crashes only when it gets SIGUSR2 and
so updates the heap container and shadow memory accord-
ignores other signals (e.g., segmentation fault), which are not
ingly. 3 After deallocation, p[1] is changed into ⋆ due to
interesting in finding techniques. We carefully implemented
unlink() in ptmalloc2 (Figure 2a). At this point, A RC H EAP
the generator not to call heap APIs implicitly except for the
detects divergence of the shadow memory from the original
pre-defined actions for reproducing the actions. For example,
heap container. Since this divergence occurs during dealloca-
the generator uses the standard error for its logging instead of
tion, the impact of exploitation is limited to restricted writes
standard out, which calls malloc internally for buffering. To
in the heap container. 4 In this case, since the heap write
prevent the accidental corruption of internal data structures,
causes the divergence, the actions can trigger arbitrary writes
the generator allocates its data structures in random addresses.
in the heap container. 5 Since this heap write introduces di-
Thus, the bug actions such as overflow cannot modify the data
vergence in the global buffer, the actions can lead to arbitrary
structures since they will not be adjacent to heap chunks.
write in the global buffer.
7 Applications
5.4 Generating PoC via Delta-Debugging
7.1 New Heap Exploitation Techniques
To find the root cause of exploitation, A RC H EAP refines
test cases using delta-debugging [76], as shown in Algo- This section discusses the new exploitation techniques in
rithm 1. The algorithm is simple in concept: for each action, ptmalloc2 during our evaluation. Compared to the old tech-
A RC H EAP re-evaluates the impact of exploitation of the test niques, we determine their uniquenesses in two aspects: root
cases without it. If the impacts of the original and new test causes and capabilities, as shown in Table 6. More informa-
cases are equal, it considers the excluded action redundant tion (e.g., elapsed time or models) can be found in section
(i.e., no meaningful effect to the exploitation). The intuition §8. To share new attack vectors in ptmalloc2, the techniques
behind this decision is that many actions are independent (e.g., are reported and under review in how2heap [61], the de-facto
buffer writes and heap writes) so that the delta-debugging can standard for exploitation techniques. Most PoC codes are
clearly separate non-essential actions from the test case. Our available in Appendix A.
current algorithm is limited to evaluating one individual ac- Unsorted bin into stack (UBS). This technique overwrites
tion at a time. It can be easily extended to check with a the unsorted bin to link a fake chunk so that it can return the
sequence or a combination of heap actions together, but our address of the fake chunk (i.e., an arbitrary chunk). This is
evaluation shows that the current scheme using a single action similar to house of lore [7], which corrupts a small bin to
New Old Root Causes New Capability Allocators P I Impacts of exploitation
OC AC RW AW
UBS HL Unsorted vs. Small Only need one size of an object
HUE HE Unsorted vs. Free Does not require a heap address dlmalloc-2.7.2 ✓ ✓ OV, WF, DF (N) AF, OV, WF AF, OV, WF AF, OV, WF
UDF FD Small vs. Fast Can abuse a small bin with more checks dlmalloc-2.8.6 ✓ ✓ OV, WF, DF (N) OV (N) OV
musl-1.1.9 ✓ ✓ OV, WF, DF (N) AF, OV, WF AF, OV, WF AF, OV, WF
OCS OC Small vs. Unsorted Does not need a controllable allocation
FDO FD Consolidation vs. Fast Can allocate a non-fast chunk musl-1.1.24 ✓ ✓ OV, WF, DF AF, OV, WF AF, OV, WF AF, OV, WF
jemalloc-5.2.1 DF
tcmalloc-2.7 ✓ OV, DF OV, WF, DF OV OV
Table 6: New techniques found by A RC H EAP in ptmalloc2, which mimalloc-1.0.8 ✓ OV, WF, DF OV, WF WF
mimalloc-secure-1.0.8 ✓ DF
have different root causes and capabilities from old ones. DieHarder-5a0f8a52
mesh-a49b6134
DF
DF, NO
N: New techniques compared to the related work, HeapHopper [17]; only top three
achieve the same attack goal. However, the unsorted bin into allocators matter. NO: No bug is required, i.e., incorrect implementations. I: In-place
metadata, P: ptmalloc2-related allocators.
stack technique requires only one kind of allocation, unlike
house of lore, which requires two different allocations, to Table 7: Summary of exploit techniques found by A RC H EAP in
move a chunk into a small bin list. This technique has been real-world allocators with their version or commit hash.
added to how2heap [61].
unsorted bin during the deallocation process. Unlike other
House of unsorted einherjar (HUE). This is a variant of
techniques related to the fast bin, this fake chunk does not
house of einherjar, which uses an off-by-one NULL byte
have to be in the fast bin. We exclude this PoC due to space
overflow and returns an arbitrary chunk. In house of einher-
limits, but it is available in our repository.
jar, attackers should have prior knowledge of a heap address
to break ASLR. However, in house of unsorted einherjar, at- 7.2 Different Types of Heap Allocators
tackers can achieve the same effect without this pre-condition.
We named this technique house of unsorted einherjar, as it We also applied A RC H EAP to the 10 different allocators with
interestingly combines two techniques, house of einherjar various versions. First, we tested dlmalloc 2.7.2, dlmalloc
and unsorted bin into stack, to relax the requirement of the 2.8.6 [41], and musl [59] 1.1.9, which were used in the re-
well-known exploitation technique. lated work, HeapHopper [17]. Moreover, we tested other
real-world allocators: the latest version of musl (1.1.24), je-
Unaligned double free (UDF). This is an unconventional
malloc [19], tcmalloc [26], Microsoft mimalloc [43] with
technique that abuses double free in a small bin, which is
its default and secure mode (noted as mimalloc-secure), and
typically considered a weak attack surface thanks to compre-
LLVM Scudo [45]. Furthermore, we evaluated allocators
hensive security checks. To avoid security checks, a victim
from academia: DieHarder [49], Mesh [56], FreeGuard [64],
chunk for double free should have proper metadata and is
and Guarder [65]. Applying A RC H EAP to other allocators
tricked to be under use (i.e., P bit of the next chunk is one).
was trivial; we leveraged LD_PRELOAD to use a new allocator.
Since double free doesn’t allow arbitrary modification of
Under the assumption that internal details of the allocators
metadata, existing techniques only abuse a fast bin or tcache,
are unknown, we ran A RC H EAP with four models specify-
which have weaker security checks than a small bin (e.g.,
ing each impact (i.e., OC, AC, RW, and AW) one by one to
fast-bin-dup in Table 4).
exhaustively explore possible techniques. After 24 hours of
Interestingly, unaligned double free bypasses these security evaluation, it found several exploit techniques among seven
checks by abusing the implicit behaviors of malloc(). First, out of 10 allocators except for Scudo, FreeGuard, and Guarder
it reuses the old metadata in a chunk since malloc() does due to their secure design. We also tested A RC H EAP with cus-
not initialize memory by default. Second, it fills freed space tom allocators from DARPA Cyber Grand Challenge, whose
before the next chunk to make the P bit of the chunk one. As results can be found in §A.1.
a result, the technique can bypass all security checks and can
As shown in Table 7, A RC H EAP discovers various exploita-
successfully craft a new chunk that overlaps with the old one.
tion techniques for ptmalloc2-related allocators: dlmalloc—
Overlapping chunks using a small bin (OCS). This is a the ancestor of ptmalloc2 and musl—a libc implementation
variant of overlapping-chunks (OC) that abuses the unsorted in embedded systems inspired by dlmalloc. In dlmalloc
bin to generate an overlapping chunk, but this technique crafts 2.7.2, dlmalloc 2.8.6, and musl 1.1.9, A RC H EAP not only
the size of a chunk in a small bin. Unlike OC, it requires more re-discovered all techniques found by HeapHopper, but also
actions — three more malloc() and one more free()— but newly found the following facts: 1) these allocators are all
doesn’t require attackers to control the allocation size. When vulnerable to double free, and 2) an arbitrary chunk is still
attackers cannot invoke malloc() with an arbitrary size, this achievable through overflow in dlmalloc-2.8.6. This was hid-
technique can be effective in crafting an overlapping chunk den in HeapHopper due to its limitation to handle symbolic-
for exploitation. size allocation. Note that we merged special cases of overflow
Fast bin into other bin (FDO). This is another interest- (O1, O1N) into OV to be consistent with HeapHopper [17],
ing technique that allows attackers to return an arbitrary ad- and our claims for new techniques are very conservative; we
dress: it abuses consolidation to convert the type of a vic- claim discovery of new techniques only when HeapHopper
tim chunk from the fast bin to another type. First, it cor- cannot find equivalent or more powerful ones (e.g., AC is
rupts a fast bin free list to insert a fake chunk. Then, it more powerful than OC). We further compare A RC H EAP
calls malloc_consolidate() to move the fake chunk into the with HeapHopper in §8.1. A RC H EAP also found that musl
fully found this corner case without having any hint about the
internals of the allocators using its randomized exploration.
PoC is available in Figure A.3.
Mesh: memory duplication using allocations with nega-
tives sizes. A RC H EAP found that if an attacker allocates an
object with negative size, Mesh will return the same chunk
twice (i.e., duplication) instead of NULL.
Figure 7: The number of working PoCs from one source LTS in
various Ubuntu LTS. For example, 56 PoCs were generated from 7.3 Evolution of Security Features
precise, 49 of them work in trusty and xenial, and 45 of them
work in bionic. We applied A RC H EAP to four versions of ptmalloc2 dis-
tributed in Ubuntu LTS: precise (12.04, libc 2.15), trusty
(14.04, libc 2.19), xenial (16.04, libc 2.23), and bionic
has no security improvement in the latest version; all tech- (18.04, libc 2.27). In trusty and xenial, a new security
niques in musl 1.1.9 are still working in 1.1.24. check that checks the integrity of size metadata (refer (1) in
A RC H EAP also successfully found several heap exploit Figure 2a) is backported by the Ubuntu maintainers. To com-
techniques in allocators that are dissimilar to ptmalloc2 (see pare each version, we perform differential testing: we first
Table 7) for the following reasons. First, A RC H EAP’s model, apply A RC H EAP to each version and generate PoCs. Then,
which is based on the common designs in allocators (§2.1), we validate the generated PoCs from one version against other
is generic enough to cover non-ptmalloc allocators. For ex- versions. (see Figure 7).
ample, tcmalloc [26] is aiming at high performance comput- We identified three interesting trends that cannot be eas-
ing, resulting in very different design from ptmalloc2’s (e.g., ily obtained without A RC H EAP’s automation. First, a new
heavy use of thread-local cache). However, tcmalloc still security check successfully mitigates a few exploitation tech-
follows our model: its metadata are placed in the head of a niques found in an old version of ptmalloc2: likely, the libc
chunk (in-place metadata) and consist of linked list pointers maintainer reacts to a new, popular exploitation technique.
(cardinal data). Thus, A RC H EAP can find several techniques Second, an internal design change in bionic rendered the
in tcmalloc including one that can lead to an arbitrary chunk most PoCs generated from previous versions ineffective. This
using overflow (see Figure A.2). It is worth emphasizing that indicates the subtleties of the generated PoCs, requiring pre-
our model only depends on metadata’s appearance, not on cise parameters and the orders of API calls for successful
their generation or management, which introduce more vari- exploitation. However, this does not particularly mean that a
ety in design, making generalization difficult. Second, thanks new version, bionic, is secure; the new component, tcache,
to standardized APIs, A RC H EAP can find exploit techniques indeed makes exploitation much easier, as Figure 7 shows.
even in allocators that are deviant from our model (e.g., je- Third, this new component, tcache, which is designed to im-
malloc). In particular, A RC H EAP discovered techniques that prove the performance [15], weakens the security of the heap
are reachable only using APIs (e.g., double free) although the allocators, not just making it easy to attack but also introduc-
allocators have removed in-place metadata for security. ing new exploitation techniques. This is similarly observed
A RC H EAP helps to find implementation bugs in allocators by other researchers and communities [17, 37].
by showing unexpected exploit primitives in secure alloca-
tors or that can be invokable without a bug. Accordingly, 8 Evaluation
A RC H EAP found three bugs in mimalloc-secure, DieHarder, This section tries to answer the following questions:
and Mesh. We reported our findings to the developers; two of 1. How effective is A RC H EAP in finding new exploitation
them got acknowledged and are patched. It is worth mention- techniques compared to the state-of-the-art technique,
ing that our auto-generated PoC has been added to mimalloc HeapHopper?
as its regression test. In the following, we discuss each issue 2. How exhaustively can A RC H EAP explore the security-
that A RC H EAP found. critical state space?
DieHarder, mimalloc-secure: memory duplication in 3. How effective is delta-debugging in removing redundant
large chunks using double free. A RC H EAP found the heap actions?
technique that allows the duplication large chunks (more than Evaluation setup. We conducted all the experiments on
64K bytes) in the well-known secure allocators, DieHarder Intel Xeon E7-4820 with 256 GB RAM. For seeding, we used
and mimalloc-secure. Interestingly, even though the alloca- 256 random bytes that are used to indicate a starting point of
tors have no direct relationship according to the developer of the state exploration and are not critical, as A RC H EAP tends
mimalloc [43], A RC H EAP found that both allocators are vul- to converge during the state exploration.
nerable to this technique. Their root causes are also distinct:
DieHarder misses verifying its chunk’s status when allocat- 8.1 Comparison to HeapHopper
ing large chunks, unlike for smaller chunks, and mimalloc HeapHopper [17] was recently proposed to analyze existing
checked the status of an incorrect block. A RC H EAP success- exploitation techniques in varying implementations of an allo-
Name Bug Impact Chunks # Txn Size TxnList (A list of transactions) 1 New techniques
FD WF AC Fast 8 {8} M-M-F-WF-M-M
Name Bug Impact Chunks # Txn A RC H EAP HeapHopper
UU O1 AW,RW Small 6 {128} M-M-O1-F
HS AF AC Fast 4 {48} AF-M T F O µ σ T F O µ σ
PN O1N OC Small 12 {128,256,512} M-M-M-F-O1N-M-M-F-F-M FDO WF AC Fast, Large —
HL WF AC Small 9 {100,1000} M-M-F-M-WF-M-M
OC O1 OC Small 8 {120,248,376} M-M-M-F-O1-M UBS WF AC Small 6 3† 0 0 20.2m 5m 0 0 3 ∞ -
UB WF AW,RW Small 7 {400} M-M-F-WF-M
HE O1 AC Small 7 {56,248,512} M-M-O1-F-M HUE O1 AC Small 9 2‡ 0 1 14.4h 8.9h 0 0 3 ∞ -
OCS OV OC Small 9 3 0 0 17.3s 1.2s 0 0 3 ∞ -
# Txn: The number of transactions, M: malloc, F: free UDF DF OC Small 9 3 0 0 19.9s 5.2s 0 0 3 ∞ -
Found 11 0 1 ⇒ #4 0 0 12 ⇒ #0
Table 8: Exploit-specific models for known techniques from
T: True positives, F: False positives, O: Timeout,
HeapHopper. It is worth noting that the results of variants (i.e., µ: Average time, σ : Standard deviation of time
techniques have same prerequisites, but different root causes) are
identical for A RC H EAP with no specific model (marked with † and Table 9: The number of experiments (at most three) that discover
‡ in Table 9 and Table 10) since A RC H EAP neglects the number of new exploitation techniques, the number of found techniques — the
transactions (i.e., # Txn). number after hash (#) sign, elapsed time, and corresponding models.
Briefly, A RC H EAP discovered all four techniques, but HeapHopper
failed to. We omitted FDO, which has a superset model of FD;
cator. Because of its goal, HeapHopper emphasizes complete- therefore, it becomes indistinguishable to FD (see, Table 8).
ness and verifiability, differentiating its method (i.e., symbolic
execution) from A RC H EAP’s (i.e., fuzzing). To overcome the
periment, FDO is excluded because its model is a superset of
state explosion in symbolic execution, HeapHopper tightly
FD; having FDO simply makes A RC H EAP and HeapHopper
encodes the prior knowledge of exploit techniques into its
converge to FD.
models, e.g., the number of transactions (i.e., non-write ac-
HeapHopper fails to identify all unknown exploitation prim-
tions in A RC H EAP), allocation sizes (i.e., guiding the use of
itives with no exploit-specific models (see Table 9). In fact,
specific bins), and even a certain order of transactions. By
it encounters a few fundamental problems of symbolic ex-
relying on this model, it could incrementally perform the
ecution: 1) exponentially growing permutations of transac-
symbolic execution for all permutations of transactions. Un-
tions and 2) huge search spaces in selecting proper size and
fortunately, its key idea—guiding the state exploration with
orders to trigger exploitation. Although HeapHopper demon-
detailed models— limits its capability to only its original
strated a successful state exploration of seven transactions
purpose that validates known exploitation techniques, unlike
with three size parameters (§7.1 in [17]), the search space
our approach can find unknown techniques.
required for discovering new techniques is much larger, ren-
Despite their different purposes, their outputs are equiva- dering HeapHopper’s approach computationally infeasible.
lent to heap exploitation techniques; therefore, we need to On the contrary, A RC H EAP successfully explores the search
show the orthogonality of A RC H EAP and HeapHopper; nei- space using the random strategies, and indeed discovers un-
ther of them can replace the other. To objectively compare known techniques.
both approaches, we performed three experiments: 1 finding
unknown techniques with no exploit-specific model (i.e., ap- 2 Known techniques with partly specified models. We
plying HeapHopper to A RC H EAP’s task), 2 finding known also evaluate the role of exploit-specific models in both ap-
techniques with partly specified models (i.e., evaluating the proaches, which are unavailable in finding new techniques.
roles of specified models in each approach), and 3 finding In particular, we evaluated both systems with partial mod-
known techniques with exploit-specific models (i.e., applying els, namely, the size parameters (+Size) and a sequence of
A RC H EAP to HeapHopper’s task). In the experiments, we transactions (+TxnList), used in HeapHopper (see, Table 8).
considered variants of exploit techniques1 as an equal class To prevent each system from converging to easy-to-find tech-
since both systems cannot distinguish their subtle differences. niques, we tested each model on top of the baseline heap
We ran each experiment three times with a 24-hour timeout model (i.e., Bug+Impact+Chunks).
for proper statistical comparison [40]. We used the default This experiment (i.e., 2 in Table 10) shows that A RC H EAP
option for HeapHopper since it shows the best performance outperforms HeapHopper with no or partly specified models:
in the following experiments (see §A.2). A RC H EAP found five more known techniques than HeapHop-
per in both +Size and Bug+Impact+Chunks. Interestingly,
1 New techniques. We first check if HeapHopper’s ap-
A RC H EAP can operate worse with additional information;
proach can be used to find previously unknown exploita-
A RC H EAP found three fewer techniques in +TxnList. Un-
tion techniques that A RC H EAP found (see, §7.1). To apply
like A RC H EAP, exploit-specific models are beneficial to
HeapHopper, we provided models that specify all sizes for
HeapHopper, finding one more techniques when +TxnList
corresponding bins but limit the number of transactions fol-
is given. This result shows that a precise model plays an
lowing our PoCs, as shown in Table 9. Note that, in theory,
essential role in symbolic execution but not in fuzzing. In
such relaxation is general enough to discover new techniques
short, A RC H EAP is particularly preferable when exploring
given an infinite amount of computing resources. In the ex-
unknown search space, (i.e., finding new techniques), where
1 Exploit an accurate model is inaccessible.
techniques often have the same prerequisite but different root
causes such as UBS and HL. 3 Known techniques with exploit-specific models When
2 Known techniques with partly specified models 3 Known techniques with exploit-specific models.
Bug+Impact+Chunks +Size +TxnList +Size, TxnList
Name A RC H EAP HeapHopper A RC H EAP HeapHopper A RC H EAP HeapHopper A RC H EAP HeapHopper
T F O µ σ T F O µ σ T F O µ σ T F O µ σ T F O µ σ T F O µ σ T F O µ σ T F O µ σ
FD 3 0 0 2.7m 1.2m 3 0 0 3.8m 0.3s 3 0 0 57.1s 27.1s 3 0 0 3.8m 0.9s 3 0 0 14.2m 4.3m 3 0 0 10.7m 2.1m 3 0 0 10.2m 7.2m 3 0 0 23.5s 0.2s
UU 3 0 0 57.9m 40.4m 0 0 3 ∞ - 3 0 0 1.6h 1.1h 0 0 3 ∞ - 0 0 3 ∞ - 0 3 0 3.2h 26.3m 0 0 3 ∞ - 0 3 0 8.2h 13m
HS 3 0 0 2.7m 59.7s 3 0 0 31.4s 0.2s 3 0 0 9.3m 6.1m 3 0 0 31.1s 0.2s 0 0 3 ∞ - 3 0 0 56s 0.8s 0 0 3 ∞ - 3 0 0 28.6s 0.2s
PN 3 0 0 13.3m 24.4s 0 0 3 ∞ - 3 0 0 16.1m 14.9m 0 0 3 ∞ - 3 0 0 1.6h 57m 0 0 3 ∞ - 3 0 0 26m 12.6m 3 0 0 4.3m 1.6s
HL 3† 0 0 20.2m 5m 0 0 3 ∞ - 3 0 0 1.2m 47.3s 0 0 3 ∞ - 2 0 1 13.2h 8.5h 0 0 3 ∞ - 3 0 0 21m 9.4m 2 1 0 2.2m 8.2s
OC 3 0 0 7.1s 5.9s 0 0 3 ∞ - 3 0 0 20s 5.3s 0 0 3 ∞ - 3 0 0 6s 2.4s 3 0 0 22.1h 33.2m 3 0 0 26.6s 34s 3 0 0 3.2m 2s
UB 3 0 0 36.8s 22.8s 3 0 0 21.8s 0.2s 3 0 0 4.7s 3.1s 3 0 0 21.9s 0.3s 3 0 0 24.8s 14.9s 3 0 0 47.6s 0.3s 3 0 0 12.6s 9.5s 3 0 0 19.5s 0.7s
HE 2‡ 0 1 14.4h 8.9h 0 0 3 ∞ - 2 0 1 9.3h 10.4h 0 0 3 ∞ - 0 0 3 ∞ - 0 0 3 ∞ - 0 0 3 ∞ - 0 3 0 6.8m 6.4s
Found 23 0 1 ⇒ #8 9 0 15 ⇒ #3 23 0 1 ⇒ #8 9 0 15 ⇒ #3 14 0 10 ⇒ #5 12 3 9 ⇒ #4 15 0 9 ⇒ #5 17 7 0 ⇒ #6
Table 10: The number of discovered known exploitation techniques and elapsed time for discovery in A RC H EAP and HeapHopper with various
models. In summary, A RC H EAP outperforms HeapHopper with no or partly specified models, e.g., A RC H EAP found five more techniques
with no specific model (Bug+Impact+Chunks). Even though HeapHopper found one more technique than A RC H EAP if exploit-specific models
are available, it suffers from false positives (marked in gray).
exploit-specific models (+Size, TxnList) are provided, Name Error message Version Xenial Bionic
HeapHopper’s approach works better: It found one more C1 corrupted double-linked list 2.3.4 ✓ ✓
C2 corrupted double-linked list (not small) 2.21 ✓
known technique and found four techniques more quickly C3 free(): corrupted unsorted chunks 2.11 ✓ ✓
C4 malloc(): corrupted unsorted chunks 1 2.11
than A RC H EAP (as illustrated in 3 in Table 10). This shows C5 malloc(): corrupted unsorted chunks 2 2.11 ✓ ✓
the strength of HeapHopper in validating existing techniques, C6 malloc(): smallbin double linked list corrupted 2.11 ✓ ✓
C7 free(): invalid next size (fast) 2.3.4 ✓ ✓
rendering orthogonality of both tools. We observed one in- C8 free(): invalid next size (normal) 2.3.4 ✓ ✓
C9 free(): invalid size 2.4 ✓ ✓
teresting behavior of HeapHopper in this experiment. With C10 malloc(): memory corruption 2.3.4 ✓ ✓
C11 double free or corruption (!prev) 2.3.4 ✓ ✓
more exploit models specified, HeapHopper tends to suffer C12 double free or corruption (fasttop) 2.3.4 ✓ ✓
from false positives because of its internal complexity, as C13 double free or corruption (top) 2.3.4 ✓ ✓
C14 double free or corruption (out) 2.3.4 ✓ ✓
noted in the paper [17]. Despite its small numbers – dozens C15 malloc(): memory corruption (fast) 2.3.4 ✓ ✓
C16 malloc_consolidate(): invalid chunk size 2.27 — ✓
in three experiments — this shows incorrectness in HeapHop- C17 break adjusted to free malloc space 2.10.1 ✓ ✓
per, resulting in failures to find UU and UE. We confirmed C18 corrupted size vs. prev_size 2.26 ✓ ✓
C19 free(): invalid pointer 2.0.1 ✓ ✓
these false positives with HeapHopper’s authors. On the con- C20 munmap_chunk(): invalid pointer 2.4 ✓ ✓
C21 invalid fastbin entry (free) 2.12.1
trary, A RC H EAP’s approach does not introduce false positives
thanks to its straightforward analysis at runtime. Table 11: Security checks in ptmalloc2 covered by A RC H EAP;
This experiment also highlights an interesting design deci- an unique identifier for a check, an error message for its failure,
sion of A RC H EAP: separating the exploration and reducing and version that the check is first introduced, and covered ones by
phases. With no exploit-specific guidance, A RC H EAP can A RC H EAP in Ubuntu versions.
freely explore the search space for finding heap exploitation
techniques, and so increase the probability of satisfying the bug, which is outside of the scope of this work. C2 and C4
precondition of certain exploitation techniques. For exam- require a strict relationship between large chunks (e.g., the
ple, if the sequence of transactions of UU (M-M-O1-F) is sizes of two chunks are not equal but less than the minimum
enforced, A RC H EAP should craft a fake chunk within a rel- size), which is probably too stringent for any randomization-
atively small period (i.e., between four actions) to trigger based strategies.
the exploit; otherwise, A RC H EAP has a higher probability to
8.3 Delta-Debugging-Based Minimization
formulate a fake chunk by executing more, perhaps redun-
dant, actions. However, such redundancy is acceptable in The minimization technique based on delta-debugging is ef-
A RC H EAP thanks to our minimization phase that effectively fective in simplifying the generated PoCs for further analysis.
reduces inessential actions from the found exploit. It effectively reduces 84.3% of redundant actions from orig-
We also confirmed that A RC H EAP can find all tcache- inal PoCs (refer to §7.3) and emits small PoCs that contain
related techniques [37] and house-of-force, which HeapHop- 26.1 lines on average (see Table 12). Although our minimiza-
per fails to find because of an arbitrary size allocation. tion is preliminary (i.e., eliminating one independent action
A RC H EAP can find these techniques within a few minutes, as per testing), the final PoC is sufficiently small for manual
they require fewer than five transactions. analysis to understand impacts of the found technique.
CROMU_00003 ✓ ✓ ✓ ✓
CROMU_00004 ✓ ✓ ✓ ✓ Table 14: Results of §8.1 with various search heuristics supported
KPRCA_00002 ✓ ✓ ✓ ✓ by HeapHopper
KPRCA_00007 ✓ ✓ ✓ ✓
NRFIN_00007
NRFIN_00014 ✓ ✓ ✓ ✓
NRFIN_00024 ✓ ✓ ✓ ✓ 1 // [PRE-CONDITION]
NRFIN_00027 ✓ ✓ ✓ ✓ 2 // fsz: fast bin size
NRFIN_00032 ✓ ✓ 3 // sz: non-fast-bin size
4 // lsz: size larger than page (> 4096)
Table 13: Exploitation techniques found by A RC H EAP in custom 5 // xlsz: very large size that cannot be allocated
allocators of CGC. Except for NRFIN_00007 that implements the 6 // [BUG] buffer overflow
7 // [POST-CONDITION]
page heap, A RC H EAP successfully found exploitation techniques in 8 // malloc(sz) == dst
the custom allocators. 9 void* p0 = malloc(sz);
10 void* p1 = malloc(xlsz);
11 void* p2 = malloc(lsz);
void* p3 = malloc(sz);
A.1 Security of Custom Allocators 12
13
To further evaluate the generality of A RC H EAP, we applied A RC H EAP to 14 // [BUG] overflowing p3 to overwrite top chunk
15 struct malloc_chunk *tc = raw_to_chunk(p3 + chunk_size(sz));
all custom heap allocators implemented for the DARPA CGC competition— 16 tc->size = 0;
since many challenges share the implementation, we selected nine unique 17
ones for our evaluation (see, Table 13). We implemented a missing API, 18 void* p4 = malloc(fsz);
(i.e., malloc_usable_size()) to get the size of allocated objects and ran the 19 void* p5 = malloc(dst - p4 - chunk_size(fsz) \
experiment for 24 hours for each heap allocator. Similar to the previous one, 20 - offsetof(struct malloc_chunk, fd));
21 assert(dst == malloc(sz));
no specific model is provided.
A RC H EAP found exploitation primitives for all of the tested allocators,
except for NRFIN_00007, which implements page heap.Such allocator looks Figure A.1: An exploitation technique for dlmalloc-2.8.6 returning
secure in terms of metadata corruption, but it is impractical due to its memory an arbitrary chunk using overflow bug that was found by A RC H EAP.
overheads causing internal fragmentation. During this evaluation, we found
two interesting results. First, A RC H EAP found exploitation techniques for
NRFIN_00032, which has a heap cookie to overflows. Although this cookie-
1 // [PRE-CONDITION]
based protection is not bypassable via heap metadata corruption, A RC H EAP 2 // sz : any size
found that the implementation is vulnerable to an integer overflow and 3 // [BUG] buffer overflow
could craft two overlapping chunks without corrupting the heap cookie. 4 // [POST-CONDITION]
Second, A RC H EAP found the incorrect implementation of the allocator in 5 // malloc(sz) == dst
6 void* p = malloc(sz);
CROMU_00004, which returns a chunk that is free or its size is larger than the
7 // [BUG] overflowing p
request. A RC H EAP successfully crafted a PoC code resulting in overlapping 8 // tcmalloc has a next chunk address at the end of a chunk
chunks by allocating a smaller chunk than the previous allocation. This 9 *(void**)(p + malloc_usable_size(p)) = dst;
experiment indicates that our common heap designs are indeed universal 10
even for in modern and custom heap allocators (§2.1). 11 // this malloc changes a next chunk address into dst
12 malloc(sz);
A.2 Search Heuristics in HeapHopper 13
14 assert(malloc(sz) == dst);
We also evaluated all search heuristics [63] supported by HeapHopper, which
can be applied without exploit-specific information; for example, we ex- Figure A.2: An exploitation technique for tcmalloc returning an
clude the strategy called ManualMergepoint, which requires an address in a
arbitrary address that was found by A RC H EAP.
binary to merge states. As a result, we collected five search heuristics: DFS,
which is the default mode of HeapHopper; Concretizer, which aggressively
concretizes symbolic values to reduce the number of paths; Unique, which
1 // [PRE-CONDITION]
selects states according to their uniqueness for better coverage; Stochas-
2 // lsz : large size (> 64 KB)
tic, which randomly selects the next states to explore; and Veritesting [5], 3 // xlsz: more large size (>= lsz + 4KB)
which merges states to suppress path explosion combining static and dynamic 4 // [BUG] double free
symbolic execution. 5 // [POST-CONDITION]
Unfortunately, as shown in Table 14, none of them was helpful in our 6 // p2 == malloc(lsz);
evaluation; the default mode (DFS) shows the best performance. First, these 7 void* p0 = malloc(lsz);
8 free(p0);
heuristics only help to mitigate, but cannot solve the fundamental problems 9 void* p1 = malloc(xlsz);
of HeapHopper: path explosion and exponential growing combinations of 10
transactions. More seriously, they cannot exploit a concrete model from 11 // [BUG] free ’p0’ again
HeapHopper to alleviate the aforementioned issues unlike DFS. This explains 12 free(p0);
13
DFS’s best performance and Stochastic’s worst performance. Veritesting
14 void* p2 = malloc(lsz);
failed due to its incorrect handling of undefined behaviors (e.g., NULL 15 free(p1);
dereference) in merged states, which are common in our task assuming 16
memory corruptions. 17 assert(p2 == malloc(lsz));