Patterns of Memory Inefficiency
Patterns of Memory Inefficiency
net/publication/221496338
CITATIONS READS
40 784
7 authors, including:
All content following this page was uploaded by John Murphy on 14 January 2015.
1 Introduction
Fig. 1. The fraction of the heap that is overhead, including JVM object headers, point-
ers used to implement delegation, and null pointer slots, is often surprisingly high.
Sizing numbers alone, whether memory consumption of the process, of the heap,
and even of the size of individual data structures [3,18,10], are not sufficient.
The test teams need a quick evaluation of whether deeper code inspections will
be a worthwhile investment of time.
If size alone does not indicate appropriateness or ease of remediation, then
perhaps measures of overhead can. Prior work infers an overhead measure, by
distinguishing the actual data of a data structure from the implementation costs
necessary for storing it in a Java heap [13]. Overheads come from Java Virtual
Machine (JVM) object headers, null pointers, and various overheads associated
with collections, such as the $Entry objects in a linked structure. Fig. 1 shows
this breakdown, of actual data versus overhead, for 34 heap snapshots from
34 real, deployed applications. The figure is typically quite high, with most
snapshots devoting 50% or more of their heap to implementation overheads.
Unfortunately, when presented with this figure, even on a per-structure basis,
these testing teams were left with only a modified form of their original dilemma.
Instead of wondering how large is too large, they now asked how much overhead is
3
We have found that eleven patterns of memory inefficiency explain the majority
of overhead in Java applications. The patterns can be divided into two main
groups: problems with collections, and problems with the data model of con-
tained items. The goal of this section is to introduce the patterns. In the next
sections, we introduce a set of data structure abstractions and an algorithm
Table 1. The analysis presented in this paper discovers easy to fix problems that
quite often result in big gains. These numbers cover the heap snapshots in Fig. 1. The
overhead is computed as described in Sect. 3.4.
Fig. 2. From the raw concrete input, a heap snapshot from a Java application, we
compute a set of abstract representations. We compute one abstract form, called the
ContainerOrContained Model, per data structure in the heap snapshot. The client
analyses scan each data structure for problematic memory patterns, making use this
abstract form.
for detecting and aggregating the occurrences of these patterns in a given data
structure.
All of these patterns are common, and lead to high amounts of overhead.
Table 2 names the eleven patterns. We use a short identifier for each, e.g. P1
stands for the pattern of empty collections. Table 3 shows that these patterns do
indeed occur frequently across our sample heap snapshots, often multiple times
per snapshot. Sect. 5 gives detailed findings of our detection algorithm.
Each of these patterns has the general nature of a large number of collections
with only a few entries. This situation leads to a high amount of overhead due
to a lack of amortization of the fixed costs of a collection. The fixed costs of
a HashSet in the Java standard library, which includes multiple Java objects
and many field slots, is around 100 bytes (on a 32-bit JVM). This sounds like
an inconsequential number, but if that set contains only a few entries, then the
relative contribution of that fixed overhead to the total heap consumption is
high. The fixed cost of a ConcurrentHashMap in Java is 1600 bytes!
Two important special cases have very different remedies from the general
case of small collections. The first is the fixed-size small collections pattern, where
all the instances of such collections contain always the same constant number
of entries. These may benefit from using array structures, rather than a general
6
It is common for data structures to have many small primitive arrays dangling
at the leaves of the structure. Most commonly, these primitive arrays contain
string data. Rather than storing all the characters once, in a single large array,
Table 3. Across the 35 snapshots in Fig. 1, the memory patterns occur frequently. The
patterns are also not spurious problems that show up in only one or two applications;
many occur commonly, across applications.
P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11
# pattern occurrences 37 16 45 11 7 19 2 111 5 5 46
# applications 18 12 20 8 6 13 2 29 3 4 19
7
ArrayList
Object[]
String String
Fig. 3. An example of the sparse collection pattern, in this case with eight null slots
the application stores each separate string in a separate String object, each of
which has its own small primitive character array. The result is often that the
overhead due to the header of the primitive character array (12 bytes, plus 4
bytes to store the array size) often dwarfs the overall cost of the data structure.
If this data is intended to be long-lived, then it is relatively easy to fix this
problem. Java Strings already support this substring optimization.
The Java standard collections, unlike C++ for example, do not support collec-
tions with primitive keys, values, or entries. As a result, primitive data must
be boxed up into wrapper objects that cost more than the data being wrapped.
This generally results in a huge overhead for storing such data.
The Java standard library requires the use of wrappers to modify the behavior
of a collections. This includes, for example, making a collection synchronized
or unmodifiable. HashSet is implemented in this way, too: as a wrapper around
HashMap. This is another case of a cost that would be amortized, if the collections
had many entries, but one with a distinct remedy.
Java data models often require high degrees of delegation. For example, an em-
ployee has attributes, such as a name and email address. In Java, due to its
single-inheritance nature, one is often forced to delegate the attributes to side
objects; for example, the developer may wish to have these two attributes ex-
tend a common ContactInformation base class. The frequent result is a highly
delegated web of objects, and the repeated payment of the object “tax”: the
header, alignment, and pointer costs.
8
HashMap
HashMap$Entry[]
HashMap$Entry
String
HashSet HashMap
...
String
HashSet HashMap
...
This pattern covers the common case of nested collections. One can use a
HashMap of HashSets to model a map with multiple values per key. Similarly, a
HashMap of ArrayLists can be used to represent a map which requires multiple
objects to implement a key. Fig. 4 portrays a HashMap of HashSet where String
key maps to a set of values, implemented using a HashSet. For this current
paper, we only cover these two important cases of nested collections: HashMaps
with either HashSet or ArrayList keys or values.
The last non-collection overhead pattern comes from wrappers around primitive
arrays. These include the String and ByteArrayOutputStream objects whose
main goal is to serve as containers for primitive data. This is a cost related to
P5: small primitive arrays, but one that is outside of developer control; hence we
treat it separately. The Java language does not allow primitive arrays to be stored
inline with scalar primitive data.4 We include this pattern for completeness, even
though in practice developers would have trouble implementing an easy fix to
4
Java supports neither structural composition nor value types, those features of C and
C# that permit a developer to express that one object is wholly contained within
another. At least it can be done manually, in the case of scalar data. This is simply
not possible for array data.
9
Fig. 5. The dominator forest is unhelpful both for detecting and for sizing memory
problems. No traversal of a dominator tree (e.g. Data Structure 1 or 2) will encounter
shared sub-structures. A collection, even if it dominates none of its constituents, should
still be considered non-empty.
The bulk of objects in a data structure takes on one of six roles. These roles
are summarized in Table 4. Consider a common “chained” implementation of a
hashmap, one that uses linked list structures to handle collisions. Instances of
this hashmap, in Java, are stored in more than one object in the runtime heap.
One of these will be the entry point to the collection, e.g. of type HashMap, and
the rest of the objects will implement the details of the linking structure. These
two roles, the Head of Container, and Implementation Details, are common to
most implementations of structures that are intended to contain an indefinite
number of items.
Underneath the chains of this hashmap will be the contained data structures.
These constituent structures have a similar dichotomy of roles: there are the
Head of Contained structures, and, for each, the implementation details of that
contained structure. Consider the example from earlier (Sect. 2.6): an Employee
data structure that has been implemented to delegate some of its functionality
to other data types, such as PhoneNumber and EmailAddress. That these latter
two pieces of information have been encoded as data types and hence (in Java)
manifested as objects at runtime, is an implementation detail of the Employee
data structure. Another role comes at the interface between the container’s im-
plementation details and the head of the contained items. For example, in a
chained hashmap, the “entry” objects (HashMap$Entry in the Java standard
collections library) will serve the role as this Container-Contained Transition
objects. This role is crucial to correctly detect some of the patterns (shown in
Sect. 4). The final important role, Points to Primitive Array, corresponds to
those objects that serve as wrappers around primitive arrays.
We introduce the ContainerOrContained abstraction, that assigns each ob-
ject in a data structure to at least one of these six roles. Objects not stored in a
collection are unlikely to be the source of memory problems, and hence do not
receive a role in this model. Given an object that is at the root of a data struc-
11
Role Examples
Head Of Container HashMap, Vector
Head Of Contained keys and values of maps
Container-Contained Transition HashMap$Entry
Points to Primitive Array String
Collection Impl. Details HashMap$Entry[]
Contained Impl. Details everything else
!"
!#
!$
– The start and stop criteria. The boundaries of an occurrence, the details of
which vary from pattern to pattern, but can always be expressed in terms of
roles. For example, whether a HashMap instance is an occurrence of the empty
collection pattern depends on the objects seen in a traversal of the subgraph
bounded on top (as one traverses from the roots of the data structure being
scanned) by a Head of Container, and bounded on the bottom by the Heads
of Contained items.
– The accounting metrics. Each pattern differs in what it needs to count, as
the scan proceeds. The empty collection pattern counts the number of Heads
of Contained. The sparse references pattern counts a pair of numbers: the
number of valid references, and the number of null slots.
– The match criterion. The empty collections pattern matches that HashMap
if the number of Heads of Contained objects encountered is zero.
Observe that the empty collections pattern cannot count the number of Tran-
sitionary objects, (HashMap$Entry in this case), for two important reasons: 1)
because some collections use these Transitionary objects as sentinels; and 2)
sharing may result in two Transitionary objects referencing a single Head of
Contained object.
Each match of a pattern would, without any aggregation, result in one pat-
tern occurrence. This would result in needlessly complicated voluminous reports.
Instead, as the algorithm traverses the data structure, it detects which Con-
tainerOrContained region it is current in. Any occurrences of a pattern while the
7
A set of patterns can be scanned for simultaneously. The description in this section
can be generalized straightforwardly to handle this.
14
traversal is in a particular region are aggregated into that region. The output
is a set of encountered regions, each with a set of occurrences. Each occurrence
will be sized according to the accounting metrics of the pattern. Fig. 7 gives
Java-like pseudocode for the algorithm.
For example, a scan for occurrences of the empty collections pattern would
count the total overhead of the collection’s Implementation Details, matches if
the number of Heads of Contained is zero, and upon a match accumulate that
overhead into the ContainerOrContained region in which the empty collection
is situated.
P5: Small Primitive Arrays. The boundary of a small primitive array pattern
occurrence is a Points to Primitive Array elements as start condition and the
traversal stops when the primitive array is reached. The client counts the per-
object overhead of the primitive array and the size of actual primitive data. A
match happens when the amount of data is small, compared to the overhead.
Table 5 presents the time to detect8 all the eleven pattern categories in the
heap snapshots from Fig. 1. While computation time is sometimes less than 2
8
Using Sun’s Linux JVM 1.6.0 13 64-Bit with -server flag, on a 2.66GHz Intel(R)
Core(TM)2 Quad CPU.
15
Fig. 7. The algorithm that, given a pattern and a data structure, produces a set of
aggregated pattern occurrences.
minutes for heaps with tens of million of objects (e.g. Applications S8 and S9,
with 57 and 34 million of objects respectively), there are also extreme cases where
the computation time is high. Currently we are working on a few optimization
possibilities to address the slow analysis time.
Table 5. The number of objects and computation time to detect all the patterns in
the heap snapshots from Fig. 1.
Application S7. This application has a heap size of 652MB of which 517MB
is overhead. The application suffers from three easy to fix problems in three
collections. As shown in the S7 rows of Table 6, these belong to the sparse,
small, and fixed-size collection patterns. One of the collections, a HashMap suffers
simultaneously from both the sparse (P4) and the small (P3) collection patterns.
The small collections are likely to be sparse, too. The tool has split out the costs
of these two separate problems, so that the user can gauge the benefit of tackling
these problems, one at a time: these two problems explain 92MB and 73MB of
overhead, respectively.
9
Their names are, unfortunately, confidential.
17
Table 6. Each row, a pattern occurrence, was selected as a case that would be easy
for the developers to fix. As Sect. 6 shows, the developer needn’t ever fix more than 20
separate problems (and often far fewer than that) to address overhead issues.
The tool also specifies the remedies available for each pattern. For example,
the small sparse HashMaps can be remedied by passing an appropriate number to
the constructor. In addition to reporting the occurrence’s overhead (as shown in
each row of Table 6), the tool (not shown) also reports the occurrence count, and
a distribution of the sizes of the collection instances that map to that pattern
occurrence. This data can be helpful in choosing a solution. For example, if 90%
of the HashMaps that map to that occurrence have only 6 entries, then this, plus
a small amount of slop, is a good figure to pass to the HashMap constructor. For
now, the tool gives these numbers, and a set of known general solutions to the
pattern of each occurrence. Though this arithmetic work necessary to gauge the
right solution, is straightforward, we feel that it is something the tool should do.
Work is in progress to do this arithmetic automatically.
18
Application S3. The application uses 362MB on actual data and 2.26GB on
overhead. This application suffers from several boxed scalar collection pattern
occurrences in HashMaps, accounting for 573MB of overhead. There are easy to
use, off the shelf solutions to this problem, including those from the Apache
Commons [1] and GNU Trove [5] libraries.
The tool also finds a large occurrence of a wrapped collections pattern. This
region is headed by a collection of type Collections$UnmodifiableMap; in the
Java standard libraries, this is a type that wraps around a given Map, changing
its accessibility characteristics. The tool (not shown) reveals an occurrence count
of 2,030,732 that accounts for 108MB of overhead. The trivial solution to this
problem is to avoid, at deployment time, the use of the unmodifiable wrapper.
Application S4. The application spends 407MB on actual data and 2.47GB
on overhead. The tool’s main finding is a Hash Map of ArrayList pattern which
accounts for 422MB of overhead. In this case, the single outer map had many
inner, but relatively small, lists. Though not small enough to fire the small
collections pattern, this case fires the nested collections pattern. In general for
this pattern, if the outer collection has a significant larger number of elements
than the inner collection, the memory overhead may be reduced by switching
the order of collection’s nesting. The benefit comes as a consequence of greatly
reducing the total number of collection instances.
This characterization can help the community better determine what sort of ad-
ditional tools, optimizations, or language features might lead to more efficient
memory usage. There are three broad questions we would like to assess: 1) Do
the patterns we have identified explain a significant amount of the overhead?
2) How often do these patterns occur in applications? and 3) Do the patterns
provide the best candidates for memory footprint reduction? To achieve this we
introduce a set of metrics, and apply them on a set of fifteen real-world appli-
cations. In summary, applying the metrics on the application set shows that the
memory patterns we detect do indeed explain large sources of overhead in each
application.
In Fig. 1 we present examples of applications with large footprint overhead.
For our study, we select from these the fifteen applications which make the least
efficient use of memory, applications where more than 50% of the heap is over-
head. Note that in the reporting of results, we are careful to distinguish between a
pattern category, such as Empty Collections, and a pattern occurrence, an instan-
tiation of a pattern category at a particular context. In all of the computations
of overhead explained by the tool, we consider only pattern occurrences which
account for at least 1% of the application overhead. We choose this threshold
so that we report and ask the user to remediate only nontrivial issues. We do
not want to burden the user with lots of insignificant findings (i.e. of just a few
bytes or kilobytes). To ensure meaningful comparisons, the computation of to-
tal overhead in the heap is based on traversing the entire heap and tabulating
lower-level overheads like object header costs, as described in 3.4 (i.e. it is not
dependent on pattern detection). Thus it includes all sources of overhead, from
trivial and nontrivial cases alike.
Table 1 gives a summary of the coverage across all the heaps we analyzed
(not just the subset of fifteen with high overhead). In almost half of the heaps
the tool is able to explain more than 80% of the overhead. Fig. 8 gives a more
detailed look at the fifteen highest overhead heaps. The third bar shows us the
percentage of overhead explained by 100% of the discovered pattern occurrences.
The remaining, unaccounted for overhead can be useful as well in helping us
identify important missing patterns. In continuing work we are looking at the
unexplained part of the heap, and are identifying which portion is the result
of detecting trivial occurrences of known patterns, and which are new patterns
that need to be encoded.
20
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
1 2 3 4 6 7 8 9 10 11 12 14 17 18 29
(a) The percentage of total overhead explained by top N%
of pattern occurrences
1 2 3 4 6 7 8 9 10 11 12 14 17 18 29
10% 1 2 2 2 2 1 1 1 2 2 1 2 1 2 1
40% 2 6 6 8 8 3 2 2 5 6 2 8 4 8 4
100% 4 14 13 20 18 6 3 3 11 13 4 18 8 19 10
(b) Number of pattern occurrences in the top N%
P1 P2 P3 P4 P5 P6 P1 P2 P3 P4 P5 P6
100% 100%
90% 90%
80% 80%
70% 70%
60% 60%
50% 50%
40% 40%
30% 30%
20% 20%
10% 10%
0% 0%
1 2 3 4 6 7 8 9 10 11 12 14 17 18 29 1 2 3 4 6 7 8 9 10 11 12 14 17 18 29
(a) The percentage of overhead explained (b) The percentage of overhead explained
by the pattern categories represented in by the pattern categories represented in
the top N=10% of pattern occurrences the top N=40% of pattern occurrences
We have seen that the top findings reveal the main categories of memory inef-
ficiencies usage in a given application. The next question is how often are the
same patterns seen across different applications? Does a small number of pattern
categories explain the bulk of the overhead in most applications, or is there more
variety among applications? To study this across multiple systems we introduce
a pairwise similarity metric.
Definition 7. Pattern category similarity measures the degree to which two ap-
plications contain the same pattern categories. The similarity metric reports the
ratio of the number of pattern categories common to both applications to the total
number of pattern categories detected in the two applications:
T
2|P C1 P C2 |
CS =
|P C1 | + |P C2 |
The value of pattern category similarity metric belongs to [0, 1]. A value of 1
means that the same pattern categories have been detected in both applications.
The lower the value of the similarity metric the greater the range of problems
identified across two applications.
Fig. 10 reports the similarity metric, computed pairwise for the heaps in our
experimental set. The darker the gray shade, the more common the problems
detected between the two applications. To understand how a given heap com-
pares to each of the other heaps, look at both the row and column labeled with
the given heap (i.e. an L-shaped region of cells). There is no single application
23
Table 7. Number of pattern categories represented in the top 10% and the top 40%
of pattern occurrences.
# categories
min median max
top 10% of occurrences 1 1 2
top 40% of occurrences 1 3 5
2 3 4 6 7 8 9 10 11 12 14 17 18 29
1
2
3
4
6
7
8
9
10
11
12
14
17
18
Fig. 10. Pattern category similarity. The darker the gray shade, the more common are
the pattern categories found in the two applications.
that presents completely different pattern categories compared with all the other
applications, though application 8 is the least similar to the others. Eleven ap-
plications out of fifteen contain half of the categories in common with at least 9
applications (i.e. CS≥0.5). From the results we conclude that the same memory
problems are frequently seen across multiple applications.
The current work addresses patterns of memory usage that have a high represen-
tational overhead, using a definition of overhead based on infrastructure costs
such as object headers and collection implementations. Developers also intro-
duce inefficiencies in the way they represent the data proper. For example, in
our experience we have seen many applications with large amounts of duplicate
immutable data, scalar data such as enumerated types that are represented in
text form, or classes carrying the cost of unused fields from overly general base
classes. In future work we would like to address these inefficiencies by encoding
data patterns into the same framework. Much of the existing analysis approach
can be easily extended to support recognition of these patterns.
24
7 Related Work
8 Conclusions
In Java, it is difficult to implement a data model with only a single data type.
The language’s lack of support for composition and unions forces us to delegate
functionality to side objects. The sometimes perverse focus on reuseability and
pluggability in our frameworks [11] encourages us to to favor delegation over
subclassing [4,8]. For these reasons, classes are a low level manifestation of intent.
In Java, even the most basic of data types, the string, requires two types of
objects and delegation: a String pointing to a character array.
There is a wide modeling gap between what programmers intend to repre-
sent, and the ways that the language and runtime encourage or force them to
store this information. As a consequence, most Java heaps suffer from excessive
implementation overhead. We have shown that it is possible to identify a small
set of semantic reasons for the majority of these overheads in Java heaps. In
the future, we would like to explore this modeling gap more thoroughly. It is
possible that a more rigorous study of the gap will yield opportunities close it,
for important common cases. Why must we use collections explicitly, to express
concerns that are so highly stylized: relationships, long-lived repositories, and
transient views?
References
1. Apache: Commons library, http://commons.apache.org
2. De Pauw, W., Jensen, E., Mitchell, N., Sevitsky, G., Vlissides, J., Yang, J.: Visual-
izing the execution of Java programs. In: Software Visualization, State-of-the-art
Survey. Lecture Notes in Computer Science, vol. 2269. Springer-Verlag (2002)
3. Eclipse Project: Eclipse memory analyzer, http://www.eclipse.org/mat/
4. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of
Reusable Object-Oriented Software. Addison-Wesley (1994)
5. GNU: Trove library, http://trove4j.sourceforge.net
6. Hill, T., Noble, J., Potter, J.: Scalable visualizations of object-oriented systems
with ownership trees. J. Vis. Lang. Comput. 13(3), 319–339 (2002)
7. Jump, M., McKinley, K.S.: Cork: dynamic memory leak detection for garbage-
collected languages. In: Symposium on Principles of Programming Languages
(2007)
8. Kegel, H., Steimann, F.: Systematically refactoring inheritance to delegation in
a class-based object-oriented programming language. In: International Conference
on Software Engineering (2008)
9. Lengauer, T., Tarjan, R.E.: A fast algorithm for finding dominators in a flowgraph.
ACM Trans. Program. Lang. Syst. 1(1), 121–141 (1979)
10. Mitchell, N., Schonberg, E., Sevitsky, G.: Making sense of large heaps. In: The
European Conference on Object-Oriented Programming. vol. 5653, pp. 77–97.
Springer-Verlag Berlin Heidelberg (2009)
11. Mitchell, N., Schonberg, E., Sevitsky, G.: Four trends leading to java runtime bloat.
IEEE Software 27, 56–63 (2010)
12. Mitchell, N., Sevitsky, G.: Leakbot: An automated and lightweight tool for diag-
nosing memory leaks in large Java applications. In: The European Conference on
Object-Oriented Programming. vol.2743. Springer, Heidelberg (2003)
13. Mitchell, N., Sevitsky, G.: The causes of bloat, the limits of health. In: Object-
oriented Programming, Systems, Languages, and Applications. pp. 245–260. ACM,
New York, NY, USA (2007)
14. Novark, G., Berger, E.D., Zorn, B.G.: Efficiently and precisely locating memory
leaks and bloat. In: Programming Language Design and Implementation. pp. 397–
407. ACM, New York, NY, USA (2009)
15. Printezis, T., Jones, R.: Gcspy: an adaptable heap visualisation framework. In:
Object-Oriented Programming, Systems, Languages, and Applications. pp. 343–
358. ACM, New York, NY, USA (2002)
16. Shacham, O., Vechev, M., Yahav, E.: Chameleon: adaptive selection of collections.
In: Programming Language Design and Implementation. pp. 408–418. ACM, New
York, NY, USA (2009)
17. Xu, G., Rountev, A.: Precise memory leak detection for java software using con-
tainer profiling. In: International Conference on Software Engineering. pp. 151–160.
ACM, New York, NY, USA (2008)
18. Yourkit LLC: Yourkit profiler, http://www.yourkit.com