0% found this document useful (0 votes)
45 views5 pages

A New Garbage Collection Technique For Server Environment: Xiting Xie

The document describes a new garbage collection technique called Two-Dimensional Generational Garbage Collection (2D-GC). 2D-GC aims to achieve low tenuring costs, low scavenging costs for medium-lived objects, and low generation size. It does this by arranging the heap into two dimensions - a young and old generation vertically, and within the young generation horizontally separating short-lived and medium/long-lived objects. New objects are allocated to the appropriate horizontal generation. When generations fill up it triggers minor collections, with survivors promoted to the old generation. This allows medium-lived objects more time in younger generations before premature promotion, improving costs.

Uploaded by

kruemeL1969
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views5 pages

A New Garbage Collection Technique For Server Environment: Xiting Xie

The document describes a new garbage collection technique called Two-Dimensional Generational Garbage Collection (2D-GC). 2D-GC aims to achieve low tenuring costs, low scavenging costs for medium-lived objects, and low generation size. It does this by arranging the heap into two dimensions - a young and old generation vertically, and within the young generation horizontally separating short-lived and medium/long-lived objects. New objects are allocated to the appropriate horizontal generation. When generations fill up it triggers minor collections, with survivors promoted to the old generation. This allows medium-lived objects more time in younger generations before premature promotion, improving costs.

Uploaded by

kruemeL1969
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2011 3rd International Conference on Computer Modeling and Simulation (ICCMS 2011)

A New Garbage Collection Technique for Server Environment

Xiting Xie
Zhong Xing Telecommunication Inc.
Nanjing, China

AbstractConventional generational garbage collection threshold for each object; (2) providing an extra generation
techniques cannot achieve low tenuring costs, low scavenging (e.g., aging space); (3) enlarging young generation size.
costs on the medium-lived objects and low generation size at the These approaches give medium-lived objects more time to
meantime. In this paper, we will propose a novel Two- die in the youngest generation, which avoid premature
Dimensional Generational Garbage Collector (2D-GC) which promotion problem to some extent. However, the first two
can meet these three requirements at once. By its name, the approaches will bring multiple-scavenging problem, that is,
2D-GC arranges the heap in two dimensions. Our the medium-lived object will be scavenged for multiple times
experimental results show that 2D-GC can achieve the lowest before its death. Therefore, this approach will increase
tenuring cost and scavenging cost without enlarging generation
scavenging cost of the medium-lived object. The last
size.
approach will cost a lot of space, which can translate to long
Keywords-Java Virtual Machine; Garbage Collection pause time of a minor collection.
In summary, conventional generational garbage collection
techniques can not achieve low tenuring costs, low
I. INTRODUCTION
scavenging costs on the medium-lived objects and low
One of the most useful language features of modern generation size at the meantime. In this paper, we will
Object-Oriented programming languages is garbage propose a novel Two-Dimensional Generational Garbage
collection (GC).GC improves programming productivity by Collector (2D-GC) which can meet these three requirements
reducing errors resulting from explicit memory management. at once. The 2D-GC leverages Xian et.al.s research on
Moreover, GC underpins sound software engineering remote/local objects [10][23] to dynamically identify
principles of abstraction and modularity. GC leads to clusters of similar-lifespan objects. Xian et.al defined key
cleaner code since memory management concerns are no objects and segregate objects into groups based on temporal
longer cluttered with the programming logic. locality and lifespan similarity. Similarly, the 2D-GC
To date, the best performing garbage collectors in wide arranges the heap in two dimensions: (1) heap is divided
use are generational collectors. These collectors rely on the into two vertical generations: a young generation and an old
fact that most objects die young. Generational GC partitions generation; (2) the young generation is divided into two
the heap into age-based generations of objects, where age is horizontal generations: one for accommodating short-lived
measured in the amount of allocation. New objects are 1 objects and the other for medium-lived and long-lived
allocated in the youngest generation. Rather than collecting objects. New objects are allocated in either one horizontal
the entire heap and incurring the cost of copying all live generation, depending on the estimated lifetimes of these
objects, generational collectors collect the youngest objects. Survivors of either horizontal generation will be
generation (i.e., minor collection or nursery collection), promoted to the old generation. The 2D-GC can be easily
place survivors in the next older generation, and only collect extended to supply more horizontal generations in the young
older generations (i.e., major collection or full collection) if generation, or tailored to a conventional generational
necessary. collector.
However, the generational garbage collection has In the remainder of the paper, we will present some
difficulty of treating medium-lived objects. Medium-lived related work and then describe our 2D-GC. Next we discuss
objects are the objects which survive at least one minor its design issues and its implementation in Sun's JVM. Then
collection but die before the next full collection. Tenuring we give experimental results and finally conclude this paper.
medium-lived objects too early will cause premature
promotion problem, which increases tenuring cost and also II. TWO DIMENSIONAL GENERATIONAL GARBAGE
causes the older generation to fill up too soon, resulting in a COLLECTION (2DGC)
full collection with a longer pause time. Also, premature
promotion has an adverse effect on the user program's Figure 1 illustrates the diagram of 2D-GC. 2D-GC
locality, since it is likely that most program accesses will be arranges the heap in two dimensions: (1) heap is divided into
to younger objects. The premature promotion problem can be a young generation and an old generation vertically; (2) the
alleviated by the following approaches: (1) setting a tenuring young generation is divided into two horizontal generations:

978-1-4244-9243-5/11/$26.00 C 2011 IEEE V2-630


2011 3rd International Conference on Computer Modeling and Simulation (ICCMS 2011)

Gen-s for accommodating short-lived objects and Gen-o for usually short-lived) and non-short-lived sites. During an
non-short-lived objects (i.e., medium-lived and long-lived actual run, the runtime system uses the prediction advice to
objects). New objects are allocated in either one horizontal identify short-lived objects. Thus, no monitoring overhead is
generation, depending on the estimated lifetimes of these incurred during the production run of the application. Online
objects. When the Gen-s is filled up, it will trigger a Gen-s profiling approaches can be accurate but they are tedious and
collection. When the Gen-o is filled up, it will trigger a Gen- inconsistent in different production runs (e.g., different
o collection. Survivors of either collection will be promoted input). In online sampling, we do not profile all object
to the old generation. lifetimes. Instead, we compute survival rate of allocation
sites during every collection. By the sample information, we
can guide collector to identity short-lived (or non-short-lived)
objects. The accuracy of online sampling approaches highly
depends on the sampling interval.
The other way, that we plan to use, is to segregate objects
based on type and connectivity information. Our segregation
scheme is based on the following two observations:
1. The type with higher allocation rate tends to
have shorter lifetime. This observation
conforms to the prolific hypothesis [20] that
prolific types (i.e., objects types that are
allocated most frequently) have short lifetimes.
An intuitive basis for this hypothesis is that if
this were not true, an application that
continuously allocates dynamic memory
would have unsustainable memory
requirements, as it would keep creating objects
Figure 1. 2D-GC Diagram
of prolific types at a fast pace without
reclaiming sufficient space. Put in another way,
III. DESIGN ISSUES OF 2D-GC the hypothesis predicts that the objects of
prolific types die younger than objects of non-
In the design of 2D-GC, several important and prolific types. Shuf et al. [20] validated this
interrelated challenges need to be solved: hypothesis and found that the relative survival
1. Object segregation. Object segregation is directly rates are usually lower for objects of prolific
related with object allocation. Objects are types than for objects of non-prolific types.
segregated based on lifetime classification. Short- We also found that most of the dead objects
lived objects are allocated in Gen-s and the rest and most of the garbage comes from short-
objects are allocated in Gen-o. We need a cost- lived objects of prolific types.
effective approach to classify objects into short- 2. Connected objects tend to die together. Hirzel
lived and non-short-lived. et al. [19] investigated the relationship
2. Heap organization. What is the optimal size ratio between lifetime and heap connectivity. They
between Gen-s and Gen-o? found that on average linked objects have a
3. Intergenerational pointers. Intergenerational very high probability (about 70-80.4%) of
pointers include pointers from the old generation dying at the same time.
to the young generation, and pointers between Our segregation scheme has two phases: the first phase is
Gen-s and Gen-o. How do we remember these coarse-grained classification phase; the second is refining
pointers? classification phase. The first phase use type-based lifetime
classification approach. Based on the first observation, we
A. Object Segregation can classify object types into short-lived types and non-short-
We can use several ways to differentiate short-lived and lived types based on their allocate rates. We compute
non-short-lived objects, varying in terms of accuracy and allocation rate of each object type periodically (here we
overhead. perform computation at each collection point). If the
One possible way is predicting object lifetimes based on allocation rate of an object type exceeds a certain threshold,
allocation-site by using online profiling [12][13] or online then this type will be treated as short-lived type, or else
sampling [21]. Most recent research indicated a strong treated as non-short-lived type. All objects of short-lived
correlation between allocation-sites and object lifetimes. In types will be assumed as short-lived objects and others will
online profiling, a runtime system monitors memory be non-short-lived objects.
activities at each allocation site. When an application exits, However, type-based lifetime classification is too strict
the collected profile is saved into a file. By analyzing the and error-prone. For example, objects of long-lived types
profile, we generate an advice file which classifies allocation might die young because they are linked to short-lived
sites into short-lived sites (i.e., the sites whose objects are objects. Therefore, we use connectivity-based lifetime

V2-631
2011 3rd International Conference on Computer Modeling and Simulation (ICCMS 2011)

classification to refine our prediction. By the second IV. EXPERIMENTAL RESULTS


observation, it is reasonable to assume that objects connected To simply discussion, we begin with a baseline
with other short-lived objects will be most likely short-lived. generational collector, Moon's Ephemeral Garbage Collector
Note that we can not apply this assumption directly to non- [11], in which all survival objects in the young generation
short-lived objects, because the second observation does not are promoted to the old generation. Then we discuss two
correlate object death-times with their lifetimes. Assume that optimized version of the baseline generational collectors: one
an object x is allocated and linked to a long-lived object y, using tenuring-threshold, the other which enlarges the
and then y dies immediately after the allocation. We can generational size. Next we present 2D-GC and finally
reasonably assume that x will die together with y at high compare the tenuring costs, scavenging cost and the young
probability, which implies that x is most likely short-lived generation size of these four collectors.
instead of long-lived. However, if an objects is connected to We summarize results in Table 1. As shown in the table,
a same-aged long-lived object (i.e., these two objects are 2D-GC can achieve the lowest tenuring cost and scavenging
allocated very closely in time), then it is highly possible that cost without enlarging generation size.
this object is long-lived because the two objects will have the
same death-time. Therefore, given two connected objects o1 TABLE I. Comparisons of 4 GC techniques
and o2 (o1 points to o2 directly or indirectly), we can decide
o2's lifetime by the following rules:
1. Rule 1: If o1 is short-lived, then o2 is most
likely short-lived; Young
Scavenging
2. Rule 2: If o1 is non-short-lived and o1 has the Tenuring cost generation
cost
same birth-time with o2, then o2 is most likely Garbage size
non-short-lived. collector

Note that these two rules cannot be applied at the same Ephemeral
10 0 20
GC
time because they might fail at the situation shown in Figure
4. In Figure 4, o1 and o2 are allocated at time 1. Assume we GC without
know o1 is short-lived and o2 is long-lived. Then o3 is tenuring 4 10 30
allocated at the same time and later both o1 and o2 point to threshold
o3. By rule 1, o3 should be most likely short-lived. By rule 2,
however, o3 should be most likely long-lived. Therefore, we GC with
set different priority levels to these two rules. By default, enlarged
4 10 20
rule 2 has higher priority than rule 1. generation
size

2D-GC 4 0 20

V. RELATED WORK
Blackburn et al. [15][16] used the profile-based approach
to select objects for pretenuring. They classified objects into
three different classes: short lived, long lived, and immortal
based on two measures: age and time of death. The proposed
method was to provide the statically-analyzed object
information to the compiler so that the optimal object
When using connectivity-based lifetime classification to
allocation site could be determined at runtime.
segregate objects, we encounter another difficulty: when an
Jump and Blackburn et al. [21] introduced a low-
object is allocated, its future connectivity information is not
overhead object sampling technique to dynamically
known a priori. This is a dilemma: we need the object's
pretenure long-lived objects. The sampling technique can
connectivity information to predict its lifetime and allocate it
accurately predict allocation site survival rates. Their
to a proper space (Gen-s or Gen-o); but connectivity can only
approach cannot attain the consistent performance
be established after its allocation. Fortunately, we can
improvement on SPECjvm98 benchmarks. So they leave
capture partial connectivity information from calling context
open optimization policies that can benefit from lifetime
before the object's allocation. For example, if o2 is allocated
sampling.
within o1's method call, then o1 will be connecting to o2
Huang et al. extended the aforementioned work by
indirectly. Actually, we do not need the thorough
maintaining online lifetime history based on the object type
connectivity information of an object, because it will be
to adaptively select which objects are to be pretenured. They
costly and have little help to improve the accuracy of lifetime
found a strong correlation between an object type and its
classification.
lifespan. Depending on the nature of the applications, some
object types are by default long-lived. They claimed that by

V2-632
2011 3rd International Conference on Computer Modeling and Simulation (ICCMS 2011)

pretenuring objects of long-lived types can reduce the REFERENCES


garbage collection time by up to 37%.
Xian et al. developed AS-GC [23][10] to solve long GC [1] M. Hertz, S.M. Blackburn, J. E. B. Moss, K. S. Mckliney, D.
pause time during heavy workload. Their approach classified Stefanovic. Generatingobject lifetime traces with Merlin.
objects into remote objects and local objects. Their approach Transactions on Programming Languages And Systems (TOPLAS)
is the first implementation of segregating objects of different 28, 3(May), 476516.
lifespans. In their study, they found that remote objects tend [2] M. Hertz, Y. Feng, E. D. Berger. Garbage collection without paging.
to live longer than local objects. Therefore, their approach In Proceedings of the 2005 ACM SIGPLAN Conference on
Programming Language Design and Implementation (PLDI 2005),
divided young generation into local space and remote space. Volume 40(7) of ACM SIGPLAN Notices (Chicago, IL, June 2005),
We extended their approach to a more general scenario. Our pp. 143153.
approach can intelligently distinguish long-lived objects and [3] M. Hertz, N. Immerman, J. E. B. Moss. Framework for analyzing
short-lived objects. garbage collection. In R. BAEZA-YATES, U. MONTANARI, AND
Optimizing JVM performance There are a plenty of N. SANTORO Eds., Foundations of Information Technology in the
literatures of optimizing Java virtual machine performance Era of Network and Mobile Computing: IFIP 17th World Computer
Congress - TC1 Stream (TCS 2002), Volume 223 of IFIP Conference
through OS support or hardware support. Proceedings (Montreal, Canada, 2002), pp. 230241.
Hertz et al. developed Bookmarking Collector (BC) [2]. [4] F. Xian, W. Srisa-an, H. Jiang. Allocation-phase aware thread
The BC is a garbage collection approach which is tightly scheduling policies to improve garbage collection performance. In
coupled with VMM of the operating system to minimize Proceedings of the 6th international Symposium on Memory
paging overhead when memory pressure is very high. Management (Montreal, Quebec, Canada, October 21 - 22, 2007).
Yang et al. developed CRAMM [9] to dynamically ACM, New York, NY, 79-90.
adjusts heap size through coordination between VMM and [5] F. Xian, W. Srisa-an, H. Jiang. Garbage collection: Java application
servers' Achilles heel. Sci. Comput. Program. 70, 2-3 (Feb. 2008),
Java virtual machine. In CRAMM, a new VMM is 89-110.
implemented to compute working-set size information of [6] F. Xian, W. Srisa-an, H. Jiang. Contention-aware scheduler:
each process. The information is passed to a heap sizing unlocking execution parallelism in multithreaded java programs. In
analytical model to predict an appropriate heap size that can Proceedings of the 23rd ACM SIGPLAN Conference on Object-
improve performance as well as minimizing page faults. Oriented Programming Systems Languages and Applications
Michael et al. proposed Yield Predictor (YP) [7] which (Nashville, TN, USA, October 19 - 23, 2008). ACM, New York, NY,
163-180.
provides a simple solution to improve GC efficiency by
[7] M. Wegiel, C. Krintz. Dynamic prediction of collection yield for
estimating GC yield (i.e., number of dead objects) using managed runtimes. In Proceeding of the 14th international
hardware page reference bits used by OS virtual memory Conference on Architectural Support For Programming Languages
manager. The YP exploits the prior empirical result that and Operating Systems (Washington, DC, USA, March 07 - 11, 2009).
objects with similar life spans tend to be clustered in the heap ACM, New York, NY, 289-300.
and tend to die together. [8] L. Zhang, C. Krintz, and P. Nagpurkar, Supporting Exception
Xian et al. implemented a Contention-Aware Scheduler Handling for Futures in Java, ACM International Conference on the
Principles and Practice on Programming in Java (PPPJ), Sep, 2007.
[6] which exploits lock contention information to reduce the
occurrences of lock convoys in large application servers [9] L. Zhang, C. Krintz, and P. Nagpurkar, Language and Virtual
Machine Support for Efficient Fine-Grained Futures in Java, The
running in multi-core systems. CA-Scheduler relies on International Conference on Parallel Architectures and Compilation
collaborations between operating system kernels and JVM to Techniques, (PACT) Sep, 2007.
improve the efficiency of Java runtime environments. [10] F. Xian, W. Srisa-an, H. Jiang. Service oriented garbage collection:
Xian and Srisa-an et al. proposed an allocation-phase improving performance and robustness of application servers. In
aware scheduler [4]. The allocation-phase aware scheduler Proceedings of ACM SIGPLAN Conference on Object-Oriented
Programming, Systems, Languages and Applications 2006: 661-662.
exploits object allocation pattern to guide thread scheduling
decisions. The principle of the approach is based on the [11] D.A. Moon. Garbage collection in a large LISP system. In
Conference Record of the 1994 ACM Symposium on LISP and
observations that objects are created and die in phases [22] Functional Programming (Austin, TX,Aug. 1994), pp. 235245.
and that current thread scheduling method is not friendly to [12] S. M. Blackburn, P. Cheng, and K.S. McKinley. Myths and reality:
garbage collection [24] [5]. The performance impact of garbage collection. In Proceedings of the
2004 Joint International Conference on Measurement and Modeling
of Computer Systems (New York, NY, June 2004), ACM, pp. 2536.
VI. CONSLUSIONS [13] S. M. Blackburn, P. Cheng, and K. S. McKinley, "Oil and Water?
High Performance Garbage Collection in Java with MMTk," in ICSE
Conventional generational garbage collection techniques 2004, 26th International Conference on Software Engineering,
cannot achieve low tenuring costs, low scavenging costs on Edinburgh, Scotland, May 23-28, 2004, 2004.
the medium-lived objects and low generation size at the [14] S. M. Blackburn, R. E. Jones, K. S. McKinley, and E. J. B. Moss,
meantime. In this paper, we will propose a novel Two- "Beltway: Getting Around Garbage Collection Gridlock," in
Dimensional Generational Garbage Collector (2D-GC) Proceedings of SIGPLAN 2002 Conference on Programming
which can meet these three requirements at once. By its Languages Design and Implementation, PLDI02, Berlin, June, 2002,
Berlin, Germany, 2002.
name, 2D-GC arranges the heap in two dimensions. Our
experimental results show that 2D-GC can achieve the [15] S. M. Blackburn, S. Singhai, M. Hertz, K.S. McKinley, and J. E. B.
Moss. Pretenuring for Java. In Proceedings of the 16th ACM
lowest tenuring cost and scavenging cost without enlarging Conference on Object-Oriented Programming, Systems, Languages,
generation size. & Applications (Tampa, FL, Oct. 2001), ACM, pp. 342352.

V2-633
2011 3rd International Conference on Computer Modeling and Simulation (ICCMS 2011)

[16] S. M. Blackburn, M. Hertz, K. S. McKinley, E. J. B. Moss, and T. [20] Y. Shuf, M. Gupta, R. Bordawekar, and J. Pal Singh. Exploiting
Yang, "Profile-based pretenuring," ACM Transactions on prolific types for memory management and optimizations. SIGPLAN
Programming Languages and Systems, vol. 27, iss. 1, 2007. Not. 37, 295-306.
[17] S. M. Blackburn, P. Cheng, and K. S. McKinley, "Myths and [21] M. Jump, S. M. Blackburn, and K. S. McKinley, "Dynamic Object
Realities: The Performance Impact of Garbage Collection," in Sampling for Pretenuring," in The 2004 International Symposium on
SIGMETRICS Performance 2004, Joint International Conference Memory Management (ISMM 2004), Vancouver, Canada, October
on Measurement and Modeling of Computer Systems, New York, NY, 24-25, 2004.
USA, June 1216, 2004, 2004. [22] F. Xian, W. Srisa-an, H. Jiang. MicroPhase: Proactively Invoking
[18] S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. Garbage Collection for Improved Performance. In Proceedings of
McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. ACM SIGPLAN Conference on Object-Oriented Programming,
Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. Eliot, B. Moss, A. Systems, Languages and Applications (OOPSLA'07), Montreal,
Phansalkar, D. Stefanovic, T. VanDrunen, D. von Dincklage, and B. Canada. Oct 21-25.
Wiedermann, "The DaCapo benchmarks: Java benchmarking [23] F. Xian, W. Srisa-an, C. Jia, H. Jiang. "AS-GC: An Efficient
development and analysis," in OOPSLA 06: Proceedings of the 21st Generational Garbage Collector for Java Application Servers". In
annual ACM SIGPLAN conference on Object-oriented programming Proceedings of 21st European Conference on Object-Oriented
systems, languages, and applications, New York, NY, USA, 2006, pp. Programming (ECOOP'07), Berlin, Germany. July 30-Aug 03, 2007.
169-190.
[24] F. Xian, W. Srisa-an, H. Jiang. "Investigating Throughput
[19] M. Hirzel, A. Diwan, and M. Hertz. Connectivity-based garbage Degradation Behavior of Java Application Servers: A View from
collection. In Proceedings of the 18th ACM Conference on Object- Inside a Virtual Machine". In Proceedings of ACM International
Oriented Programming, Systems, Languages, & Applications Conference on Principles and Practices of Programming In Java
(Anaheim, CA, Oct. 2003), pp. 359373. (PPPJ'06), Mannheim, Germany, Aug 30-Sep 1, 2006, Page 40-49.

V2-634

You might also like