0% found this document useful (0 votes)
160 views172 pages

Java Performance Tuning (Full Presentation) by Ender

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 172

The Ultimate Guide to

Java Performance Tuning


Ender Aydin Orak
koders.co
1 INTRODUCTION

koders.co
WHAT YOU WILL LEARN ?
You WILL LEARN:

Application performance principles &


methods
You WILL LEARN:

JVM structure and internals regarding


application performance
You WILL LEARN:

Garbage Collection types and when to


use which
You WILL LEARN:

Monitoring, Profiling, Tuning,


Troubleshooting JVM applications
You WILL LEARN:

Using OS and JVM tools for better


application performance
You WILL LEARN:

Applying performance best practices


You WILL LEARN:

Java language level tips & tricks


YOU WILL PRACTICE ON:
• Dead locks • Collections

• Memory leaks • Locks

• Lock contention • Multithreading

• CPU utilization • Best practices


Performance Approaches

• Top-Down: Focus on top level application

• Application Developers (our approach)


Performance Approaches

• Bottom-Up: Focus on the lowest level: CPU.

• Performance Specialists
Performance Tuning Steps

Monitoring
Performance Tuning Steps

Profiling
Performance Tuning Steps

Tuning
JVM Overview &
2 INTERNALS
koders.co
Objectives
• JVM Runtime & Architecture

• Command Line Options

• VM Life Cycle

• Class Loading
JAVA PROGRAMMING LANGUAGE
• Object oriented, Garbage collected*

• Class based

• .java files (source) compiled into .class files (bytecode)

• JVM executes platform independent bytecodes


“All problems in computer science can be
solved by another level of indirection”

–DAVID WHEELER
JVM Overvıew
• JVM: Java Virtual Machine

• A specification (JCP, JSR)

• Can have multiple implementations

• OpenJDK, Hotspot*, JRockit (Oracle), IBM J9, much


more

• Platform independent: “Write once, run everywhere”


“All non-trivial abstractions, to some
degree, are leaky.”

–JOEL SPOLSKY
HOTSPOT VM ARCHITECTURE
HOTSPOT VM ARCHITECTURE
COMMAND LINE OPTIONS
• Standard: Required by JVM specification, standard
on all implementations (-server, -classpath)

• Nonstandard: JVM implementation dependent. (Start


with -X)

• Developer Options: Non-stable, JVM implementation


dependent options for specific cases (Start with -XX in
HotSpot VM)
JVM LIFE CYCLE
1. Parse command line options

2. Establish heap sizes and JIT compiler (if not specified)

3. Establish environment variables (CLASSPATH, etc.)

4. Fetch Main-Class from Manifest (if not specified)

5. Create HotSpot VM (JNI_CreateJavaVM)

6. Load Main-Class and get main method attributes

7. Invoke main method passing provided command line arguments


PERFORMANCE
3 Overview
koders.co
Objectives

• Key concepts regarding application performance

• Common performance problems and principles

• Methodology to follow in solving problems


QUESTIONS & Expectations
• Expected throughput ?

• Acceptable latency per request ?

• How many concurrent users/tasks ?

• Expected throughput and latency ?

• Acceptable garbage collection latency ?


Terminology

• CPU Utilization: Percentage of the CPU usage


(user+kernel)

• User CPU Utilization: the percent of time the application


spends in application code
TERMINOLOGY

• Memory Utilization: Memory usage percentage


(ram/swap)

• Swapping should be avoided all times.


TERMINOLOGY

• Lock Contention: The case where a thread or process


tries to acquire a lock held by another process or
thread.

• Prevents concurrency and utilization. Should be avoided as


much as possible.
TERMINOLOGY

• Network & Disk I/O Utilization: The amount of data


sent and received via network and disk.

• Should be traced and used carefully.


Performance
• Aspects of performance:

• Responsiveness

• Throughput

• Memory Footprint

• Startup Time

• Scalability
RESPONSIVENESS
• Ability of a system to complete assigned tasks within
a given time

• Critical on most of modern software applications


(Web, Desktop, CRUD apps, Web services)

• Long pause times are not acceptable

• The focus is on responding in short periods of time


THROUGHPUT
• The amount of work done in a specific period of time.

• Critical for some specific application types


(e.g. Data analysis, Batch operations, Report generation)

• High pause times are acceptable

• Focus is on how much work are getting done over a longer


period of time
Memory Footprint
• The amount of main memory used by the application

• How much memory ?

• How the usage changes ?

• Does application uses any swap space ?

• Dedicated or shared system ?


STARTUP TIME

• The time taken for an application to start

• Important for both the server and client applications

• “Time ‘till performance”


SCALABILITY
• How well an application performs as the load on it
increases

• Huge topic that shapes the modern software architectures

• Should be linear, not exponential

• Can be measured on different layers in a complex system


Scalability
Focus areas

• Java application performance

• Tuning JVM for throughput or responsiveness

• Discovery, troubleshooting and tuning JVM


Performance Methodology
• Our steps to follow

1.Monitoring

2.Profiling

3.Tuning
Performance Monitoring
• Non-intrusively collecting and observing performance
data

• Early detection of possible problems

• Essential for production environments

• Early stage for troubleshooting problems

• OS and JVM tools


PERFORMANCE PROFILING
• Collecting and observing performance data using
special tools

• More intrusive & has affect on performance

• Narrower focus to find problems

• Not suitable for production environments


PERFORMANCE TUNING

• Changing configuration, parameters or even source


code for optimizing performance

• Follows monitoring and profiling

• Targets responsiveness or throughput


Development PROCESS
PERFORMANCE PROCESS
JVM AND GARBAGE
4 COLLECTION
koders.co
Objectives
• What garbage collection is and what it does

• Types of garbage collectors

• Differences and basic use cases of different garbage


collectors

• Garbage collection process


Garbage collectıon

• In computer science, garbage collection (GC) is a


form of automatic memory management.

• The garbage collector, attempts to reclaim memory


occupied by objects that are no longer in use by the
program.
Garbage Collectıon
• Main tasks of GC

• Allocating memory for new objects

• Keeping live (referenced) objects in memory

• Removing dead (unreferenced) objects and reclaiming


memory used by them
GC Steps: MARKING
GC Steps: DELETION [normal]
GC Steps: DELETION [COMPACTING]
GENERATIONAL GC
• Hotspot JVM is split into generational spaces
WHY GENERATIONAL GC ?

• Object life patterns in OO languages:

• Most objects “die young”

• Older objects rarely references to young ones


GENERATIONAL GC
GC STEPS: YOUNG GC
GC STEPS: YOUNG GC
GC STEPS: YOUNG GC
GC STEPS: YOUNG GC
GC STEPS: YOUNG GC
GC STEPS: YOUNG GC
GC STEPS: YOUNG GC
OLD & PERMANENT GENERATIONS
GARBAGE
5 COLLECTORS
koders.co
Objectives
• Garbage collection performance metrics

• Garbage collection algorithms

• Types of garbage collectors

• JVM ergonomics
GC PERFORMANCE METRICS
• There are mainly 3 ways to measure GC
performance:

• Throughput

• Responsiveness

• Memory footprint
FOCUS: Throughput

• Mostly long-running, batch processes

• High pause times can be acceptable

• Responsiveness per process is not critical


FOCUS: RESPONSIVENESS

• Priority is on servicing all requests within a predefined


time interval

• High GC pause times are not acceptable

• Throughput is secondary
GC ALGORITHMS

• Serial vs Parallel

• Stop-the-world vs Concurrent

• Compacting vs Non-Compacting vs Copying


Serial vs Parallel
STOP-THE-WORLD vs CONCURRENT
• STW: Simpler, more pause time,
memory need is less, simpler to
tune

• CC: Complicated, harder to tune,


memory footprint is larger,
less pause time
CoMPACTING vs Non-Compactıng
TYPES OF GC
• Serial Collector

• Parallel Collector

• Young (Parallel Collector)

• Young & Old (Parallel Compacting Collector)

• Concurrent Mark-Sweep Collector

• G1 Collector
SERIAL / Parallel Collector
SERIAL COllector
• Serial collection for both young and old generations

• Default for client-style machines

• Suitable for:

• Applications that do not have low pause reqs

• Platforms that do not have much resources

• Can be explicitly enabled with: -XX:+UseSerialGC


PARALLEL COLLECTOR
• Two options with parallel collectors:

• Young (-XX+UseParallelGC)

• Young and Old (-XX+UseParallelOldGC - Compacting)

• Throughput is important

• Suitable for

• Machines with large memory, multiple processors & cores


CMS COLLECTOR

• Focus: Responsiveness

• Low pause times are required

• Concurrent collector
CMS COLLECTOR
g1 Collector
g1 Collector [REGIONS]
g1: YOUNG GC
g1: YOUNG GC
g1: YOUNG GC [end]
g1: PHASES
1. Initial Mark (stop-the world)

2. Root region scanning

3. Concurrent marking

4. Remark (stop-the-world)

5. Cleanup (stop-the-world & concurrent)

* Copying (stop-the-world)
g1: PHASES [INITIAL MARK]
g1: PHASES [Concurrent mark]
g1: PHASES [REMARK]
g1: PHASES [COPYING/CLEANUP]
g1: PHASES [AFTER COPYING]
COMMAND LINE
6 Monitoring
koders.co
Objectıves
• Using JVM command line tools

• jps, jmd, stat

• Monitor JVMs

• Identify running JVMs

• Monitor GC & JIT activity


MONITORING

• First step to observe & identify (possible) problems


MONITORING
WHAT TO MONITOR
• Parts of interest

• Heap usage & Garbage collection

• JIT compilation

• Data of interest

• Frequency and duration of GCs

• Java heap usage

• Thread counts & states


JDK COMMAND LINE TOOLS

• jps

• jmcd

• jstat
JIT COMPILATION
• JIT compiler: optimizer, just in-time compiler

• Command line tools to monitor

• -XX:+PrintCompilation (~2% CPU)

• jstat

• Data of interest

• Frequency, duration, opt/de-opt cycles, failed compilations


INTERFERING JIT COMPILER
• .hotspot_compiler file

• Turns of jit compilation for specified methods/classes

• Very rarely used

• Opt/de-opt cycles, failure or possible bug in JVM


INTERFERING JIT COMPILER
• Via .hotspot_compiler file:

• exclude Package/to/Class method

• exclude java/lang/String toString

• Via command line:

• -XX:CompileCommand=exclude,java/lang/String,toString
Monitoring OS
7 Performance
koders.co
Objectıves
• Monitor CPU usage

• Monitor processes

• Monitor network & disk & swap I/O

• On Linux (+Windows)
Terminology

• CPU Utilization: Percentage of the CPU usage


(user+kernel)

• User CPU Utilization: the percent of time the application


spends in application code
TERMINOLOGY

• Memory Utilization: Memory usage percentage and


whether all the memory used by process reside in
physical (ram) or virtual (swap) memory.

• Swapping (using disk space as virtual memory) is pretty


expensive and should be avoided all times.
TERMINOLOGY

• Lock Contention: The case where a thread or process


tries to acquire a lock held by another process or
thread.

• Prevents concurrency and utilization. Should be avoided as


much as possible.
TERMINOLOGY

• Network & Disk I/O Utilization: The amount of data


sent and received via network and disk.

• Should be traced and used carefully.


Monitoring CPU Usage
• Monitor general and process based CPU usage

• Key definitions & metrics

• User (usr) time

• System (sys) time

• Voluntary context switch (VCX)

• Involuntary context switch (ICX)


MONITORING CPU
• Key points

• CPU utilization

• High sys/usr time

• CPU scheduler run queue


Monitoring CPU Usage
• Tools to use (Linux)

• top • prstat

• htop • gnome-system-monitor

• vmstat
MONITORING MEMORY
• Key points

• Memory footprint

• Change in usage of memory

• Virtual memory usage


MONITORING MEMORY

• Tools to use (Linux)

• free

• vmstat
MONITORING DISK I/O
• Key points

• Number of disk accesses

• Disk access latencies

• Virtual memory usage


MONITORING DISK I/O
• Tools to use (Linux)

• iostat

• lsof

• iotop
MONITORING NETWORK I/O
• Key points

• Connection count

• Connection statistics & states

• Total network traffic


MONITORING NETWORK I/O
• Tools to use (Linux)

• netstat • iftop

• iptraf • monitorix

• tcpdump
USING
8 Visual Tools
koders.co
Objectıves
• Monitor Java applications using visual tools:

• JConsole

• VisualVM

• Mission Control
JConsole
• Ships with JVM

• Enables to monitor and


control JVM

• CPU, Memory,
Classloading, Threads

• Demo
VISUALVM
• Graphical monitoring,
profiling, troubleshooting
tool

• Has Profiling and


Sampling capabilities

• Has plugin support


(Visualgc, btrace and
more)

• Demo
MISSION CONTROL
• Comprehensive
application

• Better UI

• Lots of useful information

• Monitor,
operate,manage, profile
Java applications

• Demo
JMX - MANAGED BEANS
• JMX: Java Management Extensions

• Used to monitor & manage JVM

• Managed Beans (MBeans)

• Objects used to manage Java resources

• Managed by JMX agents


PROFILING JAVA
9 APPLICATIONS
koders.co
Objectives
• Profiling Java applications using:

• jmap and jhat

• JVisual VM

• Java Flight Recorder


JMAP and JHAT
• JVM command line tools

• jmap: Creates heap profile data

• jhat: Primitively Presents data in browser

• Demo
VISUALVM

• Sampling & profiling


abilites

• Sampling: less intrusive

• Demo
10 Profiling
Performance Issues
koders.co
Objectives
• Profiling Java applications to troubleshoot and
optimize

• Detecting memory leaks

• Detecting lock contentions

• Identifying anti-patterns in heap profiles


HEAP PROFILING
• Necessary when:

• Observing frequent garbage collections

• Need for a larger heap by application

• Tune application for better performance & hardware


utilization
HEAP PROFILING: TIPS
• What to look for ?
• Objects with
• a large amount of bytes being allocated
• a high number of object allocations
• Stack traces where
• large amounts of bytes are being allocated
• large number of objects are being allocated
HEAP PROFILING: TOOLS
• jmap and jhat

• Snapshot of the application

• Top consumers & Allocation stack traces

• Compare multiple snapshots


MEMORY LEAK
• Refers to the situation when an object unintentionally
resides in memory thus can not be collected by GC.

• Frequent garbage collection

• Poor application performance

• Application failure (Out of memory error) Frequent


garbage collection
MEMORY LEAK: TOOLS

• Visual VM

• Flight Recorder

• jmap and jhat


MEMORY LEAK: TIPS
• Monitor running application

• Look for memory changes, survivor generations

• Profile applications, compare snapshots

• Look for object count changes, top grovers

• Always use -XX:+HeapDumpOnOutOfMemoryError


parameter on production
LOCK CONTENTION

• Usage of synchronization utilities (synchronized,


locks, conc. collections, etc.) cause threads to wait or
perform worse.

• Should be kept as minimum as possible.


LOCK CONTENTION: MONITOR
• Things to observe:

• High number of voluntary context switches

• Thread states and state changes (Visual VM, Flight


Recorder)

• Possible deadlocks (jstack, Visual Tools)


PROFILING ANTI-PATTERNS
• Frequent garbage collections

• Overallocation of objects

• High number of threads

• High volume of lock contention

• Large number of exception objects


GARBAGE COLLECTION
11 Tuning
koders.co
Objectives

• Learning to tune GC by setting generation sizes

• Comparing and selecting suitable GC for


performance requirements

• Monitor and understand GC outputs


Garbage Collectıon
• Main tasks of GC

• Allocating memory for new objects

• Keeping live (referenced) objects in memory

• Removing dead (unreferenced) objects and reclaiming


memory used by them
JVM Heap Size Options
JVM Heap Size Options
-Xmx<size> : Maximum size of the Java heap
-Xms<size> : Initial heap size
-Xmn<size> : Sets initial and max heap sizes as same
-XX:MaxPermSize=<size> : Max Perm size
-XX:PermSize=<size> : Initial Perm size
-XX:MaxNewSize=<size> : Max New size
-XX:NewSize=<size> : Initial New size
-XX:NewRatio=<size> : Ratio of Young to Tenured space
GARBAGE COLLECTORS
• Serial Collector

• Parallel (Throughput) Collector

• Concurrent Mark-Sweep (CMS) Collector

• Garbage First (G1) Collector


SERIAL COLLECTOR

• Single-threaded young generation collector

• Single-threaded old generation collector

• Parameter: -XX:+UseSerialGC
SERIAL COLLECTOR: TIPS
• Not suitable for applications with high performance
requirements

• Can be suitable for client applications with limited


hardware resources

• More suitable for platforms that has less than 256


MB of memory for JVM and do not have multicores
PARALLEL COLLECTOR
• Multi-threaded young generation collector

• Multi-threaded old generation collector

• Parameters:

• -XX+UseParallelGC (Parallel Young, Single-Threaded Old)

• -XX:+UseParallelOldGC (Young&Old BOTH MultiThreaded)


PARALLEL COLLECTOR: TIPS
• Suitable for applications that target throughput rather
than responsiveness

• Suitable for platforms that have multiple processors &


cores

• -XX:ParallelGCThreads=[N] can be used to specify GC


thread count

• default = Runtime.availableProcessors() (JDK 7+)

• Better reduced if multiple JVMs running on the same machine


CMS COLLECTOR

• Multi-threaded young generation collector

• Single-threaded concurrent old generation collector

• Parameter: -XX:+ConcMarkSweepGC
CMS COLLECTOR: GOOD TO KNOW
• CMS targets responsiveness and runs concurrently.
And it doesn’t come for free.

• More memory (~20%) and CPU resources needed

• Memory fragmentation

• It can lose the race. (Concurrent mode failure)


CMS COLLECTOR: GOOD TO KNOW

• CMS has to start earlier to collect not to lose the race

• -XX:CMSInitiatingOccupancyFraction=n (default 60%, J8)

• n: Percentage of tenured space size


CMS COLLECTOR: TIPS
• Size young generation as large as possible

• Small young generation puts pressure on old generation

• Consider heap profiling

• Choose tuning survivor spaces

• Enable class-unloading if needed (appservers, etc.)


-XX:+CMSClassUnloadingEnabled, -XX+PermGenSweepingEnabled
CMS: TIPS

• TODO : CMS important parameters


G1 Collector
• Parallel and concurrent young generation collector

• Single-threaded old generation collector

• Parameter: -XX:+UseG1GC

• Expected to replace CMS (J9)


G1 Collector: GOOD TO KNOW
• Concurrent & responsiveness collector like G1.
Suitable for multiprocessor platforms and heap sizes
of 6GB or more.

• Targets to stay within specified pause-time


requirements.

• Suitable for stable and predictable GC time 0.5 seconds or


below.
G1 COLLECTOR: TIPS
• G1 optimizes itself to meet pause-time requirements.

• Do not set the size of young generation space

• Use 90% goal instead of average response time (ART)

• A lower pause-time goal causes more effort of GC,


throughput decreases
Language-Level
12 TIPS & TRICS
koders.co
Objectives
• Object allocation best practices

• Java reference types and differences between them

• Usage of finalizers

• Synchronization tips & tricks & best practices


OBJECTS: BEST PRACTICES

• The problem is not the object allocation, nor the


reclamation

• Not expensive: ~10 native instructions in common case

• Allocating small objects for intermediate results is fine


OBJECTS: BEST PRACTICES
• Use short-lived immutable objects instead of long-
lived mutable objects.

• Functional Programming is rising !

• Use clearer, simpler code with more allocations


instead of more obscure code with fewer allocations

• KISS: Keep It Simple Stupid

• “Premature optimization is root of all evil” - Donald Knuth


OBJECTS: BEST PRACTICES
• Large Objects are expensive !

• Allocation

• Initialization

• Different sized large objects can cause fragmentation

• Avoid creating large objects


JAVA REFERENCE TYPES
REFERENCES: SOFT REFERENCE
• “Clear this object if you don’t have enough memory, I
can handle that.”

• get() returns the object if it is not reclaimed by GC.

• -XX:SoftRefLRUPolicyMSPerMB=[n] can be used to


control lifetime of the reference (default 1000 ms)

• Use case: Caches


REFERENCES: WEAK REFERENCE

• “Consider this reference as if it doesn’t exist. Let me


access it if it is still available.”

• get() returns the object if it is not reclaimed by GC.

• Use case: Thread pools


REFERENCES: PHANTOM REFERENCE

• “I just want to know if you have deleted the object or


not”

• get() always returns null.

• Use Case: Finalize actions


FINALIZERS
• Finalizers are not equivalents of C++ destructors

• Finalize methods have almost no practical and


meaningful use case

• Finalize methods of objects are called by GC threads.

• Handled differently than other objects, create pressure on GC

• Time consuming operations lengthen GC cycle

• Not guaranteed to be called


LANGUAGE TIPS: STRINGS

• Strings are immutable

• String “literals” are cached in String Pool

• Avoid creating Strings with “new”


LANGUAGE TIPS: STRINGS

• Avoid String concatenation

• Use StringBuilder with appropriate initial size

• Not StringBuffer (avoid synchronization)


LANGUAGE TIPS: USE PRIMITIVES

• Use primitives whenever possible, not wrapper


objects.

• Auto Boxing and Unboxing are not free of cost.


LANGUAGE TIPS: AVOID EXCEPTIONS
• Exceptions are very expensive objects

• Avoid creating them for

• non-exceptional cases

• flow control
THREADS
• Avoid excessive use of synchronized

• Increases lock contention, leads to poor performance

• Can cause dead-locks

• Minimize the synchronization

• Only for the critical section

• As short as possible

• Use other locks, concurrent collections whenever suitable


Threads: TIPS
• Favor immutable objects

• No need for synchronization

• Embrace functional paradigm

• Do not use threads directly

• Hard to maintain and program correctly

• Use Executers, thread pools

• Use concurrent collections and tune them properly


CACHING
• Caching is a common source of memory leaks

• Avoid when possible

• Avoid creating large objects in the first place

• Mind when to remove any object added to cache

• Make sure it happens, in any condition


That’s all folks!
Congrats!
Ender Aydin Orak

koders.co

You might also like