Understanding Performance
Understanding Performance
TO UNDERSTANDING
PERFORMANCE PROBLEMS.
A nd m
any ot
her ma
rvelous
tools
TABLE OF CONTENTS
AN INTRODUCTION to Performance
Java Profilers
20
29
34
35
36
56
Summary
57
60
ht
g
i
e
w r
t
h
g
L i ofi le
e
h
T
Pr
a
v
Ja
II
An Introduction to Performance
Java Performance tools are all very different and have typically been created
for different reasons and to achieve different goals. Which Java Performance
tools are you going to use in your next project and why would you choose
one over the other? There are many aspects which may sway your decision
and of course it will depend on the type of application youre building. This
report will cover Java Performance Considerations, Java Monitoring Tools,
Java Profilers, and Performance Testing Tools. We will also demonstrate a
few of our favorite Java Performance tools on a reference application to
help you get the answers you seek.
If you are not actively looking for answers to your performance issues
yet, consider this simple fact: performance affects your bottom line. The
performance of your system often directly translates into its utility for
the end user. Keeping your end users happy can have a major effect on
your bottom line. For instance, you may lose business if your e-commerce
site cannot handle the Black Friday loads, or if your business is high
performance trading, a delay of a couple of milliseconds could be the
difference between covering your body in gold leaf or simply just leaves.
WHAT IS PERFORMANCE?
It might seem like an unusual question in a report about performance
as most people tend to already have a good idea about what the term
performance means. However, its common that different people will have
a different perspective in the way they might describe it. For instance, some
may say being performant is doing the same task with fewer resources.
This might be by design, by choosing a more lightweight stack rather
than just increasing system hardware. Others may approach the topic
differently, by trying to eliminate bottlenecks, i.e. the part of the system
which is performing least well. Others might say increasing performance
is to eliminate unnecessary actions. The truth is, well, all of these are
performance related actions. Ultimately being performant is about
increasing user response times and reducing latency in all parts of your
system and as a whole, while being functionally accurate and consistent to
your end user. Now the question is how?
In this report, we will not take sides or focus on the scalability of the
system or try to make it run as fast as a caffeinated cheetah on a singlecore machine. Instead, well look at the different tools and techniques
that allow you to understand the balance between the resources your
system has and how it utilizes them: where does your system perform
most of the work and where should you look first if you need to tweak the
balance.
OLEG ELAJEV
Head of RebelLabs,
Content Warlock at ZeroTurnaround
In the next chapter well look at a list of resources you have to take into
account when talking about performance and define the functional
requirements of their utilization.
Heres a small translation of common terms that your manager might use and what they actually
mean in real performance terms:
Speed
Latency
Scalability
Throughput
Startup time
User-perceived performance
In general, almost any question about performance can be postulated in terms of the above
mentioned resources and requirements.
DEMYSTIFYING PERFORMANCE.
ITS ALL JUST DATA.
Gather data, stare at it intensely, then go and fix performance
problems. Sounds easy enough. Lets dig into the variety of tools
that help you on this path.
When it comes to java performance tools that can help to optimize performance across
these areas, most fall into three major categories - Monitoring, Testing and Profiling. Java
Profiling and Monitoring both help measure and optimize performance during runtime.
Performance testing helps to show where your development efforts were not sympathetic
to real life, heavily loaded production environments. In this chapter well look into the
tools that are available today, their strengths, the features they offer and also how they
find the culprits of any performance issues.
All rights reserved. 2014 ZeroTurnaround Inc.
URL: http://www.dynatrace.com/en/technology/java.html
Cost: $$$ [contact sales]
No gap
no bou s, no g ue
ndaries
ss i ng,
- no pr
oblem.
URL: http://www.appdynamics.com/product/applicationperformance-management/
Cost:
M o nito
transa r end-to-e
ct
nd
mi nute i o n perfor ma bus i ness
s, wit h
nc
no over e wit hi n
head.
$$$
[contact sales]
10
11
URL: http://newrelic.com/application-monitoring
Cost:
$$$
Co nsta
applica ntly m o ni
to
t i o ns s
o you d ri ng your
o nt ha
ve
to
12
13
URL: https://plumbr.eu/
Cost:
$$$
J ava P
The o n erfor manc
ly sol u
e M o ni
t
root ca i o n wit h au tori ng:
to
use det
ect i o n mat ic
14
Threads
Threads can be locked for a number of
reasons and is actually a good thing, as when
implemented successfully guarantees the
integrity and consistency of your data access.
They are however one of the most expensive
operations since hollywood made plastic surgery
a commodity. Plumbr detects which threads are
being locked, which locks are being contended
and can provide root causes for the lock itself.
15
URL: http://www.jclarity.com/illuminate/
Cost:
illuminate
G ood by
Perfor e J ava/J V
mance
ProblemM
s
$$$
16
illuminate
The areas which Illuminate monitors are pretty
wide, meaning if the bottleneck does not
reside within your application, the tool is still
very useful in telling you where your problem
may exist, from heavy disk I/O to CPU context
switching. Illuminate is implemented as a
daemon on your server machine(s) that pass
detailed performance information back to an
aggregator which collates information and
makes it available via a dashboard UI.
17
URL: http://www.oracle.com/technetwork/java/javaseproducts/
mission-control/index.html
Cost:
$$$
18
19
Java Profilers
While application performance monitoring solutions focus on the high-level
picture of your heterogeneous production environments and mostly deal
with the question: are there any errors in the systems behaviour? Profilers
usually concentrate on a deeper aspect of the main performance questions.
i.e. what is actually happening in the system, under the covers?
High level understanding of system components is great for the
overview, but when you really need to optimize something, you need
to have the exact reports of where time is being spent, exactly what
is happening with code primitives such as threads, locks and memory
management components.
Naturally, you can often make code run faster by implementing a different,
superior algorithm. But, how do you know that its faster than before?
Gut instinct? Also, how can you be sure it will it stay as fast as youve made
it? Youll only know if you truly understand whats going on underneath all of
the abstractions and layers of business logic.
Code profilers gather intelligence about low-level code events in your
application and present this information in a useful, actionable way.
There are two main metrics a profiler can gather: counts and distributions.
Countable events are like number of times a thread locked on a certain
object or like the number of database queries that were executed during
a period of time. Both of these metrics have absolute meanings by
themselves. Distributions are more interesting, they can show where the
time is spent while executing a certain portion of code.
OLEG ELAJEV
Head of RebelLabs,
Content Warlock at ZeroTurnaround
20
URL: https://www.yourkit.com/features/
Cost:
$$$
The I nd
i n .N E ustr y L ea
T & Ja
va Pro der
fi li ng
21
22
URL: https://www.ej-technologies.com/products/jprofiler/
overview.html
Cost:
JPROFILER
The A w
A ll-i n- ard-W i n
ni ng
O ne J a
va Pro
fi ler
$$$
23
JPROFILER
JProfiler can show a call graph view, where the
methods are represented by colored rectangles
that provide instant visual feedback about
where the slow code resides in the method call
chains, making bottlenecks easier to find.
24
URL: http://xrebel.com/
Cost:
$$$
The L i
ghtwei
ght J a
va Pro
fi ler
25
26
URL: https://github.com/RichardWarburton/honest-profiler
Cost:
$$$
FREE!
HONEST PROFILER
The Honest profiler has two parts to it. A C++ jvmti agent that
writes out a file containing all the profiling information about the
application which the jvmti agent was attached to. Did you shudder
when we mentioned a C++ jvmti? It may mean your transition
to the JVM darkside is now complete. Congratulations Darth
Developer, you may continue to sneer at C++ rebel code! Veering
back to the plot The second part, is a Java application that
renders a profile based on this log that was previously generated.
27
HONEST PROFILER
Honest Profiler gets around the problem
of being biased towards collecting sample
information at JVM safepoints by having its
own sampling agent that uses UNIX Operating
System signals.
28
29
URL: http://jmeter.apache.org/
Cost:
$$$
FREE!
30
JMETER
You can use JMeter to create a graphical
analysis of the performance of your application
or to test your server behaviour under heavy
concurrent load. You wont replicate your
actual browser with JMeter, it wont evaluate
the JavaScript on your pages, so it might not
suit your needs, but it is one of the de-facto
standard solutions for performance tests in the
Java world, so you ought to know how to use it.
31
URL: http://gatling.io/
Cost:
$$$
FREE!
A rm y
oursel f
for Pe
r
for man
ce
32
GATLING
Gatling is an extremely usable open source load
testing framework. There are three main factors
that contribute to its success: the quality of the
reports that Gatling produces out of the box is
much higher than one might expect. They are
interactive and look good; the tests make use of a
simple DSL which is, spoiler alert, written in Scala;
and the fact that Gatling was designed with realworld load generation as a goal, it was created
with highly concurrent test scenarios in mind.
33
I n t heo
r
betwee y t here s no
n prac
differe
t
I n prac ice and t h nce
eor y.
t ice t h
ere is.
All rights reserved. 2015 ZeroTurnaround Inc.
34
Well, there are several ways to fight application performance issues, and
they even share a crucial common trait. They all assume that you operate
on hard data and know what youre doing. While this may be very accurate,
you must take this next piece of advice very seriously
Performance issues cannot be solved by shooting from the hip.
Measure, apply a fix, measure again!
At the same time, you can be sure that just fixing this issue once and
forever is not an option available to you. Youll have the same problem
after the next release, then again and again. What you really need is a
change of perspective. How should you treat performance issues and the
performance of your application in general?
35
The first thing we always need to do is figure out which resource is limited
in the application, you know, the bottleneck. There can be only one source
of every bottleneck, even though you might predict that your applications
are CPU bound or memory bound, or perform too much IO for every action.
There are simple actions that you can take to determine the culprit.
In this section, well look at a selection of tools that we discussed before on
a sample application running locally. This would be a somewhat unusual
setup for showcasing monitoring tools, which usually shine in the more
complex environments, but since almost all of us are developers here we
really like to run things locally.
Our sample application is Confluence. We'll use this application toshow
you how to build and run simple JMeter performance tests. Then well
obtain general information about the performance of the system, digging
into the performance of a single page, analyzing why it takes so much time
to load, with YourKit and also showing how you can setup and run XRebel
to profile the application to find the most outrageous performance issues
during development time. Does that sound exciting? Good, it should do,
because it is!
SETUP
We chose Atlassian Confluence as our reference application for
performance testing. Note, that this is not an application code optimisation
exercise, in fact we do not have access to the Confluence code base.
Instead we simply configured and ran a sample set of profiling tools and a
performance testing tool that we looked at earlier in this report.
JMETER
Youll need to download JMeter from its Apache
project page. You can also include it as a Maven
dependency from Maven central, if you want to
include it programmatically.
Extracting the archive creates a directory, which
will be the home of all our JMeter experiments.
JMeter itself is a Java program, so you can easily
access it programmatically if you need or reuse
your prior knowledge of configuring various Java
programs.
JMeter happens to be quite memory intensive,
ironically, but thats understandable since its
processing multiple results concurrently to
simulate sample load. So before starting the
JMeter GUI, configure the JMeter JVM to have a
slightly larger heap size than it has available by
default, as shown.
shelajev@shrimp
jmeter-2.13
~/Downloads/apache-
37
The UI of JMeter GUI is not the slickest and it contains a number of options that might confuse a newcomer
to the tool, but to begin we just need to configure a couple of basic elements, including the Thread group
that will be used to simulate multiple users and the pages in Confluence that they will access.
First, click on the Thread Group in the tree view on the left, rename it as you like and configure 15
simultaneous users for this experiment. Also set the Loop count to some value, lets say 50. From the image
below, you can see I called my Thread Group, My precious users and set 15 users with a loop count of 50.
This means Ill expect 15 threads to each perform an action (yet to be set up) 50 times each before ending.
38
39
40
41
Back to JMeter, go totree view on the left side and click on the Aggregate Report, which we set up before we
ran the test. It provides us with the valuable insight of how quickly the requests were served by Confluence.
42
We see that our average response time was 1281 ms, but when
measuring latency you should not worry about averages. The
information that is really valuable is the 95% or 99% line, which will show
how much time the majority of your end-users will have to wait in this
scenario to get their response. The average is too susceptible to the outliers
and many quick responses will lower the value significantly. On the other
hand if the functional requirements or SLA for the system is specified for all
the users, the 99% line will be much more helpful to determine if the system
under test is close to meeting those requirements.
YOURKIT
First, download YourKit from the YourKit website. Since it is a native
application we dont need to configure anything, we can simply run the
profiler. We do of course need to register for an evaluation license that will
be delivered via email . Once this is done you can enter the license key into
YourKIt when prompted and start using the tool.
In this case the throughput of the application, or said another way, the
number of users that can be served concurrently, is 6.5 requests per second.
We have now established a rough baseline for our application performance
on this set of hardware in this particular environment. Of course the
approach we took here is simplistic for the sake of readability, but in real life
you can configure much more complex test cases in a very similar fashion.
The nuts and bolts all look very similar, you just need to add more HTTP
request samplers to pages, make each user login before starting to query
your application and so forth.
Lets move forward, with our new baseline, and look at some of the other
performance profiling tools discussed in this report. Well run them against
Confluence and generate the load with the same JMeter test we have used
initially, so our profiler results will provide more meaningful data.
43
The first time YourKit is launched, it offers to install an IDE plugin. YourKit knows that it offers the most value from the
IDE plugin, so we integrated it with a local instance of Eclipse. Note that it is not necessary to run the profiling itself from
within an IDE. In fact, the main YourKit application can connect to local and remote JVM processes utilising the javaagent
capabilities. So, we attach the profiler to our Confluence process, from within Eclipse as shown here:
Wow, we immediately see the CPU consumption by the app and some information about threads in the target JVM!
44
45
We now have the profile data, we cananalyze exactly what takes Confluence the time to respond with the Welcome to
Confluence page. YourKit immediately found an unresponsive Thread and showed us a notification, suggesting that it
might be deadlocked. However, a quick check of the top output suggests that my laptop is close to going into a coma,
so this is probably not an issue of locking but more likely just insufficient resources to run all these apps. We can save the
data which YourKit recorded as a snapshot and dig into this further.
46
Now we can perform some CPU profiling to find the source of any latency, by following
which methods had the most CPU time.
47
The Hot Spots view shows the most time consuming methods in all of our collected data.
This is a great place to start to find the most likely candidates which we could look at.
Surprisingly we can see the YourKit probe class right on the top, but we blame that on the issues with
the experimental setup. Other hot methods are legit.
48
Other views in YourKit show us a bunch of method calls which can either be grouped by Thread
or not grouped at all. This will require more time to analyze the results, however. Also, below the
CPU profiling views, theres a Java EE statistics window, where we can look at the SQL queries that
were executed from the Confluence process and other metrics. These are all nicely aggregated by
the consumed time and the query count and might be a source of interesting findings about your
application performance.
49
Another immediate thing to notice without much digging is that YourKit saw around 6000 Exceptions
being generated:
This might be alright, but then again, these can probably be avoided.
In addition to the comprehensive CPU profiling, YourKit offers other insights into application
performance. They are as intuitive and straightforward to start with as the CPU profiling was and
again, theyre available right there in the UI:
50
XREBEL
XRebel is a lightweight Java profiler, and occupies a different niche in the
category of profiling tools. It is intended primarily as a developer profiler
to spot possible performance issues as soon as possible, in fact, while a
developer is coding them. When a component of the system is just being
developed and functionally tested, XRebel is on hand to give warnings and
helpful diagnostics.
51
First of all, the page took 1.8 seconds to load, which might be acceptable, but triggers a threshold
in the default configuration of XRebel. We can change it later to take account of the speed of
Confluence on this particular machine, but right now clicking on the Application profiling icon gives us
a list of HTTP requests that the page has initiated upon being loaded:
52
Clicking on the
GET /display/ds/Welcome+to+Confluence
request gives more detail and shows the code
execution path that handled the response. The
layout describes the the total cumulative time
as well as the time spent in a particular method.
This gives valuable hints to the areas of the
code that are behave unexpectedly.
We can see here that 36.5% of the total
request serving time was spent in the
BaseWebAppDecorator.render code,
renderTemplateWithoutSwallowingErrors.
53
54
55
e,
r
a
e
THE PARADOX OF CHOICE
er
h
t
s
ice tant,
o
h
c
y
r
n
o
a
p
m
m
!
i
o
r
o
y
r
l
t
l
m
a
r
n
W he hats re s. hm r m
w
is
If you dig a hole and it's in the wrong place
m
l
l
i
we w
digging it deeper is not going to help.
SEYMOUR CHWAST
Before your start tackling performance problems in
your project, be sure to recognise your goals and pick
the appropriate tool to help you accomplish them.
56
Summary
Well done for making it all the way to the summary! We hope you loved the
report... well, of course you loved the report! Oh, hang on, you skipped the
content and jumped straight to the summary? Lazy! ;)
In this report, we covered what we mean by the term performance and the
fact that it can mean different things to different people, from removing
unnecessary code, to redesigning your application to using different
frameworks. Oh and of course you can still update your actual source code
to be more performant but that was an area we didnt cover this time.
Dont forget, your overall application performance will affect your
companys bottom line. Performance is important, but extremely hard to
grasp and master (if anyone even has mastered it). There are a myriad of
performance tools available and they all try to tackle performance problems
from different angles. We covered a number of them in this report but of
course there are many more out there too, most of which are established
with similar great features. But which one is for you? Well, as usual, the
answer is it depends! There are various things which will affect your
decision including the overall design of your application, the phase in which
youre testing, your personal preference and of course the type of issue
youre trying to track down, assuming you know what that issue is!
One of the fun parts in the report (certainly for us), which allowed us to
get our teeth into some tech, was the practical part. Were geeks too you
know! We took the Confluence application and used JMeter to simulate load
across the application, which gave us our baseline throughput. From here
we profiled the application using YourKit and XRebel, which showed some
really interesting results, particularly when XRebel was enabled as it showed
up potential latency issues and database IO issues before we even had time
to say OMG, XRebel rocks!.
There is no silver bullet in solving performance problems, and you have
to choose your tools depending on the needs of your project. So, how do
you decide which tool is most relevant for your needs? Well, as were such
giving people here at RebelLabs weve created a small FAQ section that
summarizes some of the points from this report in a friendly, problem
oriented manner. This will hopefully aid your decision making to select the
right tool for you.
57
58
Q: The operations team say that our system is consuming too much
RAM and crashes with OutOfMemory errors constantly? How do I
find out which part of the system is saturating the heap?
A: There are several solutions that promise to find memory leaks,
heavier APMs include NewRelic, Dynatrace and AppDynamics.
Other, more specific tools that identify and handle performance issues
related to memory usage issues like Plumbr or Illuminate might be
much more straight forward at showing the root cause of the problem.
Q: I want to rigorously profile my code, because we have a very lowlevel implementation of a queue interface. But I heard that profilers
are biased toward safepoints whatever that means? Are they?
A: Some operations that the JVM performs, like rearranging objects
on the heap, require application threads to be paused, these pauses
are called safepoints. The usual sampling approach to profiling Java
applications is indeed biased towards safepoints. You can enhance the
precision of the timings by using the tracing instrumentation profiling
mode. YourKit and JProfiler for example offer you that option. On
the other hand you can try to profile your application using Honest
profiler, which was created specifically to avoid the problems with the
usual sampling algorithms.
59
Q: My team are not all performance experts, but they all contribute
the same amount of code. They dont have time to sift through lots
of data. Is there a simple tool that gives simple feedback for regular
developers to understand?
A: XRebel is one of the easiest profilers for Java applications, it injects
itself right into your application and gives you a simple outline of
where your application spends time when serving requests. It can
also highlight typical performance problems with excessive database
accesses, abnormally large sessionsand so forth. The setup and ease
of use of XRebel are unmatched.
Q: I need to establish a baseline, so my aggressive refactoring wont
decrease application performance. How do I achieve this?
A: You want to look for load test libraries, like JMeter or Gatling. They
allow you to record the interaction with the application and can rerun
it later using multiple concurrent users, thus simulating real world
usage patterns.
60
61
62
t
Co ntac
us
Twitter: @RebelLabs
Web: http://zeroturnaround.com/rebellabs
Email: [email protected]
Estonia
likooli 2, 4th floor
Tartu, Estonia, 51003
Phone: +372 653 6099
USA
399 Boylston Street,
Suite 300, Boston,
MA, USA, 02116
Phone:
All rights reserved. 2015 ZeroTurnaround
Inc. 1 (857) 277-1199
Czech Republic
Jankovcova 1037/49
Building C, 5th floor,
170 00 Prague 7, Czech Republic
Phone: +420 227 020 130
Written by:
Oleg Shelajev (@shelajev), Simon Maple (@sjmaple)
Designed by: Ladislava Bohacova (@ladislava)
63