2018 06 13 Aceu JavaThreadDumps
2018 06 13 Aceu JavaThreadDumps
© 2018 kippdata informationstechnologie GmbH 1 Performance Troubleshooting using Java Thread Dumps
Introduction
© 2018 kippdata informationstechnologie GmbH 2 Performance Troubleshooting using Java Thread Dumps
Topics
© 2018 kippdata informationstechnologie GmbH 3 Performance Troubleshooting using Java Thread Dumps
Note
© 2018 kippdata informationstechnologie GmbH 4 Performance Troubleshooting using Java Thread Dumps
Which problem do we want to solve?
© 2018 kippdata informationstechnologie GmbH 5 Performance Troubleshooting using Java Thread Dumps
Methodology Pros
© 2018 kippdata informationstechnologie GmbH 6 Performance Troubleshooting using Java Thread Dumps
Methodology Cons
© 2018 kippdata informationstechnologie GmbH 7 Performance Troubleshooting using Java Thread Dumps
Most common root causes for performance problems
© 2018 kippdata informationstechnologie GmbH 9 Performance Troubleshooting using Java Thread Dumps
What is a Java thread dump and how does it help?
© 2018 kippdata informationstechnologie GmbH 10 Performance Troubleshooting using Java Thread Dumps
What is a Java thread dump and how does it help?
© 2018 kippdata informationstechnologie GmbH 11 Performance Troubleshooting using Java Thread Dumps
What is a Java thread dump and how does it help?
© 2018 kippdata informationstechnologie GmbH 13 Performance Troubleshooting using Java Thread Dumps
Pros of Java thread dumps?
© 2018 kippdata informationstechnologie GmbH 14 Performance Troubleshooting using Java Thread Dumps
Cons of Java thread dumps?
© 2018 kippdata informationstechnologie GmbH 15 Performance Troubleshooting using Java Thread Dumps
Disclaimer
© 2018 kippdata informationstechnologie GmbH 16 Performance Troubleshooting using Java Thread Dumps
How does one create thread dumps?
© 2018 kippdata informationstechnologie GmbH 17 Performance Troubleshooting using Java Thread Dumps
How does one create thread dumps?
© 2018 kippdata informationstechnologie GmbH 18 Performance Troubleshooting using Java Thread Dumps
How does one create thread dumps?
© 2018 kippdata informationstechnologie GmbH 20 Performance Troubleshooting using Java Thread Dumps
Real-world examples
What is this?
at java.net.SocketInputStream.socketRead0(Native Method)
...
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream
(HttpURLConnection.java:1000)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:373)
at com.provider.xyz.util.UserLogin.sendMessageToCallgate
(UserLogin.java:381)
at com.provider.xyz.util.UserLogin.transferClientData
(UserLogin.java:284)
...
at WICKET_com.provider.xyz.util.UserLogin$$EnhancerByCGLIB$
$b620ce.loginUser(<generated>)
at com.provider.xyz.panels.account.PanelLogin$3.onSubmit
(PanelLogin.java:231)
© 2018 kippdata informationstechnologie GmbH 21 Performance Troubleshooting using Java Thread Dumps
Real-world examples
Explanation
at java.net.SocketInputStream.socketRead0(Native Method)
Socket = network communication, here reading
...
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
Protocol is HTTP, we are the client reading the response from a remote
HTTP server
...
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:373)
We are actually waiting for the response code (first line)
at com.provider.xyz.util.UserLogin.sendMessageToCallgate
(UserLogin.java:381)
The remote system seems to be known as “Callgate”
at com.provider.xyz.util.UserLogin.transferClientData
(UserLogin.java:284)
The action seems to be triggered by a user login
So: the HTTP calls to callgate during user logins are slow
© 2018 kippdata informationstechnologie GmbH 22 Performance Troubleshooting using Java Thread Dumps
Real-world examples
What is this?
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.net.SocketInputStream.read(SocketInputStream.java:182)
at net.xyz.util.InputStreamUtils.readLine(InputStreamUtils.java:74)
at net.xyz.rpc.RpcBase.readResponseHead(RpcBase.java:746)
...
at net.xyz.rpc.RpcService.invoke2008(RpcService.java:799)
at net.xyz.rpc.RpcService$$FastClassByCGLIB$$cc7d91e6.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:149)
at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.
invokeJoinpoint(Cglib2AopProxy.java:700)
...
at net.xyz.rpc.RpcServiceSkulD$$EnhancerByCGLIB$
$93e8acf0.invoke2008(<generated>)
at com.provider.xyz.rpc.BasicWrapper.invoke(BasicWrapper.java:183)
at com.provider.xyz.rpc.MBoxDWrapper.getFolderTree
(MBoxDWrapper.java:155)
at com.provider.xyz.util.BackendUtil.helpGetFolderList
(BackendUtil.java:503)
at com.provider.xyz.util.BackendUtil.getFolderList
(BackendUtil.java:438)
© 2018 kippdata informationstechnologie GmbH 23 Performance Troubleshooting using Java Thread Dumps
Real-world examples
Explanation
at java.net.SocketInputStream.socketRead0(Native Method)
again Socket = network communication, here reading
...
at net.xyz.rpc.RpcBase.readResponseHead(RpcBase.java:746)
No sign of HTTP, instead someone has named this protocol RPC
(remote procedure call), we are reading (waiting for) the response head
...
at com.provider.xyz.rpc.MBoxDWrapper.getFolderTree
(MBoxDWrapper.java:155)
at com.provider.xyz.util.BackendUtil.helpGetFolderList
(BackendUtil.java:503)
at com.provider.xyz.util.BackendUtil.getFolderList
(BackendUtil.java:438)
The action seems to be triggered by a the need for some mail box
(mbox) folder list.
So: the RPC calls retrieving the mbox folder list are slow
© 2018 kippdata informationstechnologie GmbH 24 Performance Troubleshooting using Java Thread Dumps
Real-world examples
What is this?
at java.net.PlainSocketImpl.socketConnect(Native Method)
...
at java.net.Socket.connect(Socket.java:469)
...
at sun.net.www.protocol.http.HttpURLConnection.plainConnect
(HttpURLConnection.java:729)
at sun.net.www.protocol.http.HttpURLConnection.connect
(HttpURLConnection.java:654)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream
(HttpURLConnection.java:977)
at java.net.HttpURLConnection.getResponseCode
(HttpURLConnection.java:373)
at com.provider.xyz.util.Utilities.isURLAccessible
(Utilities.java:1011)
© 2018 kippdata informationstechnologie GmbH 25 Performance Troubleshooting using Java Thread Dumps
Real-world examples
Explanation
at java.net.PlainSocketImpl.socketConnect(Native Method)
© 2018 kippdata informationstechnologie GmbH 27 Performance Troubleshooting using Java Thread Dumps
Real-world examples
What is this?
at de.acme.lib.client.connect.RemoteLogin.callServerInThread(RemoteLogin.java:867)
- waiting to lock <0x00002aab2a245410> (a de.acme.to30.service.api.To30RemoteLogin)
at de.acme.lib.client.connect.RemoteLogin.callServer(RemoteLogin.java:804)
at com.ticketing.framework.client.business.bridge.StatelessConnector.sendRequest\
(StatelessConnector.java:58)
at com.ticketing.framework.client.business.bridge.DatasourceBridgeConnector.load\
(DatasourceBridgeConnector.java:30)
at com.ticketing.framework.business.datasource.DatasourcePipe.load\
(DatasourcePipe.java:30)
at com.ticketing.framework.business.datasource.CachedDatasource.load\
(CachedDatasource.java:56)
© 2018 kippdata informationstechnologie GmbH 28 Performance Troubleshooting using Java Thread Dumps
Real-world examples
Partial explanation
at de.acme.lib.client.connect.RemoteLogin.callServerInThread(RemoteLogin.java:867)
- waiting to lock <0x00002aab2a245410> (a
de.acme.to30.service.api.To30RemoteLogin)
A class named RemoteLogin executes the method callServerInThread.
That method uses a mutual exclusion lock to prevent concurrent execution
of parts of its code.
Every thread showing the above “waiting to lock” line waits for some
other thread to free the lock before it can acquire it and proceed execution.
If this happens a lot, it results in a performance problem!
In this case it did happen a lot, dozens of threads showed this stack!
at de.acme.lib.client.connect.RemoteLogin.callServer(RemoteLogin.java:804)
at com.ticketing.framework.client.business.bridge.StatelessConnector.sendRequest\
(StatelessConnector.java:58)
at com.ticketing.framework.client.business.bridge.DatasourceBridgeConnector.load\
(DatasourceBridgeConnector.java:30)
at com.ticketing.framework.business.datasource.DatasourcePipe.load\
(DatasourcePipe.java:30)
at com.ticketing.framework.business.datasource.CachedDatasource.load\
(CachedDatasource.java:56)
© 2018 kippdata informationstechnologie GmbH 29 Performance Troubleshooting using Java Thread Dumps
Real-world examples
Remaining explanation
Remember: waiting to lock <0x00002aab2a245410>
© 2018 kippdata informationstechnologie GmbH 30 Performance Troubleshooting using Java Thread Dumps
Real-world examples
Remaining Explanation
Discussion with developers reveals: access to the back
end server was limited on purpose to shield the back end
from overload
Why are only three threads allowed to access to back
end in parallel?
Configuration error!
That was the default setting supposed to be only used in development.
Someone simply forgot to adjust for production.
After this finding, they adjusted the value to 600 for production
© 2018 kippdata informationstechnologie GmbH 31 Performance Troubleshooting using Java Thread Dumps
About locking
Special topic locking
Locking is important
Needed to ensure correctness
Not a performance problem if properly done
Locking can get problematic
If locked code is to big or more precisely takes to long to
execute
Especially problematic: running remote calls while holding a lock
(database request, HTTP request etc.)
If locked code is very hot, ie. is executed extremely often
IMHO bad locking is the number one reason for local
performance problems
© 2018 kippdata informationstechnologie GmbH 32 Performance Troubleshooting using Java Thread Dumps
About locking
Locking in thread dumps
Threads which are blocked while waiting for a lock can
be found by searching for “- waiting to lock”,
“- waiting on” and “- parking to wait”.
Threads which hold a lock and prevent others to use the
same lock can be found by searching for “- locked”
Depending on the lock details, sometimes the thread holding
the lock does not have this text on its stack. Then you might
find it by comparing class and method names with the ones
where the other threads are blocked
Al of the lines contain the unique address of the lock
object
© 2018 kippdata informationstechnologie GmbH 33 Performance Troubleshooting using Java Thread Dumps
Locks in thread dumps
"http-28380-Processor2"
"http-28380-Processor2" daemon
daemon prio=10
prio=10 tid=0x00968800
tid=0x00968800 nid=0x1a
nid=0x1a
runnable
runnable ...
...
java.lang.Thread.State:
java.lang.Thread.State: RUNNABLE
RUNNABLE
at
at java.net.PlainSocketImpl.socketAccept(Native Method)
java.net.PlainSocketImpl.socketAccept(Native Method)
at
at
java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
-- locked
locked <0xf4490718>
<0xf4490718> (a
(a java.net.SocksSocketImpl)
java.net.SocksSocketImpl)
at
at
java.net.ServerSocket.implAccept(ServerSocket.java:450)
java.net.ServerSocket.implAccept(ServerSocket.java:450)
at
at java.net.ServerSocket.accept(ServerSocket.java:421)
java.net.ServerSocket.accept(ServerSocket.java:421)
at
at java.lang.Thread.run(Thread.java:626)
java.lang.Thread.run(Thread.java:626)
...
...
© 2018 kippdata informationstechnologie GmbH 34 Performance Troubleshooting using Java Thread Dumps
Locks in thread dumps
© 2018 kippdata informationstechnologie GmbH 36 Performance Troubleshooting using Java Thread Dumps
Real-world examples
What is this?
I found many threads in this stack
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0xHEXADDR> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
So the threads wait for a lock ...
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await\
(AbsractQueuedSynchronizer.java:1925
at java.util.concurrent.LinkedBlockingQueue.put\
(LinkedBlockingQueue.java:254)
… while trying to add something to a queue ...
at org.jboss.cache.RegionImpl.registerEvictionEvent(RegionImpl.java:249)
… called by a JBoss cache class RegionImpl, method registerEvictionEvent
I'm not a JBoss cache expert, but ...
© 2018 kippdata informationstechnologie GmbH 37 Performance Troubleshooting using Java Thread Dumps
Real-world examples
© 2018 kippdata informationstechnologie GmbH 39 Performance Troubleshooting using Java Thread Dumps
Thread dump analysis methodology
© 2018 kippdata informationstechnologie GmbH 40 Performance Troubleshooting using Java Thread Dumps
What's next?
What's next?
Create a few thread dumps of your favorite application
right now
Dare to do it even in production
Look at them and familiarize yourself with the contents,
even when there is no performance problem right now
Share your dumps and findings. Thread dumps enable
joint analysis between devs and ops.
Include taking thread dumps in the stop method of your
shutdown scripts. Thus you'll get good post-mortem
information in case of emergency restarts.
© 2018 kippdata informationstechnologie GmbH 41 Performance Troubleshooting using Java Thread Dumps
Thread dumps and Apache Tomcat
© 2018 kippdata informationstechnologie GmbH 43 Performance Troubleshooting using Java Thread Dumps