/ Java EE Support Patterns: MAT
Showing posts with label MAT. Show all posts
Showing posts with label MAT. Show all posts

11.22.2012

Java Heap Dump: Are you up to the task?

If you are as much enthusiasm as I am on Java performance, heap dump analysis should not be a mystery to you. If it is then the good news is that you have an opportunity to increase your Java troubleshooting skills and JVM knowledge.

The JVM has now evolve to a point that it is much easier today to generate and analyze a JVM heap dump vs. the old JDK 1.0 – JDK 1.4 days.

That being said, JVM heap dump analysis should not be seen as a replacement for profiling & JVM analysis tools such as JProfiler or Plumbr but complementary. It is particularly useful when troubleshooting Java heap memory leaks and java.lang.OutOfMemoryError problems.

This post will provide you with an overview of a JVM heap dump and what to expect out of it. It will also provide recommendations on how and when you should spend time analyzing a heap dump. Future articles will include tutorials on the analysis process itself.

Java Heap Dump overview

A JVM heap dump is basically a “snapshot” of the Java heap memory at a given time. It is quite different than a JVM thread dump which is a snapshot of the threads.

Such snapshot contains low level detail about the java objects and classes allocated on the Java heap such as:

  • Java objects such as Class, fields, primitive values and references
  • Classloader related data including static fields (important for classloader leak problems)
  • Garbage collection roots or objects that are accessible from outside the heap (System classloader loaded resources such as rt.jar, JNI or native variables, Threads, Java Locals and more…)
  • Thread related data & stacks (very useful for sudden Java heap increase problems, especially when combined with thread dump analysis)
Please note that it is usually recommended to generate a heap dump following a full GC in order to eliminate unnecessary “noise” from non-referenced objects.

Analysis reserved for the Elite?

One common misperception I have noticed over the last 10 years working with production support teams is the impression that deeper analysis tasks such as profiling, heap dump or thread dump analysis are reserved for the “elite” or the product vendor (Oracle, IBM…).

I could not disagree more.

As a Java developer, you write code potentially running in a highly concurrent thread environment, managing hundreds and hundreds of objects on the JVM. You do have to worry not only about concurrency issues but also on garbage collection and the memory footprint of your application(s). You are in the best position to perform this analysis since you are the expert of the application.



Find below typical questions you should be able to answer:

  • How much concurrent threads are needed to run my application concurrently as per load forecast? How much memory each active thread is consuming before they complete their tasks?
  • What is the static memory footprint of my application? (libraries, classloader footprint, in-memory cache data structures etc.)
  • What is the dynamic memory footprint of my application under load? (sessions footprint etc.)
  • Did you profile your application for any memory leak?
Load testing, profiling your application and analyzing Java heap dumps (ex: captured during a load test or production problem) will allow you to answer the above questions. You will then be in position to achieve the following goals:

  • Reduce risk of performance problems post production implementation
  • Add value to your work and your client by providing extra guidance & facts to the production and capacity management team; allowing them to take proper IT improvement actions
  • Analyze the root cause of memory leak(s) or footprint problem(s) affecting your client IT production environment
  • Increase your technical skills by learning these performance analysis principles and techniques
  • Increase your JVM skills by improving your understanding of the JVM, garbage collection and Java object life cycles
 The last thing you want to reach is a skill “plateau”. If you are not comfortable with this type of analysis then my recommendations are as per below:

  • Ask a more senior member of your team to perform the heap dump analysis and shadow his work and approach
  • Once you are more comfortable, volunteer yourself to perform the same analysis (from a different problem case) and this time request a more experienced member to shadow your analysis work
  • Eventually the student (you) will become the mentor
When to use

Analyzing JVM heap dumps should not be done every time you are facing a Java heap problem such as OutOfMemoryError. Since this can be a time consuming analysis process, I recommend this analysis for the scenarios below:

  • The need to understand & tune your application and / or surrounding API or Java EE container itself memory footprint
  • Java heap memory leak troubleshooting
  • Java classloader memory leaks
  • Sudden Java heap increase problems or trigger events (has to be combined with thread dump analysis as a starting point)

 Now find below some limitations associated with heap dump analysis:

  • JVM heap dump generation is an intensive computing task which will hang your JVM until completed. Proper due diligence is required in order to reduce impact to your production environment
  • Analyzing the heap dump will not give you the full Java process memory footprint e.g. native heap. For this purpose, you will need to rely on other tools and OS commands for that purpose
  • You may face problems opening & parsing heap dumps generated from older version of JDK’s such as 1.4 or 1.5
Heap dump generation techniques

JVM heap dumps are typically generated as a result of 2 actions: 

  • Auto-generated or triggered as a result of a java.lang.OutOfMemoryError (e.g. Java Heap, PermGen or native heap depletion)
  • Manually generated via the usage of tools such as jmap, VisualVM (via JMX) or OS level command
# Auto-triggered heap dumps

If you are using the HotSpot Java VM 1.5+ or JRockit R28+ then you will need to add the following parameter below at your JVM start-up:

-XX:+HeapDumpOnOutOfMemoryError

The above parameter will enable to HotSpot VM to automatically generate a heap dump following an OOM event. The heap dump format for those JVM types is HPROF (*.hprof).

If you are using the IBM JVM 1.4.2+, heap dump generation as a result of an OOM event is enabled by default. The heap dump format for the IBM JVM is PHD (*.phd).

# Manually triggered heap dumps

Manual JVM heap dumps generation can be achieved as per below:

  • Usage of jmap for HotSpot 1.5+
  • Usage of VisualVM for HotSpot 1.6+ * recommended *
** Please do your proper due diligence for your production environment since JVM heap dump generation is an intrusive process which will hang your JVM process until completion **

If you are using the IBM JVM 1.4.2, you will need to add the following environment variables from your JVM start-up:

export IBM_HEAPDUMP=true
export IBM_HEAP_DUMP=true


For IBM JVM 1.5+ you will need to add the following arguments at the Java start-up:

-Xdump:heap

EX:
java -Xdump:none -Xdump:heap:events=vmstop,opts=PHD+CLASSIC
JVMDUMP006I Processing Dump Event "vmstop", detail "#00000000" - Please Wait.
JVMDUMP007I JVM Requesting Heap Dump using
 'C:\sdk\jre\bin\heapdump.20050323.142011.3272.phd'
JVMDUMP010I Heap Dump written to
 C:\sdk\jre\bin\heapdump.20050323.142011.3272.phd
JVMDUMP007I JVM Requesting Heap Dump using
 'C:\sdk\jre\bin\heapdump.20050323.142011.3272.txt'
JVMDUMP010I Heap Dump written to
 C:\sdk\jre\bin\heapdump.20050323.142011.3272.txt
JVMDUMP013I Processed Dump Event "vmstop", detail "#00000000".


Please review the Xdump documentation  for IBM JVM1.5+. 

For Linux and AIX®, the IBM JVM heap dump signal is sent via kill –QUIT or kill -3. This OS command will trigger JVM heap dump generation (PHD format).

I recommend that you review the MAT summary page on how to acquire JVM heap dump via various JVM & OS combinations.

Heap dump analysis tools

My primary recommended tool for opening and analyzing a JVM heap dump is Eclipse Memory Analyzer (MAT). This is by far the best tool out there with contributors such as SAP & IBM. The tool provides a rich interface and advanced heap dump analysis capabilities, including a “leak suspect” report. MAT also supports both HPROF & PHD heap dump formats.

I recommend my earlier post for a quick tutorial on how to use MAT and analyze your first JVM heap dump. I have also a few heap dump analysis case studies useful for your learning process.



Final words

I really hope that you will enjoy JVM heap dump analysis as much as I do. Future articles will provide you with generic tutorials on how to analyze a JVM heap dump and where to start. Please feel free to provide your comments.

11.18.2011

java.lang.OutOfMemoryError – Weblogic Session size too big

A major problem was brought to our attention recently following a migration of a Java EE Portal application from Weblogic 8.1 to Weblogic 11g.

This case study will demonstrate how you can analyze an IBM JRE Heap Dump (.phd format) in order to determine the memory footprint of your application HttpSession objects.

Environment specifications (case study)

-         Java EE server: Oracle Weblogic Server 11g
-         Middleware OS: AIX 5.3
-         Java VM: IBM JRE 1.6.0
-         Platform type: Portal application

Monitoring and troubleshooting tools

-         Memory Analyzer 1.1 via IBM support assistant (IBM  JRE Heap Dump analysis)

Step #1 – Heap Dump generation

A Heap Dump file was generated following an OutOfMemoryError. The IBM JRE Heap Dump format is as per below (phd extension stands for Portal Heap Dump).

* Please note that you can also manually generate a IBM JRE Heap Dump by either using dumpHeap JMX operation via JConsole or by adding IBM_HEAPDUMP=true in your Java environment variables along with kill –QUIT command *

// Portal Heap Dump generated on 2011-09-22
heapdump.20110922.003510.1028306.0007.phd

Step #2 – Load the Heap Dump file in MAT

We used the Memory Analyzer (MAT) tool from IBM Support Assistant in order to load our generated Heap Dump. The Heap Dump parsing process can take quite some time. Once the processing is completed, you will be able to see a Java Heap overview along with a Leak Suspect report.




Step #3 – Locate the Weblogic Session Data objects

In order to understand the runtime footprint of your application HttpSession (user session) objects, you first need to locate the Weblogic HttpSession internal data structure. Such internal data structured with using memory Session persistence is identified as:

weblogic.servlet.internal.session.MemorySessionData

Unless these objects are showing up from the Leak Suspects report, the best way to locate them is to simply load the Histogram as per below and sort the Class Names. You should be able to easily locate the MemorySessionData class name and determine how many Objects instances. 1 instance of MemorySessionData corresponds to one user session of one of your Java EE Web Application.










Step #4 – HttpSession memory footprint analysis

It is now time to analyze the memory footprint of each of your HttpSession data. Simply right click on your mouse over weblogic.servlet.internal.session.MemorySessionData
 and select: List objects > with incoming references


For our case study, large HttpSession (MemorySessionData) objects were found using up to 52 MB for a single session objects which did explain why our Java Heap was depleted under heavy load.

At this point, you can explore the Session data object dig within a single instance of MemorySessionData. This will allow you to look at all your application Session data attributes and determine the source(s) of memory allocation. Simply select right click on one MemorySessionData and select >> List objects > with outgoing references.


Using this approach, our development team was able to identify the source of high memory allocation within our application HttpSession data and fix the problem.

Conclusion

I hope this tutorial along with our case study has helped you understand how useful and powerful a Heap Dump analysis can be in order to understand and identify your application HttpSession memory footprint.

Please don’t hesitate to post any comment or question.

11.05.2011

Memory Analyzer download

Memory Analyzer (MAT) is an extremely useful tool that analyzes Java Heap Dump files with hundreds of millions of objects, quickly calculate the retained sizes of objects, see who is preventing the Garbage Collector from collecting objects, run a report to automatically extract leak suspects.

Why do I need it?

OutOfMemoryError problems are quite complex and require proper analysis. The tool is a must for any Java EE production support person or developer who requires to investigate memory leak and / or to analyze their application memory footprint.

Where can I find tutorials and case studies?

You will find from this Blog several case studies and tutorial on how to use this tool to pinpoint Java Heap memory leaks. Find below a few examples:

http://javaeesupportpatterns.blogspot.com/2011/09/gc-overhead-limit-exceeded-java-heap.html
http://javaeesupportpatterns.blogspot.com/2011/02/ibm-sdk-heap-dump-httpsession-footprint.html

Where can I download it?

Memory Analyzer can be downloaded for free as a standalone software from the Eclipse site or integrated within the IBM Support Assistant tool.

## Eclipse MAT

## MAT via IBM Support Assistant tool (select Memory Analyzer from the Tools add-ons)

HPROF – Memory leak analysis tutorial

This article will provide you with a tutorial on how you can analyze a JVM memory leak problem by generating and analyzing a Sun HotSpot JVM HPROF Heap Dump file.

A real life case study will be used for that purpose: Weblogic 9.2 memory leak affecting the Weblogic Admin server.

Environment specifications

·         Java EE server: Oracle Weblogic Server 9.2 MP1
·         Middleware OS: Solaris 10
·         Java VM: Sun HotSpot 1.5.0_22
·         Platform type: Middle tier

Monitoring and troubleshooting tools

·         Quest Foglight (JVM and garbage collection monitoring)
·          jmap (hprof / Heap Dump generation tool)
·         Memory Analyzer 1.1 via IBM support assistant (hprof Heap Dump analysis)
·         Platform type: Middle tier

Step #1 – WLS 9.2 Admin server JVM monitoring and leak confirmation

The Quest Foglight Java EE monitoring tool was quite useful to identify a Java Heap leak from our Weblogic Admin server. As you can see below, the Java Heap memory is growing over time.

If you are not using any monitoring tool for your Weblogic environment, my recommendation to you is to at least enable verbose:gc of your HotSpot VM. Please visit my Java 7 verbose:gc tutorial on this subject for more detailed instructions.


Step #2 – Generate a Heap Dump from your leaking JVM

Following the discovery of a JVM memory leak, the goal is to generate a Heap Dump file (binary format) by using the Sun JDK jmap utility.

** please note that jmap Heap Dump generation will cause your JVM to become unresponsive so please ensure that no more traffic is sent to your affected / leaking JVM before running the jmap utility **

<JDK HOME>/bin/jmap -heap:format=b <Java VM PID>


This command will generate a Heap Dump binary file (heap.bin) of your leaking JVM. The size of the file and elapsed time of the generation process will depend of your JVM size and machine specifications / speed.

For our case study, a binary Heap Dump file of ~ 2 GB was generated in about 1 hour elapsed time.

Sun HotSpot 1.5/1.6/1.7 Heap Dump file will also be generated automatically as a result of a OutOfMemoryError and by adding -XX:+HeapDumpOnOutOfMemoryError in your JVM start-up arguments.


Step #3 – Load your Heap Dump file in Memory Analyzer tool

It is now time to load your Heap Dump file in the Memory Analyzer tool. The loading process will take several minutes depending of the size of your Heap Dump and speed of your machine.




Step #4 – Analyze your Heap Dump

The Memory Analyzer provides you with many features, including a Leak Suspect report. For this case study, the Java Heap histogram was used as a starting point to analyze the leaking objects and the source.


For our case study, java.lang.String and char[] data were found as the leaking Objects. Now question is what is the source of the leak e.g. references of those leaking Objects. Simply right click over your leaking objects and select >> List Objects > with incoming references



As you can see, javax.management.ObjectName objects were found as the source of the leaking String & char[] data. The Weblogic Admin server is communicating and pulling stats from its managed servers via MBeans / JMX which create javax.management.ObjectName for any MBean object type. Now question is why Weblogic 9.2 is not releasing properly such Objects…

Root cause: Weblogic javax.management.ObjectName leak!

Following our Heap Dump analysis, a review of the Weblogic known issues was performed which did reveal the following Weblogic 9.2 bug below:

·         Weblogic Bug ID: CR327368
·         Description: Memory leak of javax.management.ObjectName objects on the Administration Server used to cause OutOfMemory error on the Administration Server.
·         Affected Weblogic version(s): WLS 9.2
·         Fixed in: WLS 10 MP1

 This finding was quite conclusive given the perfect match of our Heap Dump analysis, WLS version and this known problem description.

Conclusion

I hope this tutorial along with case study has helped you understand how you can pinpoint the source of a Java Heap leak using jmap and the Memory Analyzer tool.
Please don’t hesitate to post any comment or question.
I also provided free Java EE consultation so please simply email me and provide me with a download link of your Heap Dump file so I can analyze it for you and create an article on this Blog to describe your problem, root cause and resolution.

4.20.2011

Class loader memory leak debugging tutorial for IBM VM

This article will provide you with a step by step tutorial on how you can pinpoint root cause of Java class loader memory leak related problems.

A recent class loader leak problem found from a Weblogic Integration 9.2 production system on AIX 5.3 (using the IBM Java VM 1.5) will be used as a case study and will provide you with complete root cause analysis steps.

  
Environment specifications (case study)

·        Java EE server: Oracle Weblogic Integration 9.2 MP2
·        OS: AIX 5.3 TL9 64-bit
·        JDK: IBM VM 1.5.0 SR6 - 32-bit
·        Java VM arguments: -server -Xms1792m -Xmx1792m -Xss256k
·        Platform type: Middle tier - Order Provisioning System


Monitoring and troubleshooting tools

·        IBM Thread Dump (javacore.xyz.txt format)
·        IBM AIX 5.3 svmon command
·        IBM Java VM 1.5 Heap Dump file (heapdump.xyz.phd format)
·        IBM Support Assistant 4.1 - Memory Analyzer 0.6.0.2 (Heap Dump analysis)

Introduction

Java class loader memory leak can be quite hard to identify. The first challenge is to determine that you are really facing a class loader leak vs. other Java Heap related memory problems. Getting OutOfMemoryError from your log is often the first symptom; especially when the Thread is involved in a class loading call, creation of Java Thread etc.

If you are reading this article, chances are that you already did some analysis and are suspecting a class loader leak at the source of your problem. I will still show how you can confirm your problem is 100% due to a class loader leak.


Step #1 – AIX native memory monitoring and problem confirmation

Your first task is to determine if your memory problem and/or OutOfMemoryError is really caused by a depletion of your native memory segments. If you are not familiar with this, I suggest you first go through my other article that will explain you how to monitor native memory on AIX 5.3  of your IBM Java VM process on AIX 5.3.

Using the AIX svmon command, the idea is to monitor and build a native memory comparison matrix, on a regular basis as per below. In our case study production environment, the native memory capacity is 768 MB (3 segments of 256MB).

As you can see below, the native memory is clearly leaking at a rate of 50-70MB daily.

Date
Weblogic Instance Name
Native Memory (MB)
Native memory delta increase (MB)
20-Apr-11
Node1
530
+ 54 MB
Node2
490
+65 MB
Node3
593
+70 MB
Node4
512
+50 MB


Date
 Weblogic Instance Name
Native Memory (MB)
19-Apr-11
Node1
476
Node2
425
Node3
523
Node4
462
 

This approach will allow you to confirm that your problem is related to native memory and also understand the rate of the leak itself.


Step #2 – Loaded classes and class loader monitoring

At this point, the next step is to determine if your native memory leak is due to class loader leak. Java objects like class descriptors, method names, Threads etc. are stored mainly in the native memory segments since these objects are more static in nature.

A simple way to keep track on your class loader stats is to generate a few IBM VM Thread Dump on a daily basis as per below explanations:

First, identify the PID of your Java process and generate a Thread Dump via the kill -3 command.


 ** kill -3 is actually not killing the process, simply generating a running JVM snapshot and is really low risk for your production environment **

This command will generate an IBM VM format Thread Dump at the root of your Weblogic domain.

Now open the Thread Dump file with an editor of your choice and look for the following keyword:

CLASSES subcomponent dump routine

This section provides you full detail on the # of class loaders in your Java VM along with # of class instances for each class loader


You can keep track on the total count of class loader instances by running a quick grep command:

grep -c  '2CLTEXTCLLOAD' <javacore.xyz.txt file>

You can keep track on the total count of class instances by running a quick grep command:

grep -c '3CLTEXTCLASS' <javacore.xyz.txt file>

For our case study, the number of class loaders and classes found was very huge and showing an increase on a daily basis.

·        3CLTEXTCLASS : ~ 52000 class instances
·        2CLTEXTCLLOAD: ~ 21000 class loader instances


Step #3 – Loaded classes and class loader delta increase and culprit identification

This step requires you to identify the source of the increase. By looking at the class loader and class instances, you should be able to fairly easily identify a list of primary suspects that you can analyse further. This could be application class instances or even Weblogic classes. Leaking class instances could also be Java $Proxy instances created by dynamic class loading frameworks using the Java Reflection API.

In our scenario, we found an interesting leak of the number of $ProxyXYZ class increases referenced by the Weblogic Generic class loader.


This class instance type was by far the #1 contributor for all our class instances. Further monitoring of the native memory and class instances did confirm that the source of delta increase of class instances was due to a leak of $ProxyXYZ related instances.

Such $Proxy instances are created by the Weblogic Integration 9.2 BPM (business process management) engine during our application business processes and normally implemented and managed via  java.lang.ref.SoftReference data structures and garbage collected when necessary. The Java VM is guaranteed to clear any SoftReference prior to an OutOfMemoryError so any such leak could be a symptom of hard references still active on the associated temporary Weblogic generic class loader.

The next step was to analyze the generated IBM VM Heap Dump file following an OutOfMemoryError condition.


Step #4 - Heap Dump analysis

A Heap Dump file is generated by default from the IBM Java VM 1.5 following an OutOfMemoryError. A Java VM heap dump file contains all information on your Java Heap memory but can also help you pinpoint class loader native memory leaks since it also provides detail on the class loader objects as pointers to the real native memory objects.

Find below a step by step Heap Dump analysis conducted using the ISA 4.1 tool (Memory Analyzer).

1) Open IAS and load the Heap Dump (heapdump.xyz.phd format) and select the Leak Suspects Report in order to have a look at the list of memory leak suspects

2) Once you find the source of class instance leak, the easiest way to analyze next is to use the find by address function from the tool and deep dive further

 ** In our case, the key question mark was why the Weblogic class loader itself was still referenced and still keeping hard reference to such $Proxy instances **


3) The final step was to deep dive within one sample of Weblogic Generic class loader instance (0x73C641C8) and attempt to pinpoint the culprit parent referrer



As you can see from the snapshot, the inner class weblogic/controls/container/ConfiguredServiceProvider$ProviderKey was identified as the primary suspect and potential culprit of the problem. 


Potential root cause and conclusion

As per the Heap Dump analysis. This data structure appear to be maintaining a list of java.lang.ref.SoftReference for the generated class loader instances but also appear to be holding hard references; preventing the Java VM to garbage collect the unused Weblogic Generic class loader instances and its associated $Proxy instances.

Further analysis of the Weblogic code will be required by Oracle support along with some knowledge base database research as this could be a known issue of the WLI 9.2 BPM.

I hope this tutorial will help you in your class loader leak analysis when using an IBM Java VM, please do not hesitate to post any comment or question on the subject.


Solution and next steps

We are discussing this problem right now with the Oracle support team and I will keep you informed of the solution as soon as possible so please stay tuned for more update on this post.