Enterprise Java Applications On VMware Best Practices Guide
Enterprise Java Applications On VMware Best Practices Guide
2011 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. This product is covered by one or more patents listed at http://www.vmware.com/download/patents.html. VMware is a registered trademark or trademark of VMware, Inc. in the United States and/or other jurisdictions. All other marks and names mentioned herein may be trademarks of their respective companies.
Contents
1. Introduction ...................................................................................... 5
1.1 Overview ........................................................................................................................ 5 1.2 Purpose .......................................................................................................................... 5 1.3 Target Audience ............................................................................................................. 5 1.4 Scope ............................................................................................................................. 5
2. 3.
Enterprise Java Applications on vSphere Architecture ..................... 6 Enterprise Java Applications on vSphere Best Practices ................. 7
3.1 VM Sizing and Configuration Best Practices Overview ................................................. 7 3.2 vCPU for VMs Best Practices ........................................................................................ 7 3.3 VM Memory Size Best Practice ..................................................................................... 7 3.4 VM Timekeeping Best Practices .................................................................................. 11 3.5 Vertical Scalability Best Practices ................................................................................ 12 3.6 Horizontal Scalability, Clusters, and Pools Best Practices .......................................... 13 3.7 Inter-tier Configuration Best Practices ......................................................................... 14 3.8 High Level vSphere Best Practices.............................................................................. 15
4.
5. 6.
FAQ: Enterprise Java Applications on vSphere .............................. 19 About the Author ............................................................................ 20
1. Introduction
1.1 Overview
This Enterprise Java Applications on VMware Best Practices Guide provides information about best practices for deploying enterprise Java applications on VMware, including key best practice considerations for architecture, performance, design and sizing, and high availability. This information is intended to help IT Architects successfully deploy and run Java environments on VMware vSphere.
1.2
Purpose
This guide provides best practice guidelines for deploying enterprise Java applications on VMware vSphere. The recommendations in this guide are not specific to any particular set of hardware or to the size and scope of any particular implementation. The best practices in this document provide guidance only and do not represent strict design requirements because enterprise Java application requirements can vary from one implementation to another. However, the guidelines do form a good foundation on which you can buildmany of our customers have used these guidelines to successfully virtualize their enterprise Java application. This document focuses on details that pertain to the deployment of enterprise Java applications on VMware vSphere and it is not necessarily a best practice guide for pure Java. For specific Java best practices refer to the vendor documentation for the JVM you are using. Virtualizing enterprise Java applications does not require a change in your Java coding paradigm and any performance enhancements that you have done on physical are transferrable as is to the vSphere deployed instance of your application.
1.3
Target Audience
This guide assumes a basic knowledge and understanding of VMware vSphere and enterprise Java applications. Architectural staff can use this document to gain an understanding of how the system will work as a whole as they design and implement various components. Engineers and administrators can use this document as a catalog of technical capabilities.
1.4
Scope
This guide covers the following topics: Enterprise Java Applications on vSphere Architecture This section provides a high level best practice architecture for running enterprise Java applications on vSphere. Enterprise Java Applications on vSphere Best Practices This section provides best practice guidelines for properly preparing the vSphere platform to run enterprise Java applications on vSphere. Best practices for the Design and Sizing of VMs, guest OS tips, CPU, memory, storage, networking, and useful JVM tuning parameters are presented. Also covered are the various high availability features in vSphere including VMware ESX host clusters, resource pools (horizontal scalability and vertical scalability) along with the VMware Distributed Resource Scheduler (DRS). Enterprise Java on vSphere Troubleshooting Primer There are times when you have to troubleshoot a particular Java application problem when running on vSphere. The vSphere es xto p utility is very informative when troubleshooting. FAQ: Enterprise Java Application on vSphere In this section, we answer some frequently asked questions about the deployment of the enterprise Java applications on vSphere.
A highly scalable and robust Java application has all of these tiers running in VMware vSphere in order to reap the full benefits of scalability features offered by vSphere. Figure 1 shows a multi-tier fully virtualized enterprise Java applications architecture running on VMware vSphere. Figure 1. Multi-tier Virtualized Enterprise Java Application Architecture
Each one of these tiers is running in a VM that is managed by VMware vSphere, which forms the key building foundation. Best practices are discussed in this guide for vSphere features such as VMware HA, DRS, VMware vMotion, resource pools, hot plug/hot add, networking and storage. The following are key architectural attributes of each tier: Load Balancer tier Increasingly feature-rich load balancers are available that provide various load balancing algorithms and API integration with VMware vSphere. This allows the enterprise Java application architecture to scale on demand as traffic bursts occur. Web Server tier The Web Server tier must be appropriately tuned, with the right number of HTTP threads to service your anticipated traffic demands. Java Application Server tier Many of the commonly used application servers have mechanisms to help you tune the Java engine to meet traffic demands. If you have already tuned the available Java Threads, JDBC configurations, and various JVM and GC parameters on physical machines, the tuning information is transferrable as is for the vSphere deployment of enterprise Java applications. DB Server tier Critical to meeting the uptime SLA of enterprise Java applications is having appropriate high availability architecture for the DB server. DB servers can benefit from running on vSpheresee the best practices for your DB server on vSphere. This guide covers best practices to JDBC Connection Pool tuning requirements at the Java application server level that the DB server needs to accommodate.
3.2
BP1
Best Practice
VM Sizing and VM-to JVM ratio through a performance load test BP2 VM vCPU CPU Overcommit BP3 VM vCPU Do not over subscribe to CPU cycles that you dont really need
3.3
To understand how to size memory for a VM you must understand the memory requirements of Java and various segments within it. Figure 2 provides an illustration of these separate memory areas.
Enterprise Java Applications on VMware Best Practices Guide Figure 2. Single JVM deployed on One VM
VM Memory
Guest OS Memory Java Stack Perm Gen -Xss Per thread -XX:MaxPermSize
JVM Memory
Initial Heap
-Xms
VM Memory = Guest OS Memory + JVM Memory JVM Memory = JVM Max Heap (-Xm x value) where: Perm Gen is an area that is in addition to the - Xmx (Max Heap) value and is not GCed as it holds the class level information about the code. IBM JVMs do not have Perm Gen area. The above VM Memory formula is an approximation of the main areas allocated. To more accurately size you need to load test the Java application for additional memory requirements that may be allocated due to NIO buffers: JIT code cache, classloaders, and verifiers. In particular, some Java applications may use NIO buffers, which can have huge additional memory demands. The contents of a direct buffer are allocated from the guest operating system memory instead of the Java heap, and non-direct buffers are copied into direct buffers for native I/O operations. Use load testing to appropriately size the effect of these buffers. If you have multiple JVMs (N JVMs) on a VM then: VM memory = guest OS memory + N * JVM memory. 2011 VMware, Inc. All rights reserved. Page 8 of 20 + JVM Perm Gen ( - XX: MaxPermSize) + NumberOfConcurrentThreads * (-Xs s )
Enterprise Java Applications on VMware Best Practices Guide Best Practice BP4 VM memory sizing Description Whether you are using Windows or Linux as your guest OS, refer to the technical specification of the various vendors for memory requirements. It is common to see the guest OS allocated about 1GB in addition to the JVM memory size. However, each installation may have additional processes running on it, for example monitoring agents, and you need to accommodate their memory requirements as well. Figure 2 shows the various segments of JVM and VM memory, and the formula summarizes VM Memory as: VM Memory (needed) = guest OS memory + JVM Memory, where JVM Memory = JVM Max Heap (- Xmx value) + Perm Gen (-XX:MaxPermSize) + NumberOfConcurrentThreads * (-Xs s ) The -Xmx value is the value that you found during load testing for your application on physical servers. This value does not need to change when moving to a virtualized environment. Load testing your application when deployed on vSphere will help confirm the best Xmx value. It is recommended that you do not overcommit memory because the JVM memory is an active space where objects are constantly being created and garbage collected. Such an active memory space requires its memory to be available all the time. If you overcommit memory ballooning or swapping may occur and impede performance. ESX host employs two distinct techniques for dynamically expanding or contracting the amount of memory allocated to virtual machines. The first method is known as memory balloon driver (vmmemctl). This is loaded from the VMware Tools package into the guest operating system running in a virtual machine. The second method involves paging from a virtual machine to a server swap file, without any involvement by the guest operating system. In the page swapping method, when you power on a virtual machine, a corresponding swap file is created and placed in the same location as the virtual machine configuration file (VMX file). The virtual machine can power on only when the swap file is available. ESX hosts use swapping to forcibly reclaim memory from a virtual machine when no balloon driver is available. The balloon driver may be unavailable either because VMware Tools is not installed, or because the driver is disabled or not running. For optimum performance, ESX uses the balloon approach whenever possible. However, swapping is used when the driver is temporarily unable to reclaim memory quickly enough to satisfy current system demands. Because the memory is being swapped out to disk, there is a significant performance penalty when the swapping technique is used. Therefore, it is recommended that the balloon driver is always enabled, but monitor to verify that it is not getting invoked as that memory is overcommitted. Both ballooning and swapping should be prevented for Java applications. To prevent ballooning and swapping, refer to BP5 Set memory reservation for VM needs.
Enterprise Java Applications on VMware Best Practices Guide BP5 Set memory reservation for VM memory needs JVMs running on VMs have an active heap space requirement that must always be present in physical memory. Use the VMware vSphere Client to set the reservation equal to the needed VM memory. Reservation Memory = VM Memory = guest OS Memory + JVM Memory You may set this reservation to the active memory being used by the VM for a more efficient use of the amount of memory available. Or, a simpler approach is to set the reservation equal to the total configured memory of the VM. Large memory pages help performance by optimizing the use of the Translation Look-aside Buffer (TLB), where virtual to physical address translations are performed. Use large memory pages as supported by your JVM and your guest operating system. The operating system and the JVM must be informed that you want to use large memory pages, as is the case when using large pages in physical systems. Set the -XX :+U seL arg ePag es at the JVM level for Sun HotSpot. On the IBM JVM it is -X lp , and JR ock it -XX la rgeP age s . You also need to enable this at the guest OS level. For information, see Large Page Performance: ESX Server 3.5 and ESX Server 3i v3.5.
3.4
BP7
Best Practice
Lower the clock interrupt rate on the virtual CPUs in your virtual machines by using a guest operating system that allows lower timer interrupts. Examples of such operating systems are RHEL 4.7 and later, RHEL 5.2 and later and the SuSE Linux Enterprise Server 10 SP2. See Timekeeping best practices for Linux guests for more information on timekeeping best practices for Linux. Use the Java features for lower resolution timing that are supplied by your JVM, such as the option for the Sun JVM on Windows guest operating systems: -XX: +Fo rce Tim eH ig hRes olu tio n You can also set the _JA V A_OP TIO NS variable to this value on Windows operating systems using the technique given (useful in cases where you cannot easily change the Java command line). The following is an example of how to set the Sun JVM option. To set the _J AV A_OP TIO NS environment variable: 1. Click Start > Settings > Control Panel > System > Advanced > Environment Variables. 2. Click New under System Variables. The variable name is _ JAV A_O PTI ONS. The variable value is -X X:+ For ce Ti meHi ghR eso lut io n . 3. Reboot the guest operating system to properly propagate the variable. Avoid using the / use pmt imer option in the b oot .ini system configuration for Windows guest operating systems that use an SMP HAL.
3.5
If an enterprise Java application deployed on vSphere experiences heavy CPU utilization and you have determined that an increase in the vCPU count will help resolve the saturation, you can use vSphere hot add to add additional vCPU. Best Practice BP8 Hot Add CPU/Memory Description VMs with a guest OS that supports hot add CPU and hot add memory can take advantage of the ability to change the VM configuration at runtime without any interruption to VM operations. This is particularly useful when you are trying to increase the ability of the VM to handle more traffic. Plan ahead and enable this feature. The VM must be turned off to have the hot plug feature enabled, but when enabled you can hot add CPU and hot add memory at runtime without VM shutdown (if the guest OS supports it). Refer to VM memory size best practices in Section 3.3. When you need to increase Java heap space, along with this is usually an increase in vCPU count to get the best GC cycle performance. Keep in mind that there are many other aspects to GC tuning and you should refer to your JVM documentation.
3.6
Enterprise Java applications deployed on VMware vSphere can benefit from using vSphere features for horizontal scalability using ESX host clusters, resource pools, host affinity and DRS. Best Practice BP9 Use ESX host cluster Description To enable better scalability use ESX host clusters. When creating clusters enable VMware HA and VMware DRS: o VMware HA Detects failures and provides rapid recovery for the VM running in a cluster. Core functionality includes host monitoring and VM monitoring to minimize downtime. VMware DRS Enables vCenter Server to manage hosts as an aggregate pool of resources. Cluster resources can be divided into smaller pools for users, groups, and VMs. It enables vCenter to manage the assignment of VMs to hosts automatically, suggesting placement when VMs are powered on, and migrating running VMs to balance load and enforce allocation policies.
Enable EVC (for either Intel or AMD). EVC is Enhanced vMotion Compatibility; it configures a cluster and its hosts to maximize vMotion compatibility. When EVC is enabled, only hosts that are compatible with those in the cluster may be added to the cluster.
Multiple resource pools can be used within a cluster to manage compute resource consumption by either reserving the needed memory for the VMs within a resource pool or by limiting/restricting it to a certain level. This feature also helps you meet quality of service and requirements. For example, you can create a Tier-2 resource pool for the less critical applications and a Tier-1 resource pool for business critical applications.
In addition to exiting anti-affinity rules, the VM-Host affinity rule was introduced in vSphere 4.1. The VM-Host affinity rule provides the ability to place VMs on a subset of hosts in a cluster. This is very useful in honoring ISV Licensing requirements. Rules can be created so that VMs run on ESX hosts in different blades for higher availability. Conversely, limit the ESX host to one blade in case network traffic between the VMs needs to be optimized by keeping them in one chassis location. vSphere makes it easy to add resources such as host and VMs at runtime. It is possible to provision these ahead of time. However, it is simpler if you use a load balancer that is able to integrate with vSphere APIs to detect the newly added VMs and add them to its application load balancing pools without downtime.
3.7
As discussed in Section 2, there are four critical technology tiers that sit on top of vSphere. These tiers are the Load Balancer tier, the Web Server tier, the Java Application Server tier, and the DB Server tier. The configurations for compute resources at each tier must translate to an equitable configuration at the next tier. For example, if the Web Server tier is configured to handle 100 HTTP requests per second, then of those requests you must determine how many Java application server threads are needed, and in turn how many DB connections are needed in the JDBC Pool configuration. Best Practice BP13 Establish appropriate thread ratios that prevents bottlenecks (HTTP threads:Java threads:DB connections) Description This is the ratio of HTTP threads to Java threads to DB connections. Establish initial setup by assuming that each layer requires a 1:1:1 ratio of HTTP threads:Java Threads:DB-connections, and then based on the response time and throughput numbers, adjust each of these properties accordingly until you have satisfied your SLA objectives. For example, if you have 100 HTTP requests submitted to the Web Server initially, assume that all of these will have an interaction with Java threads, and in turn, DB connections. Of course, in reality, during your benchmark you will find that not all HTTP threads are submitted to the Java application server, and in turn, not all Java application server threads each require a DB connection. That is, you may find your ratio for 100 requests translates to 100 HTTP threads:25 Java threads:10 Db connections, and this depends on the nature of your enterprise Java application behavior. Benchmarking helps you establish this ratio. Take into account the available algorithms of your load balancer. Make sure that when using the scale-out approach all of your VMs are receiving an equal share of the traffic. Some industry standard algorithms are Round Robin, Weighted Round Robin, Least Connections, and Least Response Time. You may want to initially default to Least Connections and then adjust as you see fit in your load test iterations. Keep your VMs symmetrical in terms of the size of compute resource. For example, if you decide to use 2 vCPU VMs as a repeatable, horizontally scalable building block, this helps with your load balancing algorithm, working more effectively as opposed to if there were a pool of non-symmetrical VMs for one particular application. That is, mixing 2 vCPU VMs with 4 vCPU VMs in one load balancer-facing pool is non-symmetrical and the load balancer has no notion of weighing this unless you configure for it at the Load Balancer level, which is time consuming.
3.8
It is important to follow the best practices. See Performance Best Practices for VMware vSphere 4.0. The following is a summary of some of the key networking, storage and hardware-related best practices that are commonly used.
4. Troubleshooting Primer
Troubleshooting problems with enterprise Java applications involves the investigation of each of the tiers, the Load balancer tier, The Web Server tier, the Java Application Server tier, the DB Server tier, and vSphere. VMware vSphere, in turn, has a dependency on networking and storage. The next few sections provide information about how to begin troubleshooting, and effective utilities you can use.
4.1
If you suspect the VMware vSphere is not configured optimally and is cause of the bottleneck, file a support request (http://www.vmware.com/support/contacts/file-sr.html). In addition, you may want to: Follow the troubleshooting steps outlined in the Performance troubleshooting guide for ESX4.0. Verify that you have applied all of the best practices discussed in this Best Practices Guide. Run the v m-s upp or t utility. Execute the following command at the service console: vm-s upp ort -s This collects necessary information so that VMware can help diagnose the problem. It is best to run this command at the time when the symptoms occur.
4.2
3 0 5 1 1 1 1 80
1 1 25 25 2 1 1
DISK
RESET/s
4.3
Refer to your JVM documentation for troubleshooting guides. The following sections provide information about how to begin troubleshooting. Information is given about some severe Java application problems which are GC/memory leakage-related, and some that are thread contention-based. For JDBC-based errors, refer to the JDBC driver provided to you by the database vendor. Of particular importance for performance are errors leading to OutofMemory, Stackoverflow, and Thread Deadlock.
Re-inspect the -Xmx, -Xms , -Xs s settings. If youre using JDK 6, you can use tool called jm ap on any platform. Running jm ap may add additional load on your environment so plan for the best time to run it. If youre using JDK 5: o o If youre running Linux with JDK 5 you can use jmap. If youre using JDK 5 update 14 or later, you can use the - XX: +He ap Dum pO nCtr lBr eak option when starting JVM, then use the Ctrl+Break key combination on Windows to dump the heap.
The following table provides a summary of maximums for per VM, per Host, and per vCenter. vSphere Configuration Per VM Maximum Per Host Per vCenter 8 vCPUs 255GB 2TB of storage minus 512 bytes 512 vCPUs 320 VMs 25 vCPUs per core 1000 hosts 10000 Powered on VMs 15000 registered VMs 10 Linked vCenter Servers 3000 hosts in linked vCenter servers 30000 powered on VMs in linked vCenter Servers 50000 registered VMs in linked vCenter Servers 100 concurrent vSphere Clients 400 hosts per datacenter
What decisions must be made due to virtualization? You have to determine the size of the repeatable building block VM. This is established by benchmarking, along with total scale-out factor. Determine how many concurrent users each single vCPU configuration of your application can handle, and extrapolate that to your production traffic to determine the overall compute resource requirement. Having a symmetrical building block; for example, every VM having the same number of vCPUs, helps keep load distribution from your load balancer even. Essentially, your benchmarking test helps you determine how large a single VM should be (vertical scalability) and how many of these VMs you will need (horizontal scalability).
Enterprise Java Applications on VMware Best Practices Guide You need to pay special attention to scale-out factor, and see up to what point it is linear within your application running on top of VMware. Enterprise Java applications are multi-tier and bottlenecks can appear at any point along the scale out performance line and quickly cause non-linear results. The assumption of linear scalability may not always be true, and it is essential to load test a pre-production replica (production to be) of your environment to accurately size for you traffic. I have conducted extensive GC sizing and tuning for our current enterprise Java application running on physical. Do I have to adjust anything related to sizing when moving this Java application to a virtualized environment? No. All tuning that you would perform for your Java application on physical is transferrable to your virtual environment. However, because virtualization projects are typically about driving a high consolidation ratio, it is advisable that you conduct adequate load testing to establish your ideal compute resource configuration for individual VMs, number of JVMs within a VM and overall number of VMs on the ESX host. Additionally, because this type of migration involves an OS/platform change as well as a JVM vendor change, it is advisable to read through Section 2 of this document along with your vendors tuning advice for both OS and JVM.
How many and what size of virtual machines will I need? This depends on the nature of your application. We most often see 2 vCPU VMs as a common building block for Java applications. One of the guidelines is to tune your system for more scale-out as opposed to scale up. This rule is not absolute as it depends on your organizations architectural best practices. Smaller more scaled-out VMs may provide better overall architecture, but you will incur additional guest Os licensing costs. If this is a constraint then you can tune towards larger 4 vCPU VMs and stack more JVMs on them. What is the correct number of JVMs per virtual machine? There is no one definite answer. This largely depends on the nature of your application. The benchmarking you conduct can determine the limit of the number of JVMs can be stacked up on a single VM. The more JVMs you put on a single VM the more JVM overhead/cost of initializing a JVM is incurred. Alternately, instead of stacking up multiple JVMs within a VM, you can instead increase the JVM size vertically by adding more threads and heap size. This can be achieved if your JVM is within an application server such as Tomcat. Then, instead of increasing the number of JVMs you can increase the number of concurrent threads available and resources that a single Tomcat JVM services for your n-number of applications deployed and their concurrent requests per second. The limitation of how many applications you can stack up within a single application server instance/JVM is bounded by how large you can afford your JVM heap size to be and performance. The trade-off of very large JVM heap size beyond 4GB needs to be tested for performance and GC cycle impact. This concern is not specific to virtualization as it equally applies to physical server setup.