Vware Sharepoint 2010 Best Practices Guide
Vware Sharepoint 2010 Best Practices Guide
Best Practices
© 2011 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and
intellectual property laws. This product is covered by one or more patents listed at
http://www.vmware.com/download/patents.html.
VMware is a registered trademark or trademark of VMware, Inc. in the United States and/or other
jurisdictions. All other marks and names mentioned herein may be trademarks of their respective
companies.
VMware, Inc
3401 Hillview Ave
Palo Alto, CA 94304
www.vmware.com
Contents
1. Overview .......................................................................................... 7
1.1 Purpose .......................................................................................................................... 8
1.2 Target Audience ............................................................................................................. 8
1.3 Scope ............................................................................................................................. 8
List of Figures
Figure 1. Virtual Machine Memory Settings ................................................................................................ 12
Figure 2. VMware Storage Virtualization Stack .......................................................................................... 15
Figure 3. Virtual Networking in vSphere ..................................................................................................... 17
Figure 4. Percentage of Guest vCPUs Utilized in the Three Virtual Machine Configuration ...................... 22
Figure 5. Percentage of Guest vCPUs Utilized in the Four Virtual Machine Configuration ........................ 22
Figure 6. Percentage of Guest vCPUs Utilized in the Five Virtual Machine Configuration ........................ 23
Figure 7. Number of Heavy Users at 1% Concurrency ............................................................................... 23
Figure 8. SharePoint 2010 Medium Topology Example ............................................................................. 31
Figure 9. Enterprise Intranet Collaboration Environment Technical Case Study ....................................... 32
Figure 10. Departmental Collaboration Environment Technical Case Study ............................................. 39
Figure 11. Initial Virtual Machine Placement ............................................................................................... 42
Figure 12. Dynamic Scaling on VMware ..................................................................................................... 43
Figure 13. Using VMware Hot-Add with an SQL Server 2008 Guest ......................................................... 44
Figure 14. Virtual Machine Templates ........................................................................................................ 45
Figure 15. Monitor Multi-Tiered Application Performance with VMware AppSpeed ................................... 46
Figure 16. Right-Size Applications with vCenter CapacityIQ ...................................................................... 47
List of Tables
Table 1. Esxtop CPU Performance Metrics ................................................................................................ 11
Table 2. Esxtop Memory Counters ............................................................................................................. 14
Table 3. vSphere Performance Enhancements .......................................................................................... 19
Table 4. Virtual Machine Configurations for Each Use Case ...................................................................... 21
Table 5. Partner Testing Summary ............................................................................................................. 24
Table 6. Performance Counters of Interest to VMware and SharePoint Administrators ............................ 26
Table 7. SharePoint 2010 Performance Counters from TechNet (partial) .................................................. 27
Table 8. Enterprise Intranet Collaboration Environment Technical Case Study Example ......................... 30
Table 9. SharePoint 2007 User Loads from Microsoft TechNet ................................................................. 30
Table 10. Sample Content Database Sizing Criteria .................................................................................. 34
Table 11. SQL Server Memory Recommendations .................................................................................... 36
Table 12. Web, App, and Single Server Minimum Requirements from Microsoft TechNet ........................ 37
Table 13. Database Server Minimum Requirements from Microsoft TechNet ........................................... 37
Table 14. SharePoint Resource Requirements by Server Role ................................................................. 39
Table 15. Database Server Minimum Requirements from Microsoft TechNet ........................................... 41
Table 16. Database Server Minimum Requirements from Microsoft TechNet ........................................... 41
Table 17. Departmental Collaboration Case Study Processor Utilization .................................................. 42
1. Overview
A Microsoft SharePoint infrastructure is frequently deployed first as a collaboration and document sharing
tool. SharePoint makes workgroups more efficient and reduces the cost of coordinated collaboration,
while allowing the flexibility needed by workgroups with diverse needs and goals. This flexibility and utility
often leads to rapid growth in demand for both capacity and bandwidth as more users leverage these
tools to coordinate workflows and manage more documents. More sophisticated SharePoint deployments
can optimize business workflow and communications, and can quickly become critical components in
everyday commercial operations.
While SharePoint offers great benefits to organizations, these benefits are only realized when the
underlying systems are available and perform at an acceptable level. Rapid growth and high availability
are difficult features to manage in a traditional IT environment. Accommodating both often requires the
high cost of over-designing and over-building at the earliest stages of deployment.
Because SharePoint encourages rapid growth and “viral” proliferation, user goals may conflict with the
ability of the IT staff to deliver the desired services when needed within budgetary and manpower
constraints. Flexibility is extremely valuable during this early period. If rapid growth and evolution can be
supported at realistic costs, SharePoint can become an important tool to rapidly increase everyday
®
productivity. VMware vSphere can facilitate this capability, allowing organizations to leverage the
benefits of SharePoint on a pay-as-you-go basis. Because high availability features are inherent to the
vSphere platform, these can be leveraged on demand. By virtualizing SharePoint, the common problems
of deploying a complex, high-growth IT service are alleviated, allowing resources to be spent on
maximizing the value of the tool in everyday business practice.
Contrast the benefits of a virtual infrastructure with the limitations of a traditional deployment. Using
conventional physical infrastructure typically leads to over-provisioning. This creates significant resource
underutilization and high system power, cooling, and operating costs. In addition, for complex
architectures such as SharePoint 2010, using physical servers and infrastructure may have other
limitations such as the following:
Application delivery is traditionally gated by the need for manual configuration and provisioning for
each new application or configuration change on a specific hardware platform. This can be slow, and
in an existing infrastructure, lead to excessive downtime. It can also constrain growth to the
organization’s ability to purchase new hardware. Virtual deployments typically take minutes, can
share currently deployed hardware, and can be adjusted “on the fly” when more resources are
required.
Application architectures, such as those provided by SharePoint 2010, are rapidly evolving towards
highly distributed, loosely-coupled applications. The conventional x86 computing model, in which
applications are tightly coupled to physical servers, is too static and restrictive to efficiently support
these complex applications. With a virtual deployment, the architecture can be as modular as is
appropriate, without expanding the hardware footprint. The dynamic nature of virtual machines means
that the design can grow and adapt as required, and the need for a “perfect” initial design is
eliminated.
Availability becomes a critical factor. In a physical environment, the cost and complexity of server
clustering is required to increase availability. In a highly distributed environment, VMware vSphere
can increase application availability at a much lower cost than using traditional HA strategies.
Rising datacenter costs (for power, cooling, floor space, and so on.), even while some server
computing resources go under-utilized. It is well understood that server consolidation through
virtualization is a significant factor in reducing cost. The higher density available from vSphere versus
other products can provide significant savings on hardware and Microsoft licenses.
This paper demonstrates that virtualization with VMware vSphere can minimize the challenges in a
SharePoint deployment so that maximum commercial value is realized from an investment in SharePoint
systems.
1.1 Purpose
This guide provides best practice guidelines for deploying SharePoint 2010 on vSphere. The
recommendations in this guide are not specific to any particular set of hardware or to the size and scope
of any particular SharePoint implementation. The examples and considerations in this document provide
guidance only, and do not represent strict design requirements, because the flexibility of both SharePoint
and vSphere allows for a wide variety of valid configurations.
1.3 Scope
The scope of this document is limited to the following topics:
®
VMware ESX /ESXi™ Host Best Practices for SharePoint – This section provides best practice
guidelines for properly preparing the vSphere platform for running SharePoint 2010. This section
includes guidance in the areas of CPU, memory, storage, and networking.
SharePoint Performance on vSphere – This section provides background information on SharePoint
performance in a virtual machine. It also provides information on official VMware partner testing and
guidelines for conducting and measuring internal performance tests.
SharePoint 2010 Capacity Management Concepts and Reference – Sizing SharePoint 2010 to run in
a virtual machine follows many of the same best practices as sizing on physical servers. This section
walks through the high points of the capacity management process with references to deeper dive
material on Microsoft TechNet.
vSphere Enhancements for Deployment and Operations – This section provides a brief look at
vSphere features and add-ons that enhance deployment and management of SharePoint 2010.
The following topic is out of scope for this document, but may be addressed in other documentation in this
solution kit: Availability and Recovery Options – Although this document briefly covers VMware features
that can enhance availability and recovery, a more in-depth discussion of this subject is covered in the
Microsoft SharePoint 2010 on VMware Availability and Recovery Options document.
This and other guides in this solution kit are limited in focus to deploying SharePoint on vSphere.
SharePoint deployments cover a wide subject area, and SharePoint-specific design principles must
always follow Microsoft guidelines for best results.
VMware recommends the following practices when considering the allocation of vCPUs for SharePoint
2010:
Allocate the minimum requirement for production virtual machines based on Microsoft guidelines, the
role of the virtual machine, and the size of the environment. Additional vCPUs can be added later if
necessary.
Test, development, and proof-of-concept environments can get along with fewer vCPUs allocated to
virtual machines. These environments typically require a fraction of the resources needed to satisfy
user demand in production.
When overcommitting CPU resources (number of vCPUs allocated to running virtual machines is
greater than the number of physical cores on a host), monitor the responsiveness of SharePoint to
understand the level of overcommitment which can be provided while still performing at an acceptable
level.
SharePoint 2010 minimum processor requirements recommended by Microsoft may be excessive in
some environments. For this reason, VMware recommends reducing the number of virtual CPUs if
monitoring of the actual workload shows that the virtual machine is not benefitting from the increased
virtual CPUs. Having virtual CPUs allocated but sitting idle reduces the consolidation level and efficiency
of the ESX/ESXi host. For more background, see the “ESX CPU Considerations” section in the white
paper Performance Best Practices for VMware vSphere 4 at
http://www.vmware.com/pdf/Perf_Best_Practices_vSphere4.0.pdf.
2.1.3 Overcommitment
VMware conducted tests on virtual CPU overcommitment with SAP and SQL, showing that the
performance degradation inside the virtual machines is linearly reciprocal to the overcommitment.
Because the performance degradation is “graceful,” any virtual CPU overcommitment can be effectively
® ®
managed by using VMware DRS and VMware vSphere VMotion to move virtual machines to other
ESX/ESXi hosts to obtain more processing power. By intelligently implementing CPU overcommitment,
consolidation ratios of SharePoint Web front-end and application servers can be driven higher while
maintaining acceptable performance. If it is chosen that a virtual machine not participate in
overcommitment, setting a CPU reservation provides a guaranteed CPU allocation for the virtual
machine. This practice is generally not recommended because the reserved resources are not available
to other virtual machines and flexibility is often required to manage changing workloads. However, SLAs
and multi-tenancy may require a guaranteed amount of compute resources to be available. In these
cases, reservations make sure that these requirements are met.
When choosing to overcommit CPU resources, monitor vSphere and SharePoint to be sure
responsiveness is maintained at an acceptable level. The following table lists counters that can be
monitored to help drive consolidation numbers higher while maintaining performance.
%RDY Percentage of time a vCPU in a run A high %RDY time (use 20% as a starting
queue is waiting for the CPU point) may indicate the virtual machine is
scheduler to let it run on a physical under resource contention. Monitor this—if
CPU. application speed is OK, a higher threshold
may be tolerated.
%MLMTD Percentage of time a vCPU was A high %MLMTD time may indicate a CPU
ready to run but was deliberately not limit is holding the VM in a ready to run
scheduled due to CPU limits. state. If the application is running slow
consider increasing or removing the CPU
limit.
%CSTP Percentage of time a vCPU spent in A high %CSTP time usually means that
read, co-descheduled state. Only vCPUs are not being used in a balanced
meaningful for SMP virtual fashion. Evaluate the necessity for multiple
machines. vCPUs.
2.1.4 Hyper-threading
Hyper-threading technology (recent versions of which are called symmetric multithreading, or SMT)
enables a single physical processor core to behave like two logical processors, essentially allowing two
independent threads to run simultaneously. Unlike having twice as many processor cores—which can
roughly double performance—hyper-threading can provide anywhere from a slight to a significant
increase in system performance by keeping the processor pipeline busier. For example, an ESX/ESXi
host system enabled for SMT on an 8-core server sees 16 threads that appear as 16 logical processors.
The vSphere memory settings for a virtual machine include the following parameters:
Configured memory – Memory size of virtual machine assigned at creation.
Touched memory – Memory actually used by the virtual machine. vSphere allocates only guest
operating system memory on demand.
Swappable – Virtual machine memory that can be reclaimed by the balloon driver or by vSphere
swapping. Ballooning occurs before vSphere swapping. If this memory is in use by the virtual
machine (that is, touched and in use), the balloon driver causes the guest operating system to swap.
Also, this value is the size of the per-virtual machine swap file that is created on the VMware Virtual
Machine File System (VMFS) file system (VSWP file).
If the balloon driver is unable to reclaim memory quickly enough, or is disabled or not installed,
vSphere forcibly reclaims memory from the virtual machine using the VMkernel swap file.
SWAP /MB: r/s, The rate at which machine memory High rates of swapping affect guest
w/s is swapped in and out of disk. performance. If free memory is low, consider
moving virtual machines to other hosts. If
free memory is OK, check resource limits on
the virtual machines.
MCTLSZ The amount of guest physical If the guest working set is smaller than guest
memory reclaimed by the balloon physical memory after ballooning, no
driver. performance degradation is observed.
However, investigate the cause for
ballooning. It could be due to low host
memory or a memory limit on the virtual
machine.
Figure 2 shows that VMware storage virtualization can be categorized into three layers of storage
technology:
The Storage array is the bottom layer, consisting of physical disks presented as logical disks (storage
array volumes or LUNs) to the layer above, with the virtual environment occupied by vSphere.
Storage array LUNs that are formatted as VMFS datastores that provide storage for virtual disks.
Virtual disks that are presented to the virtual machine and guest operating system as SCSI attached
disks that can be partitioned and used in file systems.
As shown in the figure, the following components make up the virtual network:
Physical switch – vSphere host-facing edge of the local area network.
Physical network interface (vmnic) – Provides connectivity between the ESX host and the local area
network.
vSwitch – The virtual switch is created in software and provides connectivity between virtual
machines. Virtual switches must uplink to a physical NIC (also known as vmnic) to provide virtual
machines with connectivity to the LAN. Otherwise, virtual machine traffic is contained within the virtual
switch.
Port group – Used to create a logical boundary within a virtual switch. This boundary can provide
VLAN segmentation when 802.1q trunking is passed from the physical switch, or can create a
boundary for policy settings.
Virtual NIC (vNIC) – Provides connectivity between the virtual machine and the virtual switch.
VMkernel (vmknic) – Interface for hypervisor functions such as connectivity for NFS, iSCSI, vMotion,
and FT logging.
Service Console (vswif) – Interface for the service console present in ESX Classic. Not present in
ESXi.
Virtual Adapter – Provides Management, vMotion, and FT Logging when connected to a vNetwork
Distributed Switch.
NIC Team – Group of physical NICs connected to the same physical/logical networks providing
redundancy.
NUMA support ESX/ESXi uses a NUMA load-balancer to See VMware vSphere 4: The CPU
assign a home node to a virtual machine. Scheduler in VMware ESX 4 at
Because memory for the virtual machine is http://www.vmware.com/files/pdf/p
allocated from the home node, memory access erf-vsphere-cpu_scheduler.pdf
is local and provides the best performance
possible. Even applications that do not directly
support NUMA benefit from this feature.
Memory By using a balloon driver loaded in the guest See Understanding Memory
ballooning operating system, the hypervisor can reclaim Resource Management in VMware
host physical memory if memory resources are ESX 4.1 at
under contention. This is done with little to no http://www.vmware.com/files/pdf/t
impact to the performance of the application. echpaper/vsp_41_perf_memory_
mgmt.pdf
Large memory An application that can benefit from large See Performance Best Practices
page support pages on native systems, such as MS SQL, for VMware vSphere 4.0 at
can potentially achieve a similar performance http://www.vmware.com/pdf/Perf_
improvement on a virtual machine backed with Best_Practices_vSphere4.0.pdf
large memory pages. Enabling large pages
and Performance and Scalability
increases the memory page size from 4KB to
of Microsoft SQL Server on
2MB.
VMware vSphere 4 at
http://www.vmware.com/files/pdf/p
erf_vsphere_sql_scalability.pdf
Para-virtualized The PVSCSI virtual storage and VMXNet3 See PVSCSI Storage
network and virtual network adapters, which are available to Performance at
storage the guest operating system after installation of http://www.vmware.com/pdf/vsp_4
controllers VMware Tools, are high-performance virtual _pvscsi_perf.pdf
I/O adapters that can provide greater
and the VMware Blog at:
throughput while requiring lower CPU
utilization. http://blogs.vmware.com/performa
nce/2009/09/vsphere-40-
introduces-a-new-para-virtualized-
network-device---vmxnet3-we-
recently-published-a-paper-
demonstrating-its-perfo.html
Distributed DRS dynamically balances computing capacity See the VMware vSphere product
Resource across a vSphere cluster, creating an overview at
Scheduler aggregate of compute power. As resource http://www.vmware.com/products/
(DRS) and utilization fluctuates within a vSphere cluster drs/overview.html
vMotion workloads are migrated with no impact to
performance or uptime.
In each use case, the Web server front-ends were configured with the Web Server role as well as the
query server role using Windows Network Load Balancing to distribute the load. The application server
virtual machine was used as a dedicated indexing server, and the SQL Server virtual machine held the
200GB of SharePoint user data. Each virtual machine was configured with two virtual CPUs and 4GB of
memory, the SQL Server virtual machine was allocated 16GB of memory.
During the first round of testing the single Web server is the bottleneck with 100% vCPU utilization.
Figure 4. Percentage of Guest vCPUs Utilized in the Three Virtual Machine Configuration
The second test consisted of two Web server virtual machines, which were also the bottlenecks at 95%
and 97% vCPU utilization. The SQL Server utilization increased to 78%.
Figure 5. Percentage of Guest vCPUs Utilized in the Four Virtual Machine Configuration
In the third test, a third Web server virtual machine was added and the Web servers were no longer the
bottleneck, however, the SQL Server was close to saturation at 96% vCPU utilization.
Figure 6. Percentage of Guest vCPUs Utilized in the Five Virtual Machine Configuration
Between the first and last experiment, the number of supported heavy SharePoint users increased from
72,600 users to 171,600 users, a 136% increase, by simply scaling out the Web server virtual machines
on the same physical server.
Figure 7. Number of Heavy Users at 1% Concurrency
The testing concluded that vSphere could provide very good results (response times less than three
seconds) while providing consolidation of a SharePoint environment. Additionally, vSphere features, such
as cloning and templates, enable the quick deployment of additional virtual machines to meet increased
demand and increase server utilization. See Microsoft Office SharePoint Server 2007 Performance on
VMware vSphere 4.1 at: http://www.vmware.com/resources/techresources/10130.
Windows Performance Monitor counters can be correlated with esxtop counters for compute resources
such as CPU, memory, and disk latencies. When used in conjunction with esxtop counters, the PerfMon
counters can be evaluated for their accuracy and used to pinpoint any bottlenecks in the system. See
Table 7 (more information is available at http://technet.microsoft.com/en-us/library/ff758658.aspx).
Table 7. SharePoint 2010 Performance Counters from TechNet (partial)
Processor
% Processor Time This shows processor usage over a period of time. If this is consistently too
high, you may find performance is adversely affected. Remember to count
"Total" in multiprocessor systems. You can measure the utilization on each
processor as well, to achieve balanced performance between cores.
Disk
Avg. Disk Queue This shows the average number of read and write requests that were
Length queued for the selected disk during the sample interval. A bigger disk queue
length may not be a problem as long as disk reads/writes are not suffering
and the system is working in a steady state without expanding queuing.
Avg. Disk Read Queue The average number of read requests that are queued.
Length
Avg. Disk Write Queue The average number of write requests that are queued.
Length
Memory
Available Mbytes This shows the amount of physical memory available for allocation.
Insufficient memory leads to excessive use of the page file and an increase
in the number of page faults per second.
Cache Faults/sec This counter shows the rate at which faults occur when a page is sought in
the file system cache and is not found. This may be a soft fault, when the
page is found in memory, or a hard fault, when the page is on disk.
The effective use of the cache for read and write operations can have a
significant impact on server performance. You must monitor for increased
cache failures, indicated by a reduction in the Async Fast Reads/sec or
Read Aheads/sec.
Pages/sec This counter shows the rate at which pages are read from or written to disk
to resolve hard page faults. If this rises, it indicates system-wide
performance problems.
Paging File
% Used and % Used The server paging file, sometimes called the swap file, holds "virtual"
Peak memory addresses on disk. Page faults occur when a process has to stop
and wait while required "virtual" resources are retrieved from disk into
memory. These are more frequent if the physical memory is inadequate.
NIC
Total Bytes/sec This is the rate at which data is sent and received by way of the network
interface card. You may need to investigate further if this rate is over 40–50
percent network capacity. To fine-tune your investigation, monitor Bytes
Received/sec and Bytes Sent/sec.
Requests per second (RPS) – The demand on the server farm expressed in the number of requests
processed by the farm per second, but with no differentiation between the type and size of requests.
RPS is highly dependent on an organization's unique usage characteristics.
Concurrent users – The number of distinct users generating requests in a given time frame.
Total daily users – The actual number of unique users in a 24-hour period, not the total number of
employees in the organization.
Total daily requests – All requests (except authentication handshake requests) over a 24-hour period.
At this time there are no definitive guidelines on what might be considered “normal” usage for SharePoint
2010 users; however, there is some information in the SharePoint 2007 TechNet documentation
(http://technet.microsoft.com/en-us/library/cc261795%28office.12%29.aspx) that may give you an idea on
what to expect in terms of user load. You can use the Requests Per Second Per User value to calculate
the overall RPS that you must support. For example, 5,000 concurrent, “heavy” users each at 60 requests
per hour would translate into .017 requests per second per user or 85 requests per second (RPS) for the
system as a whole.
Table 9. SharePoint 2007 User Loads from Microsoft TechNet
D = the number of documents you expect to host in the content database. This number should
include both user documents and documents generated by automated processes. If you are migrating
from a previous version of SharePoint, these statistics can be gathered from the current environment.
If you are installing SharePoint for the first time, you can estimate the number of documents based on
existing file shares or third-party document repositories.
S = the average size of each document that will be stored in the content database. If average file size
differs significantly between groups of users, it may be useful to estimate averages for different types
or groups of sites. For example, Marketing users may have to store larger media files whereas
Human Resources users might only need to store relatively small, static documents.
L = the number of list items in the environment. List items are more difficult to estimate than
documents. If you are migrating from a previous version of SharePoint, these statistics can be
gathered from the current environment. If you are installing SharePoint from scratch, Microsoft uses
an estimate of three times the number of documents (D), but this estimate may need adjustment
based on actual production workload.
V = the approximate number of document versions. This value is usually much lower than the
maximum allowed number of versions. For the purposes of the formula, the value should be greater
than zero.
10KB = a rough estimate for amount of metadata required by SharePoint Server 2010. If you expect
that your system will use a significant amount of metadata, you may want to increase this constant.
Microsoft provides an example that you can use to apply the formula (see Storage and SQL Server
capacity planning and configuration (SharePoint Server 2010) at http://technet.microsoft.com/en-
us/library/cc298801.aspx). Consider the data inputs and values shown in the following table.
Table 10. Sample Content Database Sizing Criteria
If you plug these values into the formula you get a value of 105GB for the content database.
Database size = (((200,000 x 2)) × 250) + ((10KB × (600,000 + (200,000 x 2))) =
110,000,000KB or 105GB
Up to 2TB 32GB
Table 12 shows the server requirements from Hardware and software requirements (SharePoint Server
2010) at http://technet.microsoft.com/en-us/library/cc262485.aspx.
Table 12. Web, App, and Single Server Minimum Requirements from Microsoft TechNet