Dell Compellent Best Practices With VMware VSphere 5.x

Dell Compellent Storage Center
Best Practices with VMware vSphere 5.x

Dell Engineering
September 2014
A Dell Best Practices
Revisions
Date
Description
September 2011
Initial Release
October 2012
Added ESXi 5.1 updates and Live Volume considerations
January 2014
Added ESXi 5.5 updates and FS8600 section
May 2014
Added SCOS 6.5 updates, including compression, multi-VLAN, Sync Live Volume
September 2014
Added iSCSI port binding resource links and clarifications
THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND
TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF
ANY KIND.
2014 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without the express
written permission of Dell Inc. is strictly forbidden. For more information, contact Dell.
PRODUCT WARRANTIES APPLICABLE TO THE DELL PRODUCTS DESCRIBED IN THIS DOCUMENT MAY BE FOUND
AT: http://www.dell.com/learn/us/en/19/terms-of-sale-commercial-and-public-sector Performance of network
reference architectures discussed in this document may vary with differing deployment conditions, network loads, and
the like. Third party products may be included in reference architectures for the convenience of the reader. Inclusion
of such third party products does not necessarily constitute Dells recommendation of those products. Please consult
your Dell representative for additional information.
Trademarks used in this text:
Dell, the Dell logo, Dell Boomi, Dell Precision ,OptiPlex, Latitude, PowerEdge, PowerVault,
PowerConnect, OpenManage, EqualLogic, Compellent, KACE, FlexAddress, Force10 and Vostro are
trademarks of Dell Inc. Other Dell trademarks may be used in this document. Cisco Nexus, Cisco MDS, Cisco NX0S, and other Cisco Catalyst are registered trademarks of Cisco System Inc. EMC VNX, and EMC Unisphere are
registered trademarks of EMC Corporation. Intel , Pentium, Xeon, Core and Celeron are registered trademarks of
Intel Corporation in the U.S. and other countries. AMD is a registered trademark and AMD Opteron, AMD
Phenom and AMD Sempron are trademarks of Advanced Micro Devices, Inc. Microsoft , Windows, Windows
Dell Compellent Storage Center Best Practices with VMware vSphere 5.x | Document number 680-041-020
Server, Internet Explorer, MS-DOS, Windows Vista and Active Directory are either trademarks or registered
trademarks of Microsoft Corporation in the United States and/or other countries. Red Hat and Red Hat Enterprise
Linux are registered trademarks of Red Hat, Inc. in the United States and/or other countries. Novell and SUSE are
registered trademarks of Novell Inc. in the United States and other countries. Oracle is a registered trademark of
Oracle Corporation and/or its affiliates. Citrix, Xen, XenServer and XenMotion are either registered trademarks or
trademarks of Citrix Systems, Inc. in the United States and/or other countries. VMware , Virtual SMP, vMotion,
vCenter and vSphere are registered trademarks or trademarks of VMware, Inc. in the United States or other
countries. IBM is a registered trademark of International Business Machines Corporation. Broadcom and
NetXtreme are registered trademarks of Broadcom Corporation. Qlogic is a registered trademark of QLogic
Corporation. Other trademarks and trade names may be used in this document to refer to either the entities claiming
the marks and/or names or their products and are the property of their respective owners. Dell disclaims proprietary
interest in the marks and names of others.
Table of contents
Revisions ............................................................................................................................................................................................. 2
Executive summary .......................................................................................................................................................................... 8
1
Fibre channel switch zoning .................................................................................................................................................... 9

1.1
Single initiator multiple target zoning ......................................................................................................................... 9
1.2
Port zoning ....................................................................................................................................................................... 9
1.3
WWN zoning .................................................................................................................................................................... 9
1.4
Virtual ports .................................................................................................................................................................... 10
Host bus adapter and other initiator settings ...................................................................................................................... 11

2.1
QLogic Fibre Channel card BIOS settings ................................................................................................................. 11
2.2
Emulex Fibre Channel card BIOS settings ................................................................................................................. 11
2.3
QLogic iSCSI HBAs ......................................................................................................................................................... 11
2.4
Miscellaneous iSCSI initiator settings ......................................................................................................................... 11
Modifying queue depth in an ESXi environment ................................................................................................................ 12

3.1
Host bus adapter queue depth ................................................................................................................................... 12
3.2
Modifying ESXi storage driver queue depth and timeouts .................................................................................... 12
3.3
Modifying the VMFS queue depth for virtual machines (DSNRO) ........................................................................ 15
3.4
Adaptive queue depth .................................................................................................................................................. 16
3.5
Modifying the guest OS queue depth ....................................................................................................................... 17
Setting operating system disk timeouts ............................................................................................................................... 20
Guest virtual SCSI adapters .................................................................................................................................................... 21
Mapping volumes to an ESXi server ..................................................................................................................................... 22

6.1
Basic volume mapping concepts ............................................................................................................................... 22
6.2
Basic volume mappings in Storage Center .............................................................................................................. 22
6.3
Multipathed volume concepts .................................................................................................................................... 23
6.4
Multipathed volumes in Storage Center ................................................................................................................... 24
6.5
Configuring the VMware iSCSI software initiator for a single path ...................................................................... 25
6.6
Configuring the VMware iSCSI software initiator for multipathing ...................................................................... 26
6.6.1 Virtual switch recommendations for iSCSI port binding ....................................................................................... 28
6.7
iSCSI Port Multi-VLAN configuration recommendations ...................................................................................... 29
6.8
Configuring the FCoE software initiator for multipathing ..................................................................................... 30
6.9
VMware multipathing policies .................................................................................................................................... 30
6.9.1 Fixed policy .................................................................................................................................................................... 31

6.9.2 Round robin ................................................................................................................................................................... 31
6.9.3 Most recently used (MRU) ........................................................................................................................................... 33
6.10 Multipathing using a fixed path selection policy ..................................................................................................... 33
6.11 Multipathing using a round robin path selection policy ........................................................................................ 33
6.12 Asymmetric logical unit access (ALUA) ..................................................................................................................... 34
6.13 Unmapping volumes from an ESXi host ................................................................................................................... 34
6.14 Additional multipathing resources ............................................................................................................................. 35
7
Boot from SAN ......................................................................................................................................................................... 36

7.1
Configuring boot from SAN ........................................................................................................................................ 36
Volume creation and sizing ................................................................................................................................................... 38

8.1
Volume sizing and the 64 TB limit ............................................................................................................................. 38
8.2
Virtual machines per datastore ................................................................................................................................... 38
8.3
VMFS partition alignment ............................................................................................................................................ 39
8.4
VMFS file systems and block sizes ............................................................................................................................. 41
8.4.1 VMFS-3............................................................................................................................................................................ 41
8.4.2 VMFS-5 ............................................................................................................................................................................ 41
9
LUN mapping layout ............................................................................................................................................................... 43

9.1
Multiple virtual machines per LUN ............................................................................................................................. 43
9.1.1 Storage of non-virtual machine files ......................................................................................................................... 43

9.1.2 Separation of the operating system page files ........................................................................................................ 43
9.1.3 Separation of the virtual machine swap files ...........................................................................................................44
9.1.4 Virtual machine placement .........................................................................................................................................44
9.2
One virtual machine per LUN ..................................................................................................................................... 45
10 Raw device mappings (RDM's) .............................................................................................................................................. 47

11 Data progression and RAID types .........................................................................................................................................48
11.1
Compression .................................................................................................................................................................. 50
11.2 On-Demand Data Progression ................................................................................................................................... 51

12 Thin provisioning and VMDK files ......................................................................................................................................... 52
12.1 Virtual disk formats ....................................................................................................................................................... 52
12.1.1 Thick provision lazy zeroed ......................................................................................................................................... 52
12.1.2 Thick provision eager zeroed ..................................................................................................................................... 52
12.1.3 Thin provisioned ............................................................................................................................................................ 53

12.1.4 Space efficient sparse ................................................................................................................................................... 53
12.2 Thin provisioning relationship .................................................................................................................................... 53
12.3 Storage Center thin write functionality ..................................................................................................................... 53
12.4 Storage Center thin provisioning or VMware thin provisioning ........................................................................... 54
12.5 Windows free space recovery..................................................................................................................................... 54
13 Extending VMware volumes .................................................................................................................................................. 56
13.1 Increasing the size of VMFS datastores ..................................................................................................................... 56
13.1.1 Expanding an extent in an existing VMFS datastore ............................................................................................... 56
13.1.2 Adding a new extent to an existing datastore .......................................................................................................... 57
13.2 Increasing the size of a virtual disk (VMDK File)....................................................................................................... 57
13.3 Increasing the size of a raw device mapping (RDM) ............................................................................................... 58
14 Replays and virtual machine backups .................................................................................................................................. 59
14.1 Backing up virtual machines ....................................................................................................................................... 59
14.1.1 Backing up virtual machines to tape or disk ............................................................................................................ 59
14.1.2 Backing up virtual machines using replays ............................................................................................................... 59
14.2 Recovering virtual machine data from a replay ...................................................................................................... 60
14.2.1 Recovering a file from a virtual disk ........................................................................................................................... 61
14.2.2Recovering an entire virtual disk ................................................................................................................................ 61
14.2.3Recovering an entire virtual machine ........................................................................................................................ 62
15 Replication and remote recovery ......................................................................................................................................... 63
15.1 Replication considerations with standard replications .......................................................................................... 64
15.2 Replication considerations with Live Volumes ........................................................................................................ 64
15.2.1 Asynchronous Live Volume ......................................................................................................................................... 64
15.2.2 Synchronous Live Volume ........................................................................................................................................... 65
15.3 Replication tips and tricks ............................................................................................................................................ 66
15.4 Virtual machine recovery at a DR site ....................................................................................................................... 66
16 VMware storage features ........................................................................................................................................................ 67
16.1 Storage I/O Controls (SIOC) ....................................................................................................................................... 67
16.2 Storage Distributed Resource Scheduler (SDRS) .....................................................................................................68
16.3 vStorage APIs for Array Integration (VAAI) ................................................................................................................ 69
16.3.1 Block zeroing (SCSI WRITE SAME) ............................................................................................................................. 69
16.3.2Full copy (SCSI EXTENDED COPY)............................................................................................................................. 70

16.3.3Hardware accelerated locking (ATS) ......................................................................................................................... 70
16.3.4Dead space reclamation (SCSI UNMAP) ................................................................................................................... 70
16.3.5Thin provisioning stun .................................................................................................................................................. 70
17 Using NFS with FluidFS ........................................................................................................................................................... 72
17.1
Prerequisite reading ...................................................................................................................................................... 72
17.2 FS8600 architecture ..................................................................................................................................................... 72

17.2.1 FS8600 SAN network ................................................................................................................................................... 73
17.2.2 FS8600 internal network ............................................................................................................................................. 73
17.2.3 FS8600 LAN/Client network ....................................................................................................................................... 73
17.3 Configuring ESXi advanced settings .......................................................................................................................... 73
17.3.1 Maximum number of NFS mounts per ESXi host .................................................................................................... 73
17.3.2 TCP/IP heap size ........................................................................................................................................................... 74
17.3.3 Additional resources ..................................................................................................................................................... 74
17.4 Basic setup of a datastore using NFS ........................................................................................................................ 74
17.4.1 Configuring the NFS export and permissions .......................................................................................................... 74
17.4.2 Enabling deduplication ................................................................................................................................................ 76
17.4.3 Connect to an NFS export via the VMware vSphere client ................................................................................... 77
18 Conclusion ............................................................................................................................................................................... 80
18.1 More information ......................................................................................................................................................... 80
18.2 Getting help .................................................................................................................................................................. 80
Determining the appropriate queue depth for an ESXi host ............................................................................................ 81
Configuring Enterprise Manager VMware integrations .....................................................................................................84
Executive summary
This document will provide configuration examples, tips, recommended settings, and other storage
guidelines a user can follow while integrating VMware ESXi 5.x Server hosts with the Storage Center. This
document has been written to answer many frequently asked questions with regard to how VMware
interacts with the Storage Center's various features such as Dynamic Capacity (thin provisioning), Data
Progression (automated tiering), and Remote Instant Replay (replication).
Prerequisites
This document assumes the reader has had formal training or has advanced working knowledge of the
following:
Installation and configuration of VMware vSphere 5.x
Configuration and operation of the Dell Compellent Storage Center
Operating systems such as Windows or Linux
Dell Compellent advises customers to read the vSphere Storage Guide, which is publicly available on the
vSphere documentation pages to provide additional important information about configuring ESXi hosts to
use the SAN.
Intended audience
This document is highly technical and intended for storage and server administrators, as well as other
information technology professionals interested in learning more about how VMware vSphere 5.x
integrates with Storage Center.
Note: The information contained within this document is intended only to be general recommendations
and may not be applicable to all configurations. There are certain circumstances and environments
where the configuration may vary based upon individual or business needs.
Fibre channel switch zoning

Zoning fibre channel switches for an ESXi host is done much the same way as any other server connected
to the Storage Center. Here are the fundamental points:
1.1
Single initiator multiple target zoning

Each fibre channel zone created should have a single initiator (HBA port) and multiple targets (Storage
Center front-end ports). This means that each HBA port needs its own fibre channel zone containing itself
and the Storage Center front-end ports.
1.2
Port zoning
If the Storage Center front-end ports are plugged into switch ports 0, 1, 2, & 3, and the first ESXi HBA port
is plugged into switch port 10, the resulting zone should contain switch ports 0, 1, 2, 3, & 10.
Repeat this for each of the HBAs in the ESXi host. If the environment has multiple fabrics, the additional
HBA ports in the host should have separate unique zones created in their respective fabrics.
Caution: Due to the supportability of port zoning, WWN zoning is preferred over port zoning.
1.3
WWN zoning
When zoning by WWN, the zone only needs to contain the host HBA port and the Storage Center frontend primary ports. In most cases, it is not necessary to include the Storage Center front-end reserve
ports because they are not used for volume mappings. For example, if the host has two HBAs connected
to two disjointed fabrics, the fibre channel zones would look similar to this:
Name: ESX1-HBA1
WWN: 2100001B32017114
WWN: 5000D31000036001
WWN: 5000D31000036009
(Zone created in fabric 1)

(ESX1 HBA Port 1)
(Controller1 front-end primary plugged into fabric 1)
Name: ESX1-HBA2
WWN: 210000E08B930AA6
WWN: 5000D31000036002
WWN: 5000D3100003600A
(Zone created in fabric 2)

(ESX1 HBA Port 2)
1.4
Virtual ports
If the Storage Center is configured to use Virtual Port Mode, all of the Front End virtual ports within each
Fault Domain should be included in the zone with each ESXi initiator.
Figure 1
10
Virtual Port Domains FC and iSCSI
Host bus adapter and other initiator settings

Make sure that the HBA BIOS settings are configured in the ESXi host according to the latest Storage
Center Administrators Guide found on Knowledge Center. At the time of this writing, here are the
current recommendations:
2.1
QLogic Fibre Channel card BIOS settings
2.2
The connection options field should be set to 1 for point to point only
The login retry count field should be set to 60 attempts
The port down retry count field should be set to 60 attempts
The link down timeout field should be set to 30 seconds
The queue depth (or Execution Throttle) field should be set to 255.
- This queue depth can be set to 255 because the ESXi VMkernel driver module and DSNRO can
more conveniently control the queue depth
Emulex Fibre Channel card BIOS settings

The Node Time Out field lpfc_devloss_tmo (formerly nodev_tmo) field should be set to 60
seconds
The topology field should be set to 2 for Auto Topology (point to point first)
The queuedepth field should be set to 255
- This queue depth can be set to 255 because the ESXi VMkernel driver module and DSNRO can
more conveniently control the queue depth
2.3
QLogic iSCSI HBAs

The ARP Redirect must be enabled for controller failover to work properly with hardware iSCSI
HBAs.
- Enabling ARP redirect for Hardware iSCSI HBAs example:
esxcli iscsi physicalnetworkportal param set --option ArpRedirect=true -A vmhba4
Note: This command replaces the former esxcfg-hwiscsi command found in VMware KB Article
1010309. The vSphere 5 Documentation Center has a helpful Reference to Replacements for Service
Console Commands which explains the new syntax.
2.4
Miscellaneous iSCSI initiator settings

Delayed ACK
- During periods of high network congestion in some customer environments, iSCSI transfer
latency may exceed acceptable levels. In such scenarios, VMware recommends disabling
Delayed ACK using the steps in Knowledge Base article 1002598.
11
Modifying queue depth in an ESXi environment

Queue depth is defined as the number of disk transactions that are allowed to be in flight between an
initiator and a target, where the initiator is typically an ESXi host HBA port/iSCSI initiator and the target is
typically the Storage Center front-end port.
Since any given target can have multiple initiators sending it data, the initiator queue depth is generally
used to throttle the number of transactions being sent to a target to keep it from becoming flooded.
When this happens, the transactions start to pile up causing higher latencies and degraded performance.
That being said, while increasing the queue depth can sometimes increase performance, if it is set too
high, there is an increased risk of overdriving the storage array.
As data travels between the application and the storage array, there are several places that the queue
depth can be set to throttle the number of concurrent disk transactions.
The most common places queue depth can be modified are:
The application itself

The virtual SCSI card driver in the guest
The VMFS layer (DSNRO)
The HBA VMkernel Module driver
The HBA BIOS
(Default=dependent on application)
(Default=32)
(Default=32)
(Default=64)
(Default=Varies)
The following sections explain how the queue depth is set in each of the layers in the event it needs to be
changed.
Caution: The appropriate queue depth for a host may vary due to a number of factors, so it is
recommended to only increase or decrease the queue depth if necessary. See Appendix A for more info
on determining the proper queue depth.
3.1
Host bus adapter queue depth

When configuring the host bus adapter for the first time, as mentioned previously, the queue depth should
be set to 255. This is because the VMkernel driver module loaded for each HBA in the system and DSNRO
ultimately regulate the hosts queue depth. For example, if the HBA BIOS is set to 255 and the VMkernel
driver module is set to 64, the maximum queue depth for that card or port will be 64.
3.2
Modifying ESXi storage driver queue depth and timeouts

As mentioned in the previous section, the VMkernel driver module ultimately regulates the queue depth
for the HBA if it needs to be changed. (See Appendix A for more information about determining the
appropriate queue depth.)
12
In addition to setting the queue depth in the driver module, the disk timeouts must also be set within the
same command. These timeouts need to be set in order for the ESXi host to properly survive a Storage
Center controller failover.
Please refer to the latest VMware documentation for instructions on how to configure these settings:
VMware document: vSphere Troubleshooting
- Section Title: Adjust Queue Depth for QLogic and Emulex HBAs
Caution: Before executing these commands, please refer to the latest documentation from VMware for
any last minute additions or changes.
For each of these adapters, the method to set the driver queue depth and timeouts uses the following
general steps:
1.
Find the appropriate driver name for the module that is loaded:
a. For QLogic:
esxcli system module list |grep ql

b. For Emulex:
esxcli system module list |grep lpfc
i. Depending on the HBA model, the output could be similar to:
1. QLogic: qla2xxx or qlnativefc
2. Emulex: lpfc820
Note: The steps below contain example module names. Actual module names should be acquired from
the step above.
2. Set the driver queue depth and timeouts using the esxcli command:
a. For QLogic:
esxcli system module parameters set -m qla2xxx -p "ql2xmaxqdepth=255
ql2xloginretrycount=60 qlport_down_retry=60"
b. For Emulex:
esxcli system module parameters set -m lpfc820 -p "lpfc_devloss_tmo=60
lpfc_hba_queue_depth=255"
Note: In certain multipathing configurations, the qlport_down_retry value may be set lower to decrease
failover times between paths if one of the paths fails.
3. Reboot the ESXi host for these changes to take effect.
13
4. To verify the settings, use the following command:

esxcli system module parameters list -m=module
(i.e. -m=qla2xxx)
Similarly, for the software iSCSI Initiator:

1.
Example of setting the queue depth to 255:
esxcli system module parameters set -m iscsi_vmk -p iscsivmk_LunQDepth=255

2. Example of how to increase the Login Timeout to 60 seconds:
a. First determine the iSCSI adapter name:
esxcli iscsi adapter list
b. Next, set the Login Timeout parameter:
esxcli iscsi adapter param set A vmhba37 k LoginTimeout v 60
3. Reboot the ESXi host for the change to take effect.
4. To verify the settings, use the following commands:
esxcli system module parameters list -m iscsi_vmk
esxcli iscsi adapter param get A vmhba37
Note: In certain earlier versions of ESXi 5.x, the option to set the Login Timeout parameter is not
available. In order to enable the login timeout parameter, it may require applying the patch as described
in VMware KB article 2007680.
14
3.3
Modifying the VMFS queue depth for virtual machines (DSNRO)

Controlling the queue depth at the datastore level is an advanced setting within each ESXi host named
Disk Scheduled Number Requests Outstanding, or DSNRO for short.
DSNRO is yet another value that can be increased or decreased depending on how many virtual machines
are to be placed on each datastore or their I/O requirements. Keep in mind, this queue depth limit is only
enforced when more than one virtual machine per host is active on that datastore. For example, if left set
to the default, the first virtual machine active on a datastore will have its queue depth limited only by the
queue depth set in the VMkernel driver module. When a second, third, or fourth virtual machine is added
to the datastore, during contention the limit will be enforced to the maximum 32 queue depth or as set by
the Disk.SchedNumReqOutstanding variable.
It is important to remember that in versions previous to ESXi 5.5, this is a global setting, so it applies to ALL
VMFS datastores with more than one virtual machine active on them. So if the host has one datastore with
2 virtual machines, and another datastore with 8 virtual machines, each of the virtual machines will have a
maximum queue depth of 32 enforced by default during contention. VMware recommends that if the
VMkernel Module driver queue depth is changed, that consideration is given to set this variable to match,
since they will gate each other. (See Appendix A for more information about determining the appropriate
queue depth.)
Figure 2
Setting DSNRO on hosts pre ESXi 5.5
In ESXi 5.5 and later, the DSNRO setting has been changed from a global value to a per LUN value set via
the command line:
esxcli storage core device set -d <naa.dev> -O <value of 1-256>
This allows fine tuning of DSNRO on a per volume basis, however the drawback is that it must be set on
each datastore (and each host) if the desired queue depth is greater than 32.
15
Figure 3
Example queue utilization when the DSNRO is set to 32.
Note: The Disk.SchedNumReqOutstanding limit does not apply to LUNs mapped as Raw Device
Mappings (RDMs). Each RDM will have its own queue.
More information on the Disk.SchedNumReqOutstanding variable can be found in the following
documents:
VMware document: vSphere Troubleshooting
- Section: Change Maximum Outstanding Disk Requests in the vSphere Web Client
VMware KB Article: Setting the Maximum Outstanding Disk Requests for virtual machines
- Link: http://kb.vmware.com/kb/1268
3.4
Adaptive queue depth

At times of high congestion, VMware has an adaptive queue depth algorithm that can be enabled with the
QFullSampleSize and QFullThreshold variables. These variables aid in relieving the congestion by
dynamically reducing and increasing the LUN queue depth. Due to the architecture of the Storage Center,
enabling these settings is not recommended unless under the guidance of Dell Copilot Support. For more
information, please consult VMware KB Article 1008113.
16
3.5
Modifying the guest OS queue depth

The queue depth can also be set within the guest operating system if needed. By default, the Windows
operating systems have a default queue depth of 32 set for each vSCSI controller, but this can be
increased up to 128 if necessary. The method to adjust the queue depth varies between operating
systems, but here are two examples.
Windows Server 2008/2008 R2/2012
Since the default LSI Logic driver is already at an acceptable version, all that needs to be done is add the
following registry keys:
Using regedit, add the following keys: (Backup the registry first)
For LSI Logic Parallel (LSI_SCSI):
Windows Registry Editor Version 5.00
[HKLM\SYSTEM\CurrentControlSet\Services\LSI_SCSI\Parameters\Device]
"DriverParameter"="MaximumTargetQueueDepth=128;"
; The semicolon is required at the end of the queue depth value
"MaximumTargetQueueDepth"=dword:00000080
; 80 hex is equal to 128 in decimal
Figure 4
17
Registry settings for the LSI Logic Parallel vSCSI adapter
For LSI Logic SAS (LSI_SAS):

[HKLM\SYSTEM\CurrentControlSet\Services\LSI_SAS\Parameters\Device]
Figure 5
1.
Registry setting for the LSI Logic SAS vSCSI Adapter (Server 2008/R2)
Reboot the virtual machine.
Windows Server 2003 (32 bit)

The default LSI Logic driver (SYMMPI) is an older LSI driver that must be updated to get the queue depth
higher than 32.
1.
Search for and download the following driver from the LSI Logic download page:
Filename: LSI_U320_W2003_IT_MID1011438.zip
Adapter: LSI20320-R
Driver: Windows Server 2003 (32-bit)
Version: WHQL 1.20.18 (Dated: 13-JUN-05)
2. Update the current LSI Logic PCI-X Ultra320 SCSI HBA driver to the newer WHQL driver version
1.20.18.
3. Using regedit, add the following keys: (Backup the registry first)
[HKLM\SYSTEM\CurrentControlSet\Services\symmpi\Parameters\Device]
18
Figure 6
Registry setting for the LSI Logic Parallel vSCSI Adapter (Server 2003)
4. Reboot the virtual machine.

Note: Please visit VMwares Knowledge Base for the most current information about setting the queue
depth with different vSCSI controllers or operating systems.
19
Setting operating system disk timeouts

For each operating system running within a virtual machine, the disk timeouts must also be set so the
operating system can handle storage controller failovers properly.
Examples of how to set the operating system timeouts can be found in the following VMware document:
VMware document: vSphere Storage Guide
- Section Title: Set Timeout on Windows Guest OS
Here are the general steps to set the disk timeout within Windows and Linux:
Windows
1.
Using the registry editor, modify the following key: (Backup the registry first)

[HKLM\SYSTEM\CurrentControlSet\Services\Disk]
"TimeOutValue"=dword:0000003c
; 3c in hex is equal to 60 seconds in decimal
Figure 7
Registry key to set the Windows disk timeout
2. Reboot the virtual machine.
Note: This registry value is automatically set when installing VMware Tools version 3.0.2 and later. For
more information, please see the VMware KB Article 1014
Linux
For more information about setting disk timeouts in Linux, please refer to the following VMware
Knowledge Base articles:
Increasing the disk timeout values for a Linux virtual machine
- http://kb.vmware.com/kb/1009465
20
Guest virtual SCSI adapters

When creating a new virtual machine there are four types of virtual SCSI Controllers that can be selected
depending on the guest operating system.
Figure 8
vSCSI adapter selection
BusLogic Parallel
This vSCSI controller is used for older operating systems. Due to this controllers queue depth limitations,
its use is not recommended unless it is the only option available for that particular operating system. This
is because when using certain versions of Windows, the OS issues only enough I/O to fill a queue depth of
one.
LSI Logic Parallel
Since this vSCSI adapter is supported by many operating system versions, and is a good overall choice, it is
recommended for virtual machines that do not support the LSI Logic SAS adapter.
LSI Logic SAS
This vSCSI controller is available for virtual machines with hardware version 7 & 8, and has similar
performance characteristics of the LSI Logic Parallel. This adapter is required for MSCS Clustering in
Windows Server 2008 because SCSI-3 reservations are needed. Some operating system vendors are
gradually withdrawing support for SCSI in favor of SAS, subsequently making the LSI Logic SAS controller a
good choice for future compatibility.
VMware Paravirtual
This vSCSI controller is a high-performance adapter that can result in greater throughput and lower CPU
utilization. Due to feature limitations when using this adapter, we recommend against using it unless the
virtual machine has very specific performance needs. More information about the limitations of this
adapter can be found in the vSphere Virtual Machine Administration Guide, in a section titled, About
VMware Paravirtual SCSI Controllers.
21
Mapping volumes to an ESXi server

Within the Storage Center, mapping is the process of presenting a volume to a host. There are some basic
concepts to understand with regards to how vSphere treats different scenarios.
6.1
Basic volume mapping concepts

When sharing volumes between ESXi hosts for such tasks as vMotion, HA, and DRS, it is best practice that
each volume is mapped to each ESXi host using the same Logical Unit Number (LUN).
For example:
There are three ESXi hosts named ESX1, ESX2, and ESX3.
A new volume is created named "LUN10-vm-storage".
This volume must be mapped to each of the ESXi hosts as the same LUN:
Volume: "LUN10-vm-storage" Mapped to ESX1 -as- LUN 10
6.2
Basic volume mappings in Storage Center

In Storage Center versions 5.x and later, the mapping process was greatly automated by creating a server
cluster object. This allows the volume to be mapped to multiple ESXi hosts at the same time, automatically
keeping the LUN numbering consistent for all the paths.
Figure 9
Example of a server cluster object
As an added benefit, when a new ESXi host is placed into the server cluster, all of the existing volume
mappings assigned to the cluster object will be applied to the new host. This means that if the cluster has
100 volumes mapped to it, presenting all of them to a newly created ESXi host is as simple as adding it to
the cluster object.
Similarly, if the host is removed from the server cluster, the cluster mappings will also be removed, so it is
important that those volumes are not being used by the host when they are removed. Only volumes that
are mapped to an individual host, such as the boot volume, will remain once a host is removed from the
server cluster.
Also in the mapping wizard, the system can auto select the LUN number, or a preferred LUN number can
be manually specified from the advanced settings screen as shown in the figure below.
22
Figure 10 Manually specifying a LUN in the advanced setting screen

This advanced option will allow administrators who already have a LUN numbering scheme to continue
using it, but if a LUN is not manually specified, the system will auto select a LUN for each volume
incrementally starting at LUN 1.
Timesaver: When naming volumes from within the Storage Center GUI, it may be helpful to specify the
LUN number as part of the volume name. This will help to quickly identify which volumes are mapped
using each LUN.
6.3
Multipathed volume concepts

If there is an ESXi host (or hosts) that have multiple initiator ports, whether they are FC, iSCSI, or Ethernet,
ESXi has built-in functionality to provide native multipathing of volumes over fibre channel, FCoE,
hardware iSCSI, or software iSCSI. Please note that even when multipathing, the LUN should still remain
consistent between paths.
Building on the example from above, here is an example of multipathing mappings:
Volume: "LUN10-vm-storage" Mapped to ESX1/HBA1 -as- LUN 10
Note: If the LUN number is not kept consistent between multiple hosts or multiple HBA's, VMFS
datastores may not be visible to all nodes, preventing use of vMotion, HA, DRS, or FT.
Keep in mind that when a volume uses multiple paths, the first ESXi initiator in each server will need to be
mapped to one front end port, while the second ESXi initiator will be mapped to the other front end port
in that same controller. For example:
"LUN10-vm-storage" Controller1/PrimaryPort1 FC-Switch-1 Mapped to ESX1/HBA1 as LUN 10
23
Likewise, if different volume is active on the second controller, it may be mapped such as:
This means that when configuring multipathing in ESXi, a single volume cannot be mapped to both
controllers at the same time, because a volume can only be active on one controller at a time.
6.4
Multipathed volumes in Storage Center

When multipathing volumes in Storage Center versions 5.x and later, much of the process is automated.
Many of the common mapping errors can be prevented by simply selecting the operating system in the
server properties screen. Based on the OS selected, it will apply a set of rules to the server, unique to each
operating system, to correctly map volumes.
Figure 11 Server operating system selection

Multipathing to an ESXi host is automatic if the server object has multiple HBAs or iSCSI initiator ports
assigned to it. In other words, the advanced options will have to be used if the server does not need a
volume multipathed.
Figure 12 Advanced server mapping options
24
Table 1
6.5
Advanced mapping options for an ESXi 5.x host
Function
Description
Select LUN
This option is to manually specify the LUN. If this box is not checked,
the system will automatically assign the next available LUN.
Restrict Mapping paths
This option is used when a volume only needs to be mapped to a

specific HBA in the ESXi host.
Map to Controller
By default, the system will automatically select which controller the

volume should be mapped. To force a particular controller to handle
the I/O, use this option.
Configure Multipathing
This option designates how many of the Storage Center FE ports that
the system will allow the volume to be mapped through. For example
if each controller has 4 Front End ports, selecting unlimited will map
the volume through all 4, whereas selecting 2 will only use 2 of the 4
front end ports. The system will automatically select the 2 front end
ports based on which already have the fewest mappings.
Configuring the VMware iSCSI software initiator for a single path

Mapping volumes via VMware's iSCSI initiator follows the same rules for LUN numbering as with fibre
channel, but there are a few extra steps required for ESXi to see the Storage Center via the ESXi software
initiator.
From within the VMware vSphere Client:
1.
2.
3.
4.
5.
Enable the "Software iSCSI Client" within the ESXi firewall (in the "Security Profile" of the ESXi host)
Add a "VMkernel port" to a virtual switch assigned to the physical NIC for iSCSI (See figures below)
From the Storage Adapters configuration screen, click Add
Select Add Software iSCSI Adapter, then click OK
From within the Storage Adapters, highlight the iSCSI Software Adapter (i.e. vmhba33), click
Properties
6. Under the Dynamic Discovery tab, add all of the Storage Center iSCSI IP addresses that are assigned
to the iSCSI cards in the Storage Center controller(s), or just the iSCSI Control Port IP address.
7. Rescan the iSCSI Initiator.
From Within the Storage Center GUI or Enterprise Manager:
8. Create a server object for the ESXi host using the IP Address previously specified for the VMkernel in
step 2 above
9. Map a volume to the ESXi host
From within the VMware vSphere Client:
10. Navigate to the Storage Adapters section, and rescan the iSCSI HBA for new LUN's.
25
Figure 13 Configuring the VMkernel port
6.6
Configuring the VMware iSCSI software initiator for multipathing

In ESXi 5.x, the software initiator can be completely configured via the vSphere client or web client.
Instructions on how to configure this can be found in the following document:
- Section Title: Configuring Software iSCSI Adapter
- Subsection: Multiple Network Adapters in iSCSI Configuration
After following the VMware instructions in the VMware document listed above on how to configure the
software iSCSI initiator to use both NICs for multipathing (See figure below), the ESXi host can then be
added to the Storage Center.
Figure 14 Binding multiple VMkernel ports to the Software iSCSI Initiator
Note: Using port binding is not recommended when the VMkernel ports are on different networks, as it
may cause very long rescan times and other storage management problems. Please see the following
VMware KB article for more information: Considerations for using software iSCSI port binding in
ESX/ESXi (2038869)
26
Note: Some older versions of Storage Center allowed adding iSCSI initiators by either IP address or by
iSCSI Names. If prompted with that choice, using iSCSI Names is recommended, as shown in the figure
below.
Figure 15 Adding iSCSI HBAs to a server object using iSCSI Names
27
6.6.1
Virtual switch recommendations for iSCSI port binding

If the physical network is comprised of a single subnet for iSCSI, then use a single vSwitch with two iSCSI
virtual ports with the traditional 1:1 mapping. Note that more iSCSI virtual ports can be used if the
controller has more front end ports available.
I.e. Subnet 10.20.x.x, Subnet mask 255.255.0.0 Create the vSwitch as shown below
Figure 16 iSCSI Ports and NICs on a single vSwitch
If the network is comprised of multiple separate subnets for iSCSI, then use a separate vSwitch with single
iSCSI virtual port and single NIC. Once again, more iSCSI virtual ports can be used if there are more front
end ports in the controller.
I.e. Subnets have different IP addressing schemes Create the vSwitches as shown below
Figure 17 iSCSI Ports and NICS on multiple vSwitches.
Note: Using port binding is not recommended when the VMkernel ports are on different networks, as it
may cause very long rescan times and other storage management problems. Please see the following
VMware KB article for more information: Considerations for using software iSCSI port binding in
ESX/ESXi (2038869)
28
6.7
iSCSI Port Multi-VLAN configuration recommendations

For Storage Center systems with iSCSI capabilities, Storage Center 6.5 introduces multiple VLAN support
for diverse network configurations and multi-tenant environments. This allows iSCSI storage I/O to be
separated between vSwitches for customers using software iSCSI initiators within the guest, or to isolate
certain ESXi host clusters VMkernel ports to their own dedicated iSCSI storage VLAN.
Figure 18 VLANs assigned to vSwitches isolating in-guest iSCSI initiator traffic for customers
Since iSCSI traffic is not encrypted in its plain form, the best practice is to isolate that traffic for security
purposes. A common misconception is that CHAP encrypts iSCSI traffic, whereas it simply provides
authentication for the connection to prevent unauthorized access.
Figure 19 Isolating the ESXi Software iSCSI initiator using VLANs
29
NOTE: The configuration of iSCSI VLANs is new functionality that can only be configured through the
Enterprise Manager Client, and is unavailable through the standard Storage Center System Manager Web
GUI.
Figure 20 Configuration of VLANs within the Enterprise Manager Client

The Enterprise Manager Client will allow the creation of new VLANs, configuring the iSCSI port IP
addresses, as well as setting the Well known IP address (formerly known as the Control Port) for the
new iSCSI fault domains.
Note: For added flexibility, Jumbo Frames can be enabled on a per Fault Domain basis.
6.8
Configuring the FCoE software initiator for multipathing

When using the ESXi 5 software FCoE, LUNs will be assigned to the VMW_SATP_LOCAL by default, which
in some instances will cause paths to go unclaimed. To remedy this situation, it may be necessary to
adjust the claim rules on each ESXi host so that FCoE volumes are correctly assigned to the
VMW_SATP_DEFAULT_AA as follows.
esxcli storage nmp satp rule add -R fcoe -s VMW_SATP_DEFAULT_AA
6.9
VMware multipathing policies

When configuring the path selection policy of each datastore or LUN, it can be set to Fixed, Round Robin,
or Most Recently Used. The default path selection policy for the Storage Center is set to Fixed, but Round
Robin is recommended if there are not technical reasons against using it such as with Microsoft failover
clusters or Live Volume.
30
6.9.1
Fixed policy
If the Fixed policy is used, it will give the greatest control over the flow of storage traffic. However, one
must be very careful to evenly distribute the load across all host HBAs, Front-End Ports, fabrics, and
Storage Center controllers.
When using the fixed policy, if a path fails, all of the LUNs using it as their preferred path will fail over to the
secondary path. When service resumes, the LUNs will resume I/O on their preferred path.
Fixed Example: (See figure below)
HBA1 loses connectivity; HBA2 takes over its connections.
HBA1 resumes connectivity; HBA2 will fail its connections back to HBA1.
Figure 21 Example of a datastore path selection policy set to Fixed
6.9.2
Round robin
The round robin path selection policy uses automatic path selection and load balancing to rotate I/O
through all paths. It is important to note that round robin load balancing does not aggregate the storage
link bandwidth; it merely distributes the load for the volumes in bursts evenly and sequentially across paths
in an alternating fashion.
Using round robin will reduce the management headaches of manually balancing the storage load across
all storage paths as with a fixed policy; however there are certain situations where using round robin does
not make sense. For instance, it is generally not considered best practice to enable round robin between
an iSCSI path and fibre channel path, nor enabling it to balance the load between an 8 Gb FC and a 16 Gb
FC path. If round robin is enabled for one or more datastores/LUNs, care must be taken to ensure all the
paths included are identical in type, speed, and have the same queue depth settings.
Here is an example of what happens during a path failure using round robin.
31
Round Robin Example: (See figure below)

Load is distributed evenly between HBA1 and HBA2
HBA1 loses connectivity; HBA2 will assume all I/O load.
HBA1 resumes connectivity; load is distributed evenly again between both.
Figure 22 Example of a datastore path selection policy set to Round Robin

Note: The round robin path selection policy (PSP) can be set to the default with the following command.
After setting round robin as the default and rebooting, any new volumes mapped will acquire this policy,
however, mappings that already existed beforehand may have to be set manually.
esxcli storage nmp satp set -P VMW_PSP_RR -s VMW_SATP_DEFAULT_AA
If an entire cluster needs the pathing policy switched from Fixed to Round Robin, here is an example
VMware PowerCLI PowerShell command to complete the task.
Get-Cluster ClusterNameHere | Get-VMHost | Get-ScsiLun | where {$_.Vendor -eq
"COMPELNT" and $_.Multipathpolicy -eq "Fixed"} | Set-ScsiLun -Multipathpolicy
RoundRobin
Caution: The round robin path selection policy may be unsupported for use with Microsoft Clustering
Services dependent on the version of ESXi, so it is advised to check current support status.
32
6.9.3
Most recently used (MRU)

The Most Recently Used path selection policy is generally used with Active/Passive arrays (to prevent path
thrashing), and is therefore not needed with the Storage Center because a volume can only be active on
one controller at a time.
6.10
Multipathing using a fixed path selection policy

Keep in mind with a fixed policy, only the preferred path will actively transfer data. To distribute the I/O
loads for multiple datastores over multiple HBA's, the preferred path must be set for each datastore
consistently between each host. Here are some examples:
Example 1: (Bad)
Volume: "LUN10-vm-storage" Mapped to ESX1/HBA1 -as- LUN 10 (Active/Preferred)
Volume: "LUN10-vm-storage" Mapped to ESX1/HBA2 -as- LUN 10 (Standby)
This example would cause all I/O for both volumes to be transferred over HBA1.
Example 2: (Good)
This example sets the preferred path to distribute the load between both HBAs.
Although the fixed multipathing policy gives greater control over which path transfers the data for each
datastore, one must manually validate that all paths have proportional amounts of traffic to each ESXi host.
6.11
Multipathing using a round robin path selection policy

If the decision is made to use round robin, it must be manually defined for each LUN and host (or set to
the default), but will provide both path failure protection, and remove some of the guesswork of
distributing load between paths manually as with a fixed policy. To reiterate from previous sections in this
document, be sure when using round robin that the paths are of the same type, speed, and have the same
queue depth setting.
Example 1:
Volume: "LUN10-vm-storage" Mapped to ESX1/HBA1 -as- LUN 10 (Active)
Storage traffic is evenly distributed between all HBAs.
33
6.12
Asymmetric logical unit access (ALUA)

The ALUA protocol was designed for arrays that VMware classifies as Asymmetrical Storage Systems.
Since the Storage Center is considered an Active-Active storage system where all the paths are active at all
times (unless a path fails), ALUA is not necessary.
6.13
Unmapping volumes from an ESXi host

Within ESXi 5.x, VMware has added the ability to gracefully remove volumes before unmapping them to
prevent an All Paths Down (APD) state.
Before attempting this procedure, please consult the latest VMware documentation for any last minute
changes:
VMware Document: vSphere Storage Guide
- Section Title: Performing Planned Storage Device Removal
1. Make note of the volumes naa identifier. This will be referenced later.
2. From the datastore view, right click on the datastore and select Unmount.
Figure 23 Unmounting a datastore

3. Once the datastore has been successfully unmounted, then select Detach on the disk device.
Figure 24 Detaching a datastore
34
a. Repeat steps 1 3 for each host the volume is presented

4. From within the Storage Center GUI or Enterprise Manager, unmap the volume.
5. From within the vSphere client, rescan the adapters to ensure that the disk has been removed.
Note: Graceful removal of volumes from an ESXi host is done automatically when using the Dell
Compellent vSphere Client Plug-ins.
6.14
Additional multipathing resources

VMware Document: vSphere Storage Guide
- Section: Understanding Multipathing and Failover
35
Boot from SAN

Booting ESXi hosts from SAN yields both advantages and disadvantages. In some cases, such as with
blade servers that do not have internal disk drives, booting from SAN may be the only option, but a lot of
ESXi hosts can have internal mirrored drives giving the flexibility to choose. The benefits of booting from
SAN are obvious. It alleviates the need for internal drives and allows the ability to take replays of the boot
volume.
However, there are also benefits to booting from Local disks and having just the virtual machines located
on SAN resources. Since it takes a short period of time to freshly load and patch an ESXi host, booting
from local disks gives them the advantage of staying online if for some reason maintenance needs to be
done to the fibre channel switches, Ethernet switches, or the array itself. The other clear advantage of
booting from local disks is being able to use the VMware iSCSI software initiator instead of iSCSI HBAs or
fibre channel cards.
In previous versions of ESXi, when booting from SAN RDMs couldnt be used, however since 3.x this
behavior has changed. When booting from SAN with ESX/ESXi 3.x, 4.x, or 5.x, RDMs can also be utilized.
Since the decision to boot from SAN depends on many business related factors including cost,
recoverability, and configuration needs, we have no specific recommendation.
7.1
Configuring boot from SAN

When deciding to boot ESXi hosts from SAN, there are a few best practices that need consideration.
When mapping the boot volume to the ESXi host for the initial install, the boot volume should only be
mapped down a single path to a single HBA. Once ESXi has been loaded and multipath modules are
operating correctly, the second path can then be added to the boot volume.
In Storage Center 5.x and later, entering the advanced mapping screen is needed to select a few options
to force mapping down a single path.
36
Figure 25 Advanced mapping screen for configuring boot from SAN
Check: Map volume using LUN 0

Check: Only map using specified server ports
Select the HBA that is configured as the boot path within the HBA BIOS
Maximum number of paths allowed: Single-path
Once the ESXi host is up and is running correctly, the second path can then be added to the boot volume
by modifying the mapping. To do this, right click on the mapping and select, Modify Mapping.
Uncheck: Only map using specified server ports
Maximum number of paths allowed: Unlimited
Once the 2nd path has been added, the HBAs on the ESXi host can be rescanned.
37
Volume creation and sizing

One of the more complex decisions an administrator must make is to determine the best volume size,
number of virtual machines per datastore, and file system versions in their environment. This section will
provide guidance to assist in making those decisions.
8.1
Volume sizing and the 64 TB limit

Although the maximum size of a LUN that can be presented to ESXi 5.x has been increased to 64 TB, the
general recommendation is to start with smaller and more manageable initial datastore sizes and expand
them as needed. Remember that a datastore can easily be expanded to a larger size later, so it is a prudent
idea to start with datastore sizes in the 500 GB 750 GB range. This is based on the consideration that a
750 GB datastore will accommodate approximately 15 40GB virtual disks, leaving a small amount of
overhead for virtual machine configuration files, logs, snapshots, and memory swap, keeping the datastore
performing adequately.
The largest single extent 64 TB VMFS-5 volume size is (64 * 1024 * 1024 * 1024 * 1024) bytes or
70368744177664 bytes. For any volumes larger than this size, VMware will simply not consume the
additional space.
Note: The sizing recommendations are simply provided as a means to limit the number of virtual
machines on each datastore to keep performance manageable, than actual capacity reasons. If there is a
virtual machine that requires a large virtual disk, creating a large datastore to accommodate it is not
discouraged.
Note: Within certain versions of ESXi 5.x, hosts may have a maximum addressable storage limits of 4 TB,
8 TB, or 60 TB due to the VMFS heap size. Please read the following VMware KB link for more info:
http://kb.vmware.com/kb/1004424
8.2
Virtual machines per datastore

Although there are no steadfast rules for how many virtual machines should be placed on a datastore due
to the scalability enhancements of VMFS-5, a good conservative approach is to place anywhere between
15-25 virtual machines on each.
The reasoning behind keeping a limited number or Virtual Machines and/or VMDK files per datastore is due
to potential I/O contention, queue depth contention, or legacy SCSI reservation conflicts that may
degrade system performance. That is also the reasoning behind limiting datastore sizes to 500 GB 750
GB, because this helps limit the total number of virtual machines that can be placed on each.
The art to virtual machine placement revolves highly around analyzing the typical disk I/O patterns for
each of the virtual machines and placing them accordingly. In other words, the sweet spot of how many
virtual machines can be put on each datastore is greatly influenced by the disk load of each. For example,
38
in some cases the appropriate number for high I/O load virtual machines may be less than 5, while the
number of virtual machines with low I/O disk requirements may be 25 or more.
Since the appropriate number of virtual machines that can be put onto each datastore is subjective and
dependent on the environment, a good recommendation is to start with 15 virtual machines, and
increase/decrease the number of virtual machines on each datastore as needed. Moving virtual machines
between datastores can even be done non-disruptively when licensed to use VMwares Storage vMotion
feature.
The most common indicator that the datastore has too many virtual machines placed on it is if the queue
depth of the datastore is regularly exceeding set limits therefore increasing disk latency. Remember that if
the driver module is set to a 256 queue depth, the maximum queue depth of each datastore is also 256.
This means that if there are 16 virtual machines on a datastore all heavily driving a 32 queue depth (16 * 32
= 512), they are essentially overdriving the disk queues by double, and the resulting high latency will most
likely degrade performance. (See Appendix A for more information on determining if the queue depth of a
datastore is being correctly utilized.)
A second less common indicator that a datastore has too many virtual machines placed on it would be the
frequent occurrence of SCSI Reservation Conflicts as seen when monitoring with esxtop. New to ESXi
5.x is a field in the Disk Device screen for Reserve Stats (RESVSTATS). That said, when monitoring it is
normal to see a few reservations (RESV/s) entries and even a few conflicts (CONS/s) from time to time, but
when noticing conflicts (CONS/s) happening very frequently on a particular volume, it may be time to
move some of the virtual machines to a different datastore. The VAAI Hardware Assisted Locking primitive
will help to alleviate the performance degradation caused by these SCSI-2 reservations, so if datastores are
impacted, is recommended to upgrade to a version of Storage Center firmware that supports this VAAI
primitive. (See the section below on VAAI for more information.)
Note: There are many resources available that discuss VMware infrastructure design and sizing, so this
should only be used as a general rule of thumb, and may vary based upon the needs of the environment.
8.3
VMFS partition alignment

Partition alignment is a performance tuning technique used with traditional SANs to align the guest
operating system partitions and VMFS partitions to the physical media, in turn reducing the number of disk
transactions it takes to process an I/O.
Due to how Dynamic Block Architecture virtualizes the blocks, manual partition alignment is generally not
necessary. This is because the Storage Center automatically aligns its 512K, 2M, or 4M pages to the
physical sector boundaries of the drives. Since the largest percentage of performance gains are seen from
aligning the Storage Center pages to the physical disks, the remaining areas that can be aligned and tuned
have a minor effect on performance.
Based on internal lab testing, we have found that any performance gains achieved by manually aligning
partitions are usually not substantial enough (1%) to justify the extra effort. However, before deciding
39
whether or not to align VMFS partitions, it is recommended that testing is performed to determine the
impact that an aligned partition may have on particular applications because all workloads are different.
To manually align the VMFS block boundaries to the Storage Center page boundaries for performance
testing, the recommended offset when creating a new datastore is 8192 (or 4 MB).
Note: Using the Compellent vSphere Client Plug-in to create new datastores will automatically align
them to the recommended offset. Also note, that guest partition alignment is generally not necessary
when using Windows Server 2008 and above, due to a change in partition offset defaults from previous
versions.
Figure 26 This is an example of a fully aligned partition in the Storage Center, where one guest I/O will
only access necessary physical disk sectors
Figure 27 This is an example of an unaligned partition in a traditional SAN where performance can be
improved by alignment
40
8.4
VMFS file systems and block sizes

Within ESXi 5.x, the choice can be made to use either the VMFS-3 or VMFS-5 file system formats which
affect the block size of the datastore.
8.4.1
VMFS-3
With VMFS-3, choosing a block size for a datastore determines the maximum size of a VMDK file that can
be placed on it. In other words, the block size should be chose based on the largest virtual disk planned to
be put on the datastore. Choose this datastore file system version if backwards compatibility with ESX 4.x
hosts is needed.
Table 2
VMFS Block Size chart
Block Size
Maximum VMDK Size
1 MB
256 GB
2 MB
512 GB
4 MB
1024 GB
8 MB
2048 GB (minus 512 B)
The default block size is 1 MB, so if virtual disks need to be sized greater than 256 GB, this value will need
to be increased. For example, if the largest virtual disk to be placed on a datastore is 200 GB, then a 1 MB
block size should be sufficient, and similarly, if there is a virtual machine that will require a 400 GB virtual
disk, then the 2 MB block size should be sufficient.
One should also consider future growth of the virtual machine disks, and the future upgrade to VMFS-5
when choosing the block size. If a virtual machine resides on a datastore formatted with a 1 MB block size,
and in the future it needs one of its virtual disks extended beyond 256 GB, the virtual machine would have
to be relocated to a different datastore with a larger block size. Also remember that if a VMFS-3 datastore
with a 2 MB block size is upgraded to VMFS-5, the block size remains at 2 MB which can cause VAAI to be
disabled and the host to use the default DataMover instead.
Note: Since certain VAAI offload operations require that the source and destination datastores have the
same VMFS block size, it is worth considering a standard block size for all datastores. Please consult the
vStorage APIs for Array Integration FAQ for more information.
8.4.2
VMFS-5
With VMFS-5 the only available block size is 1 MB allowing for up to a 64 TB datastore, and if running ESXi
5.5, up to a 62 TB VMDK. This format also allows the VAAI Dead Space Reclamation primitive (SCSI
UNMAP) to reclaim storage after a VMDK is deleted (See section below on VAAI). Keep in mind, this file
system version is not backwards compatible with ESX 4.x hosts, so there are special considerations when
using or migrating to this format.
41
For more information please consult the following reference:

Dell Document: Dell Compellent VMware vSphere VMFS-3 to VMFS-5 Upgrade Guide
Where possible, our recommendation is to create new VMFS-5 datastores, and migrate virtual machines to
them using Storage vMotion.
Caution: Before upgrading a VMFS-3 datastore to VMFS-5, it is recommended that a Replay be taken of
the datastore for protection against any possible loss of data.
42
LUN mapping layout

In addition to volume sizing, considerations for the placement of files and virtual machine data are also an
important factor.
9.1
Multiple virtual machines per LUN

One of the most common techniques in virtualization is to place more than one virtual machine on each
volume. This allows for the encapsulation of virtual machines, and therefore higher consolidation ratios.
When deciding how to layout the VMFS volumes and virtual disks, as discussed earlier, it should reflect the
performance needs as well as application and backup needs of the guest operating systems. Regardless of
how it is decided to layout the virtual machines; here are some basic concepts that should be considered.
9.1.1
Storage of non-virtual machine files

As a general recommendation, one or more VMFS datastores should be created for administrative items.
This datastore can be used to store virtual machine templates, ISO images, virtual floppies, and/or scripts.
9.1.2
Separation of the operating system page files

One technique to consider with virtual machine placement is separating the operating system paging
file/swap files onto a separate datastore.
There are two main reasons for separating operating system page files onto their own datastore.
Since page files can generate a lot of disk activity when the memory in the virtual machine or ESXi
host runs low, it could keep volume replays smaller
If replicating those volumes, it will conserve bandwidth by not replicating the operating system
page file data
Depending on the memory swap conditions unique to each environment, separating page files may or
may not make a significant reduction in replay sizes. A good way to determine whether or not separating
page files will make a difference is to use the vSphere client performance charts to monitor Swap/Balloon
usage of the ESXi host. If these numbers are high, consider testing the separation of page files to
determine the actual impact.
If decided that separating page files will make an impact in reducing replay sizes, the general
recommendation is to create "pairs" of volumes for each datastore containing virtual machines. If a
volume is created that will contain 10 virtual machines, then a second a second volume should be created
to store the operating system page files for those 10 machines.
For example:
Create one datastore for Virtual Machines
- This will usually contain the virtual disks (VMDK files), configuration files, and logs for the virtual
machines.
43
Create one paired datastore for the corresponding virtual machine page files
- This should contain virtual machine page files. Using Windows as an example, one would
create a 2GB - 16GB virtual disk (P:) on this volume to store the Windows paging file for each
virtual machine.
- This volume can be sized considerably smaller than the main datastore as it only needs
enough space to store page files.
Often the question is asked whether or not it is a good idea to place all of the operating system page files
on a single datastore. Generally speaking, this is not a good practice for a couple of reasons.
First, the page file datastore can also experience contention from queue depth utilization or disk I/O; so
too many VMDK files during a sudden memory swapping event could decrease performance even further.
For example, if a node in the ESXi HA cluster fails, and the effected virtual machines are consolidated on
the remaining hosts. The sudden reduction in overall memory could cause a sudden increase in paging
activity that could overload the datastore causing a storage performance decrease.
Second, it becomes a matter of that datastore becoming a single point of failure. Operating systems are
usually not tolerant of disk drives being unexpectedly removed. If an administrator were to accidentally
unmap the page file volume, the number of virtual machines within the failure domain would be isolated
to a subset of the virtual machines instead of all the virtual machines.
9.1.3
Separation of the virtual machine swap files

Each virtual machine also has a memory swap file located in its home directory which is used by the ESXi
host when the VMware Tools balloon driver was unable to reclaim enough memory. In other words, the
VSWP file is generally only used as a last resort by the ESXi host to reclaim memory. VMware recommends
keeping the VSWP files located in the virtual machine home directories, however if needed, it is also
possible to relocate the VSWP file to a dedicated LUN. Doing this may help to reduce replay sizes and
preserve replication bandwidth, but should only be done under the guidance of VMware support.
9.1.4
Virtual machine placement

This example technique will give a great deal of flexibility when building out the storage architecture in the
environment, while keeping with the basic concepts discussed above. The example layout below will
meet most virtual infrastructure needs, because it adds the flexibility of being able to add RDMs to virtual
machines later if needed. The key to this technique is reserving LUN numbers in the middle of the LUN
sequence to help better organize the virtual machines.
An example of this technique is as follows:
LUN0 - Boot LUN for ESXi (When booting from SAN)
LUN1 - Templates/ISO/General Storage
LUN10 - OS/DATA (C:/D:/E: Drives)
LUN11 Page file (Paired with LUN10) for OS paging files [If desired]
LUN12 - LUN19 - Reserved LUNs for virtual machine RDM's for machines in this group
44
LUN20 - OS/DATA (C:/D:/E: Drives)

LUN21 Page file (Paired with LUN20) for OS paging files [If desired]
LUN22 - LUN29 - Reserved LUNs for virtual machine RDM's for machines in this group
Figure 28 Virtual Machine Placement (With RDMs)
Timesaver: To help organize the LUN layout for ESXi clusters, some administrators prefer to store their
layout in a spreadsheet. Not only does this help to design their LUN layout in advance, but it also helps
keep things straight as the clusters grow larger.
Note: There are many factors that may influence architecting storage with respect to the placement of
virtual machines. The method shown above is merely a suggestion, as business needs may dictate
different alternatives.
9.2
One virtual machine per LUN

Although creating one volume for each virtual machine is not a very common technique, there are both
advantages and disadvantages that will be discussed below. Keep in mind that deciding to use this
technique should be based on business related factors, and may not be appropriate for all circumstances.
Advantages
Granularity in replication
- Since the Storage Center replicates at the volume level, if there is one virtual machine per
volume, administrators can choose which virtual machine to replicate
I/O contention is reduced because a single LUN is dedicated to a single virtual machine
Flexibility with volume mappings
- Since a path can be individually assigned to each LUN, this could allow a virtual machine a
specific path to a controller
45
Statistical Reporting
- Storage usage and performance can be monitored for an individual virtual machine
Backup/Restore of an entire virtual machine is simplified
- If a VM needs to be restored, an administrator can just unmap/remap a replay in its place
Disadvantages
There will be a maximum of 256 virtual machines in the ESXi cluster
- The HBA has a maximum limit of 256 LUNs that can be mapped to the ESXi host, and since we
can only use each LUN number once when mapping across multiple ESXi hosts, it would
essentially have a 256 virtual machine limit (assuming that no extra LUNs would be needed for
recoveries).
Increased administrative overhead
- Managing a LUN for each virtual machine and all the corresponding mappings may get
challenging
46
10
Raw device mappings (RDM's)

Raw Device Mappings (RDM's) are used to map a particular LUN directly to a virtual machine. When an
RDM set to physical compatibility mode is mapped to a virtual machine, the operating system writes
directly to the volume bypassing the VMFS file system. There are several distinct advantages and
disadvantages to using RDM's, but in most cases, using the VMFS datastores will meet most virtual
machines needs.
Advantages of RDM's:
Virtual and Physical mode RDMs up to 64 TB in size can be mapped directly to a guest
- A pRDM supports up to 64 TB max file size while a vRDM supports up to a 62 TB max file size
Ability to create a clustered resource (i.e. Microsoft Cluster Services)
- Virtual Machine to Virtual Machine
- Virtual Machine to Physical Machine
The volume can be remapped to another physical server in the event of a disaster or recovery
Ability to convert physical machines to virtual machines more easily
- Physical machine volume can be mapped as an RDM
Can be used when a VM has special disk performance needs
- There may be a slight disk performance increase when using an RDM versus a VMFS virtual
disk due to the lack of contention, no VMFS write penalties, and better queue depth utilization
- Independent disk queues per RDM
The ability to use certain types of SAN software
- For example, the Storage Center's Replay Manager or Windows free space recovery feature.
- More information about these features can be found in the Dell Compellent Knowledge
Center
The ability to assign a different data progression profile to each volume.
- For example, if a database server has its database and logs are separated onto different
volumes, each can have a separate data progression profile
The ability to adding a different replay profile to each volume
- For example, a database and its transaction logs may have different replay intervals and
retention periods for expiration
Virtual mode RDMs support vSphere snapshots
Disadvantages of RDM's:
Added administrative overhead due to the number of mappings
There are a limited number of LUNs that can be mapped to an ESXi host
- If every virtual machine used RDM's for drives, the cluster would have a maximum number of
255 drives
Physical mode RDMs cannot be used in conjunction with ESXi snapshots
- While VMware snapshots are not available for physical mode RDMs, Storage Center Replays
can still be used to recover data
Note: All of the above RDM related tasks can be automated by using the Dell Compellent vSphere Client
Plug-ins. These software packages can be downloaded from the Dell Compellent Knowledge Center.
47
11
Data progression and RAID types

Data Progression will migrate inactive data to the lower tier inexpensive storage while keeping the most
active data on the highest tier fast storage. This works to the advantage of VMware datastores because
multiple virtual machines are usually kept on a single volume.
Figure 29 Data Progression and thin provisioning working together

When using data progression, virtual machines that have completely different storage needs such as tier or
RAID level, can be placed on the same datastore. This gives the administrator the ability to sort virtual
machines by business purpose rather than disk performance characteristics.
However, if a business case is encountered where particular virtual machines would require different RAID
types; some decisions on how Data Progression is configured for the volumes must be made. Below is an
advanced example of possible situations where forcing particular volumes into different profiles is desired.
Here is an example of virtual machine RAID groupings:
LUN0 - Boot LUN for ESXi
-- Data Progression: Recommended (All Tiers)
LUN1 - Templates/ISO/General Storage
48
LUN10 - OS/DATA (Server group1 - High performance - 4 VM's - C:/D:/E: Drives)

-- High Priority (Tier 1)
LUN20 - OS/DATA (Server group2 - Low performance - 15 VM's - C:/D:/E: Drives)
-- Data Progression: Low Priority (Tier 3)
LUN30 - OS/DATA (Server group 3 - Application grouping - 5 VM's - C:/D:/E: Drives)
Like previously mentioned at the beginning of this section, unless there is specific business needs that
require a particular virtual machine or application to have a specific RAID type, our recommendation is to
keep the configuration simple. In most cases, the Data Progression Recommended setting can be used
to automatically classify and migrate data based upon usage.
Note: A note about Data Progression Best Practices: A replay schedule for each volume should be
created that (at a minimum) takes one daily replay that doesnt expire for 25 hours or more. This will have
a dramatic effect on Data Progression behavior, which will increase the overall system performance.
49
11.1
Compression
With the introduction of Storage Center 6.5, Data Progression has the ability to compress data as part of its
daily processing cycle. This block level compression uses the Dell LZPS algorithm to reduce the size of
Replay data stored on Tier 3 drives in a manner that has a low impact on performance.
Figure 30 Data pages eligible for compression
It is important to understand that in the first release of this feature, Data Progression will only compress
pages contained in Tier 3 classified as inaccessible. In the example figure above, when data progression
runs, the inactive pages that are color coded grey are the only pages eligible to be compressed and
immediately migrated to Tier 3. This reduces any negative performance impacts to the datastore because
the host actually does not have direct access to any of the blocks contained in the compressed pages.
The only time a compressed page could be accessed, is as part of a View volume created from Replay for
a recovery. In which case, if the page is accessed, the data will simply be decompressed on the fly.
Note: For those customers that are using Replays and View volumes for block level cloning of virtual
machine datastores, it is important to note that there may be read performance impacts if compression is
enabled on the cloned or master datastore volumes.
For more information about the compression feature, please read the Dell Compellent Storage Center
6.5.1 and Data Compression feature brief on Dell Tech Center.
50
11.2
On-Demand Data Progression

With the release of Storage Center version 6.4 and the all-flash array (AFA), On-Demand Data Progression
(ODDP) was introduced primarily to enhance flash performance capabilities, but also adds new tiering
possibilities with spinning media as well. In essence, this new capability allows Data Progression to run
outside of its normal cycle to move data according to rules defined in the Storage Profile. For example,
with an all-flash system, when ODDP is triggered by a Replay, it can migrate data from the write intensive
SSD drives to the read intensive SSD drives to free up space for new incoming writes.
Within a VMware environment, this new functionality may change the Storage Profile strategies previously
implemented in the environment. With this in mind, it is recommended that administrators review all of
the new Storage Profile selections to see if any improvements can be made to their existing tiering
strategies.
Figure 31 New Storage Profiles available with SCOS 6.4

For more detailed information on On-Demand Data Progression, please read the Dell Compellent
Storage Center System Manager Administrators Guide Version 6.4 available on Knowledge Center.
51
12
Thin provisioning and VMDK files

Dell Compellents thin provisioning allows less storage to be consumed for virtual machines, thereby
saving upfront storage costs. The following section describes the relationship that this feature has with
virtual machine storage.
12.1
Virtual disk formats

In ESXi 5.x, VMFS can create virtual disks using one of four different formats.
Figure 32 Virtual disk format selection
12.1.1
Thick provision lazy zeroed

(a.k.a. zeroedthick) [Default]
Only a small amount of disk space is used within the Storage Center at virtual disk creation time, and new
blocks are only allocated on the Storage Center during write operations. However, before any new data is
written to the virtual disk, ESXi will first zero out the block, to ensure the secure integrity of the write. This
on first write style of zeroing of the block before the write induces extra I/O and an additional amount of
write latency which could potentially affect applications that are sensitive to disk latency or performance.
12.1.2
Thick provision eager zeroed

(a.k.a. eagerzeroedthick)
Space required for the virtual disk is fully allocated at creation time. Unlike with the zeroedthick format, all
of the data blocks within the virtual disk are zeroed out during creation. Disks in this format may take
longer to create than other types of disks because all of the blocks must be zeroed out before it can be
used. Note that when using VAAI, the time it takes to create an eagerzeroedthick disk is greatly reduced.
This format is generally reserved for Microsoft clusters, and the highest I/O workload virtual machines
because it does not suffer from operational write penalties like the zeroedthick or thin formats.
52
12.1.3
Thin provisioned
(a.k.a. Thin)
The logical space required for the virtual disk is not allocated during creation, but it is allocated on
demand during first write issued to the block. Just like thick disks, this format will also zero out the block
before writing data inducing extra I/O and an additional amount of write latency.
12.1.4
Space efficient sparse

(a.k.a. SE Sparse Disks)
SE Sparse disks are a new virtual disk type introduced in ESXi 5.1 for use with VMware Horizon View
Composer. These virtual disks are used with linked clones for improved performance, space reclamation,
and a new 4 KB grain size. For more information, please read the Whats new in VMware vSphere 5.1
whitepaper.
12.2
Thin provisioning relationship

The following points describe how each virtual disk format affects Storage Centers thin provisioning.
Thick Provision Lazy Zeroed (zeroedthick)
- Virtual disks will be thin provisioned by the Storage Center
Thick Provision Eager Zeroed (eagerzeroedthick)
- Virtual disks will be thin provisioned by the Storage Center (See the Storage Center Thin Write
Functionality section below.)
Thin Provisioned (thin)
- There are no additional storage savings while using this format because the array already uses
its thin provisioning (see below)
SE Sparse (Available with VMware Horizon View virtual desktops)
- Space reclamation within the VMDK is available
We recommend sticking with the default virtual disk format of Thick Provision Lazy Zeroed (zeroedthick)
unless there are specific needs to pre-allocate virtual disk storage such as Microsoft clustering, VMware
Fault Tolerance(FT), or for virtual machines that may be impacted by the thin or zeroedthick on-first-write
penalties. If the application is sensitive to VMFS write penalties, it is recommended to test
eagerzeroedthick virtual disks to determine the actual performance impact.
12.3
Storage Center thin write functionality

Certain versions of Storage Center have the ability to detect incoming sequential zeros while being
written, track them, but not actually write the zeroed page to the physical disks. When creating virtual
disks on these versions of firmware, all virtual disk formats will be thin provisioned at the array level,
including Thick Provision Eager Zeroed (eagerzeroedthick).
53
12.4
Storage Center thin provisioning or VMware thin provisioning

A common question is whether or not to use array based thin provisioning or VMwares thin provisioned
VMDK format. Since the Storage Center uses thin provisioning on all volumes by default, it is not
necessary to use VMware thin provisioning because there are no additional storage savings by doing so.
However, if VMwares thin provisioning is needed for whatever reason, one must pay careful attention not
to accidentally overrun the storage allocated. To prevent any unfavorable situations, the built-in vSphere
datastore threshold alerting capabilities should be used to warn against running out of space on a
datastore. The threshold alerting capabilities of Dell Compellent Enterprise Manager can also be used to
alert on low space conditions.
12.5
Windows free space recovery

One of the nuances of the Windows NTFS file system is that gradually over time, the actual usage of the
file system can grow apart from what the Storage Center reports as being allocated. For example, imagine
there is a 20 GB data volume, where Windows writes 15 GB worth of files, followed by deleting 10 GB
worth of those files. Although Windows reports only 5 GB of disk space in-use, Dynamic Capacity has
assigned those blocks to that volume, so the Storage Center will still report 15 GB of data being used. This
is because when Windows deletes a file, it merely removes the entry in the file allocation table, and there
are no built-in mechanisms for the Storage Center to determine if an allocated block is actually still in use
by the OS. However, the Dell Compellent Enterprise Manager Server Agent contains the necessary
functionality to recover this free space from machines running certain versions of Windows. It does this
by comparing the Windows file allocation table to the list of blocks allocated to the volume, and then
returning those free blocks into the storage pool to be used elsewhere in the system. It is important to
note, blocks which are kept as part of a replay, cannot be freed until that replay is expired.
The free space recovery functionality can only be used in Windows virtual machines under the following
circumstances:
The virtual disk needs to be mapped as a Raw Device Mapping set to physical compatibility mode
(pRDM).
- This allows the free space recovery agent to perform a SCSI query of the physical LBAs in-use,
and then correlate them to the blocks allocated on the Storage Center that can be freed.
- The disk must be an NTFS basic disk (either MBR or GPT)
The virtual disk cannot be a VMDK, or a Raw Device Mapping set to virtual compatibility mode
(vRDM).
- This is because VMware does not provide the necessary APIs for the free space recovery agent
to correlate the virtual LBAs to the actual physical LBAs needed to perform the space recovery.
- If a virtual machine has a C: drive (VMDK) and a D: drive (RDMP), Windows free space recovery
will only be able to reclaim space for the D: drive.
- The restriction against using virtual mode RDMs for space recovery also implies that these
disks cannot participate in ESXi host snapshots.
o This means that if software that uses VMware snapshots is needed, there will have be an
alternative method of backing up the physical mode RDMs applied. For example, the
Storage Center Command Set for Windows PowerShell installation provides an
54
example PowerShell script, which can be used to backup physical mode RDMs as part of
the pre-execution steps of the backup job.
The free space recovery agent will also work with volumes mapped directly to the virtual machine
via the Microsoft Software iSCSI initiator.
- Volumes mapped to the virtual machine through the Microsoft iSCSI initiator interact with the
SAN directly, and consequently, space recovery works as intended.
For more information on Windows free space recovery and compatible versions of Windows,
please consult the Dell Compellent Enterprise Manager User Guide.
55
13
Extending VMware volumes

Within an ESXi host, there are three ways to extend or grow storage. The general steps are listed below,
but additional information can be found the following documentation pages:
- Subsection: Increasing VMFS Datastore Capacity
VMware document: vSphere Virtual Machine Administration Guide
- Subsection: Virtual Disk Configuration
13.1
Increasing the size of VMFS datastores

There are two methods to increasing the size of VMFS datastores, either by the recommended method of
growing an existing extent, or by concatenating extents which is not recommended.
13.1.1
Expanding an extent in an existing VMFS datastore

This functionality is used to grow an existing extent in a VMFS datastore, but can only be done if there is
adjacent free capacity.
Figure 33 Datastore2 and Datastore3 can be grown by 100GB, but Datastore1 cannot
To extend the space at the end of a Storage Center volume as shown above, it can be done from the
Storage Center GUI or Enterprise Manager. After the volume has been extended and the hosts HBA has
been rescanned, the properties of the datastore can be edited to grow it by clicking on the Increase
button, and then follow through the Increase Datastore Capacity wizard.
Be careful to select the volume that is Expandable, otherwise the unintended action will actually add a
VMFS extent to the datastore (see section below on VMFS extents).
Figure 34 Screenshot from the wizard after extending a 500GB datastore by 100 GB
Warning: If a VMFS-3 volume (or pRDM residing on a VMFS-3 datastore) is extended beyond the 2 TB
limits, that volume will become inaccessible by the ESXi host. If this happens, the most likely scenario
will result in recovering data from a replay or tape.
56
Note: As an alternative to extending a datastore volume when a virtual machine needs additional disk
space, consider creating a new datastore volume and migrating that virtual machine. This will help to
keep volume sizes manageable, as well as help to keep any single datastore from being overloaded due
to I/O contention.
Note: All of the above tasks can be automated by using the Dell Compellent vSphere Client Plug-in. This
can be downloaded from the Dell Compellent Knowledge Center.
13.1.2
Adding a new extent to an existing datastore

This functionality is used to concatenate multiple Storage Center volumes into a single datastore. This
feature is used to create VMFS-3 datastores larger than 2 TB. If a datastore larger than 2 TB is needed, it is
highly recommended that VMFS-5 is used instead.
Warning: Due to the complexities of coordinating replays and recoveries of datastores that are spanned
across multiple Storage Center volumes, the use of VMFS extents is highly discouraged. However, if the
use of extents is needed, Replays of those volumes should be taken using the Consistent Replay Profile
functionality available in Storage Center versions 5.x and later.
13.2
Increasing the size of a virtual disk (VMDK File)

Hot extending a SCSI virtual disk is available from within the vSphere client when editing the settings of a
virtual machine (or by using vmkfstools from the ESX CLI).
Figure 35 Growing a virtual disk from the virtual machine properties screen
For Windows machines: After growing the virtual disk from the vSphere client, an administrator must log
into the virtual machine, rescan for new disks, and then use DISKPART or the disk management MMC
console to extend the drive.
Warning: Microsoft does not support extending the system partition (C: drive) of a machine in certain
versions of Windows.
57
13.3
Increasing the size of a raw device mapping (RDM)

To extend a raw device mapping, follow the same basic procedure as with a physical server. First extend
the RDM volume from the Storage Center GUI, rescan disks from Windows disk management, and then
use DISKPART to extend the drive.
A physical mode RDM that resides on a VMFS-5 datastore can be extended up to the 64 TB limit, however
dependent on version, vRDMs may have a limit anywhere between 2TB-512B up to 62 TB.
Warning: Just as with datastore volumes, it is also very important not to extend an RDM volume with its
pointer file residing on a VMFS-3 datastore past the 2047GB/1.99TB limit.
58
14
Replays and virtual machine backups

Important to any virtualized infrastructure is backup and recovery. There are several common techniques
discussed in this section that can be deployed to improve the robustness of environments.
14.1
Backing up virtual machines

The key to any good backup strategy is not only testing the backup, but also verifying the results. There
are many ways to back up virtual machines, but depending on business needs, each solution is usually
unique to each environment. Through testing and verification, it may be found that one solution works
better than another, so it is best to test a few different options.
Since the subject of virtual machines backup is so vast, this section will only cover a few basics. If more
information is needed about virtual machine backup strategies, please consult VMwares documentation
pages.
14.1.1
Backing up virtual machines to tape or disk

Perhaps the most common methods of backing up virtual machines to tape are using backup client
software installed within the guest, or by using a third party backup software.
Backup client loaded within the guest
- Using this method, backup software is loaded within the guest operating system, and the data
is backed up over the network to a backup host containing the tape drive. Depending on the
software used, it usually only performs file level backups, but in some cases, it can include
additional capabilities for application level backups.
vSphere Data Protection
- This is a software package designed to allow operational backup and recovery of virtual
machines within a vSphere environment.
o http://www.vmware.com/products/vsphere/features/data-protection.html
Third party backup software using vStorage APIs for Data Protection (VADP)
- The vStorage APIs for Data Protection are the successor to VMware Consolidated Backup, and
provide backup vendors an integrated method to backup virtual machines.
o http://kb.vmware.com/kb/1021175
14.1.2
Backing up virtual machines using replays

There are several options for backing up virtual machines using Storage Center Replays.
Replays scheduled from within the Storage Center GUI or Enterprise Manager
- From within the Storage Center GUI, a replay profile can be created to schedule replays of
virtual machine volumes. In most cases, using replays to back up virtual machines is sufficient
to perform a standard recovery. It is important to remember that replays can only capture
data that has been written to disk, and therefore the virtual machine data is preserved in what
is called a crash consistent state. In other words, when recovering the virtual machine, the
59
data recovered will be as if the virtual machine had simply lost power. Most modern journaling
file systems such as NTFS or EXT3 are designed to recover from such states.
Replays taken via Dell Compellents Replay Manager Software
- Since virtual machines running transactional databases are more sensitive to crash consistent
data, Dell Compellent has developed its Replay Manger software to utilize Microsofts VSS
framework for taking replays of Microsoft Exchange and SQL databases. This software
package will ensure that the database is in a consistent state before executing the replay.
Replays taken via Dell Compellents scripting tools
- For applications that need a custom method for taking consistent replays of the data, Dell
Compellent has developed two scripting tools:
o Dell Compellent Command Utility (CompCU) This is a java based scripting tool that
allows scripting for many of the Storage Centers tasks (such as taking replays).
o Storage Center Command Set for Windows PowerShell This scripting tool will also
allow scripting for many of the same storage tasks using Microsofts PowerShell scripting
language.
- A good example of using one of these scripting utilities is writing a script to take a replay of an
Oracle database after it is put into hot backup mode.
Replays used for Storage Center Replication and VMware Site Recovery Manager
- Replicating replays to a disaster recovery site not only ensures an off-site backup, but in
addition when using Site Recovery Manager, provides an automated recovery of the virtual
infrastructure in the event a disaster is declared.
14.2
Recovering virtual machine data from a replay

When recovering a VMFS datastore from a replay, an admin can recover an entire virtual machine, an
individual virtual disk, or files within a virtual disk.
The basic steps are as follows:
1.
From the Storage Center GUI or Enterprise Manager, select the replay to recover from and then
choose: Local Recovery or Create Volume From Replay
2. Continue through the local recovery wizard to create the view volume, and map it to the ESXi host
designated to recover the data.
a. Be sure to map the recovery view volume using a LUN which is not already in use.
3. Rescan the HBAs from the Storage Adapter section to detect the new LUN
4. From the vSphere client, highlight an ESXi host, then select the configuration tab
a. Select Storage
b. Click Add Storage
c. Select Disk/LUN and then click Next
d. Select the LUN for the view volume that was just mapped to the host and then click Next.
e. Three options are presented:
i. Keep the Existing Signature This option should only be used if the original datastore
is not present on the host.
60
5.
ii. Assign a New Signature This option will regenerate the datastore signature so that it
can be accessed by the host. (Select this option if unsure of which option to use.)
iii. Format the disk This option will format the view volume, and create a new datastore
from it.
f. Finish through the wizard verifying all selections.
Once the datastore has been resignatured, the snap datastore will be accessible:
Figure 36 The storage configuration tab showing snapshot datastore

6. The recovery datastore is now designated with snap-xxxxxxxx-originalname
7. From here the datastore can be browsed to perform the recovery via one of the methods listed below.
Note: All of the above tasks can be automated by using the recovery functionality in the Dell Compellent
vSphere Client Plug-ins. These can be downloaded from the Dell Compellent Knowledge Center.
14.2.1
Recovering a file from a virtual disk

To recover a file from within a virtual disk located on this snap datastore, simply Add a new virtual disk to
the virtual machine, and then select Use an existing virtual disk. Browse to select the virtual disk to
recover from, and add it to the virtual machine. Now a drive letter can be assigned to the virtual disk, and
recover/copy/move the file back to its original location.
After completing the file recovery, it is important that the recovered virtual disk be removed from the
virtual machine before unmapping or deleting the view volume.
14.2.2
Recovering an entire virtual disk

To recover an entire virtual disk from the snap datastore, browse to the virtual disk to be recovered, right
click, and select Move to. Following through the wizard, browse to the destination datastore and folder,
then click Move. If a VMDK file is being moving back to its original location, remember that the virtual
machine must be powered off to overwrite the virtual disk. Also, depending on the size of the virtual disk,
this operation may take anywhere between several minutes to several hours to finish.
Note: Alternatively, the old VMDK can be removed from the virtual machine, the recovered virtual disk readded to the virtual machine, and then use Storage vMotion to move the virtual disk back to the original
datastore while the VM is powered on.
61
14.2.3
Recovering an entire virtual machine

To recover an entire virtual machine from the snap datastore, browse to the virtual machine configuration
file (*.VMX), right click, then select add to inventory. Follow through the wizard to add the virtual machine
into inventory.
Caution: To prevent network name or IP address conflicts when powering on the newly recovered virtual
machine, it is a good idea to power off, or place the one of the virtual machines onto an isolated network
or private vSwitch.
If virtual center detects a duplicate UUID, it will prompt with the following virtual machine message:
Figure 37 Virtual Machine Question prompting for appropriate UUID action

The selections behave as follows:
I moved it This option will keep the configuration file UUIDs and the MAC addresses of the virtual
machine Ethernet adapters.
I copied it This option will regenerate the configuration file UUIDs and the MAC addresses of the
virtual machine Ethernet adapters.
Note: Once the virtual machine has been recovered, it can be migrated back to the original datastore
using Storage vMotion, provided that the original datastore has enough space, or the original virtual
machine was deleted.
62
15
Replication and remote recovery

Storage Center replication in coordination with the vSphere 5.x line of products can provide a robust
disaster recovery solution. Since each different replication method effects recovery a little differently,
choosing the correct method to meet business requirements is important. Here is a brief summary of the
different options.
Synchronous
The data is replicated real-time to the destination. In a synchronous replication, an I/O must be
committed on both systems before an acknowledgment is sent back to the host. This limits the
type of links that can be used, since they need to be highly available with low latencies. High
latencies across the link will slow down access times on the source volume.
If running Storage Center versions prior to 6.3, the downside of synchronous replication was that
replays on the source volume were not replicated to the destination, and any disruption to the link
would force the entire volume to be re-replicated from scratch. In versions SCOS 6.3 and later, the
synchronous replication engine was completely re-written to remedy these limitations.
- In Storage Center 6.3 and later, in addition to replays being replicated, two Synchronous
replication modes were introduced called High Availability and High Consistency to control
how the source volume behaves when the destination volume becomes unavailable.
High Availability Mode Accepts writes to the source volume when the destination volume
is unavailable (or when latency is too high) to avoid interrupting service. However, if writes
are accepted to the source volume, the destination volume data becomes stale.
High Consistency Mode - Prevents writes to the source volume when the destination
volume is unavailable to guarantee that the volumes remain identical. However, the source
volume cannot be modified during this time, which can interrupt operations.
Keep in mind that synchronous replication does not make both the source and destination volumes
read/writeable. That functionality is inherent within the Storage Centers Live Volume feature.
Asynchronous
In an asynchronous replication, the I/O needs only be committed and acknowledged to the source
system, so the data can be transferred to the destination in a non-concurrent timeframe. There are
two different methods to determine when data is transferred to the destination:
- By frozen Replay The replay schedule dictates how often data is sent to the destination.
When each replay is taken, the Storage Center determines which blocks have changed since
the last replay (the delta changes), and then transfers them to the destination. Depending on
the rate of change and the bandwidth, it is entirely possible for the replications to fall behind,
so it is important to monitor them to verify that the recovery point objective (RPO) can be met.
- Replicating the active replay With this method, the data is transferred near real-time to the
destination, sometimes requiring more bandwidth than if the system were just replicating the
replays. As each block of data is written on the source volume, it is committed, acknowledged
to the host, and then transferred to the destination as fast as it can. Keep in mind that the
replications can still fall behind if the rate of change exceeds available bandwidth.
Asynchronous replications usually have more flexible bandwidth requirements making this the most
common replication method.
63
The benefit of an asynchronous replication is that the replays are transferred to the destination
volume, allowing for check-points at the source system as well as the destination system.
15.1
Replication considerations with standard replications

One thing to keep in mind about the Storage Center replication is that when a volume is replicated either
synchronously or asynchronously, the replication only flows in one direction. In other words, any changes
made to the destination volume will not be replicated back to the source. That is why it is extremely
important not to map the replications destination volume directly to a host instead of creating a readwritable view volume.
Block changes are not replicated bidirectionally with standard replication, this means that the ability to
vMotion virtual machines between source controllers (main site), and destination controllers (DR site), is
not possible with a standard replication, but is possible with Live Volume. That being said, there are a few
best practices to replication and remote recovery that should be considered.
ESXi host hardware is needed at the DR site in which to map replicated volumes in the event the
source ESXi cluster becomes inoperable.
Preparations should be made to have all of the Virtual Center resources replicated to the DR site as
well. This includes vCenter, Enterprise Manager, and other management software hosts.
To keep replication sizes smaller, the operating system page files can be separated onto their own
non-replicated volume if shown to reduce replay sizes.
15.2
Replication considerations with Live Volumes

Live Volume is a feature that allows a volume to be accessed from two disparate Storage Center systems,
allowing for the migration of data, increased uptime, and planned maintenance. Although this technology
enables new functionality such as long distance vMotion, there are some caveats that administrators need
to be aware when choosing to use asynchronous or synchronous Live Volume.
15.2.1
Asynchronous Live Volume

When a Live Volume replication is created, the volume will become read-writable from both the main
system and secondary system. This will allow vMotion of virtual machines over distance; however it is
important that VMwares Long Distance vMotion best practices are followed. This means that the vMotion
network between ESXi hosts must be gigabit or greater, round trip latency must be 10 milliseconds or less
(dependent on vSphere support agreement level), and the virtual machine IP networks Layer 2 must be
stretched between data centers. In addition, the storage replication must also be high bandwidth and
low latency to ensure Live Volumes can be kept in sync. The amount of bandwidth required to keep Live
Volumes in sync highly depends on the environment, so testing is recommended to determine the
bandwidth requirements for the implementation.
Caution: It is also important to consider LUN conflicts when designing Live Volumes into the
environments architecture. When a Live Volume is created, it will attempt to keep LUN assignments
identical between sites. For example, if a VMFS volume is mapped as LUN 10 at the primary site, it will
64
attempt to map the destination Live Volume as LUN 10. In some instances, the secondary site may
already have existing LUNs that conflict, thereby causing the Live Volume not operate as expected.
Due to the asynchronous nature of the replication and how it proxies data, it is recommended that Async
Live Volume only be used for Disaster Avoidance or planned migration. In other words, Live Volume is
most practically used for planned maintenance operations such as migrating workloads between racks or
datacenters. Because of the nature of the proxied I/O, any disruption to the link or primary Live Volume
will cause the secondary Live Volume datastore to become unavailable as well. If for any reason the
primary Live Volume goes down permanently, this means that admins will need to perform a recovery on
the secondary Live Volume from the last known good Replay. The Enterprise Manager Disaster recovery
wizard is designed to assist in the recovery in such situations.
15.2.2
Synchronous Live Volume

Storage Center 6.5 introduces synchronous capabilities to Live Volume using the newly revamped sync
replication engine added in a previous release. This offers Sync Live Volume as a new disaster recovery
option to save time and effort recovering virtual machines. Since the disk signatures of the datastores
remain the same between sites, this means that during a recovery, volumes do not need to be
resignatured, nor do virtual machines need to be removed and re-added to inventory. Although the
disaster must still be declared through Enterprise Manager to bring the secondary Live Volumes online, this
saves most of the laborious recovery steps it took previously, and virtual machines can simply be powered
on after the volume comes online and the host has been rescanned.
One of the key considerations to using Sync Live Volume is that round trip link latency is now an important
factor for application performance. In relation to this, the High Consistency mode is the default option to
ensure data integrity, meaning low latencies are especially important. Generally speaking, Sync Live
Volume is most practically used within a datacenter or a campus environment where transactions need to
be guaranteed to both storage systems, and round trip link latency can be kept low. Although round trip
latencies between sites has no hard set limitation, it is generally limited by the tolerances for each
application, or VMware support tolerances as noted earlier.
Important: Although Sync Live Volume can be used as a disaster recovery method, at the time of this
writing, it does not yet have the capabilities for certification by VMware as a metro cluster solution.
For more information about Live Volume, please consult the Storage Center Synchronous Replication and
Live Volume Solutions Guide available on Dell TechCenter.
65
15.3
Replication tips and tricks

Since replicated volumes often contain more than one virtual machine, it is recommended that
virtual machines are sorted into specific replicated and non-replicated volumes. For example, if
there are 30 virtual machines in the ESXi cluster, and only 8 of them need to be replicated to the DR
site, create a special "Replicated" volume to place those 8 virtual machines on.
As mentioned previously, if separation proves to reduce replay sizes, keep operating system page
files on a separate volume that is not replicated. That will keep replication and replay sizes smaller
because the data in the page file changes frequently and it is generally not needed for a system
restore.
To set replication priorities, an administrator can take advantage of the Storage Center QOS to
prioritize replication bandwidth of certain volumes. For example, if there is a 100 Mb pipe between
sites, two QOS definitions can be created such that the "mission critical" volume would get 80 Mb
of the bandwidth, and the lower priority volume would get 20 Mb of the bandwidth.
15.4
Virtual machine recovery at a DR site

When recovering virtual machines at the disaster recovery site, the same general steps as outlined in the
previous section titled Recovering Virtual Machine Data from a Replay should be followed.
Timesaver: If the environment has a significant number of volumes to recover, time can be saved during
the recovery process by using the Replication Recovery functionality within Dell Compellents
Enterprise Manager Software. These features will allow an admin to pre-define the recovery settings with
things such as the appropriate hosts, mappings, LUN numbers, and host HBAs. After the recovery has
been predefined, a recovery at the secondary site is greatly automated.
Caution: It is extremely important that the destination volume, usually denoted by Repl of, never gets
directly mapped to an ESXi host while data is actively being replicated. Doing so will inevitably cause data
integrity issues in the destination volume, requiring the entire volume be re-replicated from scratch. The
safest recovery method is to always restore the virtual machine from a local recovery or view volume as
shown in previous sections. Please see the Copilot Services Technical Alert (CSTA) titled, Mapping
Replicated Volumes at a DR Site available on Dell Compellent Knowledge Center for more info.
66
16
VMware storage features

The vSphere platform has several features that integrate coincide with Storage Center features. This
section details considerations that should be made with regard to features such as SIOC, SDRS, and VAAI.
16.1
Storage I/O Controls (SIOC)

SIOC is a feature that was introduced in ESX/ESXi 4.1 to help VMware administrators regulate storage
performance and provide fairness across hosts sharing a LUN. Due to factors such as Data Progression
and the fact that Storage Center uses a shared pool of disk spindles, it is recommended that caution is
exercised when using this feature. Due to how Data Progression migrates portions of volumes into
different storage tiers and RAID levels at the block level, this could ultimately affect the latency of the
volume, and trigger the resource scheduler at inappropriate times. Practically speaking, it may not make
sense to use SIOC unless pinning particular volumes into specific tiers of disk.
For example, if a datastore contains multiple virtual disks (see figure below), where each virtual disk may
have different portions of blocks in tier 1 and tier 3. If VM1 begins to read a lot of archived data from
vm1.vmdk residing on tier 3 disks, thereby increasing the latency of the datastore above the congestion
threshold, the scheduler could activate and throttle vm3.vmdk although most of its blocks reside on a
separate tier of spindles.
Figure 38 Multiple virtual disks with blocks residing in multiple tiers
67
The default setting for the congestion threshold is 30 milliseconds of latency or 90% of peak throughput.
Figure 39 Setting the congestion threshold

It is recommended that this value is left at default unless under the guidance of VMware or Dell
Compellent Copilot Support.
For more information please refer to the following documentation:
VMware document: vSphere Resource Management Guide
VMware Knowledge base article: External I/O workload detected on shared datastore running
Storage I/O Control (SIOC) for congestion management
- http://kb.vmware.com/kb/1020651
16.2
Storage Distributed Resource Scheduler (SDRS)

Storage DRS is a feature introduced in ESXi 5.0 that will automatically load balance virtual machines within
a datastore cluster based on capacity and/or performance. When creating SDRS datastore clusters with
Storage Center, it is important to remember a few guidelines.
Group datastores with like disk characteristics. For example: replicated,
non-replicated, storage profile, application, or performance
Use Storage DRS for initial placement based on capacity. When placing
virtual machines on the datastore cluster, it will place them based on
which datastore has the most space available.
Set the Automation Level to Manual Mode. This means that Storage DRS
will make recommendations about moving virtual machines, but will not
move them automatically. It is suggested that SDRS be run in manual
mode to examine all recommendations before applying them. There are
a few items to keep in mind before applying the recommendations:
- Virtual machines that are moved between datastores automatically,
will essentially have their data progression history reset. Depending
on the storage profile settings, this could mean that the entire virtual
machine gets moved back to Tier 1 RAID 10 if the volume was set to
use the Recommended storage profile.
68
When a virtual machine moves between datastores, its actual location at the time the replay
was taken may make the virtual machine harder to find during a recovery. For example, if a
virtual machine moves twice in one week while daily replays are being taken, the appropriate
volume that contains the good replay of the virtual machine may be difficult to locate.
- If the Storage Center version in use does not support VAAI, the move process could be slow
(without Full Copy) or leave storage allocated on its originating datastore (without Dead Space
Reclamation). See the section on VAAI for more information.
Storage DRS could produce incorrect recommendations for virtual machine placement when I/O
metric inclusion is enabled on a Storage Center system using Data Progression. This is because
when a datastore is inactive, the SIOC injector will perform random read tests to determine
latency statistics of the datastore. With Data Progression enabled, the blocks that SIOC reads to
determine datastore performance, could potentially reside on SSD, 15K, or even 7K drives. This
could ultimately skew the latency results and decrease the effectiveness of the SRDS
recommendations.
-
Figure 40 Disabling I/O Metric Inclusion

To reiterate, it is our recommendation that Storage DRS be only used for Capacity recommendations, set
to Manual Automation mode, and the I/O Metric Inclusion be disabled. This will allow administrators to
take advantage of SDRS capacity placement recommendations, while still allowing Data Progression to
manage the performance of the data at the block level.
16.3
vStorage APIs for Array Integration (VAAI)

Within ESXi 5.x, there are now five primitives that the ESXi host can use to offload specific storage
functionality to the array. These are all enabled by default in ESXi 5.x because they are based on new T10
standardized SCSI-3 commands.
16.3.1
Block zeroing (SCSI WRITE SAME)

Traditionally, when an ESXi host would create an Eager Zeroed Thick virtual disk, it would transmit all of
the zeros over the SAN to the array, which consumed bandwidth, CPU, and disk resources to transmit the
zeros across the wire. The Block Zeroing primitive uses the SCSI WRITE SAME command to offload the
heavy lifting to the array. For example, when the ESXi host needs a 40 GB VMDK to be zeroed out, it will
simply send a few WRITE SAME commands asking the array to write 40 GB worth of zeros, and have the
array respond when finished. In the case of the Storage Center, since the array does not write zeros due
to its Thin Write feature, this means that creating an Eager Zeroed Thick virtual disk only takes seconds
instead of minutes.
69
16.3.2
Full copy (SCSI EXTENDED COPY)

This primitive is used to offload to the array the copying or movement of blocks. It does this by taking
advantage of a SCSI command called EXTENDED COPY that allows the ESXi host to instruct the Storage
Center on which blocks it needs copied or moved, leaving the heavy lifting to the array itself. This has a
performance benefit to virtual machine cloning and storage vMotions because the ESXi host no longer has
to send that data over the wire. The ESXi host simply tells the Storage Center which blocks need to be
copied/moved and the array takes care of the operations.
16.3.3
Hardware accelerated locking (ATS)

This primitive is intended to add scalability to the VMFS file system by removing the need for SCSI-2
reservations. With VMFS, when an ESXi host needs exclusive access to a LUN to update metadata,
traditionally it will set a SCSI-2 non-persistent reservation on the LUN to guarantee it had exclusive rights.
Typically during this operation, VMware has documented small performance degradation to other virtual
machines accessing this LUN at the same time from different hosts. With the new hardware accelerated
locking primitive, it uses a new SCSI-3 method called Atomic Test & Set (ATS) that greatly minimizes any
storage performance degradation during the operation. This primitive was added to migrate away from
the SCSI-2 operations to SCSI-3 to increase the future scalability of VMFS.
16.3.4
Dead space reclamation (SCSI UNMAP)

This primitive allows enhanced integration with arrays offering thin provisioning. Traditionally, in the
Storage Center, when a VMDK was deleted, the array had no idea that the blocks were no longer in use,
meaning they couldnt be freed back into the pagepool to be re-used. This caused an effect dubbed the
high water mark, where blocks of data could still be consumed by the array that the operating system no
longer needed. The Dead Space Reclamation primitive uses the SCSI UNMAP command to allow the ESXi
host to instruct the array when specific blocks are no longer in use, so that the array can return them to
the pool to be re-used. This means that if a VMDK is moved or deleted, those blocks will be returned to
the pagepool, with two notable exceptions. First, if those blocks are frozen as part of a replay, they will
not return to the pagepool until that replay has expired. Secondly, dead space reclamation doesnt work
within the VMDK, because it only works at the VMFS level. This means that if a large file is deleted from
within a VMDK the space wouldnt be returned to the pagepool unless the VMDK itself was deleted.
Note: In certain patch levels of ESXi 5.x, the dead space reclamation primitive must be invoked manually.
See the following VMware KB article for more information: http://kb.vmware.com/kb/2057513
16.3.5
Thin provisioning stun

This primitive allows the Storage Center to send a special SCSI sense code back to the ESXi host when
there is an Out of Space (OOS) condition, allowing ESXi to pause only the virtual machines which are
requesting additional pages until additional storage is added to remedy the situation.
Warning: Using VAAI requires multiple software version prerequisites from both VMware and Dell
Compellent. Because certain primitives are only available with specific versions of the SCOS firmware,
please consult the VMware HCL for the latest VAAI certified versions.
70
For more important information about VAAI primitives please see the following resources:
- Subsection: Array Thin Provisioning and VMFS Datastores
- Subsection: Storage Hardware Acceleration
VMware KB: Disabling VAAI Thin Provisioning Block Space Reclamation (UNMAP) in ESXi 5.0
71
17
Using NFS with FluidFS

Introducing an FS8600 appliance into the environment will add file protocol support, such as NFS and
SMB/CIFS to the Storage Center. When using a file protocol instead of a block protocol as discussed
earlier, there are several different aspects that need to be considered.
17.1
Prerequisite reading
To fully understand the recommendations made within this chapter, it is highly suggested to read the
following documents before proceeding:
VMware NFS Best Practices
- http://www.vmware.com/files/pdf/VMware_NFS_BestPractices_WP_EN.pdf
FS8600 Networking Best Practices
- http://en.community.dell.com/techcenter/extras/m/white_papers/20437940.aspx
17.2
FS8600 architecture
FS8600 scaleout NAS consists of one to four FS8600 appliances configured as a FluidFS cluster. Each NAS
appliance is a rackmounted 2U chassis that contains two hotswappable NAS controllers in an active
active configuration. In each NAS appliance, the second NAS controller with which one NAS controller is
paired is called the peer controller. FS8600 scaleout NAS supports expansion that is, one can start with a
single NAS appliance and add NAS appliances to the FluidFS cluster as needed to increase performance.
Figure 41 FS8600 Architecture Diagram
72
17.2.1
FS8600 SAN network

The FS8600 shares a backend infrastructure with the Storage Center. The SAN network connects the
FS8600 to the Storage Center and carries the block level traffic. The FS8600 communicates with the
Storage Center using either the iSCSI or Fibre Channel protocol, depending on which NAS appliance
configuration purchased.
17.2.2
FS8600 internal network

The internal network is used for communication between NAS controllers. Each of the NAS controllers in
the FluidFS cluster must have access to all other NAS controllers in the FluidFS cluster to achieve the
following goals:
17.2.3
Provide connectivity for FluidFS cluster creation

Act as a heartbeat mechanism to maintain high availability
Enable internal data transfer between NAS controllers
Enable cache mirroring between NAS controllers
Enable balanced client distribution between NAS controllers
FS8600 LAN/Client network

The LAN/client network is used for client access to the CIFS shares and NFS exports, and it is also used by
the storage administrator to manage the FluidFS cluster. The client network is assigned one or more virtual
IP addresses (client VIPs) that allow clients to access the FluidFS cluster as a single entity. The client VIP
also enables load balancing between NAS controllers, and ensures failover in the event of a NAS controller
failure.
If client access to the FluidFS cluster is not through a router (in other words, a flat network), define one
client VIP per NAS controller. If clients access the FluidFS cluster through a router, define a client VIP for
each client interface port per NAS controller.
17.3
Configuring ESXi advanced settings

When configuring the ESXi hosts to access storage over NFS to the FS8600 appliance, there are a few
recommended settings that should be configured.
17.3.1
Maximum number of NFS mounts per ESXi host

By default, the NFS.MaxVolumes value is 8. This means the maximum number of NFS volumes that can be
mounted to an ESXi host is limited to 8 by default. This can be increased since VMware supports up to a
maximum of 256 NFS volumes mounted to each ESXi host. Although the total number of mount points in
each environment is arbitrary, since changing this setting requires a reboot, it is recommended to set it to
a higher value such as 64, 128, or 256, so a reboot maintenance window need not be scheduled at a later
date.
73
17.3.2
TCP/IP heap size

Net.TcpIpHeapSize is the size of the memory (in megabytes) that is allocated up front by the VMkernel to
TCP/IP heap. Net.TcpIpHeapMax is the maximum amount of memory that can be consumed by TCP/IP
as heap. Since the NFS.MaxVolumes has been increased to 256, it is recommended that the heap values
be increased as well. When the MaxVolumes is set to 256, the Net.TcpipHeapSize should be set to 32, and
the Net.TcpipHeapMax be set to 512 per VMwares recommendation for ESXi 5.5.
Note: Because each version of ESXi 5.x has different recommended values for these settings, please
reference VMware KB 2239 listed in the resources below for the version specific values.
17.3.3
Additional resources
VMware has knowledge base articles available to further explain these settings:
VMware KB: Increasing the default value that defines the maximum number of NFS mounts on an
ESXi/ESX host
VMware KB: Definition of the advanced NFS options
17.4
Basic setup of a datastore using NFS

To following steps provide the procedure to create NFS export on the FS8600 cluster, enable root access
to the export (no_root_squash), enable deduplication for the entire cluster and a specific FS8600 volume,
and connect the NFS export as a datastore via VMware vSphere client.
17.4.1
Configuring the NFS export and permissions

First, to configure NFS export:
1) Login to Enterprise Manager Client
2) Select the FluidFS cluster
3) Click on the File System tab
4) Right click on the volume that will host the NFS export, and click on Create NFS export
5) For the Folder path provide the folder name to export.
a. If the directory does not exist, click on Create folder if it does not exist box
6) After providing the directory information click on the Edit link to access the export permission
74
Figure 42 Configuring the NFS export

7) To enable root access to the NFS export, FluidFS give several options:
Enable root access only for a specific client
Enable root access for an entire subnet (shown below)
8) In addition to configuring the subnet information, click on the Trusted Users drop box and choose
Everybody. This will set the no_root_squash as required by ESXi to connect to the NFS mount point.
Figure 43 Configuring the export for root access
75
17.4.2
Enabling deduplication
To enable FluidFS deduplication feature it needs to be enabled globally on FluidFS cluster. Once it is
enable globally, deduplication can be enabled on a per volume basis. The following screenshot
demonstrates how to enable deduplication globally and also per volume.
1)
To do this, right click on the FluidFS cluster and then click Edit Settings.
Figure 44 Enabling deduplication globally on a FluidFS cluster

2) Next, click on the File System section and select the Enable data reduction check box.
Figure 45 Enabling data reduction on the system globally
76
3) Now that deduplication has been enabled globally, navigate back to FluidFS volumes and choose the
volume that contains the VMware Export from the previous steps, right click on that volume and
choose Edit Data Reduction Setting.
4) In the following screen, click on the Enable check box and create the deduplication policy
Figure 46 Enabling deduplication on a volume
17.4.3
Connect to an NFS export via the VMware vSphere client

Now that the FluidFS has been configured, the NFS export can be connected as a datastore. The following
section will demonstrate how to connect the NFS export via the VMware vSphere client.
1)
2)
3)
4)
77
From VMware vSphere client, select an ESXi host, click on the Configuration tab
Click on Storage
Click on Add Storage
In the add storage wizard choose Network File System and then click Next
Figure 47 Adding an NFS mount point as a datastore

5) Provide the following information:
FS8600 cluster VIP or DNS name in the sever field
The folder exported folder
Choose a name for the datastore
Figure 48 Entering NFS mount point information
78
6) Click Next to continue

7) Review the information provided and then click Finish
Figure 49 Reviewing the mount point information

8) Once all steps competed the newly created NFS datastore will appear in the datastore list
9) Repeat the steps 1-8 above for each host within the cluster
79
18
Conclusion
This document addresses many of the best practices scenarios an administrator may encounter while
implementing or upgrading VMware vSphere with the Dell Compellent Storage Center.
18.1
More information
If more information is needed, please review the following web sites:
Dell Compellent
- General Web Site
o http://www.dellstorage.com/compellent/
- Compellent Customer Portal
o http://customer.compellent.com
- Knowledge Center
o http://kc.compellent.com
- Dell Tech Center
o http://en.community.dell.com/techcenter/storage/w/wiki/5018.compellent-technical-
content.aspx
-
Training
o http://www.dellstorage.com/storage-services/training-services/
VMware
- General Web Site
o http://www.vmware.com/
- VMware Education and Training
o http://www.vmware.com/education
- vSphere Online Documentation
o http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html
- VMware Communities
o http://communities.vmware.com
18.2
Getting help
Contacting Copilot Support
For customers in the United States:
Telephone: 866-EZ-STORE (866-397-8673)
- E-mail [email protected]
Additional Copilot Support contact information for other countries can be found at:
- Contacting Compellent Technical Support
80
Determining the appropriate queue depth for an ESXi host

Adjusting the queue depth on ESXi hosts is a very complicated subject. On one hand, increasing it can
remove bottlenecks and help to improve performance (as long as there are enough back end spindles to
handle the incoming requests). Yet on the other hand, if set improperly, the ESXi hosts could overdrive the
controller front-end ports or the back end spindles, and potentially make the performance worse.
The general rule of thumb is to set the queue depth high enough to achieve an acceptable number of
IOPS from the back end spindles, while at the same time, not setting it too high allowing an ESXi host to
flood the front or back end of the array. Here are a few basic pointers:
Fibre Channel
2 Gb Storage Center Front-End Ports
- Each 2 Gb FE port has a max queue depth of 256 so care must be taken to not overdrive it
- It is generally best to leave the ESXi queue depths set to default and only increase if absolutely
necessary
- Recommended settings for controllers with 2 Gb FE ports
- HBA BIOS = 255
- HBA Queue depth is actually regulated by the driver module
- Driver module = 32 (Default)
- DSNRO= 32 (Default)
- Guest vSCSI controller = 32 (Default)
4/8/16 Gb Storage Center Front-End Ports
- Each 4/8/16 Gb front-end port can accept many more (~1900+) outstanding I/Os
- Since each FE port can accept more outstanding I/Os, the ESXi queue depths can be set more
flexibly to accommodate guest I/O. Keep in mind, the queue depth may need to be decreased
if the front-end ports become saturated, the back end spindles become maxed out, or the
latencies become too high.
- Recommended settings for controllers with 4/8/16 Gb FE ports
- HBA BIOS = 255
- Driver module = 255
- DSNRO = 32 (Default)
o Increase/decrease as necessary
- Guest vSCSI controller = 32 (Default)
iSCSI
- Leave the queue depth set to default and only increase if necessary
- HBA BIOS (If using Hardware iSCSI) = 255
- Driver Module (iscsi_vmk) = 255
- DSNRO = 32 (Default)
81
Guest vSCSI controller = 32 (Default)

ESXi
It is important to note that although the per LUN queue depth maximum is 256, the per adapter
maximum within ESXi is 4096. By increasing the per LUN queue depth from 64 to 128, it can take
fewer LUNs to saturate a ports queue. For example, 4096/64=64 LUNs, whereas 4096/128=32
LUNs.
The best way to determine the appropriate queue depth is by using the esxtop utility. This utility can be
executed from one of the following locations:
ESXi Shell via SSH
- Command: esxtop
vCLI 5.x or the vMA Virtual Appliance
- Command: resxtop (or resxtop.sh)
When opening the esxtop utility, the best place to monitor queue depth and performance is from the Disk
Device screen. Here is how to navigate to that screen:
1.
From the command line type either:

a. # esxtop
(or resxtop.sh --server esxserver.domain.local)
2. Enter the Disk Device screen by pressing u
3. Expand the devices field by pressing L 36 <enter> (Capital L)
a. This will expand the disk devices so that LUNs can be identified by naa identifier
4. Chose the Fields to monitor by pressing f:
a. Press b to uncheck the ID field (not needed)
b. OPTIONALLY: (Depending on preference)
i. Check or uncheck i for overall Latency
ii. Check j for read latency
iii. Check k for write latency
c. Press <enter> to return to the monitoring screen
5. Set the refresh time by pressing s 2 <enter>. (Refresh every 2 seconds)
The quick and easy way to see if the queue depth is set correctly is to monitor the queue depth section in
coordination with the latency section.
Figure 50 esxtop with a queue depth of 32 (Edited to fit screen)

Generally speaking, if the LOAD is consistently greater than 1.00 on one or more volume, the latencies are
still acceptable, and the back end spindles have available IOPS, then increasing the queue depth may make
82
sense. However, if the LOAD is consistently less than 1.00 on a majority of the LUNs, and the performance
and latencies are acceptable, then there is usually no need to adjust the queue depth.
In the figure above, the device queue depth is set to 32. Please note that three of the four LUNs
consistently have a LOAD above 1.00. If the back end spindles are not maxed out, it may make sense to
increase the queue depth, as well as increase the DSNRO setting.
Figure 51 Queue depth increased to 255 (Edited to fit screen)

Please note that by increasing the queue depth from the previous example, the total IOPS increased from
6700 to 7350, but the average device latency (DAVG/cmd) increased from 18ms to 68ms. That means the
latency over tripled for a mere 9% performance gain. In this case, it may not make sense to increase the
queue depth because latencies became too high. Generally speaking, in a system that is sized with the
appropriate number of drives to handle the application load, the average response time (latency) should
remain below 20 ms.
For more information about the disk statistics in esxtop, consult the esxtop man page, or the VMware
document: vSphere Monitoring and Performance Guide
83
Configuring Enterprise Manager VMware integrations

For customers that are running Enterprise Manager Versions 5.5.3 or higher, the Enterprise Manager Data
Collector can be configured to gather storage statistics and perform basic storage administration
functions with vCenter.
To add vCenter credentials into Enterprise Manager, enter the Servers Viewer screen, and then right click
on Servers then select Register Server.
Figure 52 Adding a vCenter Server to Enterprise Manager
Figure 53 Registering vCenter Server Credentials

After entering vCenter credentials, administrators can see aggregate storage statistics, as well as being able
to provision VMFS datastores and RDMs. For example, when creating a new volume, by selecting an ESXi
host, it will automatically give the option to format it with VMFS. Similarly, when creating a new volume to
be assigned to a virtual machine, Enterprise Manager can automatically add the volume as a pRDM.
84

Dell Compellent Best Practices With VMware VSphere 5.x

Uploaded by

Copyright:

Available Formats

Dell Compellent Best Practices With VMware VSphere 5.x

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dell Compellent Best Practices With VMware VSphere 5.x

Uploaded by

Copyright:

Available Formats

Dell Compellent Storage Center

Best Practices with VMware vSphere 5.x

A Dell Best Practices

Added ESXi 5.1 updates and Live Volume considerations

Added ESXi 5.5 updates and FS8600 section

Added iSCSI port binding resource links and clarifications

Fibre channel switch zoning .................................................................................................................................................... 9

Single initiator multiple target zoning ......................................................................................................................... 9

Port zoning ....................................................................................................................................................................... 9

WWN zoning .................................................................................................................................................................... 9

Virtual ports .................................................................................................................................................................... 10

Host bus adapter and other initiator settings ...................................................................................................................... 11

QLogic Fibre Channel card BIOS settings ................................................................................................................. 11

Emulex Fibre Channel card BIOS settings ................................................................................................................. 11

QLogic iSCSI HBAs ......................................................................................................................................................... 11

Miscellaneous iSCSI initiator settings ......................................................................................................................... 11

Modifying queue depth in an ESXi environment ................................................................................................................ 12

Host bus adapter queue depth ................................................................................................................................... 12

Modifying ESXi storage driver queue depth and timeouts .................................................................................... 12

Adaptive queue depth .................................................................................................................................................. 16

Modifying the guest OS queue depth ....................................................................................................................... 17

Setting operating system disk timeouts ............................................................................................................................... 20

Guest virtual SCSI adapters .................................................................................................................................................... 21

Mapping volumes to an ESXi server ..................................................................................................................................... 22

Basic volume mapping concepts ............................................................................................................................... 22

Basic volume mappings in Storage Center .............................................................................................................. 22

Multipathed volume concepts .................................................................................................................................... 23

Multipathed volumes in Storage Center ................................................................................................................... 24

Configuring the VMware iSCSI software initiator for multipathing ...................................................................... 26

6.6.1 Virtual switch recommendations for iSCSI port binding ....................................................................................... 28

iSCSI Port Multi-VLAN configuration recommendations ...................................................................................... 29

Configuring the FCoE software initiator for multipathing ..................................................................................... 30

VMware multipathing policies .................................................................................................................................... 30

6.9.1 Fixed policy .................................................................................................................................................................... 31

Boot from SAN ......................................................................................................................................................................... 36

Configuring boot from SAN ........................................................................................................................................ 36

Volume creation and sizing ................................................................................................................................................... 38

Volume sizing and the 64 TB limit ............................................................................................................................. 38

Virtual machines per datastore ................................................................................................................................... 38

VMFS partition alignment ............................................................................................................................................ 39

VMFS file systems and block sizes ............................................................................................................................. 41

LUN mapping layout ............................................................................................................................................................... 43

Multiple virtual machines per LUN ............................................................................................................................. 43

9.1.1 Storage of non-virtual machine files ......................................................................................................................... 43

One virtual machine per LUN ..................................................................................................................................... 45

10 Raw device mappings (RDM's) .............................................................................................................................................. 47

11.2 On-Demand Data Progression ................................................................................................................................... 51

12.1.3 Thin provisioned ............................................................................................................................................................ 53

16.3.2Full copy (SCSI EXTENDED COPY)............................................................................................................................. 70

Prerequisite reading ...................................................................................................................................................... 72

17.2 FS8600 architecture ..................................................................................................................................................... 72

Determining the appropriate queue depth for an ESXi host ............................................................................................ 81

Configuring Enterprise Manager VMware integrations .....................................................................................................84

Fibre channel switch zoning

Single initiator multiple target zoning

(Zone created in fabric 1)

(Zone created in fabric 2)

Virtual Port Domains FC and iSCSI

Host bus adapter and other initiator settings

QLogic Fibre Channel card BIOS settings