FDM TS
FDM TS
Second Way
The following key configuration settings not carried forward during an upgrade:
New HA Agent – FDM - Fault Domain Manager is the name of the agent
FDM replaces AAM agent in 4.1
Supports management network partitions – Can have multiple “master nodes” when multiple
network partitions exist.
Enhanced isolation validation - Avoids false positives when the complete management
network has failed.
Datastore Heartbeating – This additional level of heart beating reduces chances of false
positives by using the storage layer to validate the state of the host and to avoid unnecessary
downtime when there’s a management network interruption.
FDM Master
---- The FDM Master monitors all hosts and VMs and slave hosts.
---- if host failure happens, the master restarts all VMs on another host.
---- It keeps a list of VMs protected, which is updated after every power off or on done by user.
---- keeps track of any adding/removing of hosts.
---- sends periodically status reports to vCenter
Slave
--- A slave monitors the vms it is running and informs the master about any changes.
--- The slave also monitors the health of the master by monitoring heartbeats.
--- If the master fails, the slaves initiate and participate in the election process.
Master-election algorithm
Takes 15 to 25s (depends on reason for election)
Elects a host which has access to greatest number of datastores.
if there is a tie, host with highest Managed Object Id will be chosen.
When the cluster is split in two sites due to a link failure each “partition” will get its own
master. This allows for workloads to be restarted even in a geographically dispersed
cluster when the network has failed….
Datastore Heartbeat
--- If the primary network goes down this secondary network is used to find out
Whether the host is failed or isolated / partitioned host.
--- It Prevents Unnecessary Restarts due to isolation
Locking mechanism
To see which datastores have been selected for heartbeating. go to your summary tab on your
cluster and click “Cluster Status”, the 3rd tab “Heartbeat Datastores” will reveal it.
Isolated vs Partitioned
there are two different states:
Isolated
o Is not receiving heartbeats from the master
o Is not receiving any election traffic
o Cannot ping the isolation address
Partitioned
o Is not receiving heartbeats from the master
o Is receiving election traffic
o (at some point a new master will be elected at which the state will be
reported to vCenter)
Network Partition
When multiple hosts are isolated it can communicate within each other over the management
networks, it is called a network partition. When a network partition exists, a master election
process will be issued so that a host failure or network isolation within this partition will result in
appropriate action on the impacted virtual machine(s).
Restarting VMs
Restart priority changes
The order in which virtual machines will be restarted:
Agent virtual machines
FT secondary virtual machines
Virtual Machines configured with a restart priority of high,
Virtual Machines configured with a medium restart priority
Virtual Machines configured with a low restart priority
Agent virtual machine is a virtual machine which performs a specific function like
vShield Endpoint appliance which is an anti-virus solution. By "tagging" this virtual
machine as an agent VM we ensure that this virtual machine gets powered on first in the
case of a host failure before all other VMs start.
if the VM was cleanly powered off, initiated by the admin, or powered-off due to a
failure/isolation. If a vm is cleanly powered off they will not be restarted, If they are not cleanly
powered off such VMs will be powered on.
Storage Adapter:
by default it’s not installed, you must manual add the adapter by clicking Configuration >
Storage Adapter > Add Storage Adapter
- Select: Add Software iSCSI Adapter
A new software iSCSI adapter will be added to the Storage Adapters list. After it has been
added, select the software iSCSI adapter in the list and click on Properties to complete the
configuration
s): -
Network label: iSCSI01 - Configure IP Address and Subnet Mask
..
- Finish the configuration, repeat this step to configure the 2nd iSCSI02 port
vSwitch2 result:
Vcenter Crashed
What happens if the vCentre server looses connection to the database server,
HP blade centre, both the database server and the HP blade server?
Backup vCenter:
The vCenter Database
The Sys prep files- used for customizations during template deployments.
C:\ProgramData\VMware\VMware VirtualCenter\Sysprep
If SSL certificates are not backed up, we have to disconnect and reconnect esxi hosts.
once you would restore your vCenter server.
Hence it’s better to backup those SSL Certificates.
Restoring vCenter :
--- Reinstall your Windows Server
--- Reinstall sql
--- Restore database
--- Create the ODBC connections and test it.
--- Install vCenter server, Note : when installing you receive a message that there already is a
vCenter database, do not replace the database and use it.
--- After Installation completes Restore SSL and Sysprep files.
To configure the TSM timeout value from the vSphere Client:
3. Click OK.
1. At the main DCUI screen, press ALT+F1 simultaneously. This opens a virtual console
window to the host.
2. Provide credentials when prompted.
Note: When typing the password, characters are not displayed on the console.
Notes:
o Directions may vary depending on what SSH client you are using. For more
information, consult vendor documentation and support.
o By default, SSH works on TCP port 22.
3. Provide credentials when prompted.
For ESXi 3.5, ESX/ESXi 4.x and 5.0, use the esxcfg-dumppart utility:
# esxcfg-dumppart -L <vmkernel-zdump-filename>
# ls /root/vmkernel* /var/core/vmkernel*
/var/core/vmkernel-zdump-073108.09.16.1
2. Use the vmkdump or esxcfg-dumppart utility to extract the log. For example:
# vmkdump -l /var/core/vmkernel-zdump-073108.09.16.1
created file vmkernel-log.1
# esxcfg-dumppart -L /var/core/vmkernel-zdump-073108.09.16.1
created file vmkernel-log.1
3. The vmkernel-log.1 file is plain text, though may start with null characters. Focus on the
end of the log, which looks similar to:
Note: The file name created for the log in this example is vmkernel-log.1. If another file with
the same name already exists, the new file is created with the number suffix incremented.
It is caused by the host agent service (mgmt-vmware) failing due to a dead process.
# kill -9 23961
Now the hostd service will start and after a few seconds
Host and virtual machines will be up in vCenter.
service mgmt-vmware restart (restarts host agent (hostd) on vmware esx server)
service vmware-vpxa restart (restarts Vcenter agent service)
service network restart (restarts management networks on ESX)
Example:
For 250-DC01
Vmid : is 7
Name : is 250-DC01.
File :: located in datastore2 under 250-DC01
Guest OS :: win 2008 64 is listed as windows7server64guest
Version :: of hardware is 8
If you are unable to shutdown or you don’t have vmware tools installed, you will need
to use another command.
vim-cmd vmsvc/power.off VMID
Powering off a virtual machine fails with the error: Cannot power Off: Another task is
already in progress
To resolve this issue:
1. Open the .vmx file of the virtual machine using a text editor.
2. Comment this line:
#log.fileName =
"/vmfs/volumes/4b8bd18f-c1f89b5a-1914-002219c8e7a3/vmware.log"
Note: Instead of restarting the virtual machine, you can reload the .vmx file using
these commands:
# vmware-vim-cmd vmsvc/getallvms
This command returns the VMID.
# vim-cmd vmsvc/getallvms
This command returns the VMID.
In ESXi 3.5-5.0, use kill command to terminate, a running virtual machine process.
1. On the ESXi console, enter Tech Support mode and log in as root
To check the virtual machine process is running on the ESXi host, run this command:
ps | grep vmx
The first column contains the PID, and the Second contains the parent's PID.
Kill only the Parent process.
2. If the vmx process is listed, Kill the process using this command:
kill ProcessID
3. Wait 30 seconds and repeat step 2 to check for the process again.
4. If it is not terminated, run this command:
kill -9 ProcessID
write zeros in the unused areas of your hard drive, overwriting any scraps and fragments of old
files there so that they can’t be easily recovered by snoops
When files are normally deleted they are just removed from the filesystem. The file itself is still
on the hard drive though, which is how some programs are able to "recover" deleted files. For
security reasons you may want to zero out your free space, which will get rid of all chances of
recovering deleted files.
c:\>sdelete -z c:
-c Zero free space (good for virtual disk optimization
-z Clean free space
To check virtual disk size : ls –lh *.vmdk. In my example, I have a 40 GB virtual size.
Issue the command vmkfstools –punchzero VM.vmdk. you VM must be powered off
On a 40 GB VMDK it took 10 minutes, but this will depend mostly on your SAN speed.
Nice, now it’s actually 10.8 GB. Makes it a lot easier to transfer around
Then go into Computer Management, under Start -> Administrative Tools. Here, under
Storage, click on Disk Management.
Right-click on the allocated piece of the disk and click Extend Volume
Recover Orphaned (left Alone) Virtual Machines
Virtual machines in host with (orphaned) appended to their name.
Cause
Virtual machines can become orphaned if a host failover is unsuccessful, or when the virtual
machine is unregistered directly on the host.
If this situation, Move the orphaned virtual machine to another host .
Solution
1 Right-click the virtual machine and select Migrate.
2 Click Change Host and click Next.
3 Select the host on which to place the virtual machine.
If no hosts are available, add a host that can access the datastore on which the virtual machine's
files are stored.
The virtual machine is connected to the new host and appears in the inventory list.
Virtual Machine Does Not Power On After Cloning or Deploying from Template
Virtual machines do not power on after you complete the clone or deploy from template
Cause
The swap file size is not reserved when the virtual machine disks are created.
Solution
Reduce the size of the swap file of the vm. To do this increase the vm memory reservation.
right-click the virtual machine and select Edit Settings.
Select the Resources tab and click Memory.
Use the Reservation slider to increase the amount of memory allocated to the virtual
machine.
Click OK.
Step :2 Increase space for the swap file. by moving other vm disks off of the datastore that is
being used for the swap file.
Step :3 Increase space for the swap file by changing the swap file location to a datastore with
enough space.
Note
If the host is in cluster use the Cluster Settings to change the swap file location policy for the
cluster.
Problem
vCenter Server cannot connect to managed hosts after server certificates are replaced and the
system is restarted.
Solution
Log into the host as the root user and reconnect the host to vCenter Server.
vCenter map
A vCenter map is a visual representation of your vCenter Server topology. Maps show the
relationships between the virtual and physical resources available to vCenter Server.
Maps are available only when the vSphere Client is connected to a vCenter Server system.
The maps helps to check which clusters or hosts are most densely populated,
which networks are most critical, and which storage devices are being utilized.
Where server is the hostname or IP address of the server, and port is the port that you want to
connect to.
Connecting to port 902 on an ESXi/ESX host:
Root cause
The VM will be having a snapshot. Because when a snapshot is taken the actual VMDK file
will be locked and the changes will be written to a new delta file. So, the VMDK file could not
be resized.
Resolution
Remove the snapshot file and then try to resize.
Esxtop
esxtop command used to check whether the ESX/ESXi server is being overloaded
1.check the load average on the first line of the command output.
load average of 1.00 means the ESXi Server’s physical CPUs are fully utilized,
load average of 0.5 means they are half utilized.
load average of 2.00 means the system is overloaded.
2. Check %READY for the percentage of time that the virtual machine was ready but could not
be scheduled to run on a physical CPU.
OR
Decrease the number of virtual CPUs allocated to the host.
Memory overcommitment
Use the esxtop command to check whether the ESX/ESXi server's memory is overcommitted
check the MEM overcommit avg on the first line of the command output. This value says the
ratio of the requested memory to the available memory, minus 1.
Examples:
- If the vms require 4 GB of RAM, and the host has 4 GB of RAM, then there is a 1:1 ratio.
After subtracting 1 (from 1/1), the MEM overcommit avg field reads 0.
There is no overcommitment and no extra RAM is required.
- If the vms require 6 GB of RAM, and the host has 4 GB of RAM, then there is a 1.5:1
ratio. After subtracting 1 (from 1.5/1), the MEM overcommit avg field reads .5. The
RAM is overcommited by 50%, meaning that 50% more than the available RAM is
required.
If the memory is being overcommited, adjust the memory load on the host.
To adjust the memory load, either:
Increase the amount of physical RAM on the host
OR
Decrease the amount of RAM allocated to the virtual machines.
OR
Reduce the total number of virtual machines on the host.
Determine whether the virtual machines are ballooning and/or swapping.
a. Run esxtop.
b. Type m for memory
c. Type f for fields
d. Select the letter J for Memory Ballooning Statistics (MCTL)
e. Look at the MCTLSZ value.
MCTLSZ (MB) displays the amount of guest physical memory reclaimed by the
balloon driver.
f. Type f for Field
g. Select the letter for Memory Swap Statistics (SWAP STATS).
h. Look at the SWCUR value.
esxtop – Provides real time CPU, memory, disk and network statistics for virtual
machines and their host. This utility is accessible from a direct connection to an ESXi
host. It does not report statistics for NFS datastores.
esxtop has 8 different "displays" that show CPU, interrupt, memory, network, disk adapter,
disk interface, disk VM, and power management
resxtop – A remote version of esxtop. It is included as part of the Linux vCLI (vSphere
Command Line) and the vMA (vSphere Management Assistant). Discussed below,
resxtop has three modes of operation including Interactive, Batch, and Replay. Both
esxtop and resxtop may only connect to a single ESXi host and requires root level access.
In ESXi 5.x and ESXi 4.1, to find the owner of locked file of a vm
# vmkvsitools lsof | grep Virtual_Machine_Name
then run this command to get the PID of the process for the vm
ps | grep Virtual_Machine_name
To generate a core dump after killing the running virtual machine (but hung and
nonresponsive), use the command kill -6 PID or kill -11 PID.
In ESXi 4.1 and ESXi 5.x, Use k command in esxtop to kill a running virtual machine
process. On the ESXi console, enter Tech Support mode and log in as root.
Next time when the VM boots we can see this screen (depending on your guest OS of course).
To Patch an ESXi 5.0 host from the command line:
“ Install “ command overwrites the existing packages. It may downgrade the existing packages,
Overwrite existing drivers.
“ Update” command is the recommended. It will update from lower version to new.
Patching Steps
1. Patches can be downloaded from the VMware patch portal. Select ESXi (Embedded and
Installable) in the product dropdown and click Search.
2. Click the Download link below the patch Release Name to download the patch.
3. Upload the patch to a datastore using the Datastore Browser from vCenter or a direct
connection to the ESXi 5.0 host using the vSphere client.
Note: VMware recommends creating a new directory on the datastore and uploading the patch
file to this directory.
4. Log into the local Tech Support Mode console of the ESXi 5.0 host.
5. Migrate or power off the virtual machines and put the host into maintenance mode.
# vim-cmd hostsvc/maintenance_mode_enter
6. Navigate to the directory on the datastore where the patch file was uploaded
# cd /vmfs/volumes/Datastore/DirectoryName
# ls
Where Datastore is the datastore name where the patch file was uploaded to, and
DirectoryName is the directory you created on the datastore.
Where PatchName.zip is the name of the patch file you uploaded to the datastore.
# esxcli software vib install -d "/vmfs/volumes/datastore1/patch-directory/ESXi500-
201111001.zip"
8. After the patch has been installed, reboot the ESX host:
# reboot
9. After the host has finished booting, exit maintenance mode and power on the vms .
# vim-cmd hostsvc/maintenance_mode_exit
ethernetN.checkMACAddress = “false”
ethernetN.addressType = “static”
ethernetN.Address = “XX:XX:XX:XX:XX:XX″
Where XX:XX:XX:XX:XX:XX is the new desired MAC address for the virtual machine.
5. Upload the new (modified) .vmx file back to the same location (datastore) using the
Datastore Browser.
6. Copy the original .vmx file (using a name similar to vmname.vmx.old) for backup
purposes.
7. Register the virtual machine back to the inventory.
8. Start the virtual machine.
ethernet0.address = 00:50:56:XX:YY:ZZ
Where XX is a valid hex number between 00 and 3F and YY and ZZ are valid hex numbers
between 00 and FF. The value for XX must not be greater than 3F in order to avoid
conflict with MAC addresses that are generated by the VMware
Storage vMotion is separated into two components: Data movers and Data mirroring.
Data movers read blocks from the original location and copy to a new destination.
Data mirroring writes data to both the original VMDK location and the new location.
The guest does not get a write confirm until the data has been written in both locations.
Data Movers
The ESX hypervisor uses one of three mechanisms, which affects the speed of the Storage
vMotion. The two different kind of data movers are: (s/w & h/w based)
fsdm - Software Datamovers
fs3dm - Software Datamovers
fs3dm-hw - Hardware offloading datamover
Storage IO Control
It dynamically allocates portions of hosts’ I/O queues to VMs running with shares assigned
to the VMs. Administrators can set priority for vm with critical workloads during peak load
periods (by means of disk shares)
Setting I/O priorities for VMs results in better performance during periods of congestion.
If you try to customize threshold, you can click Advanced button.
Esxi 5.0
Using shares, each network traffic type is allowed the network bandwidth equal to the share
value. With shares, unused bandwidth is available for use by other traffic types.
Using limits, a maximum bandwidth utilization (in Mbps) is set for each traffic type.
Unused bandwidth is not available for other traffic types.
This feature can guarantee bandwidth for specific needs and can prevent any one traffic type
from impacting the others
Disk performance
2-5 ms latencies are healthy storage 5-12 ms latencies reflecting a healthy storage architecture
were data is being randomly read across the disk,
15 ms latencies or greater possibly representing an over-utilized or misbehaving array.
esxtop – Provides real time CPU, memory, disk and network statistics for virtual
machines and their host. This utility is accessible from a direct connection to an ESXi
host. It does not report statistics for NFS datastores.
resxtop – A remote version of esxtop. It is included as part of the Linux vCLI (vSphere
Command Line) and the vMA (vSphere Management Assistant). Discussed below,
resxtop has three modes of operation including Interactive, Batch, and Replay. Both
esxtop and resxtop may only connect to a single ESXi host and requires root level access.
To access a virtual machines CPU performance statistics using esxtop/resxtop, perform the
following steps. Note: For more information on establishing an esxtop/resxtop session
CPU Scheduler
The scheduler is a VMKernal component .It schedules VM’s virtual CPUs requests to the
physical CPUs of the host server. When a VM uses its virtual CPU, the VMkernel has to find a
free physical CPU (or core).
The scheduler’s job is to find CPU time for all the VMs that are requesting it and to do it in a
balanced way, so performance for any one VM does not suffer.
1. %USED - Percentage of the host’s physical CPU cycles are being used by a VM. A high
value does not necessarily indicate an issue unless coupled with queueing. A high
%USED value in conjunction with a high %RDY value could indicate that the host is
overcommitted.
2. %WAIT - Amount of time the VM spent in a blocked or busy wait state. The VM is
likely waiting for a vmkernel operation to complete before it can be scheduled. Note that
this includes idle time.
3. %MLMTD – Idle time due to a configured CPU limit.
4. %CTSP - Amount of time an SMP VM was ready to run, but experienced a delay due to
vCPU scheduling contention.
Through console
Setting the number of cores per CPU in a virtual machine
Power off the virtual machine--Right-click on the virtual machine -- click Edit Settings.
Click Hardware and select CPUs. -- Choose the number of virtual processors.
Click the Options tab -- Click General, in the Advanced options section.
Click Configuration Parameters Include cpuid.coresPerSocket in the Name column.
Enter a value (try 2, 4, or 8) in the Value column.
Notes: Ensure that the number of vCPUs is divisible by the number of cpuid.coresPerSocket in
the virtual machine. That is, when you divide the number of vCPUs by the number of
cpuid.coresPerSocket , it must return an integer value. For example, if your virtual machine is
created with 8 vCPUs, coresPerSocket can only be 1, 2, 4, or 8.
The virtual machine now appears to the operating system as having multi-core CPUs with the
number of cores per CPU given by the value that you provided in step 9.
1. Click OK.
For example:
Using 4 vCPU
Configuration you want Settings needed for this configuration
Number of Set
Number of cores per Total Set vCPU cpuid.coresPerSocket/soc
sockets socket cores to: kets to:
1 4 4 4 4
2 2 4 4 2
Using 8 vCPU
Configuration you want Settings needed for this configuration
Number of Set
Number of cores per Total Set vCPU cpuid.coresPerSocket/soc
sockets socket cores to: kets to:
1 8 8 8 8
2 4 8 8 4
4 2 8 8 2
Notes:
To assign more than 4 vCPUs or if the processor supports more than 6 cores per
processor, you should have Enterprise Plus license, which supports up to 8 vCPUs and 12
cores per processor.
Only values of 1, 2, 4, 8 for the cpuid.coresPerSocket are supported for the multi-core
vCPU feature in ESX 4.x.
In ESX 4.0, if multi-core vCPU is used, hot-plug vCPU is not permitted, even if it is
available in the UI.
Only HV 7 virtual machines support the multi-core vCPU feature.
http://www.petri.co.il/vsphere-hot-add-memory-and-cpu.htm
Click OK to proceed.
Check the “Enable” tickbox, and in the iSCSI Name field, you will see the name of the iSCSI
initiator (Which you will use on your san for LUN masking) and click “Ok”
On the next Tab (Dynamic Discovery), click on “Add” and enter the IP address of one of your
iSCSI ports of your san, mine is 10.1.1.27
In the Add Storage wizard first page, select “Disk/LUN” and click Next
Your Added LUN will be in the list, select it, and click “Next”
Next will be a summary of your disk layout, click next
On the next page you will see the file formatting config, I left the defaults, and clicked next
Once finished the wizard, you will see your LUN storage ready in the storage list.
Deselect Maximum Capacity check box and enter the LUN size if you want to decrease this
VMFS datastore
Step 1
Select the ESX host and then click the "Configuration" tab.
Step 2
Select "Storage Adapters" from under Hardware. Click "Rescan."
Step 3
Confirm "Scan for New Storage Devices" is selected click "OK" to assign LUN to the host.
Step 4
Configuration tab Storage to open the Add Storage Wizard.
Step 5
Select "Disk/LUN" click "Next." Select the LUN from the list of devices and then click "Next."
Click "Next" again.
Step 6
give a name for new volume. Click "Next." Choose a block size -- 1, 2, 4 or 8MB -- from the
drop-down menu or use the default setting.
Step 7
Deselect "Maximum Capacity" and then enter the desired storage size of the volume, if
preferred. Click "Next."
Step 8
Click "Finish" to add the new volume to VMware ESX.
Un-presenting LUN
ESX/ESXi 4.1
The best practices for removing a LUN from an ESX 4.1 host
1. Unregister all objects from the datastore including VMs and Templates
2. Ensure that no 3rd party tools are accessing the datastore
3. Ensure that no vSphere features, such as Storage I/O Control, are using the device
4. Mask the LUN from the ESX host by creating new rules in the PSA (Pluggable Storage
Architecture)
5. Physically unpresent the LUN from the ESX host using the appropriate array tools
6. Rescan the SAN
7. Clean up the rules created earlier to mask the LUN
8. Unclaim any paths left over after the LUN has been removed
Esxi 5.0
1. Unregister all objects from the datastore including VMs and Templates
2. Ensure that no 3rd party tools are accessing the datastore
3. Ensure that no vSphere features, such as Storage I/O Control or Storage DRS, are using
the device
4. Detach the device from the ESX host; this will also initiate an unmount operation
5. Physically unpresent the LUN from the ESX host using the appropriate array tools
6. Rescan the SAN
Migrate the virtual machines to another datastore(s), delete the existing datastore, and re-create
it using VMFS-5.
The main change with VMFS 5 is the unified block size, which is 1 MB.
VMFS 3 was able to format at 1, 2, 4, or 8 MB
With VMFS 5 supporting 1 MB block sizes, the maximum sizes for Virtual Machine Disk
Formats (VMDKs) are not limited to 256 GB like previous block sizes
VMFS 3 datastores can be upgraded, but it retain their block size. So reformat the volume to
VMFS 5 at 1 MB (the default size).
Another important change is related to the sub-block algorithm allocation. VMFS implements a
unique sub-block algorithm that works well for the polar distribution of file types: large VMDKs
and small VMX and others. The previous sub-block was 64 KB
Upgrading VMFS 3 to VMFS 5
You can migrate VM’s between vSphere clusters (even between different versions) as long as
below conditions are met:
LUN paths
vmhba0:C0:T0:L23
Adapter: vmhba0
Channel: 0
Target: 0
LUN: 23
esxcli storage core path list gives a list of all LUN paths currently connected to the ESXi host.
esxcli storage core device list gives a list of LUNs currently connected to the ESXi host.
For example, vmhba0:2:3:1 refers to the first partition on LUN 3, target 2, HBA 0.
The block size defines the maximum size of a file on the datatore (e.g. a virtual disk file - .vmdk
if you select a 1MB block size on your data store the maximum file size is limited to 256GB. So
when you create a VM you cannot assign it a single virtual disk greater then 256GB.
1MB block size – 256GB maximum file size
2MB block size – 512GB maximum file size
4MB block size – 1024GB maximum file size
8MB block size – 2048GB maximum file size
To change the block size of a vmfs filesystem, use vmkfstools to reformat the partition.
The command is : vmkfstools --createfs vmfs3 --blocksize 8M vmhba0:0:0:3
It will destroy any data on the partition, so make sure you move your data before you do this
With VMFS-5 this is history and the unified block size for all file sizes is 1MB.