ESX2 1ExamCram-SysAdminIandII
ESX2 1ExamCram-SysAdminIandII
This is based on the admin guide for 2.1 downloadable from VMware’s website.
Further material highlighted in green is from the authorised courseware.
If you do feel there are errors or something is not clear – then drop me an email,
and I will correct the original in due course
Chapter X: Installation
Starting the Installation (Boot from CD)
1. Insert ESX2.1 drive into the CD drive
2. Boot server
Note:
The server should boot from the CD-rom
Note:
You have the choice of GUI or text-mode installation. If you wait it
defaults to GUI
ESX will detect you host bus adapters (SCSI/SCSI Raid) and load the
appropriate driver to allow you to access the physical disk
Note:
Do not use automatic. Under 2.1 the installation is Linux based, and
attempts allocate the large amounts of the disk space to a Linux swap
file. If the server has a lot of memory – which ESX boxes generally do –
this will consume large amount of disk space unnecessarily
Keyboard Configuration/Mouse
• Adjust to suit your preferences
• In my case choose
• Model generic 105 key
• United Kingdom as the Layout
• Generic 2 Button Mouse
• Dead Keys – appears to be enabled by default an allows special
keys/characters
License Agreement
1. Enable the X for I accept the terms in the license agreement
Note:
Apparently – it won’t boot if you don’t accept this!
Serial Numbers
Note:
• You can bypass inputting serial numbers at this stage – but you would
have to input them in the MUI
• If you did bypass them and add them later – you have to do a reboot!
• It is better type them in lower-case – as UPPERCASE tends to fill the edit
boxes, and it is hard to see if 0’s are O’s
Device Allocation
Note:
• The amount of memory you reserve for the Service Console (default
192MG) has a direct relationship to the number of VM’s the VMkernal will
support -
• Options here will vary based on your hardware configuration. Normally,
defaults are fine.
• You made find it useful to make a note of the PCI Bus Device Function
numbers that it detects, this can help you in future troubleshooting
• System finds the first network card in the PCI bus and allocates to the
Service Console
• Any subsequent network cards are used by the Virtual Machines
• If you only have one onboard Host Bus Adapter – then X Shared with
service console must be enabled for it to be used for both Service Console
and storage of Virtual Disks on VMFS partitions
• As most servers only have one SCSI adapter or RAID controller card this is
a good default – unless you dealing with a system designed for a Disk
Duplex (fault tolerance on the controller card
• In my case the Dell PowerEdge 1650 has this configuration
o Boot (ext3)
Used to boot the system – 50MG is all that is required. No big files
are stored here – as you if you did you could flush the disk, and
stop the server from booting correctly
o VMimages (ext3)
A storage point for you ISO’s and template disks, sometime
referred to as “Golden Masters”.
o Vmkcore (vmkcore)
The core dump partition where the Vmkernal puts memory dump
information in the event of a “purple screen of death” or a “kernel
panic”
• Each time you use the partition tool to create a new partition – you
must select on which disk you would like it to be
Note:
Below is what VMware recommends for partitioning strategy. Although a lot
depends on your disk capacity, of course. The size of VMimages is “Goldie Locks”
value – it depends on much space you want for your ISOs/Disk Templates.
To summarize:
Here we have 3 primary partitions, and one extended partition with 6 logical
partitions (/home, /tmp, /var, /vmimages, vmkcore, vmfs2)
Warning:
Make sure you have the correct disk selected, each and every time you
create a partition
1. Choose New
2. Change the File System type to be Swap
3. Change the Size in MG, to be 1024MG
4. Enable X Force to be a Primary Partition
5. Click OK
1. Choose New
2. Change the mount point to be /
3. Change the Size in MG, to be 1800MG
4. Enable X Force to be Primary Partition
5. Click OK
1. Choose New
2. Change the mount point to be /home
3. Change the Size in MG, to be 1800MG
Note:
Do not check off the option to force to be a primary partition
4. Choose OK
Note:
Do not check off the option to force to be a primary partition
4. Choose OK
1. Choose New
2. From the File System type choose vmkcore
3. Change the Size in MG, to be 100
Note:
Do not check off the option to force to be a primary partition
4. Choose OK
1. Choose New
2. From the File System Type choose VMFSv2
3. Enable © Fill to Maximum Allowable Size
Note:
Do not check off the option to force to be a primary partition
4. Choose OK
CAUTION:
Review you partitioning – get a colleague to check your configuration.
Before clicking Next
Network Configuration
• Do not use DHCP – this can cause problems with virtual mac address
within VM’s
• Use static address at all times for ESX and VM’s within…
• As you enter an IP address, the installer will pre-fill some of the IP ranges
– some it might get right – such as Subnet Mask, others it gets wrong it
assumes your router is w.x.y.254 and your primary dns is w.x.y.1
• You must set a host name
1. hostname: esxinstructor1.education.vmw
2. I: 192.168.2.101
3. S: 255.255.255.0
4. G: 192.168.2.1
5. P: 217.79.96.163
6. S: 217.79.111.7
7. Choose Next
1. Choose X System Clock uses UTC if you set your time via the BIOS
2. Select you location, in my case Europe/London
3. Click the UTC Offset Tab
4. Click UTC (with no offset in my case)
5. and enable X System Clock uses UTC
6. Choose Next
Account Configuration
Note:
You must create a least one other account as well as ROOT. Never create VM’s
logged on as account as you will face ownership issues, beside which it’s a
security issue. You should be able to carry out most of your tasks as an ordinary
VM administrator with only occasionally needing to login as ROOT.
Post-Installation (MUI)
Note:
Post configuration is done by the Management User Interface (MUI) it’s a web-
page front-end powered by Apache running on the Service Console. Its also the
primary management front-end too. Beware of some pop-up stoppers – they can
stop some of the page transitions which occur when you logon to the MUI for the
very first time – and you miss your opportunity to be guided through the post-
configuration changes.
Note:
Unless you have DNS name resolution configured, it is recommend you configure
a host file with the name and IP address of your ESX server. Internet Explorer
can be slow to connect if you just use an IP address. Additionally, you can have
proxy server issues associated with just using IP address
Note:
The three main post-configuration tasks are:
Note:
This a built-in server certificate used to verify the identity of the server to
which you are connecting, and also encrypt any information sent from the
workstation to the MUI. You can use your own certificates or enrol with a
3rd Party Certificate Authority should you wish.
Note:
System will create a swap file called swapfile.vsmp which is the same size
as the amount of RAM on the ESX server – this is recommendation. The
default to activate at start-up is a good default
3. Click OK
4. In the second window click Activate – this will save you from doing an
unnecessary reboot!
Note:
To see your volume label use “Manage Files”, select vmfs, select your
disk (vmhbaW:X:Y:Z). It should be displayed like so:
SwapFile.vswp
r w - : 2047MB : Aug 5 09:45
Meaning of vmhbaA:T:L:V
Notes
• Each number refers to in turn
• A – is the Host Bus Adapter Controller in the Server – 0 for the first one, 1
for the second and so on
• T is the SCSI or SP target. Most SAN’s will have two controllers for fault
tolerances with HBA0 plugged into one Storage Port on the first controller
(0) and the HBA1 plugged into the second Storage Port on the SAN (1)
• L – is the LUN number (a area of Logical Free space comprising usually
more than DISK for RAID0, 1, 5 or a combination) this begins with 0,1,2,3
– some SAN’s reserve LUN0 as a “management LUN”
• V - is the Partition number for Primary or Logical Drives or for ESX VMFS
partition
• So vmhba1:1:7:1 would indicate the second hba connected to the second
controller on the SAN, using LUN 7 with Partition 1
Note:
Post-configuration of the ESX server is now complete. We are now in a
position to consider creating virtual machines
Note:
At the top of this view the first 7 or so will be “system” partitions like boot,
root, home, vmimages and so on
3. Select the link that say Edit, next to Logical VMFS-2.11 Volume
4. In the Edit box type in a name such as: Internal
Note:
Volume labels are entirely configurable – set them to be what ever you
find useful
5. Choose OK
6. Choose Close
Note:
To see your volume label use “Manage Files”, select vmfs, select in the
case of this example “Local”
Please note the ROOT user cannot directly FTP or Telnet into a ESX server. You
can, however, telnet in as a VM administrator, and use the SU command to
switch user once connected
Note:
This lowers security to allow Telnet, FTP – it also starts the services for
you
4. Click OK
Chapter 1: Introduction to ESX
• Core Elements of Design:
• Virtualisation/Abstraction
o CPU – appears to have own CPU or multiple CPUs. Most non-
privileged instructions go direct to CPU, privileged ones go via the
Virtual Layer
o Memory – seen as contiguous – but underneath there is swapping
and non-contiguous areas of memory – Guest OS unaware of this
o Disk – Represented as SCSI drive connected to SCSI adapter
despite physical layer – which supports SCSI, RAID, SAN adapters
from QLogic and Emulux. Virtual Disk (VD) is portable to any
system regardless of hardware
• Service Console
o First installed component
o Runs the whole system with http/snmp/api interfaces
o Boots the installation – using a modified Linux distribution
o After booting the Service Console initialises the Virtual layer and
Resource Manager
o Components:
Blank
o Disk Modes:
o Boot from ISO files and floppy disks as well – on Rconsole you
would have to go into
o Settings, Configuration Editor, Floppy Drive and enable X
Connected, X Connected on Power Up – or in Floppy or CD –
specify a path to a ISO image
o Booting from a WIN98 Floppy worked in my Virtual Machine but not
on the physical hardware – GOOD OLD VMWARE!!!
o You can run fdisk in a VMware environment to partition up the DSK
o Alternatively, you could use PXE and use something like RIS from
Microsoft or the Rapid Deployment Pack from HP
• If you want over the network installs you would need to boot from a floppy
disk or ISO image to get network connections to start a network install…
• You might get warnings about disk corruption – just Guest OS getting
confused about virtual disk – not to worry, code is written which
eventually makes it accepted
• Network Card appears as a single AMD Network Card – until you
switch to native networking using the Vmware Driver…
Note:
Without SP3 you get: “System Process — Driver Entry Point Not
Found; The \SystemRoot\System32\drivers\vmxnet.sys device
driver could not locate the entry point
NdisGetFirstBufferFromPacket in driver NDIS.SYS.”
Like NT4 if you using the VMxnet driver you have manually add it in…
logon as root
mount the CD
copy the installer file to tmp
unmount the cd
see below:
su
mount -t iso9660 /dev/cdrom /mnt
cp /mnt/vmware-linux-tools.tar.gz /tmp
umount /dev/cdrom
Untar the VMware Tools tar file in /tmp and install it.
See below:
cd /tmp
tar zxf vmware-linux-tools.tar.gz
cd vmware-tools-distrib
./vmware-install.pl
Follow the remaining steps. Choose directories for the various files.
Choose a display size for the virtual machine. Enter the number for the
choice and press Enter.
• If you wish, start X and your graphical environment and launch the
VMware Tools background application.vmware-toolbox &
Note: If you created this virtual machine using the vmxnet driver, you
now need to run netconfig or another network configuration utility in the
virtual machine to set up the virtual network adapter.
On Novell Console:
load cdrom
load cd9660.nss
vmwtools:\setup.ncf
When the installation finishes, the message VMware Tools for NetWare are
now running appears in the Logger Screen (NetWare 6.5 guests) or the
Console Screen (NetWare 5.1 guests).
Restart the guest operating system. In the system console, type: restart
server
After you install VMware Tools, make sure the VMware Tools virtual CD-
ROM image (netware.iso) is not attached to the virtual machine. If it is,
disconnect it. Right-click the CD-ROM icon in the status bar of the console
window and select Disconnect.
• On all OS the tools load automatically – except Linux. You have to make
vmware –toolbox in the start-up programs in the Gnome Control Center
(Linux)
• To configure options see /etc/vmware/vmware-guestd --help
Shutdown & Restarting a VM
• Stop, Go, Pause (suspends to a file) and Restart
• From the VMware Management Interface (Web)
• Or from Rconsole
vmware-cmd /root/vmware/win2000Serv/win2000Serv.vmx
reset trysoft
Note:
This stuff is case-sensitive I believe
Using PXE
• Boot for PXE rather than cd/floppy/hd to:
o Remotely install OS – like a RisServer
o Deploy an image of a virtual disk to a server – using Ghost or Altris
o Boot Linux disklessly – and run it across the network
o Windows XP not supported in this method
• Requirements
o Make sure VM setup to use a virtual adapter
o Either vmxnet or vlance is supported
o Virtual Disk but no OS installed
o VM boots according to its “BIOS” settings
o They recommend putting PXE at the top of the list for the boot
order
scsi<n>.virtualDev = "vmxbuslogic"
to
scsi<n>.virtualDev = "vmxlsilogic"
or unregister with:
Configuring VM’s
• TO change hardware options on a machine – you must have the VM
stopped!
• One admin at time should change the settings
• Can edit VMX file directly
• Some scary stuff here I’m avoiding!
Chapter 3: Using the Management Interface (Web)
Note:
Most hardware or resource settings can only be configured when the VM is
shutdown…
General
• After login you get the “Status Monitor” page – high level view of the
servers
• Refresh every 90secs –
o you may still need to manual refreshes –
o especially if someone has used Rconsole to change the power
status of machine
• Timeout of the console is 60mins –
o You can change this by editing vmware_SESSION_LENGTH
o In the /home/vmware/mui/apache/conf/access.conf file
o Block access by making this entry 0 (zero)
o -1 sets an indefiniate duration for the session
• Basic Stats get updated every 20 seconds
o Can be set to 1-15 to even out any spikes in performance – cough
and splutters the VM’s make
o Can be changed in the access.conf file
o PerlSetEnv vmware_STATS_PERIOD 15
o Restart Apache to allow change to take effect with:
o /etc/init.d/httpd.vmware restart
• You can launch Rconsole Session from MI – but if you using IE 6.0
review security settings
o Change security options to: Do not save encrypted pages to disk
o Apparently your ssl connections can be cached and potentially
intercept by another
o Switching this on is deeply annoying – it doesn’t open Rconsole
directly
• Via Proxy Servers/Security
o In W2K3 – make Vmware IM page a “Trusted Site”
o In others – make sure X Bypass Proxy server is enabled
o Perhaps don’t use FQDN but Netbios – as they won’t get proxied…
o If no proxy – they recommend using FQDN
o First connection you get SSL errors until you had the certificate into
your store..
• Browsers Supported
o IE 5.5. or higher
o NN 7.0 or higher
o Mozilla 1.x or higher
o If using NN or Moz - you need JavaScript and Stylesheets enabled
o If clicking at the Icon to start a Rconsole session – then NN/Moz
will need mime types defined. IE on the other hand prompts for a
warning about downloading the . file
CPU Resources
• Tab shows
o Minimum – need to run machine – 0$
o Maximum - highest amount can be 200% if you have two cpu’s
and so on
o Shares
• represents a relative metric for allocating processor
capacity. The values low, normal, and high are compared to
the sum of all shares of all virtual machines on the server
and the service console. Share allocation symbolic values
can be used to configure their conversion into numeric
values.
o Isolated from Hyper-Threading
• represents the CPU operation state of the virtual machine.
Enabling this option prevents a virtual machine from sharing
a physical CPU with other virtual machines when Hyper-
Threading is enabled.
Memory Resources
• Similar to CPU in your options
• Except you have memory affinity – which relates to “Memory Affinity — if
displayed, this represent the NUMA nodes on the ESX Server system to
which the virtual machine can be bound, when the ESX Server system a
NUMA system. “
Disk
• Read & Write bandwidth… you can reduce the Share number – but not do
allocations like you can with Memory or CPU
• This part of the admin guide became quite repetitive – and kept on
referring to later parts of the guide
• Floppy
o Only one VM can connect to the physical floppy at one time.
o Virtual Floppies can be loaded from a ISO image file!
o Options
Connect – connects to the ESX servers floppy
Connect on – connects to ESX server during start-up –
required to boot from floppy?
• CD-ROM
o Same as Floppy drive really
• CPUs
o It is possible for a single CPU system to emulate a 2-cpu system –
like hyper-threading would
• Memory
o Number assigned in multiple of 4
o 128/256/512/768/1024
• Virtual Adapter
o Choose which PHYSIC adapter to use
o Choose which DRIVER to use
VMnic – connects to physical adapter – acts normally on the
network
VMxnet – Virtual network which allows the other vm
machine to talk to each other
o Drivers called vlance – which installs automatically OR
o Vmxnet – provides better performance but only on Gigbit Ethernet
Card
o Vmxnet – requires VMtools to be installed to the Guest OS
• SCSI Adapter
o Virtual Device – As meantion earlier – Buslogic or LSI
o Bus Sharing:
Physical – share disks with any virtual machines on any
server
Virtual – share disks with VM within a server
None – to stop sharing of disks
• V-Disks – nothing added here – beyond what was mentioned earlier
• Colour
o Increase network transfer between Rconsole and VM
o But might need to be increased due to app requirements such as
Citrix MC
• Virtual SCSI
o Possible to assign virtual scsi adapter
o If you move from default – 0:0 it warns you about possible boot
problems
• Add Device Wizard
o Allows you to configure more devices – looks straight forward
except for tape device
o Located on the Hardware Page of a VM machine as a link at the
bottom – Add Device
o Check the SCSI ID via file manager
o /proc/vmware/vmhda<x>/<y>:<z>
X = HBA id
Y = scsi id
Z = scsi lun id
Start-Up options
• You set up ESX to start the VM machine automatically when it boots – by
default they don’t
• Shutdown – shutdown ESX defaults to shutting down the VM as well – can
set a delay…
• Appears to be unavailable in ESX2.0
Modifying Peripherals
• Stuff you can do!
o Adding more than 6 SCSI devices – as SCSI has been update to
work beyond 6 devices
o Using a Raw Disk – use if you need to access resources on a
physical disk – like my images partition? Nothing new here…
o Parallel Ports (for Dongles)
o Serial Ports
o Disk Modes…
• Some of this can only be done via the VMX file
• Parallel Port Set-up
o Only one OS at a time – only one VM machine at a time
o Reboot ESX, BIOS, set Parallel Port to be PS/2 or Bi-Directional
o Logon to Service Console as root, run these these commands:
/sbin/insmod parport
/sbin/insmod parport_pc
/sbin/insmod ppdev
o Properties of VM, edit the VMX file in Verbose Options and add
these lines
o NOTE:
When the virtual machine starts after you update the virtual
hardware version, you see a dialog box with the message “The
CMOS of this virtual machine is incompatible with the current
version of VMware ESX Server. A new CMOS with default values will
be used instead.” Click OK.
o NOTE:
As you start the virtual machine, you may see a message warning
that the parallel port is starting disconnected. If you do, connect to
the virtual machine with a remote console and use the remote
console's Devices menu to connect the parallel port.
Deleting a VM
• Can only be done by:
o Root
o Creater/Owner of the VM
o If permissions have been set to allow you to do so
• Delete option appears in the pull down within the Web-Console
• Will ask you if you want to delete files as well…
• Choice of keeping the DSK file, while deleting the logs
• All files can be deleted except the Redo.log and Lock files
• DSK files NOT associated with any other Register VM can be deleted also
• Does not display DSK file associated with another VM
Managing ESX Resources
• This is main options page on the web-console…
• Only viewable by the root
• Logout button logs you out – clearing any cached credentials etc
• Web management console – runs under an apache services – these can be
stopped/started at the console
o /etc/init.d/httpd.vmware stop or start or restart
• VM config file
• Ordinary File
• VMFS Volume
• Same as above with a collection of pages – is WS/GSX file
• VMFS allows for files larger than other File Systems – so if you copy DSK
file to another location – it will be CONVERTED and SPLIT into 2GB formats
–
• BEWARE IF YOU DO THIS. I DID IT BY ACCIDENT. IT FLUSHED THE DISK.
CRASHED THE BOX – AND I COULDN’T GET BACK IN!
• This is done by
• Always opens to /root/vmware within sub-dirs are created for VM which
contain the VMX files
• The DSK files are held on separate partition formatted with VMSF
• You sometimes find it can’t display certain files with in the manager – over
2GB – this stops some VM workstation files from being imported
• Can select multiple files/directories and set permissions
o Inherited File system
o R W X – read, write, execute
o – indicate setting is the same for all files – and nothing is granted
such as R W –
o A Black space indicates settings are NOT the same for all files
o R – to View the Virtual Machine (RConsole)
o W – To make changes to its configuration
o X – To change its power status
o RWX – Register and Un-register a machine
o These perms are used to control other peoples access…
• Permissions and Directories
o Previous versions – checked both FILE and DIR permission to the
VMX file
o Therefore you needed X on every directory to the file
o The remote console has this requirement still…
• Assigning Permissions – “Flagship User”
o One account that has rights to all machines
o Not tied to a particular person – so control is maintain regardless of
holidays and or personnel changes
o Avoid problems with access privileges
o Can be quick way of assigning the X privileges
renice 0 -p <vmware-serverd_process_ID>
renice 0 -p <httpd_process_ID>
o Not sure why you have to put it back down to 0??? Wots the point
in that???
o Looks like a temporary change to gain access to start the VM’s off?
• Increasing memory resources to Apache Process
o Each VM gets 25MG from Apache to hold the VM’s data
o Sufficent for 80 VMs
o 200 is the maximum number of registered machines
o Apache may run out of memory
o You may get a “Panic out of memory” message in the
/usr/lib/vmware-mui/apache/logs/error_log file – and the Web
management interface closes down
o Solution – allocate more ram by:
Edit the /etc/vmware/config file
Add this line mui.vmdb.shmSize = “37748736” which 36MG
(36x1024 x 1024)
Restart Apache with - /etc/rc.d/init.d/httpd.vmware restart
o Could slow VM’s because its eaten up more RAM
• Increase Authentication Time Value – to give the system more
time to validate you
o Edit the /etc/vmware/config file
o Edit the value vmauthd.connectionSetupTimeout = 120 (default is
30sec upto 2mins)
• Increase Memory Allocation to vmware-serverd process
o Edit the /etc/vmware/config file
o vmserverd.limits.memory = “49152”
o vmserverd.limits.memhard = “65536”
o These changes raise the soft memory limit for the vmware-serverd
process to 48 MB (48 multiplied by 1024) and the hard memory
limit to 64 MB (64 multiplied by 1024).
o Reboot have these changes take effect or use killall -HUP vmware-
serverd
• Running Many Virtual Machine with a Significant CPU Load
o Increase the number of “shares” to 10,000
o Options, Service Console Settings, click the CPU tab,
Backup Up VMs
• Tape or Network
o Recommend a second SCSI controller for the tape device
o Separate from consoles controller
• Backup within the VM
o Run a back up to tape or network within the VM’s OS
o If failure occurs you still have VM to recreate & load recovery
software to then restore the Guest OS you backed up
• Backup from the Console
o And the VMX and DSK files
o You loose the adv of above – the ability to restore files within the
Guest OS
o Use VMware Scripting API with conventional backup software to
o Works with many disk modes
• Hardware/Software Disk snapshots
o Backup/Clones from HW vendor
o Disk subsystem, File System, Volume Manager
• Network based replication tools
o Synchronous or Asychonous is supported
o But software remote mirroring can cause problems
May not recognise VMFS
Increase network load
CPU load
More common in Windows & Unix than in a Linux
environment
Chapter 4: Using Remote Console
• Upto 3 people can connect to the same VM within Rconsole at any one
time
• Start Rconsole from the telly icon in the Web Management Tool – or
running it individually – server admins get a choice of VM’s to connect to
on that ESX server
• In windows Rconsole is started from a shortcut once installed or in Linux
using:
vmware –console
• Power Settings
o Same Power Off, Power On, Suspend, Reset Buttons in Rconsole as
you see elsewhere
o With VMtools installed the restart options allow you to set scripts to
run
o Scripts work with – Power On, Suspend, Resume – BUT NO
OTHERS!
o RESET, Power Off – are not GRACEFUL shutdown – its like wacking
the power button!
o You should use “Shutdown Guest OS” or “Reset Guest OS” from the
pull down-lists in Rconsole
o Similar Options appear in the Web console
o I THINK VMWARE NEEDS TO REVIEW THESE DEFAULTS
• VMTools Settings
o Synch time –
only if Guest OS is EARLIER than Service console
o Setup Device
uch as floppy, cd, Ethernet –
can also be done from Devices, and Settings, Configuration
Editor
o Set scripts –
the default ones are named suspend-vm-default.bat –
all that changes is the first part of filename – resume,
poweron, poweroff
On the suspend does anything – it contains the command
vmip.exe –release
o Shink
Export ESX disk to GSX using fewest no files
Optional, not required
In ESX the dsk allocation is total – not dynamic – it doesn’t
grow in size like a GSX disk does
Shrinking process is meant to address this difference in the
way ESX and GSX treat their dsk files
It could potentional split one big DSK file into multiple
smaller ones
Do the export with File Manager or vmkstools command
Second tab shows “unsupportable” partitions such as CD-
ROM and Floppy
o Cut and Paste
Apparently you can between Rconsole and Workstation –
Needs VMware Tools
Wouldn’t work for me!
Chapter 5: Using the Service Console
General Stuff:
• Based on a modified Linux Red Hat 7.2 Distrubution
• So it can be managed by VMkernal
• Most services have been disabled esp network ones
• SSH is enabled for remote access
• Don’t run heavy loads on the console – as it takes resources away from
the VM’s
• Aviod DHCP – if you do you need Dynamic Update on the DNS server
• You can dedicate an Network Adapter to the Console – but if you share
with VM machines – you NEED a static IP to do this
• ALT+F2 at the Console to Logon…or SSH or Telnet in if your Security
Settings allow
• Most commands are the same as Linux ones – there are some specific to
Vmware
• They have their own help system based on “manuals” – you use the
command MAN plus the command you want help on to access them
• VMFS Volumes are automatically “mounted”
• Some Examples:
o Findnic – used to id network cards and observe LED’s flashing –
nice method!!!
o Vmfstoools – used to manipulate the file system – you specify
VMFS volume or SCSI id values
o Vmkload – loads device drivers, network shaper modules
o Vdf – shows capcity of all volumes replaces/supersedes df
• Common Linux commands
o Cd – change directories
o Cp – copy files
o Ln – create links or shortcuts
o Ls – to list files in the directory
o Mkdir – to make directories
o Mv – to move a file
o Pwd – show path to current working dir
o Rm – remove a file – delete
o Rmdir – remove a directory
o Cat – prints file to command like DOS command type
o Grep – search for text string in a filename
o Less – to show only a screens worth of data in a file at one time
o More – to pause the scrolling of data
o Apropro – to search for commands that contain a particular string
o Du – display size of a file/directory
o Fdformat – format a floppy disk
o GroupAdd – a group
o Hostname – show esx server name
o Ifconfig – Shows network configuration
o Insmod – load a loadable module into the kernel
o Kill – kill a process by its process number – kill -9 is the surest way
– but this doesn’t release editor buffers
o Lsmod – list all loaded modules
o Lspci – list all PCI devices –v does a verbose listing
o Mount - load a storage device at a specified location in the file
system
o Unmount – inverse of above
o Passwd – change your password, root can change another persons
password
o Useradd – add a user to the system
o Who – show names of users logged on to the system
o Whoami – shows you who you have logged in as.
o Su – switch user…
o Exit – switches you back to your previous user name
o Ps – show names, process ids and other info –f and –e for full and
every process
o Shutdown – to shutdown computer – with a delay of 5mins
(shutdown –h 5) or immediately (shutdown –r now)
o Chmod – used to change permissions –
o Chown - change ownership
o Chgrp – change group setting for a file
• Some examples of these commands together
o man cat | less – to get help on the cat command with space to
scroll through and q to quit
• /Proc/vmware is a “directory” of files loaded into ram which provides the
virtualisation layer – it can be altered with the echo command – NOT
RECOMMENDED EXCEPT IN SUPPORT CALL
Adapter Bindings
• Virtual Stuff
o Allows you to make virtual adapters/switches
o You give it a label – and attach it to genuine physical adapter
o Changing this label – can stop VM for starting up – as they fail to
find the adapter
o If you don’t attach to Physical Adapter – you end up a virtual
adapter that can only communicate
o Also supports Port Groups
Extensions of networks – using Virtual LANs
Requires a Vlan ID
Goes via vmkernel – you can make it go straight to the
adapter and on to the wider network
In the VMX verbose options
Net.VlanTrunking to 0 which disables this process
• Physical Adapters
o Set their speed or auto-negiotate
Configuring SANs
• Make sure ONLY ONE ESX server has access to the SAN when your first
formatting the disk
• Set all partitions on the SAN for Public or Shared Access
Configuring Storage
• Create VMSF Paritions where ever you see Free Space
• Edit, Label, Remove, Change the Volume Label
• Span Partitions – like a volume set?
• You cannot alter any of the Linux partitions set up during the installation
• You create a core dump file – this should be stored on a local disk no
o Stores crash info
o Required for Debug and Support
• Rest of Partition stores the VM logs/dsk files
• You can convert from VMFS-1 to VMFS-2
o You have deactivate the swap file
o The metadata overhead is greater on a VMFSv2 volume so you
have to concern yourself with whether you have enough space for
the conversion
• Access Modes
o Public – default, recommended
Multiple ESX servers access the same partition on the SAN
With Version 1 – only one at a time!
Version 2 – allows concurrent access
Automatic “locking” systems to ensure file integrity
o Shared
Used for fail-over clustering systems
Turns OFF the software/kernel VMFS SCSI 2 Reservations
and allows physical SCSI 2 Reservations to be allowed by
Hardware – on the complete LUN
Between two ESX server with a VM on each
Or between a Physical Server and VM
• To convert from one mode to another – you have to “deactivate” the swap
file
• Default Maximum VM dsk size is 144MG
• Spanned Volumes – a single addressable space made of lots of volumes
o Each volume is referred to as an Extent
o You cannot alter the maximum file size
o Cannot be used to expand a drive that is getting full
o It deletes data when the Spanned Volume is being created
o Best done at the beginning or else backup and restore your VM’s
o If your volume is spanned you cannot simply remove it – there is a
procedure
• Adapter Bindings
o Allows you to see the SAN adapters
o Displays the WWW port names (?)
o View the Persistent Binding Status
o Assigns specific target ID’s to SCSI devices
o ID is retained at reboot
o Useful for RAW disk setups (dsk is mapped directly to hard-drive
not to a dsk file)
o
• Fail-over Path Locations
o Paths to multiple fibre cards for redundancy
o Shows paths and preferred path (marked as preferred)
o Last path used to access the LUN
o Three colour codes
Green – Active and Data is being passed successfully
Orange – Path is disabled and available for activation
Red – Should be active, but system cannot connect to the
LUN
o Fail-over Policies
Choose how adapters will be selected
Two options
Fixed – always use preferred path where possible
Most Recently Used -
That is, the virtual machine has three times as much CPU time as
the service console, as long as the virtual machine’s CPU
percentage is between 20% and 50%. In actuality, the virtual
machine may only get twice the CPU time of the service console,
because three times the CPU time exceeds 50%, or the maximum
CPU percentage of the virtual machine.”
o If you are running VM on the same HD as the console – consider
increasing CPU
• Same for Disks – but no percentages just the share value
o “For example, the service console and 2 VMFS partitions, VMFS-A
and VMFS-B, are located on the same hard disk on the ESX Server
system. If the service console has 2000 shares and VMFS-A and
VMFS-B each have 1000 shares, then the service console has twice
the disk bandwidth of both VMFS-A and VMFS-B.”
o SWAP
Reserved – as above but with SWAP file instead
Unreserved
TOTAL
o Memory
Memory Available to Power On a Virtual Machine – does
what it says on the tin!
• Virtual Machines: Virtual Machine Summary
o RAM – allocate on memory, often more than is really being used
o Private – memory used by VM only
o Shared – memory used between VMs
o Swapped – memory moved from physical RAM to HD (Swap,
rather than fault)
o Balloon Driver – memory reclaimed from VM inconjunction with
VMTools VMMEMCTL driver and the Guest OS
o Unused – Memory never used by the VM and therefore has not be
allocated
o Active – Memory recently used
o SWAP I/O – bytes per second of faults/swaps
Monitoring VM’s
• Can retrieve info if VM dsk is stored on an ESX VMFS partition – but not if
its on an NFS mounted drive
• Kind of data you can retrieve
o On/Off
o Lost of Heartbeat
o Resumption of Heartbeat
o Requires VMtools
o Not generated immediately when a new VM is registered
o Must reboot or restart Vmware-ServerD with:
Volumes
• VMFS-2 volumes can span multiple partitions, multiple LUNS or Physical
Disks
• Volume is a “logical grouping of extents” – each extent of a portion of disk
space partition that is a addressable as a single “volume”
• VMFS-1 is limited to one extent – so can’t be used to soak disk space
across multiple disks
• Use vmkfstools –p <volume label> in the /vmfs directory to see more
information
• Labels – can be set when you create and format the partition
o Can be useful – rather than using the proper SCSI id such as
vmhba0:3:0:1
o Acts like a shortcut effectively
o Vmfkstools –S vms vmhba0:3:0:1
o Would allow you to refer to vms:w2k3.dsk in the command line and
in the VMS file
• Labels also useful for:
o Adding additional disks/scsi adaptors
o Useful for LUN ID between servers
o The LUN ID can change – as long as the servers are pointing to the
label not the LUN ID
VMFS Accessibility
• Public
o Default
o Version 1: Multiple ESX servers access same data on the SAN – one
at a time
o Version 2: Multiple ESX servers access same data on the SAN – at
the SAME TIME! – with locking to ensure file integrity
o Recommend especially on SAN based systems
• Shared
o Used for fail-over clustering based systems
o Among VM’s on the different ESX servers
o Or between Physical (NodeA) and Virtual Machines (NodeB)
• Private
o System used previously
o Still supported but recommend you convert them
o No performance overhead in doing so
o They recommend public access
• Changing Accessibility
o Done within Options page in Web based MI
o Storage Configuration
o Cannot be committed if any files are open and in use on that VMFS
Volume
o Warnings and Error occur
Using vmfkstools
• Supports creating of dsk files on a SCSI disk
• You can do most of the file management tasks on with the Web IM
• If the command files – check /var/log/vmkernel log file or Options, System
Logs in Web IM
• Uses a special syntax to address adapter, target, LUN number and
partition (4 numbers all together) such as
vmhba1:2:0:3
• Syntax
• Such as:
vmfkstools /vmfs/vmhba1:2:0:3/rh9.dsk
vmfkstools /vmfs/lun1/rh9.dsk
• Options can be specified with long and short names such as:
Advanced Examples
• Creating a new VMFS2 partition
o vmkfstools -C vmfs2 -b 2m -n 32 vmhba1:3:0:1
• Extends an Existing VMFS Partition by spanning two partitions
o vmkfstools -Z vmhba0:1:2:4 vmhba1:3:0:1
• Names a VMFS volume
o vmkfstools -S mydisk vmhba1:3:0:1
• Creates a new VMFS virtual disk file
o vmkfstools -c 2000m mydisk:rh6.2.dsk
• Imports the contents of a virtual disk to the specified file on a
SCSI device
o vmkfstools -i ~/vms/nt4.dsk vmhba0:2:0:0:nt4.dsk
• Import a GSX or WS Virtual Disk into ESX
o vmkfstools -i winXP.vmdk vmhba0:6:0:1:winXP.dsk
cat /proc/vmware/scsi/vmhba0/1:0
or
o This tells the Qlogic driver to clear the cache of existing LUN’s
o Be sure to choose the RIGHT qlogic driver for IBM/HP/EMC storage
– it is 6.04 and is the default driver used
• DO NOT RUN VMKFSTOOLS -S on NON-FIBRE CHANNEL adapters
Persistant Bindings
• You can hardcode a HBA with specific SCSI devices
• Esp useful if the server connects to RAW storage rather than a DSK file
• There is perl script called pbind.pl that allows you to configure this via the
service console rather than through the MUI
Multi-Pathing
• At least two routes to two different switches to two different controllers on
the SAN
• The most fault tolerant SAN solution currently available
• Vmkmultipath –q is a command-line utility that displays the state of all or
selected paths
• Gives status info like: on, off, dead, preferred, active
• Policies – MRU (Most Recently Used) used for Active/Passive systems (the
default) and FIXED for Active/Active
• -s to set a path with –e to enable, -d to disable, -r to set the preferred
path
• Preferred path is ignored in MRU
• Syntax is like this: vmkmultipath -s vmhba0:0:1 -e vmhba1:0:1 with first
device being the controller and second device being disk
• In event of a failure – it takes about 30-60 seconds for a SAN to detect
this – multiple failures show I/O errors on the VMFS partitions
• PortDownRetryCount – controls on a Qlogic this fail over period – value is
Nx2 so a value of 15 would mean a 30sec wait
• On Windows boxes increase the Disk “TImeOut” value in the Registry– so
to be longer than the failover time
Chapter 10: Configuration for Clustering
Warning:
From this point onwards, I had attend the Admin I and II course, and developed
the course to teach it – so I have only documented stuff I didn’t know
• If you accident share VMFS that contains a bootdisk this can stop the boot
process! It get reserved by another a machine so the VM cannot access it
to boot!
• Most Applications don’t do SCSI reservations – and this what clustering
service provide to all applications
• There are release and reset command switches you can use with
vmkpcidivy if this happens
• The reason LUN masking is useful – is that it reduces the chances of these
SCSI reservations occurring when you don’t want them
• NLB is not really touch on in the Admin Course – but I would recommend
boning up on it if you not familiar…
Extra Clustering Info from Microsoft Documentation
• A cluster can contain more than one server – depending on your OS
o W2K Adv – 2 nodes
o W2K DataCenter – 4 nodes
o Win.NET Adv – 4 nodes
o Win.NET DataCenter – 8 nodes
o NLB is 32 nodes regardless of OS
o COM Load-balancing is 8 nodes regardless of OS
• Upgrades – move app to nodeb, upgrade package on nodea, switch back
to nodea, upgrade package on nodeb – perception of no down time
• Really for F&P, Database,– use NLB for Web and Terminal Service – e-
commerce sites use both NLB on the web-pages, and clustering on the
backend databases
• Cluster really only handles failure of the server and the services it
supports – its not intended to protect the users data in the event of a
failure – however you can have shared data disk – typically for data like
databases
• High Availabilty, Reliability, Scalability
• Limits – software compatiabilty? Virus? Software corruption? Human error?
• Sometimes referred to as a pack, rather than a cluster
• Where as front-end, NLB servers are referred to as Clones, rather than
Farms
• With SQL you would be likely to partition the database up into sections so
one cluster dealt with A-F, anther G-M and so on – rather than all servers
responding to all queries
• Sometimes called a “shared nothing” cluster – ensures that two active
nodes in two different clusters could never write to the same db at the
same time – and cause corruption – supported by exchange/sql
• A “component routing cluster” handles comms between front-end servers
and Application Servers (also clustered)
• Scalability – function of what Windows OS you choose to install on, and
the CPU/Mem capcity of the those OS – so DataCenter supports more
CPU/Mem than Adv Server does
• Scale out – more servers in the pack/clone
• Scale Up – more hardware resources
• Active or Passive – with multi-node clusters you can have different combos
of both
• Nice thing about passive, is it has no load – there fore it can take a lot if
one node fails… but if you have an active/active system and one node fails
– is there enough resources to take the load of the lost node – the pack of
cards effect…
• Bad thing about passive – the “insurance policy” you don’t feel the benefit
of your investment until a failure occurs
• Answer you make sure – that the servers are loaded at 50% or less so
there is “free resources” should one of the nodes fail.
• You could have two clusters – which means you have two actives and two
passives
• Generally if you have all nodes active in a 4-node cluster each node only
takes 25% of load – making sure that it can easily take additional load
should 1 of the 4 fail – resources are least doubled to cover for this
scenario
• Put simply a two-node cluster use 50% of resources per server – leaving
50% spare. With 4 node cluster – you use reserve 25% resources or
utilise 75% of resources on the server – a bit like RAID the cost of excess
capacity goes down as the number of servers in a cluster goes up
• DataCenter supports “cascading failover” where you set the “Preferred
Owner”, and then order and name of remain Nodes 2, 3, 4 –
• Summary – active/passive is cheaper, but you waste money on an unused
server – active/active is more expensive, but you get more value for
money for your investment
• Site redundancy – what if you loose the whole location? Answer duplicate
site – can go for full duplication or “partial impleamentation” – that only
duplicates for a limited period, peak traffic, limited services (not fully
functional)
• If you go for full duplication – you then face the challenge of keeping more
than one dataset in synch – and the possible lost/corruption of data this
entails
• Could have a “strech cluster” where you have a LAN like link between two
sites (latency of less than 500ms or SAN with fibre which has longer cable
lengths)
• Window.net supports “majority node” clustering – changes the way the
quorum is used – instead of single quorum shared between the nodes –
each node has its own quorum which is kept in synch via the network (?)
and gets round the limits of SAN cables and so on – they call it a “Quorum
Set” looks like RAID5 on the network – if one of the quorums is lost – it
can find that info out from the remaining quorums in the set(?)
• Fibre channel is the preferred method, although SCSI is supported
• Two nodes (2xIP address each for Private, and Public nets) and one Virtual
Server IP – which is what the user connect to…
• Failback support is there – so if nodea fails it can failover to nodeb, if
nodea come back online with in a configurable time – it rolls back to nodea
• Resource Group – not all services/apps need to be clustered – so we just
add the apps that do – to the resource group. What cannot be clustered:
o Non-IP based on apps using netbeiu, ipx-spx
o Apps where you cannot redirect their storage to a shared disk
o Client application need to have some kind of retry method –
otherwise they will disconnect before the hand-over to the other
node is complete
• Might be able to get round the lack of cluster awareness of apps with
VBscript & Jscipt – but this is only supported on Windows.NET
• SQL – Lots of RAM, Fast HD’s, Plenty of CPU
• Disk partition – your standard RAID0, 1 and 5 are supported but you can
now get combinations – RAID5+1 is a strip-set that’s mirrored (6 volumes
or more – excellent FT but lots overhead), RAID0+1 is mirror that stripped
(2volumes or more – good ft with good r/w#)
• Domainlet – nodes are dc in the own domain – ft on the cluster service
account
• Two NICs – private and public, with the option of using Public should
private become unavailable. Private net for low contention on network
comms between the two nodes… alternatively you might have two private
nets – with the second being an interface to front-end servers – with the
back end cluster node never actually spoken to directly by client devices
• Alternatively – you can use direct SAN coms – using what ms call
“WinSock Direct” – sans support direct hardware support which removes
the burden on the OS, and two transfer modes one for handshaking and
the other for data transfer – again with no burden on the Server OS
• The events logs are clustered so you only have to read one event log not
one per node on the cluster! – grows to 8MG and clears out in FIFO format
• Cluster disks are plug-and-play able
• You can stop chkdsk occurring which can slow reboots
• Network media failure detection part of W2K is picked up on by Cluster
Service
NLB
• Strictly not required for the exam - but it is in the same MS white paper –
so in it goes
• NLB for http/media/terminal services/ecommerce sites
• TCP/UDP but GRE is only supported in Windows.NET (Generic Routing
Encapsulation)
• Failover and Fail back – within 10 seconds… Load-balancing is endemic –
there’s isn’t a “passive” node scenario
• Clients connect to one Virtual IP – behind which are the real IPs
• Local Data on Local Drives – just synch by a central master – generally
static no volatile data
• Can have on NLB system with 10 servers in – and direct ftp to server1-5
and http to 6-10
• No special HW is really required as it is an IP driver
• Two NICs like a cluster onto the Virtual IP of the cluster and the other a
management IP
• Uses Uni or Multicast broadcast – which it proliferates on to the real server
– does nothing to the non-specified ports – just send them to a physical
server without passing through the virtual stack
• Can work with a single NIC – but limits
o Unicast only from one cluster to outside world – not NLB Cluster to
NLB cluster
o Multi-cast does support cluster-to-cluster comms, but not optimum
for heavy loads
• You tend not to buy RAID infrastructure with the servers as the are N
nodes in the NLB anyway, with the same content
• The real challenge is keeping all these local copies of data in synch and the
same – which also introduces an overhead and possibility of downtime
when the synch is occurring
Chapter 11: Networking
• Two MAC address OUI’s – one for auto and one for manual
• Auto uses UUID and path to the VMX – plus an off-set incase the algorthim
generates the same mac on two different Virtual machines
• Auto MAC stays the same unless you move the VM to another ESX server
(Vmotion?) or relocate the VMX file
• Sometime difficult to map PHYSIC nic to the ESX name allocated to them
(vmnic0, vmnic1 and so on)
• There is a command called findnic – which can assist you in working out
which card is which – does a glorified ping test – look at LED’s to see
activity?
• findnic vmnic0 10.2.0.5 10.2.0.4 first IP binds the card to that IP, and
second IP is the remote machine to ping – there is –f switch to a “flood”
ping – continuous ping?
• Default – everything is set to autonegoiate speed and duplex settings – if
you find things are slow or things miss report speed – switch to manual
settings – beware if you have installed the NATIVE network drivers to the
VM – you will get 10MG from AMD, and 1G from the Native Driver
• You can allow promisicous mode for all traffic within VM or restrict it by
MAC address
• Apparently this setting is VOLITLE and not retain after reboots
• Possible to allow the Service Console to interact directly with Virtual
Networks within the VMs – requires a you install the vmxnet_console
driver to the Service Console
• Once the driver is added and made away of the NICs that provide the
Virtual network – you can bring up these interface with the command:
ifconfig eth1 up 10.2.0.4
• It is possible to have a single NIC ESX box and share the NIC between the
Service Console and the Virtual Machines
• Does support hardware accelation features if the physical nics have them
such as:
o VLAN tag handling
o Checksum Calculations
o TCP Segmentation Offloading
• Bind similair NICs in Bonds – otherwise you might not get all the features
because one adapter doesn’t support them
• To find out NIC information – names and pic slots and so on you can use:
EXAMPLE:
Example: Web Server Consolidation
Suppose that you are using ESX Server to consolidate eight nearly-identical Web
servers running IIS on Windows 2000. Each Windows 2000 machine is configured
with
512MB of memory. The native memory requirement with eight physical servers is
8*
512MB = 4GB.
To consolidate these servers as virtual machines, 24MB is needed for the server
virtualization layer and 192MB is recommended for the service console. Each
virtual
machine also requires an additional 54MB of overhead memory. An additional 6
percent should be added to account for the minimum free memory level.
Assuming
no overcommitment and no benefits from memory sharing, the memory required
for
virtualizing the workload is 24MB + 192MB + (1.06 * 8 * (512MB + 54MB)) =
5016MB.
The total overhead for virtualization in this case is 920MB.
It may also make sense to overcommit memory. For example, suppose that on
average, two of the eight Web server virtual machines are typically idle and that
each
Web server virtual machine requires only 256MB to provide minimally acceptable
service. In this case, the hardware memory size can be reduced safely by an
additional
2 * 256MB = 512MB. In the worst case where all virtual machines happen to be
active
at the same time, the system may need to swap some virtual machine memory to
disk.
Addendum
I passed and failed the exam. Although I got 81% as instructor I needed to get
more than 85%. So I did some additional study to bone up on the areas I wasn’t
sure on….
The 100MG of data from the core dump file gets copied to the /ROOT
partition
For those of you attending the Admin II authorised course the instructions
are in the first lab where you install ESX 2.1
• What is the correct use of vmkfstools, when you want to set the
file system mode – Public or shared?
The numbers for the Controller:SCSI Target: Lun: Partition – can start
with 0, but partitions always begin with 1…
Yes, every time you shutdown or restart the VM – you get a new VMID –
ESX see’s it like any other process
Bit tricky to find this out. The VM kernel log files in /var/log all start with
vmk – and there’s three of them:
vmksummary
vmkernel
vmkwarning
However, if you looking to the syslog,conf file which controls the logging
process – there is a boot.log and message file – however, these appear to
information which covers the Service Console boot process – not the VM
Kernel boot process.
It does NOT look as if the log files in /var/log/vmware directory which are
called vmware-serverd.log are controlled by syslog,conf
• A new SAN adapter is added to the server – and one of the VM’s
starts to exhibit boot problems – how would trouble shoot this?