0% found this document useful (0 votes)
0 views

Cloud-Computing - Module 2 (1)

The document discusses virtualization technology and its implementation levels, highlighting the benefits of virtual machines (VMs) in cloud computing. It details various virtualization approaches, including hardware-level, operating system-level, and library support, along with their advantages and disadvantages. Additionally, it covers the design requirements for Virtual Machine Monitors (VMMs) and introduces the Xen architecture as an example of a microkernel hypervisor.

Uploaded by

Arul N
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Cloud-Computing - Module 2 (1)

The document discusses virtualization technology and its implementation levels, highlighting the benefits of virtual machines (VMs) in cloud computing. It details various virtualization approaches, including hardware-level, operating system-level, and library support, along with their advantages and disadvantages. Additionally, it covers the design requirements for Virtual Machine Monitors (VMMs) and introduces the Xen architecture as an example of a microkernel hypervisor.

Uploaded by

Arul N
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 109

CLOUD COMPUTING

(BCS601)
MODULE – 2
Virtual Machines and Virtualization
of Clusters and Data Centers
Introduction
• The reincarnation of virtual machines (VMs) presents a
great opportunity for parallel, cluster, grid, cloud, and
distributed computing.
• Virtualization technology benefits the computer and IT
industries by enabling users to share expensive hardware
resources by multiplexing VMs on the same set of
hardware hosts.
3.1 Implementation Levels Of
Virtualization
• Virtualization is a computer architecture technology by
which multiple virtual machines (VMs) are multiplexed in
the same hardware machine.
• The idea of VMs can be dated back to the 1960s.
• The purpose of a VM is to enhance resource sharing by
many users and improve computer performance in terms
of resource utilization and application flexibility.
• According to a 2009 Gartner Report, virtualization was the
top strategic technology poised to change the computer
industry.
• With sufficient storage, any computer platform can be
installed in another host computer, even if they use
processors with different instruction sets and run with
distinct operating systems on the same hardware
3.1.1 Levels of Virtualization
Implementation
• A traditional computer runs with a host operating system
specially tailored for its hardware architecture, as shown
in Figure 3.1(a).
• After virtualization, different user applications managed by
their own operating systems (guest OS) can run on the
same hardware, independent of the host OS.
• This is often done by adding additional software, called a
virtualization layer as shown in Figure 3.1(b).
• This virtualization layer is known as hypervisor or virtual
machine monitor (VMM).
• The main function of the software layer for virtualization is
to virtualize the physical hardware of a host machine into
virtual resources to be used by the VMs, exclusively.
• The virtualization software creates the abstraction of VMs
by interposing a virtualization layer at various levels of a
computer system.
• Common virtualization layers include the instruction set
architecture (ISA) level, hardware level, operating system
level, library support level, and application level.
3.1.1.1 Instruction Set Architecture Level
• At the Instruction Set Architecture (ISA) level,
virtualization means making one computer system behave
like another by emulating its instructions.
• For example, a MIPS binary code can run on an x86-
based computer using ISA emulation. This helps to run
old software (legacy code) on new hardware.
• The basic method to do this is called code
interpretation, where an interpreter converts each
instruction from the source machine to the target
machine, but it is slow.
• To improve speed, dynamic binary translation is used. It
converts a block of instructions instead of one-by-one,
making the process faster.
• This process requires binary translation and
optimization to make old code work on new hardware.
• Virtual Instruction Set Architecture (V-ISA) is created
by adding a software translation layer that helps in
running different types of machine code on one hardware
platform.
3.1.1.2 Hardware Abstraction Level
• In hardware-level virtualization, virtualization is done
directly on the physical hardware.
• It creates a virtual hardware environment for virtual
machines (VMs).
• It also manages the physical hardware resources like
CPU, memory, and I/O devices.
• The main purpose is to allow multiple users to share
hardware resources, improving hardware utilization.
• This concept was first introduced in the IBM VM/370
system in the 1960s.
• Nowadays, Xen hypervisor is commonly used to
virtualize x86-based machines to run Linux or other
operating systems.
3.1.1.3 Operating System Level
• In OS-level virtualization, a layer is created between the
operating system (OS) and user applications.
• It uses containers to isolate applications on a single
physical server, making each container act like a separate
server.
• This helps in sharing hardware and software resources
in data centers.
• It is mainly used for virtual hosting, where resources are
shared among multiple users.
• It is also useful for consolidating servers, meaning
moving services from different servers into containers or
VMs on one server to save resources.
3.1.1.4 Library Support Level
• In library support level virtualization, user-level
libraries are used instead of making direct system calls to
the operating system (OS).
• It works by virtualizing the API (Application
Programming Interface), which acts as a bridge
between the application and the system.
• This allows applications from one OS to run on another
OS.
• Example:
• WINE software helps run Windows applications on UNIX
systems.
• vCUDA allows VM applications to use GPU hardware for faster
performance.
3.1.1.5 User-Application Level
• In application-level virtualization, an application is
virtualized to run like a virtual machine (VM).
• In a normal OS, applications run as processes, but here
they run as virtualized processes, so it is also called
process-level virtualization.
• This is commonly done using High-Level Language
(HLL) Virtual Machines, like:
• Java Virtual Machine (JVM) – Runs Java programs.
• .NET CLR (Common Language Runtime) – Runs
.NET applications.
• Another form of application-level virtualization is
application isolation or sandboxing, where:
• The application is isolated from the host OS and other
applications.
• This makes it easier to install, distribute, and remove
applications without affecting the system.
• Example:
• LANDesk Application Virtualization allows running
applications as executable files without installation or
system changes.
3.1.1.6 Relative Merits of Different Approaches
• The column headings correspond to four technical merits.
• ―Higher Performance‖ and ―Application Flexibility‖ are self-
explanatory.
• ―Implementation Complexity‖ implies the cost to implement that
particular virtualization level.
• ―Application Isolation‖ refers to the effort required to isolate resources
committed to different VMs.
• Each row corresponds to a particular level of virtualization.
3.1.2 VMM Design Requirements and
Providers
• In hardware-level virtualization, a layer called the
Virtual Machine Monitor (VMM) is placed between the
hardware and the operating system (OS).
• The VMM manages hardware resources like CPU,
memory, and storage.
• Whenever a program tries to access the hardware, the
VMM captures and controls the process, making it work
like a traditional OS.
• The CPU can be divided into multiple virtual CPUs,
allowing multiple operating systems (same or
different) to run on the same hardware.
• There are three requirements for a VMM:
• First, a VMM should provide an environment for pro
grams which is essentially identical to the original
machine.
• Second, programs run in this environment should show,
at worst, only minor decreases in speed.
• Third, a VMM should be in complete control of the system
resources.
• Two possible exceptions in terms of differences are
permitted with this requirement:
• Differences caused by the availability of system
resources and differences caused by timing
dependencies.
• Each virtual machine (VM) uses fewer hardware
resources like memory, but when multiple VMs run at the
same time, their combined resource usage is higher than
the actual physical machine's capacity.
• This happens because there is an extra software layer
(VMM) between the hardware and VMs. Multiple VMs
running together affect overall performance.
• However, despite these performance differences, the
VMM still functions like a real machine.
• A VMM should demonstrate efficiency in using the VMs.
• Table 3.2 compares four hypervisors and VMMs that are
in use today.
• Complete control of the resources by a VMM includes the
following aspects:
1) The VMM is responsible for allocating hardware
resources for programs;
2) It is not possible for a program to access any resource
not explicitly allocated to it; and
3) It is possible under certain circumstances for a VMM to
regain control of resources already allocated.
3.1.3 Virtualization Support at the OS Level
Why OS-Level Virtualization?
• In a cloud computing environment, perhaps thousands of
VMs need to be initialized simultaneously.
• Besides slow operation, storing the VM images also
becomes an issue.
• Moreover, full virtualization at the hardware level also has
the disadvantages of slow performance and low density,
and the need for para-virtualization to modify the guest
OS.
• To reduce the performance overhead of hardware-level
virtualization, even hardware modification is needed.
• OS-level virtualization provides a feasible solution for
these hardware-level virtualization issues.
• Operating system virtualization inserts a virtualization layer
inside an operating system to partition a machine’s physical
resources.
• It enables multiple isolated VMs within a single operating
system kernel.
• This kind of VM is often called a virtual execution environment
(VE), Virtual Private System (VPS),or simply container.
• A VE has its own set of processes, file system, user accounts,
network interfaces with IP addresses, routing tables, firewall
rules, and other personal settings.
• Although VEs can be customized for different people, they
share the same operating system kernel.
• Therefore, OS-level virtualization is also called single-OS
image virtualization.
• Figure 3.3 illustrates operating system virtualization from the
point of view of a machine stack.
Advantages of OS Extensions

• The benefits of OS extensions are two fold:


1) VMs at the operating system level have minimal
startup/shutdown costs, low resource requirements, and
high scalability; and
2) for an OS-level VM, it is possible for a VM and its host
environment to synchronize state changes when
necessary.
Disadvantages of OS Extensions

• The main disadvantage of OS extensions is that all the


VMs at operating system level on a single container must
have the same kind of guest operating system.
• That is, although different OS-level VMs may have
different operating system distributions, they must pertain
to the same operating system family.
• Figure 3.3 illustrates the concept of OS-level
virtualization.
• The virtualization layer is inserted inside the OS to
partition the hardware resources for multiple VMs to run
their applications in multiple virtual environments.
• To implement OS-level virtualization, isolated execution
environments (VMs) should be created based on a single
OS kernel.
• Furthermore, the access requests from a VM need to be
redirected to the VM’s local resource partition on the
physical machine.
Virtualization on Linux or Windows Platforms

• By far, most reported OS-level virtualization systems are


Linux-based.
• Virtualization support on the Windows-based platform is
still in the research stage.
• The Linux kernel offers an abstraction layer to allow
software processes to work with and operate on
resources without knowing the hardware details.
• New hardware may need a new Linux kernel to support.
Therefore, different Linux plat forms use patched kernels
to provide special support for extended functionality.
• Table 3.3 summarizes several examples of OS level
virtualization tools that have been developed in recent
years.
3.1.4 Middleware Support for Virtualization
• Library-level virtualization is also known as user-level
Application Binary Interface (ABI) or API emulation.
• This type of virtualization can create execution
environments for running alien programs on a platform
rather than creating a VM to run the entire operating
system.
• API call interception and remapping are the key functions
performed.
• This section provides an overview of several library-level
virtualization systems: namely the Windows Application
Binary Interface (WABI), lxrun, WINE, Visual MainWin,
and vCUDA, which are summarized in Table 3.4.
3.2 VIRTUALIZATION STRUCTURES/TOOLS
AND MECHANISMS
• Before virtualization, the operating system manages the
hardware.
• After virtualization, a virtualization layer is inserted between the
hardware and the operating system.
• In such a case, the virtualization layer is responsible for
converting portions of the real hardware into virtual hardware.
• Therefore, different operating systems such as Linux and
Windows can run on the same physical machine,
simultaneously.
• Depending on the position of the virtualization layer, there are
several classes of VM architectures, namely the hypervisor
architecture, para-virtualization, and host-based virtualization.
• The hypervisor is also known as the VMM (Virtual Machine
Monitor). They both perform the same virtualization operations.
3.2.1 Hypervisor and Xen Architecture

• The hypervisor supports hardware-level virtualization (see


Figure 3.1(b)) on bare metal devices like CPU, memory,
disk and network interfaces.
• The hypervisor software sits directly between the physical
hardware and its OS.
• This virtualization layer is referred to as either the VMM or
the hypervisor.
• The hypervisor provides hypercalls for the guest OSes
and applications. Depending on the functionality, a
hypervisor can assume a micro-kernel architecture like
the Microsoft Hyper-V. Or it can assume a monolithic
hypervisor architecture like the VMware ESX for server
virtualization.
The Xen Architecture
• Xen is an open source hypervisor program developed by
Cambridge University.
• Xen is a micro kernel hypervisor, which separates the policy
from the mechanism.
• The Xen hypervisor implements all the mechanisms, leaving
the policy to be handled by Domain 0, as shown in Figure 3.5.
• Xen does not include any device drivers natively. It just
provides a mechanism by which a guest OS can have direct
access to the physical devices.
• As a result, the size of the Xen hypervisor is kept rather small.
• Xen provides a virtual environment located between the
hardware and the OS.
• A number of vendors are in the process of developing
commercial Xen hypervisors, among them are Citrix XenServer
and Oracle VM.
• The core components of a Xen system are the hypervisor,
kernel, and applications.
• The organization of the three components is important.
• Like other virtualization systems, many guest OSes can run on
top of the hypervisor.
• However, not all guest OSes are created equal, and one in
particular controls the others.
• The guest OS, which has control ability, is called Domain 0,
and the others are called Domain U.
• Domain 0 is a privileged guest OS of Xen.
• It is first loaded when Xen boots without any file system drivers
being available.
• Domain 0 is designed to access hardware directly and manage
devices.
• Therefore, one of the responsibilities of Domain 0 is to allocate
and map hardware resources for the guest domains (the
Domain U domains).
• For example, Xen is based on Linux and its security level
is C2.
• Its management VM is named Domain 0, which has the
privilege to manage other VMs implemented on the same
host.
• If Domain 0 is compromised, the hacker can control the
entire system. So, in the VM system, security policies are
needed to improve the security of Domain 0.
• Domain 0, behaving as a VMM, allows users to create,
copy, save, read, modify, share, migrate, and roll back
VMs as easily as manipulating a file, which flexibly
provides tremendous benefits for users.
• Unfortunately, it also brings a series of security problems
during the software life cycle and data lifetime.
Binary Translation with Full Virtualization

• Depending on implementation technologies, hardware


virtualization can be classified into two categories: full
virtualization and host-based virtualization.
• Full virtualization does not need to modify the host OS.
• It relies on binary translation to trap and to virtualize the
execution of certain sensitive, non virtualizable
instructions.
• The guest OSes and their applications consist of
noncritical and critical instructions.
• In a host-based system, both a host OS and a guest OS
are used. A virtualization software layer is built between
the host OS and guest OS.
Full Virtualization
• With full virtualization, noncritical instructions run on the
hardware directly while critical instructions are discovered
and replaced with traps into the VMM to be emulated by
software.
• Both the hypervisor and VMM approaches are considered
full virtualization.
Binary Translation of Guest OS Requests Using a VMM
• This approach was implemented by VMware and many
other software companies. As shown in Figure 3.6,
VMware puts the VMM at Ring 0 and the guest OS at
Ring 1.
• The VMM scans the instruction stream and identifies the
privileged, control- and behavior-sensitive instructions.
• When these instructions are identified, they are trapped
into the VMM, which emulates the behavior of these
instructions.
• The method used in this emulation is called binary
translation. Therefore, full virtualization combines binary
translation and direct execution.
• The guest OS is completely decoupled from the
underlying hardware. Consequently, the guest OS is
unaware that it is being virtualized.
• The performance of full virtualization may not be ideal,
because it involves binary translation which is rather time-
consuming.
• In particular, the full virtualization of I/O-intensive
applications is a really a big challenge.
• Binary translation employs a code cache to store
translated hot instructions to improve performance, but it
increases the cost of memory usage.
• At the time of this writing, the performance of full
virtualization on the x86 architecture is typically 80
percent to 97 percent that of the host machine.
Host-Based Virtualization
• An alternative virtual machine (VM) setup is to install a
virtualization layer on top of the host operating system (OS).
• In this setup, the host OS still manages the hardware.
• The guest operating systems run on the virtualization layer,
while some applications can run directly on the host OS.
• This host-based architecture has some advantages:
 Easy Installation: It can be installed without changing the
host OS.
 Simpler Design: The virtual machine software uses the host
OS's device drivers and low-level services, making it easier
to design and deploy.
 Flexible: It works on many different machine setups.
• However, there are drawbacks:
 Lower Performance: Accessing hardware involves four
layers, slowing things down.
 Binary Translation Needed: If the guest OS uses a
different instruction set than the hardware, the system
must translate the instructions, further reducing
performance.
• Overall, while this approach is flexible, it often runs too
slowly to be practical.
Para-Virtualization with Compiler Support

• Para-virtualization needs to modify the guest operating


systems.
• A para-virtualized VM provides special APIs requiring
substantial OS modifications in user applications.
• Performance degradation is a critical issue of a virtualized
system.
• No one wants to use a VM if it is much slower than using
a physical machine.
• The virtualization layer can be inserted at different
positions in a machine soft ware stack.
• However, para-virtualization attempts to reduce the
virtualization overhead, and thus improve performance by
modifying only the guest OS kernel.
• Figure 3.7 illustrates the concept of a para-virtualized VM
architecture.
• The guest operating systems are para-virtualized.
• They are assisted by an intelligent compiler to replace the
nonvirtualizable OS instructions by hypercalls as
illustrated in Figure 3.8.
• The traditional x86 processor offers four instruction
execution rings: Rings 0, 1, 2, and 3.
• The lower the ring number, the higher the privilege of
instruction being executed.
• The OS is responsible for managing the hardware and the
privileged instructions to execute at Ring 0, while user-
level applications run at Ring 3.
• The best example of para-virtualization is the KVM to be
described below.
Para-Virtualization Architecture

• When the x86 processor is virtualized, a virtualization


layer is inserted between the hardware and the OS.
• According to the x86 ring definition, the virtualization layer
should also be installed at Ring 0.
• Different instructions at Ring 0 may cause some
problems.
• In Figure 3.8, we show that para-virtualization replaces
nonvirtualizable instructions with hypercalls that
communicate directly with the hypervisor or VMM.
• However, when the guest OS kernel is modified for
virtualization, it can no longer run on the hardware
directly.
• Although para-virtualization reduces the overhead, it has
incurred other problems.
• First, its compatibility and portability may be in doubt,
because it must support the unmodified OS as well.
• Second, the cost of maintaining para-virtualized OSes is
high, because they may require deep OS kernel
modifications.
• Finally, the performance advantage of para-virtualization
varies greatly due to workload variations.
KVM (Kernel-Based VM)

• This is a Linux para-virtualization system—a part of the


Linux version 2.6.20 kernel.
• Memory management and scheduling activities are
carried out by the existing Linux kernel.
• The KVM does the rest, which makes it simpler than the
hypervisor that controls the entire machine.
• KVM is a hardware-assisted para-virtualization tool, which
improves performance and supports unmodified guest
OSes such as Windows, Linux, Solaris, and other UNIX
variants.
3.3 VIRTUALIZATION OF CPU, MEMORY,
AND I/O DEVICES
• To support virtualization, processors such as the x86
employ a special running mode and instructions, known
as hardware-assisted virtualization.
• In this way, the VMM and guest OS run in different modes
and all sensitive instructions of the guest OS and its
applications are trapped in the VMM.
• To save processor states, mode switching is completed by
hardware.
• For the x86 architecture, Intel and AMD have proprietary
technologies for hardware-assisted virtualization.
Hardware Support for Virtualization

• Modern operating systems and processors allow multiple


processes to run at the same time.
• To prevent system crashes, processors have protection
mechanisms.
• They work in two modes: user mode (for normal
applications) and supervisor mode (for critical system
tasks). Only privileged instructions can run in
supervisor mode, while normal instructions are called
unprivileged.
• In virtual environments, running operating systems and
applications is more complex because of extra layers.
Hardware support helps manage this.
• For example, VMware Workstation is a popular
virtualization software for x86 and x86-64 computers. It
lets users create and run multiple virtual machines (VMs)
alongside the host operating system. VMware uses the
host-based virtualization approach.
• Xen is another type of virtualization tool known as a
hypervisor. It runs on systems like IA-32, x86-64,
Itanium, and PowerPC 970. Xen modifies the Linux kernel
to act as the most powerful layer (hypervisor), allowing
multiple guest operating systems to run on top of it.
• KVM (Kernel-based Virtual Machine) is built into the
Linux kernel. It supports hardware-assisted
virtualization (using Intel VT-x or AMD-V) and
paravirtualization (using the VirtIO framework). VirtIO
provides virtual devices like Ethernet, disk I/O, memory
management (balloon device), and graphics (VGA using
VMware drivers).
CPU Virtualization
• A Virtual Machine (VM) is like a copy of a computer
system. Most VM instructions run directly on the physical
CPU for better performance. However, some important
instructions must be carefully managed to ensure the
system runs correctly. These important instructions are
divided into three types:
 Privileged instructions: Only work in supervisor mode
(high-level access).
 Control-sensitive instructions: Change system
settings.
 Behavior-sensitive instructions: Behave differently
based on the system state.
• Privileged instructions trigger a trap if executed outside
supervisor mode.
• For CPU virtualization to work, the processor must allow
the VM’s normal instructions to run directly, while the
Virtual Machine Monitor (VMM) handles the critical
ones.
• When the VM runs privileged instructions, the VMM steps
in to manage them. This ensures the system stays stable
and correct.
• Not all CPUs support virtualization. For example, RISC
CPUs are easier to virtualize because most of their
sensitive instructions are privileged.
• However, x86 CPUs were not originally designed for
virtualization. Some important instructions like SGDT and
SMSW are not privileged, so they can’t be trapped by the
VMM, making x86 harder to virtualize.
• In normal UNIX-like systems, a program uses the 80h
interrupt to ask the OS for help (system call).
• In virtualization (like with Xen), the guest OS triggers this
interrupt. Instead of going straight to the real OS, the call
first goes to the hypervisor.
• The hypervisor then decides what to do. This setup allows
the guest OS to run as if it were on real hardware, even
though it’s virtualized.
Memory Virtualization
• Virtual memory virtualization is similar to the virtual
memory support provided by modern operating systems.
• In a traditional execution environment, the operating
system maintains mappings of virtual memory to machine
memory using page tables, which is a one-stage mapping
from virtual memory to machine memory.
• All modern x86 CPUs include a memory management unit
(MMU) and a translation lookaside buffer (TLB) to
optimize virtual memory performance.
• However, in a virtual execution environment, virtual
memory virtualization involves sharing the physical
system memory in RAM and dynamically allocating it to
the physical memory of the VMs.
• That means a two-stage mapping process should be
maintained by the guest OS and the VMM, respectively:
virtual memory to physical memory and physical memory
to machine memory.
• Furthermore, MMU virtualization should be supported,
which is transparent to the guest OS.
• The guest OS continues to control the mapping of virtual
addresses to the physical memory addresses of VMs.
• But the guest OS cannot directly access the actual
machine memory.
• The VMM is responsible for mapping the guest physical
memory to the actual machine memory. Figure 3.12
shows the two-level memory mapping procedure
• Since each page table of the guest OSes has a separate
page table in the VMM corresponding to it, the VMM page
table is called the shadow page table.
• Nested page tables add another layer of indirection to
virtual memory.
• The MMU already handles virtual-to-physical translations
as defined by the OS.
• Then the physical memory addresses are translated to
machine addresses using another set of page tables
defined by the hypervisor.
• Since modern operating systems maintain a set of page
tables for every process, the shadow page tables will get
flooded.
• Consequently, the performance overhead and cost of
memory will be very high.
I/O Virtualization
• I/O virtualization handles how virtual machines (VMs) send and
receive data through shared physical hardware devices. There
are three main methods to achieve this: full device emulation,
para-virtualization, and direct I/O.

Full Device Emulation:


• The virtual machine monitor (VMM) creates a software version
of a real hardware device.
• The guest operating system (OS) thinks it’s interacting with real
hardware, but the VMM is handling everything.
• Multiple VMs can share one physical device.
• Downside: It is slow because software emulation is much
slower than real hardware.
The full device emulation approach is shown in Figure 3.14.
Para-Virtualization (Split Driver Model):
• Used by platforms like Xen.
• Has two drivers: a frontend driver in the VM (Domain U)
and a backend driver in the main OS (Domain 0).
• Both communicate using shared memory.
• The backend driver manages the real device and shares it
between VMs.
• Advantage: Faster than full emulation.
• Downside: Higher CPU usage.

Direct I/O Virtualization:


• VMs access physical devices directly, leading to almost
native performance with low CPU overhead.
• Mostly used in large systems like mainframes.
• Challenges: Risky on regular hardware since devices
may be left in unpredictable states during migration,
causing crashes or errors.
• Solution: Hardware help is needed, like Intel VT-d, which
remaps device memory access and interrupts safely.

Another method is Self-Virtualized I/O (SV-IO):


• Uses the power of multi-core processors to handle I/O
virtualization.
• Creates virtual devices (VIFs) that act like real hardware
for the VMs (e.g., virtual networks, disks, cameras).
• Each VIF has:
• A unique ID.
• Two message queues: one for sending data and one for receiving
data.
• The guest OS communicates with VIFs through special
device drivers.
3.4 VIRTUAL CLUSTERS AND RESOURCE
MANAGEMENT
• A physical cluster is a collection of servers (physical
machines) interconnected by a physical network such as
a LAN.
• three critical design issues of virtual clusters: live
migration of VMs, memory and file migrations, and
dynamic deployment of virtual clusters.
• When a traditional VM is initialized, the administrator
needs to manually write configuration information or
specify the configuration sources.
• Amazon’s Elastic Compute Cloud (EC2) is a good
example of a web service that provides elastic computing
power in a cloud. EC2 permits customers to create VMs
and to manage user accounts over the time of their use.
Physical versus Virtual Clusters

• Virtual clusters are built with VMs installed at distributed


servers from one or more physical clusters.
• The VMs in a virtual cluster are interconnected logically
by a virtual network across several physical networks.
• Figure 3.18 illustrates the concepts of virtual clusters and
physical clusters.
• Each virtual cluster is formed with physical machines or a
VM hosted by multiple physical clusters.
• The virtual cluster boundaries are shown as distinct
boundaries.
• The provisioning of VMs to a virtual cluster is done dynamically
to have the following interesting properties:
1. The virtual cluster nodes can be either physical or virtual
machines. Multiple VMs running with different OSes can be
deployed on the same physical node.
2. A VM runs with a guest OS, which is often different from the
host OS, that manages the resources in the physical
machine, where the VM is implemented.
3. The purpose of using VMs is to consolidate multiple
functionalities on the same server. This will greatly enhance
server utilization and application flexibility.
4. VMs can be colonized (replicated) in multiple servers for the
purpose of promoting distributed parallelism, fault tolerance,
and disaster recovery.
5. The size (number of nodes) of a virtual cluster can grow or
shrink dynamically, similar to the way an overlay network
varies in size in a peer-to-peer (P2P) network.
6. The failure of any physical nodes may disable some
VMs installed on the failing nodes. But the failure of
VMs will not pull down the host system.

 The different node colors in Figure 3.18 refer to different


virtual clusters. In a virtual cluster system, it is quite
important to store the large number of VM images
efficiently.
 Figure 3.19 shows the concept of a virtual cluster based
on application partitioning or customization.
 The different colors in the figure represent the nodes in
different virtual clusters.
 As a large number of VM images might be present, the
most important thing is to determine how to store those
images in the system efficiently.
• There are common installations for most users or
applications, such as operating systems or user-level
programming libraries.
• These software packages can be preinstalled as
templates (called template VMs).
• With these templates, users can build their own software
stacks.
• New OS instances can be copied from the template VM.
• User-specific components such as programming libraries
and applications can be installed to those instances.
Fast Deployment and Effective Scheduling

• The system should have the capability of fast deployment.


• Here, deployment means two things: to construct and
distribute software stacks (OS, libraries, applications) to a
physical node inside clusters as fast as possible, and to
quickly switch runtime environments from one user’s
virtual cluster to another user’s virtual cluster.
• If one user finishes using his system, the corresponding
virtual cluster should shut down or suspend quickly to
save the resources to run other VMs for other users.
Live VM Migration Steps and Performance Effects

• In a virtual cluster, a mix of physical (host) machines and


virtual machines (VMs) work together.
• Normally, tasks run on physical machines.
• If a VM fails, another VM with the same guest OS can
take over its role on a different physical node.
• This adds flexibility compared to traditional physical-to-
physical failover.
• However, if the host machine running a VM crashes, the
VM also stops.
• This problem can be reduced by live migration, where a
VM is moved from one physical machine to another while
it’s still running.
• The VM’s state is copied from storage to the new host.
• There are four ways to manage virtual clusters:
1. Guest-based Manager:
• The cluster manager runs inside a VM.
• Example: openMosix (Linux cluster) and Sun’s Oasis (Solaris
cluster on VMware).
2. Host-based Manager:
• The cluster manager runs on the physical host and controls VMs.
• Example: VMware HA, which restarts VMs if a failure happens.
3. Independent Managers:
• Separate managers on both the host and guest systems.
• This method is complex and harder to manage.
4. Integrated Manager:
• A smart manager that understands both virtual and physical
resources.
• Best option when combined with live migration for flexibility and
efficiency.
• Live migration allows moving VMs from one physical
machine to another with minimal interruption. If a failure
occurs, one VM can easily replace another. This approach
is useful in cloud computing, computational grids, and
high-performance computing (HPC).
• The biggest advantage of virtual clusters is that they offer
dynamic resources—machines can be added or
replaced as needed.
• When live migration is used, three things must be
balanced:
 Minimal downtime
 Low network usage
 Reasonable migration time
• It’s also important that migration does not slow down other
services running on the same host.
A VM can be in four different states:
• Inactive: The VM is not running.
• Active: The VM is running and doing tasks.
• Paused: The VM is loaded but not processing any task.
• Suspended: The VM is stopped and its state is saved to
disk.
• As shown in Figure 3.20, live migration of a VM consists
of the following six steps:
Steps 0 & 1: Start Migration
• Migration preparation begins.
• The system decides which VM to move and where to
move it (the destination host).
• This is usually automatic (like for load balancing or saving
energy), but users can also start it manually.
Step 2: Transfer VM Memory
• The VM’s running state is stored in its memory, so the
memory is sent to the new host.
• First, all memory is copied.
• Then, the system keeps copying only the memory that
changed (called "dirty memory") until very little is left.
• This process happens while the VM is still running, so
service is barely interrupted.
Step 3: Suspend the VM and Copy the Final Data
• The VM is briefly paused to send the last bit of memory
and other data like CPU state and network connections.
• This is the only time users might notice the service is
unavailable—called "downtime"—and it’s kept as short
as possible.
Steps 4 & 5: Commit and activate the new host.
• The new host loads the VM’s data and restarts it.
• Network connections switch to the new VM.
• The old VM is deleted from the original host.
• The VM service continues from the new host.
Migration of Memory, Files, and Network
Resources
Memory Migration
• One of the most important parts of VM (Virtual Machine)
migration is moving the memory from one physical server
to another. There are different ways to do this, but most
follow similar basic ideas. The method chosen often
depends on the type of applications or workloads running
inside the VM.
• In today’s systems, the memory being moved can range
from hundreds of megabytes to a few gigabytes, so it
needs to be done efficiently.
• The Internet Suspend-Resume (ISR) technique helps with
this by using a concept called temporal locality. This
means that when a VM is suspended and then resumed,
most of its memory content stays the same because not
much has changed in that time.
• In ISR, each file is split into smaller parts and organized
like a tree. Both the suspended and resumed VM have
copies of this tree. This design helps save time and
resources because only the parts of the files that have
changed need to be sent during migration.
• However, ISR is mostly used when it’s okay for the VM to
stop running during migration. Because of this, the
downtime (when the VM is not working) is higher
compared to other methods that allow the VM to stay live
while migrating.
File System Migration

• To successfully migrate a VM, the system must give the


VM a consistent view of its file system, no matter which
machine it runs on.
• One simple way to do this is to give each VM a virtual disk
and move that disk along with the VM. However, since
disks are large nowadays, sending the entire disk over the
network isn’t practical.
• Another method is to use a global file system that is
shared across all machines.
• This way, the VM doesn’t need to move files because
everything is already accessible over the network.
• In the Internet Suspend-Resume (ISR) method, a
distributed file system helps move the VM’s suspended
state. However, the VM doesn’t directly use this
distributed file system.
• Instead, the Virtual Machine Monitor (VMM) uses its local
file system.
• When suspending the VM, important files are copied out
of the local file system. When resuming, the needed files
are copied back in.
• There’s also a method called smart copying, where the
VMM uses spatial locality. Since people often move
between the same places (like home and office), the
system only needs to send the differences between the
files at those locations. This reduces the amount of data
that has to be transferred.
Network Migration

• When a VM is moved to another machine, it should still


keep all its active network connections working without
needing help from the old host or extra tools for
redirection.
• To do this, each VM is given a virtual IP address that
others use to find and connect to it. This virtual IP is
different from the physical machine’s IP.
• The VM can also have its own virtual MAC address. The
Virtual Machine Monitor (VMM) keeps track of which
virtual IP and MAC belong to which VM.
• When a VM moves, it takes its network settings and
connection states with it.
• If the source and destination machines are on the same
local network (LAN), the new machine can send an ARP
(Address Resolution Protocol) message to inform other
devices that the VM’s IP address is now at a new location.
• This helps update the network so that future data is sent
to the correct machine. A few packets might be lost during
the move, but this usually isn’t a big problem.
• Another way is for the VM to keep using its original MAC
address, and the network switch automatically detects
that the VM has moved to a different port.
Dynamic Deployment of Virtual Clusters
• Table 3.5 summarizes four virtual cluster research
projects.
• We briefly introduce them here just to identify their design
objectives and reported results.
• The Cellular Disco at Stanford is a virtual cluster built in a
shared-memory multiprocessor system.
• The INRIA virtual cluster was built to test parallel
algorithm performance.
• The COD and VIOLIN clusters are studied in forthcoming
examples.
The Cluster-on-Demand (COD) Project at Duke
University
The VIOLIN Project at Purdue University
• The Purdue VIOLIN Project applies live VM migration to
reconfigure a virtual cluster environment.
• Its purpose is to achieve better resource utilization in
executing multiple cluster jobs on multiple cluster
domains.
• The project leverages the maturity of VM migration and
environment adaptation technology.
• The approach is to enable mutually isolated virtual
environments for executing parallel applications on top of
a shared physical infrastructure consisting of multiple
domains.
• Figure 3.25 illustrates the idea with five concurrent virtual
environments, labeled as VIOLIN 1–5, sharing two
physical clusters.
• The squares of various shadings represent the VMs deployed
in the physical server nodes.
• The major contribution by the Purdue group is to achieve
autonomic adaptation of the virtual computation environments
as active, integrated entities.
• A virtual execution environment is able to relocate itself across
the infrastructure, and can scale its share of infrastructural
resources.
• The adaptation is transparent to both users of virtual
environments and administrations of infrastructures.
• The adaptation overhead is maintained at 20 sec out of 1,200
sec in solving a large NEMO3D problem of 1 million particles.
• The message being conveyed here is that the virtual
environment adaptation can enhance resource utilization
significantly at the expense of less than 1 percent of an
increase in total execution time.
3.5 VIRTUALIZATION FOR DATA-CENTER
AUTOMATION
• Data centers have grown rapidly in recent years, and all
major IT companies are pouring their resources into
building new data centers.
• In addition, Google, Yahoo!, Amazon, Microsoft, HP,
Apple, and IBM are all in the game.
• All these companies have invested billions of dollars in
data center construction and automation.
• Data-center automation means that huge volumes of
hardware, software, and database resources in these
data centers can be allocated dynamically to millions of
Internet users simultaneously, with guaranteed QoS and
cost-effectiveness.
Server Consolidation in Data Centers

• In data centers, many different types of workloads run on


servers at different times. These workloads can be
grouped into two types: chatty workloads and non-
interactive workloads.
• Chatty workloads have periods of high activity followed
by quiet periods. For example, a video streaming service
is very busy at night but not so much during the day.
• Non-interactive workloads don’t need user interaction to
keep running, like high-performance computing tasks.
• Since the resource needs of these workloads change a lot
over time, data centers often assign enough resources to
handle the maximum load (peak demand), even if most of
the time those resources aren’t fully used.
• This leads to many servers being underutilized, wasting
hardware, space, power, and increasing management costs.
• Server consolidation helps solve this problem by reducing the
number of physical servers and using resources more
efficiently.
• Among various consolidation methods, virtualization-based
server consolidation is the most effective.
• It allows better use of CPU, memory, and network resources by
dividing a physical server into multiple virtual machines (VMs),
each getting just the resources it needs instead of a whole
server.
• However, while virtualization improves resource usage, it also
makes managing resources more complex.
• The challenge for data centers is balancing better resource use
with ensuring good performance and quality of service (QoS).
Virtual Storage Management

• Virtual storage management has a different meaning in


system virtualization compared to traditional storage.
Earlier, it meant combining and dividing large physical
disks for physical machines. In system virtualization, it
refers to managing storage for virtual machines (VMs)
and their operating systems.
• There are two types of data in this environment:
 VM images (which are specific to virtualization)
 Application data (which is like data in normal operating
systems)
• Two key goals in virtualization are encapsulation
(wrapping the OS and apps inside VMs) and isolation
(keeping VMs separate).
• Hardware like CPUs and chipsets have improved to
support these goals, but storage systems haven’t kept up,
creating a bottleneck for VM deployments.
• In virtualization, a layer is added between the hardware
and the OS. This makes storage management harder
because:
 Each guest OS thinks it’s working with a real hard disk,
but it isn’t.
 Multiple VMs try to access the same physical storage at
the same time, causing competition.
• This makes the Virtual Machine Monitor (VMM)'s storage
management much more complex than in traditional
systems. Also, VMs use basic storage operations that are
not flexible, making tasks like moving volumes or saving
disk states difficult and sometimes impossible.
• With thousands of VMs in data centers, VM images
consume huge amounts of storage.
• Researchers are working on making storage management
easier, improving performance, and reducing storage
space used by VM images.
• One solution is Parallax, a distributed storage system
designed for virtual environments.
• Another solution is Content Addressable Storage
(CAS), which reduces the size of VM images.
Example 3.11 Parallax Providing Virtual Disks to Client VMs
from a Large Common Shared Physical Disk

For theory refer text book.


Cloud OS for Virtualized Data Centers

• Data centers must be virtualized to serve as cloud


providers.
• Table 3.6 summarizes four virtual infrastructure (VI)
managers and OSes.
• These VI managers and OSes are specially tailored for
virtualizing data centers which often own a large number
of servers in clusters.
• Nimbus, Eucalyptus, and OpenNebula are all open source
software available to the general public.
• Only vSphere 4 is a proprietary OS for cloud resource
virtualization and management over data centers.
Example 3.12 Eucalyptus for Virtual Networking of
Private Cloud

For theory refer text book.


Example 3.13 VMware vSphere 4 as a Commercial
Cloud OS

For theory refer text


book.
Trust Management in Virtualized Data Centers

• A VMM changes the computer architecture. It provides a layer


of software between the operating systems and system
hardware to create one or more VMs on a single physical
platform.
• A VM entirely encapsulates the state of the guest operating
system running inside it.
• Encapsulated machine state can be copied and shared over
the network and removed like a normal file, which proposes a
challenge to VM security.
• In general, a VMM can provide secure isolation and a VM
accesses hard ware resources through the control of the VMM,
so the VMM is the base of the security of a virtual system.
Normally, one VM is taken as a management VM to have some
privileges such as creating, suspending, resuming, or deleting
a VM.
VM-Based Intrusion Detection

• Intrusions happen when someone gains unauthorized


access to a computer, either from the network or locally.
Intrusion Detection Systems (IDS) are used to detect
these attacks. IDS works based on how intrusion actions
behave and is built into the operating system.
• There are two main types of IDS:
• Host-based IDS (HIDS): Runs directly on the computer
being monitored. But if the system is attacked, the HIDS
itself could also be at risk.
• Network-based IDS (NIDS): Monitors network traffic but
may not detect fake or disguised actions.
• In virtualization-based IDS, virtual machines (VMs) are
isolated from each other even if they share the same
hardware. If one VM is attacked, it doesn’t affect others —
similar to how NIDS works.
• Additionally, the Virtual Machine Monitor (VMM)
watches and controls hardware and software access
requests, preventing fake actions and offering protection
like HIDS.
• There are two ways to implement VM-based IDS:
• Run the IDS as a separate process in each VM or in a
high-privileged VM on top of the VMM.
• Integrate the IDS directly into the VMM, giving it full
access to hardware and system resources.
• The VM-based IDS includes:
• A policy engine and policy module to monitor events in
VMs.
• Tools like PTrace to track system actions based on
security policies.
• It’s hard to catch and stop every intrusion immediately, so
analyzing what happened after an attack is important.
Systems usually use logs to study attacks, but logs can
be changed or deleted by attackers.
Example 3.14 EMC Establishment of Trusted Zones for
Protection of Virtual Clusters Provided to Multiple Tenants

For theory refer


text book.

You might also like