Virtualization 02
Virtualization 02
<<<……………>>>
Motivation
Three fundamental abstractions are necessary to describe the
operation of a computing systems:
(1) interpreters/processors, (2) memory, (3) communications
links
As the scale of a system and the size of its users grows, it
becomes very challenging to manage its recourses
Resource management issues:
provision for peak demands overprovisioning
heterogeneity of hardware and software
machine failures
Virtualization is a basic enabler of Cloud Computing; it
simplifies the management of physical resources for
the three abstractions
3
Motivation (cont’d)
Layering and Interfaces
A1 Applications
API
Libraries A2
ABI
System calls
Operating System A3
ISA
System ISA User ISA
Hardware
10
Code portability
11
HLL Language Translations
HLL code
Intermediate Portable
code code
VM compiler/ VM compiler/
Loader
interpreter interpreter
12
Virtual Machine Monitor (VMM /
Hypervisor)
A virtual machine monitor (VMM/hypervisor) partitions
the resources of computer system into one or more virtual
machines (VMs). Allows several operating systems to run
concurrently on a single hardware platform
A VM is an execution environment that runs an OS
VM – an isolated environment that appears to be a whole
computer, but actually only has access to a portion of the
computer resources
A VMM allows:
Multiple services to share
the same platform
Live migration - the movement
of a server from one platform to another
System modification while maintaining
backward compatibility with the original system
Enforces isolation among the systems, thus security 13
A guest operating system is an OS that runs in a VM under
VMM Virtualizes the CPU and the
Memory
A VMM (also hypervisor):
Traps the privileged instructions executed by a guest OS
and enforces the correctness and safety of the operation
Traps interrupts and dispatches them to the individual guest
operating systems
Controls the virtual memory management
Maintains a shadow page table for each guest OS and
replicates any modification made by the guest OS in its own
shadow page table. This shadow page table points to the
actual page frame and it is used by the Memory
Management Unit (MMU) for dynamic address translation.
Monitors the system performance and takes corrective
actions to avoid performance degradation. For example,
the VMM may swap out a VM to avoid thrashing.
14
Type 1 and 2 Hypervisors
Type 1 Hypervisor Type 2 Hypervisor
Taxonomy of VMMs:
1. Type 1 Hypervisor (bare metal, native): supports multiple virtual
machines and runs directly on the hardware (e.g., VMware ESX ,
Xen, Denali)
2. Type 2 Hypervisor (hosted) VM - runs under a host operating
system (e.g., user-mode Linux)
15
Examples of Hypervisors
16
Performance and Security Isolation
The run-time behavior of an application is affected by other
applications running concurrently on the same platform and
competing for CPU cycles, cache, main memory, disk and
network access. Thus, it is difficult to predict the completion
time!
19
User-mode vs Kernel-mode
22
Techniques for Virtualizing CPU on
x86
Full virtualization – a guest OS can run unchanged under the
VMM as if it was running directly on the hardware platform. Each
VM runs an exact copy of the actual hardware.
Binary translation rewrites parts of the code on the fly to replace
sensitive but not privileged instructions with safe code to emulate the
original instruction
“The hypervisor translates all operating system instructions on the fly
and caches the results for future use, while user level instructions run
unmodified at native speed.”
Examples: VMware, Microsoft Virtual Server
Advantages:
No hardware assistance,
No modifications of the guest OS
Isolation, Security
Disadvantages:
Speed of execution
23
Techniques for Virtualizing CPU on
x86
Para-virtualization – “involves modifying the OS kernel to
replace non- virtualizable instructions with hypercalls that
communicate directly with the virtualization layer hypervisor. The
hypervisor also provides hypercall interfaces for other critical
kernel operations such as memory management, interrupt
handling and time keeping. “ (from VMware paper)
24
Full Virtualization and
Paravirtualization
Guest OS Guest OS
Hardware Hardware
abstraction abstraction
layer layer
Hypervisor Hypervisor
Hardware Hardware
25
Techniques for Virtualizing CPU on
x86
Hardware Assisted Virtualization – “a new CPU execution
mode feature that allows the VMM to run in a new root mode
below ring 0. As depicted in Figure 7, privileged and sensitive
calls are set to automatically trap to the hypervisor, removing
the need for either binary translation or paravirtualization“
(from VMware paper)
26
VT-x, a Major Architectural
Enhancement
In 2005 Intel released two Pentium 4 models supporting VT-x.
VT-x supports two modes of operations (Figure (a)):
1. VMX root - for VMM operations.
2. VMX non-root - support a VM.
And a new data structure called the Virtual Machine Control
Structure including host-state and guest-state areas (Figure (b)).
VM entry - the processor state is loaded from the guest-state of
the VM scheduled to run; then the control is transferred from VMM
to the VM.
VM exit - saves the processor state in the guest-state area of the
running VM; then it loads the processor state from the host-state
area, finally transfers control to the VMM.
Virtual-machine control structure
VM entry
host-state
VMX root VMX non-root
guest-state
VM exit
(a) (b)
27
Xen - a VMM based on
Paravirtualization
The goal of the Cambridge group - design a VMM capable of
scaling to about 100 VMs running standard applications and
services without any modifications to the Application Binary
Interface (ABI). (2003, Computing Laboratory, Cambridge University)
Linux, Minix, NetBSD, FreeBSD and others can operate as
paravirtualized Xen guest OS running on x86, x86-64, Itanium,
and ARM architectures.
Xen domain - ensemble of address spaces hosting a guest OS
and applications running under the guest OS. Runs on a virtual
CPU.
Dom0 - dedicated to execution of Xen control functions and
privileged instructions.
DomU - a user domain.
Applications make system calls using hypercalls processed by
Xen; privileged instructions issued by a guest OS are
paravirtualized and must be validated by Xen.
28
Xen
Management
OS Application Application Application
Xen
Domain0 control Virtual x86 Virtual physical Virtual block
interface Virtual network
CPU memory devices
X86 hardware
29
Strategies for virtual memory management, CPU
multiplexing, and I/O devices
30
Linux Containers
A Linux Container is a Linux process (or processes) that is a
virtual environment with its own process network space.
(lightweight process virtualization)
Containers share portions of the host kernel
Containers use:
Namespaces: per-process isolation of OS resources (filesystem, network
and user ids)
Cgroups: resource management and accounting per process
Examples for using containers:
https://www.dotcloud.com/
https://www.heroku.com/
31
Xen I/O
I/O channel
Driver domain Guest domain
Bridge
Xen zero-copy Backend Frontend
Consumer Response
Producer Response
(private pointer maintained by
(shared pointer updated
by Xen)
Response queue the guest OS) 32
(b)
Xen Network Architecture
The original architecture The
optimised architecture
Bridge Bridge
Offload
I/O
Driver I/O
channel channel High Level
NIC Backend Virtual NIC Backend Virtual
Driver Interface Interface Driver Interface
Interface
(a) (b)
33
The Darker Side of Virtualization
Application
Application
Malicious Guest OS
OS
Operating
Malicious system (OS)
OS Virtual machine monitor
Hardware Hardware
(a) (b)
The insertion of a Virtual-Machine Based Rootkit (VMBR) as the
lowest layer of the software stack running on the physical
hardware; (a) below an operating system; (b) below a legitimate
virtual machine monitor. The VMBR enables a malicious OS to
run surreptitiously and makes it invisible to the genuine or the
35
guest OS and to the application.