0% found this document useful (0 votes)
16 views44 pages

03 OVM3 Domain Concept

Uploaded by

Razu Mollah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views44 pages

03 OVM3 Domain Concept

Uploaded by

Razu Mollah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

1 | © 2012 Oracle Corporation – Proprietary and Confidential

Safe Harbor Statement


The following is intended to outline our general product
direction. It is intended for information purposes only, and
may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality,
and should not be relied upon in making purchasing
decisions.
The development, release, and timing of any features or
functionality described for Oracle’s products remains at the
sole discretion of Oracle.

2 | © 2012 Oracle Corporation – Proprietary and Confidential


Oracle Training Materials – Usage Agreement
Use of this Site (“Site”) or Materials constitutes agreement with the following terms and conditions:

1. Oracle Corporation (“Oracle”) is pleased to allow its business partner (“Partner”) to download and copy the information,
documents, and the online training courses (collectively, “Materials") found on this Site. The use of the Materials is restricted to
the non-commercial, internal training of the Partner’s employees only. The Materials may not be used for training, promotion, or
sales to customers or other partners or third parties.

2. All the Materials are trademarks of Oracle and are proprietary information of Oracle. Partner or other third party at no time has
any right to resell, redistribute or create derivative works from the Materials.

3. Oracle disclaims any warranties or representations as to the accuracy or completeness of any Materials. Materials are provided
"as is" without warranty of any kind, either express or implied, including without limitation warranties of merchantability, fitness
for a particular purpose, and non-infringement.

4. Under no circumstances shall Oracle or the Oracle Authorized Boot Camp Training Partner be liable for any loss, damage,
liability or expense incurred or suffered which is claimed to have resulted from use of this Site of Materials. As a condition of use
of the Materials, Partner agrees to indemnify Oracle from and against any and all actions, claims, losses, damages, liabilities
and expenses (including reasonable attorneys' fees) arising out of Partner’s use of the Materials.

5. Reference materials including but not limited to those identified in the Boot Camp manifest can not be redistributed in any format
without Oracle written consent.

3 | © 2012 Oracle Corporation – Proprietary and Confidential


Oracle VM for SPARC Domain Concept
Instructors Name

4 | © 2012 Oracle Corporation – Proprietary and Confidential


Module Objectives
• Introduction of Domain Concept in OVM for SPARC
• Resource virtualization in OVM for SPARC
– CPU Virtualization
– Memory Virtualization
– I/O Virtualization
– Platform Component Virtualization
• Domain Metadata and Platform Configuration
• Create OVM for SPARC Initial Configuration

5 | © 2012 Oracle Corporation – Proprietary and Confidential


Oracle VM for SPARC

Domain Concept

6 | © 2012 Oracle Corporation – Proprietary and Confidential


Domain Concept in Oracle VM for SPARC
• Domain is an isolated runtime environment for independent
Solaris OS
• Domain environment are collection of virtual resources
• Virtual Resources in a domain are similar as physical
components on bare-metal
– CPU
– Memory
– I/O Components
– Platform Components

7 | © 2012 Oracle Corporation – Proprietary and Confidential


Domain Concept in Oracle VM for SPARC
Resource Provisioning
• Use a descriptive way for resource requirement of a Domain
– Stored as domain metadata
– Must have a persistent storage for those metadata
• Virtual resources provisioning
– Use internal data structure in sun4v platform
• Generated at domain initialization and future resource DR
• Recognized by OBP process and Solaris OS in each domain

• Resource provisioning are dynamic


– Resource configuration can be changed by issuing a resource DR
request

8 | © 2012 Oracle Corporation – Proprietary and Confidential


Oracle VM for SPARC

Resource Virtualization

9 | © 2012 Oracle Corporation – Proprietary and Confidential


CPU Virtualization in OVM for SPARC
• CPU Virtualization will try to share CPU resources among domains
• Legacy server hardware does not have enough CPU resources, so
HV has to play tricks to support large number of guests
– By scheduling CPU resources among multiple guests
– Can support CPU over-commitment, however the complexity will cause higher
overhead
• SPARC processors are leading the multi-threaded CPU technology
– Support up to 3072 threads in M6-32 server
• OVM for SPARC chooses CPU resource hard-allocation for domains
– Each domain will get dedicated CPU resources
• Better isolation
• Bare-metal performance

10 | © 2012 Oracle Corporation – Proprietary and Confidential


CPU Virtualization in OVM for SPARC
CPU Resource Provisioning
• Virtual CPU Provisioning
– Virtual CPU is provisioned by Hypervisor
• HV exposes physical CPU thread directly to domains
– CPU Provisioning in OVM for SPARC is fine-grained
• 1 vCPU = 1 Thread
– Extreme Scalability
• Can allocate from 1 thread up to whole CPU resource for a single domain
• Crypto Acceleration Provisioning
– In S2 core, Crypto is implemented as a shared component in CPU Core
• Must assign it to a domain explicitly and exclusively
– In S3 core, Crypto is implemented as shared instructions in CPU Core
• Every domain can leverage the capabilities without need of assignment

11 | © 2012 Oracle Corporation – Proprietary and Confidential


CPU Virtualization in OVM for SPARC
Virtual CPU Architecture
• Virtual CPU Architecture
– Control the capabilities (available instruction sets) of the virtual processor
• Available Virtual CPU architecture
– native
• Same capabilities as native processors
– generic
• Only common SPARCv9 instruction sets
– migration-class1
• S3 Core capabilities, can use most of the native S3 core functions
– sparc64-class1
• Fujitsu SPARC 64 capabilities, can use most of the native SPARC64 core functions
• CPU Architecture other than native will impact domain performance
12 | © 2012 Oracle Corporation – Proprietary and Confidential
Memory Virtualization in OVM for SPARC
Memory Model in sun4v platform
• Addressing Model
CPU-0
– Solaris OS implements a hybrid (RAM, Storage CPU-1
MMU
etc.) virtual memory system MMU

• Can support larger memory space than available


physical memory
• Continuous Virtual Address (VA) space
– CPU can only access physical memory for data
processing RA
• Physical memory for a domain is identified by Real RA: 0x103000000
Address (RA) space VA: 0x03000000

– Memory space is divided into multiple pages


• Solaris supports varied page size from 8k to 2GB VA
• Address Translation
– Need translation from VA to RA and load data into
physical memory for processing

13 | © 2012 Oracle Corporation – Proprietary and Confidential


Memory Virtualization in OVM for SPARC
Memory Virtualization Implementation
• Memory Provisioning
– Hypervisor will use memory hard-allocation for domains
• Only provision physical memory for guest domains and no over-commitment support
• Memory allocation is dedicated
– Hypervisor will allocate physical memory for domains by blocks
• One domain can have multiple memory blocks in different sizes
• Each memory will be given a non-overlapped RA
• Address Translation
– TLB entries from MMU will be provisioned dedicated to each domain
• Enable fastest translation in hardware directly
– Solaris OS will assist in translation if TLB matching miss
• Use a software-based TSB

14 | © 2012 Oracle Corporation – Proprietary and Confidential


Memory Virtualization in OVM for SPARC
Memory Affinity
• Why Memory Affinity
– Modern processors including SPARC use NUMA Architecture
• MMU in processor
– NUMA effect will have negative performance impact
• Remote memory access through other processors is slower than local and even worse in
a large server like M6-32 if via the Scalability Link
• Memory Affinity in OVM for SPARC
– Hypervisor will try best effort to allocate memory locally to physical processors
used by a guest domain
• Currently only ensure memory affinity at bind time
• CPU DR may break memory affinity

15 | © 2012 Oracle Corporation – Proprietary and Confidential


I/O Virtualization in OVM for SPARC
I/O Hierarchy in Physical Server
• I/O Devices in SPARC Servers are in a tree- CPU-0 CPU-1

like topology 0
PCIe
1 0
PCIe
1

– PCI ports in processors are the root

RC
– Switch chips in the middle of the path will
create branches PCIe
get rid of all inside
Switch 0
small boxes
PCIe
Switch 1

– Various PCI devices are on the leaf

Devices
• Ethernet Card
• HBA NIC
Chip
NIC
Chip

• …

Slot 1

Slot 2

Slot 3

Slot 4

Slot 5

Slot 6

Slot 7

Slot 8
• IO hierarchy is divided into multiple sub-trees
– Aka, Root Complex (RC)
– Each RC will have separate PCI resources

16 | © 2012 Oracle Corporation – Proprietary and Confidential


I/O Virtualization in OVM for SPARC
Solaris I/O Architecture CPU-0

• Solaris OS will help to initialize I/O resources


PCIe
0 1

– For each probed RC and attached I/O resources


PCIe
• Solaris OS manipulates all devices by drivers
get rid of all inside
Switch 0
small boxes

– Nexus driver
• Represent bus, controller of various sub-systems NIC
Chip
• Oracle is the only vendor of those nexus drivers
– Device driver

Slot 1

Slot 2

Slot 3

Slot 4
• Can be developed by device vendors
/devices

• Solaris OS will organize drivers as a device tree pci@300

pci@0
– Nexus driver at root, Device driver at leaf pci@1

– Stored as a special devfs in /devices folder pci@6


network@0:ixgbe0
• It can not be modified by any file operation but Solaris OS

17 | © 2012 Oracle Corporation – Proprietary and Confidential


I/O Virtualization in OVM for SPARC
• OVM for SPARC adopts same I/O architecture
– Help simplify the virtualization implementation in HV and Solaris OS kernel
• I/O Virtualization is the combined effort from both HV and Solaris OS
– HV will present virtualized I/O resources for each domain
– Solaris OS will implement both virtualized I/O services and I/O devices as kernel
drivers
• Three Virtualized I/O models in OVM for SPARC
– Virtual I/O
• Assign simulated virtual devices to guest domain
– Physical I/O
• Assign physical I/O devices or functions to a domain
– Hybrid I/O
• Deprecated since OVM for SPARC 3, only support NIU on T2, T3, T4 platforms

18 | © 2012 Oracle Corporation – Proprietary and Confidential


Virtual I/O
Architecture
• Virtual I/O Architecture
Service Domain Guest Domain
– In Service Domain
• Virtual I/O Services as proxy to backend I/O Virtual Nexus Virtual Nexus

– Provision simulated devices to guest domains


• A virtual nexus device will be root of those
virtual I/O services
VSW VDS
– Mapped as /devices/virtual-devices@100/ in the
device tree
– Guest Domain
• Virtual devices connect to backend
– A dedicate LDC for each virtual device to
communicate with virtual I/O services in Hypervisor
Service Domain
• A virtual nexus device will be root of those SPARC Server
virtual devices
– Mapped as /devices/virtual-devices@100/ in the
device tree

19 | © 2012 Oracle Corporation – Proprietary and Confidential


Virtual I/O
Service Domain
• Service Domain provides virtual devices to other domains
– Two types of virtual I/O device
• Virtual Disk
• Virtual Network
– Typically also an I/O Domain, so that has backend devices to virtualize
• But can create a service domain without any physical I/O backend
– Can have multiple Service Domains on a single server
• For I/O Redundant deployment
– For performance and security reason, should be careful of planning Solaris OS in
Service Domain
• Recommend using latest Solaris OS release
• Recommend not running applications in service domain

20 | © 2012 Oracle Corporation – Proprietary and Confidential


Physical I/O
Architecture
• Physical I/O Architecture
Root Domain I/O Domain
– In Root Domain
Physical Virtual
• Own whole RC and all attached PCI devices RC RC

– Has responsibility to initialize the RC and all


attached PCI devices NIC
– Some devices will not be visible after assigned Chip

to I/O domain pciv pciv


– I/O Domain
• Virtual RCs
– Same configuration as corresponding physical
Hypervisor
RC, however can not make modification
• Physical Devices
– Only presented devices are visible SPARC Server
– Data communication does no go through HV
– Configuration requests for all devices will be
forwarded back to real devices through a
single LDC channel

21 | © 2012 Oracle Corporation – Proprietary and Confidential


Physical I/O
Root Domain
• Root Domain is a special I/O Domain
– Own whole RC instead of individual PCI devices
– Can assign individual devices or functions to I/O domains using PCI standards
• Direct I/O
• SR-IOV
– Recommend running latest Solaris OS release in Root Domain
• Leverage latest implementation and features
• Root Domain Model
– A special “physical partition” like domain configuration on sun4v platform
• All domains are Root Domains
• All domains will work independently
• All domains will achieve bare-metal performance

22 | © 2012 Oracle Corporation – Proprietary and Confidential


Physical I/O
I/O Domain
• I/O Domain will use physical PCI resources from Root Domain
– What are PCI resources
• MSI and Event Queue within a physical RC
– PCI Resources Provisioning
• Root Domain will distribute those resources from a physical RC to I/O Domains
– Each I/O domain will get same amount of resources
– Virtual RC will be implemented to use those PCI resources
• Each Virtual RC will be created for a specified RC in Root Domain
– The interrupt handling will be implemented in RC with those resources
• Platform will limit maximum number of VFs from a single RC
– The limitation will be described in Physical Resource Inventory (PRI) populated to Control Domain
– LDM process will enforce the limitation

23 | © 2012 Oracle Corporation – Proprietary and Confidential


Platform Component Virtualization
Open Boot Process (OBP)
• What is OBP
– Open Boot Process (OBP) is a piece of code to boot a Solaris OS
• Loaded before Solaris Kernel is running by HV
• Initialize all devices for the domain
• Locate kernel image on specified bootable device and start the Solaris OS
• Several parameters can be defined to control the boot process
• Will be offloaded from memory to save space for Solaris OS
• OBP Virtualization
– HV will help to load a copy of OBP for each domain from host flash
– HV will load parameters for each domain from host flash at boot time
– A virtual “flashprom” device will be loaded in each Solaris OS to manipulate OBP
parameters in domain configuration

24 | © 2012 Oracle Corporation – Proprietary and Confidential


Platform Component Virtualization
System Console
• Physical I/O Architecture Control Domain Guest Domain Guest Domain

– Control Domain Solaris Solaris Solaris


• Virtual Console Concentrator Console Console

– A virtual device collecting console data from Console VNTSD


running guest domains
– Console data will be grouped by name or domain VCC
• Virtual Network Terminal Server
– A Solaris service that convert console data to a
network accessible protocol
– A specified port for each group in VCC
Hypervisor
• Virtual Console
– A virtual device that will write console data
directly to SP SP
– Guest Domain SPARC Server

• Virtual Console
– A virtual device that will generate console data
and send to VCC via a LDC channel

25 | © 2012 Oracle Corporation – Proprietary and Confidential


Oracle VM for SPARC

Domain Metadata &


Platform Configuration

26 | © 2012 Oracle Corporation – Proprietary and Confidential


Domain Metadata
• Why metadata
– Domain does not directly control physical resource
• Has to use a descriptive way for its resource requirement
– Benefits of using metadata
• Detach a domain from a physical server
– Enable features like Live Migration and HA
• Easier for store and parsing

• Domain Metadata in OVM for SPARC


– Aka. Domain Constraints, including various sets of information
• Resource : CPU, Memory, I/O
• OBP Variables
• Other domain-wide configuration

27 | © 2012 Oracle Corporation – Proprietary and Confidential


Manage Domain Constraints
• Domain Constraints is manipulated by a set of configuration
requests
• To learn constraints of a domain
– # ldm ls-constraints <domain>
• Collection of constraints of all domains are stored in the
Constraint DB file
– /var/opt/SUNWldm/ldoms-db.xml
– A successful configuration request will update the Constraints DB

28 | © 2012 Oracle Corporation – Proprietary and Confidential


Domain Resources
• A domain can not be started without real resources
• The allocated resources for a domain are recorded with a data
structure called Machine Description (MD)
– Will be used during the life cycle of a running domain
• MD management
– HV provision correct MD for each domain
– LDom Manager will update the MDs with resource reconfiguration requests
• Leverage the knowledge of following information
– Current available resources on the server hardware
• Aka Physical Resource Identifier (PRI)
– Allocated resources for each domain
• Aka Machine Description (MD)

29 | © 2012 Oracle Corporation – Proprietary and Confidential


Domain Resource Reconfiguration
• Domain resource can be change by issuing a configuration request
– Use reconfiguration for reallocating resources
– The reconfiguration request will eventually affect Solaris OS in the domain
• If the request can affect the running Solaris, it is a dynamic reconfiguration
• If the request can not affect the running Solaris, it is a delayed
reconfiguration
– Delayed reconfiguration is only supported on control and root domain
– Possible Reasons to use delayed reconfiguration
• Wait for other domains to free the resources
• Will take too long to reallocate resources for a domain
• Solaris OS Kernel does not support DR
– After domain is restarted modifications will be applied to domain

30 | © 2012 Oracle Corporation – Proprietary and Confidential


Domain Resource Reconfiguration
Delayed Reconfiguration
• How to start a delayed reconfiguration
– Explicitly start a delayed reconfiguration
• # ldm start-reconf <domain>
– Implicitly start a delayed reconfiguration
• Some resource change requests will start a delayed reconfiguration automatically

• Resource requests in a delayed reconfiguration as a transaction


– If want to rollback any modification, use the following CLI to cancel a delayed
reconfiguration
• # ldm cancel-reconf <domain>

31 | © 2012 Oracle Corporation – Proprietary and Confidential


Platform Configuration
• Platform Configuration is the collection of configuration information
about OVM for SPARC
– PRI
– MDs of all bound domains
– Constraint DB
• Metadata for all domains even not bound

• Can have multiple platform configuration on a single server


– Only one can be set to current and active
– The default is “factory-default” and can not be removed
• A single domain with whole resources assigned
– Can switch to another one as current
• Need a power cycle

32 | © 2012 Oracle Corporation – Proprietary and Confidential


Platform Configuration Repository
• Platform Configuration has two repositories
– Flash in SP
• The persistent storage of platform configuration
– Control Domain
• Aka. BootSet
• When a DR request is completed, current configuration will be saved automatically
– /var/opt/SUNWldm/autosave-<configuration name>
• When syncing to SP, configuration will also be saved in
– /var/opt/SUNWldm/bootsets/<configuration name>

• Two repositories should be in sync


– Timestamp is used to mark the last-modified time of every configuration copy
• In a normal case, BootSet in Control Domain is considered master copy
– Mismatch may cause configuration lost sometimes

33 | © 2012 Oracle Corporation – Proprietary and Confidential


Platform Configuration Recovery
Manual Recovery
• Backup domain constraints manually
– Backup a single Domain
• # ldm ls-constraints -x <domain> > <constraint xml file>
– Backup all domains
• # ldm ls-constraints -x > <constraint xml file>

• Restore domain constraints manually


– Restore a single Domain
• # ldm add-domain -i <single domain constraint xml file>
– Restore all domains
• # ldm init-system -r -i <all domains constraint xml file>

34 | © 2012 Oracle Corporation – Proprietary and Confidential


Platform Configuration Recovery
Automatic Recovery
• LDom Manager will check timestamp of both SP config and BootSet
– If BootSet is newer than SP config , recovery action is done automatically
depends on autorecovery_policy property value defined for LDM service
• 0 (Default) - Log a warning message in LDM service log. Must update the SP config
manually
– # ldm add-config -r <old config> <new config>
• 1 - Prompt warning message every time you run ldm CLI
• 2 - Update the SP config automatically
– To change the policy value
• # svccfg -s ldmd setprop ldmd/autorecovery_policy=<value>

35 | © 2012 Oracle Corporation – Proprietary and Confidential


Manipulate SP Configuration
• Store the boot set into SP for persistent storage
– # ldm add-spconfig <config name>
• List stored config
– # ldm list-spconfig
• Remove a stored config
– # ldm remove-spconfig <config name>
• Switch to an existing config
– # ldm set-spconfig <config name>

36 | © 2012 Oracle Corporation – Proprietary and Confidential


Oracle VM for SPARC

Initial Configuration

37 | © 2012 Oracle Corporation – Proprietary and Confidential


Initial Configuration Tasks
Release Resource from Control Domain
• The factory-default configuration assigns whole resources to the control
domain
– Release at least some CPU & memory resources for other domains
– Control Domain does not need to take so much space
• Configure Control Domain Resources
– The name for control domain is always “primary”
– To release CPU resources by assigning new CPU core, for example
• # ldm set-core 4 primary, only assign 4 cores for control domain
– To release memory resources, for example
• # ldm set-mem 8G primary, only assign 8GB memory for control domain

38 | © 2012 Oracle Corporation – Proprietary and Confidential


Initial Configuration Tasks
Configure Virtual Console Services
• Virtual console service will listen at a series of port as the proxy for running
domains
– Must specify the port range reserved for Virtual Console
• How to config Virtual Console Service
– # ldm add-vcc port-range=5000-5127 primary-vcc primary
• How to check Virtual Console Service config in primary domain
– # ldm ls-services primary

39 | © 2012 Oracle Corporation – Proprietary and Confidential


Initial Configuration Tasks
Store the initial configuration
• The factory-default configuration can not be modified
– The initial configuration is stored in Control Domain with an empty name
• Config file as : autosave-_Default_SP_Config
– The configuration file may lost if not stored persistently in SP
• To store the configuration in SP
– # ldm add-config <config name>
– Configuration name could be any valid name

40 | © 2012 Oracle Corporation – Proprietary and Confidential


Initial Configuration Tasks
Rollback to factory-default
• Should first stop all running guest domains or migrate those domains to
other servers
– To stop all domains: # ldm stop-domain -a
• Choose factory-default as the configuration for next-power on
– In control domain: # ldm set-config factory-default
– In iLOM CLI: -> set /HOST/bootmode config=factory-default
• Power off the machine
– In control domain: # shutdown -i5 -g0 -y OR
– In iLOM: -> stop /SYS
• Power on the machine
– In iLOM: -> start /SYS

41 | © 2012 Oracle Corporation – Proprietary and Confidential


42 | © 2012 Oracle Corporation – Proprietary and Confidential
43 | © 2012 Oracle Corporation – Proprietary and Confidential
44 | © 2012 Oracle Corporation – Proprietary and Confidential

You might also like