Workspaces for CE Management
Kate Keahey [email protected] Argonne National Laboratory
Why Workspaces?
We need to be able to dynamically create an execution environment on remote resources
Dream up an ideal environment
magic happens
Ideal environment is deployed in just the right place
run jobs
The aspects of workspaces:
Quality of Service: isolation and enforcement
Quality of Life: providing the right configuration at the right time
06/22/05
EGEE All Hands, Brno
What are Virtual Workspaces?
A description of an execution environment
Software configuration requirements
OSG worker node, submit node for a Grid3 cluster
Use exactly X memory, at least Y disk space, Z bandwidth Unix account, sandbox, various kinds of virtual machines
Resource allocation requirements
Sharing and isolation properties
And others
Basic workspace example : a Unix account on a remote machine
Workspace can be managed and refined
In terms of lifetime, meta-data, access policies
A workspace can be deployed on a resource Jobs can be deployed in a workspace A workspace can have various implementations
Dynamic accounts & configuration tools
Pacman, SoftEnv, Softricity
Virtual Machines
06/22/05
EGEE All Hands, Brno
Binding Wrokspaces to Resources
CPU = Pentium Memory = 1GB Disk = 20GB Bind to reservation Reserve capability vw1 Find a resource advertising the required capability
capability description CPU >= 60% Pentium Memory = 256 MB
Client
workspace description CPU >= 60% Pentium Memory = 256 MB Software. = OSG SE v1 Reserve capability for vw2 Bind to reservation workspace description CPU >= 30% Pentium Memory = 512MB Software = exp capability description CPU >= 30% Pentium Memory = 512 MB
06/22/05
EGEE All Hands, Brno
VW Implemenations: Virtual Machines
Advantages Customizable software configuration
Library signature, OS, 64/32-bit architectures Enforcement on a sandbox rather than process level Generally enhanced security, audit forensics VM images (include RAM), can be copied
Excellent enforcement potential
Excellent isolation
Pausing, serialization, and migration
Available implementations
Commercial (VMware) Open source (Xen, UML)
Also support for Xen from XenSource and many Linux distributors
Xen is rapidly emerging as the most popular implementation
The fastest, freeest, the most open source, the most backed
06/22/05
EGEE All Hands, Brno
The Need for Speed
1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
SPEC INT2000 (score)
Linux build time (s)
OSDB-OLTP (tup/s)
SPEC WEB99 (score)
Benchmark suite running on Linux (L), Xen (X), VMware Workstation (V), and UML (U)
06/22/05 EGEE All Hands, Brno
DRAG Benchmark Results
DRAG suite: FFT-based benchmark Comparison (by Xuehai Zhang, UC):
Linux: machine runs native 2.6 Linux. Dom0: machine runs Xen and domain 0. DomU: machine runs Xen, domain 0 and a user domain. <3% degradation, but sometimes actually better than native Linux
Similar performance as native Linux
More details at http://people.cs.uchicago.edu/~hai/vm1/drag/.
EGEE All Hands, Brno
06/22/05
Deployment Concerns
Distribution/Installation
Para-virtualization (Xen) requires kernel modifications
Yes, but everything else stays the same Work in progress on making Xen part of Linux kernel False information of its conclusion seen recently! Support from many Linux distributors: Fedora, Debian, SUSE, Gentoo, Mandrake, etc.
Privilege level(Xen)
Domain0 is a privileged domain, not a good environment for sharing. If Xen configuration is going to be permanent using DomainU is recommended
Performance impact needs to be considered
06/22/05
EGEE All Hands, Brno
The Xen of Enforcement
CPU
Schedulers: BVT, FBVT, Round Robin, Atropos/SEDF May be selected at boot time; BVT is default Borrowed Virtual Time (BVT)
Fair share of CPU based on weights assigned to the domains Work-conserving Reserves absolute shares of CPU for domains
Simple Earliest Deadline First (SEDF)
Memory
Memory size specified in a configuration file Can be readjusted from domain0
Export partitions Logical Volume Manager (LVM) allows to grow and shrink the disk size
Disk
Networking
Standard Linux deployment tools: Domain0 can do traffic shaping for user domains.
06/22/05
EGEE All Hands, Brno
Workspace as a CE Environment
headnode
VW Manager
CE CE
WSS LCAS
LCMAPS GRAM
WSS
LCMAPS
CE broker
LCAS
GRAM
VW Manager
CE
WSS LCAS
LCMAPS GRAM
06/22/05
EGEE All Hands, Brno
Pros and Cons
Problems that VMs solve for us
Environment management
Configuration management Running two different versions of CE software side-by-side Enforcement and isolation Renegotiating the resource allocation Live migration across nodes
Graceful load management
Problems that VMs dont solve for us
Job management: jobs within an environment still need to be managed
Job throttling Job persistence, restart, etc. Each CE will run a copy of similar services leading to potential inefficiencies There has been some work in sharing e.g. shared libraries between VMs, but is inconclusive right now
The cost of perfect enforcement
06/22/05
EGEE All Hands, Brno
Meanwhile, in a galaxy far, far away
Similar ideas in US projects
Edge Services
effort led by Frank Wuertherwein in the context of OSG/CMS work with Rob Gardner and Mike Wilde in the context of Grid3/Atlas
Management of submit nodes
Requirements:
Install and manage a complex configuration
Easy upgrades based on pre-configured images, consistent configuration across sites, version management, etc.
Control of resources
Guaranteed dedicated use of resources
Widening the interface to a cluster based on need
Flexible load balancing
06/22/05
EGEE All Hands, Brno
Edge Services
Edge Services: Services executing on the edge of a private/public network boundary
Typical configuration of todays resources
Resources within a site are available only on a private network Site can be accessed through a limited number of public addresses
Examples: CE, SE, GK, and others Role-based deployment Initially no advance reservations, no load balancing A proof-of-concept activity http://osg-docdb.opensciencegrid.org/cgibin/ShowDocument?docid=167
Edge Services will be deployed in VM-based workspaces
Draft document available:
06/22/05
EGEE All Hands, Brno
Submit Node Management
Similar to the Edge Service activity with particular emphasis on:
Configuration management
Configure once, copy and deploy many times
Load balancing: widening the submit bottleneck to clusters based on need
06/22/05
Client
EGEE All Hands, Brno
Conclusions
Workspaces solve environment management problems
Configuration management
Configure once, copy and deploy many times Upgrading service versions Running conflicting or incompatible services side-by-side Guaranteed resources with respect to other users
Enforcement
There seems to be a confluence of ideas
Similar ideas in three different contexts Coincidence?
Workspaces do not solve job management problems
Which leads into subjects well talk about later
06/22/05
EGEE All Hands, Brno