0% found this document useful (0 votes)
261 views37 pages

Microsoft Cluster Service: A Retrospect

A retrospect on Microsoft Cluster Service

Uploaded by

tanmeya
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
261 views37 pages

Microsoft Cluster Service: A Retrospect

A retrospect on Microsoft Cluster Service

Uploaded by

tanmeya
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Microsoft Cluster Service

A
RETROSPECT
What is Cluster?

 A cluster is a collection of computer nodes that work in concert


to provide a much more powerful system.
 To be effective, the cluster must be as easy to program and
manage as a single large computer.
 Clusters have the advantage that they can grow much larger than
the largest single node, they can tolerate node failures and
continue to offer service, and they can be built from inexpensive
components.
What is Clustering?

 Administrators cluster machines to provide services via a group


of servers, with the goal of achieving high availability,
scalability, or both.
Cluster Abstractions

 Node
 Resource
 Quorum Resource
 Resource Dependencies
 Resource Groups
 Cluster Configuration Database
Node

 A node is a self-contained Windows NT™ system that can run


an instance of the Cluster Service.
 Groups of nodes implement a cluster.
 Nodes in a cluster communicate via messages over network
interconnects use communication timeouts to detect node
failures.
Resource

 A resource represents certain functionality offered at a node.


 It may be physical or logical.
 Resources may, under control of the Cluster Service, migrate to
another node.
Quorum Resource

 The Quorum Resource provides an arbitration mechanism to


control membership.
 The Quorum Resource also implements persistent storage where
the Cluster Service can store the Cluster Configuration
Database and change log.
 The Quorum Resource must be available when the cluster is
formed, and whenever the Cluster Configuration Database is
changed.
 It is desirable that the Quorum Resource be highly available and
does not depend on the availability of a single node.
Resource Dependencies

 Resources often depend on the availability of other resources.

 These dependencies are declared and recorded in a dependency


tree, which describes the sequence in which the resources should
be brought online and which resources need to migrate
together.
 If a resource is restarted, all resources that depend on it are also
restarted.
Resource Groups

 A Resource Group is the unit of migration (failover).


 Although a resource dependency tree describes the resources
which must failover together, there may be additional
considerations for grouping resources into migration units.
 The cluster administrator can assign a collection of independent
resource dependency trees to a single resource group.
 When the group needs to migrate to another node in the cluster,
all the resources in the group will move to the new location.
Cluster Configuration Database

 All configuration data necessary to start the cluster is kept in the


Cluster Configuration Database.
 The database, replicated at each node in the cluster, is accessed
through the Registry (standard Windows NT configuration database).
 The Cluster Service, ensures that the replica of the configuration
database is correct at each active node.
 When a node joins the cluster, it contacts an active member to
determine the current version of the database and to
synchronize its local replica of the configuration database.
 Updates to the database during the regular operation are applied
to the Master copy and to all the replicas.
Clustering Technologies

 Microsoft servers provide three technologies to support


clustering:
 Network Load Balancing (NLB)
 Component Load Balancing (CLB)
 Microsoft Cluster Service (MSCS).
Clustering Technologies
Network Load Balancing

 Network Load Balancing (NLB) acts as a front-end cluster,


distributing incoming IP traffic across a cluster of servers, and is
ideal for enabling incremental scalability and outstanding
availability for e-commerce Web sites.
 NLB enhances scalability by distributing its client requests across
multiple servers within the cluster.
 NLB also provides high availability by automatically detecting
the failure of a server and repartitioning client traffic among the
remaining servers within 10 seconds, while it provides users with
continuous service.
Component Load Balancing

 Component Load Balancing (CLB) distributes workload across


multiple servers running a site's business logic.
 It provides for dynamic balancing of COM+ components across
a set of up to eight identical servers.
 CLB complements both NLB and Cluster Service by acting on
the middle tier of a multi-tiered clustered network.
 Both CLB and Microsoft Cluster Service can run on the same
group of machines.
Microsoft Cluster service

 Microsoft Cluster Server (MSCS) is software designed to allow


servers to work together as a computer cluster, to provide
failover and increased availability of applications, or parallel
calculating power in case of high-performance computing
clusters.
 MSCS acts as a back-end cluster; provides high availability for
applications such as databases, messaging and file and print
services.
 MSCS attempts to minimize the effect of failure on the system
as any node (a server in the cluster) fails or is taken offline.
MSCS goals

 Commodity

 Scalability

 Transparency

 Reliability
Commodity

 The cluster runs on a collection of off the-shelf computer nodes


interconnected by a generic network.
 The operating system is a standard commercial version of
Windows NT server, the network communication is through the
standard Internet protocols.
Scalability

 Adding applications, nodes, peripherals, and network


interconnects is possible without interrupting the availability of
the services at the cluster.
Transparency

 The cluster; built out of a group of loosely coupled, independent


computer nodes, is a single system to clients outside the cluster.
 Client applications interact with the cluster as a single high-
performance, highly reliable server.
 The clients are not affected by interaction with the cluster and do
not need modification.
 System management tools access and manage the services at the
cluster as if it is one single server.
Reliability

 The Cluster Service is able to detect failures of the hardware and


software resources it manages.
 In case of failure the Cluster Service can restart failed
applications on other nodes in the cluster.
 The restart policy is part of the cluster configuration.
 A failure can also cause ownership of other resources (shared
disks, network names, etc.) to migrate to other nodes in the
system.
 Hardware and software can be upgraded in a phased manner
without interrupting the availability of the services in the cluster.
Cluster Operation

 Four areas of particular interest in an MSCS cluster:


 Cluster Membership Activities
 Resource Management and Resource Failure handling
 Application state failover
 Cluster Management
Cluster Membership Activity

 When a cluster node restarts, it can take one of two distinct paths:
 If there are already active nodes in the cluster, the new node
will synchronize with these nodes and join the cluster (i.e.
become active).
 If the node cannot discover any other active cluster nodes, it
will try to form a cluster by itself. It will assume it is the first
node to start and that other nodes will join later.
Resource Management

 The Cluster Service manages resources by invoking a pre-defined


set of calls to a resource control program library, provided
when the resource type is defined.
 Central to this is the means by which the Cluster Service can
monitor the state of the resource.
 The resource control libraries present a polymorphic state
transition mechanism.
 The resource control libraries for each type hide the complexity
of managing state changes for that resource type.
Resource Migration

 A resource group may migrate to another node for many reasons:


 failure of the original node,
 failure of the resource at the original node,
 the resource group prefers to execute at the other node, and
 If the operator requests the group to move.
 In the first case Cluster Services pull the resource groups to the
surviving cluster nodes.
 In the other cases, the owning Cluster Service pushes the
resource group to the other node.
Resource Failover
Node Failover
Pushing a group

 If a resource fails, the local Cluster Service repeatedly tries to


online the resource.
 Failing that, the Cluster Service will optionally move the
containing resource group to another node.
 First all resources in the resource group are taken to the offline
state.
 A new active host node is selected, and the resource group is
brought online at the new hosting node by its local Cluster
Service.
 This process is called pushing a group to another node.
Pulling a group (1)

 When an active node fails, its resource groups must be pulled to


the other active nodes.
 This process is similar to pushing a resource group, but without
the shutdown phase on the failed node.
 The complication here is determining what groups were running
on the failed node and which node should take ownership of the
various groups.
 The selection is based on node capabilities, the group's
preferred owner list, and a simple tie breaker rule, in case the
nodes cannot decide which node should be the new host.
Pulling a group (2)

 The replicated cluster database gives all nodes full knowledge of


the resource groups on the failed node.
 Hence, the nodes can determine the new hosts without
communicating with one another.
 Each active node pulls (brings online) the resource groups it
now owns.
Fail-back

 A resource group that migrated from its preferred owner is not


automatically migrated back when the preferred owner
rejoins the cluster.
 Migration back is constrained by the resource group failback
window described in the Cluster Configuration Database.
 The failback window indicates how long the new node must be
up and running, before the resource group is migrated back to
its preferred owner.
 It also indicates blackout periods when failbacks are deferred
for cost or availability reasons (migration causes temporary
service outage).
SAP R/3 and MSCS

 SAP supports a configuration based on a two-node cluster in an


active–active mode.
SAP R/3 and MSCS

 A supported SAP cluster configuration must meet the following


conditions:
 The hardware must be certified for SAP and must be approved by
Microsoft for MSCS.
 The operating system must be approved for MSCS.
 The DBMS must be cluster-aware.
 Two Windows DLLs developed to make SAP R/3 a cluster-aware
application:
 SAPRC.DLL, which allows the MSCS to check the status of the SAP
R/3 system and to start and stop the SAP services in case of failover;
 SAPRCEX.DLL, which allows MSCS to manage the SAP resources.
SAP integration with MSCS

 By grouping various resources together using the MSCS group


feature, a virtual machine is created and thus can move among all
the nodes in the cluster.
 If any resource fails, the whole group fails over to the other
machine seemingly.
SAP integration with MSCS

 Initially

 After failure
Hardware Configuration
References

 The Design and Architecture of the Microsoft Cluster Service.


- A Practical Approach to High-Availability and Scalability.
Werner Vogels, Dan Dumitriu, Ken Birman, Dept. of Computer Science, Cornell University. Rod Gamache, Mike Massa, Rob
Short, John Vert, Microsoft Cluster group, Microsoft Corporation. Joe Barrera, Jim Gray, Scalable Server group, Microsoft
Research.

 Microsoft Cluster Server http://en.wikipedia.org/wiki/Microsoft_Clustering_Services


 Introducing Microsoft Cluster Service (MSCS) in the Windows Server 2003 Family
http://msdn.microsoft.com/en-us/library/ms952401.aspx
 Implementing SAP R/3 4.5B Using Microsoft Cluster Server on IBM Netfinity Servers
David Watts, Matthew Cali, Edward Charles, Olivier De Lampugnani, David Dariouch, Mauro Gatti, Bill Sadek
Thank You

Have a nice day!!!

You might also like