0% found this document useful (0 votes)
97 views10 pages

Oracle 11g Release 2 Clusterware Stack

The document summarizes the Oracle 11g Release 2 Clusterware stack, which consists of two separate stacks: the Oracle High Availability Services stack anchored by the ohasd process, and the Cluster Ready Services stack anchored by the crsd process. It describes the processes that make up each stack, including ohasd, crsd, cssd, evmd, gpnpd, gipcd, mdnsd, gnsd, ologgerd, sysmond, and provides example commands to view them.

Uploaded by

Akshay Talekar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views10 pages

Oracle 11g Release 2 Clusterware Stack

The document summarizes the Oracle 11g Release 2 Clusterware stack, which consists of two separate stacks: the Oracle High Availability Services stack anchored by the ohasd process, and the Cluster Ready Services stack anchored by the crsd process. It describes the processes that make up each stack, including ohasd, crsd, cssd, evmd, gpnpd, gipcd, mdnsd, gnsd, ologgerd, sysmond, and provides example commands to view them.

Uploaded by

Akshay Talekar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Oracle 11g Release 2 Clusterware Stack

javidhasanov.wordpress.com
Oracle 11g Release 2 Clusterware Stack
Introduction

Before we start let’s fresh our memory and try to remember what is Oracle Clusterware.

Oracle Clusterware is the software which enables the nodes to communicate with each other, allowing them to
form the cluster of nodes which behaves as a single logical server.

Oracle 11g Release 2 Clusterware consists of two separate stacks:

 Oracle High Availability Services Stack - a lower stack anchored by the Oracle High
Availability Services daemon (ohasd).
 Cluster Ready Services Stack - an upper stack anchored by the Cluster Ready Services (CRS)
daemon (crsd)

These two stacks have several processes that facilitate cluster operations.

Oracle Clusterware Component Linux/UNIX Process

CRS crsd.bin (r)


CSS ocssd.bin, cssdmonitor, cssdagent
CTSS octssd.bin (r)
EVM evmd.bin, evmlogger.bin
GIPC gipcd.bin
GNS gnsd (r)
Grid Plug and Play gpnpd.bin
LOGGER ologgerd.bin (r)
Master Diskmon diskmon.bin
mDNS mdnsd.bin
oraagent.bin (11.2), or racgmain and
Oracle agent
racgimon (11.1)
Oracle High Availability Services ohasd.bin (r)
ONS ons
Oracle root agent orarootagent (r)
SYSMON osysmond.bin (r)

The following sections describe these stacks in more detail:


1. Oracle High Availability Services (OHAS)

Starting in Oracle 11g Real Application Clusters (RAC), a new process called Oracle High
Availability Services (OHAS) was implemented.

On the OS layer Oracle High Availability Services is implemented via a new daemon process which
is called OHAS. As the 11g documentation notes that this Oracle High Availability Services Daemon
(ohasd) anchors the lower part of the Oracle Clusterware stack, which consists of processes that
facilitate cluster operations in RAC databases. This includes the GPNPD, GIPC, MDNS and GNS
background processes on Linux and UNIX operating systems, or services on Windows. We will
discuss each of them in more detail later in this post.

In a cluster, the OHAS daemon runs as the root user. In an Oracle Restart environment it runs as
the oracle user.

[root@node1 ~]# ps -ef | grep ohas | grep -v grep


root 3226 1 0 May18 ? 00:02:12 /u01/app/11.2.0/grid/bin/ohasd.bin reboot
root 3345 1 0 May18 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run

The high availability stack is based on the Oracle High Availability Services daemon (ohasd.bin).
This daemon is responsible for starting all other Oracle Clusterware processes. It is also responsible
for managing and maintaining the OLR.
There are no direct commands for Oracle High Availability Services - continue to use the
traditional crsctl commands for cluster management.

The background processes that are indirectly controlled by the OHAS are CRS daemon, CSS
daemon, and EVM daemon. We will discuss each of them in more detail later in this post. The
advantage of having the OHAS daemon do this is that the administrator can now issue cluster-wide
commands. The OHAS daemon will start even if Grid Infrastructure is disabled explicitly.

According to Oracle 11g Documentation, processes that comprise the Oracle High Availability
Services stack are:

 Grid Plug and Play (GPNPD): GPNPD provides access to the Grid Plug and Play profile,
and coordinates updates to the profile among the nodes of the cluster to ensure that all of
the nodes node have the most recent profile.
 Grid Interprocess Communication (GIPC): A helper daemon for the communications
infrastructure. Currently has no functionality; to be activated in a later release.
 Multicast Domain Name Service (mDNS): Allows DNS requests. The mDNS process is a
background process on Linux and UNIX, and a service on Windows.
 Oracle Grid Naming Service (GNS): A gateway between the cluster mDNS and external
DNS servers. The gnsd process performs name resolution within the cluster.

Enabling Oracle High Availability Services

To enable OHAS for each RAC node, you issue the crsctl enable crs command, and this will
cause OHAS to autostart when each node re-boots. To verify that OHAS is running, check for the
CRS-4123 message in your alert log:

CRS-4123: Oracle High Availability Services has been started.


ohasd is starting

Oracle High Availability Services (OHAS) Processes.

1. Cluster Logger Service (ologgerd)

[root@node1 ~]# ps -ef | grep -v grep | grep ologgerd


root 4207 1 0 May18 ? 00:03:19 /u01/app/11.2.0/grid/bin/ologgerd -M -d
/u01/app/11.2.0/grid/crf/db/node1

ologgred - This service runs on only two nodes in a cluster - one cluster logger service (ologgerd)
on only one node in a cluster and another node is chosen by the cluster logger service to house the
standby for the master cluster logger service. If the master cluster logger service fails (because the
service is not able come up after a fixed number of retries or the node where the master was running
is down), the node where the standby resides takes over as master and selects a new node for standby.
The master manages the operating system metric database in the CHM (Cluster Health Monitor)
repository and interacts with the standby to manage a replica of the master operating system metrics
database.

2. System Monitor Service (sysmond)

[root@node1 ~]# ps -ef | grep -v grep | grep osysmond


root 3576 1 0 May18 ? 00:04:19 /u01/app/11.2.0/grid/bin/osysmond.bin

osysmond.bin - There is one system monitor service on every node. The system monitor service
(osysmond) is the monitoring and operating system metric collection service that sends the data to
the cluster logger service. The cluster logger service receives the information from all the nodes and
persists in a CHM repository-based database.

3. Grid Plug and Play (GPNP)

[root@node1 ~]# ps -ef | grep gpn | grep -v grep


oracle 3548 1 0 May18 ? 00:00:23 /u01/app/11.2.0/grid/bin/gpnpd.bin
gpnpd.bin - Provides access to the Grid Plug and Play profile, and coordinates updates to the profile
among the nodes of the cluster to ensure that all of the nodes have the most recent profile.

4. Grid Interprocess Communication (GIPC)

[root@node1 ~]# ps -ef | grep -v grep | grep gipc


oracle 3562 1 0 May18 ? 00:02:55 /u01/app/11.2.0/grid/bin/gipcd.bin

gipcd.bin - A support daemon that enables Redundant Interconnect Usage.

5. Multicast Domain Name Service (MDNS)

[root@node1 ~]# ps -ef | grep -v grep | grep dns


oracle 3538 1 0 May18 ? 00:00:01 /u01/app/11.2.0/grid/bin/mdnsd.bin

mdnsd.bin - Used by Grid Plug and Play to locate profiles in the cluster, as well as by GNS to
perform name resolution.

6. Oracle Grid Naming Service (GNS)

gnsd.bin - Handles requests sent by external DNS servers, performing name resolution for names
defined by the cluster.

Cluster Ready Services Stack


Oracle Clusterware is run by Cluster Ready Services (CRS) consisting of two key components:

1. Oracle Cluster Registry (OCR) - which records and maintains the cluster and node
membership information;
2. Voting disk - which polls for consistent heartbeat information from all the nodes when the
cluster is running, and acts as a tie-breaker during communication failures.

The CRS service itself has three main components, each handling a variety of functions:

1. Cluster Ready Services daemon (CRSd) - The primary program for managing high availability
operations in a cluster.
2. Cluster Synchronization Service Daemon (CSSd)
3. Event Volume Manager Daemon (EVMd)

Failure or death of the CRS daemon can cause node failure, which triggers automatic reboots of
the nodes to avoid the corruption of data (due to the possible failure of communication between the
nodes), also known as fencing.
The CRS daemon runs as "root" (super user) on UNIX platforms and runs as a service on Windows
platforms.

Cluster Ready Services daemon (CRSd)

The following functions are provided by the Oracle Cluster Ready Services daemon (CRSd):

 CRS is installed and run from a different ORACLE_HOME known as ORA_CRS_HOME


or Grid Home, which is independent of ORACLE_HOME.
 CRSd manages the resources like starting and stopping the services and failing-over the
application resources. It spawns separate processes to manage application resources.
 CRS daemon has two modes of running. During startup and after a shutdown. During
planned Clusterware start it is started as ‘reboot’ mode. It is started as ‘restart’ mode after
unplanned shutdown.
 In reboot mode it ‘auto’ starts all the resources under its management. In restart mode it
prevails the previous state and brings back the resources to it previous state before
shutdown
 Manages the Oracle Cluster Registry and stores the current known state in the Oracle
Cluster Registry
 Runs as ‘root’ on Unix and ‘LocalSystem’ on windows and automatically restarts in case of
failure.
 CRS requires the public interface, private interface and the Virtual IP (VIP) for the
operation. All these interfaces should be up and running, should be able to ping each other
before starting CRS Installation. Without the above network infrastructure CRS cannot be
installed.

[root@node1 ~]# ps -ef | grep crs | grep -v grep

root 4330 1 0 May18 ? 00:00:51 /u01/app/11.2.0/grid/bin/crsd.bin reboot

The crsd process is responsible for start, stop, monitor and failover of resource. It maintains OCR and also
restarts the resources when the failure occurs. This is applicable for RAC systems. For Oracle Restart and ASM
ohasd is used.

The CRS daemon (crsd) manages cluster resources based on the configuration information that is
stored in OCR for each resource. This includes start, stop, monitor, and failover operations. The crsd
process generates events when the status of a resource changes. When you have Oracle RAC
installed, the crsd process monitors the Oracle database instance, listener, and so on, and
automatically restarts these components when a failure occurs.

Cluster Synchronization Services daemon (CSSd)

Cluster Synchronization Services daemon (CSSd) provides basic ‘group services’ support. Group
Services is a distributed group membership system that allows the applications to coordinate activities
to achieve a common result. As such, it provides synchronization services between nodes, access to
the node membership information, as well as enabling basic cluster services, including cluster group
services and cluster locking. CSSd also manages the cluster configuration by controlling which nodes
are members of the cluster and by notifying members when a node joins or leaves the cluster. If you
are using certified third-party clusterware, then CSS processes interface with your clusterware to
manage node membership information.

Failure of CSSd causes the machine to reboot to avoid a split-brain situation. This is also required in
a single instance configuration if Automatic Storage Management (ASM) is used. OCSSd runs as the
"oracle" user.

The following functions are provided by the Oracle Cluster Synchronization Services daemon
(OCSSd):

 ‘Lock Services’ provides the basic cluster-wide serialization locking functions, and uses a
FIFO mechanism to manage locking
 'Node Services' uses OCR to store state data, and updates the information during
reconfiguration. It also manages the OCR data, which is static otherwise.

[root@node1 ~]# ps -ef | grep -v grep | grep css


root 3699 1 0 May18 ? 00:00:07 /u01/app/11.2.0/grid/bin/cssdmonitor
root 3747 1 0 May18 ? 00:00:21 /u01/app/11.2.0/grid/bin/cssdagent
oracle 3840 1 0 May18 ? 00:11:40 /u01/app/11.2.0/grid/bin/ocssd.bin

cssdmonitor - Monitors node and CSSd process hangs and monitors vendor clusterware. This is the
multithreaded process that runs with elevated priority.

Startup sequence: INIT => init.ohasd => ohasd => ohasd.bin => cssdmonitor

cssdagent - Spawned by OHASD process the cssdagent process monitors the cluster and provides
I/O fencing. This service formerly was provided by Oracle Process Monitor Daemon (oprocd –
10g), also known as OraFenceService on Windows. A cssdagent failure may result in Oracle
Clusterware restarting the node. Stops, start checks the status of occsd.bin daemon

Startup sequence: INIT => init.ohasd => ohasd => ohasd.bin => cssdagent

ocssd.bin - Manages cluster node membership runs as grid user. Failure of this process results in
node restart.

Startup sequence: INIT => init.ohasd => ohasd => ohasd.bin => cssdagent => ocssd =>
ocssd.bin

Oracle ASM:

Provides disk management for Oracle Clusterware and Oracle Database.


Event Management daemon (EVMd)

The third component in OCS is the Event Management daemon (EVMd). EVMd spawns a
permanent child process called "evmlogger" and generates events. The EVMd child process
‘evmlogger’ spawns new children processes on demand and scans the callout directory to invoke
callouts. It will restart automatically on failures and death of the EVMd process does not halt the
instance. EVMd runs as the "oracle" user.

[root@node1 ~]# ps -ef | grep evm | grep -v grep


oracle 4049 1 0 May18 ? 00:00:18 /u01/app/11.2.0/grid/bin/evmd.bin
oracle 4399 4049 0 May18 ? 00:00:00 /u01/app/11.2.0/grid/bin/evmlogger.bin -o
/u01/app/11.2.0/grid/evm/log/evmlogger.info -l /u01/app/11.2.0/grid/evm/log/evmlogger.log

evmd.bin - Distributes and communicates some cluster events to all of the cluster members so that
they are aware of the cluster changes.

A background process that publishes events that Oracle Clusterware creates.

evmlogger.bin - Started by evmd.bin reads the configuration files and determines what events to
subscribe to from EVMD and it runs user defined actions for those events.

Oracle Root Agent

A specialized orarootagent.bin process that helps crsd manages resources owned by root, such as
the network, and the Grid Virtual IP address.

[root@node1 ~]# ps -ef | grep orarootagent | grep -v grep


root 3559 1 0 May18 ? 00:03:40 /u01/app/11.2.0/grid/bin/orarootagent.bin
root 4441 1 0 May18 ? 00:07:09 /u01/app/11.2.0/grid/bin/orarootagent.bin

The above 2 process are actually threads which looks like processes. This is a Linux specific

Cluster Time Synchronization Service daemon (CTSSd)

octssd.bin - Provides Time Management in a cluster for Oracle Clusterware

[root@node1 ~]# ps -ef | grep ctss | grep -v grep


root 4015 1 0 May18 ? 00:00:47 /u01/app/11.2.0/grid/bin/octssd.bin reboot

Oracle Notification Service (ONS):

A publish and subscribe service for communicating Fast Application Notification (FAN) events.
[oracle@node1 ~]$ ps -ef | grep ons | grep -v grep
oracle 4656 1 0 15:50 ? 00:00:00 /u01/app/11.2.0/grid/opmn/bin/ons -d
oracle 4657 4656 0 15:50 ? 00:00:00 /u01/app/11.2.0/grid/opmn/bin/ons -d

Oracle Agent (oraagent)

Extends clusterware to support Oracle-specific requirements and complex resources. This process
runs server callout scripts when FAN events occur. This process was known as RACG in Oracle
Clusterware 11g release 1 (11.1).

[oracle@node1 ~]$ ps -ef | grep oraagent | grep -v grep


oracle 3648 1 0 15:49 ? 00:00:00 /u01/app/11.2.0/grid/bin/oraagent.bin
oracle 4543 1 0 15:50 ? 00:00:01 /u01/app/11.2.0/grid/bin/oraagent.bin

The Cluster Synchronization Service (CSS), Event Management (EVM), and Oracle Notification
Services (ONS) components communicate with other cluster component layers on other nodes in
the same cluster database environment. These components are also the main communication links
between Oracle Database, applications, and the Oracle Clusterware high availability components. In
addition, these background processes monitor and manage database operations.

Oracle 11g Release 2 Clusterware startup and process stacks


Master Diskmon

The master diskmon process (diskmon.bin) can be seen running in all Grid Infrastructure installs,
but it's only in Exadata that it's actually doing any work. On every compute node there will be one
master diskmon process and one DSKM, slave diskmon process, per every Oracle instance (including
ASM).
In Exadata, the diskmon is responsible for
 Handling of storage cell failures and I/O fencing
 Monitoring of Exadata Server state on all storage cells in the cluster (heartbeat)
 Broadcasting intra database IORM (I/O Resource Manager) plans from databases to storage
cells
 Monitoring or the control messages from database and ASM instances to storage cells
 Communicating with other diskmons in the cluster

References:
https://en.wikipedia.org/wiki/Oracle_Clusterware
Clusterware Administration and Deployment Guide
https://blogs.oracle.com/myoraclediary/entry/clusterware_processes_in_11g_rac
http://logicalread.solarwinds.com/oracle-crs-in-11g-mc04/#.Vz6nwOSUSec
https://docs.oracle.com/cd/E11882_01/rac.112/e41959/intro.htm#CWADD1111
http://www.dba-oracle.com/t_oracle_high_availability_services_ohas.htm

Best regards: Javid Hasanov.


javidhasanov.wordpress.com

You might also like