Sun Cluster 3.1 Cheat Sheet

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Sun Cluster 3.

1 Cheat Sheet Page 1 sur 4

Sun Cluster 3.1 cheat sheet


Daemons

This is used by cluster kernel threads to execute


userland commands (such as the run_reserve and dofsck
commands). It is also used to run cluster commands remotely (like
clexecd the cluster shutdown command).
This daemon registers with failfastd so that a failfast device
driver will panic the kernel if this daemon is killed and not
restarted in 30 seconds.
This daemon provides access from userland management applications
cl_ccrad to the CCR.
It is automatically restarted if it is stopped.
The cluster event daemon registers and forwards cluster events
(such as nodes entering and leaving the cluster). There is also a
cl_eventd protocol whereby user applications can register themselves to
receive cluster events.
The daemon is automatically respawned if it is killed.
cluster event log daemon logs cluster events into a binary log
file. At the time of writing for this course, there is no
cl_eventlogd published interface to this log. It is automatically restarted if
it is stopped.
This daemon is the failfast proxy server.The failfast daemon
failfastd allows the kernel to panic if certain essential daemons have
failed
The resource group management daemon which manages the state of
rgmd all cluster-unaware applications.A failfast driver panics the
kernel if this daemon is killed and not restarted in 30 seconds.
This is the fork-and-exec daemon, which handles requests from rgmd
to spawn methods for specific data services. A failfast driver
rpc.fed panics the kernel if this daemon is killed and not restarted in 30
seconds.
This is the process monitoring facility. It is used as a general
mechanism to initiate restarts and failure action scripts for some
cluster framework daemons (in Solaris 9 OS), and for most
rpc.pmfd application daemons and application fault monitors (in Solaris 9
and10 OS). A failfast driver panics the kernel if this daemon is
stopped and not restarted in 30 seconds.
Public managment network service daemon manages network status
information received from the local IPMP daemon running on each
pnmd node and facilitates application failovers caused by complete
public network failures on nodes. It is automatically restarted if
it is stopped.
Disk path monitoring daemon monitors the status of disk paths, so
scdpmd that they can be reported in the output of the cldev status
command. It is automatically restarted if it is stopped.

File locations

man pages /usr/cluster/man


/var/cluster/logs
log files /var/adm/messages
sccheck logs /var/cluster/sccheck/report.<date>
CCR files /etc/cluster/ccr
Cluster infrastructure file /etc/cluster/ccr/infrastructure

SCSI Reservations

scsi2:
/usr/cluster/lib/sc/pgre -c pgre_inkeys -d /dev/did/rdsk/d4s2
Display reservation keys
scsi3:
/usr/cluster/lib/sc/scsi -c inkeys -d /dev/did/rdsk/d4s2
scsi2:
/usr/cluster/lib/sc/pgre -c pgre_inresv -d /dev/did/rdsk/d4s2
determine the device owner
scsi3:
/usr/cluster/lib/sc/scsi -c inresv -d /dev/did/rdsk/d4s2

Cluster information

Quorum info scstat –q


Cluster components scstat -pv
Resource/Resource group status scstat –g
IP Networking Multipathing scstat –i

http://www.datadisk.co.uk/html_docs/sun/sun_cluster_31_cs.htm 24/10/2008
Sun Cluster 3.1 Cheat Sheet Page 2 sur 4

Status of all nodes scstat –n


Disk device groups scstat –D
Transport info scstat –W
Detailed resource/resource group scrgadm -pv
Cluster configuration info scconf –p
Installation info (prints packages and scinstall –pv
version)

Cluster Configuration

Integrity check sccheck


Configure the cluster (add nodes, add data
services, etc) scinstall
Cluster configuration utility (quorum, data scsetup
sevices, resource groups, etc)
Add a node scconf –a –T node=<host><host>
Remove a node scconf –r –T node=<host><host>
Prevent new nodes from entering scconf –a –T node=.
scconf -c -q node=<node>,maintstate

Put a node into maintenance state


Note: use the scstat -q command to verify that the node is in
maintenance mode, the vote count should be zero for that node.
scconf -c -q node=<node>,reset

Get a node out of maintenance state


Note: use the scstat -q command to verify that the node is in
maintenance mode, the vote count should be one for that node.

Admin Quorum Device

Quorum devices are nodes and disk devices, so the total quorum will be all nodes and devices added together.
You can use the scsetup GUI interface to add/remove quorum devices or use the below commands.

scconf –a –q globaldev=d11

Adding a device to the quorum


Note: if you get the error message "uable to scrub device" use
scgdevs to add device to the global device namespace.
Removing a device to the quorum scconf –r –q globaldev=d11
Evacuate all nodes

put cluster into maint mode


#scconf –c –q installmode
Remove the last quorum device remove the quorum device
#scconf –r –q globaldev=d11

check the quorum devices


#scstat –q
scconf –c –q reset
Resetting quorum info
Note: this will bring all offline quorum devices online
obtain the device number
Bring a quorum device into maintenance #scdidadm –L
mode #scconf –c –q globaldev=<device>,maintstate
Bring a quorum device out of maintenance scconf –c –q globaldev=<device><device>,reset
mode

Device Configuration
Lists all the configured devices including scdidadm –L
paths across all nodes.
List all the configured devices including scdidadm –l
paths on node only.
Reconfigure the device database, creating scdidadm –r
new instances numbers if required.
Perform the repair procedure for a particular scdidadm –R <c0t0d0s0> - device
path (use then when a disk gets replaced) scdidadm –R 2 - device id

Configure the global device namespace scgdevs


scdpm –p all:all
Status of all disk paths
Note: (<host>:<disk>)
Monitor device path scdpm –m <node:disk path>

http://www.datadisk.co.uk/html_docs/sun/sun_cluster_31_cs.htm 24/10/2008
Sun Cluster 3.1 Cheat Sheet Page 3 sur 4

Unmonitor device path scdpm –u <node:disk path>

Disks group

Adding/Registering scconf -a -D type=vxvm,name=appdg,nodelist=<host>:<host>,preferenced=true


Removing scconf –r –D name=<disk group>
adding single node scconf -a -D type=vxvm,name=appdg,nodelist=<host>
Removing single node scconf –r –D name=<disk group>,nodelist=<host>
Switch scswitch –z –D <disk group> -h <host>
Put into maintenance mode scswitch –m –D <disk group>
take out of maintenance mode scswitch -z -D <disk group> -h <host>
onlining a disk group scswitch -z -D <disk group> -h <host>
offlining a disk group scswitch -F -D <disk group>
Resync a disk group scconf -c -D name=appdg,sync

Transport cable

Enable scconf –c –m endpoint=<host>:qfe1,state=enabled


scconf –c –m endpoint=<host>:qfe1,state=disabled
Disable
Note: it gets deleted

Resource Groups

Adding scrgadm -a -g <res_group> -h <host>,<host>


Removing scrgadm –r –g <group>
changing properties scrgadm -c -g <resource group> -y <propety=value>
Listing scstat –g
Detailed List scrgadm –pv –g <res_group>
Display mode type (failover or scalable) scrgadm -pv -g <res_group> | grep 'Res Group mode'
Offlining scswitch –F –g <res_group>
Onlining scswitch -Z -g <res_group>
scswitch –u –g <res_group>
Unmanaging
Note: (all resources in group must be disabled)
Managing scswitch –o –g <res_group>
Switching scswitch –z –g <res_group> –h <host>

Resources

Adding failover network resource scrgadm –a –L –g <res_group> -l <logicalhost>


Adding shared network resource scrgadm –a –S –g <res_group> -l <logicalhost>
scrgadm –a –j apache_res -g <res_group> \
adding a failover apache application and -t SUNW.apache -y Network_resources_used = <logicalhost>
attaching the network resource -y Scalable=False –y Port_list = 80/tcp \
-x Bin_dir = /usr/apache/bin
scrgadm –a –j apache_res -g <res_group> \
adding a shared apache application and -t SUNW.apache -y Network_resources_used = <logicalhost>
attaching the network resource -y Scalable=True –y Port_list = 80/tcp \
-x Bin_dir = /usr/apache/bin
scrgadm -a -g rg_oracle -j hasp_data01 -t SUNW.HAStoragePlus \
Create a HAStoragePlus failover resource > -x FileSystemMountPoints=/oracle/data01 \
> -x Affinityon=true
scrgadm –r –j res-ip
Removing
Note: must disable the resource first
changing properties scrgadm -c -j <resource> -y <property=value>
List scstat -g
scrgadm –pv –j res-ip
Detailed List scrgadm –pvv –j res-ip
Disable resoure monitor scrgadm –n –M –j res-ip
Enable resource monitor scrgadm –e –M –j res-ip

Disabling scswitch –n –j res-ip

Enabling scswitch –e –j res-ip


Clearing a failed resource scswitch –c –h<host>,<host> -j <resource> -f STOP_FAILED

http://www.datadisk.co.uk/html_docs/sun/sun_cluster_31_cs.htm 24/10/2008
Sun Cluster 3.1 Cheat Sheet Page 4 sur 4

Find the network of a resource # scrgadm –pvv –j <resource> | grep –I network


offline the group
# scswitch –F –g rgroup-1

remove the resource


Removing a resource and resource group # scrgadm –r –j res-ip

remove the resource group


# scrgadm –r –g rgroup-1

Resource Types

Adding scrgadm –a –t <resource type> i.e SUNW.HAStoragePlus


Deleting scrgadm –r –t <resource type>
Listing scrgadm –pv | grep ‘Res Type name’

http://www.datadisk.co.uk/html_docs/sun/sun_cluster_31_cs.htm 24/10/2008

You might also like