0% found this document useful (0 votes)
717 views348 pages

Managing SGLX 11.19

Hewlett-Packard makes no warranty of any kind with regard to this manual. A copy of the specific warranty terms applicable to your product can be obtained from your local Sales and Service Office. Use, duplication or disclosure by the U.S. Government is subject to restrictions.

Uploaded by

Marvin Tapessur
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
717 views348 pages

Managing SGLX 11.19

Hewlett-Packard makes no warranty of any kind with regard to this manual. A copy of the specific warranty terms applicable to your product can be obtained from your local Sales and Service Office. Use, duplication or disclosure by the U.S. Government is subject to restrictions.

Uploaded by

Marvin Tapessur
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 348

Managing HP Serviceguard for Linux,

Tenth Edition

HP Part Number: B9903-90073


Published: July 2009
Legal Notices
The information in this document is subject to change without notice.

Hewlett-Packard makes no warranty of any kind with regard to this manual, including, but not limited to, the implied warranties of
merchantability and fitness for a particular purpose. Hewlett-Packard shall not be held liable for errors contained herein or direct,
indirect, special, incidental or consequential damages in connection with the furnishing, performance, or use of this material.

Warranty. A copy of the specific warranty terms applicable to your Hewlett-Packard product and replacement parts can be
obtained from your local Sales and Service Office.

Restricted Rights Legend. Use, duplication or disclosure by the U.S. Government is subject to restrictions as set forth in
subparagraph (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013 for DOD agencies,
and subparagraphs (c) (1) and (c) (2) of the Commercial Computer Software Restricted Rights clause at FAR 52.227-19 for other
agencies.

Hewlett-Packard Company
19420 Homestead Road
Cupertino, California 95014 U.S.A.

Use of this manual and flexible disk(s) or tape cartridge(s) supplied for this pack is restricted to this product only. Additional
copies of the programs may be made for security and back-up purposes only. Resale of the programs in their present form or
with alterations, is expressly prohibited.

Copyright Notices

© Copyright 2001–2009 Hewlett-Packard Development Company, L.P.

Reproduction, adaptation, or translation of this document without prior written permission is prohibited, except as allowed under
copyright laws.

Trademark Notices

HP Serviceguard® is a registered trademark of Hewlett-Packard Company, and is protected by copyright.

NIS™ is a trademark of Sun Microsystems, Inc.

UNIX® is a registered trademark of The Open Group.

Linux® is a registered trademark of Linus Torvalds.

Red Hat® is a registered trademark of Red Hat Software, Inc.

SUSE® is a registered trademark of SUSE AG, a Novell Business.


Table of Contents
Printing History ...........................................................................................................................19

Preface.......................................................................................................................................21

1 Serviceguard for Linux at a Glance............................................................................................23


What is Serviceguard for Linux? .......................................................................................23
Failover..........................................................................................................................25
Using Serviceguard Manager.............................................................................................26
Monitoring Clusters with Serviceguard Manager........................................................26
Administering Clusters with Serviceguard Manager...................................................27
Configuring Clusters with Serviceguard Manager.......................................................27
Starting Serviceguard Manager.....................................................................................27
Configuration Roadmap.....................................................................................................27

2 Understanding Hardware Configurations for Serviceguard for Linux...............................................29


Redundant Cluster Components........................................................................................29
Redundant Network Components ....................................................................................30
Rules and Restrictions...................................................................................................30
Redundant Ethernet Configuration .............................................................................31
Cross-Subnet Configurations........................................................................................32
Configuration Tasks.................................................................................................33
Restrictions...............................................................................................................33
For More Information..............................................................................................34
Redundant Disk Storage.....................................................................................................34
Supported Disk Interfaces ............................................................................................35
Disk Monitoring............................................................................................................35
Sample Disk Configurations ........................................................................................35
Redundant Power Supplies ...............................................................................................36

3 Understanding Serviceguard Software Components.....................................................................37


Serviceguard Architecture..................................................................................................37
Serviceguard Daemons.................................................................................................38
Configuration Daemon: cmclconfd..........................................................................39
Cluster Daemon: cmcld............................................................................................39
Log Daemon: cmlogd...............................................................................................40
Network Manager Daemon: cmnetd.......................................................................40
Lock LUN Daemon: cmdisklockd............................................................................40
Cluster Object Manager Daemon: cmomd..............................................................40

Table of Contents 3
Service Assistant Daemon: cmserviced...................................................................40
Quorum Server Daemon: qs....................................................................................41
Utility Daemon: cmlockd.........................................................................................41
Cluster SNMP Agent Daemon: cmsnmpd...............................................................41
Cluster WBEM Agent Daemon: cmwbemd.............................................................42
Proxy Daemon: cmproxyd.......................................................................................42
How the Cluster Manager Works ......................................................................................42
Configuration of the Cluster ........................................................................................42
Heartbeat Messages ......................................................................................................43
Manual Startup of Entire Cluster..................................................................................43
Automatic Cluster Startup ...........................................................................................44
Dynamic Cluster Re-formation ....................................................................................44
Cluster Quorum to Prevent Split-Brain Syndrome.......................................................44
Cluster Lock...................................................................................................................45
Use of a Lock LUN as the Cluster Lock........................................................................45
Use of the Quorum Server as a Cluster Lock................................................................46
No Cluster Lock ............................................................................................................48
What Happens when You Change the Quorum Configuration Online.......................48
How the Package Manager Works.....................................................................................49
Package Types...............................................................................................................49
Non-failover Packages.............................................................................................49
Failover Packages.....................................................................................................50
Configuring Failover Packages ..........................................................................50
Deciding When and Where to Run and Halt Failover Packages .......................51
Failover Packages’ Switching Behavior..............................................................51
Failover Policy....................................................................................................53
Automatic Rotating Standby..............................................................................54
Failback Policy....................................................................................................57
On Combining Failover and Failback Policies...................................................60
Using Older Package Configuration Files.....................................................................60
How Packages Run.............................................................................................................61
What Makes a Package Run?.........................................................................................61
Before the Control Script Starts.....................................................................................64
During Run Script Execution........................................................................................64
Normal and Abnormal Exits from the Run Script........................................................66
Service Startup with cmrunserv..................................................................................66
While Services are Running..........................................................................................67
When a Service or Subnet Fails, or a Dependency is Not Met......................................67
When a Package is Halted with a Command................................................................67
During Halt Script Execution........................................................................................68
Normal and Abnormal Exits from the Halt Script........................................................69
Package Control Script Error and Exit Conditions..................................................70
How the Network Manager Works ...................................................................................71
Stationary and Relocatable IP Addresses and Monitored Subnets...............................71

4 Table of Contents
Types of IP Addresses...................................................................................................73
Adding and Deleting Relocatable IP Addresses ..........................................................73
Load Sharing ...........................................................................................................73
Bonding of LAN Interfaces ...........................................................................................74
Bonding for Load Balancing..........................................................................................77
Monitoring LAN Interfaces and Detecting Failure: Link Level....................................78
Monitoring LAN Interfaces and Detecting Failure: IP Level........................................78
Reasons To Use IP Monitoring.................................................................................79
How the IP Monitor Works......................................................................................79
Failure and Recovery Detection Times...............................................................81
Constraints and Limitations.....................................................................................81
Reporting Link-Level and IP-Level Failures.................................................................82
Package Switching and Relocatable IP Addresses........................................................82
Address Resolution Messages after Switching on the Same Subnet ...........................83
VLAN Configurations...................................................................................................83
What is VLAN?........................................................................................................83
Support for Linux VLAN.........................................................................................83
Configuration Restrictions.......................................................................................84
Additional Heartbeat Requirements........................................................................84
Volume Managers for Data Storage....................................................................................84
Storage on Arrays..........................................................................................................85
Monitoring Disks...........................................................................................................86
More Information on LVM............................................................................................86
About Persistent Reservations............................................................................................86
Rules and Limitations....................................................................................................87
How Persistent Reservations Work...............................................................................88
Responses to Failures .........................................................................................................89
Reboot When a Node Fails ...........................................................................................89
What Happens when a Node Times Out.................................................................90
Example .............................................................................................................90
Responses to Hardware Failures ..................................................................................91
Responses to Package and Service Failures .................................................................92
Service Restarts .......................................................................................................92
Network Communication Failure ...........................................................................92

4 Planning and Documenting an HA Cluster .................................................................................93


General Planning ...............................................................................................................93
Serviceguard Memory Requirements...........................................................................94
Planning for Expansion ................................................................................................94
Hardware Planning ...........................................................................................................94
SPU Information ...........................................................................................................94
LAN Information ..........................................................................................................95
Shared Storage...............................................................................................................95

Table of Contents 5
FibreChannel............................................................................................................95
Multipath for Storage ..............................................................................................96
Disk I/O Information ....................................................................................................96
Hardware Configuration Worksheet ............................................................................97
Power Supply Planning .....................................................................................................97
Power Supply Configuration Worksheet .....................................................................98
Cluster Lock Planning........................................................................................................98
Cluster Lock Requirements...........................................................................................98
Planning for Expansion.................................................................................................99
Using a Quorum Server.................................................................................................99
Quorum Server Worksheet .....................................................................................99
Volume Manager Planning ................................................................................................99
Volume Groups and Physical Volume Worksheet......................................................100
Cluster Configuration Planning .......................................................................................100
Heartbeat Subnet and Cluster Re-formation Time .....................................................100
About Hostname Address Families: IPv4-Only, IPv6-Only, and Mixed Mode..........101
What Is IPv4–only Mode?......................................................................................102
What Is IPv6-Only Mode?......................................................................................102
Rules and Restrictions for IPv6-Only Mode.....................................................102
Recommendations for IPv6-Only Mode...........................................................104
What Is Mixed Mode?............................................................................................104
Rules and Restrictions for Mixed Mode...........................................................104
Cluster Configuration Parameters ..............................................................................105
Cluster Configuration: Next Step ...............................................................................123
Package Configuration Planning .....................................................................................123
Logical Volume and File System Planning .................................................................123
Planning for Expansion...............................................................................................125
Choosing Switching and Failover Behavior................................................................125
About Package Dependencies.....................................................................................126
Simple Dependencies.............................................................................................126
Rules for Simple Dependencies.............................................................................127
Dragging Rules for Simple Dependencies........................................................128
Guidelines for Simple Dependencies.....................................................................131
Extended Dependencies.........................................................................................132
Rules for Exclusionary Dependencies..............................................................133
Rules for different_node and any_node Dependencies...................................134
About Package Weights...............................................................................................134
Package Weights and Node Capacities..................................................................134
Configuring Weights and Capacities.....................................................................135
Simple Method.......................................................................................................135
Example 1..........................................................................................................135
Points to Keep in Mind.....................................................................................136
Comprehensive Method.........................................................................................137
Defining Capacities...........................................................................................137

6 Table of Contents
Defining Weights..............................................................................................139
Rules and Guidelines.............................................................................................142
For More Information.............................................................................................142
How Package Weights Interact with Package Priorities and Dependencies.........143
Example 1..........................................................................................................143
Example 2..........................................................................................................143
About External Scripts.................................................................................................143
Using Serviceguard Commands in an External Script..........................................146
Determining Why a Package Has Shut Down.......................................................147
last_halt_failed Flag..........................................................................................147
About Cross-Subnet Failover......................................................................................147
Implications for Application Deployment.............................................................148
Configuring a Package to Fail Over across Subnets: Example..............................149
Configuring node_name...................................................................................149
Configuring monitored_subnet_access............................................................149
Configuring ip_subnet_node............................................................................150
Configuring a Package: Next Steps.............................................................................150
Planning for Changes in Cluster Size...............................................................................150

5 Building an HA Cluster Configuration.......................................................................................153


Preparing Your Systems ...................................................................................................153
Installing and Updating Serviceguard .......................................................................153
Understanding the Location of Serviceguard Files.....................................................153
Enabling Serviceguard Command Access..................................................................154
Configuring Root-Level Access...................................................................................155
Allowing Root Access to an Unconfigured Node..................................................155
Ensuring that the Root User on Another Node Is Recognized..............................156
About identd.....................................................................................................156
Configuring Name Resolution....................................................................................156
Safeguarding against Loss of Name Resolution Services......................................158
Ensuring Consistency of Kernel Configuration .........................................................159
Enabling the Network Time Protocol .........................................................................159
Implementing Channel Bonding (Red Hat)................................................................159
Sample Configuration............................................................................................160
Restarting Networking...........................................................................................161
Viewing the Configuration....................................................................................161
Implementing Channel Bonding (SUSE).....................................................................162
Restarting Networking...........................................................................................163
Setting up a Lock LUN................................................................................................164
Setting Up and Running the Quorum Server..............................................................165
Creating the Logical Volume Infrastructure ...............................................................165
Displaying Disk Information.................................................................................166
Creating Partitions.................................................................................................167

Table of Contents 7
Enabling Volume Group Activation Protection.....................................................169
Building Volume Groups: Example for Smart Array Cluster Storage (MSA 2000
Series).....................................................................................................................170
Building Volume Groups and Logical Volumes....................................................171
Distributing the Shared Configuration to all Nodes..............................................172
Testing the Shared Configuration..........................................................................173
Storing Volume Group Configuration Data ..........................................................175
Preventing Boot-Time vgscan and Ensuring Serviceguard Volume Groups
Are Deactivated................................................................................................176
Setting up Disk Monitoring...................................................................................176
Configuring the Cluster....................................................................................................177
cmquerycl Options......................................................................................................177
Speeding up the Process........................................................................................177
Specifying the Address Family for the Cluster Hostnames...................................178
Specifying the Address Family for the Heartbeat .................................................178
Full Network Probing............................................................................................179
Specifying a Lock LUN................................................................................................179
Specifying a Quorum Server.......................................................................................179
Obtaining Cross-Subnet Information..........................................................................180
Identifying Heartbeat Subnets....................................................................................182
Specifying Maximum Number of Configured Packages ...........................................182
Modifying the MEMBER_TIMEOUT Parameter.........................................................182
Controlling Access to the Cluster................................................................................183
A Note about Terminology....................................................................................183
How Access Roles Work........................................................................................183
Levels of Access......................................................................................................184
Setting up Access-Control Policies.........................................................................185
Role Conflicts....................................................................................................188
Package versus Cluster Roles.................................................................................189
Verifying the Cluster Configuration ...........................................................................189
Cluster Lock Configuration Messages........................................................................190
Distributing the Binary Configuration File ................................................................190
Managing the Running Cluster........................................................................................191
Checking Cluster Operation with Serviceguard Commands.....................................191
Setting up Autostart Features .....................................................................................192
Changing the System Message ...................................................................................193
Managing a Single-Node Cluster................................................................................193
Single-Node Operation..........................................................................................193
Disabling identd..........................................................................................................194
Deleting the Cluster Configuration ............................................................................195

6 Configuring Packages and Their Services .................................................................................197


Choosing Package Modules..............................................................................................198

8 Table of Contents
Types of Package: Failover, Multi-Node, System Multi-Node....................................198
Package Modules and Parameters...............................................................................199
Base Package Modules...........................................................................................200
Optional Package Modules....................................................................................202
Package Parameter Explanations.................................................................................204
package_name...........................................................................................................204
module_name...........................................................................................................205
module_version.........................................................................................................205
package_type............................................................................................................205
package_description..................................................................................................205
node_name...............................................................................................................205
auto_run..................................................................................................................206
node_fail_fast_enabled...............................................................................................206
run_script_timeout...................................................................................................207
halt_script_timeout...................................................................................................207
successor_halt_timeout.............................................................................................208
script_log_file...........................................................................................................208
operation_sequence...................................................................................................208
log_level...................................................................................................................208
failover_policy..........................................................................................................209
failback_policy..........................................................................................................209
priority.....................................................................................................................209
dependency_name.....................................................................................................210
dependency_condition...............................................................................................210
dependency_location.................................................................................................211
weight_name, weight_value.......................................................................................211
monitored_subnet.....................................................................................................212
monitored_subnet_access...........................................................................................212
ip_subnet.................................................................................................................213
ip_subnet_node ........................................................................................................214
ip_address................................................................................................................214
service_name............................................................................................................214
service_cmd..............................................................................................................215
service_restart..........................................................................................................215
service_fail_fast_enabled...........................................................................................216
service_halt_timeout.................................................................................................216
vgchange_cmd..........................................................................................................216
vg............................................................................................................................216
File system parameters...........................................................................................216
concurrent_fsck_operations.......................................................................................217
concurrent_mount_and_umount_operations..............................................................217
fs_mount_retry_count...............................................................................................217
fs_umount_retry_count ............................................................................................218
fs_name....................................................................................................................218

Table of Contents 9
fs_directory..............................................................................................................218
fs_type.....................................................................................................................218
fs_mount_opt...........................................................................................................219
fs_umount_opt.........................................................................................................219
fs_fsck_opt................................................................................................................219
pv............................................................................................................................219
pev_.........................................................................................................................219
external_pre_script...................................................................................................220
external_script..........................................................................................................220
user_host..................................................................................................................220
user_name................................................................................................................221
user_role..................................................................................................................221
Additional Parameters Used Only by Legacy Packages........................................221
Generating the Package Configuration File......................................................................222
Before You Start...........................................................................................................222
cmmakepkg Examples.................................................................................................222
Next Step.....................................................................................................................223
Editing the Configuration File..........................................................................................223
Verifying and Applying the Package Configuration........................................................227
Adding the Package to the Cluster...................................................................................228
Creating a Disk Monitor Configuration...........................................................................228

7 Cluster and Package Maintenance...........................................................................................229


Reviewing Cluster and Package Status ............................................................................229
Reviewing Cluster and Package Status with the cmviewcl Command....................229
Viewing Package Dependencies..................................................................................230
Cluster Status ..............................................................................................................230
Node Status and State .................................................................................................230
Package Status and State.............................................................................................230
Package Switching Attributes......................................................................................232
Service Status ..............................................................................................................233
Network Status............................................................................................................233
Failover and Failback Policies.....................................................................................233
Examples of Cluster and Package States ....................................................................233
Normal Running Status.........................................................................................233
Quorum Server Status............................................................................................235
Status After Halting a Package...............................................................................235
Status After Moving the Package to Another Node..............................................236
Status After Package Switching is Enabled............................................................238
Status After Halting a Node...................................................................................238
Viewing Information about Unowned Packages...................................................238
Managing the Cluster and Nodes ....................................................................................239
Starting the Cluster When all Nodes are Down..........................................................240

10 Table of Contents
Adding Previously Configured Nodes to a Running Cluster.....................................240
Removing Nodes from Participation in a Running Cluster........................................241
Using Serviceguard Commands to Remove a Node from Participation in a
Running Cluster ....................................................................................................241
Halting the Entire Cluster ...........................................................................................242
Automatically Restarting the Cluster .........................................................................242
Managing Packages and Services ....................................................................................242
Starting a Package .......................................................................................................242
Starting a Package that Has Dependencies............................................................243
Halting a Package .......................................................................................................243
Halting a Package that Has Dependencies............................................................243
Moving a Failover Package .........................................................................................244
Changing Package Switching Behavior ......................................................................244
Maintaining a Package: Maintenance Mode.....................................................................245
Characteristics of a Package Running in Maintenance Mode or Partial-Startup
Maintenance Mode .....................................................................................................246
Rules for a Package in Maintenance Mode or Partial-Startup Maintenance Mode
................................................................................................................................247
Additional Rules for Partial-Startup Maintenance Mode.................................247
Dependency Rules for a Package in Maintenance Mode or Partial-Startup
Maintenance Mode ................................................................................................248
Performing Maintenance Using Maintenance Mode..................................................248
Procedure...............................................................................................................248
Performing Maintenance Using Partial-Startup Maintenance Mode..........................249
Procedure...............................................................................................................249
Excluding Modules in Partial-Startup Maintenance Mode...................................250
Reconfiguring a Cluster....................................................................................................251
Previewing the Effect of Cluster Changes...................................................................252
What You Can Preview..........................................................................................252
Using Preview mode for Commands and in Serviceguard Manager....................253
Using cmeval..........................................................................................................254
Reconfiguring a Halted Cluster ..................................................................................255
Reconfiguring a Running Cluster................................................................................255
Adding Nodes to the Configuration While the Cluster is Running .....................256
Removing Nodes from the Cluster while the Cluster Is Running ........................256
Changing the Cluster Networking Configuration while the Cluster Is Running.......257
What You Can Do...................................................................................................257
What You Must Keep in Mind...............................................................................258
Example: Adding a Heartbeat LAN.......................................................................259
Example: Deleting a Subnet Used by a Package....................................................260
Updating the Cluster Lock LUN Configuration Online.............................................261
Changing MAX_CONFIGURED_PACKAGES.............................................................261
Configuring a Legacy Package.........................................................................................262
Creating the Legacy Package Configuration ..............................................................262

Table of Contents 11
Using Serviceguard Manager to Configure a Package .........................................262
Using Serviceguard Commands to Configure a Package .....................................262
Configuring a Package in Stages......................................................................263
Editing the Package Configuration File............................................................263
Creating the Package Control Script...........................................................................265
Customizing the Package Control Script ..............................................................266
Adding Customer Defined Functions to the Package Control Script ...................267
Adding Serviceguard Commands in Customer Defined Functions ...............267
Support for Additional Products...........................................................................268
Verifying the Package Configuration..........................................................................268
Distributing the Configuration....................................................................................268
Distributing the Configuration And Control Script with Serviceguard
Manager.................................................................................................................269
Copying Package Control Scripts with Linux commands.....................................269
Distributing the Binary Cluster Configuration File with Linux Commands ........269
Configuring Cross-Subnet Failover.............................................................................269
Configuring node_name........................................................................................270
Configuring monitored_subnet_access..................................................................270
Creating Subnet-Specific Package Control Scripts.................................................270
Control-script entries for nodeA and nodeB....................................................271
Control-script entries for nodeC and nodeD....................................................271
Reconfiguring a Package...................................................................................................271
Migrating a Legacy Package to a Modular Package....................................................272
Reconfiguring a Package on a Running Cluster .........................................................272
Reconfiguring a Package on a Halted Cluster ............................................................273
Adding a Package to a Running Cluster.....................................................................273
Deleting a Package from a Running Cluster ..............................................................273
Resetting the Service Restart Counter.........................................................................274
Allowable Package States During Reconfiguration ....................................................274
Changes that Will Trigger Warnings......................................................................278
Responding to Cluster Events ..........................................................................................278
Single-Node Operation ....................................................................................................279
Removing Serviceguard from a System...........................................................................279

8 Troubleshooting Your Cluster....................................................................................................281


Testing Cluster Operation ................................................................................................281
Testing the Package Manager .....................................................................................281
Testing the Cluster Manager .......................................................................................282
Monitoring Hardware ......................................................................................................282
Replacing Disks.................................................................................................................283
Replacing a Faulty Mechanism in a Disk Array..........................................................283
Replacing a Lock LUN.................................................................................................283
Revoking Persistent Reservations after a Catastrophic Failure........................................284

12 Table of Contents
Examples......................................................................................................................285
Replacing LAN Cards.......................................................................................................285
Replacing a Failed Quorum Server System......................................................................286
Troubleshooting Approaches ...........................................................................................288
Reviewing Package IP Addresses ...............................................................................288
Reviewing the System Log File ..................................................................................289
Sample System Log Entries ...................................................................................289
Reviewing Object Manager Log Files .........................................................................290
Reviewing Configuration Files ...................................................................................290
Reviewing the Package Control Script .......................................................................290
Using the cmquerycl and cmcheckconf Commands.............................................291
Reviewing the LAN Configuration ............................................................................291
Solving Problems .............................................................................................................291
Name Resolution Problems.........................................................................................292
Networking and Security Configuration Errors....................................................292
Cluster Re-formations Caused by Temporary Conditions..........................................292
Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low.............292
System Administration Errors ....................................................................................293
Package Control Script Hangs or Failures ............................................................294
Package Movement Errors ..........................................................................................295
Node and Network Failures .......................................................................................296
Troubleshooting the Quorum Server...........................................................................296
Authorization File Problems..................................................................................296
Timeout Problems..................................................................................................297
Messages................................................................................................................297
Lock LUN Messages....................................................................................................297

A Designing Highly Available Cluster Applications ......................................................................299


Automating Application Operation ................................................................................299
Insulate Users from Outages ......................................................................................300
Define Application Startup and Shutdown ................................................................300
Controlling the Speed of Application Failover ................................................................301
Replicate Non-Data File Systems ...............................................................................301
Evaluate the Use of a Journaled Filesystem (JFS)........................................................301
Minimize Data Loss ....................................................................................................301
Minimize the Use and Amount of Memory-Based Data ......................................302
Keep Logs Small ....................................................................................................302
Eliminate Need for Local Data ..............................................................................302
Use Restartable Transactions ......................................................................................302
Use Checkpoints .........................................................................................................303
Balance Checkpoint Frequency with Performance ...............................................303
Design for Multiple Servers ........................................................................................303
Design for Replicated Data Sites ................................................................................304

Table of Contents 13
Designing Applications to Run on Multiple Systems .....................................................304
Avoid Node Specific Information ...............................................................................304
Obtain Enough IP Addresses ................................................................................305
Allow Multiple Instances on Same System ...........................................................305
Avoid Using SPU IDs or MAC Addresses .................................................................305
Assign Unique Names to Applications ......................................................................306
Use DNS ................................................................................................................306
Use uname(2) With Care ............................................................................................307
Bind to a Fixed Port ....................................................................................................307
Bind to Relocatable IP Addresses ...............................................................................307
Call bind() before connect() ...................................................................................308
Give Each Application its Own Volume Group .........................................................308
Use Multiple Destinations for SNA Applications ......................................................308
Avoid File Locking ......................................................................................................309
Restoring Client Connections ..........................................................................................309
Handling Application Failures ........................................................................................310
Create Applications to be Failure Tolerant .................................................................310
Be Able to Monitor Applications ................................................................................311
Minimizing Planned Downtime ......................................................................................311
Reducing Time Needed for Application Upgrades and Patches ...............................311
Provide for Rolling Upgrades ...............................................................................312
Do Not Change the Data Layout Between Releases .............................................312
Providing Online Application Reconfiguration .........................................................312
Documenting Maintenance Operations .....................................................................312

B Integrating HA Applications with Serviceguard..........................................................................315


Checklist for Integrating HA Applications ......................................................................315
Defining Baseline Application Behavior on a Single System .....................................316
Integrating HA Applications in Multiple Systems .....................................................316
Testing the Cluster ......................................................................................................316

C Blank Planning Worksheets ....................................................................................................319


Hardware Worksheet .......................................................................................................320
Power Supply Worksheet .................................................................................................321
Quorum Server Worksheet ..............................................................................................322
Volume Group and Physical Volume Worksheet ............................................................323
Cluster Configuration Worksheet ....................................................................................324
Package Configuration Worksheet ..................................................................................325
Package Control Script Worksheet (Legacy).....................................................................326

D IPv6 Network Support............................................................................................................327


IPv6 Address Types..........................................................................................................327

14 Table of Contents
Textual Representation of IPv6 Addresses..................................................................327
IPv6 Address Prefix.....................................................................................................328
Unicast Addresses.......................................................................................................328
IPv4 and IPv6 Compatibility.......................................................................................328
IPv4 Compatible IPv6 Addresses...........................................................................329
IPv4 Mapped IPv6 Address...................................................................................329
Aggregatable Global Unicast Addresses...............................................................329
Link-Local Addresses.............................................................................................330
Site-Local Addresses..............................................................................................330
Multicast Addresses...............................................................................................330
Network Configuration Restrictions................................................................................331
Configuring IPv6 on Linux...............................................................................................332
Enabling IPv6 on Red Hat Linux.................................................................................332
Adding persistent IPv6 Addresses on Red Hat Linux................................................332
Configuring a Channel Bonding Interface with Persistent IPv6 Addresses on Red
Hat Linux.....................................................................................................................332
Adding Persistent IPv6 Addresses on SUSE...............................................................333
Configuring a Channel Bonding Interface with Persistent IPv6 Addresses on
SUSE............................................................................................................................333

E Using Serviceguard Manager..................................................................................................335


About the Online Help System.........................................................................................335
Launching Serviceguard Manager....................................................................................335
Scenario 1 - Single cluster management......................................................................335
Scenario 2- Multi-Cluster Management......................................................................337

Index........................................................................................................................................341

Table of Contents 15
List of Figures
1-1 Typical Cluster Configuration ....................................................................................24
1-2 Typical Cluster After Failover ....................................................................................25
1-3 Tasks in Configuring a Serviceguard Cluster ............................................................28
2-1 Redundant LANs .......................................................................................................32
2-2 Mirrored Disks Connected for High Availability ......................................................36
3-1 Serviceguard Software Components on Linux...........................................................38
3-2 Lock LUN Operation...................................................................................................46
3-3 Quorum Server Operation..........................................................................................47
3-4 Quorum Server to Cluster Distribution......................................................................47
3-5 Package Moving During Failover...............................................................................50
3-6 Before Package Switching...........................................................................................52
3-7 After Package Switching.............................................................................................53
3-8 Rotating Standby Configuration before Failover........................................................55
3-9 Rotating Standby Configuration after Failover..........................................................56
3-10 configured_node Policy Packages after Failover...................................................57
3-11 Automatic Failback Configuration before Failover....................................................58
3-12 Automatic Failback Configuration After Failover......................................................59
3-13 Automatic Failback Configuration After Restart of node1........................................60
3-14 Legacy Package Time Line Showing Important Events..............................................63
3-15 Legacy Package Time Line .........................................................................................65
3-16 Legacy Package Time Line for Halt Script Execution.................................................69
3-17 Bonded Network Interfaces........................................................................................75
3-18 Bonded NICs...............................................................................................................76
3-19 Bonded NICs After Failure.........................................................................................77
3-20 Bonded NICs Configured for Load Balancing............................................................78
3-21 Physical Disks Combined into LUNs..........................................................................85
5-1 Access Roles..............................................................................................................184
E-1 System Management Homepage with Serviceguard Manager................................337
E-2 Cluster by Type.........................................................................................................339

16 List of Figures
List of Tables
1 .....................................................................................................................................19
3-1 Package Configuration Data.......................................................................................54
3-2 Node Lists in Sample Cluster......................................................................................58
3-3 Error Conditions and Package Movement for Failover Packages..............................70
4-1 Package Failover Behavior .......................................................................................126
5-1 Changing Linux Partition Types...............................................................................164
6-1 Base Modules.............................................................................................................201
6-2 Optional Modules......................................................................................................202
7-1 Types of Changes to the Cluster Configuration .......................................................251
7-2 Types of Changes to Packages ..................................................................................275
D-1 IPv6 Address Types...................................................................................................327
D-2 ...................................................................................................................................328
D-3 ...................................................................................................................................329
D-4 ...................................................................................................................................329
D-5 ...................................................................................................................................329
D-6 ...................................................................................................................................330
D-7 ...................................................................................................................................330
D-8 ...................................................................................................................................330

17
18
Printing History
Table 1
Printing Date Part Number Edition

November 2001 B9903-90005 First

November 2002 B9903-90012 First

December 2002 B9903-90012 Second

November 2003 B9903-90033 Third

February 2005 B9903-90043 Fourth

June 2005 B9903-90046 Fifth

August 2006 B9903-90050 Sixth

July 2007 B9903-90054 Seventh

March 2008 B9903-90060 Eighth

April 2009 B9903-90068 Ninth

July 2009 B9903-90073 Tenth

The last printing date and part number indicate the current edition, which applies to
the A.11.19 version of HP Serviceguard for Linux.
The printing date changes when a new edition is printed. (Minor corrections and
updates which are incorporated at reprint do not cause the date to change.) The part
number is revised when extensive technical changes are incorporated.
New editions of this manual will incorporate all material updated since the previous
edition.
HP Printing Division:
Business Critical Computing
Hewlett-Packard Co.
19111 Pruneridge Ave.
Cupertino, CA 95014

19
20
Preface
This guide describes how to configure and manage Serviceguard for Linux on HP
ProLiant and HP Integrity servers under the Linux operating system. It is intended for
experienced Linux system administrators. (For Linux system administration tasks that
are not specific to Serviceguard, use the system administration documentation and
manpages for your distribution of Linux.)
The contents are as follows:
• Chapter 1 (page 23) describes a Serviceguard cluster and provides a roadmap for
using this guide.
• Chapter 2 (page 29) provides a general view of the hardware configurations used
by Serviceguard.
• Chapter 3 (page 37) describes the software components of Serviceguard and shows
how they function within the Linux operating system.
• Chapter 4 (page 93) steps through the planning process.
• Chapter 5 (page 153) describes the creation of the cluster configuration.
• Chapter 6 (page 197) describes the creation of high availability packages.
• Chapter 7 (page 229) presents the basic cluster administration tasks.
• Chapter 8 (page 281) explains cluster testing and troubleshooting strategies.
• Appendix A (page 299) gives guidelines for creating cluster-aware applications
that provide optimal performance in a Serviceguard environment.
• Appendix B (page 315) provides suggestions for integrating your existing
applications with Serviceguard for Linux.
• Appendix C (page 319) contains a set of empty worksheets for preparing a
Serviceguard configuration.
• Appendix D (page 327) provides information about IPv6.
• Appendix E (page 335) is an introduction to Serviceguard Manager.

Related Publications
The following documents contain additional useful information:
• HP Serviceguard for Linux Version A.11.19 Release Notes
• HP Serviceguard Quorum Server Version A.04.00 Release Notes
• Clusters for High Availability: a Primer of HP Solutions. Second Edition. HP Press,
2001 (ISBN 0-13-089355-2)
Use the following URL to access HP’s high availability documentation web page:
http://docs.hp.com/hpux/ha
Information about supported configurations is in the HP Serviceguard for Linux
Configuration Guide. For updated information on supported hardware and Linux

21
distributions refer to the HP Serviceguard for Linux Certification Matrix. Both documents
are available at:
http://www.hp.com/info/sglx

Problem Reporting
If you have any problems with the software or documentation, please contact your
local Hewlett-Packard Sales Office or Customer Service Center.

22
1 Serviceguard for Linux at a Glance
This chapter introduces Serviceguard for Linux and shows where to find different
kinds of information in this book. It includes the following topics:
• What is Serviceguard for Linux?
• Using Serviceguard Manager (page 26)
• Configuration Roadmap (page 27)
If you are ready to start setting up Serviceguard clusters, skip ahead to Chapter 4
(page 93). Specific steps for setup are in Chapter 5 (page 153).

What is Serviceguard for Linux?


Serviceguard for Linux allows you to create high availability clusters of HP ProLiant
and HP Integrity servers. A high availability computer system allows application
services to continue in spite of a hardware or software failure. Highly available systems
protect users from software failures as well as from failure of a system processing unit
(SPU), disk, or local area network (LAN) component. In the event that one component
fails, the redundant component takes over. Serviceguard and other high availability
subsystems coordinate the transfer between components.
A Serviceguard cluster is a networked grouping of HP ProLiant and HP Integrity
servers (host systems known as nodes) having sufficient redundancy of software and
hardware that a single point of failure will not significantly disrupt service. Application
services (individual Linux processes) are grouped together in packages; in the event
of a single service, node, network, or other resource failure, Serviceguard can
automatically transfer control of the package to another node within the cluster, allowing
services to remain available with minimal interruption.

What is Serviceguard for Linux? 23


Figure 1-1 Typical Cluster Configuration

In the figure, node 1 (one of two SPU's) is running package A, and node 2 is running
package B. Each package has a separate group of disks associated with it, containing
data needed by the package's applications, and a copy of the data. Note that both nodes
are physically connected to disk arrays. However, only one node at a time may access
the data for a given group of disks. In the figure, node 1 is shown with exclusive access
to the top two disks (solid line), and node 2 is shown as connected without access to
the top disks (dotted line). Similarly, node 2 is shown with exclusive access to the
bottom two disks (solid line), and node 1 is shown as connected without access to the
bottom disks (dotted line).
Disk arrays provide redundancy in case of disk failures. In addition, a total of four data
buses are shown for the disks that are connected to node 1 and node 2. This
configuration provides the maximum redundancy and also gives optimal I/O
performance, since each package is using different buses.
Note that the network hardware is cabled to provide redundant LAN interfaces on
each node. Serviceguard uses TCP/IP network services for reliable communication
among nodes in the cluster, including the transmission of heartbeat messages, signals
from each functioning node which are central to the operation of the cluster. TCP/IP
services also are used for other types of inter-node communication. (The heartbeat is
explained in more detail in the chapter “Understanding Serviceguard Software.”)
24 Serviceguard for Linux at a Glance
Failover
Under normal conditions, a fully operating Serviceguard cluster simply monitors the
health of the cluster's components while the packages are running on individual nodes.
Any host system running in the Serviceguard cluster is called an active node. When
you create the package, you specify a primary node and one or more adoptive nodes.
When a node or its network communications fails, Serviceguard can transfer control
of the package to the next available adoptive node. This situation is shown in Figure
1-2.

Figure 1-2 Typical Cluster After Failover

After this transfer, the package typically remains on the adoptive node as long the
adoptive node continues running. If you wish, however, you can configure the package
to return to its primary node as soon as the primary node comes back online.
Alternatively, you may manually transfer control of the package back to the primary
node at the appropriate time.
Figure 1-2 does not show the power connections to the cluster, but these are important
as well. In order to remove all single points of failure from the cluster, you should
provide as many separate power circuits as needed to prevent a single point of failure
of your nodes, disks and disk mirrors. Each power circuit should be protected by an

What is Serviceguard for Linux? 25


uninterruptible power source. For more details, refer to the section on “Power Supply
Planning” in Chapter 4, “Planning and Documenting an HA Cluster.”
Serviceguard is designed to work in conjunction with other high availability products,
such as disk arrays, which use various RAID levels for data protection; and
HP-supported uninterruptible power supplies (UPS), which eliminate failures related
to power outage. HP recommends these products; in conjunction with Serviceguard
they provide the highest degree of availability.

Using Serviceguard Manager

NOTE: For more-detailed information, see Appendix E (page 335), and the section on
Serviceguard Manager in the latest version of the Serviceguard Release Notes. Check
the Serviceguard/SGeRAC/SMS/Serviceguard Manager Plug-in Compatibility and Feature
Matrix and the latest Release Notes for up-to-date information about Serviceguard
Manager compatibility. You can find both documents at http://www.docs.hp.com
-> High Availability -> Serviceguard.
Serviceguard Manager is the graphical user interface for Serviceguard. It is available
as a “plug-in” to the web-based HP System Management Homepage (HP SMH).
You can use Serviceguard Manager to monitor, administer, and configure Serviceguard
clusters.
• You can see properties, status, and alerts of cluster, nodes, and packages.
• You can do administrative tasks such as run or halt clusters, cluster nodes, and
packages.
• You can create or modify a cluster and its packages.
See the latest Release Notes for your version of Serviceguard for Linux for an
introduction to using Serviceguard Manager, and the
Serviceguard/SGeRAC/SMS/Serviceguard Manager Plug-in Compatibility and Feature Matrix
for up-to-date information about Serviceguard Manager compatibility (http://
www.docs.hp.com -> High Availability -> Serviceguard for Linux).

Monitoring Clusters with Serviceguard Manager


From the main page of Serviceguard Manager, you can see status and alerts for the
cluster, nodes, and packages. You can also drill down to see the configuration and
alerts of the cluster, nodes, and packages.

26 Serviceguard for Linux at a Glance


Administering Clusters with Serviceguard Manager
You can administer clusters, nodes, and packages if access control policies permit:
• Cluster: halt, run
• Cluster nodes: halt, run
• Package: halt, run, move from one node to another, reset node- and
package-switching flags

Configuring Clusters with Serviceguard Manager


You can configure clusters and packages in Serviceguard Manager. You must have
root (UID=0) access to the cluster nodes.

Starting Serviceguard Manager


Follow the directions in the Release Notes for your version of Serviceguard for Linux
to start the Serviceguard Manager. Then select a cluster, node, or package, and use the
drop-down menus below the “Serviceguard Manager” banner to navigate to the task
you need to do.
Use Serviceguard Manager’s built-in help to guide you through the tasks; this manual
will tell you if a task can be done in Serviceguard Manager but does not duplicate the
help.

Configuration Roadmap
This manual presents the tasks you need to perform in order to create a functioning
HA cluster using Serviceguard. These tasks are shown in Figure 1-3.

Configuration Roadmap 27
Figure 1-3 Tasks in Configuring a Serviceguard Cluster

HP recommends that you gather all the data that is needed for configuration before you
start. See Chapter 4 (page 93) for tips on gathering data.

28 Serviceguard for Linux at a Glance


2 Understanding Hardware Configurations for Serviceguard
for Linux
This chapter gives a broad overview of how the server hardware components operate
with Serviceguard for Linux. The following topics are presented:
• Redundant Cluster Components
• Redundant Network Components (page 30)
• Redundant Disk Storage (page 34)
• Redundant Power Supplies (page 36)
Refer to the next chapter for information about Serviceguard software components.

Redundant Cluster Components


In order to provide a high level of availability, a typical cluster uses redundant system
components, for example two or more SPUs and two or more independent disks.
Redundancy eliminates single points of failure. In general, the more redundancy, the
greater your access to applications, data, and supportive services in the event of a
failure. In addition to hardware redundancy, you need software support to enable and
control the transfer of your applications to another SPU or network after a failure.
Serviceguard provides this support as follows:
• In the case of LAN failure, the Linux bonding facility provides a standby LAN, or
Serviceguard moves packages to another node.
• In the case of SPU failure, your application is transferred from a failed SPU to a
functioning SPU automatically and in a minimal amount of time.
• For software failures, an application can be restarted on the same node or another
node with minimum disruption.
Serviceguard also gives you the advantage of easily transferring control of your
application to another SPU in order to bring the original SPU down for system
administration, maintenance, or version upgrades.
The maximum number of nodes supported in a Serviceguard Linux cluster is 16; the
actual number depends on the storage configuration. For example, a package that
accesses data over a FibreChannel connection can be configured to fail over among 16
nodes, while SCSI disk arrays are typically limited to four nodes.
A package that does not use data from shared storage can be configured to fail over to
as many nodes as you have configured in the cluster (up to the maximum of 16),
regardless of disk technology. For instance, a package that runs only local executables,
and uses only local data, can be configured to fail over to all nodes in the cluster.

Redundant Cluster Components 29


Redundant Network Components
To eliminate single points of failure for networking, each subnet accessed by a cluster
node is required to have redundant network interfaces. Redundant cables are also
needed to protect against cable failures. Each interface card is connected to a different
cable and hub or switch.
Network interfaces are allowed to share IP addresses through a process known as
channel bonding. See “Implementing Channel Bonding (Red Hat)” (page 159) or
“Implementing Channel Bonding (SUSE)” (page 162).
Serviceguard supports a maximum of 30 network interfaces per node. For this purpose
an interface is defined as anything represented as a primary interface in the output of
ifconfig, so the total of 30 can comprise any combination of physical LAN interfaces
or bonding interfaces. (A node can have more than 30 such interfaces, but only 30 can
be part of the cluster configuration.)

Rules and Restrictions


• A single subnet cannot be configured on different network interfaces (NICs) on
the same node.
• In the case of subnets that can be used for communication between cluster nodes,
the same network interface must not be used to route more than one subnet
configured on the same node.
• For IPv4 subnets, Serviceguard does not support different subnets on the same
LAN interface.
— For IPv6, Serviceguard supports up to two subnets per LAN interface (site-local
and global).
• Serviceguard does support different subnets on the same bridged network (this
applies at both the node and the cluster level).
• Serviceguard does not support using networking tools such as ifconfig to add
IP addresses to network interfaces that are configured into the Serviceguard cluster,

30 Understanding Hardware Configurations for Serviceguard for Linux


unless those IP addresses themselves will be immediately configured into the
cluster as stationary IP addresses.

CAUTION: If you configure any address other than a stationary IP address on a


Serviceguard network interface, it could collide with a relocatable package IP
address assigned by Serviceguard. See “Stationary and Relocatable IP Addresses
and Monitored Subnets” (page 71).
— Similarly, Serviceguard does not support using networking tools to move or
reconfigure any IP addresses configured into the cluster.
Doing so leads to unpredictable results because the Serviceguard view of the
configuration is different from the reality.

NOTE: If you will be using a cross-subnet configuration, see also the Restrictions
(page 33) that apply specifically to such configurations.

Redundant Ethernet Configuration


The use of redundant network components is shown in Figure 2-1, which is an Ethernet
configuration.

Redundant Network Components 31


Figure 2-1 Redundant LANs

In Linux configurations, the use of symmetrical LAN configurations is strongly


recommended, with the use of redundant hubs or switches to connect Ethernet segments.
The software bonding configuration should be identical on each node, with the active
interfaces connected to the same hub or switch.

Cross-Subnet Configurations
As of Serviceguard A.11.18 it is possible to configure multiple subnets, joined by a
router, both for the cluster heartbeat and for data, with some nodes using one subnet
and some another.
A cross-subnet configuration allows:
• Automatic package failover from a node on one subnet to a node on another
• A cluster heartbeat that spans subnets.

32 Understanding Hardware Configurations for Serviceguard for Linux


Configuration Tasks
Cluster and package configuration tasks are affected as follows:
• You must use the -w full option to cmquerycl to discover actual or potential
nodes and subnets across routers.
• You must configure two new parameters in the package configuration file to allow
packages to fail over across subnets:
— ip_subnet_node - to indicate which nodes the subnet is configured on
— monitored_subnet_access - to indicate whether the subnet is configured on all
nodes (FULL) or only some (PARTIAL)
(For legacy packages, see “Configuring Cross-Subnet Failover” (page 269).)
• You should not use the wildcard (*) for node_name in the package configuration
file, as this could allow the package to fail over across subnets when a node on the
same subnet is eligible; failing over across subnets can take longer than failing
over on the same subnet. List the nodes in order of preference instead of using the
wildcard.
• You should configure IP monitoring for each subnet; see “Monitoring LAN
Interfaces and Detecting Failure: IP Level” (page 78).

Restrictions
The following restrictions apply:
• All nodes in the cluster must belong to the same network domain (that is, the
domain portion of the fully-qualified domain name must be the same).
• The nodes must be fully connected at the IP level.
• A minimum of two heartbeat paths must be configured for each cluster node.
• There must be less than 200 milliseconds of latency in the heartbeat network.
• Each heartbeat subnet on each node must be physically routed separately to the
heartbeat subnet on another node; that is, each heartbeat path must be physically
separate:
— The heartbeats must be statically routed; static route entries must be configured
on each node to route the heartbeats through different paths.
— Failure of a single router must not affect both heartbeats at the same time.
• IPv6 heartbeat subnets are not supported in a cross-subnet configuration.
• IPv6–only and mixed modes are not supported in a cross-subnet configuration.
For more information about these modes, see “About Hostname Address Families:
IPv4-Only, IPv6-Only, and Mixed Mode” (page 101).
• Deploying applications in this environment requires careful consideration; see
“Implications for Application Deployment” (page 148).

Redundant Network Components 33


• cmrunnode will fail if the “hostname LAN” is down on the node in question.
(“Hostname LAN” refers to the public LAN on which the IP address that the node’s
hostname resolves to is configured).
• If a monitored_subnet is configured for PARTIAL monitored_subnet_access in a
package’s configuration file, it must be configured on at least one of the nodes on
the node_name list for that package. Conversely, if all of the subnets that are being
monitored for this package are configured for PARTIAL access, each node on the
node_name list must have at least one of these subnets configured.
— As in other configurations, a package will not start on a node unless the subnets
configured on that node, and specified in the package configuration file as
monitored subnets, are up.

NOTE: See also the Rules and Restrictions (page 30) that apply to all cluster
networking configurations.

For More Information


For more information on the details of configuring the cluster and packages in a
cross-subnet context, see “About Cross-Subnet Failover” (page 147), “Obtaining
Cross-Subnet Information” (page 180), and (for legacy packages only) “Configuring
Cross-Subnet Failover” (page 269).
See also the white paper Technical Considerations for Creating a Serviceguard Cluster that
Spans Multiple IP Subnets, which you can find at the address below. This paper discusses
and illustrates supported configurations, and also potential mis-configurations.

IMPORTANT: Although cross-subnet topology can be implemented on a single site,


it is most commonly used by extended-distance clusters. For more information about
such clusters, see the latest edition of the HP Serviceguard Extended Distance Cluster for
Linux Deployment Guide on docs.hp.com under High Availability —>
Serviceguard for Linux.

Redundant Disk Storage


Each node in a cluster has its own root disk, but each node may also be physically
connected to several other disks in such a way that more than one node can obtain
access to the data and programs associated with a package it is configured for. This
access is provided by the Logical Volume Manager (LVM). A volume group must be
activated by no more than one node at a time, but when the package is moved, the
volume group can be activated by the adoptive node.

34 Understanding Hardware Configurations for Serviceguard for Linux


NOTE: As of release A.11.16.07, Serviceguard for Linux provides functionality similar
to HP-UX exclusive activation. This feature is based on LVM2 hosttags, and is available
only for Linux distributions that officially support LVM2.
All of the disks in the volume group owned by a package must be connected to the
original node and to all possible adoptive nodes for that package.
Shared disk storage in Serviceguard Linux clusters is provided by disk arrays, which
have redundant power and the capability for connections to multiple nodes. Disk arrays
use RAID modes to provide redundancy.

Supported Disk Interfaces


The following interfaces are supported by Serviceguard for disks that are connected
to two or more nodes (shared data disks):
• MSA (Modular Smart Array) 2000 family
• FibreChannel.
For information on configuring multipathing, see “Multipath for Storage ” (page 96).

Disk Monitoring
You can configure monitoring for disks and configure packages to be dependent on
the monitor. For each package, you define a package service that monitors the disks
that are activated by that package. If a disk failure occurs on one node, the monitor
will cause the package to fail, with the potential to fail over to a different node on which
the same disks are available.

Sample Disk Configurations


Figure 2-2 shows a two node cluster. Each node has one root disk which is mirrored
and one package for which it is the primary node. Resources have been allocated to
each node so that each node can adopt the package from the other node. Each package
has one disk volume group assigned to it and the logical volumes in that volume group
are mirrored.

Redundant Disk Storage 35


Figure 2-2 Mirrored Disks Connected for High Availability

Redundant Power Supplies


You can extend the availability of your hardware by providing battery backup to your
nodes and disks. HP-supported uninterruptible power supplies (UPS) can provide this
protection from momentary power loss.
Disks should be attached to power circuits in such a way that disk array copies are
attached to different power sources. The boot disk should be powered from the same
circuit as its corresponding node. Quorum server systems should be powered separately
from cluster nodes. Your HP representative can provide more details about the layout
of power supplies, disks, and LAN hardware for clusters.

36 Understanding Hardware Configurations for Serviceguard for Linux


3 Understanding Serviceguard Software Components
This chapter gives a broad overview of how the Serviceguard software components
work. It includes the following topics:
• Serviceguard Architecture
• How the Cluster Manager Works (page 42)
• How the Package Manager Works (page 49)
• How Packages Run (page 61)
• How the Network Manager Works (page 71)
• Volume Managers for Data Storage (page 84)
• Responses to Failures (page 89)
If you are ready to start setting up Serviceguard clusters, skip ahead to Chapter 4,
“Planning and Documenting an HA Cluster.”

Serviceguard Architecture
The following figure shows the main software components used by Serviceguard for
Linux. This chapter discusses these components in some detail.

Serviceguard Architecture 37
Figure 3-1 Serviceguard Software Components on Linux

Serviceguard Daemons
Serviceguard for Linux uses the following daemons:
• cmclconfd—configuration daemon
• cmcld—cluster daemon
• cmnetd—Network Manager daemon
• cmlogd—cluster system log daemon
• cmdisklockd—cluster lock LUN daemon
• cmomd—Cluster Object Manager daemon
• cmserviced—Service Assistant daemon
• qs—Quorum Server daemon
• cmlockd—utility daemon
• cmsnmpd—cluster SNMP subagent (optionally running)
• cmwbemd—WBEM daemon
• cmproxyd—proxy daemon
Each of these daemons logs to the Linux system logging files. The quorum server
daemon logs to the user specified log file, such as, /usr/local/qs/log/qs.log file

38 Understanding Serviceguard Software Components


on Red Hat or /var/log/qs/sq.log on SUSE and cmomd logs to /usr/local/
cmom/log/cmomd.log on Red Hat or /var/log/cmom/log/cmomd.log on SUSE.

NOTE: The file cmcluster.conf contains the mappings that resolve symbolic
references to $SGCONF, $SGROOT, $SGLBIN, etc, used in the pathnames in the
subsections that follow. See “Understanding the Location of Serviceguard Files”
(page 153) for details.

Configuration Daemon: cmclconfd


This daemon is used by the Serviceguard commands to gather information from all
the nodes within the cluster. It gathers configuration information such as information
on networks and volume groups. It also distributes the cluster binary configuration
file to all nodes in the cluster. This daemon is started by the internet daemon,
xinetd(1M).
Parameters are in the /etc/xinetd.d/hacl-cfg and /etc/xinetd.d/
hacl-cfgudp files. The path for this daemon is $SGLBIN/cmclconfd.

Cluster Daemon: cmcld


This daemon determines cluster membership by sending heartbeat messages to cmcld
daemons on other nodes in the Serviceguard cluster. It runs at a real time priority and
is locked in memory. The cmcld daemon sets a safety timer in the kernel which is
used to detect kernel hangs. If this timer is not reset periodically by cmcld, the kernel
will cause a system reboot This could occur because cmcld could not communicate
with the majority of the cluster’s members, or because cmcld exited unexpectedly,
aborted, or was unable to run for a significant amount of time and was unable to update
the kernel timer, indicating a kernel hang. Before a system reset resulting from the
expiration of the safety timer, messages will be written to syslog, and the kernel’s
message buffer, if possible, and a system dump is performed.
The duration of the safety timer depends on the cluster configuration parameter
MEMBER_TIMEOUT, and also on the characteristics of the cluster configuration, such
as whether it uses a quorum server or a cluster lock (and what type of lock) and whether
or not standby LANs are configured.
For further discussion, see “What Happens when a Node Times Out” (page 90). For
advice on setting MEMBER_TIMEOUT, see Cluster Configuration Parameters (page 105).
For troubleshooting, see “Cluster Re-formations Caused by MEMBER_TIMEOUT Being
Set too Low” (page 292).
cmcld also manages Serviceguard packages, determining where to run them and
when to start them. The path for this daemon is: $SGLBIN/cmcld.

Serviceguard Architecture 39
NOTE: Two of the central components of Serviceguard—Package Manager, and
Cluster Manager—run as parts of the cmcld daemon. This daemon runs at priority 94
and is in the SCHED_RR class. No other process is allowed a higher real-time priority.

Log Daemon: cmlogd


cmlogd is used by cmcld to write messages to the system log file. Any message written
to the system log by cmcld it written through cmlogd. This is to prevent any delays
in writing to syslog from impacting the timing of cmcld. The path for this daemon
is $SGLBIN/cmlogd.

Network Manager Daemon: cmnetd


This daemon monitors the health of cluster networks. It also handles the addition and
deletion of relocatable package IPs, for both IPv4 and IPv6 addresses.

Lock LUN Daemon: cmdisklockd


If a lock LUN is being used, cmdisklockd runs on each node in the cluster, providing
tie-breaking services when needed during cluster re-formation. It is started by cmcld
when the node joins the cluster. The path for this daemon is $SGLBIN/cmdisklockd.

Cluster Object Manager Daemon: cmomd


This daemon is responsible for providing information about the cluster to
clients—external products or tools that depend on knowledge of the state of cluster
objects.
Clients send queries to the object manager and receive responses from it (this
communication is done indirectly, through a Serviceguard API). The queries are
decomposed into categories (of classes) which are serviced by various providers. The
providers gather data from various sources, including, commonly, the cmclconfd
daemons on all connected nodes, returning data to a central assimilation point where
it is filtered to meet the needs of a particular query.
This daemon is started by xinetd. Parameters are in the /etc/xinetd.d/
hacl-probe file. The path for this daemon is $SGLBIN/cmomd.
This daemon may not be running on your system; it is used only by clients of the object
manager.

Service Assistant Daemon: cmserviced


This daemon forks and execs any script or processes as required by the cluster daemon,
cmcld. There are two type of forks that this daemon carries out:
• Executing package run and halt scripts
• Launching services

40 Understanding Serviceguard Software Components


For services, cmcld monitors the service process and, depending on the number of
service retries, cmcld either restarts the service through cmsrvassistd or it causes
the package to halt and moves the package to an available alternate node. The path for
this daemon is: $SGLBIN/cmserviced.

Quorum Server Daemon: qs


Using a quorum server is one way to break a tie and establish a quorum when the
cluster is re-forming; the other way is to use a Lock LUN. See “Cluster Quorum to
Prevent Split-Brain Syndrome” (page 44) and the sections that follow it.
The quorum server, if used, runs on a system external to the cluster. It is normally
started from /etc/inittab with the respawn option, which means that it
automatically restarts if it fails or is killed. It can also be configured as a Serviceguard
package in a cluster other than the one(s) it serves; see Figure 3-4 (page 47).
All members of the cluster initiate and maintain a connection to the quorum server; if
it dies, the Serviceguard nodes will detect this and then periodically try to reconnect
to it. If there is a cluster re-formation while the quorum server is down and tie-breaking
is needed, the re-formation will fail and all the nodes will halt (system reset). For this
reason it is important to bring the quorum server back up as soon as possible.
For more information about the Quorum Server software and how it works, including
instructions for configuring the Quorum Server as a Serviceguard package, see the
latest version of the HP Serviceguard Quorum Server release notes at http://
docs.hp.com -> High Availability -> Quorum Server. See also “Use of
the Quorum Server as a Cluster Lock” (page 46).
The path for this daemon is:
• For SUSE: /opt/qs/bin/qs
• For Red Hat: /usr/local/qs/bin/qs

Utility Daemon: cmlockd


Runs on every node on which cmcld is running. It maintains the active and pending
cluster resource locks.

Cluster SNMP Agent Daemon: cmsnmpd


This daemon collaborates with the SNMP Master Agent to provide instrumentation
for the cluster Management Information Base (MIB).
The SNMP Master Agent and the cmsnmpd provide notification (traps) for
cluster-related events. For example, a trap is sent when the cluster configuration changes,
or when a Serviceguard package has failed. To configure the agent to send traps to one
or more specific destinations, add the trap destinations to /etc/snmp/
snmptrapd.conf (SUSE and Red Hat). Make sure traps are turned on with trap2sink
in /etc/snmp/snmpd.conf (SUSE and Red Hat).

Serviceguard Architecture 41
The installation of the cmsnmpd rpm configures snmpd and cmsnmpd to start up
automatically. Their startup scripts are in /etc/init.d/. The scripts can be run manually
to start and stop the daemons.
For more information, see the cmsnmpd (1)manpage.

Cluster WBEM Agent Daemon: cmwbemd


This daemon collaborates with the Serviceguard WBEM provider plug-in module
(SGProviders) and WBEM services cimserver to provide notification (WBEM
Indications) of Serviceguard cluster events to Serviceguard WBEM Indication
subscribers that have registered a subscription with the cimserver. For example, an
Indication is sent when the cluster configuration changes, or when a Serviceguard
package has failed.
You can start and stop cmwbemd with the commands /sbin/init.d/cmwbemagt
start and /sbin/init.d/cmwbemagt stop.

Proxy Daemon: cmproxyd


This daemon is used to proxy or cache Serviceguard configuration data for use by
certain Serviceguard commands running on the local node. This allows these commands
to get the data quicker and removes the burden of responding to certain requests from
cmcld.

How the Cluster Manager Works


The cluster manager is used to initialize a cluster, to monitor the health of the cluster,
to recognize node failure if it should occur, and to regulate the re-formation of the
cluster when a node joins or leaves the cluster. The cluster manager operates as a
daemon process that runs on each node. During cluster startup and re-formation
activities, one node is selected to act as the cluster coordinator. Although all nodes
perform some cluster management functions, the cluster coordinator is the central point
for inter-node communication.

Configuration of the Cluster


The system administrator sets up cluster configuration parameters and does an initial
cluster startup; thereafter, the cluster regulates itself without manual intervention in
normal operation. Configuration parameters for the cluster include the cluster name
and nodes, networking parameters for the cluster heartbeat, cluster lock information,
and timing parameters (discussed in detail in Chapter 4 (page 93) ). Cluster parameters
are entered by editing the cluster configuration file (see “Configuring the Cluster”
(page 177)). The parameters you enter are used to build a binary configuration file which
is propagated to all nodes in the cluster. This binary cluster configuration file must be
the same on all the nodes in the cluster.

42 Understanding Serviceguard Software Components


Heartbeat Messages
Central to the operation of the cluster manager is the sending and receiving of heartbeat
messages among the nodes in the cluster. Each node in the cluster exchanges UDP
heartbeat messages with every other node over each IP network configured as a
heartbeat device.
If a cluster node does not receive heartbeat messages from all other cluster nodes within
the prescribed time, a cluster re-formation is initiated; see “What Happens when a
Node Times Out” (page 90). At the end of the re-formation, if a new set of nodes form
a cluster, that information is passed to the package coordinator (described later in this
chapter, under “How the Package Manager Works” (page 49)). Failover packages that
were running on nodes that are no longer in the new cluster are transferred to their
adoptive nodes.
If heartbeat and data are sent over the same LAN subnet, data congestion may cause
Serviceguard to miss heartbeats and initiate a cluster re-formation that would not
otherwise have been needed. For this reason, HP recommends that you dedicate a LAN
for the heartbeat as well as configuring heartbeat over the data network.
Each node sends its heartbeat message at a rate calculated by Serviceguard on the basis
of the value of the MEMBER_TIMEOUT parameter, set in the cluster configuration
file, which you create as a part of cluster configuration.

IMPORTANT: When multiple heartbeats are configured, heartbeats are sent in parallel;
Serviceguard must receive at least one heartbeat to establish the health of a node. HP
recommends that you configure all subnets that interconnect cluster nodes as heartbeat
networks; this increases protection against multiple faults at no additional cost.
Heartbeat IP addresses must be on the same subnet on each node, but it is possible to
configure a cluster that spans subnets; see “Cross-Subnet Configurations” (page 32).
See HEARTBEAT_IP, under “Cluster Configuration Parameters ” (page 105), for more
information about heartbeat requirements. For timeout requirements and
recommendations, see the MEMBER_TIMEOUT parameter description in the same
section. For troubleshooting information, see “Cluster Re-formations Caused by
MEMBER_TIMEOUT Being Set too Low” (page 292). See also “Cluster Daemon: cmcld”
(page 39).

Manual Startup of Entire Cluster


A manual startup forms a cluster out of all the nodes in the cluster configuration.
Manual startup is normally done the first time you bring up the cluster, after
cluster-wide maintenance or upgrade, or after reconfiguration.
Before startup, the same binary cluster configuration file must exist on all nodes in the
cluster. The system administrator starts the cluster with the cmruncl command issued
from one node. The cmruncl command can only be used when the cluster is not
running, that is, when none of the nodes is running the cmcld daemon.

How the Cluster Manager Works 43


During startup, the cluster manager software checks to see if all nodes specified in the
startup command are valid members of the cluster, are up and running, are attempting
to form a cluster, and can communicate with each other. If they can, then the cluster
manager forms the cluster.

Automatic Cluster Startup


An automatic cluster startup occurs any time a node reboots and joins the cluster. This
can follow the reboot of an individual node, or it may be when all nodes in a cluster
have failed, as when there has been an extended power failure and all SPUs went down.
Automatic cluster startup will take place if the flag AUTOSTART_CMCLD is set to 1 in
the $SGCONF/cmcluster.rc file. When any node reboots with this parameter set to
1, it will rejoin an existing cluster, or if none exists it will attempt to form a new cluster.

Dynamic Cluster Re-formation


A dynamic re-formation is a temporary change in cluster membership that takes place
as nodes join or leave a running cluster. Re-formation differs from reconfiguration,
which is a permanent modification of the configuration files. Re-formation of the cluster
occurs under the following conditions (not a complete list):
• An SPU or network failure was detected on an active node.
• An inactive node wants to join the cluster. The cluster manager daemon has been
started on that node.
• A node has been added to or deleted from the cluster configuration.
• The system administrator halted a node.
• A node halts because of a package failure.
• A node halts because of a service failure.
• Heavy network traffic prohibited the heartbeat signal from being received by the
cluster.
• The heartbeat network failed, and another network is not configured to carry
heartbeat.
Typically, re-formation results in a cluster with a different composition. The new cluster
may contain fewer or more nodes than in the previous incarnation of the cluster.

Cluster Quorum to Prevent Split-Brain Syndrome


In general, the algorithm for cluster re-formation requires a cluster quorum of a strict
majority (that is, more than 50%) of the nodes previously running. If both halves (exactly
50%) of a previously running cluster were allowed to re-form, there would be a
split-brain situation in which two instances of the same cluster were running. In a
split-brain scenario, different incarnations of an application could end up simultaneously
accessing the same disks. One incarnation might well be initiating recovery activity
while the other is modifying the state of the disks. Serviceguard’s quorum requirement
is designed to prevent a split-brain situation.

44 Understanding Serviceguard Software Components


Cluster Lock
Although a cluster quorum of more than 50% is generally required, exactly 50% of the
previously running nodes may re-form as a new cluster provided that the other 50% of
the previously running nodes do not also re-form. This is guaranteed by the use of a
tie-breaker to choose between the two equal-sized node groups, allowing one group
to form the cluster and forcing the other group to shut down. This tie-breaker is known
as a cluster lock. The cluster lock is implemented either by means of a lock LUN or a
quorum server. A cluster lock is required on two-node clusters.
The cluster lock is used as a tie-breaker only for situations in which a running cluster
fails and, as Serviceguard attempts to form a new cluster, the cluster is split into two
sub-clusters of equal size. Each sub-cluster will attempt to acquire the cluster lock. The
sub-cluster which gets the cluster lock will form the new cluster, preventing the
possibility of two sub-clusters running at the same time. If the two sub-clusters are of
unequal size, the sub-cluster with greater than 50% of the nodes will form the new
cluster, and the cluster lock is not used.
If you have a two-node cluster, you are required to configure a cluster lock. If
communications are lost between these two nodes, the node that obtains the cluster
lock will take over the cluster and the other node will halt (system reset). Without a
cluster lock, a failure of either node in the cluster will cause the other node, and therefore
the cluster, to halt. Note also that if the cluster lock fails during an attempt to acquire
it, the cluster will halt.

Use of a Lock LUN as the Cluster Lock


A lock LUN can be used for clusters up to and including four nodes in size. The cluster
lock LUN is a special piece of storage (known as a partition) that is shareable by all
nodes in the cluster. When a node obtains the cluster lock, this partition is marked so
that other nodes will recognize the lock as “taken.”

NOTE: The lock LUN is dedicated for use as the cluster lock, and, in addition, HP
recommends that this LUN comprise the entire disk; that is, the partition should take
up the entire disk.
The complete path name of the lock LUN is identified in the cluster configuration file.
The operation of the lock LUN is shown in Figure 3-2.

How the Cluster Manager Works 45


Figure 3-2 Lock LUN Operation

Serviceguard periodically checks the health of the lock LUN and writes messages to
the syslog file if the disk fails the health check. This file should be monitored for early
detection of lock disk problems.

Use of the Quorum Server as a Cluster Lock


The cluster lock in Linux can also be implemented by means of a quorum server. A
quorum server can be used in clusters of any size. The quorum server software can be
configured as a Serviceguard package, or standalone, but in either case it must run on
a system outside of the cluster for which it is providing quorum services.
The quorum server listens to connection requests from the Serviceguard nodes on a
known port. The server maintains a special area in memory for each cluster, and when
a node obtains the cluster lock, this area is marked so that other nodes will recognize
the lock as “taken.”
If the quorum server is not available when its tie-breaking services are needed during
a cluster re-formation, the cluster will halt.
The operation of the quorum server is shown in Figure 3-3. When there is a loss of
communication between node 1 and node 2, the quorum server chooses one node (in
this example, node 2) to continue running in the cluster. The other node halts.

46 Understanding Serviceguard Software Components


Figure 3-3 Quorum Server Operation

A quorum server can provide quorum services for multiple clusters. Figure 3-4 illustrates
quorum server use across four clusters.

Figure 3-4 Quorum Server to Cluster Distribution

How the Cluster Manager Works 47


IMPORTANT: For more information about the quorum server, see the latest version
of the HP Serviceguard Quorum Server release notes at http://docs.hp.com ->
High Availability -> Quorum Server.

No Cluster Lock
Normally, you should not configure a cluster of three or fewer nodes without a cluster
lock. In two-node clusters, a cluster lock is required. You may consider using no cluster
lock with configurations of three or more nodes, although the decision should be
affected by the fact that any cluster may require tie-breaking. For example, if one node
in a three-node cluster is removed for maintenance, the cluster re-forms as a two-node
cluster. If a tie-breaking scenario later occurs due to a node or communication failure,
the entire cluster will become unavailable.
In a cluster with four or more nodes, you may not need a cluster lock since the chance
of the cluster being split into two halves of equal size is very small. However, be sure
to configure your cluster to prevent the failure of exactly half the nodes at one time.
For example, make sure there is no potential single point of failure such as a single
LAN between equal numbers of nodes, and that you don’t have exactly half of the
nodes on a single power circuit.

What Happens when You Change the Quorum Configuration Online


You can change the quorum configuration while the cluster is up and running. This
includes changes to the quorum method (for example from a lock disk to a quorum
server), the quorum device (for example from one quorum server to another), and the
parameters that govern them (for example the quorum server polling interval). For
more information about the quorum server and lock parameters, see “Cluster
Configuration Parameters ” (page 105).
When you make quorum configuration changes, Serviceguard goes through a two-step
process:
1. All nodes switch to a strict majority quorum (turning off any existing quorum
devices).
2. All nodes switch to the newly configured quorum method, device and parameters.

48 Understanding Serviceguard Software Components


IMPORTANT: During Step 1, while the nodes are using a strict majority quorum, node
failures can cause the cluster to go down unexpectedly if the cluster has been using a
quorum device before the configuration change. For example, suppose you change the
quorum server polling interval while a two-node cluster is running. If a node fails
during Step 1, the cluster will lose quorum and go down, because a strict majority of
prior cluster members (two out of two in this case) is required. The duration of Step 1
is typically around a second, so the chance of a node failure occurring during that time
is very small.
In order to keep the time interval as short as possible, make sure you are changing only
the quorum configuration, and nothing else, when you apply the change.
If this slight risk of a node failure leading to cluster failure is unacceptable, halt the
cluster before you make the quorum configuration change.

How the Package Manager Works


Packages are the means by which Serviceguard starts and halts configured applications.
A package is a collection of services, disk volumes and IP addresses that are managed
by Serviceguard to ensure they are available.
Each node in the cluster runs an instance of the package manager; the package manager
residing on the cluster coordinator is known as the package coordinator.
The package coordinator does the following:
• Decides when and where to run, halt, or move packages.
The package manager on all nodes does the following:
• Executes the control scripts that run and halt packages and their services.
• Reacts to changes in the status of monitored resources.

Package Types
Three different types of packages can run in the cluster; the most common is the failover
package. There are also special-purpose packages that run on more than one node at
a time, and so do not fail over. They are typically used to manage resources of certain
failover packages.

Non-failover Packages
There are two types of special-purpose packages that do not fail over and that can run
on more than one node at the same time: the system multi-node package, which runs
on all nodes in the cluster, and the multi-node package, which can be configured to
run on all or some of the nodes in the cluster. System multi-node packages are reserved
for use by HP-supplied applications.
The rest of this section describes failover packages.

How the Package Manager Works 49


Failover Packages
A failover package starts up on an appropriate node (see node_name (page 205)) when
the cluster starts. In the case of a service, network, or other resource or dependency
failure, package failover takes place. A package failover involves both halting the
existing package and starting the new instance of the package on a new node.
Failover is shown in the following figure:

Figure 3-5 Package Moving During Failover

Configuring Failover Packages


You configure each package separately. You create a failover package by generating
and editing a package configuration file template, then adding the package to the
cluster configuration database; details are in Chapter 6: “Configuring Packages and
Their Services ” (page 197).
For legacy packages (packages created by the method used on versions of Serviceguard
earlier than A.11.18), you must also create a package control script for each package,
to manage the execution of the package’s services. See “Configuring a Legacy Package”
(page 262) for detailed information.
Customized package control scripts are not needed for modular packages (packages
created by the method introduced in Serviceguard A.11.18). These packages are managed
50 Understanding Serviceguard Software Components
by a master control script that is installed with Serviceguard; see Chapter 6:
“Configuring Packages and Their Services ” (page 197), for instructions for creating
modular packages.

Deciding When and Where to Run and Halt Failover Packages


The package configuration file assigns a name to the package and includes a list of the
nodes on which the package can run.
Failover packages list the nodes in order of priority (i.e., the first node in the list is the
highest priority node). In addition, failover packages’ files contain three parameters
that determine failover behavior. These are the auto_run parameter, the failover_policy
parameter, and the failback_policy parameter.

Failover Packages’ Switching Behavior


The auto_run parameter (known in earlier versions of Serviceguard as the
PKG_SWITCHING_ENABLED parameter) defines the default global switching attribute
for a failover package at cluster startup: that is, whether Serviceguard can automatically
start the package when the cluster is started, and whether Serviceguard should
automatically restart the package on a new node in response to a failure. Once the
cluster is running, the package switching attribute of each package can be temporarily
set with the cmmodpkg command; at reboot, the configured value will be restored.
The auto_run parameter is set in the package configuration file.
A package switch normally involves moving failover packages and their associated IP
addresses to a new system. The new system must already have the same subnet
configured and working properly, otherwise the packages will not be started.

NOTE: It is possible to configure a cluster that spans subnets joined by a router, with
some nodes using one subnet and some another. This is known as a cross-subnet
configuration. In this context, you can configure packages to fail over from a node on
one subnet to a node on another, and you will need to configure a relocatable IP address
for each subnet the package is configured to start on; see “About Cross-Subnet Failover”
(page 147), and in particular the subsection “Implications for Application Deployment”
(page 148).
When a package fails over, TCP connections are lost. TCP applications must reconnect
to regain connectivity; this is not handled automatically. Note that if the package is
dependent on multiple subnets, normally all of them must be available on the target
node before the package will be started. (In a cross-subnet configuration, all the
monitored subnets that are specified for this package, and configured on the target
node, must be up.)
If the package has a dependency on a resource or another package, the dependency
must be met on the target node before the package can start.

How the Package Manager Works 51


The switching of relocatable IP addresses is shown in the figures that follow. Users
connect to each node with the IP address of the package they wish to use. Each node
has a stationary IP address associated with it, and each package has an IP address
associated with it.

Figure 3-6 Before Package Switching

In Figure 3-7, node1 has failed and pkg1 has been transferred to node2. pkg1's IP
address was transferred to node2 along with the package. pkg1 continues to be
available and is now running on node2. Also note that node2 now has access both to
pkg1's disk and pkg2's disk.

52 Understanding Serviceguard Software Components


NOTE: For design and configuration information about clusters that span subnets,
see the documents listed under “Cross-Subnet Configurations” (page 32).

Figure 3-7 After Package Switching

Failover Policy
The Package Manager selects a node for a failover package to run on based on the
priority list included in the package configuration file together with the failover_policy
parameter, also in the configuration file. The failover policy governs how the package
manager selects which node to run a package on when a specific node has not been
identified and the package needs to be started. This applies not only to failovers but
also to startup for the package, including the initial startup. The two failover policies
are configured_node (the default) and min_package_node. The parameter is set
in the package configuration file.

How the Package Manager Works 53


If you use configured_node as the value for the failover policy, the package will
start up on the highest priority node available in the node list. When a failover occurs,
the package will move to the next highest priority node in the list that is available.
If you use min_package_node as the value for the failover policy, the package will
start up on the node that is currently running the fewest other packages. (Note that
this does not mean the lightest load; the only thing that is checked is the number of
packages currently running on the node.)

Automatic Rotating Standby


Using the min_package_node failover policy, it is possible to configure a cluster that
lets you use one node as an automatic rotating standby node for the cluster. Consider
the following package configuration for a four node cluster. Note that all packages can
run on all nodes and have the same node_name lists. Although the example shows
the node names in a different order for each package, this is not required.
Table 3-1 Package Configuration Data
Package Name NODE_NAME List FAILOVER_POLICY

pkgA node1, node2, node3, min_package_node


node4

pkgB node2, node3, node4, min_package_node


node1

pkgC node3, node4, node1, min_package_node


node2

When the cluster starts, each package starts as shown in Figure 3-8.

54 Understanding Serviceguard Software Components


Figure 3-8 Rotating Standby Configuration before Failover

If a failure occurs, the failing package would fail over to the node containing fewest
running packages:

How the Package Manager Works 55


Figure 3-9 Rotating Standby Configuration after Failover

NOTE: Under the min_package_node policy, when node2 is repaired and brought
back into the cluster, it will then be running the fewest packages, and thus will become
the new standby node.
If these packages had been set up using the configured_node failover policy, they
would start initially as in Figure 3-8, but the failure of node2 would cause the package
to start on node3, as shown in Figure 3-10.

56 Understanding Serviceguard Software Components


Figure 3-10 configured_node Policy Packages after Failover

If you use configured_node as the failover policy, the package will start up on the
highest-priority eligible node in its node list. When a failover occurs, the package will
move to the next eligible node in the list, in the configured order of priority.

Failback Policy
The use of the failback_policy parameter allows you to decide whether a package will
return to its primary node if the primary node becomes available and the package is
not currently running on the primary node. The configured primary node is the first
node listed in the package’s node list.
The two possible values for this policy are automatic and manual. The parameter is
set in the package configuration file:
As an example, consider the following four-node configuration, in which failover_policy
is set to configured_node and failback_policy is automatic:

How the Package Manager Works 57


Figure 3-11 Automatic Failback Configuration before Failover

Table 3-2 Node Lists in Sample Cluster


Package Name NODE_NAME List FAILOVER POLICY FAILBACK POLICY

pkgA node1, node4 configured_node automatic

pkgB node2, node4 configured_node automatic

pkgC node3, node4 configured_node automatic

node1 panics, and after the cluster reforms, pkgA starts running on node4:

58 Understanding Serviceguard Software Components


Figure 3-12 Automatic Failback Configuration After Failover

After rebooting, node1 rejoins the cluster. At that point, pkgA will be automatically
stopped on node4 and restarted on node1.

How the Package Manager Works 59


Figure 3-13 Automatic Failback Configuration After Restart of node1

NOTE: Setting the failback_policy to automatic can result in a package failback and
application outage during a critical production period. If you are using automatic
failback, you may want to wait to add the package’s primary node back into the cluster
until you can allow the package to be taken out of service temporarily while it switches
back to the primary node.

On Combining Failover and Failback Policies


Combining a failover_policy of min_package_node with a failback_policy of automatic
can result in a package’s running on a node where you did not expect it to run, since
the node running the fewest packages will probably not be the same host every time
a failover occurs.

Using Older Package Configuration Files


If you are using package configuration files that were generated using a previous
version of Serviceguard, HP recommends you use the cmmakepkg command to open

60 Understanding Serviceguard Software Components


a new template, and then copy the parameter values into it. In the new template, read
the descriptions and defaults of the choices that did not exist when the original
configuration was made. For example, the default for failover_policy is now
configured_node and the default for failback_policy is now manual.
For full details of the current parameters and their default values, see Chapter 6:
“Configuring Packages and Their Services ” (page 197), and the package configuration
file template itself.

How Packages Run


Packages are the means by which Serviceguard starts and halts configured applications.
Failover packages are also units of failover behavior in Serviceguard. A package is a
collection of services, disk volumes and IP addresses that are managed by Serviceguard
to ensure they are available. There can be a maximum of 300 packages per cluster and
a total of 900 services per cluster.

What Makes a Package Run?


There are 3 types of packages:
• The failover package is the most common type of package. It runs on one node at
a time. If a failure occurs, it can switch to another node listed in its configuration
file. If switching is enabled for several nodes, the package manager will use the
failover policy to determine where to start the package.
• A system multi-node package runs on all the active cluster nodes at the same time.
It can be started or halted on all nodes, but not on individual nodes.
• A multi-node package can run on several nodes at the same time. If auto_run is set
to yes, Serviceguard starts the multi-node package on all the nodes listed in its
configuration file. It can be started or halted on all nodes, or on individual nodes,
either by user command (cmhaltpkg) or automatically by Serviceguard in response
to a failure of a package component, such as service or subnet.
System multi-node packages are supported only for use by applications supplied by
Hewlett-Packard.
A failover package can be configured to have a dependency on a multi-node or system
multi-node package. The package manager cannot start a package on a node unless
the package it depends on is already up and running on that node.
The package manager will always try to keep a failover package running unless there
is something preventing it from running on any node. The most common reasons for
a failover package not being able to run are that auto_run is disabled so Serviceguard
is not allowed to start the package, that node switching is disabled for the package on
particular nodes, or that the package has a dependency that is not being met. When a
package has failed on one node and is enabled to switch to another node, it will start
up automatically in a new location where its dependencies are met. This process is
known as package switching, or remote switching.

How Packages Run 61


A failover package starts on the first available node in its configuration file; by default,
it fails over to the next available one in the list. Note that you do not necessarily have
to use a cmrunpkg command to restart a failed failover package; in many cases, the
best way is to enable package and/or node switching with the cmmodpkg command.
When you create the package, you indicate the list of nodes on which it is allowed to
run. System multi-node packages must list all cluster nodes in their cluster. Multi-node
packages and failover packages can name some subset of the cluster’s nodes or all of
them.
If the auto_run parameter is set to yes in a package’s configuration file Serviceguard
automatically starts the package when the cluster starts. System multi-node packages
are required to have auto_run set to yes. If a failover package has auto_run set to no,
Serviceguard cannot start it automatically at cluster startup time; you must explicitly
enable this kind of package using the cmmodpkg command.

NOTE: If you configure the package while the cluster is running, the package does
not start up immediately after the cmapplyconf command completes. To start the
package without halting and restarting the cluster, issue the cmrunpkg or cmmodpkg
command.
How does a failover package start up, and what is its behavior while it is running?
Some of the many phases of package life are shown in Figure 3-14.

62 Understanding Serviceguard Software Components


NOTE: This diagram applies specifically to legacy packages. Differences for modular
scripts are called out below.

Figure 3-14 Legacy Package Time Line Showing Important Events

The following are the most important moments in a package’s life:


1. Before the control script starts. (For modular packages, this is the master control
script.)
2. During run script execution. (For modular packages, during control script execution
to start the package.)
3. While services are running
4. When a service or subnet fails, or a dependency is not met.
5. During halt script execution. (For modular packages, during control script execution
to halt the package.)
6. When the package or the node is halted with a command
7. When the node fails

How Packages Run 63


Before the Control Script Starts
First, a node is selected. This node must be in the package’s node list, it must conform
to the package’s failover policy, and any resources required by the package must be
available on the chosen node. One resource is the subnet that is monitored for the
package. If the subnet is not available, the package cannot start on this node. Another
type of resource is a dependency on another package. If monitoring shows a value for
a configured resource that is outside the permitted range, the package cannot start.
Once a node is selected, a check is then done to make sure the node allows the package
to start on it. Then services are started up for a package by the control script on the
selected node. Strictly speaking, the run script on the selected node is used to start a
legacy package; the master control script starts a modular package.

During Run Script Execution


Once the package manager has determined that the package can start on a particular
node, it launches the script that starts the package (that is, a package’s control script
or master control script is executed with the start parameter). This script carries out the
following steps:
1. Executes any external_pre_scripts (modular packages only; see “About External
Scripts” (page 143))
2. Activates volume groups or disk groups.
3. Mounts file systems.
4. Assigns package IP addresses to the LAN card on the node (failover packages
only).
5. Executes any customer-defined run commands (legacy packages only; see “Adding
Customer Defined Functions to the Package Control Script ” (page 267)) or
external_scripts (modular packages only; see “About External Scripts” (page 143)).
6. Starts each package service.
7. Exits with an exit code of zero (0).

64 Understanding Serviceguard Software Components


Figure 3-15 Legacy Package Time Line

At any step along the way, an error will result in the script exiting abnormally (with
an exit code of 1). For example, if a package service is unable to be started, the control
script will exit with an error.

NOTE: This diagram is specific to legacy packages. Modular packages also run external
scripts and “pre-scripts” as explained above.
If the run script execution is not complete before the time specified in the
run_script_timeout parameter (page 207), the package manager will kill the script. During
run script execution, messages are written to a log file. For legacy packages, this is in
the same directory as the run script and has the same name as the run script and the
extension.log. For modular packages, the pathname is determined by the script_log_file
parameter in the package configuration file (page 208)). Normal starts are recorded in
the log, together with error messages or warnings related to starting the package.

How Packages Run 65


NOTE: After the package run script has finished its work, it exits, which means that
the script is no longer executing once the package is running normally. After the script
exits, the PIDs of the services started by the script are monitored by the package manager
directly. If the service dies, the package manager will then run the package halt script
or, if service_fail_fast_enabled (page 216) is set to yes, it will halt the node on which the
package is running. If a number of restarts is specified for a service in the package
control script, the service may be restarted if the restart count allows it, without
re-running the package run script.

Normal and Abnormal Exits from the Run Script


Exit codes on leaving the run script determine what happens to the package next. A
normal exit means the package startup was successful, but all other exits mean that the
start operation did not complete successfully.
• 0—normal exit. The package started normally, so all services are up on this node.
• 1—abnormal exit, also known as no_restart exit. The package did not complete
all startup steps normally. Services are killed, and the package is disabled from
failing over to other nodes.
• 2—alternative exit, also known as restart exit. There was an error, but the
package is allowed to start up on another node. You might use this kind of exit
from a customer defined procedure if there was an error, but starting the package
on another node might succeed. A package with a restart exit is disabled from
running on the local node, but can still run on other nodes.
• Timeout—Another type of exit occurs when the run_script_timeout is
exceeded. In this scenario, the package is killed and disabled globally. It is not
disabled on the current node, however. The package script may not have been
able to clean up some of its resources such as LVM volume groups or package
mount points, so before attempting to start up the package on any node, be sure
to check whether any resources for the package need to be cleaned up.

Service Startup with cmrunserv


Within the package control script, the cmrunserv command starts up the individual
services. This command is executed once for each service that is coded in the file. You
can configure a number of restarts for each service. The cmrunserv command passes
this number to the package manager, which will restart the service the appropriate
number of times if the service should fail. The following are some typical settings in a
legacy package; for more information about configuring services in modular packages,
see the discussion starting with “service_name” (page 214) in Chapter 6, and the comments
in the package configuration template file.
SERVICE_RESTART[0]=" " ; do not restart
SERVICE_RESTART[0]="-r <n>" ; restart as many as <n> times
SERVICE_RESTART[0]="-R" ; restart indefinitely

66 Understanding Serviceguard Software Components


NOTE: If you set <n> restarts and also set service_fail_fast_enabled to yes, the failfast
will take place after <n> restart attempts have failed. It does not make sense to set
service_restart to “-R” for a service and also set service_fail_fast_enabled to yes.

While Services are Running


During the normal operation of cluster services, the package manager continuously
monitors the following:
• Process IDs of the services
• Subnets configured for monitoring in the package configuration file
If a service fails but the restart parameter for that service is set to a value greater than
0, the service will restart, up to the configured number of restarts, without halting the
package.
During normal operation, while all services are running, you can see the status of the
services in the “Script Parameters” section of the output of the cmviewcl command.

When a Service or Subnet Fails, or a Dependency is Not Met


What happens when something goes wrong? If a service fails and there are no more
restarts, or if a configured dependency on another package is not met, then a failover
package will halt on its current node and, depending on the setting of the package
switching flags, may be restarted on another node. If a multi-node or system multi-node
package fails, all of the packages that have configured a dependency on it will also fail.
Package halting normally means that the package halt script executes (see the next
section). However, if a failover package’s configuration has the service_fail_fast_enabled
flag (page 216) set to yes for the service that fails, then the node will halt as soon as the
failure is detected. If this flag is not set, the loss of a service will result in halting the
package gracefully by running the halt script.
If auto_run (page 206) is set to yes, the package will start up on another eligible node,
if it meets all the requirements for startup. If auto_run is set to no, then the package
simply halts without starting up anywhere else.

NOTE: If a package is dependent on a subnet, and the subnet on the primary node
fails, the package will start to shut down. If the subnet recovers immediately (before
the package is restarted on an adoptive node), the package manager restarts the package
on the same node; no package switch occurs.

When a Package is Halted with a Command


The Serviceguard cmhaltpkg command has the effect of executing the package halt
script, which halts the services that are running for a specific package. This provides

How Packages Run 67


a graceful shutdown of the package that is followed by disabling automatic package
startup (see “auto_run” (page 206)).
You cannot halt a multi-node or system multi-node package unless all the packages
that have a configured dependency on it are down. Use cmviewcl to check the status
of dependents. For example, if pkg1 and pkg2 depend on PKGa, both pkg1 and pkg2
must be halted before you can halt PKGa.

NOTE: If you use cmhaltpkg command with the-n <nodename> option, the package
is halted only if it is running on that node.
The cmmodpkg command cannot be used to halt a package, but it can disable switching
either on particular nodes or on all nodes. A package can continue running when its
switching has been disabled, but it will not be able to start on other nodes if it stops
running on its current node.

During Halt Script Execution


Once the package manager has detected the failure of a service or package that a failover
package depends on, or when the cmhaltpkg command has been issued for a particular
package, the package manager launches the halt script. That is, a package’s control
script or master control script is executed with the stop parameter. This script carries
out the following steps (also shown in Figure 3-16):
1. Halts all package services.
2. Executes any customer-defined halt commands (legacy packages only) or
external_scripts (modular packages only; see “external_script” (page 220)).
3. Removes package IP addresses from the LAN card on the node.
4. Unmounts file systems.
5. Deactivates volume groups.
6. Exits with an exit code of zero (0).
7. Executes any external_pre_scripts (modular packages only; see “external_pre_script”
(page 220)).

68 Understanding Serviceguard Software Components


Figure 3-16 Legacy Package Time Line for Halt Script Execution

At any step along the way, an error will result in the script exiting abnormally (with
an exit code of 1). If the halt script execution is not complete before the time specified
in the halt_script_timeout (page 207), the package manager will kill the script. During
halt script execution, messages are written to a log file. For legacy packages, this is in
the same directory as the run script and has the same name as the run script and the
extension.log. For modular packages, the pathname is determined by the script_log_file
parameter in the package configuration file (page 208). Normal starts are recorded in
the log, together with error messages or warnings related to halting the package.

NOTE: This diagram applies specifically to legacy packages. Differences for modular
scripts are called out above.

Normal and Abnormal Exits from the Halt Script


The package’s ability to move to other nodes is affected by the exit conditions on leaving
the halt script. The following are the possible exit codes:

How Packages Run 69


• 0—normal exit. The package halted normally, so all services are down on this
node.
• 1—abnormal exit, also known as no_restart exit. The package did not halt
normally. Services are killed, and the package is disabled globally. It is not disabled
on the current node, however.
• Timeout—Another type of exit occurs when the halt_script_timeout is exceeded. In
this scenario, the package is killed and disabled globally. It is not disabled on the
current node, however.

Package Control Script Error and Exit Conditions


Table 3-3 shows the possible combinations of error condition, failfast setting and package
movement for failover packages.
Table 3-3 Error Conditions and Package Movement for Failover Packages
Package Error Condition Results

Error or Exit Node Service Linux Status Halt script Package Allowed Package
Code Failfast Failfast on Primary runs after to Run on Primary Allowed to
Enabled Enabled after Error Error or Node after Error Run on
Exit Alternate
Node

Service Failure Either YES system No N/A (system Yes


Setting reset reset)

Service Failure Either NO Running Yes No Yes


Setting

Run Script Exit Either Either Running No Not changed No


1 Setting Setting

Run Script Exit YES Either system No N/A (system Yes


2 Setting reset reset)

Run Script Exit NO Either Running No No Yes


2 Setting

Run Script YES Either system No N/A (system Yes


Timeout Setting reset reset)

Run Script NO Either Running No Not changed No


Timeout Setting

Halt Script Exit YES Either Running N/A Yes No


1 Setting

Halt Script Exit NO Either Running N/A Yes No


1 Setting

70 Understanding Serviceguard Software Components


Table 3-3 Error Conditions and Package Movement for Failover Packages (continued)
Package Error Condition Results

Error or Exit Node Service Linux Status Halt script Package Allowed Package
Code Failfast Failfast on Primary runs after to Run on Primary Allowed to
Enabled Enabled after Error Error or Node after Error Run on
Exit Alternate
Node

Halt Script YES Either system N/A N/A (system Yes, unless
Timeout Setting reset reset) the timeout
happened
after the
cmhaltpkg
command
was executed.

Halt Script NO Either Running N/A Yes No


Timeout Setting

Service Failure Either YES system No N/A (system Yes


Setting reset reset)

Service Failure Either NO Running Yes No Yes


Setting

Loss of YES Either system No N/A (system Yes


Network Setting reset reset)

Loss of NO Either Running Yes Yes Yes


Network Setting

package Either Either Running Yes Yes when Yes if


depended on Setting Setting dependency is dependency
failed again met met

How the Network Manager Works


The purpose of the network manager is to detect and recover from network card failures
so that network services remain highly available to clients. In practice, this means
assigning IP addresses for each package to LAN interfaces on the node where the
package is running and monitoring the health of all interfaces, switching them when
necessary.

NOTE: Serviceguard monitors the health of the network interfaces (NICs) and can
monitor the IP level (layer 3) network.

Stationary and Relocatable IP Addresses and Monitored Subnets


Each node (host system) should have an IP address for each active network interface.
This address, known as a stationary IP address, is configured in the file

How the Network Manager Works 71


/etc/sysconfig/network-scripts/ifcfg-<interface> on Red Hat or
/etc/sysconfig/network/ifcfg-<mac_address> on SUSE. The stationary IP
address is not associated with packages, and it is not transferable to another node.
Stationary IP addresses are used to transmit data, heartbeat messages (described under
“How the Cluster Manager Works ” (page 42)), or both. They are configured into the
cluster via the cluster configuration file; see the entries for HEARTBEAT_IP and
STATIONARY_IP under “Cluster Configuration Parameters ” (page 105).
Serviceguard monitors the subnets represented by these IP addresses. They are referred
to as monitored subnets, and you can see their status at any time in the output of the
cmviewcl command; see “Network Status” (page 233) for an example.
You can also configure these subnets to be monitored for packages, using the
monitored_subnet parameter in the package configuration file (page 212). A package will
not start on a node unless the subnet(s) identified by monitored_subnet in its package
configuration file are up and reachable from that node.

IMPORTANT: Any subnet identified as a monitored_subnet in the package configuration


file must be configured into the cluster via NETWORK_INTERFACE and either
STATIONARY_IP or HEARTBEAT_IP in the cluster configuration file. See “Cluster
Configuration Parameters ” (page 105) and “Package Parameter Explanations” (page 204).
In addition to the stationary IP address, you normally assign one or more unique IP
addresses to each package. The package IP address is assigned to a LAN interface when
the package starts up.
The IP addresses associated with a package are called relocatable IP addresses (also
known as IP aliases, package IP addresses or floating IP addresses) because the
addresses can actually move from one cluster node to another. You can use up to 200
relocatable IP addresses in a cluster spread over as many as 300 packages. These
addresses can be IPv4, IPv6, or a combination of both address families.
Because system multi-node and multi-node packages do not fail over, they do not have
relocatable IP address.
A relocatable IP address is like a virtual host IP address that is assigned to a package.
HP recommends that you configure names for each package through DNS (Domain
Name System). A program then can use the package’s name like a host name as the
input to gethostbyname(3), which will return the package’s relocatable IP address.
Relocatable addresses (but not stationary addresses) can be taken over by an adoptive
node if control of the package is transferred. This means that applications can access
the package via its relocatable address without knowing which node the package
currently resides on.

72 Understanding Serviceguard Software Components


IMPORTANT: Any subnet that is used by a package for relocatable addresses should
be configured into the cluster via NETWORK_INTERFACE and either STATIONARY_IP
or HEARTBEAT_IP in the cluster configuration file. For more information about those
parameters, see “Cluster Configuration Parameters ” (page 105). For more information
about configuring relocatable addresses, see the descriptions of the package ip_
parameters (page 213).

NOTE: It is possible to configure a cluster that spans subnets joined by a router, with
some nodes using one subnet and some another. This is called a cross-subnet
configuration. In this context, you can configure packages to fail over from a node on
one subnet to a node on another, and you will need to configure a relocatable address
for each subnet the package is configured to start on; see “About Cross-Subnet Failover”
(page 147), and in particular the subsection “Implications for Application Deployment”
(page 148).

Types of IP Addresses
Both IPv4 and IPv6 address types are supported in Serviceguard. IPv4 addresses are
the traditional addresses of the form n.n.n.n where n is a decimal digit between 0
and 255. IPv6 addresses have the form x:x:x:x:x:x:x:x where x is the hexadecimal
value of each of eight 16-bit pieces of the 128-bit address. You can define heartbeat IPs,
stationary IPs, and relocatable (package) IPs as IPv4 or IPv6 addresses (or certain
combinations of both).

Adding and Deleting Relocatable IP Addresses


When a package is started, any relocatable IP addresses configured for that package
are added to the specified IP subnet. When the package is stopped, the relocatable IP
address is deleted from the subnet. These functions are performed by the cmmodnet
command in the package master control script (package control script for legacy
packages).
IP addresses are configured only on each primary network interface card. Multiple
IPv4 addresses on the same network card must belong to the same IP subnet.

CAUTION: HP strongly recommends that you add relocatable addresses to packages


only by editing ip_address (page 214) in the package configuration file (or IP [] entries in
the control script of a legacy package) and running cmapplyconf (1m).

Load Sharing
Serviceguard allows you to configure several services into a single package, sharing a
single IP address; in that case all those services will fail over when the package does.
If you want to be able to load-balance services (that is, move a specific service to a less

How the Network Manager Works 73


loaded system when necessary) you can do so by putting each service in its own package
and giving it a unique IP address.

Bonding of LAN Interfaces


Several LAN interfaces on a node can be grouped together in a process known in Linux
as channel bonding. In the bonded group, typically one interface is used to transmit
and receive data, while the others are available as backups. If one interface fails, another
interface in the bonded group takes over. HP strongly recommends you use channel
bonding in each critical IP subnet to achieve highly available network services.
Host Bus Adapters (HBAs) do not have to be identical. Ethernet LANs must be the
same type, but can be of different bandwidth (for example 1 Gb and 100 Mb).
Serviceguard for Linux supports the use of bonding of LAN interfaces at the driver
level. The Ethernet driver is configured to employ a group of interfaces.
Once bonding is enabled, each interface can be viewed as a single logical link of multiple
physical ports with only one IP and MAC address. There is no limit to the number of
slaves (ports) per bond, and the number of bonds per system is limited to the number
of Linux modules you can load.
You can bond the ports within a multi-ported networking card (cards with up to four
ports are currently available). Alternatively, you can bond ports from different cards.
HP recommends that use different cards.Figure 3-17 shows an example of four separate
interfaces bonded into one aggregate.

74 Understanding Serviceguard Software Components


Figure 3-17 Bonded Network Interfaces

The LANs in the non-bonded configuration have four LAN cards, each associated with
a separate non-aggregated IP address and MAC address, and each with its own LAN
name (eth1, eth2, eth3, eth4). When these ports are aggregated, all four ports are
associated with a single IP address and MAC address. In this example, the aggregated
ports are collectively known as bond0, and this is the name by which the bond is known
during cluster configuration.
Figure 3-18 shows a bonded configuration using redundant hubs with a crossover
cable.

How the Network Manager Works 75


Figure 3-18 Bonded NICs

Node1 Node2

bond0: bond0:

eth0 eth1 eth0 eth1

active active

Hub

Crossover cable

Hub

In the bonding model, individual Ethernet interfaces are slaves, and the bond is the
master. In the basic high availability configuration (mode 1), one slave in a bond assumes
an active role, while the others remain inactive until a failure is detected. (In Figure
3-18, both eth0 slave interfaces are active.) It is important that during configuration,
the active slave interfaces on all nodes are connected to the same hub. If this were not
the case, then normal operation of the LAN would require the use of the crossover
between the hubs and the crossover would become a single point of failure.
After the failure of a card, messages are still carried on the bonded LAN and are received
on the other node, but now eth1 has become active in bond0 on node1. This situation
is shown in Figure 3-19.

76 Understanding Serviceguard Software Components


Figure 3-19 Bonded NICs After Failure

Various combinations of Ethernet card types (single or dual-ported) and bond groups
are possible, but it is vitally important to remember that at least two physical cards (or
physically separate on-board LAN interfaces) must be used in any combination of
channel bonds to avoid a single point of failure for heartbeat connections.

Bonding for Load Balancing


It is also possible to configure bonds in load balancing mode, which allows all slaves
to transmit data in parallel, in an active/active arrangement. In this case, high availability
is provided by the fact that the bond still continues to function (with less throughput)
if one of the component LANs should fail. The user should check the Bonding
documentation to determine if the hardware configuration must use Ethernet switches
such as the HP Procurve switch, which supports trunking of switch ports. The bonding
driver configuration must specify mode 0 for the bond type.
An example of this type of configuration is shown in Figure 3-20.

How the Network Manager Works 77


Figure 3-20 Bonded NICs Configured for Load Balancing

Monitoring LAN Interfaces and Detecting Failure: Link Level


At regular intervals, determined by the NETWORK_POLLING_INTERVAL (see “Cluster
Configuration Parameters ” (page 105)), Serviceguard polls all the network interface
cards specified in the cluster configuration file (both bonded and non-bonded). If the
link status of an interface is down, Serviceguard marks the interface, and all subnets
running on it, as down; this is shown in the output of cmviewcl (1m); see “Reporting
Link-Level and IP-Level Failures” (page 82). When the link comes back up, Serviceguard
marks the interface, and all subnets running on it, as up.

Monitoring LAN Interfaces and Detecting Failure: IP Level


Serviceguard can also monitor the IP level, checking Layer 3 health and connectivity
for both IPv4 and IPv6 subnets. This is done by the IP Monitor, which is configurable:
you can enable IP monitoring for any subnet configured into the cluster, but you do
not have to monitor any. You can configure IP monitoring for a subnet, or turn off
monitoring, while the cluster is running.
The IP Monitor:
• Detects when a network interface fails to send or receive IP messages, even though
it is still up at the link level.
• Handles the failure, failover, recovery, and failback.

78 Understanding Serviceguard Software Components


Reasons To Use IP Monitoring
Beyond the capabilities already provided by link-level monitoring, IP monitoring can:
• Monitor network status beyond the first level of switches; see “How the IP Monitor
Works” (page 79)
• Detect and handle errors such as:
— IP packet corruption on the router or switch
— Link failure between switches and a first-level router
— Inbound failures
— Errors that prevent packets from being received but do not affect the link-level
health of an interface

IMPORTANT: You should configure the IP Monitor in a cross-subnet configuration,


because IP monitoring will detect some errors that link-level monitoring will not. See
also “Cross-Subnet Configurations” (page 32).

How the IP Monitor Works


Using Internet Control Message Protocol (ICMP) and ICMPv6, the IP Monitor sends
polling messages to target IP addresses and verifies that responses are received. When
the IP Monitor detects a failure, it marks the network interface down at the IP level, as
shown in the output of cmviewcl (1m); see “Reporting Link-Level and IP-Level
Failures” (page 82) and “Failure and Recovery Detection Times” (page 81).
The monitor can perform two types of polling:
• Peer polling.
In this case the IP Monitor sends ICMP ECHO messages from each IP address on
a subnet to all other IP addresses on the same subnet on other nodes in the cluster.
• Target polling.
In this case the IP Monitor sends ICMP ECHO messages from each IP address on
a subnet to an external IP address specified in the cluster configuration file; see
POLLING_TARGET under “Cluster Configuration Parameters ” (page 105).
cmquerycl (1m) will detect gateways available for use as polling targets, as
shown in the example below.
Target polling enables monitoring beyond the first level of switches, allowing you
to detect if the route is broken anywhere between each monitored IP address and
the target.

NOTE: In a cross-subnet configuration, nodes can configure peer interfaces on


nodes on the other routed subnet as polling targets.

HP recommends that you configure target polling if the subnet is not private to the
cluster.
How the Network Manager Works 79
The IP Monitor section of the cmquerycl output looks similar to this:

Route Connectivity (no probing was performed):

IPv4:

1 16.89.143.192
16.89.120.0

Possible IP Monitor Subnets:

IPv4:

16.89.112.0 Polling Target 16.89.112.1

IPv6:

3ffe:1000:0:a801:: Polling Target 3ffe:1000:0:a801::254


The IP Monitor section of the cluster configuration file will look similar to the following
for a subnet on which IP monitoring is configured with target polling.

NOTE: This is the default if cmquerycl detects a gateway for the subnet in question;
see SUBNET under “Cluster Configuration Parameters ” (page 105) for more information.

IMPORTANT: By default, cmquerycl does not verify that the gateways it detects will
work correctly for monitoring. But if you use the -w full option, cmquerycl will
validate them as polling targets.
SUBNET 192.168.1.0
IP_MONITOR ON
POLLING_TARGET 192.168.1.254
To configure a subnet for IP monitoring with peer polling, edit the IP Monitor section
of the cluster configuration file to look similar to this:
SUBNET 192.168.2.0
IP_MONITOR ON

80 Understanding Serviceguard Software Components


NOTE: This is not the default. If cmquerycl does not detect a gateway for the subnet
in question, it sets IP_MONITOR to OFF, disabling IP-level polling for this subnet; if it
does detect a gateway, it populates POLLING_TARGET, enabling target polling. See
SUBNET under “Cluster Configuration Parameters ” (page 105) for more information.
The IP Monitor section of the cluster configuration file will look similar to the following
in the case of a subnet on which IP monitoring is disabled:
SUBNET 192.168.3.0
IP_MONITOR OFF

NOTE: This is the default if cmquerycl does not detect a gateway for the subnet in
question; it is equivalent to having no SUBNET entry for the subnet. See SUBNET under
“Cluster Configuration Parameters ” (page 105) for more information.

Failure and Recovery Detection Times


With the default NETWORK_POLLING_INTERVAL of 2 seconds (see “Cluster
Configuration Parameters ” (page 105)), the IP monitor will detect IP failures typically
within 8–10 seconds for Ethernet and within 16–18 seconds for InfiniBand. Similarly,
with the default NETWORK_POLLING_INTERVAL, the IP monitor will detect the
recovery of an IP address typically within 8–10 seconds for Ethernet and with 16–18
seconds for InfiniBand.

IMPORTANT: HP strongly recommends that you do not change the default


NETWORK_POLLING_INTERVAL value of 2 seconds.
See also “Reporting Link-Level and IP-Level Failures” (page 82).

Constraints and Limitations


• A subnet must be configured into the cluster in order to be monitored.
• Polling targets are not detected beyond the first-level router.
• Polling targets must accept and respond to ICMP (or ICMPv6) ECHO messages.
• A peer IP on the same subnet should not be a polling target because a node can
always ping itself.
The following constraints apply to peer polling when there are only two interfaces on
a subnet:
• If one interface fails, both interfaces and the entire subnet will be marked down
on each node, unless bonding is configured and there is a working standby.
• If the node that has one of the interfaces goes down, the subnet on the other node
will be marked down.

How the Network Manager Works 81


Reporting Link-Level and IP-Level Failures
Any given failure may occur at the link level or the IP level; a failure is reported slightly
differently in the output of cmviewcl (1m) depending on whether link-level or IP
monitoring detects the failure.
If a failure is detected at the link level, output from cmviewcl -v will look like
something like this:
Network_Parameters:
INTERFACE STATUS PATH NAME
PRIMARY down (Link and IP) 0/3/1/0 eth2
PRIMARY up 0/5/1/0 eth3
cmviewcl -v -f line will report the same failure like this:
node:gary|interface:lan2|status=down
node:gary|interface:lan2|disabled=false
node:gary|interface:lan2|failure_type=link+ip
If a failure is detected by IP monitoring, output from cmviewcl -v will look like
something like this:
Network_Parameters:
INTERFACE STATUS PATH NAME
PRIMARY down (IP only) 0/3/1/0 eth2
PRIMARY up 0/5/1/0 eth3
cmviewcl -v -f line will report the same failure like this:
node:gary|interface:lan2|status=down
node:gary|interface:lan2|disabled=false
node:gary|interface:lan2|failure_type=ip_only

Package Switching and Relocatable IP Addresses


A package switch involves moving the package to a new system. In the most common
configuration, in which all nodes are on the same subnet(s), the package IP (relocatable
IP; see “Stationary and Relocatable IP Addresses and Monitored Subnets” (page 71))
moves as well, and the new system must already have the subnet configured and
working properly, otherwise the packages will not be started.

NOTE: It is possible to configure a cluster that spans subnets joined by a router, with
some nodes using one subnet and some another. This is called a cross-subnet
configuration. In this context, you can configure packages to fail over from a node on
one subnet to a node on another, and you will need to configure a relocatable address
for each subnet the package is configured to start on; see “About Cross-Subnet Failover”
(page 147), and in particular the subsection“Implications for Application Deployment”
(page 148).
When a package switch occurs, TCP connections are lost. TCP applications must
reconnect to regain connectivity; this is not handled automatically. Note that if the
package is dependent on multiple subnets (specified as monitored_subnets in the package
configuration file), all those subnets must normally be available on the target node

82 Understanding Serviceguard Software Components


before the package will be started. (In a cross-subnet configuration, all subnets
configured on that node, and identified as monitored subnets in the package
configuration file, must be available.)
The switching of relocatable IP addresses is shown in Figure 3-6 and Figure 3-7 .

Address Resolution Messages after Switching on the Same Subnet


When a relocatable IP address is moved to a new interface, either locally or remotely,
an ARP message is broadcast to indicate the new mapping between IP address and
link layer address. An ARP message is sent for each IP address that has been moved.
All systems receiving the broadcast should update the associated ARP cache entry to
reflect the change. Currently, the ARP messages are sent at the time the IP address is
added to the new system. An ARP message is sent in the form of an ARP request. The
sender and receiver protocol address fields of the ARP request message are both set to
the same relocatable IP address. This ensures that nodes receiving the message will
not send replies.
Unlike IPv4, IPv6 addresses use NDP messages to determine the link-layer addresses
of their neighbors.

VLAN Configurations
Virtual LAN configuration (VLAN) is supported in Serviceguard clusters.

What is VLAN?
VLAN is a technology that allows logical grouping of network nodes, regardless of
their physical locations.
VLAN can be used to divide a physical LAN into multiple logical LAN segments or
broadcast domains, helping to reduce broadcast traffic, increase network performance
and security, and improve manageability.
Multiple VLAN interfaces, each with its own IP address, can be configured from a
physical LAN interface; these VLAN interfaces appear to applications as ordinary
network interfaces (NICs). See the documentation for your Linux distribution for more
information on configuring VLAN interfaces.

Support for Linux VLAN


VLAN interfaces can be used as heartbeat as well as data networks in the cluster. The
Network Manager monitors the health of VLAN interfaces configured in the cluster,
and performs remote failover of VLAN interfaces when failure is detected. Failure of
a VLAN interface is typically the result of the failure of the underlying physical NIC
port or Channel Bond interface.

How the Network Manager Works 83


Configuration Restrictions
Linux allows up to 1024 VLANs to be created from a physical NIC port. A large pool
of system resources is required to accommodate such a configuration; Serviceguard
could suffer performance degradation if many network interfaces are configured in
each cluster node. To prevent this and other problems, Serviceguard imposes the
following restrictions:
• A maximum of 30 network interfaces per node is supported. The interfaces can be
physical NIC ports, VLAN interfaces, Channel Bonds, or any combination of these.
• Only port-based and IP-subnet-based VLANs are supported. Protocol-based VLAN
is not supported because Serviceguard does not support any transport protocols
other than TCP/IP.
• Each VLAN interface must be assigned an IP address in a unique subnet.
• Using VLAN in a Wide Area Network cluster is not supported.

Additional Heartbeat Requirements


VLAN technology allows great flexibility in network configuration. To maintain
Serviceguard’s reliability and availability in such an environment, the heartbeat rules
are tightened as follows when the cluster is using VLANs:
1. VLAN heartbeat networks must be configured on separate physical NICs or
Channel Bonds, to avoid single points of failure.
2. Heartbeats are still recommended on all cluster networks, including VLANs.
3. If you are using VLANs, but decide not to use VLANs for heartbeat networks,
heartbeats are recommended for all other physical networks or Channel Bonds
specified in the cluster configuration file.

Volume Managers for Data Storage


A volume manager lets you create units of disk storage that are more flexible than
individual disk partitions. These units can be used on single systems or in
high-availability clusters. HP Serviceguard for Linux uses the Linux Logical Volume
Manager (LVM) which creates redundant storage groups. This section provides an
overview of volume management with LVM. See “Creating the Logical Volume
Infrastructure ” (page 165) in Chapter 5 for information about configuring volume
groups, logical volumes, and file systems for use in Serviceguard packages.
In HP Serviceguard for Linux, the supported shared data storage type is disk arrays
which configure redundant storage in hardware.
In a disk array, the basic element of storage is a LUN, which already provides storage
redundancy via RAID1 or RAID5. Before you can use the LUNs, you must partition
them using fdisk.
In LVM, you manipulate storage in one or more volume groups. A volume group is
built by grouping individual physical volumes. Physical volumes can be disk partitions
or LUNs that have been marked as physical volumes as described below.

84 Understanding Serviceguard Software Components


You use the pvcreate command to mark the LUN as physical volumes. Then you use
the vgcreate command to create volume groups out of one or more physical volumes.
Once configured, a volume group can be subdivided into logical volumes of different
sizes and types. File systems or databases used by the applications in the cluster are
mounted on these logical volumes. In Serviceguard clusters, volume groups are activated
by package control scripts when an application starts up, and they are deactivated by
package control scripts when the application halts.

Storage on Arrays
Figure 3-21 shows LUNs configured on a storage array. Physical disks are configured
by an array utility program into logical units, or LUNs, which are seen by the operating
system.

Figure 3-21 Physical Disks Combined into LUNs

NOTE: LUN definition is normally done using utility programs provided by the disk
array manufacturer. Since arrays vary considerably, you should refer to the
documentation that accompanies your storage unit.
For information about configuring multipathing, see “Multipath for Storage ” (page 96).

Volume Managers for Data Storage 85


Monitoring Disks
Each package configuration includes information about the disks that are to be activated
by the package at startup. If monitoring is used, the health of the disks is checked at
package startup. The package will fail if the disks are not available.
When this happens, the package may be restarted on another node. If auto_run is set
to yes, the package will start up on another eligible node, if it meets all the requirements
for startup. If auto_run is set to no, then the package simply halts without starting up
anywhere else.
The process for configuring disk monitoring is described in “Creating a Disk Monitor
Configuration” (page 228).

More Information on LVM


Refer to the section “Creating the Logical Volume Infrastructure” in Chapter 5 for
details about configuring volume groups, logical volumes, and file systems for use in
Serviceguard packages.
Refer to the article, Logical Volume Manager HOWTO on the Linux Documentation
Project page at http://www.tldp.org for a basic description of Linux LVM.

About Persistent Reservations


As of A.11.19, Serviceguard for Linux packages use persistent reservations (PR)
wherever possible to control access to LUNs. Persistent Reservations, defined by the
SCSI Primary Commands version 3 (SPC-3) standard, provide a means to register I/O
initiators and specify who can access LUN devices (anyone, all registrants, only one
registrant) and how (read-only, write-only).
Unlike exclusive activation for volume groups, which does not prevent unauthorized
access to the underlying LUNs, PR controls access at the LUN level. Registration and
reservation information is stored on the device and enforced by its firmware; this
information persists across device resets and system reboots.

NOTE: Persistent Reservations coexist with, and are independent of, activation
protection of volume groups. You should continue to configure activation protection
as instructed under Enabling Volume Group Activation Protection. Subject to the Rules
and Limitations spelled out below, Persistent Reservations will be applied to the cluster's
LUNs, whether or not the LUNs are configured into volume groups.
Advantages of PR are:

86 Understanding Serviceguard Software Components


• Consistent behavior.
Whereas different volume managers may implement exclusive activation differently
(or not at all) PR is implemented at the device level and does not depend on
volume-manager support for exclusive activation.
• Packages can control access to LUN devices independently of a volume manager.
Serviceguard's support for the OCFS2 file system and ASM manager allows packages
whose applications use these protocols to access storage devices directly, without
using a volume manager.

Rules and Limitations


Serviceguard automatically implements PR for packages that use LUN storage, subject
to the following constraints:
• The LUN device must support PR and be consistent with the SPC-3 specification
• PR is not available in legacy multi-node packages.
PR is available in modular multi-node packages, and in both modular and legacy
failover packages.
— All instances of a modular multi-node package must be able to use PR; otherwise
it will be turned off for all instances.
• The package must have access to real devices, not only virtualized ones.
This means that HPVM guests running as cluster nodes cannot use PR.

About Persistent Reservations 87


NOTE: PR is turned off at the cluster level if any node is an HPVM guest.

• Clusters that have nodes that are VMware guests can use PR, with the following
restrictions:
— Two or more VMware guests acting as nodes in the same cluster cannot run
on the same host.
(A cluster can have multiple VMware guests if each is on a separate host; and
a host can have multiple guests if each is in a different cluster.)
— Packages running on VMware guests must use Raw Device Mapping to access
the underlying physical LUNs.

CAUTION: Serviceguard makes and revokes registrations and reservations during


normal package startup and shutdown, or package failover. Serviceguard also provides
a script to clear reservations in the event of a catastrophic cluster failure. You need to
make sure that this script is run in that case; the LUN devices could become unusable
otherwise. See “Revoking Persistent Reservations after a Catastrophic Failure” (page 284)
for more information.

How Persistent Reservations Work


You do not need to do any configuration to enable or activate PR, and in fact you cannot
enable it or disable it, either at the cluster or the package level; Serviceguard makes the
decision for each cluster and package on the basis of the Rules and Limitations described
above.
When you run cmapplyconf (1m) to configure a new cluster, or add a new node,
Serviceguard sets the variable cluster_pr_mode to either pr_enabled or pr_disabled.
• ENABLED means that packages can in principle use PR, but in practice will do so
only if they meet the conditions spelled out under “Rules and Limitations”.
• DISABLED means that no packages can use PR because at least one node is an
HPVM guest.
You can see the setting of cluster_pr_mode in the output of cmviewcl -f line; for
example:
...
cluster_pr_mode: pr_enabled

NOTE: You cannot change the setting of cluster_pr_mode.


If a package is qualified to use PR, Serviceguard automatically makes and revokes
registrations and reservations for the package's LUNs during package startup, and
revokes them during package shutdown, using the sg_persist command. This
command is available, and has a manpage, on both Red Hat 5 and SUSE SLES 10/11.

88 Understanding Serviceguard Software Components


Serviceguard makes a PR of type Write Exclusive Registrants Only (WERO) on the
package's LUN devices. This gives read access to any initiator regardless of whether
the initiator is registered or not, but grants write access only to those initiators who are
registered. (WERO is defined in the SPC-3 standard.)
All initiators on each node running the package register with LUN devices using the
same PR Key, known as the node_pr_key. Each node in the cluster has a unique
node_pr_key, which you can see in the output of cmviewcl -f line; for example:
...
node:bla2|node_pr_key=10001
When a failover package starts up, any existing PR keys and reservations are cleared
from the underlying LUN devices first; then the node_pr_key of the node that the package
is starting on is registered with each LUN.
In the case of a multi-node package, the PR reservation is made for the underlying
LUNs by the first instance of the package, and the appropriate node_pr_key is registered
each time the package starts on a new node. If a node fails, the instances of the package
running on other nodes will remove the registrations of the failed node.
You can use cmgetpkgenv (1m) to see whether PR is enabled for a given package;
for example:
cmgetpkgenv pkg1
...
PKG_PR_MODE="pr_enabled"

Responses to Failures
Serviceguard responds to different kinds of failures in specific ways. For most hardware
failures, the response is not user-configurable, but for package and service failures,
you can choose the system’s response, within limits.

Reboot When a Node Fails


The most dramatic response to a failure in a Serviceguard cluster is a system reboot.
This allows packages to move quickly to another node, protecting the integrity of the
data.
A reboot is done if a cluster node cannot communicate with the majority of cluster
members for the pre-determined time, or under other circumstances such as a kernel
hang or failure of the cluster daemon (cmcld). When this happens, you may see the
following message on the console:
DEADMAN: Time expired, initiating system restart.
The case is covered in more detail under “What Happens when a Node Times Out”.
See also “Cluster Daemon: cmcld” (page 39).

Responses to Failures 89
A reboot is also initiated by Serviceguard itself under specific circumstances; see
“Responses to Package and Service Failures ” (page 92).

What Happens when a Node Times Out


Each node sends a heartbeat message to all other nodes at an interval equal to one-fourth
of the value of the configured MEMBER_TIMEOUT or 1 second, whichever is less. You
configure MEMBER_TIMEOUT in the cluster configuration file; see “Cluster
Configuration Parameters ” (page 105). The heartbeat interval is not directly configurable.
If a node fails to send a heartbeat message within the time set by MEMBER_TIMEOUT,
the cluster is reformed minus the node no longer sending heartbeat messages.
When a node detects that another node has failed (that is, no heartbeat message has
arrived within MEMBER_TIMEOUT microseconds), the following sequence of events
occurs:
1. The node contacts the other nodes and tries to re-form the cluster without the
failed node.
2. If the remaining nodes are a majority or can obtain the cluster lock, they form a
new cluster without the failed node.
3. If the remaining nodes are not a majority or cannot get the cluster lock, they halt
(system reset).

Example
Situation. Assume a two-node cluster, with Package1 running on SystemA and
Package2 running on SystemB. Volume group vg01 is exclusively activated on
SystemA; volume group vg02is exclusively activated on SystemB. Package IP
addresses are assigned to SystemA and SystemB respectively.
Failure. Only one LAN has been configured for both heartbeat and data traffic. During
the course of operations, heavy application traffic monopolizes the bandwidth of the
network, preventing heartbeat packets from getting through.
Since SystemA does not receive heartbeat messages from SystemB, SystemA attempts
to re-form as a one-node cluster. Likewise, since SystemB does not receive heartbeat
messages from SystemA, SystemB also attempts to reform as a one-node cluster.
During the election protocol, each node votes for itself, giving both nodes 50 percent
of the vote. Because both nodes have 50 percent of the vote, both nodes now vie for the
cluster lock. Only one node will get the lock.
Outcome. Assume SystemA gets the cluster lock. SystemA re-forms as a one-node
cluster. After re-formation, SystemA will make sure all applications configured to run
on an existing clustered node are running. When SystemA discovers Package2 is not
running in the cluster it will try to start Package2 if Package2 is configured to run
on SystemA.
SystemB recognizes that it has failed to get the cluster lock and so cannot re-form the
cluster. To release all resources related toPackage2 (such as exclusive access to volume

90 Understanding Serviceguard Software Components


group vg02 and the Package2 IP address) as quickly as possible, SystemB halts
(system reset).

NOTE: If AUTOSTART_CMCLD in /etc/rc.config.d/cmcluster


($SGAUTOSTART) is set to zero, the node will not attempt to join the cluster when it
comes back up.
For more information on cluster failover, see the white paper Optimizing Failover Time
in a Serviceguard Environment (version A.11.19 and later) at http://www.docs.hp.com
-> High Availability -> Serviceguard -> White Papers. For
troubleshooting information, see “Cluster Re-formations Caused by
MEMBER_TIMEOUT Being Set too Low” (page 292).

Responses to Hardware Failures


If a serious system problem occurs, such as a system panic or physical disruption of
the SPU's circuits, Serviceguard recognizes a node failure and transfers the packages
currently running on that node to an adoptive node elsewhere in the cluster. (System
multi-node and multi-node packages do not fail over.)
The new location for each package is determined by that package's configuration file,
which lists primary and alternate nodes for the package. Transfer of a package to
another node does not transfer the program counter. Processes in a transferred package
will restart from the beginning. In order for an application to be expeditiously restarted
after a failure, it must be “crash-tolerant”; that is, all processes in the package must be
written so that they can detect such a restart. This is the same application design required
for restart after a normal system crash.
In the event of a LAN interface failure, bonding provides a backup path for IP messages.
If a heartbeat LAN interface fails and no redundant heartbeat is configured, the node
fails with a reboot. If a monitored data LAN interface fails, the node fails with a reboot
only if node_fail_fast_enabled (described further under “Configuring a Package: Next
Steps” (page 150)) is set to yes for the package. Otherwise any packages using that
LAN interface will be halted and moved to another node if possible (unless the LAN
recovers immediately; see “When a Service or Subnet Fails, or a Dependency is Not
Met” (page 67)).
Disk monitoring provides additional protection. You can configure packages to be
dependent on the health of disks, so that when a disk monitor reports a problem, the
package can fail over to another node. See “Creating a Disk Monitor Configuration”
(page 228).
Serviceguard does not respond directly to power failures, although a loss of power to
an individual cluster component may appear to Serviceguard like the failure of that
component, and will result in the appropriate switching behavior. Power protection is
provided by HP-supported uninterruptible power supplies (UPS).

Responses to Failures 91
Responses to Package and Service Failures
In the default case, the failure of the package or of a service within a package causes
the package to shut down by running the control script with the stop parameter, and
then restarting the package on an alternate node. A package will also fail if it is
configured to have a dependency on another package, and that package fails.
You can modify this default behavior by specifying that the node should halt (system
reset) before the transfer takes place. You do this by setting failfast parameters in the
package configuration file.
In cases in which package shutdown might hang, leaving the node in an unknown
state, failfast options can provide a quick failover, after which the node will be cleaned
up on reboot. Remember, however, that a system reset causes all packages on the node
to halt abruptly.
The settings of the failfast parameters in the package configuration file determine the
behavior of the package and the node in the event of a package or resource failure:
• If service_fail_fast_enabled (page 216) is set to yes in the package configuration file,
Serviceguard will reboot the node if there is a failure of that specific service.
• If node_fail_fast_enabled (page 206) is set to yes in the package configuration file,
and the package fails, Serviceguard will halt (reboot) the node on which the package
is running.
For more information, see “Package Configuration Planning ” (page 123) and Chapter 6
(page 197).

Service Restarts
You can allow a service to restart locally following a failure. To do this, you indicate a
number of restarts for each service in the package control script. When a service starts,
the variable service_restart is set in the service’s environment. The service, as it executes,
can examine this variable to see whether it has been restarted after a failure, and if so,
it can take appropriate action such as cleanup.

Network Communication Failure


An important element in the cluster is the health of the network itself. As it continuously
monitors the cluster, each node listens for heartbeat messages from the other nodes
confirming that all nodes are able to communicate with each other. If a node does not
hear these messages within the configured amount of time, a node timeout occurs,
resulting in a cluster re-formation and later, if there are still no heartbeat messages
received, a reboot. See “What Happens when a Node Times Out” (page 90)

92 Understanding Serviceguard Software Components


4 Planning and Documenting an HA Cluster
Building a Serviceguard cluster begins with a planning phase in which you gather and
record information about all the hardware and software components of the
configuration.
This chapter assists you in the following planning areas:
• General Planning
• Hardware Planning (page 94)
• Power Supply Planning (page 97)
• Cluster Lock Planning (page 98)
• Volume Manager Planning (page 99)
• Cluster Configuration Planning (page 100)
• Package Configuration Planning (page 123)
Appendix C (page 319) contains a set of blank worksheets which you may find useful
as an offline record of important details of the configuration.

NOTE: Planning and installation overlap considerably, so you may not be able to
complete the worksheets before you proceed to the actual configuration. In that case,
fill in the missing elements to document the system as you proceed with the
configuration.
Subsequent chapters describe configuration and maintenance tasks in detail.

General Planning
A clear understanding of your high availability objectives will quickly help you to
define your hardware requirements and design your system. Use the following questions
as a guide for general planning:
1. What applications must continue to be available in the event of a failure?
2. What system resources (processing power, networking, SPU, memory, disk space)
are needed to support these applications?
3. How will these resources be distributed among the nodes in the cluster during
normal operation?
4. How will these resources be distributed among the nodes of the cluster in all
possible combinations of failures, especially node failures?
5. How will resources be distributed during routine maintenance of the cluster?
6. What are the networking requirements? Are all networks and subnets available?
7. Have you eliminated all single points of failure? For example:
• network points of failure.
• disk points of failure.

General Planning 93
• electrical points of failure.
• application points of failure.

Serviceguard Memory Requirements


Serviceguard requires approximately 15.5 MB of lockable memory.

Planning for Expansion


When you first set up the cluster, you indicate a set of nodes and define a group of
packages for the initial configuration. At a later time, you may wish to add additional
nodes and packages, or you may wish to use additional disk hardware for shared data
storage. If you intend to expand your cluster without having to bring it down, you need
to plan the initial configuration carefully. Use the following guidelines:
• Set the Maximum Configured Packages parameter (described later in this chapter
under “Cluster Configuration Planning ” (page 100)) high enough to accommodate
the additional packages you plan to add.
• Networks should be pre-configured into the cluster configuration if they will be
needed for packages you will add later while the cluster is running. See “LAN
Information ” (page 95).
See Chapter 7: “Cluster and Package Maintenance” (page 229), for more information
about changing the cluster configuration dynamically, that is, while the cluster is
running.

Hardware Planning
Hardware planning requires examining the physical hardware itself. One useful
procedure is to sketch the hardware configuration in a diagram that shows adapter
cards and buses, cabling, disks and peripherals.
You may also find it useful to record the information on the Hardware worksheet
(page 320) indicating which device adapters occupy which slots and updating the details
as you create the cluster configuration. Use one form for each node (server).

SPU Information
SPU information includes the basic characteristics of the server systems you are using
in the cluster.
You may want to record the following on the Hardware worksheet (page 320) :
Server Series Number Enter the series number, for example, DL380 G5.
Host Name Enter the name to be used on the system as the host
name.
Memory Capacity Enter the memory in MB.
Number of I/O slots Indicate the number of slots.

94 Planning and Documenting an HA Cluster


LAN Information
While a minimum of one LAN interface per subnet is required, at least two LAN
interfaces are needed to eliminate single points of network failure.
HP recommends that you configure heartbeats on all subnets, including those to be
used for client data.
Collect the following information for each LAN interface:
Subnet Name The IP address for the subnet. Note that heartbeat IP
addresses must be on the same subnet on each node.
Interface Name The name of the LAN card as used by this node to access
the subnet. This name is shown by ifconfig after you
install the card.
IP Address The IP address to be used on this interface.
An IPv4 address is a string of 4 digits separated with
decimals, in this form:
nnn.nnn.nnn.nnn
An IPV6 address is a string of 8 hexadecimal values
separated with colons, in this form:
xxx:xxx:xxx:xxx:xxx:xxx:xxx:xxx
For more details of IPv6 address format, see Appendix D
(page 327).
Kind of LAN Traffic The purpose of the subnet. Valid types include the
following:
• Heartbeat
• Client Traffic
Label the list to show the subnets that belong to a bridged net.
This information is used in creating the subnet groupings and identifying the IP
addresses used in the cluster and package configuration files.

Shared Storage
SCSI can be used for up to four-node clusters; FibreChannel can be used for clusters
of up to 16 nodes.

FibreChannel
FibreChannel cards can be used to connect up to 16 nodes to a disk array containing
storage. After installation of the cards and the appropriate driver, the LUNs configured
on the storage unit are presented to the operating system as device files, which can be
used to build LVM volume groups.

Hardware Planning 95
NOTE: Multipath capabilities are supported by FibreChannel HBA device drivers
and the Linux Device Mapper. Check with the storage device documentation for details.
See also “Multipath for Storage ”.

You can use the worksheet to record the names of the device files that correspond to
each LUN for the Fibre-Channel-attached storage unit.

Multipath for Storage


The method for achieving a multipath solution is dependent on the storage sub-system
attached to the cluster and the Host Bus Adapters (HBAs) in the servers. Please check
the documentation that accompanied your storage sub-system and HBA.
For fibre-channel-attached storage, the multipath function within the HBA driver
should be used, if it is supported by HP. For the QLogic driver, see “Using the QLogic
HBA driver for single-path or multipath failover mode on Linux systems application
note”, which you can find by entering the terms qlogic multipath application
into the search box of www.hp.com.

NOTE: With the rapid evolution of Linux, the multipath mechanisms may change,
or new ones may be added. Serviceguard for Linux supports DeviceMapper multipath
(DM-MPIO) with some restrictions; see the Serviceguard for Linux Certification Matrix
at the address provided in the Preface to this manual for up-to-date information.

NOTE: md also supports software RAID; but this configuration is not currently
supported with Serviceguard for Linux.

Disk I/O Information


You may want to use the Hardware worksheet in Appendix C to record the following
information for each disk connected to each disk device adapter on the node:
Bus Type Indicate the type of bus. Supported buses are SAS (Serial
Attached SCSI) and FibreChannel.
LUN Number Indicate the number of the LUN as defined in the storage
unit.
Slot Number Indicate the slot number(s) into which the SCSI or
FibreChannel interface card(s) are inserted in the backplane
of the computer.
Address Enter the bus hardware path number, which is the numeric
part of the host parameter, which can be seen on the system
by using the following command:
cat /proc/scsi/scsi

96 Planning and Documenting an HA Cluster


Disk Device File Enter the disk device file name for each SCSI disk or LUN.
This information is needed when you create the mirrored disk configuration using
LVM. In addition, it is useful to gather as much information as possible about your
disk configuration.
You can obtain information about available disks by using the following commands;
your system may provide other utilities as well.
• ls /dev/sd* (Smart Array cluster storage)
• ls /dev/hd* (non-SCSI/FibreChannel disks)
• ls /dev/sd* (SCSI and FibreChannel disks)
• du
• df
• mount
• vgdisplay -v
• lvdisplay -v
See the manpages for these commands for information about specific usage. The
commands should be issued from all nodes after installing the hardware and rebooting
the system. The information will be useful when doing LVM and cluster configuration.

Hardware Configuration Worksheet


The hardware configuration worksheet (page 320) will help you organize and record
your specific cluster hardware configuration. Make as many copies as you need.

Power Supply Planning


There are two sources of power for your cluster which you will have to consider in
your design: line power and uninterruptible power supplies (UPS). Loss of a power
circuit should not bring down the cluster.
Frequently, servers, mass storage devices, and other hardware have two or three
separate power supplies, so they can survive the loss of power to one or more power
supplies or power circuits. If a device has redundant power supplies, connect each
power supply to a separate power circuit. This way the failure of a single power circuit
will not cause the complete failure of any critical device in the cluster. For example, if
each device in a cluster has three power supplies, you will need a minimum of three
separate power circuits to eliminate electrical power as a single point of failure for the
cluster. In the case of hardware with only one power supply, no more than half of the
nodes should be on a single power source. If a power source supplies exactly half of
the nodes, it must not also supply the cluster lock LUN or quorum server, or the cluster
will not be able to re-form after a failure. See “Cluster Lock Planning” (page 98) for
more information.
To provide a high degree of availability in the event of power failure, use a separate
UPS at least for each node’s SPU and for the cluster lock disk (if any). If you use a

Power Supply Planning 97


quorum server, or quorum server cluster, make sure each quorum server node has a
power source separate from that of every cluster it serves. If you use software mirroring,
make sure power supplies are not shared among different physical volume groups;
this allows you to set up mirroring between physical disks that are not only on different
I/O buses, but also connected to different power supplies.
To prevent confusion, label each hardware unit and power supply unit clearly with a
different unit number. Indicate on the Power Supply Worksheet the specific hardware
units you are using and the power supply to which they will be connected. Enter the
following label information on the worksheet:
Host Name Enter the host name for each SPU.
Disk Unit Enter the disk drive unit number for each disk.
Tape Unit Enter the tape unit number for each backup device.
Other Unit Enter the number of any other unit.
Power Supply Enter the power supply unit number of the UPS to which the host
or other device is connected.
Be sure to follow UPS, power circuit, and cabinet power limits as well as SPU power
limits.

Power Supply Configuration Worksheet


The Power Supply Planning worksheet (page 321) will help you organize and record
your specific power supply configuration. Make as many copies as you need.

Cluster Lock Planning


The purpose of the cluster lock is to ensure that only one new cluster is formed in the
event that exactly half of the previously clustered nodes try to form a new cluster. It is
critical that only one new cluster is formed and that it alone has access to the disks
specified in its packages. You can specify a lock LUN or a quorum server as the cluster
lock. For more information about the cluster lock, see“Cluster Lock” (page 45).

NOTE: You cannot use more than one type of lock in the same cluster.

Cluster Lock Requirements


A one-node cluster does not require a lock. Two-node clusters require the use of a
cluster lock, and a lock is recommended for larger clusters as well. Clusters larger than
four nodes can use only a quorum server as the cluster lock.
For information on configuring lock LUNs and the Quorum Server, see “Setting up a
Lock LUN” (page 164), “Specifying a Lock LUN” (page 179), and the HP Serviceguard
Quorum Server Version A.04.00 Release Notes at http://www.docs.hp.com -> High
Availability -> Quorum Server.

98 Planning and Documenting an HA Cluster


Planning for Expansion
Bear in mind that a cluster with more than 4 nodes cannot use a lock LUN. So if you
plan to add enough nodes to bring the total to more than 4, you should use a quorum
server.

Using a Quorum Server


The Quorum Server is described under “Use of the Quorum Server as a Cluster Lock”
(page 46). See also “Cluster Lock” (page 45).
A quorum server:
• Can be used with up to 150 clusters, not exceeding 300 nodes total.
• Can support a cluster with any supported number of nodes.
• Can support a cluster with any supported number of nodes.
• Can communicate with the cluster on up to two subnets (a primary and an
alternate).

IMPORTANT: If you plan to use a Quorum Server, make sure you read the HP
Serviceguard Quorum Server Version A.04.00 Release Notes before you proceed. You can
find them at: http://www.docs.hp.com -> High Availability -> Quorum
Server. You should also consult the Quorum Server white papers at the same location.

Quorum Server Worksheet


You can use the Quorum Server Worksheet (page 322) to identify a quorum server for
use with one or more clusters. You may want to record the following:
Quorum Server Host The host name for the quorum server.
IP Address The IP address(es) by which the quorum server will
communicate with the cluster nodes.
Supported Node Names The name (39 characters or fewer) of each cluster node
that will be supported by this quorum server. These
entries will be entered into qs_authfile on the
system that is running the quorum server process.

Volume Manager Planning


When designing your disk layout using LVM, you should consider the following:
• The volume groups that contain high availability applications, services, or data
must be on a bus or buses available to the primary node and all adoptive nodes.
• High availability applications, services, and data should be placed in volume
groups that are separate from non-high availability applications, services, and
data.

Volume Manager Planning 99


• You must group high availability applications, services, and data, whose control
needs to be transferred together, on a single volume group or a series of volume
groups.
• You must not group two different high availability applications, services, or data,
whose control needs to be transferred independently, on the same volume group.
• Your root disk must not belong to a volume group that can be activated on another
node.

Volume Groups and Physical Volume Worksheet


You can organize and record your physical disk configuration by identifying which
physical disks, LUNs, or disk array groups will be used in building each volume group
for use with high availability applications. Use the Volume Group and Physical Volume
worksheet (page 323).

NOTE: HP recommends that you use volume group names other than the default
volume group names (vg01, vg02, etc.). Choosing volume group names that represent
the high availability applications they are associated with (e.g., /dev/vgdatabase)
will simplify cluster administration.

Cluster Configuration Planning


A cluster should be designed to provide the quickest possible recovery from failures.
The actual time required to recover from a failure depends on several factors:
• The length of the MEMBER_TIMEOUT; see the description of this parameter under
“Cluster Configuration Parameters ” for recommendations.
• The design of the run and halt instructions in the package control script. They
should be written for fast execution.
• The application and database recovery time. They should be designed for the
shortest recovery time.
In addition, you must provide consistency across the cluster so that:
• User names are the same on all nodes.
• UIDs are the same on all nodes.
• GIDs are the same on all nodes.
• Applications in the system area are the same on all nodes.
• System time is consistent across the cluster.
• Files that could be used by more than one node, such as /usr or/opt files, must
be the same on all nodes.

Heartbeat Subnet and Cluster Re-formation Time


The speed of cluster re-formation depends on the number of heartbeat subnets.

100 Planning and Documenting an HA Cluster


If the cluster has only a single heartbeat network, and a network card on that network
fails, heartbeats will be lost while the failure is being detected and the IP address is
being switched to a standby interface. The cluster may treat these lost heartbeats as a
failure and re-form without one or more nodes. To prevent this, a minimum
MEMBER_TIMEOUT value of 14 seconds is required for clusters with a single heartbeat
network.
If there is more than one heartbeat subnet, and there is a failure on one of them,
heartbeats will go through another, so you can configure a smaller MEMBER_TIMEOUT
value.

NOTE: For heartbeat configuration requirements, see the discussion of the


HEARTBEAT_IP parameter later in this chapter. For more information about managing
the speed of cluster re-formation, see the discussion of the MEMBER_TIMEOUT
parameter, and further discussion under “What Happens when a Node Times Out”
(page 90), and, for troubleshooting, “Cluster Re-formations Caused by
MEMBER_TIMEOUT Being Set too Low” (page 292).

About Hostname Address Families: IPv4-Only, IPv6-Only, and Mixed Mode


As of A.11.19, Serviceguard supports three possibilities for resolving the nodes'
hostnames (and Quorum Server hostnames, if any) to network address families:
• IPv4-only
• IPv6-only
• Mixed
IPv4-only means that Serviceguard will try to resolve the hostnames to IPv4 addresses
only.

IMPORTANT: You can configure an IPv6 heartbeat, or stationary or relocatable IP


address, in any mode: IPv4-only, IPv6-only, or mixed. You can configure an IPv4
heartbeat, or stationary or relocatable IP address, in IPv4-only or mixed mode.
IPv6-only means that Serviceguard will try to resolve the hostnames to IPv6 addresses
only.
Mixed means that when resolving the hostnames, Serviceguard will try both IPv4 and
IPv6 address families.
You specify the address family the cluster will use in the cluster configuration file (by
setting HOSTNAME_ADDRESS_FAMILY to IPV4, IPV6, or ANY), or by means of the
-a of cmquerycl (1m); see “Specifying the Address Family for the Cluster
Hostnames” (page 178). The default is IPV4. See the subsections that follow for more
information and important rules and restrictions.

Cluster Configuration Planning 101


What Is IPv4–only Mode?
IPv4 is the default mode: unless you specify IPV6 or ANY (either in the cluster
configuration file or via cmquerycl -a) Serviceguard will always try to resolve the
nodes' hostnames (and the Quorum Server's, if any) to IPv4 addresses, and will not try
to resolve them to IPv6 addresses. This means that you must ensure that each hostname
can be resolved to at least one IPv4 address.

NOTE: This applies only to hostname resolution. You can have IPv6 heartbeat and
data LANs no matter what the HOSTNAME_ADDRESS_FAMILY parameter is set to.
(IPv4 heartbeat and data LANs are allowed in IPv4 and mixed mode.)

What Is IPv6-Only Mode?


If you configure IPv6-only mode (HOSTNAME_ADDRESS_FAMILY set to IPV6, or
cmquerycl -a ipv6), then all the hostnames and addresses used by the cluster —
including the heartbeat and stationary and relocatable IP addresses, and Quorum Server
addresses if any — must be or resolve to IPv6 addresses. The single exception to this
is each node's IPv4 loopback address, which cannot be removed from /etc/hosts.

NOTE: How the clients of IPv6-only cluster applications handle hostname resolution
is a matter for the discretion of the system or network administrator; there are no HP
requirements or recommendations specific to this case.
In IPv6-only mode, all Serviceguard daemons will normally use IPv6 addresses for
communication among the nodes, although local (intra-node) communication may
occur on the IPv4 loopback address.
For more information about IPv6, see Appendix D (page 327).

Rules and Restrictions for IPv6-Only Mode

IMPORTANT: See the latest version of the Serviceguard for Linux release notes for
the most current information on these and other restrictions.
• Red Hat 5 clusters are not supported.

NOTE: This also applies if HOSTNAME_ADDRESS_FAMILY is set to ANY; Red


Hat 5 supports only IPv4-only clusters.

• All addresses used by the cluster must be in each node's /etc/hosts file. In
addition, the file must contain the following entry:
::1 localhost ipv6-localhost ipv6-loopback
For more information and recommendations about hostname resolution, see
“Configuring Name Resolution” (page 156).

102 Planning and Documenting an HA Cluster


• All addresses must be IPv6, apart from the node's IPv4 loopback address, which
cannot be removed from /etc/hosts.
• The node's public LAN address (by which it is known to the outside world) must
be the last address listed in /etc/hosts.
Otherwise there is a possibility of the address being used even when it is not
configured into the cluster.
• You must use $SGCONF/cmclnodelist, not ~/.rhosts or /etc/hosts.equiv,
to provide root access to an unconfigured node.

NOTE: This also applies if HOSTNAME_ADDRESS_FAMILY is set to ANY. See


“Allowing Root Access to an Unconfigured Node” (page 155) for more information.

• If you use a Quorum Server, you must make sure that the Quorum Server hostname
(and the alternate Quorum Server address specified by QS_ADDR, if any) resolve
to IPv6 addresses, and you must use Quorum Server version A.04.00 or later. See
the latest Quorum Server release notes for more information; you can find them
at docs.hp.com under High Availability —> Quorum Server.

NOTE: The Quorum Server itself can be an IPv6–only system; in that case it can
serve IPv6–only and mixed-mode clusters, but not IPv4–only clusters.

• If you use a Quorum Server, and the Quorum Server is on a different subnet from
cluster, you must use an IPv6-capable router.
• Hostname aliases are not supported for IPv6 addresses, because of operating
system limitations.

NOTE: This applies to all IPv6 addresses, whether


HOSTNAME_ADDRESS_FAMILY is set to IPV6 or ANY.

• Cross-subnet configurations are not supported in IPv6-only mode.


• Virtual machines are not supported.
You cannot have a virtual machine that is either a node or a package if
HOSTNAME_ADDRESS_FAMILY is set to ANY or IPV6.

Cluster Configuration Planning 103


Recommendations for IPv6-Only Mode

IMPORTANT: Check the latest Serviceguard for Linux release notes for the latest
instructions and recommendations.
• If you decide to migrate the cluster to IPv6-only mode, you should plan to do so
while the cluster is down.

What Is Mixed Mode?


If you configure mixed mode (HOSTNAME_ADDRESS_FAMILY set to ANY, or
cmquerycl -a any) then the addresses used by the cluster, including the heartbeat,
and Quorum Server addresses if any, can be IPv4 or IPv6 addresses. Serviceguard will
first try to resolve a node's hostname to an IPv4 address, then, if that fails, will try IPv6.

Rules and Restrictions for Mixed Mode

IMPORTANT: See the latest version of the Serviceguard release notes for the most
current information on these and other restrictions.
• Red Hat 5 clusters are not supported.

NOTE: This also applies if HOSTNAME_ADDRESS_FAMILY is set to IPv6; Red


Hat 5 supports only IPv4-only clusters.

• The hostname resolution file on each node (for example, /etc/hosts) must
contain entries for all the IPv4 and IPv6 addresses used throughout the cluster,
including all STATIONARY_IP and HEARTBEAT_IP addresses as well any private
addresses. There must be at least one IPv4 address in this file (in the case of /etc/
hosts, the IPv4 loopback address cannot be removed). In addition, the file must
contain the following entry:
::1 localhost ipv6-localhost ipv6-loopback
For more information and recommendations about hostname resolution, see
“Configuring Name Resolution” (page 156).
• You must use $SGCONF/cmclnodelist, not ~/.rhosts or /etc/hosts.equiv,
to provide root access to an unconfigured node.
See “Allowing Root Access to an Unconfigured Node” (page 155) for more
information.
• Hostname aliases are not supported for IPv6 addresses, because of operating
system limitations.

104 Planning and Documenting an HA Cluster


NOTE: This applies to all IPv6 addresses, whether
HOSTNAME_ADDRESS_FAMILY is set to IPV6 or ANY.

• Cross-subnet configurations are not supported.


This also applies if HOSTNAME_ADDRESS_FAMILY is set to IPV6. See
“Cross-Subnet Configurations” (page 32) for more information about such
configurations.
• Virtual machines are not supported.
You cannot have a virtual machine that is either a node or a package if
HOSTNAME_ADDRESS_FAMILY is set to ANY or IPV6.

Cluster Configuration Parameters


You need to define a set of cluster parameters. These are stored in the binary cluster
configuration file, which is distributed to each node in the cluster. You configure these
parameters by editing the cluster configuration template file created by means of the
cmquerycl command, as described under “Configuring the Cluster” (page 177).

NOTE: See “Reconfiguring a Cluster” (page 251) for a summary of changes you can
make while the cluster is running.
The following parameters must be configured:
CLUSTER_NAME The name of the cluster as it will appear in the
output of cmviewcl and other commands, and
as it appears in the cluster configuration file.
The cluster name must not contain any of the
following characters: space, slash (/), backslash
(\), and asterisk (*).

Cluster Configuration Planning 105


NOTE: In addition, the following characters
must not be used in the cluster name if you are
using the Quorum Server: at-sign (@), equal-sign
(=), or-sign (|), semicolon (;).
These characters are deprecated, meaning that
you should not use them, even if you are not
using the Quorum Server.

All other characters are legal. The cluster name


can contain up to 39 characters.

CAUTION: Make sure that the cluster name is


unique within the subnets configured on the
cluster nodes; under some circumstances
Serviceguard may not be able to detect a
duplicate name and unexpected problems may
result.
In particular make sure that two clusters with
the same name do not use the same quorum
server; this could result in one of the clusters
failing to obtain the quorum server’s arbitration
services when it needs them, and thus failing to
re-form.

HOSTNAME_ADDRESS_FAMILY Specifies the Internet Protocol address family to


which Serviceguard will try to resolve cluster
node names and Quorum Server host names.
Valid values are IPV4, IPV6, and ANY. The
default is IPV4.
• IPV4 means Serviceguard will try to resolve
the names to IPv4 addresses only.
• IPV6 means Serviceguard will try to resolve
the names to IPv6 addresses only.
• ANY means Serviceguard will try to resolve
the names to both IPv4 and IPv6 addresses.

106 Planning and Documenting an HA Cluster


IMPORTANT: See “About Hostname Address
Families: IPv4-Only, IPv6-Only, and Mixed
Mode” (page 101) for important information. See
also the latest Serviceguard release notes at
docs.hp.com under High Availability
—> Serviceguard for Linux.

QS_HOST The fully-qualified hostname or IP address of a


host system outside the current cluster that is
providing quorum server functionality. It must
be (or resolve to) an IPv4 address on Red Hat 5.
On SLES 10 and 11, it can be (or resolve to) either
an IPv4 or an IPv6 address if
HOSTNAME_ADDRESS_FAMILY is set to ANY,
but otherwise must match the setting of
HOSTNAME_ADDRESS_FAMILY. This
parameter is used only when you employ a
quorum server for tie-breaking services in the
cluster. You can also specify an alternate address
(QS_ADDR) by which the cluster nodes can reach
the quorum server.
For more information, see “Cluster Lock
Planning” (page 98) and “Specifying a Quorum
Server” (page 179). See also “Configuring
Serviceguard to Use the Quorum Server” in the
latest version HP Serviceguard Quorum Server
Version A.04.00 Release Notes, at http://
www.docs.hp.com -> High Availability
-> Quorum Server.

IMPORTANT: See also“About Hostname


Address Families: IPv4-Only, IPv6-Only, and
Mixed Mode” (page 101) for important
information about requirements and restrictions
in an IPv6–only cluster.
Can be changed while the cluster is running; see
“What Happens when You Change the Quorum
Configuration Online” (page 48) for important
information.
QS_ADDR An alternate fully-qualified hostname or IP
address for the quorum server. It must be (or
Cluster Configuration Planning 107
resolve to) an IPv4 address on Red Hat 5. On
SLES 10 and 11, it can be (or resolve to) either an
IPv4 or an IPv6 address if
HOSTNAME_ADDRESS_FAMILY is set to ANY,
but otherwise must match the setting of
HOSTNAME_ADDRESS_FAMILY. This
parameter is used only if you use a quorum
server and want to specify an address on an
alternate subnet by which it can be reached. On
SLES 10 and 11, the alternate subnet need not
use the same address family as QS_HOST if
HOSTNAME_ADDRESS_FAMILY is set to ANY.
For more information, see “Cluster Lock
Planning” (page 98) and “Specifying a Quorum
Server” (page 179).

IMPORTANT: For special instructions that may


apply to your version of Serviceguard and the
Quorum Server see “Configuring Serviceguard
to Use the Quorum Server” in the latest version
HP Serviceguard Quorum Server Version A.04.00
Release Notes, at http://www.docs.hp.com
-> High Availability -> Quorum
Server.
Can be changed while the cluster is running; see
“What Happens when You Change the Quorum
Configuration Online” (page 48) for important
information.
QS_POLLING_INTERVAL The time (in microseconds) between attempts to
contact the quorum server to make sure it is
running. Default is 300,000,000 microseconds (5
minutes). Minimum is 10,000,000 (10 seconds).
Maximum is 2,147,483,647 (approximately 35
minutes).
Can be changed while the cluster is running; see
“What Happens when You Change the Quorum
Configuration Online” (page 48) for important
information.
QS_TIMEOUT_EXTENSION You can use the QS_TIMEOUT_EXTENSION to
increase the time interval after which the current
connection (or attempt to connect) to the quorum

108 Planning and Documenting an HA Cluster


server is deemed to have failed; but do not do so
until you have read the HP Serviceguard Quorum
Server Version A.04.00 Release Notes, and in
particular the following sections in that
document: “About the QS Polling Interval and
Timeout Extension”, “Network
Recommendations”, and “Setting Quorum Server
Parameters in the Cluster Configuration File”.
Can be changed while the cluster is running; see
“What Happens when You Change the Quorum
Configuration Online” (page 48) for important
information.

Cluster Configuration Planning 109


NODE_NAME The hostname of each system that will be a node
in the cluster.

CAUTION: Make sure that the node name is


unique within the subnets configured on the
cluster nodes; under some circumstances
Serviceguard may not be able to detect a
duplicate name and unexpected problems may
result.
Do not use the full domain name. For example,
enter ftsys9, not ftsys9.cup.hp.com. A
cluster can contain up to 16 nodes.

IMPORTANT: Node names must be 39


characters or less, and are case-sensitive; for each
node, the node_name in the cluster configuration
file must exactly match the corresponding
node_name in the package configuration file (see
Chapter 6: “Configuring Packages and Their
Services ” (page 197)) and these in turn must
exactly match the hostname portion of the name
specified in the node’s networking configuration.
(Using the above example, ftsys9 must appear
in exactly that form in the cluster configuration
and package configuration files, and as
ftsys9.cup.hp.com in the DNS database).
The parameters immediately following
NODE_NAME in this list
(NETWORK_INTERFACE, HEARTBEAT_IP,
STATIONARY_IP, CLUSTER_LOCK_LUN,
CAPACITY_NAME, and CAPACITY_VALUE)
apply specifically to the node identified by the
preceding NODE_NAME entry.
CLUSTER_LOCK_LUN The pathname of the device file to be used for
the lock LUN on each node. The pathname can
contain up to 39 characters.
See “Setting up a Lock LUN” (page 164) and
“Specifying a Lock LUN” (page 179)
Can be changed while the cluster is running; see
“Updating the Cluster Lock LUN Configuration

110 Planning and Documenting an HA Cluster


Online” (page 261). See also “What Happens
when You Change the Quorum Configuration
Online” (page 48) for important information.
NETWORK_INTERFACE The name of each LAN that will be used for
heartbeats or for user data on the node identified
by the preceding NODE_NAME. An example is
eth0. See also HEARTBEAT_IP,
STATIONARY_IP, and “About Hostname
Address Families: IPv4-Only, IPv6-Only, and
Mixed Mode” (page 101).

NOTE: Any subnet that is configured in this


cluster configuration file as a SUBNET for IP
monitoring purposes, or as a monitored_subnet in
a package configuration file (or SUBNET in a
legacy package; see “Package Configuration
Planning ” (page 123)) must be specified in the
cluster configuration file via
NETWORK_INTERFACE and either
STATIONARY_IP or HEARTBEAT_IP. Similarly,
any subnet that is used by a package for
relocatable addresses should be configured into
the cluster via NETWORK_INTERFACE and
either STATIONARY_IP or HEARTBEAT_IP. For
more information about relocatable addresses,
see “Stationary and Relocatable IP Addresses
and Monitored Subnets” (page 71) and the
descriptions of the package ip_ parameters
(page 213).
For information about changing the configuration
online, see “Changing the Cluster Networking
Configuration while the Cluster Is Running”
(page 257).
HEARTBEAT_IP IP notation indicating this node's connection to
a subnet that will carry the cluster heartbeat.

Cluster Configuration Planning 111


NOTE: Any subnet that is configured in this
cluster configuration file as a SUBNET for IP
monitoring purposes, or as a monitored_subnet in
a package configuration file (or SUBNET in a
legacy package; see “Package Configuration
Planning ” (page 123)) must be specified in the
cluster configuration file via
NETWORK_INTERFACE and either
STATIONARY_IP or HEARTBEAT_IP. Similarly,
any subnet that is used by a package for
relocatable addresses should be configured into
the cluster via NETWORK_INTERFACE and
either STATIONARY_IP or HEARTBEAT_IP. For
more information about relocatable addresses,
see “Stationary and Relocatable IP Addresses
and Monitored Subnets” (page 71) and the
descriptions of the package ip_ parameters
(page 213).
If HOSTNAME_ADDRESS_FAMILY is set to
IPV4 or ANY, a heartbeat IP address can be either
an IPv4 or an IPv6 address, with the exceptions
noted below. If
HOSTNAME_ADDRESS_FAMILY is set to IPV6,
all heartbeat IP addresses must be IPv6
addresses.
For more details of the IPv6 address format, see
“IPv6 Address Types” (page 327). Heartbeat IP
addresses on a given subnet must all be of the
same type: IPv4 or IPv6 site-local or IPv6 global.
For information about changing the configuration
online, see “Changing the Cluster Networking
Configuration while the Cluster Is Running”
(page 257).
Heartbeat configuration requirements:
The cluster needs at least two network interfaces
for the heartbeat in all cases, using one of the
following minimum configurations:

112 Planning and Documenting an HA Cluster


• two heartbeat subnets;
or
• one heartbeat subnet using bonding in high
availability mode (or mode 1) with two
slaves.
You cannot configure more than one heartbeat
IP address on an interface; only one
HEARTBEAT_IP is allowed for each
NETWORK_INTERFACE.

NOTE: The Serviceguard cmapplyconf,


cmcheckconf, and cmquerycl commands
check that these minimum requirements are met,
and produce a warning if they are not met at the
immediate network level. If you see this warning,
you need to check that the requirements are met
in your overall network configuration.
If you are using virtual machine guests as nodes,
you have a valid configuration (and can ignore
the warning) if there is one heartbeat network
on the guest, backed by a network on the host
using APA with two trunk members (HPVM),
or using NIC bonding as in the second bullet
above (VMware ESX Server).

Considerations for cross-subnet:


IP addresses for a given heartbeat path are
usually on the same subnet on each node, but it
is possible to configure the heartbeat on multiple
subnets such that the heartbeat is carried on one
subnet for one set of nodes and another subnet
for others, with the subnets joined by a router.
This is called a cross-subnet configuration, and
in this case at least two heartbeat paths must be
configured for each cluster node, and each
heartbeat subnet on each node must be physically
routed separately to the heartbeat subnet on
another node (that is, each heartbeat path must
be physically separate). See “Cross-Subnet
Configurations” (page 32).

Cluster Configuration Planning 113


NOTE: IPv6 heartbeat subnets are not
supported in a cross-subnet configuration.
NOTE: The use of a private heartbeat network
is not advisable if you plan to use Remote
Procedure Call (RPC) protocols and services. RPC
assumes that each network adapter device or I/O
card is connected to a route-able network. An
isolated or private heartbeat LAN is not
route-able, and could cause an RPC
request-reply, directed to that LAN, to time out
without being serviced.
NFS, NIS and NIS+, and CDE are examples of
RPC based applications that are frequently used.
Other third party and home-grown applications
may also use RPC services through the RPC API
libraries. If necessary, consult with the
application vendor.

STATIONARY_IP This node's IP address on each subnet that does


not carry the cluster heartbeat, but is monitored
for packages.

114 Planning and Documenting an HA Cluster


NOTE: Any subnet that is configured in this
cluster configuration file as a SUBNET for IP
monitoring purposes, or as a monitored_subnet in
a package configuration file (or SUBNET in a
legacy package; see “Package Configuration
Planning ” (page 123)) must be specified in the
cluster configuration file via
NETWORK_INTERFACE and either
STATIONARY_IP or HEARTBEAT_IP. Similarly,
any subnet that is used by a package for
relocatable addresses should be configured into
the cluster via NETWORK_INTERFACE and
either STATIONARY_IP or HEARTBEAT_IP. For
more information about relocatable addresses,
see “Stationary and Relocatable IP Addresses
and Monitored Subnets” (page 71) and the
descriptions of the package ip_ parameters
(page 213).
If HOSTNAME_ADDRESS_FAMILY is set to
IPV4 or ANY, a stationary IP address can be either
an IPv4 or an IPv6 address, with the exceptions
noted below. If
HOSTNAME_ADDRESS_FAMILY is set to IPV6,
all the IP addresses used by the cluster must be
IPv6 addresses.
If you want to separate application data from
heartbeat messages, define one or more
monitored non-heartbeat subnets here. You can
identify any number of subnets to be monitored.
A stationary IP address can be either an IPv4 or
an IPv6 address. For more information about
IPv6 addresses, see “IPv6 Address Types”
(page 327).
For information about changing the configuration
online, see “Changing the Cluster Networking
Configuration while the Cluster Is Running”
(page 257).
CAPACITY_NAME, Node capacity parameters. Use the
CAPACITY_VALUE CAPACITY_NAME and CAPACITY_VALUE
parameters to define a capacity for this node.

Cluster Configuration Planning 115


Node capacities correspond to package weights;
node capacity is checked against the
corresponding package weight to determine if
the package can run on that node.
CAPACITY_NAME name can be any string that
starts and ends with an alphanumeric character,
and otherwise contains only alphanumeric
characters, dot (.), dash (-), or underscore (_).
Maximum length is 39 characters.
CAPACITY_NAME must be unique in the cluster.
CAPACITY_VALUE specifies a value for the
CAPACITY_NAME that precedes it. It must be a
floating-point value between 0 and 1000000.
Capacity values are arbitrary as far as
Serviceguard is concerned; they have meaning
only in relation to the corresponding package
weights.
Capacity definition is optional, but if
CAPACITY_NAME is specified,
CAPACITY_VALUE must also be specified;
CAPACITY_NAME must come first.

NOTE: cmapplyconf will fail if any node


defines a capacity and any package has
min_package_node as its failover_policy
(page 209) or automatic as its failback_policy
(page 209).
To specify more than one capacity for a node,
repeat these parameters for each capacity. You
can specify a maximum of four capacities per
cluster, unless you use the reserved
CAPACITY_NAME package_limit; in that
case, you can use only that capacity throughout
the cluster.
For all capacities other than package_limit,
the default weight for all packages is zero, though
you can specify a different default weight for any
capacity other than package_limit; see the
entry for WEIGHT_NAME and
WEIGHT_DEFAULT later in this list.

116 Planning and Documenting an HA Cluster


See “About Package Weights” (page 134) for more
information.
Can be changed while the cluster is running; will
trigger a warning if the change would cause a
running package to fail.
MEMBER_TIMEOUT The amount of time, in microseconds, after which
Serviceguard declares that the node has failed
and begins re-forming the cluster without this
node.
Default value: 14 seconds (14,000,000
microseconds).
This value leads to a failover time of between
approximately 18 and 22 seconds, if you are using
a quorum server, or a Fiber Channel cluster lock,
or no cluster lock. Increasing the value to 25
seconds increases the failover time to between
approximately 29 and 39 seconds. The time will
increase by between 5 and 13 seconds if you are
you using a SCSI cluster lock or dual Fibre
Channel cluster lock).
Maximum supported value: 300 seconds
(300,000,000 microseconds).
If you enter a value greater than 60 seconds
(60,000,000 microseconds), cmcheckconf and
cmapplyconf will note the fact, as confirmation
that you intend to use a large value.
Minimum supported values:
• 3 seconds for a cluster with more than one
heartbeat subnet.
• 14 seconds for a cluster that has only one
heartbeat LAN
With the lowest supported value of 3 seconds, a
failover time of 4 to 5 seconds can be achieved.

Cluster Configuration Planning 117


NOTE: The failover estimates provided here
apply to the Serviceguard component of failover;
that is, the package is expected to be up and
running on the adoptive node in this time, but
the application that the package runs may take
more time to start.
Keep the following guidelines in mind when
deciding how to set the value.
Guidelines: You need to decide whether it's more
important for your installation to have fewer (but
slower) cluster re-formations, or faster (but
possibly more frequent) re-formations:
• To ensure the fastest cluster re-formations,
use the minimum value applicable to your
cluster. But keep in mind that this setting
will lead to a cluster re-formation, and to the
node being removed from the cluster and
rebooted, if a system hang or network load
spike prevents the node from sending a
heartbeat signal within the
MEMBER_TIMEOUT value. More than one
node could be affected if, for example, a
network event such as a broadcast storm
caused kernel interrupts to be turned off on
some or all nodes while the packets are
being processed, preventing the nodes from
sending and processing heartbeat messages.
See “Cluster Re-formations Caused by
MEMBER_TIMEOUT Being Set too Low”
(page 292) for troubleshooting information.
• For fewer re-formations, use a setting in the
range of 10 to 25 seconds (10,000,000 to
25,000,000 microseconds), keeping in mind
that a value larger than the default will lead
to slower re-formations than the default. A
value in this range is appropriate for most
installations
See also “What Happens when a Node Times
Out” (page 90), “Cluster Daemon: cmcld”
(page 39), and the white paper Optimizing

118 Planning and Documenting an HA Cluster


Failover Time in a Serviceguard Environment (version
A.11.19 and later) on docs.hp.com under High
Availability —> Serviceguard —>
White Papers.
Can be changed while the cluster is running.
AUTO_START_TIMEOUT The amount of time a node waits before it stops
trying to join a cluster during automatic cluster
startup. All nodes wait this amount of time for
other nodes to begin startup before the cluster
completes the operation. The time should be
selected based on the slowest boot time in the
cluster. Enter a value equal to the boot time of
the slowest booting node minus the boot time of
the fastest booting node plus 600 seconds (ten
minutes).
Default is 600,000,000 microseconds.
Can be changed while the cluster is running.
NETWORK_POLLING_INTERVAL Specifies how frequently the networks configured
for Serviceguard are checked.
Default is 2,000,000 microseconds (2 seconds).
This means that the network manager will poll
each network interface every 2 seconds, to make
sure it can still send and receive information.
The minimum value is 1,000,000 (1 second) and
the maximum value supported is 30 seconds.

IMPORTANT: HP strongly recommends using


the default. Changing this value can affect how
quickly the link-level and IP-level monitors detect
a network failure. See “Monitoring LAN
Interfaces and Detecting Failure: Link Level”
(page 78).
Can be changed while the cluster is running.

Cluster Configuration Planning 119


SUBNET IP address of a cluster subnet for which IP
Monitoring can be turned on or off (see
IP_MONITOR). The subnet must be configured
into the cluster, via NETWORK_INTERFACE and
either HEARTBEAT_IP or STATIONARY_IP. All
entries for IP_MONITOR and POLLING_TARGET
apply to this subnet until the next SUBNET entry;
SUBNET must be the first of each trio.
By default, each of the cluster subnets is listed
under SUBNET, and, if at least one gateway is
detected for that subnet, IP_MONITOR is set to
ON and POLLING_TARGET entries are populated
with the gateway addresses, enabling target
polling; otherwise the subnet is listed with
IP_MONITOR set to OFF.
See “Monitoring LAN Interfaces and Detecting
Failure: IP Level” (page 78) for more information.
Can be changed while the cluster is running;
must be removed, with its accompanying
IP_MONITOR and POLLING_TARGET entries,
if the subnet in question is removed from the
cluster configuration.
IP_MONITOR Specifies whether or not the subnet specified in
the preceding SUBNET entry will be monitored
at the IP layer.
To enable IP monitoring for the subnet, set
IP_MONITOR to ON; to disable it, set it to OFF.
By default IP_MONITOR is set to ON if a gateway
is detected for the SUBNET in question, and
POLLING_TARGET entries are populated with
the gateway addresses, enabling target polling.
See the POLLING_TARGET description that
follows for more information.
HP recommends you use target polling because
it enables monitoring beyond the first level of
switches, but if you want to use peer polling
instead, set IP_MONITOR to ON for this SUBNET,
but do not use POLLING_TARGET (comment
out or delete any POLLING_TARGET entries that
are already there).
If a network interface in this subnet fails at the

120 Planning and Documenting an HA Cluster


IP level and IP_MONITOR is set toON, the
interface will be marked down. If it is set to OFF,
failures that occur only at the IP-level will not be
detected.
Can be changed while the cluster is running;
must be removed if the preceding SUBNET entry
is removed.
POLLING_TARGET The IP address to which polling messages will
be sent from all network interfaces on the subnet
specified in the preceding SUBNET entry, if
IP_MONITOR is set to ON. This is called target
polling.
Each subnet can have multiple polling targets;
repeat POLLING_TARGET entries as needed.
If IP_MONITOR is set to ON, but no
POLLING_TARGET is specified, polling messages
are sent between network interfaces on the same
subnet (peer polling). HP recommends you use
target polling; see “How the IP Monitor Works”
(page 79) for more information.

NOTE: cmquerycl (1m) detects first-level


routers in the cluster (by looking for gateways
in each node's routing table) and lists them here
as polling targets. If you run cmquerycl with
the -w full option (for full network probing)
it will also verify that the gateways will work
correctly for monitoring purposes.
Can be changed while the cluster is running;
must be removed if the preceding SUBNET entry
is removed.
WEIGHT_NAME, Default value for this weight for all packages that
WEIGHT_DEFAULT can have weight; see “Rules and Guidelines”
(page 142) under“About Package Weights”
(page 134). WEIGHT_NAME specifies a name for
a weight that exactly corresponds to a
CAPACITY_NAME specified earlier in the cluster
configuration file. (A package has weight; a node
has capacity.) The rules for forming

Cluster Configuration Planning 121


WEIGHT_NAME are the same as those spelled
out for CAPACITY_NAME earlier in this list.
These parameters are optional, but if they are
defined, WEIGHT_DEFAULT must follow
WEIGHT_NAME, and must be set to a
floating-point value between 0 and 1000000. If
they are not specified for a given weight,
Serviceguard will assume a default value of zero
for that weight. In either case, the default can be
overridden for an individual package via the
weight_name and weight_value parameters in the
package configuration file.
For more information and examples, see
“Defining Weights” (page 139).

IMPORTANT: CAPACITY_NAME,
WEIGHT_NAME, and weight_value must all
match exactly.

NOTE: A weight (WEIGHT_NAME,


WEIGHT_DEFAULT) has no meaning on a node
unless a corresponding capacity
(CAPACITY_NAME, CAPACITY_VALUE) is
defined for that node.
For the reserved weight and capacity
package_limit, the default weight is always
one. This default cannot be changed in the cluster
configuration file, but it can be overridden for
an individual package in the package
configuration file.
cmapplyconf will fail if you define a default
for a weight but do not specify a capacity of the
same name for at least one node in the cluster.
You can define a maximum of four
WEIGHT_DEFAULTs per cluster.
Can be changed while the cluster is running.
(Access Control Policies) Specify three things for each policy:
USER_NAME, USER_HOST, and USER_ROLE.
Policies set in the configuration file of a cluster
and its packages must not be conflicting or

122 Planning and Documenting an HA Cluster


redundant. For more information, see
“Controlling Access to the Cluster” (page 183).
MAX_CONFIGURED_PACKAGES This parameter sets the maximum number of
packages that can be configured in the cluster.
The minimum value is 0, and the maximum
value, which is also the default, is 300.
Can be changed while the cluster is running.

Cluster Configuration: Next Step


When you are ready to configure the cluster, proceed to “Configuring the Cluster”
(page 177). If you find it useful to record your configuration ahead of time, use the
Cluster Configuration worksheet (page 324).

Package Configuration Planning


Planning for packages involves assembling information about each group of highly
available services.

NOTE: As of Serviceguard A.11.18, there is a new and simpler way to configure


packages. This method allows you to build packages from smaller modules, and
eliminates the separate package control script and the need to distribute it manually;
see Chapter 6: “Configuring Packages and Their Services ” (page 197), for complete
instructions.
This manual refers to packages created by the newer method as modular packages,
and to packages created by the older method as legacy packages.
The discussion that follows assumes you will be using the modular method. For
information and instructions on creating and maintaining legacy packages, see
“Configuring a Legacy Package” (page 262).

The document Framework for HP Serviceguard Toolkits provides a a guide to integrating


an application with Serviceguard, and includes a suite of customizable scripts intended
for use with legacy packages. This document is included in the Serviceguard Developer’s
Toolbox, which you can download free of charge from
http://www.hp.com/go/softwaredepot.

NOTE: As of the date of this manual, the Framework for HP Serviceguard Toolkits deals
specifically with legacy packages.

Logical Volume and File System Planning


Use logical volumes in volume groups as the storage infrastructure for package
operations on a cluster. When the package moves from one node to another, it must
still be able to access the same data on the same disk as it did when it was running on

Package Configuration Planning 123


the previous node. This is accomplished by activating the volume group and mounting
the file system that resides on it.
In Serviceguard, high availability applications, services, and data are located in volume
groups that are on a shared bus. When a node fails, the volume groups containing the
applications, services, and data of the failed node are deactivated on the failed node
and activated on the adoptive node (the node the packages move to). In order for this
to happen, you must configure the volume groups so that they can be transferred from
the failed node to the adoptive node.

NOTE: To prevent an operator from accidentally activating volume groups on other


nodes in the cluster, versions A.11.16.07 and later of Serviceguard for Linux include a
type of VG activation protection. This is based on the “hosttags” feature of LVM2.
This feature is not mandatory, but HP strongly recommends you implement it as you
upgrade existing clusters and create new ones. See “Enabling Volume Group Activation
Protection” (page 169) for instructions.

As part of planning, you need to decide the following:


• What volume groups are needed?
• How much disk space is required, and how should this be allocated in logical
volumes?
• What file systems need to be mounted for each package?
• Which nodes need to import which logical volume configurations.
• If a package moves to an adoptive node, what effect will its presence have on
performance?
Create a list by package of volume groups, logical volumes, and file systems. Indicate
which nodes need to have access to common file systems at different times.
HP recommends that you use customized logical volume names that are different from
the default logical volume names (lvol1, lvol2, etc.). Choosing logical volume names
that represent the high availability applications that they are associated with (for
example, lvoldatabase) will simplify cluster administration.
To further document your package-related volume groups, logical volumes, and file
systems on each node, you can add commented lines to the /etc/fstab file. The
following is an example for a database application:
# /dev/vg01/lvoldb1 /applic1 ext2 defaults 0 1 # These six entries are
# /dev/vg01/lvoldb2 /applic2 ext2 defaults 0 1 # for information purposes
# /dev/vg01/lvoldb3 raw_tables ignore ignore 0 0 # only. They record the
# /dev/vg01/lvoldb4 /general ext2 defaults 0 2 # logical volumes that
# /dev/vg01/lvoldb5 raw_free ignore ignore 0 0 # exist for Serviceguard's
# /dev/vg01/lvoldb6 raw_free ignore ignore 0 0 # HA package. Do not uncomment.

124 Planning and Documenting an HA Cluster


Create an entry for each logical volume, indicating its use for a file system or for a raw
device.

CAUTION: Do not use /etc/fstab to mount file systems that are used by
Serviceguard packages.
For information about creating, exporting, and importing volume groups, see “Creating
the Logical Volume Infrastructure ” (page 165).

Planning for Expansion


You can add packages to a running cluster. This process is described in Chapter 7:
“Cluster and Package Maintenance” (page 229).
When adding packages, be sure not to exceed the value of max_configured_packages as
defined in the cluster configuration file (see “Cluster Configuration Parameters ”
(page 105)). You can modify this parameter while the cluster is running if you need to.

Choosing Switching and Failover Behavior


To determine the failover behavior of a failover package (see “Package Types”
(page 49)), you define the policy that governs where Serviceguard will automatically
start up a package that is not running. In addition, you define a failback policy that
determines whether a package will be automatically returned to its primary node when
that is possible.
The following table describes different types of failover behavior and the settings in
the package configuration file that determine each behavior. See “Package Parameter
Explanations” (page 204) for more information.

Package Configuration Planning 125


Table 4-1 Package Failover Behavior
Switching Behavior Parameters in Configuration File

Package switches normally after • node_fail_fast_enabled set to no. (Default)


detection of service or network, failure, • service_fail_fast_enabled set to no for all services. (Default)
or when a configured dependency is not • auto_run set to yes for the package. (Default)
met. Halt script runs before switch takes
place. (Default)

Package fails over to the node with the • failover_policy set to min_package_node.
fewest active packages.

Package fails over to the node that is • failover_policy set to configured_node. (Default)
next on the list of nodes. (Default)

Package is automatically halted and • failback_policy set to automatic.


restarted on its primary node if the
primary node is available and the
package is running on a non-primary
node.

Package can be manually returned to • failback_policy set to manual. (Default)


its primary node if it is running on a • failover_policy set to configured_node. (Default)
non-primary node, but this does not
happen automatically.

All packages switch following a system • service_fail_fast_enabled set to yes for a specific service.
reboot on the node when a specific • auto_run set to yes for all packages.
service fails. Halt scripts are not run.

All packages switch following a system • service_fail_fast_enabled set to yes for all services.
reboot on the node when any service • auto_run set to yes for all packages.
fails.

About Package Dependencies


A package can have dependencies on other packages, meaning the package will not
start on a node unless the packages it depends on are running on that node.
You can make a package dependent on any other package or packages running on the
same cluster node, subject to the restrictions spelled out in Chapter 6, under
“dependency_condition” (page 210).
As of A.11.19, Serviceguard adds two new capabilities: you can specify broadly where
the package depended on must be running, and you can specify that it must be down.
These capabilities are discussed later in this section under “Extended Dependencies”
(page 132). You should read the next section, “Simple Dependencies” (page 126), first.

Simple Dependencies
A simple dependency occurs when one package requires another to be running on the
same node. You define these conditions by means of the parameters dependency_condition
and dependency_location, using the literal values UP and same_node, respectively. (For

126 Planning and Documenting an HA Cluster


detailed configuration information, see the package parameter definitions starting with
“dependency_name” (page 210). For a discussion of complex dependencies, see
Make a package dependent on another package if the first package cannot (or should
not) function without the services provided by the second. For example, pkg1 might
run a real-time web interface to a database managed by pkg2. In this case it might
make sense to make pkg1 dependent on pkg2.
In considering whether or not to create a dependency between packages, use the Rules
for Simple Dependencies and Guidelines for Simple Dependencies (page 131) that
follow.

Rules for Simple Dependencies


Assume that we want to make pkg1 depend on pkg2.

NOTE: pkg1 can depend on more than one other package, and pkg2 can depend on
another package or packages; we are assuming only two packages in order to make
the rules as clear as possible.
• pkg1 will not start on any node unless pkg2 is running on that node.
• pkg1’s package_type (page 205) and failover_policy (page 209) constrain the type and
characteristics of pkg2, as follows:
— If pkg1 is a multi-node package, pkg2 must be a multi-node or system
multi-node package. (Note that system multi-node packages are not supported
for general use.)
— If pkg1 is a failover package and its failover_policy is min_package_node,
pkg2 must be a multi-node or system multi-node package.
— If pkg1 is a failover package and its failover_policy is configured_node, pkg2
must be:
◦ a multi-node or system multi-node package, or
◦ a failover package whose failover_policy is configured_node.
• pkg2 cannot be a failover package whose failover_policy is min_package_node.
• pkg2’s node_name list (page 205) must contain all of the nodes on pkg1’s.
— This means that if pkg1 is configured to run on any node in the cluster (*),
pkg2 must also be configured to run on any node.

NOTE: If pkg1 lists all the nodes, rather than using the asterisk (*), pkg2
must also list them.

— Preferably the nodes should be listed in the same order if the dependency is
between packages whose failover_policy is configured_node; cmcheckconf
and cmapplyconf will warn you if they are not.

Package Configuration Planning 127


• A package cannot depend on itself, directly or indirectly.
That is, not only must pkg1 not specify itself in the dependency_condition (page 210),
but pkg1 must not specify a dependency on pkg2 if pkg2 depends on pkg1, or
if pkg2 depends on pkg3 which depends on pkg1, etc.
• If pkg1 is a failover package and pkg2 is a multi-node or system multi-node
package, and pkg2 fails, pkg1 will halt and fail over to the next node on its
node_name list on which pkg2 is running (and any other dependencies, such as
resource dependencies or a dependency on a third package, are met).
• In the case of failover packages with a configured_node failover_policy, a set of
rules governs under what circumstances pkg1 can force pkg2 to start on a given
node. This is called dragging and is determined by each package’s priority (page 209).
See “Dragging Rules for Simple Dependencies” (page 128).
• If pkg2 fails, Serviceguard will halt pkg1 and any other packages that depend
directly or indirectly on pkg2.
By default, Serviceguard halts packages in dependency order, the dependent
package(s) first, then the package depended on. In our example, pkg1 would be
halted first, then pkg2. If there were a third package, pkg3, that depended on
pkg1, pkg3 would be halted first, then pkg1, then pkg2.
If the halt script for any dependent package hangs, by default the package depended
on will wait forever (pkg2 will wait forever for pkg1, and if there is apkg3 that
depends on pkg1, pkg1 will wait forever for pkg3). You can modify this behavior
by means of the successor_halt_timeout parameter (page 208)). (The successor of a
package depends on that package; in our example, pkg1 is a successor of pkg2;
conversely pkg2 can be referred to as a predecessor of pkg1.)

Dragging Rules for Simple Dependencies


The priority parameter (page 209) gives you a way to influence the startup, failover, and
failback behavior of a set of failover packages that have a configured_node
failover_policy, when one or more of those packages depend on another or others.
The broad rule is that a higher-priority package can drag a lower-priority package,
forcing it to start on, or move to, a node that suits the higher-priority package.

NOTE: This applies only when the packages are automatically started (package
switching enabled); cmrunpkg will never force a package to halt.
Keep in mind that you do not have to set priority, even when one or more packages
depend on another. The default value, no_priority, may often result in the behavior
you want. For example, if pkg1 depends on pkg2, and priority is set to no_priority
for both packages, and other parameters such as node_name and auto_run are set as
recommended in this section, then pkg1 will normally follow pkg2 to wherever both
can run, and this is the common-sense (and may be the most desirable) outcome.

128 Planning and Documenting an HA Cluster


The following examples express the rules as they apply to two failover packages whose
failover_policy (page 209) is configured_node. Assume pkg1 depends on pkg2, that
node1, node2 and node3 are all specified (in some order) under node_name (page 205)
in the configuration file for each package, and that failback_policy (page 209) is set to
automatic for each package.

Package Configuration Planning 129


NOTE: Keep the following in mind when reading the examples that follow, and when
actually configuring priorities:
1. auto_run (page 206) should be set to yes for all the packages involved; the examples
assume that it is.
2. Priorities express a ranking order, so a lower number means a higher priority (10
is a higher priority than 30, for example).
HP recommends assigning values in increments of 20 so as to leave gaps in the
sequence; otherwise you may have to shuffle all the existing priorities when
assigning priority to a new package.
no_priority, the default, is treated as a lower priority than any numerical value.
3. All packages with no_priority are by definition of equal priority, and there is
no other way to assign equal priorities; a numerical priority must be unique within
the cluster. See “priority” (page 209) for more information.
If pkg1 depends on pkg2, and pkg1’s priority is lower than or equal to pkg2’s, pkg2’s
node order dominates. Assuming pkg2’s node order is node1, node2, node3, then:
• On startup:
— pkg2 will start on node1, or node2 if node1 is not available or does not at
present meet all of its dependencies, etc.
◦ pkg1 will start on whatever node pkg2 has started on (no matter where
that node appears on pkg1’s node_name list) provided all of pkg1’s other
dependencies are met there.
◦ If the node where pkg2 has started does not meet all pkg1’s dependencies,
pkg1 will not start.
• On failover:
— If pkg2 fails on node1, pkg2 will fail over to node2 (or node3 if node2 is not
available or does not currently meet all of its dependencies, etc.)
◦ pkg1 will fail over to whatever node pkg2 has restarted on (no matter where
that node appears on pkg1’s node_name list) provided all of pkg1’s
dependencies are met there.
– If the node where pkg2 has restarted does not meet all pkg1’s
dependencies, pkg1 will not restart.
— If pkg1 fails, pkg1 will not fail over.
This is because pkg1 cannot restart on any adoptive node until pkg2 is running
there, and pkg2 is still running on the original node. pkg1 cannot drag pkg2
because it has insufficient priority to do so.
• On failback:

130 Planning and Documenting an HA Cluster


— If both packages have moved from node1 to node2 and node1 becomes
available, pkg2 will fail back to node1 only if pkg2’s priority is higher than
pkg1’s:
◦ If the priorities are equal, neither package will fail back (unless pkg1 is not
running; in that case pkg2 can fail back).
◦ If pkg2’s priority is higher than pkg1’s, pkg2 will fail back to node1;
pkg1 will fail back to node1 provided all of pkg1’s other dependencies are
met there;
– if pkg2 has failed back to node1 and node1 does not meet all of pkg1’s
dependencies, pkg1 will halt.

If pkg1 depends on pkg2, and pkg1’s priority is higher than pkg2’s, pkg1’s node
order dominates. Assuming pkg1’s node order is node1, node2, node3, then:
• On startup:
— pkg1 will select node1 to start on.
— pkg2 will start on node1, provided it can run there (no matter where node1
appears on pkg2’s node_name list).
◦ If pkg2 is already running on another node, it will be dragged to node1,
provided it can run there.
— If pkg2 cannot start on node1, then both packages will attempt to start on
node2 (and so on).
Note that the nodes will be tried in the order of pkg1’s node_name list, and pkg2
will be dragged to the first suitable node on that list whether or not it is currently
running on another node.
• On failover:
— If pkg1 fails on node1, pkg1 will select node2 to fail over to (or node3 if it
can run there and node2 is not available or does not meet all of its dependencies;
etc.)
— pkg2 will be dragged to whatever node pkg1 has selected, and restart there;
then pkg1 will restart there.
• On failback:
— If both packages have moved to node2 and node1 becomes available, pkg1
will fail back to node1 if both packages can run there;
◦ otherwise, neither package will fail back.

Guidelines for Simple Dependencies


As you can see from the above Dragging Rules for Simple Dependencies, if pkg1
depends on pkg2, it can sometimes be a good idea to assign a higher priority to pkg1,
because that provides the best chance for a successful failover (and failback) if pkg1
fails.

Package Configuration Planning 131


But you also need to weigh the relative importance of the packages. If pkg2 runs a
database that is central to your business, you probably want it to run undisturbed, no
matter what happens to application packages that depend on it. In this case, the database
package should have the highest priority.
Note that, if no priorities are set, the dragging rules favor a package that is depended
on over a package that depends on it.
Consider assigning a higher priority to a dependent package if it is about equal in
real-world importance to the package it depends on; otherwise assign the higher priority
to the more important package, or let the priorities of both packages default.
You also need to think about what happens when a package fails. If other packages
depend on it, Serviceguard will halt those packages (and any packages that depend
on them, etc.) This happens regardless of the priority of the failed package.
By default the packages are halted in the reverse of the order in which they were started;
and if the halt script for any of the dependent packages hangs, the failed package will
wait indefinitely to complete its own halt process. This provides the best chance for all
the dependent packages to halt cleanly, but it may not be the behavior you want. You
can change it by means of the successor_halt_timeout parameter (page 208).
If you set successor_halt_timeout to zero, Serviceguard will halt the dependent packages
in parallel with the failed package; if you set it to a positive number, Serviceguard will
halt the packages in the reverse of the start order, but will allow the failed package to
halt after the successor_halt_timeout number of seconds whether or not the dependent
packages have completed their halt scripts.
If you decide to create dependencies between packages, it is a good idea to test
thoroughly, before putting the packages into production, to make sure that package
startup, halt, failover, and failback behavior is what you expect.

Extended Dependencies
To the capabilities provided by Simple Dependencies (page 126), extended dependencies
add the following:
• You can specify whether the package depended on must be running or must be
down.
You define this condition by means of the dependency_condition, using one of the
literals UP or DOWN (the literals can be upper or lower case). We'll refer to the
requirement that another package be down as an exclusionary dependency; see
“Rules for Exclusionary Dependencies” (page 133).
• You can specify where the dependency_condition must be satisfied: on the same
node, a different node, all nodes, or any node in the cluster.
You define this by means of the dependency_location parameter (page 211), using
one of the literals same_node, different_node, all_nodes, or any_node.

132 Planning and Documenting an HA Cluster


different_node and any_node are allowed only if dependency_condition is UP.
all_nodes is allowed only if dependency_condition is DOWN.
See “Rules for different_node and any_node Dependencies” (page 134).
For more information about the dependency_ parameters, see the definitions starting
with “dependency_name” (page 210), and the cmmakepkg (1m) manpage.

IMPORTANT: If you have not already done so, read the discussion of Simple
Dependencies (page 126) before you go on.
The interaction of the legal values of dependency_location and dependency_condition creates
the following possibilities:
• Same-node dependency: a package can require that another package be UP on the
same node.
This is the case covered in the section on Simple Dependencies (page 126).
• Different-node dependency: a package can require that another package be UP
on a different node.
• Any-node dependency: a package can require that another package be UP on any
node in the cluster.
• Same-node exclusion: a package can require that another package be DOWN on
the same node. (But this does not prevent that package from being UP on another
node.)
• All-nodes exclusion: a package can require that another package be DOWN on all
nodes in the cluster.

Rules for Exclusionary Dependencies


• All exclusions must be mutual.
That is, if pkg1 requires pkg2 to be DOWN, pkg2 must also require pkg1 to be
DOWN.
By creating an exclusionary relationship between any two packages, you ensure
that only one of them can be running at any time — either on a given node
(same-node exclusion) or throughout the cluster (all-nodes exclusion). A package
can have an exclusionary relationship with any number of other packages, but
each such relationship must be mutual.
• Priority (discussed in detail under “Dragging Rules for Simple Dependencies”
(page 128)) must be set for at least one of the packages in an exclusionary
relationship.
The higher-priority package can force the lower-priority package to halt or (in the
case of a same-node exclusion) move to another eligible node, if any.

Package Configuration Planning 133


• dependency_location must be either same_node or all_nodes, and must be the
same for both packages.
• Both packages must be failover packages whose failover_policy (page 209) is
configured_node.

Rules for different_node and any_node Dependencies


These rules apply to packages whose dependency_condition is UP and whose
dependency_location is different_node or any_node. For same-node dependencies,
see Simple Dependencies (page 126); for exclusionary dependencies, see “Rules for
Exclusionary Dependencies” (page 133).
• Both packages must be failover packages whose failover_policy (page 209) is
configured_node.
• The priority (page 209) of the package depended on must be higher than or equal
to the priority of the dependent package and the priorities of that package's
dependents.
— For example, if pkg1 has a different_node or any_node dependency on
pkg2, pkg2's priority must be higher than or equal to pkg1's priority and the
priority of any package that depends on pkg1 to be UP. pkg2's node order
dominates when Serviceguard is placing the packages.
• A package cannot depend on itself, directly or indirectly.
For example, not only must pkg1 not specify itself in the dependency_condition
(page 210), but pkg1 must not specify a dependency on pkg2 if pkg2 depends on
pkg1, or if pkg2 depends on pkg3 which depends on pkg1, etc.
• “Dragging” rules apply. See “Dragging Rules for Simple Dependencies” (page 128).

About Package Weights


Package weights and node capacities allow you to restrict the number of packages that
can run concurrently on a given node, or, alternatively, to limit the total package
“weight” (in terms of resource consumption) that a node can bear.
For example, suppose you have a two-node cluster consisting of a large system and a
smaller system. You want all your packages to be able to run on the large system at
the same time, but, if the large node fails, you want only the critical packages to run
on the smaller system. Package weights allow you to configure Serviceguard to enforce
this behavior.

Package Weights and Node Capacities


You define a capacity, or capacities, for a node (in the cluster configuration file), and
corresponding weights for packages (in the package configuration file).
Node capacity is consumed by package weights. Serviceguard ensures that the capacity
limit you set for a node is never exceeded by the combined weight of packages running

134 Planning and Documenting an HA Cluster


on it; if a node's available capacity will be exceeded by a package that wants to run on
that node, the package will not run there. This means, for example, that a package
cannot fail over to a node if that node does not currently have available capacity for it,
even if the node is otherwise eligible to run the package — unless the package that
wants to run has sufficient priority to force one of the packages that are currently
running to move; see “How Package Weights Interact with Package Priorities and
Dependencies” (page 143).

Configuring Weights and Capacities


You can configure multiple capacities for nodes, and multiple corresponding weights
for packages, up to four capacity/weight pairs per cluster. This allows you considerable
flexibility in managing package use of each node's resources — but it may be more
flexibility than you need. For this reason Serviceguard provides two methods for
configuring capacities and weights: a simple method and a comprehensive method.
The subsections that follow explain each of these methods.

Simple Method
Use this method if you simply want to control the number of packages that can run on
a given node at any given time. This method works best if all the packages consume
about the same amount of computing resources.
If you need to make finer distinctions between packages in terms of their resource
consumption, use the Comprehensive Method (page 137) instead.
To implement the simple method, use the reserved keyword package_limit to define
each node's capacity. In this case, Serviceguard will allow you to define only this single
type of capacity, and corresponding package weight, in this cluster. Defining package
weight is optional; for package_limit it will default to 1 for all packages, unless you
change it in the package configuration file.

Example 1
For example, to configure a node to run a maximum of ten packages at any one time,
make the following entry under the node's NODE_NAME entry in the cluster
configuration file:
NODE_NAME node1
...
CAPACITY_NAME package_limit
CAPACITY_VALUE 10
Now all packages will be considered equal in terms of their resource consumption,
and this node will never run more than ten packages at one time. (You can change this
behavior if you need to by modifying the weight for some or all packages, as the next
example shows.) Next, define the CAPACITY_NAME and CAPACITY_VALUE
parameters for the remaining nodes, setting CAPACITY_NAME to package_limit

Package Configuration Planning 135


in each case. You may want to set CAPACITY_VALUE to different values for different
nodes. A ten-package capacity might represent the most powerful node, for example,
while the least powerful has a capacity of only two or three.

NOTE: Serviceguard does not require you to define a capacity for each node. If you
define the CAPACITY_NAME and CAPACITY_VALUE parameters for some nodes but
not for others, the nodes for which these parameters are not defined are assumed to
have limitless capacity; in this case, those nodes would be able to run any number of
eligible packages at any given time.
If some packages consume more resources than others, you can use the weight_name
and weight_value parameters to override the default value (1) for some or all packages.
For example, suppose you have three packages, pkg1, pkg2, and pkg3. pkg2 is about
twice as resource-intensive as pkg3 which in turn is about one-and-a-half times as
resource-intensive as pkg1. You could represent this in the package configuration files
as follows:
• For pkg1:
weight_name package_limit
weight_value 2
• For pkg2:
weight_name package_limit
weight_value 6
• For pkg3:
weight_name package_limit
weight_value 3
Now node1, which has a CAPACITY_VALUE of 10 for the reserved CAPACITY_NAME
package_limit, can run any two of the packages at one time, but not all three. If in
addition you wanted to ensure that the larger packages, pkg2 and pkg3, did not run
on node1 at the same time, you could raise the weight_value of one or both so that the
combination exceeded 10 (or reduce node1's capacity to 8).

Points to Keep in Mind


The following points apply specifically to the Simple Method (page 135). Read them in
conjunction with the Rules and Guidelines (page 142), which apply to all weights and
capacities.

136 Planning and Documenting an HA Cluster


• If you use the reserved CAPACITY_NAME package_limit, then this is the only
type of capacity and weight you can define in this cluster.
• If you use the reserved CAPACITY_NAME package_limit, the default weight
for all packages is 1. You can override this default in the package configuration file,
via the weight_name and weight_value parameters, as in the example above.
(The default weight remains 1 for any package to which you do not explicitly
assign a different weight in the package configuration file.)
• If you use the reserved CAPACITY_NAME package_limit, weight_name, if used,
must also be package_limit.
• You do not have to define a capacity for every node; if you don't, the node is
assumed to have unlimited capacity and will be able to run any number of eligible
packages at the same time.
• If you want to define only a single capacity, but you want the default weight to
be zero rather than 1, do not use the reserved name package_limit. Use another
name (for example resource_quantity) and follow the Comprehensive Method.
This is also a good idea if you think you may want to use more than one capacity
in the future.
To learn more about configuring weights and capacities, see the documents listed under
For More Information (page 142).

Comprehensive Method
Use this method if the Simple Method (page 135) does not meet your needs. (Make sure
you have read that section before you proceed.) The comprehensive method works
best if packages consume differing amounts of computing resources, so that simple
one-to-one comparisons between packages are not useful.

IMPORTANT: You cannot combine the two methods. If you use the reserved capacity
package_limit for any node, Serviceguard will not allow you to define any other
type of capacity and weight in this cluster; so you are restricted to the Simple Method
in that case.

Defining Capacities
Begin by deciding what capacities you want to define; you can define up to four different
capacities for the cluster.
You may want to choose names that have common-sense meanings, such as “processor”,
“memory”, or “IO”, to identify the capacities, but you do not have to do so. In fact it
could be misleading to identify single resources, such as “processor”, if packages really
contend for sets of interacting resources that are hard to characterize with a single
name. In any case, the real-world meanings of the names you assign to node capacities
and package weights are outside the scope of Serviceguard. Serviceguard simply

Package Configuration Planning 137


ensures that for each capacity configured for a node, the combined weight of packages
currently running on that node does not exceed that capacity.
For example, if you define a CAPACITY_NAME and weight_name processor, and a
CAPACITY_NAME and weight_name memory, and a node has a processor capacity
of 10 and a memory capacity of 1000, Serviceguard ensures that the combined
processor weight of packages running on the node at any one time does not exceed
10, and that the combined memory weight does not exceed 1000. But Serviceguard has
no knowledge of the real-world meanings of the names processor and memory; there
is no mapping to actual processor and memory usage and you would get exactly the
same results if you used the names apples and oranges, for example.
Suppose you have the following configuration:
• A two node cluster running four packages. These packages contend for resource
we'll simply call A and B.
• node1 has a capacity of 80 for A and capacity of 50 for B.
• node2 has a capacity of 60 for A and capacity of 70 for B.
• pkg1 uses 60 of the A capacity and 15 of the B capacity.
• pkg2 uses 40 of the A capacity and 15 of the B capacity.
• pkg3 uses insignificant amount (zero) of the A capacity and 35 of the B capacity.
• pkg4 uses 20 of the A capacity and 40 of the B capacity.
pkg1 and pkg2 together require 100 of the A capacity and 30 of the B capacity. This
means pkg1 and pkg2 cannot run together on either of the nodes. While both nodes
have sufficient B capacity to run both packages at the same time, they do not have
sufficient A capacity.
pkg3 and pkg4 together require 20 of the A capacity and 75 of the B capacity. This
means pkg3 and pkg4 cannot run together on either of the nodes. While both nodes
have sufficient A capacity to run both packages at the same time, they do not have
sufficient B capacity.

Example 2
To define these capacities, and set limits for individual nodes, make entries such as the
following in the cluster configuration file:
CLUSTER_NAME cluster_23
...
NODE_NAME node1
...
CAPACITY_NAME A
CAPACITY_VALUE 80
CAPACITY_NAME B

138 Planning and Documenting an HA Cluster


CAPACITY_VALUE 50
NODE_NAME node2
CAPACITY_NAME A
CAPACITY_VALUE 60
CAPACITY_NAME B
CAPACITY_VALUE 70
...

NOTE: You do not have to define capacities for every node in the cluster. If any
capacity is not defined for any node, Serviceguard assumes that node has an infinite
amount of that capacity. In our example, not defining capacity A for a given node would
automatically mean that node could run pkg1 and pkg2 at the same time no matter
what A weights you assign those packages; not defining capacity B would mean the
node could run pkg3 and pkg4 at the same time; and not defining either one would
mean the node could run all four packages simultaneously.
When you have defined the nodes' capacities, the next step is to configure the package
weights; see “Defining Weights”.

Defining Weights
Package weights correspond to node capacities, and for any capacity/weight pair,
CAPACITY_NAME and weight_name must be identical.
You define weights for individual packages in the package configuration file, but you
can also define a cluster-wide default value for a given weight, and, if you do, this
default will specify the weight of all packages that do not explicitly override it in their
package configuration file.

NOTE: There is one exception: system multi-node packages cannot have weight, so
a cluster-wide default weight does not apply to them.

Defining Default Weights


To pursue the example begun under “Defining Capacities” (page 137), let's assume that
all packages other than pkg1 and pkg2 use about the same amount of capacity A, and
all packages other than pkg3 and pkg4 use about the same amount of capacity B. You
can use the WEIGHT_DEFAULT parameter in the cluster configuration file to set defaults
for both weights, as follows.

Example 3
WEIGHT_NAME A
WEIGHT_DEFAULT 20

Package Configuration Planning 139


WEIGHT_NAME B
WEIGHT_DEFAULT 15
This means that any package for which weight A is not defined in its package
configuration file will have a weight A of 20, and any package for which weight B is
not defined in its package configuration file will have a weight B of 15.
Given the capacities we defined in the cluster configuration file (see “Defining
Capacities”), node1 can run any three packages that use the default for both A and B.
This would leave 20 units of spare A capacity on this node, and 5 units of spare B
capacity.

Defining Weights for Individual Packages


For each capacity you define in the cluster configuration file (see “Defining Capacities”)
you have the following choices when it comes to assigning a corresponding weight to
a given package:
1. Configure a cluster-wide default weight and let the package use that default.
2. Configure a cluster-wide default weight but override it for this package in its
package configuration file.
3. Do not configure a cluster-wide default weight, but assign a weight to this package
in its package configuration file.
4. Do not configure a cluster-wide default weight and do not assign a weight for this
package in its package configuration file.

NOTE: Option 4 means that the package is “weightless” as far as this particular
capacity is concerned, and can run even on a node on which this capacity is completely
consumed by other packages.
(You can make a package “weightless” for a given capacity even if you have defined
a cluster-wide default weight; simply set the corresponding weight to zero in the
package's cluster configuration file.)

Pursuing the example started under “Defining Capacities” (page 137), we can now use
options 1 and 2 to set weights for pkg1 through pkg4.

Example 4
In pkg1's package configuration file:
weight_name A
weight_value 60
In pkg2's package configuration file:
weight_name A
weight_value 40
In pkg3's package configuration file:

140 Planning and Documenting an HA Cluster


weight_name B
weight_value 35
weight_name A
weight_value 0
In pkg4's package configuration file:
weight_name B
weight_value 40

IMPORTANT: weight_name in the package configuration file must exactly match the
corresponding CAPACITY_NAME in the cluster configuration file. This applies to case
as well as spelling: weight_name a would not match CAPACITY_NAME A.
You cannot define a weight unless the corresponding capacity is defined: cmapplyconf
will fail if you define a weight in the package configuration file and no node in the
package's node_name list (page 205) has specified a corresponding capacity in the cluster
configuration file; or if you define a default weight in the cluster configuration file and
no node in the cluster specifies a capacity of the same name.

Some points to notice about this example:


• Since we did not configure a B weight for pkg1 or pkg2, these packages have the
default B weight (15) that we set in the cluster configuration file in Example 3
(page 139). Similarly, pkg4 has the default A weight (20).
• We have configured pkg3 to have a B weight of 35, but no A weight.
• pkg1 will consume all of node2's A capacity; no other package that has A weight
can run on this node while pkg1 is running there.
But node2 could still run pkg3 while running pkg1, because pkg3 has no A weight,
and pkg1 is consuming only 15 units (the default) of node2's B capacity, leaving
35 available to pkg3 (assuming no other package that has B weight is already
running there).
• Similarly, if any package that has A weight is already running on node2, pkg1
will not be able to start there (unless pkg1 has sufficient priority to force another
package or packages to move; see “How Package Weights Interact with Package
Priorities and Dependencies” (page 143)). This is true whenever a package has a
weight that exceeds the available amount of the corresponding capacity on the
node.

Package Configuration Planning 141


Rules and Guidelines
The following rules and guidelines apply to both the Simple Method (page 135) and
the Comprehensive Method (page 137) of configuring capacities and weights.
• You can define a maximum of four capacities, and corresponding weights,
throughout the cluster.

NOTE: But if you use the reserved CAPACITY_NAME package_limit, you


can define only that single capacity and corresponding weight. See “Simple
Method” (page 135).

• Node capacity is defined in the cluster configuration file, via the CAPACITY_NAME
and CAPACITY_VALUE parameters.
• Capacities can be added, changed, and deleted while the cluster is running. This
can cause some packages to be moved, or even halted and not restarted.
• Package weight can be defined in cluster configuration file, via the WEIGHT_NAME
and WEIGHT_DEFAULT parameters, or in the package configuration file, via the
weight_name and weight_value parameters, or both.
• Weights can be assigned (and WEIGHT_DEFAULTs, apply) only to multi-node
packages and to failover packages whose failover_policy (page 209) is
configured_node and whose failback_policy (page 209) is manual.
• If you define weight (weight_name and weight_value) for a package, make sure you
define the corresponding capacity (CAPACITY_NAME and CAPACITY_VALUE)
in the cluster configuration file for at least one node on the package's node_name
list (page 205). Otherwise cmapplyconf will fail when you try to apply the package.
• Weights (both cluster-wide WEIGHT_DEFAULTs, and weights defined in the
package configuration files) can be changed while the cluster is up and the packages
are running. This can cause some packages to be moved, or even halted and not
restarted.

For More Information


For more information about capacities, see the comments under CAPACITY_NAME
and CAPACITY_VALUE in:
• the cluster configuration file
• the cmquerycl (1m) manpage
• the section “Cluster Configuration Parameters ” (page 105) in this manual.
For more information about weights, see the comments under weight_name and
weight_value in:
• the package configuration file
• the cmmakepkg (1m) manpage
• the section “Package Parameter Explanations” (page 204) in this manual.

142 Planning and Documenting an HA Cluster


For further discussion and use cases, see the white paper Using Serviceguard’s Node
Capacity and Package Weight Feature on docs.hp.com under High Availability
—> Serviceguard —> White Papers.

How Package Weights Interact with Package Priorities and Dependencies


If necessary, Serviceguard will halt a running lower-priority package that has weight
to make room for a higher-priority package that has weight. But a running package
that has no priority (that is, its priority is set to the default, no_priority) will not be
halted to make room for a down package that has no priority. Between two down
packages without priority, Serviceguard will decide which package to start if it cannot
start them both because there is not enough node capacity to support their weight.

Example 1
• pkg1 is configured to run on nodes turkey and griffon. It has a weight of 1
and a priority of 10. It is down and has switching disabled.
• pkg2 is configured to run on nodes turkey and griffon. It has a weight of 1
and a priority of 20. It is running on node turkey and has switching enabled.
• turkey and griffon can run one package each (package_limit is set to 1).
If you enable switching for pkg1, Serviceguard will halt the lower-priority pkg2 on
turkey. It will then start pkg1 on turkey and restart pkg2 on griffon.
If neither pkg1 nor pkg2 had priority, pkg2 would continue running on turkey and
pkg1 would run on griffon.

Example 2
• pkg1 is configured to run on nodes turkey and griffon. It has a weight of 1
and a priority of 10. It is running on node turkey and has switching enabled.
• pkg2 is configured to run on nodes turkey and griffon. It has a weight of 1
and a priority of 20. It is running on node turkey and has switching enabled.
• pkg3 is configured to run on nodes turkey and griffon. It has a weight of 1
and a priority of 30. It is down and has switching disabled.
• pkg3 has a same_node dependency on pkg2
• turkey and griffon can run two packages each (package_limit is set to 2).
If you enable switching for pkg3, it will stay down because pkg2, the package it depends
on, is running on node turkey, which is already running two packages (its capacity
limit). pkg3 has a lower priority than pkg2, so it cannot drag it to griffon where
they both can run.

About External Scripts


As of Serviceguard A.11.18, the package configuration template for modular packages
explicitly provides for external scripts. These replace the CUSTOMER DEFINED
FUNCTIONS in legacy scripts and can be run either:

Package Configuration Planning 143


• On package startup and shutdown, as essentially the first and last functions the
package performs. These scripts are invoked by means of the parameter
external_pre_script (page 220); or
• During package execution, after volume-groups and file systems are activated,
and IP addresses are assigned, and before the service and resource functions are
executed; and again, in the reverse order, on package shutdown. These scripts are
invoked by means of the parameter external_script (page 220).
The scripts are also run when the package is validated by cmcheckconf and
cmapplyconf.
A package can make use of both kinds of script, and can launch more than one of each
kind; in that case the scripts will be executed in the order they are listed in the package
configuration file (and in the reverse order when the package shuts down).
Each external script must have three entry points: start, stop, and validate, and
should exit with one of the following values:
• 0 - indicating success.
• 1 - indicating the package will be halted, and should not be restarted, as a result
of failure in this script.
• 2 - indicating the package will be restarted on another node, or halted if no other
node is available.

NOTE: In the case of the validate entry point, exit values 1 and 2 are treated the
same; you can use either to indicate that validation failed.
The script can make use of a standard set of environment variables (including the
package name, SG_PACKAGE, and the name of the local node, SG_NODE) exported by
the package manager or the master control script that runs the package; and can also
call a function to source in a logging function and other utility functions. One of these
functions, sg_source_pkg_env(), provides access to all the parameters configured
for this package, including package-specific environment variables configured via the
pev_ parameter (page 219).

NOTE: Some variables, including SG_PACKAGE, and SG_NODE, are available only at
package run and halt time, not when the package is validated. You can use
SG_PACKAGE_NAME at validation time as a substitute for SG_PACKAGE.
For more information, see the template in $SGCONF/examples/
external_script.template.
A sample script follows. It assumes there is another script called monitor.sh, which
will be configured as a Serviceguard service to monitor some application. The
monitor.sh script (not included here) uses a parameter
PEV_MONITORING_INTERVAL, defined in the package configuration file, to
periodically poll the application it wants to monitor; for example:

144 Planning and Documenting an HA Cluster


PEV_MONITORING_INTERVAL 60
At validation time, the sample script makes sure the PEV_MONITORING_INTERVAL
and the monitoring service are configured properly; at start and stop time it prints out
the interval to the log file.
#!/bin/sh
# Source utility functions.
if [[ -z $SG_UTILS ]]
then
. $SGCONF.conf
SG_UTILS=$SGCONF/scripts/mscripts/utils.sh
fi

if [[ -f ${SG_UTILS} ]]; then


. ${SG_UTILS}
if (( $? != 0 ))
then
echo "ERROR: Unable to source package utility functions file: ${SG_UTILS}"
exit 1
fi
else
echo "ERROR: Unable to find package utility functions file: ${SG_UTILS}"
exit 1
fi

# Get the environment for this package through utility function


# sg_source_pkg_env().
sg_source_pkg_env $*

function validate_command
{

typeset -i ret=0
typeset -i i=0
typeset -i found=0
# check PEV_ attribute is configured and within limits
if [[ -z PEV_MONITORING_INTERVAL ]]
then
sg_log 0 "ERROR: PEV_MONITORING_INTERVAL attribute not configured!"
ret=1
elif (( PEV_MONITORING_INTERVAL < 1 ))
then
sg_log 0 "ERROR: PEV_MONITORING_INTERVAL value ($PEV_MONITORING_INTERVAL) not within legal
limits!"
ret=1
fi
# check monitoring service we are expecting for this package is configured
while (( i < ${#SG_SERVICE_NAME[*]} ))
do
case ${SG_SERVICE_CMD[i]} in
*monitor.sh*) # found our script
found=1
break
;;
*)
;;
esac
(( i = i + 1 ))
done
if (( found == 0 ))
then
sg_log 0 "ERROR: monitoring service not configured!"
ret=1
fi
if (( ret == 1 ))
then
sg_log 0 "Script validation for $SG_PACKAGE_NAME failed!"
fi
return $ret
}

Package Configuration Planning 145


function start_command
{ sg_log 5 "start_command"

# log current PEV_MONITORING_INTERVAL value, PEV_ attribute can be changed


# while the package is running
sg_log 0 "PEV_MONITORING_INTERVAL for $SG_PACKAGE_NAME is $PEV_MONITORING_INTERVAL"
return 0
}

function stop_command
{

sg_log 5 "stop_command"
# log current PEV_MONITORING_INTERVAL value, PEV_ attribute can be changed
# while the package is running
sg_log 0 "PEV_MONITORING_INTERVAL for $SG_PACKAGE_NAME is $PEV_MONITORING_INTERVAL"
return 0
}
typeset -i exit_val=0
case ${1} in
start)
start_command $*
exit_val=$?
;;
stop)
stop_command $*
exit_val=$?
;;
validate)
validate_command $*
exit_val=$?
;;
*)
sg_log 0 "Unknown entry point $1"
;;
esac
exit $exit_val

Using Serviceguard Commands in an External Script


You can use Serviceguard commands (such as cmmodpkg) in an external script. These
commands must not interact with the package itself (that is, the package that runs the
external script) but can interact with other packages. But be careful how you code these
interactions.
If a Serviceguard command interacts with another package, be careful to avoid command
loops. For instance, a command loop might occur under the following circumstances.
Suppose a pkg1 script does a cmmodpkg -d of pkg2, and a pkg2 script does a
cmmodpkg -d of pkg1. If both pkg1 and pkg2 start at the same time, the pkg1 script
now tries to cmmodpkg pkg2. But that cmmodpkg command has to wait for pkg2
startup to complete. The pkg2 script tries to cmmodpkg pkg1, but pkg2 has to wait for
pkg1 startup to complete, thereby causing a command loop.
To avoid this situation, it is a good idea to specify a run_script_timeout and
halt_script_timeout for all packages, especially packages that use Serviceguard
commands in their external scripts. If a timeout is not specified and your package has
a command loop as described above, inconsistent results can occur, including a hung
cluster.

146 Planning and Documenting an HA Cluster


Determining Why a Package Has Shut Down
You can use an external script (or CUSTOMER DEFINED FUNCTIONS area of a legacy
package control script) to find out why a package has shut down.
Serviceguard sets the environment variable SG_HALT_REASON in the package control
script to one of the following values when the package halts:
• failure - set if the package halts because of the failure of a subnet, resource, or
service it depends on
• user_halt - set if the package is halted by a cmhaltpkg or cmhaltnode
command, or by corresponding actions in Serviceguard Manager
• automatic_halt - set if the package is failed over automatically because of the
failure of a package it depends on, or is failed back to its primary node
automatically (failback_policy = automatic)
You can add custom code to the package to interrogate this variable, determine why
the package halted, and take appropriate action. For legacy packages, put the code in
the customer_defined_halt_cmds() function in the CUSTOMER DEFINED
FUNCTIONS area of the package control script (see “Adding Customer Defined Functions
to the Package Control Script ” (page 267)); for modular packages, put the code in the
package’s external script (see “About External Scripts” (page 143)).
For example, if a database package is being halted by an administrator
(SG_HALT_REASON set to user_halt) you would probably want the custom code
to perform an orderly shutdown of the database; on the other hand, a forced shutdown
might be needed if SG_HALT_REASON is set to failure, indicating thatthe
package is halting abnormally (for example because of the failure of a service it depends
on).

last_halt_failed Flag
cmviewcl -v -f line displays a last_halt_failed flag.

NOTE: last_halt_failed appears only in the line output of cmviewcl, not the
default tabular format; you must use the -f line option to see it.
The value of last_halt_failed is no if the halt script ran successfully, or has not
run since the node joined the cluster, or has not run since the package was configured
to run on the node; otherwise it is yes.

About Cross-Subnet Failover


It is possible to configure a cluster that spans subnets joined by a router, with some
nodes using one subnet and some another. This is known as a cross-subnet
configuration; see “Cross-Subnet Configurations” (page 32). In this context, you can
configure packages to fail over from a node on one subnet to a node on another.

Package Configuration Planning 147


The implications for configuring a package for cross-subnet failover are as follows:
• For modular packages, you must configure two new parameters in the package
configuration file to allow packages to fail over across subnets:
— ip_subnet_node (page 214) - to indicate which nodes a subnet is configured on
— monitored_subnet_access (page 212) - to indicate whether a monitored subnet is
configured on all nodes ( FULL) or only some (PARTIAL). (Leaving
monitored_subnet_access unconfigured for a monitored subnet is equivalent to
FULL.)
(For legacy packages, see “Configuring Cross-Subnet Failover” (page 269).)
• You should not use the wildcard (*) for node_name in the package configuration
file, as this could allow the package to fail over across subnets when a node on the
same subnet is eligible; failing over across subnets can take longer than failing
over on the same subnet. List the nodes in order of preference instead of using the
wildcard.
• Deploying applications in this environment requires careful consideration; see
“Implications for Application Deployment” (page 148).
• If a monitored_subnet is configured for PARTIAL monitored_subnet_access in a
package’s configuration file, it must be configured on at least one of the nodes on
the node_name list (page 205) for that package.
Conversely, if all of the subnets that are being monitored for this package are
configured for PARTIAL access, each node on the node_name list must have at least
one of these subnets configured.
— As in other cluster configurations, a package will not start on a node unless the
subnets configured on that node, and specified in the package configuration
file as monitored subnets, are up.

Implications for Application Deployment


Because the relocatable IP address will change when a package fails over to a node on
another subnet, you need to make sure of the following:
• The hostname used by the package is correctly remapped to the new relocatable
IP address.
• The application that the package runs must be configured so that the clients can
reconnect to the package’s new relocatable IP address.
In the worst case (when the server where the application was running is down),
the client may continue to retry the old IP address until TCP’s tcp_timeout is reached
(typically about ten minutes), at which point it will detect the failure and reset the
connection.
For more information, see the white paper Technical Considerations for Creating a
Serviceguard Cluster that Spans Multiple IP Subnets, at http://docs.hp.com -> High
Availability.
148 Planning and Documenting an HA Cluster
Configuring a Package to Fail Over across Subnets: Example
To configure a package to fail over across subnets, you need to make some additional
edits to the package configuration file.

NOTE: This section provides an example for a modular package; for legacy packages,
see “Configuring Cross-Subnet Failover” (page 269).
Suppose that you want to configure a package, pkg1, so that it can fail over among all
the nodes in a cluster comprising NodeA, NodeB, NodeC, and NodeD.
NodeA and NodeB use subnet 15.244.65.0, which is not used by NodeC and NodeD;
and NodeC and NodeD use subnet 15.244.56.0, which is not used by NodeA and
NodeB. (See “Obtaining Cross-Subnet Information” (page 180) for sample cmquerycl
output).

Configuring node_name
First you need to make sure that pkg1 will fail over to a node on another subnet only
if it has to. For example, if it is running on NodeA and needs to fail over, you want it
to try NodeB, on the same subnet, before incurring the cross-subnet overhead of failing
over to NodeC or NodeD.
Assuming nodeA is pkg1’s primary node (where it normally starts), create node_name
entries in the package configuration file as follows:
node_name nodeA
node_name nodeB
node_name nodeC
node_name nodeD

Configuring monitored_subnet_access
In order to monitor subnet 15.244.65.0 or 15.244.56.0, depending on where
pkg1 is running, you would configure monitored_subnet and monitored_subnet_access
in pkg1’s package configuration file as follows:
monitored_subnet 15.244.65.0
monitored_subnet_access PARTIAL
monitored_subnet 15.244.56.0
monitored_subnet_access PARTIAL

Package Configuration Planning 149


NOTE: Configuring monitored_subnet_access as FULL (or not configuring
monitored_subnet_access) for either of these subnets will cause the package configuration
to fail, because neither subnet is available on all the nodes.

Configuring ip_subnet_node
Now you need to specify which subnet is configured on which nodes. In our example,
you would do this by means of entries such as the following in the package configuration
file:
ip_subnet 15.244.65.0
ip_subnet_node nodeA
ip_subnet_node nodeB
ip_address 15.244.65.82
ip_address 15.244.65.83
ip_subnet 15.244.56.0
ip_subnet_node nodeC
ip_subnet_node nodeD
ip_address 15.244.56.100
ip_address 15.244.56.101

Configuring a Package: Next Steps


When you are ready to start configuring a package, proceed to Chapter 6: “Configuring
Packages and Their Services ” (page 197); start with “Choosing Package Modules”
(page 198). (If you find it helpful, you can assemble your package configuration data
ahead of time on a separate worksheet for each package; blank worksheets are in
Appendix C.)

Planning for Changes in Cluster Size


If you intend to add additional nodes to the cluster online (while it is running) ensure
that they are connected to the same heartbeat subnets and to the same lock disks as the
other cluster nodes.
In selecting a cluster lock configuration, be careful to anticipate any potential need for
additional cluster nodes. Remember that while a two-node cluster must use a cluster
lock, a cluster of more than four nodes must not use a lock LUN, but can use a quorum
server. So if you will eventually need five nodes, you should build an initial
configuration that uses a quorum server.
If you intend to remove a node from the cluster configuration while the cluster is
running, ensure that the resulting cluster configuration will still conform to the rules

150 Planning and Documenting an HA Cluster


for cluster locks described above. See “Cluster Lock Planning” (page 98) for more
information.
If you are planning to add a node online, and a package will run on the new node,
ensure that any existing cluster-bound volume groups for the package have been
imported to the new node. Also, ensure that the MAX_CONFIGURED_PACKAGES
parameter is set high enough to accommodate the total number of packages you will
be using; see “Cluster Configuration Parameters ” (page 105).

Planning for Changes in Cluster Size 151


152
5 Building an HA Cluster Configuration
This chapter and the next take you through the configuration tasks required to set up
a Serviceguard cluster. You carry out these procedures on one node, called the
configuration node, and Serviceguard distributes the resulting binary file to all the
nodes in the cluster. In the examples in this chapter, the configuration node is named
ftsys9, and the sample target node is called ftsys10.
This chapter covers the following major topics:
• Preparing Your Systems
• Configuring the Cluster (page 177)
• Managing the Running Cluster (page 191)
Configuring packages is described in the next chapter.
Use the Serviceguard manpages for each command to obtain full information about
syntax and usage.

Preparing Your Systems


Before configuring your cluster, ensure that Serviceguard is installed on all cluster
nodes, and that all nodes have the appropriate security files, kernel configuration and
NTP (network time protocol) configuration.

Installing and Updating Serviceguard


For information about installing and updating Serviceguard, see the Release Notes for
your version athttp://docs.hp.com -> High Availability ->
Serviceguard for Linux -> Release Notes.

Understanding the Location of Serviceguard Files


Serviceguard uses a special file, /etc/cmcluster.conf, to define the locations for
configuration and log files within the Linux file system. The different distributions
may use different locations. The following are example locations for a Red Hat
distribution:
############################## cmcluster.conf ###########################
#
# Highly Available Cluster file locations
#
# This file must not be edited
#########################################################################
SGROOT=/usr/local/cmcluster # SG root directory
SGCONF=/usr/local/cmcluster/conf # configuration files
SGSBIN=/usr/local/cmcluster/bin # binaries
SGLBIN=/usr/local/cmcluster/bin # binaries
SGLIB=/usr/local/cmcluster/lib # libraries
SGRUN=/usr/local/cmcluster/run # location of core dumps from daemons
SGAUTOSTART=/usr/local/cmcluster/conf/cmcluster.rc # SG Autostart file

Preparing Your Systems 153


The following are example locations for a SUSE distribution:
############################## cmcluster.conf ###########################
#
# Highly Available Cluster file locations
#
# This file must not be edited
#########################################################################
SGROOT=/opt/cmcluster # SG root directory
SGCONF=/opt/cmcluster/conf # configuration files
SGSBIN=/opt/cmcluster/bin # binaries
SGLBIN=/opt/cmcluster/bin # binaries
SGLIB=/opt/cmcluster/lib # libraries
SGRUN=/opt/cmcluster/run # location of core dumps from daemons
SGAUTOSTART=/opt/cmcluster/conf/cmcluster.rc # SG Autostart file
Throughout this document, system filenames are usually given with one of these
location prefixes. Thus, references to $SGCONF/<FileName> can be resolved by
supplying the definition of the prefix that is found in this file. For example, if SGCONF
is /usr/local/cmcluster/conf, then the complete pathname for file
$SGCONF/cmclconfig would be /usr/local/cmcluster/conf/cmclconfig.

Enabling Serviceguard Command Access


To allow the creation of a Serviceguard configuration, you should complete the following
steps on all cluster nodes before running any Serviceguard commands:
1. Make sure the root user’s path includes the Serviceguard executables. This can be
done by adding the following environment variable definition to the root user’s
profile for Red Hat:
PATH=$PATH:/usr/local/cmcluster/bin
For SUSE:
PATH=$PATH:/opt/cmcluster/bin
2. Edit the /etc/man.config file to include the following line for Red Hat:
MANPATH /usr/local/cmcluster/doc/man
For SUSE:
MANPATH /opt/cmcluster/doc/man
This will allow use of the Serviceguard man pages.
3. Enable use of Serviceguard variables.
If the Serviceguard variables are not defined on your system, then include the file
/etc/cmcluster.conf in your login profile for user root:
. /etc/cmcluster.conf
You can confirm the access to the one of the variables as follows:
cd $SGCONF

154 Building an HA Cluster Configuration


Configuring Root-Level Access
The subsections that follow explain how to set up root access between the nodes in the
prospective cluster. (When you proceed to configuring the cluster, you will define
various levels of non-root access as well; see “Controlling Access to the Cluster”
(page 183).)

NOTE: For more information and advice, see the white paper Securing Serviceguard
at http://docs.hp.com -> High Availability -> Serviceguard ->
White Papers.

Allowing Root Access to an Unconfigured Node


To enable a system to be included in a cluster, you must enable Linux root access to
the system by the root user of every other potential cluster node. The Serviceguard
mechanism for doing this is the file $SGCONF/cmclnodelist. This is sometimes
referred to as a “bootstrap” file because Serviceguard consults it only when configuring
a node into a cluster for the first time; it is ignored after that. It does not exist by default,
but you will need to create it.
You may want to add a comment such as the following at the top of the file:
###########################################################
# Do not edit this file!
# Serviceguard uses this file only to authorize access to an
# unconfigured node. Once the node is configured,
# Serviceguard will not consult this file.
###########################################################
The format for entries in cmclnodelist is as follows:
[hostname] [user] [#Comment]
For example:
gryf root #cluster1, node1
sly root #cluster1, node2
bit root #cluster1, node3
This example grants root access to the node on which this cmclnodelist file resides
to root users on the nodes gryf, sly, and bit.
Serviceguard also accepts the use of a “+” in the cmclnodelist file; this indicates that
the root user on any Serviceguard node can configure Serviceguard on this node.

Preparing Your Systems 155


IMPORTANT: If $SGCONF/cmclnodelist does not exist, Serviceguard will look at
~/.rhosts. HP strongly recommends that you use cmclnodelist.

NOTE: When you upgrade a cluster from Version A.11.15 or earlier, entries in
$SGCONF/cmclnodelist are automatically updated to Access Control Policies in the
cluster configuration file. All non-root user-hostname pairs are assigned the role of
Monitor.

Ensuring that the Root User on Another Node Is Recognized


The Linux root user on any cluster node can configure the cluster. This requires that
Serviceguard on one node be able to recognize the root user on another.
Serviceguard uses the identd daemon to verify user names, and, in the case of a root
user, verification succeeds only if identd returns the username root. Because identd
may return the username for the first match on UID 0, you must check /etc/passwd
on each node you intend to configure into the cluster, and ensure that the entry for the
root user comes before any other entry with a UID of 0.

About identd
HP strongly recommends that you use identd for user verification, so you should
make sure that each prospective cluster node is configured to run it. identd is usually
started from /etc/init.d/xinetd.
(It is possible to disable identd, though HP recommends against doing so. If for some
reason you have to disable identd, see “Disabling identd” (page 194).)
For more information about identd, see the white paper Securing Serviceguard at
http://docs.hp.com -> High Availability -> Serviceguard -> White
Papers, and the identd manpage.

Configuring Name Resolution


Serviceguard uses the name resolution services built into Linux.
Serviceguard nodes can communicate over any of the cluster’s shared networks, so the
network resolution service you are using (such as DNS, NIS, or LDAP) must be able
to resolve each of their primary addresses on each of those networks to the primary
hostname of the node in question.
In addition, HP recommends that you define name resolution in each node’s
/etc/hosts file, rather than rely solely on a service such as DNS. Configure the name
service switch to consult the /etc/hosts file before other services. See “Safeguarding
against Loss of Name Resolution Services” (page 158) for instructions.

156 Building an HA Cluster Configuration


NOTE: If you are using private IP addresses for communication within the cluster,
and these addresses are not known to DNS (or the name resolution service you use)
these addresses must be listed in /etc/hosts.
For requirements and restrictions that apply to IPv6–only clusters and mixed-mode
clusters, see “Rules and Restrictions for IPv6-Only Mode” (page 102) and “Rules and
Restrictions for Mixed Mode” (page 104), respectively, and the latest version of the
Serviceguard release notes.

For example, consider a two node cluster (gryf and sly) with two private subnets
and a public subnet. These nodes will be granting access by a non-cluster node (bit)
which does not share the private subnets. The /etc/hosts file on both cluster nodes
should contain:
15.145.162.131 gryf.uksr.hp.com gryf
10.8.0.131 gryf.uksr.hp.com gryf
10.8.1.131 gryf.uksr.hp.com gryf

15.145.162.132 sly.uksr.hp.com sly


10.8.0.132 sly.uksr.hp.com sly
10.8.1.132 sly.uksr.hp.com sly

15.145.162.150 bit.uksr.hp.com bit

NOTE: Serviceguard recognizes only the hostname (the first element) in a fully
qualified domain name (a name like those in the example above). This means, for
example, that gryf.uksr.hp.com and gryf.cup.hp.com cannot be nodes in the
same cluster, as Serviceguard would see them as the same host gryf.
If applications require the use of hostname aliases, the Serviceguard hostname must
be one of the aliases in all the entries for that host. For example, if the two-node cluster
in the previous example were configured to use the alias hostnames alias-node1
and alias-node2, then the entries in /etc/hosts should look something like this:
15.145.162.131 gryf.uksr.hp.com gryf1 alias-node1
10.8.0.131 gryf2.uksr.hp.com gryf2 alias-node1
10.8.1.131 gryf3.uksr.hp.com gryf3 alias-node1
15.145.162.132 sly.uksr.hp.com sly1 alias-node2
10.8.0.132 sly2.uksr.hp.com sly2 alias-node2
10.8.1.132 sly3.uksr.hp.com sly3 alias-node2

Preparing Your Systems 157


IMPORTANT: Serviceguard does not support aliases for IPv6 addresses.
For information about configuring an IPv6–only cluster, or a cluster that uses a
combination of IPv6 and IPv4 addresses for the nodes' hostnames, see “About Hostname
Address Families: IPv4-Only, IPv6-Only, and Mixed Mode” (page 101).

Safeguarding against Loss of Name Resolution Services


When you employ any user-level Serviceguard command (including cmviewcl), the
command uses the name service you have configured (such as DNS) to obtain the
addresses of all the cluster nodes. If the name service is not available, the command
could hang or return an unexpected networking error message.

NOTE: If such a hang or error occurs, Serviceguard and all protected applications
will continue working even though the command you issued does not. That is, only
the Serviceguard configuration commands (and corresponding Serviceguard Manager
functions) are affected, not the cluster daemon or package services.
The procedure that follows shows how to create a robust name-resolution configuration
that will allow cluster nodes to continue communicating with one another if a name
service fails.
1. Edit the /etc/hosts file on all nodes in the cluster. Add name resolution for all
heartbeat IP addresses, and other IP addresses from all the cluster nodes; see
“Configuring Name Resolution” (page 156) for discussion and examples.

NOTE: For each cluster node, the public-network IP address must be the first
address listed. This enables other applications to talk to other nodes on public
networks.

2. If you are using DNS, make sure your name servers are configured in /etc/
resolv.conf, for example:
domain cup.hp.com
search cup.hp.com hp.com
nameserver 15.243.128.51
nameserver 15.243.160.51
3. Edit or create the /etc/nsswitch.conf file on all nodes and add the following
text, if it does not already exist:
• for DNS, enter (two lines):
hosts: files [NOTFOUND=continue UNAVAIL=continue] dns [NOTFOUND=return UNAVAIL=return]
ipnodes: files [NOTFOUND=continue UNAVAIL=continue] dns [NOTFOUND=return
UNAVAIL=return]

• for NIS, enter (two lines):

158 Building an HA Cluster Configuration


hosts: files [NOTFOUND=continue UNAVAIL=continue] nis [NOTFOUND=return UNAVAIL=return]
ipnodes: files [NOTFOUND=continue UNAVAIL=continue] nis [NOTFOUND=return
UNAVAIL=return]

If a line beginning with the string hosts: or ipnodes: already exists, then make
sure that the text immediately to the right of this string is (on one line):
files [NOTFOUND=continue UNAVAIL=continue] dns [NOTFOUND=return UNAVAIL=return]
or
files [NOTFOUND=continue UNAVAIL=continue] nis [NOTFOUND=return UNAVAIL=return]
This step is critical, allowing the cluster nodes to resolve hostnames to IP addresses
while DNS, NIS, or the primary LAN is down.
4. Create a $SGCONF/cmclnodelist file on all nodes that you intend to configure
into the cluster, and allow access by all cluster nodes. See “Allowing Root Access
to an Unconfigured Node” (page 155).

NOTE: HP recommends that you also make the name service itself highly available,
either by using multiple name servers or by configuring the name service into a
Serviceguard package.

Ensuring Consistency of Kernel Configuration


Make sure that the kernel configurations of all cluster nodes are consistent with the
expected behavior of the cluster during failover. In particular, if you change any kernel
parameters on one cluster node, they may also need to be changed on other cluster
nodes that can run the same packages.

Enabling the Network Time Protocol


HP strongly recommends that you enable network time protocol (NTP) services on
each node in the cluster. The use of NTP, which runs as a daemon process on each
system, ensures that the system time on all nodes is consistent, resulting in consistent
timestamps in log files and consistent behavior of message services. This ensures that
applications running in the cluster are correctly synchronized. The NTP services daemon,
xntpd, should be running on all nodes before you begin cluster configuration. The
NTP configuration file is /etc/ntp.conf.

Implementing Channel Bonding (Red Hat)


This section applies to Red Hat installations. If you are using a SUSE distribution, skip
ahead to the next section.
Channel bonding of LAN interfaces is implemented by the use of the bonding driver,
which is installed in the kernel at boot time. With this driver installed, the networking
software recognizes bonding definitions that are created in the /etc/sysconfig/
network-scripts directory for each bond. For example, the file named ifcfg-bond0
defines bond0 as the master bonding unit, and the ifcfg-eth0 and ifcfg-eth1
scripts define each individual interface as a slave.

Preparing Your Systems 159


Bonding can be defined in different modes. Mode 0, which is used for load balancing,
uses all slave devices within the bond in parallel for data transmission. This can be
done when the LAN interface cards are connected to an Ethernet switch, with the ports
on the switch configured as Fast EtherChannel trunks. Two switches should be cabled
together as an HA grouping to allow package failover.
For high availability, in which one slave serves as a standby for the bond and the other
slave transmits data, install the bonding module in mode 1. This is most appropriate
for dedicated heartbeat connections that are cabled through redundant network hubs
or switches that are cabled together.
For more information on networking bonding, make sure you have installed the
kernel-doc rpm, and see:
/usr/share/doc/kernel-doc-<version>/Documentation/networking/bonding.txt

NOTE: HP recommends that you do the bonding configuration from the system
console, because you will need to restart networking from the console when the
configuration is done.

Sample Configuration
Configure the following files to support LAN redundancy. For a single failover only
one bond is needed.
1. Create a bond0 file, ifcfg-bond0.
Create the configuration in the /etc/sysconfig/network-scripts directory.
For example, in the file, ifcfg-bond0, bond0 is defined as the master (for your
installation, substitute the appropriate values for your network instead of
192.168.1.1).
Include the following information in the ifcfg-bond0 file:
DEVICE=bond0
IPADDR=192.168.1.1
NETMASK=255.255.255.0
NETWORK=192.168.1.0
BROADCAST=192.168.1.255
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
For Red Hat 5 only, add the following line to the ifcfg-bond0file:
BONDING OPTS=’miimon=100 mode=1’
2. Create an ifcfg-ethn file for each interface in the bond. All interfaces should
have SLAVE and MASTER definitions. For example, in a bond that uses eth0 and
eth1, edit the ifcfg-eth0 file to appear as follows:
DEVICE=eth0
USERCTL=no

160 Building an HA Cluster Configuration


ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
Edit the ifcfg-eth1 file to appear as follows:
DEVICE=eth1
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
For Red Hat 5 only, add a line containing the hardware (MAC) address of the
interface to the corresponding ifcfg-ethn slave file, for example:
HWADDR=00:12:79:43:5b:f4
3. Add the following lines to /etc/modprobe.conf:
alias bond0 bonding options bond0 miimon=100 mode=1
Use MASTER=bond1 for bond1 if you have configured a second bonding interface,
then add the following after the first bond (bond0): options bond1 -o
bonding1 miimon=100 mode=1

NOTE: During configuration, you need to make sure that the active slaves for the
same bond on each node are connected the same hub or switch. You can check on this
by examining the file /proc/net/bonding/bond<x>/info on each node. This file
will show the active slave for bond x.

Restarting Networking
Restart the networking subsystem. From the console of either node in the cluster, execute
the following command on a Red Hat system:
/etc/rc.d/init.d/network restart

NOTE: It is better not to restart the network from outside the cluster subnet, as there
is a chance the network could go down before the command can complete.
The command prints bringing up network statements.
If there was an error in any of the bonding configuration files, the network might not
function properly. If this occurs, check each configuration file for errors, then try to
restart the network again.

Viewing the Configuration


You can test the configuration and transmit policy with ifconfig. For the configuration
created above, the display should look like this:

Preparing Your Systems 161


/sbin/ifconfig
bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0
TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:0

eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4


inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0
TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:100
Interrupt:10 Base address:0x1080

eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4


inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0
TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:9 Base address:0x1400

Implementing Channel Bonding (SUSE)


If you are using a Red Hat distribution, use the procedures described in the previous
section. The following applies only to the SUSE distributions.
First run yast/yast2 and configure Ethernet devices as DHCP so they create the
ifcfg-eth-id-<mac> files.
Next modify each of ifcfg-eth-id-<mac> files that you want to bond, they are
located in /etc/sysconfig/network, and change them from:
BOOTPROTO='dhcp'
MTU=''
REMOTE_IPADDR=''
STARTMODE='onboot'
UNIQUE='gZD2.ZqnB7JKTdX0'
_nm_name='bus-pci-0000:00:0b.0'
to:
BOOTPROTO='none'
STARTMODE='onboot'
UNIQUE='gZD2.ZqnB7JKTdX0'
_nm_name='bus-pci-0000:00:0b.0'

162 Building an HA Cluster Configuration


NOTE: Do not change the UNIQUE and _nm_name parameters. You can leave MTU
and REMOTE_IPADDR in the file as long as they are not set.
Next, in /etc/sysconfig/network, edit your ifcfg-bond0 file so it looks like
this:
BROADCAST='172.16.0.255'
BOOTPROTO='static'
IPADDR='172.16.0.1'
MTU=''
NETMASK='255.255.255.0'
NETWORK='172.16.0.0'
REMOTE_IPADDR=''
STARTMODE='onboot'
BONDING_MASTER='yes'
BONDING_MODULE_OPTS='miimon=100 mode=1'
BONDING_SLAVE0='eth0'
BONDING_SLAVE1='eth1'
The above example configures bond0 with mii monitor equal to 100 and
active-backup mode. Adjust the IP, BROADCAST, NETMASK, and NETWORK
parameters to correspond to your configuration.
As you can see, you are adding the configuration options BONDING_MASTER,
BONDING-MODULE_OPTS, and BONDING_SLAVE. BONDING-MODULE_OPTS are
the additional options you want to pass to the bonding module. You cannot pass
max_bonds as an option, and you do not need to because the ifup script will load
the module for each bond needed.
BONDING_SLAVE tells ifup which Ethernet devices to enslave to bond0. So if you
wanted to bond four Ethernet devices you would add:
BONDING_SLAVE2='eth2'
BONDING_SLAVE3='eth3'

NOTE: Use ifconfig to find the relationship between eth IDs and the MAC
addresses.
For more networking information on bonding, see
/usr/src/linux<kernel_version>/Documentation/networking/bonding.txt.

Restarting Networking
Restart the networking subsystem. From the console of any node in the cluster, execute
the following command on a SUSE system:

Preparing Your Systems 163


/etc/init.d/network restart

NOTE: It is better not to restart the network from outside the cluster subnet, as there
is a chance the network could go down before the command can complete.
If there is an error in any of the bonding configuration files, the network may not
function properly. If this occurs, check each configuration file for errors, then try to
start the network again.

Setting up a Lock LUN


The lock LUN requires a partition of one cylinder of at least 100K defined (via the
fdisk command) as type Linux (83). You will need the pathnames for the lock LUN
as it is seen on each cluster node. On one node, use the fdisk command to define a
partition of 1 cylinder, type 83, on this LUN. Here is an example:
Respond to the prompts as shown in the following table to set up the lock LUN partition:
fdisk <Lock LUN Device File>
Table 5-1 Changing Linux Partition Types
Prompt Response Action Performed

1. Command (m for help): n Create new partition

2. Partition number (1-4): 1 Partition affected

3. Hex code (L to list codes): 83 Set partition to type to Linux, default

Command (m for help): 1 Define first partition

Command (m for help): 1 Set size to 1 cylinder

Command (m for help): p Display partition data

Command (m for help): w Write data to the partition table

The following example of the fdisk dialog shows that the disk on the device file /dev/
sdc is set to Smart Array type partition, and appears as follows:
fdisk /dev/sdc
Command (m for help): n
Partition number (1-4): 1
HEX code (type L to list codes): 83
Command (m for help): 1
Command (m for help): 1

Command (m for help): p


Disk /dev/sdc: 64 heads, 32 sectors, 4067 cylinders
Units = cylinders of 2048 * 512 bytes

164 Building an HA Cluster Configuration


Device Boot Start End Blocks Id System
/dev/sdc 1 1 1008 83 Linux

Command (m for help): w


The partition table has been altered!

NOTE: Follow these rules:


• Do not try to use LVM to configure the lock LUN.
• The partition type must be 83.
• Do not create any filesystem on the partition used for the lock LUN.
• Do not use md to configure multiple paths to the lock LUN.

To transfer the disk partition format to other nodes in the cluster use the command:
sfdisk -R <device>
where <device> corresponds to the same physical device as on the first node. For
example, if /dev/sdc is the device name on the other nodes use the command:
sfdisk -R /dev/sdc
You can check the partition table by using the command:
fdisk -l /dev/sdc

NOTE: fdisk may not be available for SUSE on all platforms. In this case, using
YAST2 to set up the partitions is acceptable.

Setting Up and Running the Quorum Server


If you will be using a quorum server rather than a lock LUN, the Quorum Server
software must be installed on a system other than the nodes on which your cluster will
be running, and must be running during cluster configuration.
For detailed discussion, recommendations, and instructions for installing, updating,
configuring, and running the Quorum Server, see the HP Serviceguard Quorum Server
Version A.04.00 Release Notes at http://www.docs.hp.com -> High Availability
-> Quorum Server. See also the discussion of the QS_HOST and QS_ADDR
parameters under “Cluster Configuration Parameters ” (page 105).

Creating the Logical Volume Infrastructure


Serviceguard makes use of shared disk storage. This is set up to provide high availability
by using redundant data storage and redundant paths to the shared devices. Storage
for a Serviceguard package is logically composed of LVM Volume Groups that are
activated on a node as part of starting a package on that node. Storage is generally
configured on logical units (LUNs).

Preparing Your Systems 165


Disk storage for Serviceguard packages is built on shared disks that are cabled to
multiple cluster nodes. These are separate from the private Linux root disks, which
include the boot partition and root file systems. To provide space for application data
on shared disks, create disk partitions using the fdisk, and build logical volumes with
LVM.
You can build a cluster (next section) before or after defining volume groups for shared
data storage. If you create the cluster first, information about storage can be added to
the cluster and package configuration files after the volume groups are created.
See “Volume Managers for Data Storage” (page 84) for an overview of volume
management in HP Serviceguard for Linux. The sections that follow explain how to
do the following tasks:
• Displaying Disk Information (page 166)
• Creating Partitions (page 167)
• Enabling Volume Group Activation Protection (page 169)
• Building Volume Groups: Example for Smart Array Cluster Storage (MSA 2000
Series) (page 170)
• Building Volume Groups and Logical Volumes (page 171)
• Distributing the Shared Configuration to all Nodes (page 172)
• Testing the Shared Configuration (page 173)
• Storing Volume Group Configuration Data (page 175)
• Setting up Disk Monitoring (page 176)

CAUTION: The minor numbers used by the LVM volume groups must be the same
on all cluster nodes. This means that if there are any non-shared volume groups in the
cluster, create the same number of them on all nodes, and create them before you define
the shared storage. If possible, avoid using private volume groups, especially LVM
boot volumes. Minor numbers increment with each logical volume, and mismatched
numbers of logical volumes between nodes can cause a failure of LVM (and boot, if
you are using an LVM boot volume).

NOTE: Except as noted in the sections that follow, you perform the LVM configuration
of shared storage on only one node. The disk partitions will be visible on other nodes
as soon as you reboot those nodes. After you’ve distributed the LVM configuration to
all the cluster nodes, you will be able to use LVM commands to switch volume groups
between nodes. (To avoid data corruption, a given volume group must be active on
only one node at a time).
For multipath information, see “Multipath for Storage ” (page 96).

Displaying Disk Information


To display a list of configured disks, use the following command:

166 Building an HA Cluster Configuration


fdisk -l
You will see output such as the following:
Disk /dev/sda: 64 heads, 32 sectors, 8678 cylinders
Units = cylinders of 2048 * 512 bytes

Device Boot Start End Blocks Id System


/dev/sda1 * 1 1001 1025008 83 Linux
/dev/sda2 1002 8678 7861248 5 Extended
/dev/sda5 1002 4002 3073008 83 Linux
/dev/sda6 4003 5003 1025008 82 Linux swap
/dev/sda7 5004 8678 3763184 83 Linux

Disk /dev/sdb: 64 heads, 32 sectors, 8678 cylinders


Units = cylinders of 2048 * 512 bytes

Device Boot Start End Blocks Id System

Disk /dev/sdc: 255 heads, 63 sectors, 1106 cylinders


Units = cylinders of 16065 * 512 bytesDisk /dev/sdd: 255 heads, 63 sectors, 1106 cylinders
Units = cylinders of 16065 * 512 bytes

In this example, the disk described by device file /dev/sda has already been partitioned
for Linux, into partitions named /dev/sda1 - /dev/sda7. The second internal device
/dev/sdb and the two external devices /dev/sdc and /dev/sdd have not been
partitioned.

NOTE: fdisk may not be available for SUSE on all platforms. In this case, using
YAST2 to set up the partitions is acceptable.

Creating Partitions
You must define a partition on each disk device (individual disk or LUN in an array)
that you want to use for your shared storage. Use the fdisk command for this.
The following steps create the new partition:
1. Run fdisk, specifying your device file name in place of <DeviceName>:
# fdisk <DeviceName>
Respond to the prompts as shown in the following table, to define a partition:

Prompt Response Action Performed

1. Command (m for help): n Create a new partition

2. Command action e extended p p Creation a primary partition


primary partition (1-4)

3. Partition number (1-4): 1 Create partition 1

4. First cylinder (1-nn, default Enter Accept the default starting cylinder
1): 1

5. Last cylinder or +size or +sizeM Enter Accept the default, which is the last
or +sizeK (1-nn, default nn): cylinder number

Preparing Your Systems 167


Prompt Response Action Performed

Command (m for help): p Display partition data

Command (m for help): w Write data to the partition table

The following example of the fdisk dialog shows that the disk on the device file
/dev/sdc is configured as one partition, and appears as follows:
fdisk /dev/sdc
Command (m for help): n
Command action
e extended
p primary partition (1-4) p
Partition number (1-4): 1
First cylinder (1-4067, default 1): Enter
Using default value 1Last cylinder or +size or +sizeM or +sizeK (1-4067, default 4067): Enter
Using default value 4067

Command (m for help): p


Disk /dev/sdc: 64 heads, 32 sectors, 4067 cylinders
Units = cylinders of 2048 * 512 bytes

Device Boot Start End Blocks Id System


/dev/sdc 1 4067 4164592 83 Linux

Command (m for help): w


The partition table has been altered!

2. Respond to the prompts as shown in the following table to set a partition type:

Prompt Response Action Performed

1. Command (m for help): t Set the partition type

2. Partition number (1-4): 1 Partition affected

3. Hex code (L to list codes): 8e Set partition to type to Linux LVM

Command (m for help): p Display partition data

Command (m for help): w Write data to the partition table

The following example of the fdisk dialog describes that the disk on the device
file /dev/sdc is set to Smart Array type partition, and appears as follows:
fdisk /dev/sdc
Command (m for help): t
Partition number (1-4): 1
HEX code (type L to list codes): 8e

Command (m for help): p


Disk /dev/sdc: 64 heads, 32 sectors, 4067 cylinders
Units = cylinders of 2048 * 512 bytes

168 Building an HA Cluster Configuration


Device Boot Start End Blocks Id System
/dev/sdc 1 4067 4164592 8e Linux LVM

Command (m for help): w


The partition table has been altered!
3. Repeat this process for each device file that you will use for shared storage.
fdisk /dev/sdd
fdisk /dev/sdf
fdisk /dev/sdg
4. If you will be creating volume groups for internal storage, make sure to create
those partitions as well, and create those volume groups before you define the
shared storage.
fdisk /dev/sddb

NOTE: fdisk may not be available for SUSE on all platforms. In this case, using
YAST2 to set up the partitions is acceptable.

Enabling Volume Group Activation Protection


As of Serviceguard for Linux A.11.16.07, you can enable activation protection for logical
volume groups, preventing the volume group from being activated by more than one
node at the same time. Activation protection, if used, must be enabled on each cluster
node.
Follow these steps to enable activation protection for volume groups on Red Hat and
SUSE systems:

IMPORTANT: Perform this procedure on each node.


1. Edit /etc/lvm/lvm.conf and add the following line:
tags { hosttags = 1 }
2. Uncomment the line in /etc/lvm/lvm.conf that begins # volume_list =,
and edit it to include all of the node's "private" volume groups (those not shared
with the other cluster nodes), including the root volume group.
For example if the root volume group is vg00 and the node also uses vg01 and
vg02 as private volume groups, the line should look like this:
volume_list = [ "vg00", "vg01", "vg02" ]
3. Create the file /etc/lvm/lvm_$(uname -n).conf
4. Add the following line to the file you created in step 2:
activation { volume_list=[“@node”] }

Preparing Your Systems 169


where node is the value of uname -n.
5. Run vgscan:
vgscan

NOTE: At this point, the setup for volume-group activation protection is complete.
Serviceguard adds a tag matching the uname -n value of the owning node to
each volume group defined for a package when the package runs and deletes the
tag when the package halts. The command vgs -o +tags vgname will display
any tags that are set for a volume group.
The sections that follow take you through the process of configuring volume groups
and logical volumes, and distributing the shared configuration. When you have
finished that process, use the procedure under “Testing the Shared Configuration”
(page 173) to verify that the setup has been done correctly.

Building Volume Groups: Example for Smart Array Cluster Storage (MSA 2000 Series)

NOTE: For information about setting up and configuring the MSA 2000 for use with
Serviceguard, see the HP Serviceguard for Linux Version A.11.19 Deployment Guide on
docs.hp.com under High Availability —> Serviceguard for Linux —>
White Papers.
Use Logical Volume Manager (LVM) on your system to create volume groups that can
be activated by Serviceguard packages. This section provides an example of creating
Volume Groups on LUNs created on MSA 2000 Series storage. For more information
on LVM, see the Logical Volume Manager How To, which you can find at http://
tldp.org/HOWTO/HOWTO-INDEX/howtos.html
Before you start, partition your LUNs and label them with a partition type of 8e (Linux
LVM). Use the type t parameter of the fdisk command to change from the default
of 83 (Linux).
Do the following on one node:
1. Update the LVM configuration and create the /etc/lvmtab file. You can omit
this step if you have previously created volume groups on this node.
vgscan

NOTE: The files /etc/lvmtab and /etc/lvmtab.d may not exist on some
distributions. In that case, ignore references to these files.

2. Create LVM physical volumes on each LUN. For example:


pvcreate -f /dev/sda1
pvcreate -f /dev/sdb1

170 Building an HA Cluster Configuration


pvcreate -f /dev/sdc1
3. Check whether there are already volume groups defined on this node. Be sure to
give each volume group a unique name.
vgdisplay
4. Create separate volume groups for each Serviceguard package you will define. In
the following example, we add the LUNs /dev/sda1 and /dev/sdb1 to volume
group vgpkgA, and /dev/sdc1 to vgpkgB:
vgcreate --addtag $(uname -n) /dev/vgpkgA /dev/sda1 /dev/sdb1
vgcreate --addtag $(uname -n) /dev/vgpkgB /dev/sdc1

NOTE: Use vgchange --addtag only if you are implementing volume-group


activation protection. Remember that volume-group activation protection, if used,
must be implemented on each node.

Building Volume Groups and Logical Volumes


1. Use Logical Volume Manager (LVM) to create volume groups that can be activated
by Serviceguard packages.
For an example showing volume-group creation on LUNs, see “Building Volume
Groups: Example for Smart Array Cluster Storage (MSA 2000 Series)” (page 170).
(For Fibre Channel storage you would use device-file names such as those used
in the section “Creating Partitions” (page 167).)
2. On Linux distributions that support it, enable activation protection for volume
groups. See “Enabling Volume Group Activation Protection” (page 169).
3. To store data on these volume groups you must create logical volumes. The
following creates a 500 Megabyte logical volume named /dev/vgpkgA/lvol1
and a one Gigabyte logical volume named /dev/vgpkgA/lvol2 in volume group
vgpkgA:
lvcreate -L 500M vgpkgA
lvcreate -L 1G vgpkgA
4. Create a file system on one of these logical volumes, and mount it in a newly
created directory:
mke2fs -j /dev/vgpkgA/lvol1
mkdir /extra
mount -t ext3 /dev/vgpkgA/lvol1 /extra

Preparing Your Systems 171


NOTE: For information about supported filesystem types, see the fs_type
discussion on (page 218).

5. To test that the file system /extra was created correctly and with high availability,
you can create a file on it, and read it.
echo "Test of LVM" >> /extra/LVM-test.conf
cat /extra/LVM-test.conf

NOTE: Be careful if you use YAST or YAST2 to configure volume groups, as that
may cause all volume groups on that system to be activated. After running YAST
or YAST2, check to make sure that volume groups for Serviceguard packages not
currently running have not been activated, and use LVM commands to deactivate
any that have. For example, use the command vgchange -a n /dev/sgvg00
to deactivate the volume group sgvg00.

Distributing the Shared Configuration to all Nodes


The goal in setting up a logical volume infrastructure is to build a set of volume groups
that can be activated on multiple nodes in the cluster. To do this, you need to build the
same LVM volume groups on any nodes that will be running the same package.

NOTE: The minor numbers used by the LVM volume groups must be the same on
all cluster nodes. They will if all the nodes have the same number of unshared volume
groups.
To distribute the shared configuration, follow these steps:
1. Unmount and deactivate the volume group, and remove the tag if necessary. For
example, to deactivate only vgpkgA:
umount /extra
vgchange -a n vgpkgA
vgchange --deltag $(uname -n) vgpkgA

NOTE: Use vgchange --deltag only if you are implementing volume-group


activation protection. Remember that volume-group activation protection, if used,
must be implemented on each node.

2. To get the node ftsys10 to see the new disk partitioning that was done on
ftsys9, reboot:
reboot

172 Building an HA Cluster Configuration


The partition table on the rebooted node is then rebuilt using the information
placed on the disks when they were partitioned on the other node.

NOTE: You must reboot at this time.

3. Run vgscan to make the LVM configuration visible on the new node and to create
the LVM database on/etc/lvmtab and /etc/lvmtab.d. For example, on
ftsys10:
vgscan

Testing the Shared Configuration


When you have finished the shared volume group configuration, you can test that the
storage is correctly sharable as follows:

Preparing Your Systems 173


1. On ftsys9, activate the volume group, mount the file system that was built on
it, write a file in the shared file system and look at the result:
vgchange --addtag $(uname -n) vgpkgB

NOTE: If you are using the volume-group activation protection feature of


Serviceguard for Linux, you must use vgchange --addtag to add a tag when
you manually activate a volume group. Similarly, you must remove the tag when
you deactivate a volume group that will be used in a package (as shown at the
end of each step).
Use vgchange --addtag and vgchange --deltag only if you are
implementing volume-group activation protection. Remember that volume-group
activation protection, if used, must be implemented on each node.
Serviceguard adds a tag matching the uname -n value of the owning node to
each volume group defined for a package when the package runs; the tag is deleted
when the package is halted. The command vgs -o +tags vgname will display
any tags that are set for a volume group.

vgchange -a y vgpkgB
mount /dev/vgpkgB/lvol1 /extra
echo ‘Written by’ ‘hostname‘ ‘on’ ‘date‘ > /extra/datestamp
cat /extra/datestamp
You should see something like the following, showing the date stamp written by
the other node:
Written by ftsys9.mydomain on Mon Jan 22 14:23:44 PST 2006
Now unmount the volume group again:
umount /extra
vgchange -a n vgpkgB
vgchange --deltag $(uname -n) vgpkgB

174 Building an HA Cluster Configuration


2. On ftsys10, activate the volume group, mount the file system, write a date stamp
on to the shared file, and then look at the content of the file:
vgchange --addtag $(uname -n) vgpkgB
vgchange -a y vgpkgB
mount /dev/vgpkgB/lvol1 /extra
echo ‘Written by’ ‘hostname‘ ‘on’ ‘date‘ >> /extra/datestamp
cat /extra/datestamp
You should see something like the following, including the date stamp written by
the other node:
Written by ftsys9.mydomain on Mon Jan 22 14:23:44 PST 2006
Written by ftsys10.mydomain on Mon Jan 22 14:25:27 PST 2006
Now unmount the volume group again, and remove the tag you added in step 1:
umount /extra
vgchange -a n vgpkgB
vgchange --deltag $(uname -n) vgpkgB

NOTE: The volume activation protection feature of Serviceguard for Linux


requires that you add the tag as shown at the beginning of the above steps when
you manually activate a volume group. Similarly, you must remove the tag when
you deactivate a volume group that will be used in a package (as shown at the
end of each step). As of Serviceguard for Linux A.11.16.07, a tag matching the
uname -n value of the owning node is automatically added to each volume group
defined for a package when the package runs; the tag is deleted when the package
is halted. The command vgs -o +tags vgname will display any tags that are
set for a volume group.

Storing Volume Group Configuration Data


When you create volume groups, LVM creates a backup copy of the volume group
configuration on the configuration node. In addition, you should create a backup of
configuration data on all other nodes where the volume group might be activated by
using the vgcfgbackup command:
vgcfgbackup vgpkgA vgpkgB
If a disk in a volume group must be replaced, you can restore the old disk’s metadata
on the new disk by using the vgcfgrestore command. See “Replacing Disks” in the
“Troubleshooting” chapter.

Preparing Your Systems 175


Preventing Boot-Time vgscan and Ensuring Serviceguard Volume Groups Are Deactivated
By default, Linux will perform LVM startup actions whenever the system is rebooted.
These include a vgscan (on some Linux distributions) and volume group activation.
This can cause problems for volumes used in a Serviceguard environment (for example,
a volume group for a Serviceguard package that is not currently running may be
activated). To prevent such problems, proceed as follows on the various Linux versions.

NOTE: You do not need to perform these actions if you have implemented
volume-group activation protection as described under “Enabling Volume Group
Activation Protection” (page 169).
SUSE Linux Enterprise Server
Prevent a vgscan at boot time by removing the /etc/rc.d/boot.d/S07boot.lvm
file from all cluster nodes.

NOTE: Be careful if you use YAST or YAST2 to configure volume groups, as that may
cause all volume groups to be activated. After running YAST or YAST2, check that
volume groups for Serviceguard packages not currently running have not been activated,
and use LVM commands to deactivate any that have. For example, use the command
vgchange -a n /dev/sgvg00 to deactivate the volume group sgvg00.
Red Hat
It is not necessary to prevent vgscan on Red Hat.
To deactivate any volume groups that will be under Serviceguard control, add
vgchange commands to the end of /etc/rc.d/rc.sysinit; for example, if volume
groups sgvg00 and sgvg01 are under Serviceguard control, add the following lines
to the end of the file:
vgchange -a n /dev/sgvg00
vgchange -a n /dev/sgvg01
The vgchange commands activate the volume groups temporarily, then deactivate
them; this is expected behavior.

Setting up Disk Monitoring


HP Serviceguard for Linux includes a Disk Monitor which you can use to detect
problems in disk connectivity. This lets you fail a package over from one node to another
in the event of a disk link failure.
See “Creating a Disk Monitor Configuration” (page 228) for instructions on configuring
disk monitoring.

176 Building an HA Cluster Configuration


Configuring the Cluster
This section describes how to define the basic cluster configuration. This must be done
on a system that is not part of a Serviceguard cluster (that is, on which Serviceguard
is installed but not configured). You can do this in Serviceguard Manager, or from the
command line as described below.
Use the cmquerycl command to specify a set of nodes to be included in the cluster
and to generate a template for the cluster configuration file.

IMPORTANT: See NODE_NAME under “Cluster Configuration Parameters ” (page 105)


for important information about restrictions on the node name.
Here is an example of the command (enter it all one line):
cmquerycl -v -C $SGCONF/clust1.conf -n ftsys9 -n ftsys10
This creates a template file, by default /etc/cmcluster/clust1.conf. In this output
file, keywords are separated from definitions by white space. Comments are permitted,
and must be preceded by a pound sign (#) in the far left column.

NOTE: HP strongly recommends that you modify the file so as to send heartbeat over
all possible networks.
The manpage for the cmquerycl command further explains the parameters that appear
in this file. Many are also described in Chapter 4: “Planning and Documenting an HA
Cluster ” (page 93). Modify your /etc/cmcluster/clust1.configfile as needed.

cmquerycl Options
Speeding up the Process
In a larger or more complex cluster with many nodes, networks or disks, the cmquerycl
command may take several minutes to complete. To speed up the configuration process,
you can direct the command to return selected information only by using the -k and
-w options:
-k eliminates some disk probing, and does not return information about potential
cluster lock volume groups and lock physical volumes.
-w local lets you specify local network probing, in which LAN connectivity is verified
between interfaces within each node only. This is the default when you use cmquerycl
with the-C option.
(Do not use -w local if you need to discover nodes and subnets for a cross-subnet
configuration; see “Full Network Probing”.)
-w none skips network querying. If you have recently checked the networks, this
option will save time.

Configuring the Cluster 177


Specifying the Address Family for the Cluster Hostnames
You can use the -a option to tell Serviceguard to resolve cluster node names (as well
as Quorum Server hostnames, if any) to IPv4 addresses only (-a ipv4) IPv6 addresses
only (-a ipv6), or both (-a any). You can also configure the address family by means
of the HOSTNAME_ADDRESS_FAMILY in the cluster configuration file.

IMPORTANT: See “About Hostname Address Families: IPv4-Only, IPv6-Only, and


Mixed Mode” (page 101) for a full discussion, including important restrictions for
IPv6–only and mixed modes.
If you use the -a option, Serviceguard will ignore the value of the
HOSTNAME_ADDRESS_FAMILY parameter in the existing cluster configuration, if
any, and attempt to resolve the cluster and Quorum Server hostnames as specified by
the -a option:
• If you specify -a ipv4, each of the hostnames must resolve to at least one IPv4
address; otherwise the command will fail.
• Similarly, if you specify -a ipv6, each of the hostnames must resolve to at least
one IPv6 address; otherwise the command will fail.
• If you specify -a any, Serviceguard will attempt to resolve each hostname to an
IPv4 address, then, if that fails, to an IPv6 address.
If you do not use the -a option:
• If a cluster is already configured, Serviceguard will use the value configured for
HOSTNAME_ADDRESS_FAMILY, which defaults to IPv4.
• If no cluster configured, and Serviceguard finds at least one IPv4 address that
corresponds to the local node's hostname (that is, the node on which you are
running cmquerycl), Serviceguard will attempt to resolve all hostnames to IPv4
addresses. If no IPv4 address is found for a given hostname, Serviceguard will
look for an IPv6 address. (This is the same behavior as if you had specified -a
any.)

Specifying the Address Family for the Heartbeat


To tell Serviceguard to use only IPv4, or only IPv6, addresses for the heartbeat, use the
-h option. For example, to use only IPv6 addresses:
cmquerycl -v -h ipv6 -C $SGCONF/clust1.conf -n ftsys9 -n ftsys10
• -h ipv4 tells Serviceguard to discover and configure only IPv4 subnets. If it does
not find any eligible subnets, the command will fail.
• -h ipv6 tells Serviceguard to discover and configure only IPv6 subnets. If it does
not find any eligible subnets, the command will fail.
• If you don't use the -h option, Serviceguard will choose the best available
configuration to meet minimum requirements, preferring an IPv4 LAN over IPv6
where both are available. The resulting configuration could be IPv4 only, IPv6

178 Building an HA Cluster Configuration


only, or a mix of both. You can override Serviceguard's default choices by means
of the HEARTBEAT_IP parameter, discussed under “Cluster Configuration
Parameters ” (page 105); that discussion also spells out the heartbeat requirements.
• The-h and -c options are mutually exclusive.

Full Network Probing


-w full lets you specify full network probing, in which actual connectivity is verified
among all LAN interfaces on all nodes in the cluster, whether or not they are all on the
same subnet.

NOTE: This option must be used to discover actual or potential nodes and subnets
in a cross-subnet configuration. See “Obtaining Cross-Subnet Information” (page 180).
It will also validate IP Monitor polling targets; see “Monitoring LAN Interfaces and
Detecting Failure: IP Level” (page 78), and POLLING_TARGET under “Cluster
Configuration Parameters ” (page 105).

Specifying a Lock LUN


A cluster lock LUN or quorum server is required for two-node clusters. If you will be
using a lock LUN, be sure to specify the -L lock_lun_device option with the
cmquerycl command. If the name of the device is the same on all nodes, enter the
option before the node names, as in the following example (all on one line):
cmquerycl -v -L /dev/sda1 -n lp01 -n lp02 -C
$SGCONF/lpcluster.conf
If the name of the device is different on the different nodes, specify each device file
following each node name, as in the following example (all on one line):
cmquerycl -v -n node1 -L /dev/sda1 -n node2 -L /dev/sda2 -C
$SGCONF/lpcluster.conf

Specifying a Quorum Server

IMPORTANT: The following are standard instructions. For special instructions that
may apply to your version of Serviceguard and the Quorum Server see “Configuring
Serviceguard to Use the Quorum Server” in the latest version HP Serviceguard Quorum
Server Version A.04.00 Release Notes, at http://www.docs.hp.com -> High
Availability -> Quorum Server.
A cluster lock LUN or quorum server, is required for two-node clusters. To obtain a
cluster configuration file that includes Quorum Server parameters, use the -q option
of the cmquerycl command, specifying a Quorum Server hostname or IP address, for
example (all on one line):
cmquerycl -q <QS_Host> -n ftsys9 -n ftsys10 -C <ClusterName>.conf

Configuring the Cluster 179


To specify an alternate hostname or IP address by which the Quorum Server can be
reached, use a command such as (all on one line):
cmquerycl -q <QS_Host> <QS_Addr> -n ftsys9 -n ftsys10 -C
<ClusterName>.conf
Enter the QS_HOST (IPv4 or IPv6 on SLES 10 and 11; IPv4 only on Red Hat 5), optional
QS_ADDR (IPv4 or IPv6 on SLES 10 and 11; IPv4 only on Red Hat 5),
QS_POLLING_INTERVAL, and optionally a QS_TIMEOUT_EXTENSION; and also
check the HOSTNAME_ADDRESS_FAMILY setting, which defaults to IPv4. See the
parameter descriptions under Cluster Configuration Parameters (page 105).
For important information, see also “About Hostname Address Families: IPv4-Only,
IPv6-Only, and Mixed Mode” (page 101); and “What Happens when You Change the
Quorum Configuration Online” (page 48)

Obtaining Cross-Subnet Information


As of Serviceguard A.11.18 it is possible to configure multiple IPv4 subnets, joined by
a router, both for the cluster heartbeat and for data, with some nodes using one subnet
and some another. See “Cross-Subnet Configurations” (page 32) for rules and
definitions.
You must use the -w full option to cmquerycl to discover the available subnets.
For example, assume that you are planning to configure four nodes, NodeA, NodeB,
NodeC, and NodeD, into a cluster that uses the subnets 15.13.164.0, 15.13.172.0,
15.13.165.0, 15.13.182.0, 15.244.65.0, and 15.244.56.0.
The following command
cmquerycl –w full –n nodeA –n nodeB –n nodeB –n nodeC –n nodeD
will produce the output such as the following:
Node Names: nodeA
nodeB
nodeC
nodeD

Bridged networks (full probing performed):


1 lan3 (nodeA)
lan4 (nodeA)
lan3 (nodeB)
lan4 (nodeB)
2 lan1 (nodeA)
lan1 (nodeB)
3 lan2 (nodeA)
lan2 (nodeB)
4 lan3 (nodeC)
lan4 (nodeC)
lan3 (nodeD)
lan4 (nodeD)

180 Building an HA Cluster Configuration


5 lan1 (nodeC)
lan1 (nodeD)
6 lan2 (nodeC)
lan2 (nodeD)

IP subnets:
IPv4:

15.13.164.0 lan1 (nodeA)


lan1 (nodeB)
15.13.172.0 lan1 (nodeC)
lan1 (nodeD)
15.13.165.0 lan2 (nodeA)
lan2 (nodeB)
15.13.182.0 lan2 (nodeC)
lan2 (nodeD)
15.244.65.0 lan3 (nodeA)
lan3 (nodeB)
15.244.56.0 lan4 (nodeC)
lan4 (nodeD)

IPv6:

3ffe:1111::/64 lan3 (nodeA)


lan3 (nodeB)
3ffe:2222::/64 lan3 (nodeC)
lan3 (nodeD)

Possible Heartbeat IPs:


15.13.164.0
15.13.164.1 (nodeA)
15.13.164.2 (nodeB)
15.13.172.0 15.13.172.158 (nodeC)
15.13.172.159 (nodeD)
15.13.165.0 15.13.165.1 (nodeA)
15.13.165.2 (nodeB)
15.13.182.0 15.13.182.158 (nodeC)
15.13.182.159 (nodeD)
Route connectivity(full probing performed):

1 15.13.164.0
15.13.172.0
2 15.13.165.0
15.13.182.0
3 15.244.65.0
4 15.244.56.0
In the Route connectivity section, the numbers on the left (1-4) identify which
subnets are routed to each other (for example 15.13.164.0 and 15.13.172.0).

Configuring the Cluster 181


IMPORTANT: Note that in this example subnet 15.244.65.0, used by NodeA and
NodeB, is not routed to 15.244.56.0, used by NodeC and NodeD.
But subnets 15.13.164.0 and 15.13.165.0, used by NodeA and NodeB, are routed
respectively to subnets 15.13.172.0 and 15.13.182.0, used by NodeC and NodeD.
At least one such routing among all the nodes must exist for cmquerycl to succeed.

For information about configuring the heartbeat in a cross-subnet configuration, see


the HEARTBEAT_IP parameter discussion under “Cluster Configuration Parameters
” (page 105).

Identifying Heartbeat Subnets


The cluster configuration file includes entries for IP addresses on the heartbeat subnet.
HP recommends that you use a dedicated heartbeat subnet, and configure heartbeat
on other subnets as well, including the data subnet.
The heartbeat can be on an IPv4 or an IPv6 subnet.
The heartbeat can comprise multiple IPv4 subnets joined by a router. In this case at
least two heartbeat paths must be configured for each cluster node. See also the
discussion of HEARTBEAT_IP [p. 111], and “Cross-Subnet Configurations” (page 32).

Specifying Maximum Number of Configured Packages


This value must be equal to or greater than the number of packages currently configured
in the cluster. The count includes all types of packages: failover, multi-node, and system
multi-node. The maximum number of packages per cluster is 300. The default is the
maximum.

NOTE: Remember to tune kernel parameters on each node to ensure that they are set
high enough for the largest number of packages that will ever run concurrently on that
node.

Modifying the MEMBER_TIMEOUT Parameter


The cmquerycl command supplies a default value of 14 seconds for the
MEMBER_TIMEOUT parameter. Changing this value will directly affect the cluster’s
re-formation and failover times. You may need to increase the value if you are
experiencing cluster node failures as a result of heavy system load or heavy network
traffic; or you may need to decrease it if cluster re-formations are taking a long time.
You can change MEMBER_TIMEOUT while the cluster is running.
For more information about node timeouts, see “What Happens when a Node Times
Out” (page 90) and the MEMBER_TIMEOUT parameter discussions under “Cluster
Configuration Parameters ” (page 105), and “Cluster Re-formations Caused by
MEMBER_TIMEOUT Being Set too Low” (page 292).

182 Building an HA Cluster Configuration


Controlling Access to the Cluster
Serviceguard access-control policies define cluster users’ administrative or monitoring
capabilities.

A Note about Terminology


Although you will also sometimes see the term role-based access (RBA) in the output
of Serviceguard commands, the preferred set of terms, always used in this manual, is
as follows:
• Access-control policies - the set of rules defining user access to the cluster.
— Access-control policy - one of these rules, comprising the three parameters
USER_NAME, USER_HOST, USER_ROLE. See “Setting up Access-Control
Policies” (page 185).
• Access roles - the set of roles that can be defined for cluster users (Monitor, Package
Admin, Full Admin).
— Access role - one of these roles (for example, Monitor).

How Access Roles Work


Serviceguard daemons grant access to Serviceguard commands by matching the
command user’s hostname and username against the access control policies you define.
Each user can execute only the commands allowed by his or her role.
The diagram that shows the access roles and their capabilities. The innermost circle is
the most trusted; the outermost the least. Each role can perform its own functions and
the functions in all of the circles outside it. For example Serviceguard Root can perform
its own functions plus all the functions of Full Admin, Package Admin and Monitor;
Full Admin can perform its own functions plus the functions of Package Admin and
Monitor; and so on.

Configuring the Cluster 183


Figure 5-1 Access Roles

Levels of Access
Serviceguard recognizes two levels of access, root and non-root:
• Root access: Full capabilities; only role allowed to configure the cluster.
As Figure 5-1 shows, users with root access have complete control over the
configuration of the cluster and its packages. This is the only role allowed to use
the cmcheckconf, cmapplyconf, cmdeleteconf, and cmmodnet -a
commands.
In order to exercise this Serviceguard role, you must log in as the root user
(superuser) on a node in the cluster you want to administer. Conversely, the root
user on any node in the cluster always has full Serviceguard root access privileges
for that cluster; no additional Serviceguard configuration is needed to grant these
privileges.

184 Building an HA Cluster Configuration


IMPORTANT: Users on systems outside the cluster can gain Serviceguard root
access privileges to configure the cluster only via a secure connection (rsh or ssh).

• Non-root access: Other users can be assigned one of four roles:


— Full Admin: Allowed to perform cluster administration, package administration, and
cluster and package view operations.
These users can administer the cluster, but cannot configure or create a cluster.
Full Admin includes the privileges of the Package Admin role.
— (all-packages) Package Admin: Allowed to perform package administration, and
use cluster and package view commands.
These users can run and halt any package in the cluster, and change its switching
behavior, but cannot configure or create packages. Unlike single-package
Package Admin, this role is defined in the cluster configuration file. Package
Admin includes the cluster-wide privileges of the Monitor role.
— (single-package) Package Admin: Allowed to perform package administration for
a specified package, and use cluster and package view commands.
These users can run and halt a specified package, and change its switching
behavior, but cannot configure or create packages. This is the only access role
defined in the package configuration file; the others are defined in the cluster
configuration file. Single-package Package Admin also includes the cluster-wide
privileges of the Monitor role.
— Monitor: Allowed to perform cluster and package view operations.
These users have read-only access to the cluster and its packages.

IMPORTANT: A remote user (one who is not logged in to a node in the cluster,
and is not connecting via rsh or ssh) can have only Monitor access to the cluster.
(Full Admin and Package Admin can be configured for such a user, but this usage
is deprecated. As of Serviceguard A.11.18 configuring Full Admin or Package
Admin for remote users gives them Monitor capabilities. See “Setting up
Access-Control Policies” (page 185) for more information.)

Setting up Access-Control Policies


The root user on each cluster node is automatically granted the Serviceguard root access
role on all nodes. (See “Configuring Root-Level Access” (page 155) for more information.)
Access-control policies define non-root roles for other cluster users.

Configuring the Cluster 185


NOTE: For more information and advice, see the white paper Securing Serviceguard
at http://docs.hp.com -> High Availability -> Serviceguard ->
White Papers.
Define access-control policies for a cluster in the cluster configuration file; see “Cluster
Configuration Parameters ” (page 105). To define access control for a specific package,
use user_host (page 220) and related parameters in the package configuration file. You
can define up to 200 access policies for each cluster. A root user can create or modify
access control policies while the cluster is running.

NOTE: Once nodes are configured into a cluster, the access-control policies you set
in the cluster and package configuration files govern cluster-wide security; changes to
the “bootstrap” cmclnodelist file are ignored (see “Allowing Root Access to an
Unconfigured Node” (page 155)).
Access control policies are defined by three parameters in the configuration file:
• Each USER_NAME can consist either of the literal ANY_USER, or a maximum of
8 login names from the /etc/passwd file on USER_HOST. The names must be
separated by spaces or tabs, for example:
# Policy 1:
USER_NAME john fred patrick
USER_HOST bit
USER_ROLE PACKAGE_ADMIN
• USER_HOST is the node where USER_NAME will issue Serviceguard commands.

NOTE: The commands must be issued on USER_HOST but can take effect on
other nodes; for example patrick can use bit’s command line to start a package
on gryf (assuming bit and gryf are in the same cluster).
Choose one of these three values for USER_HOST:
— ANY_SERVICEGUARD_NODE - any node on which Serviceguard is configured,
and which is on a subnet with which nodes in this cluster can communicate
(as reported bycmquerycl -w full).

186 Building an HA Cluster Configuration


NOTE: If you set USER_HOST to ANY_SERVICEGUARD_NODE, set
USER_ROLE to MONITOR; users connecting from outside the cluster cannot
have any higher privileges (unless they are connecting via rsh or ssh; this is
treated as a local connection).
Depending on your network configuration, ANY_SERVICEGUARD_NODE can
provide wide-ranging read-only access to the cluster.

— CLUSTER_MEMBER_NODE - any node in the cluster


— A specific node name - Use the hostname portion (the first part) of a
fully-qualified domain name that can be resolved by the name service you are
using; it should also be in each node’s /etc/hosts. Do not use an IP addresses
or the fully-qualified domain name. If there are multiple hostnames (aliases)
for an IP address, one of those must match USER_HOST. See “Configuring
Name Resolution” (page 156) for more information.
• USER_ROLE must be one of these three values:
— MONITOR
— FULL_ADMIN
— PACKAGE_ADMIN
MONITOR and FULL_ADMIN can be set only in the cluster configuration file and
they apply to the entire cluster. PACKAGE_ADMIN can be set in the cluster
configuration file or a package configuration file. If it is set in the cluster
configuration file, PACKAGE_ADMIN applies to all configured packages; if it is set
in a package configuration file, it applies to that package only. These roles are not
exclusive; for example, more than one user can have the PACKAGE_ADMIN role for
the same package.

NOTE: You do not have to halt the cluster or package to configure or modify access
control policies.
Here is an example of an access control policy:
USER_NAME john
USER_HOST bit
USER_ROLE PACKAGE_ADMIN
If this policy is defined in the cluster configuration file, it grants user john the
PACKAGE_ADMIN role for any package on node bit. User john also has the MONITOR
role for the entire cluster, because PACKAGE_ADMIN includes MONITOR. If the policy
is defined in the package configuration file for PackageA, then user john on node bit
has the PACKAGE_ADMIN role only for PackageA.

Configuring the Cluster 187


Plan the cluster’s roles and validate them as soon as possible. If your organization’s
security policies allow it, you may find it easiest to create group logins. For example,
you could create a MONITOR role for user operator1 from CLUSTER_MEMBER_NODE
(that is, from any node in the cluster). Then you could give this login name and
password to everyone who will need to monitor your clusters.

Role Conflicts
Do not configure different roles for the same user and host; Serviceguard treats this as
a conflict and will fail with an error when applying the configuration. “Wildcards”,
such as ANY_USER and ANY_SERVICEGUARD_NODE, are an exception: it is acceptable
for ANY_USER and john to be given different roles.

IMPORTANT: Wildcards do not degrade higher-level roles that have been granted to
individual members of the class specified by the wildcard. For example, you might set
up the following policy to allow root users on remote systems access to the cluster:
USER_NAME root
USER_HOST ANY_SERVICEGUARD_NODE
USER_ROLE MONITOR
This does not reduce the access level of users who are logged in as root on nodes in this
cluster; they will always have full Serviceguard root-access capabilities.

Consider what would happen if these entries were in the cluster configuration file:
# Policy 1:
USER_NAME john
USER_HOST bit
USER_ROLE PACKAGE_ADMIN

# Policy 2:
USER_NAME john
USER_HOST bit
USER_ROLE MONITOR

# Policy 3:
USER_NAME ANY_USER
USER_HOST ANY_SERVICEGUARD_NODE
USER_ROLE MONITOR
In the above example, the configuration would fail because user john is assigned two
roles. (In any case, Policy 2 is unnecessary, because PACKAGE_ADMIN includes the role
of MONITOR.)
Policy 3 does not conflict with any other policies, even though the wildcard ANY_USER
includes the individual user john.

188 Building an HA Cluster Configuration


NOTE: Check spelling especially carefully when typing wildcards, such as ANY_USER
and ANY_SERVICEGUARD_NODE. If they are misspelled, Serviceguard will assume they
are specific users or nodes.

Package versus Cluster Roles


Package configuration will fail if there is any conflict in roles between the package
configuration and the cluster configuration, so it is a good idea to have the cluster
configuration file in front of you when you create roles for a package; use cmgetconf
to get a listing of the cluster configuration file.
If a role is configured for a username/hostname in the cluster configuration file, do not
specify a role for the same username/hostname in the package configuration file; and
note that there is no point in assigning a package administration role to a user who is
root on any node in the cluster; this user already has complete control over the
administration of the cluster and its packages.

Verifying the Cluster Configuration


If you have edited a cluster configuration template file, use the following command to
verify the content of the file:
cmcheckconf -v -C $SGCONF/clust1.conf
This command checks the following:
• Network addresses and connections.
• Quorum server connection.
• All lock LUN device names on all nodes refer to the same physical disk area.
• One and only one lock LUN device is specified per node.
• A quorum server or lock LUN is configured, but not both.
• Uniqueness of names.
• Existence and permission of scripts specified in the command line.
• If all nodes specified are in the same heartbeat subnet.
• Correct configuration filename.
• All nodes can be accessed.
• No more than one CLUSTER_NAME, HEARTBEAT_INTERVAL, and
AUTO_START_TIMEOUT are specified.
• The value for package run and halt script timeouts does not exceed the maximum.
• The value for HEARTBEAT_INTERVAL is at least one second.
• The value for NODE_TIMEOUT is at least twice the value of
HEARTBEAT_INTERVAL.
• The value for AUTO_START_TIMEOUT variables is greater than zero.

Configuring the Cluster 189


• Heartbeat network minimum requirement. See HEARTBEAT_IP under “Cluster
Configuration Parameters ” (page 105).
• At least one NODE_NAME is specified.
• Each node is connected to each heartbeat network.
• All heartbeat networks are of the same type of LAN.
• The network interface device files specified are valid LAN device files.
• Other configuration parameters for the cluster and packages are valid.
If the cluster is online the cmcheckconf command also verifies that all the conditions
for the specific change in configuration have been met.

NOTE: Using the -k option means that cmcheckconf only checks disk connectivity
to the LVM disks that are identified in the cluster configuration file. Omitting the -k
option (the default behavior) means that cmcheckconf tests the connectivity of all
LVM disks on all nodes. Using -k can result in significantly faster operation of the
command.

Cluster Lock Configuration Messages


The cmquerycl, cmcheckconf and cmapplyconf commands will return errors if
the cluster lock is not correctly configured. If there is no cluster lock in a cluster with
two nodes, the following message is displayed in the cluster configuration file:
# Warning: Neither a quorum server nor a lock lun was specificed.
# A Quorum Server or a lock lun is required for clusters of only two nodes.
If you attempt to configure both a quorum server and a lock LUN, the following message
appears on standard output when issuing the cmcheckconf or cmapplyconf command:
Duplicate cluster lock, line 55. Quorum Server already specified.

Distributing the Binary Configuration File


After specifying all cluster parameters, use the cmapplyconf command to apply the
configuration. This action distributes the binary configuration file to all the nodes in
the cluster. HP recommends doing this separately before you configure packages
(described in the next chapter). In this way, you can verify the quorum server, heartbeat
networks, and other cluster-level operations by using the cmviewcl command on the
running cluster. Before distributing the configuration, ensure that your security files
permit copying among the cluster nodes. See “Configuring Root-Level Access”
(page 155).
The following command distributes the binary configuration file:
cmapplyconf -k -v -C $SGCONF/clust1.conf

190 Building an HA Cluster Configuration


Managing the Running Cluster
This section describes some approaches to routine management of the cluster. For more
information, see Chapter 7: “Cluster and Package Maintenance” (page 229). You can
manage the cluster from Serviceguard Manager, or by means of Serviceguard commands
as described below.

Checking Cluster Operation with Serviceguard Commands


• cmviewcl checks the status of the cluster and many of its components. A non-root
user with the role of Monitor can run this command from a cluster node or see
status information in Serviceguard Manager.
• cmrunnode is used to start a node. A non-root user with the role of Full Admin,
can run this command from a cluster node or through Serviceguard Manager.
• cmhaltnode is used to manually stop a running node. (This command is also
used by shutdown(1m).) A non-root user with the role of Full Admin can run
this command from a cluster node or through Serviceguard Manager.
• cmruncl is used to manually start a stopped cluster. A non-root user with Full
Admin access can run this command from a cluster node, or through Serviceguard
Manager.
• cmhaltcl is used to manually stop a cluster. A non-root user with Full Admin
access, can run this command from a cluster node or through Serviceguard
Manager.
You can use these commands to test cluster operation, as in the following:
1. If the cluster is not already running, start it:
cmruncl -v
By default, cmruncl will check the networks. Serviceguard will probe the actual
network configuration with the network information in the cluster configuration.
If you do not need this validation, use cmruncl -v -w none instead, to turn
off validation and save time
2. When the cluster has started, make sure that cluster components are operating
correctly:
cmviewcl -v
Make sure that all nodes and networks are functioning as expected. For more
information, refer to the chapter on “Cluster and Package Maintenance.”
3. Verify that nodes leave and enter the cluster as expected using the following steps:
• Halt the cluster. You can use Serviceguard Manager or the cmhaltnode
command.
• Check the cluster membership to verify that the node has left the cluster. You
can use the Serviceguard Manager main page or the cmviewcl command.

Managing the Running Cluster 191


• Start the node. You can use Serviceguard Manager or the cmrunnode
command.
• Verify that the node has returned to operation. You can use Serviceguard
Manager or the cmviewcl command again.
4. Bring down the cluster. You can use Serviceguard Manager or the cmhaltcl -v
-f command.
See the manpages for more information about these commands. See Chapter 8:
“Troubleshooting Your Cluster” (page 281) for more information about cluster testing.

Setting up Autostart Features


Automatic startup is the process in which each node individually joins a cluster;
Serviceguard provides a startup script to control the startup process. If a cluster already
exists, the node attempts to join it; if no cluster is running, the node attempts to form
a cluster consisting of all configured nodes. Automatic cluster start is the preferred
way to start a cluster. No action is required by the system administrator.
There are three cases:
• The cluster is not running on any node, all cluster nodes must be reachable, and
all must be attempting to start up. In this case, the node attempts to form a cluster
consisting of all configured nodes.
• The cluster is already running on at least one node. In this case, the node attempts
to join that cluster.
• Neither is true: the cluster is not running on any node, and not all the nodes are
reachable and trying to start. In this case, the node will attempt to start for the
AUTO_START_TIMEOUT period. If neither of these things becomes true in that
time, startup will fail.
To enable automatic cluster start, set the flag AUTOSTART_CMCLD to 1 in the
$SGAUTOSTARTfile ($SGCONF/cmcluster.rc) on each node in the cluster; the nodes
will then join the cluster at boot time.
Here is an example of the $SGAUTOSTART file:
SGAUTOSTART=/usr/local/cmcluster/conf/cmcluster.rc
#*************************** CMCLUSTER *************************

# Highly Available Cluster configuration


#
# @(#) $Revision: 82.2 $
#
#
# AUTOSTART_CMCLD
#
# Automatic startup is the process in which each node individually
# joins a cluster. If a cluster already exists, the node attempts
# to join it; if no cluster is running, the node attempts to form
# a cluster consisting of all configured nodes. Automatic cluster

192 Building an HA Cluster Configuration


# start is the preferred way to start a cluster. No action is
# required by the system administrator. If set to 1, the node will
# attempt to join/form its CM cluster automatically as described
# above. If set to 0, the node will not attempt to join its CM
# cluster.

AUTOSTART_CMCLD=1

NOTE: The /sbin/init.d/cmcluster file may call files that Serviceguard stores
in$SGCONF/rc. (See “Understanding the Location of Serviceguard Files” (page 153)
for information about Serviceguard directories on different Linux distributions.) This
directory is for Serviceguard use only! Do not move, delete, modify, or add files in this
directory.

Changing the System Message


You may find it useful to modify the system's login message to include a statement
such as the following:
This system is a node in a high availability cluster.
Halting this system may cause applications and services to
start up on another node in the cluster.
You may want to include a list of all cluster nodes in this message, together with
additional cluster-specific information.
The /etc/motd file may be customized to include cluster-related information.

Managing a Single-Node Cluster


The number of nodes you will need for your cluster depends on the processing
requirements of the applications you want to protect.
In a single-node cluster, a quorum server is not required, since there is no other node
in the cluster. The output from the cmquerycl command omits the quorum server
information area if there is only one node.
You still need to have redundant networks, but you do not need to specify any heartbeat
LANs, since there is no other node to send heartbeats to. In the cluster configuration
file, specify all LANs that you want Serviceguard to monitor. For LANs that already
have IP addresses, specify them with the STATIONARY_IP parameter, rather than the
HEARTBEAT_IP parameter.

Single-Node Operation
Single-node operation occurs in a single-node cluster, or in a multi-node cluster in
which all but one node has failed, or in which you have shut down all but one node,
which will probably have applications running. As long as the Serviceguard daemon
cmcld is active, other nodes can rejoin the cluster at a later time.

Managing the Running Cluster 193


If the cmcld daemon fails during single-node operation, it will leave the single node
up and your applications running. (This is different from the failure of cmcld in a
multi-node cluster, which causes the node to halt with a reboot, and packages to be
switched to adoptive nodes.)
It is not necessary to halt the single node in this case, since the applications are still
running, and no other node is currently available for package switching.

CAUTION: But you should not try to restart Serviceguard; data corruption might
occur if another node were to attempt to start up a new instance of an application that
is still running on the single node. Instead, choose an appropriate time to shut down
and reboot the node. This will allow the applications to shut down and Serviceguard
to restart the cluster after the reboot.

Disabling identd
Ignore this section unless you have a particular need to disable identd.
You can configure Serviceguard not to use identd.

CAUTION: This is not recommended. Consult the white paper Securing Serviceguard
at http://docs.hp.com -> High Availability -> Serviceguard ->
White Papers for more information.
If you must disable identd, do the following on each node after installing Serviceguard
but before each node rejoins the cluster (e.g. before issuing a cmrunnode or cmruncl).
For Red Hat and SUSE:
1. Change the value of the server_args parameter in the file /etc/xinetd.d/
hacl-cfg from -c to -c -i
2. Change the server_args parameter in the /etc/xinetd.d/hacl-probe file to
include the value -i
On a SUSE system, change
server_args = -f /opt/cmom/log/cmomd.log -r /opt/cmom/run
to
server_args = -i -f /opt/cmom/log/cmomd.log -r /opt/cmom/run
On A Red Hat system, change
server_args = -f /user/local/cmom/log/cmomd.log -r
/user/local/cmom/run
to

194 Building an HA Cluster Configuration


server_args = -i -f /user/local/cmom/log/cmomd.log -r
/user/local/cmom/run
3. Restart xinetd:
/etc/init.d/xinetd restart

Deleting the Cluster Configuration


You can delete a cluster configuration by means of the cmdeleteconf command. The
command prompts for a verification before deleting the files unless you use the -f
option. You can delete the configuration only when the cluster is down. The action
removes the binary configuration file from all the nodes in the cluster and resets all
cluster-aware volume groups to be no longer cluster-aware.

NOTE: The cmdeleteconf command removes only the cluster binary file $SGCONF/
cmclconfig. It does not remove any other files from the $SGCONF directory.
Although the cluster must be halted, all nodes in the cluster should be powered up
and accessible before you use the cmdeleteconf command. If a node is powered
down, power it up and allow it to boot. If a node is inaccessible, you will see a list of
inaccessible nodes and the following message:
Checking current status
cmdeleteconf: Unable to reach node lptest1.
WARNING: Once the unreachable node is up, cmdeleteconf
should be executed on the node to remove the configuration.

Delete cluster lpcluster anyway (y/[n])?


Reply Yes to remove the configuration. Later, if the inaccessible node becomes available,
run cmdeleteconf on that node to remove the configuration file.

Managing the Running Cluster 195


196
6 Configuring Packages and Their Services
Serviceguard packages group together applications and the services and resources they
depend on.
The typical Serviceguard package is a failover package that starts on one node but can
be moved (“failed over”) to another if necessary. For more information, see “What is
Serviceguard for Linux? ” (page 23), “How the Package Manager Works” (page 49),
and“Package Configuration Planning ” (page 123).
You can also create multi-node packages, which run on more than one node at the
same time.
System multi-node packages, which run on all the nodes in the cluster, are supported
only for applications supplied by HP.
Creating or modifying a package requires the following broad steps, each of which is
described in the sections that follow:
1. Decide on the package’s major characteristics and choose the modules you need
to include (page 198).
2. Generate the package configuration file (page 222).
3. Edit the configuration file (page 223).
4. Verify and apply the package configuration (page 227).
5. Add the package to the cluster (page 228).

197
NOTE: This is a new process for configuring packages, as of Serviceguard A.11.18.
This manual refers to packages created by this method as modular packages, and
assumes that you will use it to create new packages.
Packages created using Serviceguard A.11.16 or earlier are referred to as legacy
packages. If you need to reconfigure a legacy package (rather than create a new package),
see “Configuring a Legacy Package” (page 262).
It is also still possible to create new legacy packages by the method described in
“Configuring a Legacy Package”. If you are using a Serviceguard Toolkit, consult the
documentation for that product.
If you decide to convert a legacy package to a modular package, see “Migrating a
Legacy Package to a Modular Package” (page 272). Do not attempt to convert
Serviceguard Toolkit packages.
(Parameters that are in the package control script for legacy packages, but in the package
configuration file instead for modular packages, are indicated by (S) in the tables under
“Optional Package Modules” (page 202).)

Choosing Package Modules

IMPORTANT: Before you start, you need to do the package-planning tasks described
under “Package Configuration Planning ” (page 123).
To choose the right package modules, you need to decide the following things about
the package you are creating:
• What type of package it is; see “Types of Package: Failover, Multi-Node, System
Multi-Node” (page 198).
• Which parameters need to be specified for the package (beyond those included in
the base type, which is normally failover, multi-node, or system-multi-node).
See “Package Modules and Parameters” (page 199).
When you have made these decisions, you are ready to generate the package
configuration file; see “Generating the Package Configuration File” (page 222).

Types of Package: Failover, Multi-Node, System Multi-Node


There are three types of packages:
• Failover packages. This is the most common type of package. Failover packages
run on one node at a time. If there is a failure, Serviceguard (or a user) can halt

198 Configuring Packages and Their Services


them, and then start them up on another node selected from the package’s
configuration list; see node_name (page 205).
To generate a package configuration file that creates a failover package, include-m
sg/failover on the cmmakepkg command line. See “Generating the Package
Configuration File” (page 222).
• Multi-node packages. These packages run simultaneously on more than one node
in the cluster. Failures of package components such as applications, services, or
subnets, will cause the package to be halted only on the node on which the failure
occurred.
Relocatable IP addresses cannot be assigned to multi-node packages.

IMPORTANT: Multi-node packages must either use a clustered file system such
as Red Hat GFS, or not use shared storage.
To generate a package configuration file that creates a multi-node package,
include-m sg/multi_node on the cmmakepkg command line. See “Generating
the Package Configuration File” (page 222).
• System multi-node packages. System multi-node packages are supported only
for applications supplied by HP.

NOTE: The following parameters cannot be configured for multi-node packages:


• failover_policy
• failback_policy
• ip_subnet
• ip_address
Volume groups configured for packages of this type must be activated in shared mode.

For more information about types of packages and how they work, see “How the
Package Manager Works” (page 49). For information on planning a package, see
“Package Configuration Planning ” (page 123).
When you have decided on the type of package you want to create, the next step is to
decide what additional package-configuration modules you need to include; see
“Package Modules and Parameters” (page 199).

Package Modules and Parameters


The table that follows shows the package modules and the configuration parameters
each module includes. Read this section in conjunction with the discussion under
“Package Configuration Planning ” (page 123).
Use this information, and the parameter explanations that follow (page 204) to decide
which modules (if any) you need to add to the failover, multi-node, or system multi-node
module, to create your package. If you are used to creating legacy packages, you will
Choosing Package Modules 199
notice that parameters from the package control script (or their equivalents) are now
in the package configuration file; these parameters are marked (S) in the table.
You can use cmmakepkg -l (letter “l”) to see a list of all available modules, including
non-Serviceguard modules such as those supplied in the HP Toolkits.

NOTE: If you are going to create a complex package that contains many modules,
you may want to skip the process of selecting modules, and simply create a configuration
file that contains all the modules:
cmmakepkg -m sg/all $SGCONF/pkg_sg_complex
(The output will be written to $SGCONF/pkg_sg_complex.)

Base Package Modules


At least one base module (or default or all, which include the base module) must
be specified on the cmmakepkg command line. Parameters marked with an asterisk
(*) are new or changed as of Serviceguard A.11.18 or A.11.19. (S) indicates that the
parameter (or its equivalent) has moved from the package control script to the package
configuration file for modular packages. See the “Package Parameter Explanations”
(page 204) for more information.

200 Configuring Packages and Their Services


Table 6-1 Base Modules
Module Name Parameters (page) Comments

failover package_name (page 204) * Base module. Use as primary


module_name (page 205) * building block for failover
module_version (page 205) * packages.
package_type (page 205)
package_description (page 205) *
node_name (page 205)
auto_run (page 206)
node_fail_fast_enabled (page 206)
run_script_timeout (page 207)
halt_script_timeout (page 207)
successor_halt_script_timeout (page 208) *
script_log_file (page 208)
operation_sequence (page 208) *
log_level (page 208) *
failover_policy (page 209)
failback_policy (page 209)
priority (page 209) *

multi_node package_name (page 204) * Base module. Use as primary


module_name (page 205) * building block for multi-node
module_version (page 205) * packages.
package_type (page 205)
node_name (page 205)
auto_run (page 206)
node_fail_fast_enabled (page 206)
run_script_timeout (page 207)
halt_script_timeout (page 207)
successor_halt_timeout (page 208) *
script_log_file (page 208)
operation_sequence (page 208) *
log_level (page 208) *
priority (page 209) *

system_multi_node package_name (page 204) * Base module. Primary


module_name (page 205) * building block for system
module_version (page 205) * multi-node packages. System
package_type (page 205) multi-node packages are
node_name (page 205) supported only for
auto_run (page 206) applications supplied by HP.
node_fail_fast_enabled (page 206)
run_script_timeout (page 207)
halt_script_timeout (page 207)
successor_halt_timeout (page 208) *
script_log_file (page 208) *
operation_sequence (page 208) *
log_level (page 208) *
priority (page 209) *

Choosing Package Modules 201


Optional Package Modules
Add optional modules to a base module if you need to configure the functions in
question. Parameters marked with an asterisk (*) are new or changed as of Serviceguard
A.11.18 or A.11.19. (S) indicates that the parameter (or its equivalent) has moved from
the package control script to the package configuration file for modular packages. See
the “Package Parameter Explanations” (page 204) for more information.
Table 6-2 Optional Modules
Module Name Parameters (page) Comments

dependency dependency_name (page 210) * Add to a base module to


dependency_condition (page 210) create a package that depends
dependency_location (page 211) on one or more other
packages.

weight weight_name (page 211) * Add to a base module to


weight value (page 211) * create a package that has
weight that will be counted
against a node's capacity.

monitor_subnet monitored_subnet (page 212) * Add to a base module to


monitored_subnet_access (page 212)* configure subnet monitoring
for the package.

package_ip ip_subnet (page 213) * (S) Add to failover module to


ip_subnet_node (page 214) * assign relocatable IP addresses
ip_address (page 214) * (S) to a failover package.

service service_name (page 214) * (S) Add to a base module to


service_cmd (page 215) (S) create a package that runs an
service_restart (page 215) * (S) application or service.
service_fail_fast_enabled (page 216)
service_halt_timeout (page 216)

volume_group vgchange_cmd (page 216) * (S) Add to a base module if the


vg (page 216) (S) package needs to mount file
systems (other than Red Hat
GFS) on LVM volumes.

filesystem concurrent_fsck_operations (page 217) (S) Add to a base module to


concurrent_mount_and_umount_operations configure filesystem options
(page 217) (S) for the package.
fs_mount_retry_count (page 217) (S)
fs_umount_retry_count (page 218) * (S)
fs_name (page 218) * (S)
fs_directory (page 218) * (S)
fs_type (page 218) (S)
fs_mount_opt (page 219) (S)
fs_umount_opt (page 219) (S)
fs_fsck_opt (page 219) (S)

202 Configuring Packages and Their Services


Table 6-2 Optional Modules (continued)
Module Name Parameters (page) Comments

pev pev_ (page 219) * Add to a base module to


configure environment
variables to be passed to an
external script.

external_pre external_pre_script (page 220) * Add to a base module to


specify additional programs
to be run before volume
groups are activated while the
package is starting and after
they are deactivated while the
package is halting.

external external_script (page 220) * Add to a base module to


specify additional programs
to be run during package start
and halt time.

acp user_name (page 221) Add to a base module to


user_host (page 220) configure Access Control
user_role (page 221) Policies for the package.

all all parameters Use if you are creating a


complex package that requires
most or all of the optional
parameters; or if you want to
see the specifications and
comments for all available
parameters.

multi_node_all all parameters that can be used by a Use if you are creating a
multi-node package; includes multi_node, multi-node package that
dependency, monitor_subnet, service, requires most or all of the
volume_group, filesystem, pev, optional parameters that are
external_pre, external, and acp available for this type of
modules. package.

default (all parameters) A symbolic link to the all


module; used if a base module
is not specified on the
cmmakepkg command line;
see “cmmakepkg Examples”
(page 222).

Choosing Package Modules 203


NOTE: The default form for parameter names in the modular package configuration
file is lower case; for legacy packages the default is upper case. There are no
compatibility issues; Serviceguard is case-insensitive as far as the parameter names are
concerned. This manual uses lower case, unless the parameter in question is used only
in legacy packages, or the context refers exclusively to such a package.

Package Parameter Explanations


Brief descriptions of the package configuration parameters follow.

NOTE: For more information, see the comments in the editable configuration file
output by the cmmakepkg command, and the cmmakepkg (1m) manpage.
If you are going to browse these explanations deciding which parameters you need,
you may want to generate and print out a configuration file that has the comments for
all of the parameters; you can create such a file as follows:
cmmakepkg -m sg/all $SGCONF/sg-all
or simply
cmmakepkg $SGCONF/sg-all
This creates a file $SGCONF/sg-all that contains all the parameters and comments.
(See “Understanding the Location of Serviceguard Files” (page 153) for the location of
$SGCONF on your version of Linux.)
More detailed instructions for running cmmakepkg are in the next section, “Generating
the Package Configuration File” (page 222).
See also “Package Configuration Planning ” (page 123).

package_name
Any name, up to a maximum of 39 characters, that:
• starts and ends with an alphanumeric character
• otherwise contains only alphanumeric characters or dot (.), dash (-), or underscore
(_)
• is unique among package names in this cluster

204 Configuring Packages and Their Services


IMPORTANT: Restrictions on package names in previous Serviceguard releases
were less stringent. Packages whose names do not conform to the above rules will
continue to run, but if you reconfigure them, you will need to change the name;
cmcheckconf and cmapplyconf will enforce the new rules.

module_name
The module name. Do not change it. Used in the form of a relative path (for example
sg/failover) as a parameter to cmmakepkg to specify modules to be used in
configuring the package. (The files reside in the $SGCONF/modules directory; see
“Understanding the Location of Serviceguard Files” (page 153) for the location of
$SGCONF on your version of Linux.)
New for modular packages.

module_version
The module version. Do not change it.
New for modular packages.

package_type
The type can be failover, multi_node, or system multi_node. You can configure
only failover or multi-node packages; see “Types of Package: Failover, Multi-Node,
System Multi-Node” (page 198).

package_description
The application that the package runs. This is a descriptive parameter that can be set
to any value you choose, up to a maximum of 80 characters. Default value is
Serviceguard Package. New for 11.19

node_name
The node on which this package can run, or a list of nodes in order of priority, or an
asterisk (*) to indicate all nodes. The default is *. For system multi-node packages, you
must specify node_name *.
If you use a list, specify each node on a new line, preceded by the literal node_name,
for example:
node_name <node1>
node_name <node2>
node_name <node3>
The order in which you specify the node names is important. First list the primary
node name (the node where you normally want the package to start), then the first

Choosing Package Modules 205


adoptive node name (the best candidate for failover), then the second adoptive node
name, followed by additional node names in order of preference.
In case of a failover, control of the package will be transferred to the next adoptive
node name listed in the package configuration file, or (if that node is not available or
cannot run the package at that time) to the next node in the list, and so on.

IMPORTANT: See “Cluster Configuration Parameters ” (page 105) for important


information about node names.
See “About Cross-Subnet Failover” (page 147) for considerations when configuring
cross-subnet packages, which are further explained under “Cross-Subnet
Configurations” (page 32).

auto_run
Can be set to yes or no. The default is yes.
For failover packages, yes allows Serviceguard to start the package (on the first available
node listed under node_name) on cluster start-up, and to automatically restart it on an
adoptive node if it fails. no prevents Serviceguard from automatically starting the
package, and from restarting it on another node.
This is also referred to as package switching, and can be enabled or disabled while the
package is running, by means of the cmmodpkg command.
auto_run should be set to yes if the package depends on another package, or is depended
on; see “About Package Dependencies” (page 126).
For system multi-node packages, auto_run must be set to yes. In the case of a multi-node
package, setting auto_run to yes allows an instance to start on a new node joining the
cluster; no means it will not.

node_fail_fast_enabled
Can be set to yes or no. The default is no.
yes means the node on which the package is running will be halted (reboot) if the
package fails; no means Serviceguard will not halt the system.
If this parameter is set to yes and one of the following events occurs, Serviceguard
will halt the system (reboot) on the node where the control script fails:
• A package subnet fails and no backup network is available
• Serviceguard is unable to execute the halt function
• The start or halt function times out

206 Configuring Packages and Their Services


NOTE: If the package halt function fails with “exit 1”, Serviceguard does not halt
the node, but sets no_restart for the package, which disables package switching,
setting auto_run (page 206) to no and thereby preventing the package from starting on
any adoptive node.
Setting node_fail_fast_enabled to yes prevents Serviceguard from repeatedly trying (and
failing) to start the package on the same node.
For system multi-node packages, node_fail_fast_enabled must be set to yes.

run_script_timeout
The amount of time, in seconds, allowed for the package to start; or no_timeout. The
default is no_timeout. The maximum is 4294.
If the package does not complete its startup in the time specified by run_script_timeout,
Serviceguard will terminate it and prevent it from switching to another node. In this
case, if node_fail_fast_enabled is set to yes, the node will be halted (rebooted).
If no timeout is specified (no_timeout), Serviceguard will wait indefinitely for the
package to start.
If a timeout occurs:
• Switching will be disabled.
• The current node will be disabled from running the package.

NOTE: If no_timeout is specified, and the script hangs, or takes a very long time
to complete, during the validation step (cmcheckconf (1m)), cmcheckconf will
wait 20 minutes to allow the validation to complete before giving up.

halt_script_timeout
The amount of time, in seconds, allowed for the package to halt; or no_timeout. The
default is no_timeout. The maximum is 4294.
If the package’s halt process does not complete in the time specified by
halt_script_timeout, Serviceguard will terminate the package and prevent it from
switching to another node. In this case, if node_fail_fast_enabled (page 206) is set to yes,
the node will be halted (reboot).
If a halt_script_timeout is specified, it should be greater than the sum of all the values
set for service_halt_timeout (page 216) for this package.
If a timeout occurs:
• Switching will be disabled.
• The current node will be disabled from running the package.
If a halt-script timeout occurs, you may need to perform manual cleanup. See Chapter 8:
“Troubleshooting Your Cluster” (page 281).
Choosing Package Modules 207
successor_halt_timeout
Specifies how long, in seconds, Serviceguard will wait for packages that depend on
this package to halt, before halting this package. Can be 0 through 4294, or
no_timeout. The default is no_timeout.
• no_timeout means that Serviceguard will wait indefinitely for the dependent
packages to halt.
• 0 means Serviceguard will not wait for the dependent packages to halt before
halting this package.
New as of A.11.18 (for both modular and legacy packages). See also “About Package
Dependencies” (page 126).

script_log_file
The full pathname of the package’s log file. The default
is$SGRUN/log/<package_name>.log. (See “Understanding the Location of
Serviceguard Files” (page 153) for more information about Serviceguard pathnames.)
See also log_level.

operation_sequence
Defines the order in which the scripts defined by the package’s component modules
will start up. See the package configuration file for details.
This parameter is not configurable; do not change the entries in the configuration file.
New for modular packages.

log_level
Determines the amount of information printed to stdout when the package is validated,
and to the script_log_file when the package is started and halted. Valid values are 0
through 5, but you should normally use only the first two (0 or 1); the remainder (2
through 5) are intended for use by HP Support.
• 0 - informative messages
• 1 - informative messages with slightly more detail
• 2 - messages showing logic flow
• 3 - messages showing detailed data structure information
• 4 - detailed debugging information
• 5 - function call flow
New for modular packages.

208 Configuring Packages and Their Services


failover_policy
Specifies how Serviceguard decides where to start the package, or restart it if it fails.
Can be set to configured_node or min_package_node. The default is
configured_node.
• configured_node means Serviceguard will attempt to start the package on the
first available node in the list you provide under node_name (page 205).
• min_package_node means Serviceguard will start the package on whichever
node in the node_name list has the fewest packages running at the time.
This parameter can be set for failover packages only. If this package will depend on
another package or vice versa, see also “About Package Dependencies” (page 126).

failback_policy
Specifies whether or not Serviceguard will automatically move a package that is not
running on its primary node (the first node on its node_name list) when the primary
node is once again available. Can be set to automatic or manual. The default is
manual.
• manual means the package will continue to run on the current node.
• automatic means Serviceguard will move the package to the primary node as
soon as that node becomes available, unless doing so would also force a package
with a higher priority to move.
This parameter can be set for failover packages only. If this package will depend on
another package or vice versa, see also “About Package Dependencies” (page 126).

priority
Assigns a priority to a failover package whose failover_policy is configured_node.
Valid values are 1 through 3000, or no_priority. The default is no_priority. See
also the dependency_ parameter descriptions (page 210).
priority can be used to satisfy dependencies when a package starts, or needs to fail over
or fail back: a package with a higher priority than the packages it depends on can force
those packages to start or restart on the node it chooses, so that its dependencies are
met.
If you assign a priority, it must be unique in this cluster. A lower number indicates a
higher priority, and a numerical priority is higher than no_priority. HP recommends
assigning values in increments of 20 so as to leave gaps in the sequence; otherwise you
may have to shuffle all the existing priorities when assigning priority to a new package.

Choosing Package Modules 209


IMPORTANT: Because priority is a matter of ranking, a lower number indicates a
higher priority (20 is a higher priority than 40). A numerical priority is higher than
no_priority.
New as of A.11.18 (for both modular and legacy packages). See “About Package
Dependencies” (page 126) for more information.

dependency_name
A unique identifier for a particular dependency (see dependency_condition) that must
be met in order for this package to run (or keep running). It must be unique among
this package's dependency_names. The length and formal restrictions for the name are
the same as for package_name (page 204).

IMPORTANT: Restrictions on dependency names in previous Serviceguard releases


were less stringent. Packages that specify dependency_names that do not conform to the
above rules will continue to run, but if you reconfigure them, you will need to change
the dependency_name; cmcheckconf and cmapplyconf will enforce the new rules.
Configure this parameter, along with dependency_condition and dependency_location, and
optionally priority (page 209), if this package depends on another package; for example,
if this package depends on a package named pkg2:
dependency_name pkg2dep
dependency_condition pkg2 = UP
dependency_location same_node
For more information about package dependencies, see “About Package Dependencies”
(page 126).

dependency_condition
The condition that must be met for this dependency to be satisfied. As of Serviceguard
A.11.18, the only condition that can be set is that another package must be running.
The syntax is: <package_name> = UP, where <package_name> is the name of the
package depended on. The type and characteristics of the current package (the one we
are configuring) impose the following restrictions on the type of package it can depend
on:

210 Configuring Packages and Their Services


• If the current package is a multi-node package, <package_name> must identify
a multi-node or system multi-node package.
• If the current package is a failover package and its failover_policy (page 209) is
min_package_node, <package_name> must identify a multi-node or system
multi-node package.
• If the current package is a failover package and configured_node is its
failover_policy, <package_name> must identify a multi-node or system multi-node
package, or a failover package whose failover_policy is configured_node.
See also “About Package Dependencies” (page 126).

dependency_location
Specifies where the dependency_condition must be met. The only legal value is
same_node.

weight_name, weight_value
These parameters specify a weight for a package; this weight is compared to a node's
available capacity (defined by the CAPACITY_NAME and CAPACITY_VALUE
parameters in the cluster configuration file) to determine whether the package can run
there.
Both parameters are optional, but if weight_value is specified, weight_name must also
be specified, and must come first. You can define up to four weights, corresponding
to four different capacities, per cluster. To specify more than one weight for this package,
repeat weight_name and weight_value.

NOTE: But if weight_name is package_limit, you can use only that one weight and
capacity throughout the cluster. package_limit is a reserved value, which, if used,
must be entered exactly in that form. It provides the simplest way of managing weights
and capacities; see “Simple Method” (page 135) for more information.
The rules for forming weight_name are the same as those for forming package_name
(page 204). weight_name must exactly match the corresponding CAPACITY_NAME.
weight_value is an unsigned floating-point value between 0 and 1000000 with at most
three digits after the decimal point.
You can use these parameters to override the cluster-wide default package weight that
corresponds to a given node capacity. You can define that cluster-wide default package
weight by means of the WEIGHT_NAME and WEIGHT_DEFAULT parameters in the
cluster configuration file (explicit default). If you do not define an explicit default (that
is, if you define a CAPACITY_NAME in the cluster configuration file with no
corresponding WEIGHT_NAME and WEIGHT_DEFAULT), the default weight is
assumed to be zero (implicit default). Configuring weight_name and weight_value here
in the package configuration file overrides the cluster-wide default (implicit or explicit),
and assigns a particular weight to this package.

Choosing Package Modules 211


For more information, see “About Package Weights” (page 134). See also the discussion
of the relevant parameters under “Cluster Configuration Parameters ” (page 105), in
the cmmakepkg (1m) and cmquerycl (1m) manpages, and in the cluster
configuration and package configuration template files.
New for 11.19.

monitored_subnet
The LAN subnet that is to be monitored for this package. Replaces legacy SUBNET
which is still supported in the package configuration file for legacy packages; see
“Configuring a Legacy Package” (page 262).
You can specify multiple subnets; use a separate line for each.
If you specify a subnet as a monitored_subnet the package will not run on any node not
reachable via that subnet. This normally means that if the subnet is not up, the package
will not run. (For cross-subnet configurations, in which a subnet may be configured
on some nodes and not on others, see monitored_subnet_access below, ip_subnet_node
(page 214), and “About Cross-Subnet Failover” (page 147).)
Typically you would monitor the ip_subnet, specifying it here as well as in the ip_subnet
parameter (page 213), but you may want to monitor other subnets as well; you can
specify any subnet that is configured into the cluster (via the STATIONARY_IP
parameter in the cluster configuration file). See “Stationary and Relocatable IP Addresses
and Monitored Subnets” (page 71) for more information.
If any monitored_subnet fails, Serviceguard will switch the package to any other node
specified by node_name (page 205) which can communicate on all the monitored_subnets
defined for this package. See the comments in the configuration file for more information
and examples.

monitored_subnet_access
In cross-subnet configurations, specifies whether each monitored_subnet is accessible
on all nodes in the package’s node_name list (page 205), or only some. Valid values are
PARTIAL, meaning that at least one of the nodes has access to the subnet, but not all;
and FULL, meaning that all nodes have access to the subnet. The default is FULL, and
it is in effect if monitored_subnet_access is not specified.
See also ip_subnet_node (page 214) and “About Cross-Subnet Failover” (page 147).
New for modular packages. For legacy packages, see “Configuring Cross-Subnet
Failover” (page 269).

212 Configuring Packages and Their Services


ip_subnet
Specifies an IP subnet used by the package. Replaces SUBNET, which is still supported
in the package control script for legacy packages.

CAUTION: HP recommends that this subnet be configured into the cluster. You do
this in the cluster configuration file by specifying a HEARTBEAT_IP or STATIONARY_IP
under a NETWORK_INTERFACE on the same subnet, for each node in this package's
NODE_NAME list. For example, an entry such as the following in the cluster
configuration file configures subnet 192.10.25.0 (lan1) on node ftsys9:
NODE_NAME ftsys9
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.10.25.18
See“Cluster Configuration Parameters ” (page 105) for more information.
If the subnet is not configured into the cluster, Serviceguard cannot manage or monitor
it, and in fact cannot guarantee that it is available on all nodes in the package's node-name
list (page 205) . Such a subnet is referred to as an external subnet, and relocatable
addresses on that subnet are known as external addresses. If you use an external subnet,
you risk the following consequences:
• If the subnet fails, the package will not fail over to an alternate node.
• Even if the subnet remains intact, if the package needs to fail over because of some
other type of failure, it could fail to start on an adoptive node because the subnet
is not available on that node.
For these reasons, configure all ip_subnets into the cluster, unless you are using a
networking technology that does not support DLPI. In such cases, follow instructions
in the networking product's documentation to integrate the product with Serviceguard.

For each subnet used, specify the subnet address on one line and, on the following
lines, the relocatable IP addresses that the package uses on that subnet. These will be
configured when the package starts and unconfigured when it halts.
For example, if this package uses subnet 192.10.25.0 and the relocatable IP addresses
192.10.25.12 and 192.10.25.13, enter:
ip_subnet 192.10.25.0
ip_address 192.10.25.12
ip_address 192.10.25.13
If you want the subnet to be monitored, specify it in the monitored_subnet parameter
(page 212) as well.
In a cross-subnet configuration, you also need to specify which nodes the subnet is
configured on; see ip_subnet_node below. See also monitored_subnet_access (page 212)
and “About Cross-Subnet Failover” (page 147).

Choosing Package Modules 213


This parameter can be set for failover packages only.

ip_subnet_node
In a cross-subnet configuration, specifies which nodes an ip_subnet is configured on.
If no ip_subnet_nodes are listed under an ip_subnet, it is assumed to be configured on
all nodes in this package’s node_name list (page 205).
Can be added or deleted while the package is running, with these restrictions:
• The package must not be running on the node that is being added or deleted.
• The node must not be the first to be added to, or the last deleted from, the list of
ip_subnet_nodes for this ip_subnet.
See also monitored_subnet_access (page 212) and “About Cross-Subnet Failover” (page 147).
New for modular packages. For legacy packages, see “Configuring Cross-Subnet
Failover” (page 269).

ip_address
A relocatable IP address on a specified ip_subnet. Replaces IP, which is still supported
in the package control script for legacy packages.
For more information about relocatable IP addresses, see “Stationary and Relocatable
IP Addresses and Monitored Subnets” (page 71).
This parameter can be set for failover packages only.

service_name
A service is a program or function which Serviceguard monitors as long the package
is up. service_name identifies this function and is used by the cmrunserv and
cmhaltserv commands. You can configure a maximum of 30 services per package
and 900 services per cluster.
The length and formal restrictions for the name are the same as for package_name
(page 204). service_name must be unique among all packages in the cluster.

IMPORTANT: Restrictions on service names in previous Serviceguard releases were


less stringent. Packages that specify services whose names do not conform to the above
rules will continue to run, but if you reconfigure them, you will need to change the
name; cmcheckconf and cmapplyconf will enforce the new rules.
Each service is defined by five parameters: service_name, service_cmd, service_restart,
service_fail_fast_enabled, and service_halt_timeout. See the descriptions that follow.
The following is an example of fully defined service:
service_name patricks-package4-ping]
service_cmd "/usr/sbin/ping hasupt22"
service_restart unlimited

214 Configuring Packages and Their Services


service_fail_fast_enabled no
service_halt_timeout 300
See the package configuration template file for more examples.
For legacy packages, this parameter is in the package control script as well as the
package configuration file.

service_cmd
The command that runs the program or function for this service_name, for example,
/usr/bin/X11/xclock -display 15.244.58.208:0
An absolute pathname is required; neither the PATH variable nor any other environment
variable is passed to the command. The default shell is /bin/sh.

NOTE: Be careful when defining service run commands. Each run command is
executed in the following way:
• The cmrunserv command executes the run command.
• Serviceguard monitors the process ID (PID) of the process the run command
creates.
• When the command exits, Serviceguard determines that a failure has occurred
and takes appropriate action, which may include transferring the package to an
adoptive node.
• If a run command is a shell script that runs some other command and then exits,
Serviceguard will consider this normal exit as a failure.
Make sure that each run command is the name of an actual service and that its process
remains alive until the actual service stops. One way to manage this is to configure a
package such that the service is actually a monitoring program that checks the health
of the application that constitutes the main function of the package, and exits if it finds
the application has failed. The application itself can be started by an external_script
(page 220).

This parameter is in the package control script for legacy packages.

service_restart
The number of times Serviceguard will attempt to re-run the service_cmd. Valid values
are unlimited, none or any positive integer value. Default is none.
If the value is unlimited, the service will be restarted an infinite number of times. If
the value is none, the service will not be restarted.
This parameter is in the package control script for legacy packages.

Choosing Package Modules 215


service_fail_fast_enabled
Specifies whether or not Serviceguard will halt the node (reboot) on which the package
is running if the service identified by service_name fails. Valid values are yes and no.
Default is no, meaning that failure of this service will not cause the node to halt.

service_halt_timeout
The length of time, in seconds, Serviceguard will wait for the service to halt before
forcing termination of the service’s process. The maximum value is 4294.
The value should be large enough to allow any cleanup required by the service to
complete.
If no value is specified, a zero timeout will be assumed, meaning that Serviceguard
will not wait any time before terminating the process.

vgchange_cmd
Replaces VGCHANGE, which is still supported for legacy packages; see “Configuring
a Legacy Package” (page 262). Specifies the method of activation for each Logical Volume
Manager (LVM) volume group identified by a vg entry.
The default is vgchange -a y.

vg
Specifies an LVM volume group (one per vg, each on a new line) on which a file system
(other than Red Hat GFS; see fs_type) needs to be mounted. A corresponding
vgchange_cmd (see above) specifies how the volume group is to be activated. The package
script generates the necessary filesystem commands on the basis of the fs_ parameters
(see “File system parameters” ).

File system parameters


A package can activate one or more storage groups on startup, and to mount logical
volumes to file systems. At halt time, the package script unmounts the file systems and
deactivates each storage group. All storage groups must be accessible on each target
node.
For each file system (fs_name) you specify in the package configuration file, you must
identify a logical volume, the mount point, the mount, umount and fsck options, and
the type of the file system; for example:
fs_name /dev/vg01/lvol1
fs_directory /pkg01aa
fs_mount_opt "-o rw"
fs_umount_opt ""
fs_fsck_opt ""

216 Configuring Packages and Their Services


fs_type "ext2"
A logical volume must be built on an LVM volume group. Logical volumes can be
entered in any order.
A gfs file system can be configured using only the fs_name, fs_directory, and fs_mount_opt
parameters; see the configuration file for an example. Additional rules apply for gfs
as explained under fs_type.
The parameter explanations that follow provide more detail.

concurrent_fsck_operations
The number of concurrent fsck operations allowed on file systems being mounted
during package startup. Not used for Red Hat GFS (see fs_type).
Legal value is any number greater than zero. The default is 1.
If the package needs to run fsck on a large number of file systems, you can improve
performance by carefully tuning this parameter during testing (increase it a little at
time and monitor performance each time).

concurrent_mount_and_umount_operations
The number of concurrent mounts and umounts to allow during package startup or
shutdown.
Legal value is any number greater than zero. The default is 1.
If the package needs to mount and unmount a large number of file systems, you can
improve performance by carefully tuning this parameter during testing (increase it a
little at time and monitor performance each time).

fs_mount_retry_count
The number of mount retries for each file system. Legal value is zero or any greater
number. The default is zero. The only valid value for Red Hat GFS (see fs_type) is zero.
If the mount point is busy at package startup and fs_mount_retry_count is set to zero,
package startup will fail.
If the mount point is busy and fs_mount_retry_count is greater than zero, the startup
script will attempt to kill the user process responsible for the busy mount point (fuser
-ku) and then try to mount the file system again. It will do this the number of times
specified by fs_mount_retry_count.
If the mount still fails after the number of attempts specified by fs_mount_retry_count,
package startup will fail.
This parameter is in the package control script for legacy packages.

Choosing Package Modules 217


fs_umount_retry_count
The number of umount retries for each file system. Replaces FS_UMOUNT_COUNT,
which is still supported in the package control script for legacy packages; see
“Configuring a Legacy Package” (page 262).
Legal value is 1 or (for filesystem types other than Red Hat GFS) any greater number.
The default is 1. Operates in the same way as fs_mount_retry_count.

fs_name
This parameter, in conjunction with fs_directory, fs_type, fs_mount_opt, fs_umount_opt,
and fs_fsck_opt, specifies a filesystem that is to be mounted by the package. Replaces
LV, which is still supported in the package control script for legacy packages.
fs_name must specify the block devicefile for a logical volume.
File systems are mounted in the order you specify in the package configuration file,
and unmounted in the reverse order.
See “File system parameters” (page 216) and the comments in the FILESYSTEMS section
of the configuration file for more information and examples. See also “Volume Manager
Planning ” (page 99), and the mount manpage.

NOTE: For filesystem types other than Red Hat GFS (see fs_type), a volume group
must be defined in this file (using vg; see (page 216)) for each logical volume specified
by an fs_name entry.

fs_directory
The root of the file system specified by fs_name. Replaces FS, which is still supported
in the package control script for legacy packages; see “Configuring a Legacy Package”
(page 262).
See the mount manpage and the comments in the configuration file for more
information.

fs_type
The type of the file system specified by fs_name. This parameter is in the package control
script for legacy packages.
Supported types are ext2, ext3, reiserfs, and gfs.

218 Configuring Packages and Their Services


NOTE: A package using gfs (Red Hat Global File System, or GFS) cannot use any
other file systems of a different type. vg and vgchange_cmd (page 216) are not valid for
GFS file systems. For more information about using GFS with Serviceguard, see
Clustering Linux Servers with the Concurrent Deployment of HP Serviceguard for Linux and
Red Hat Global File Systems for RHEL5 on docs.hp.com under High Availability
—> Serviceguard for Linux —> White Papers
See also concurrent_fsck_operations (page 217), fs_mount_retry_count and
fs_umount_retry_count (page 218), and fs_fsck_opt (page 219).

See the comments in the package configuration file template for more information.

fs_mount_opt
The mount options for the file system specified by fs_name. See the comments in the
configuration file for more information. This parameter is in the package control script
for legacy packages.

fs_umount_opt
The umount options for the file system specified by fs_name. See the comments in the
configuration file for more information. This parameter is in the package control script
for legacy packages.

fs_fsck_opt
The fsck options for the file system specified by fs_name. Not used for Red Hat GFS
(see fs_type). This parameter is in the package control script for legacy packages.
See the fsck manpage, and the comments in the configuration file, for more information.

pv
Physical volume on which persistent reservations (PR) will be made if the device
supports it. New for 11.19.

IMPORTANT: This parameter is for use only by HP partners, who should follow the
instructions in the package configuration file.
For information about Serviceguard's implementation of PR, see “About Persistent
Reservations” (page 86).

pev_
Specifies a package environment variable that can be passed to external_pre_script,
external_script, or both, by means of the cmgetpkgenv command. New for modular
packages.

Choosing Package Modules 219


The variable name must be in the form pev_<variable_name> and contain only
alphanumeric characters and underscores. The letters pev (upper-case or lower-case)
followed by the underscore (_) are required.
The variable name and value can each consist of a maximum of MAXPATHLEN
characters (4096 on Linux systems).
You can define more than one variable. See “About External Scripts” (page 143), as well
as the comments in the configuration file, for more information.

external_pre_script
The full pathname of an external script to be executed before volume groups and disk
groups are activated during package startup, and after they have been deactivated
during package shutdown; that is, effectively the first step in package startup and last
step in package shutdown. New for modular packages.
If more than one external_pre_script is specified, the scripts will be executed on package
startup in the order they are entered into the package configuration file, and in the
reverse order during package shutdown.
See “About External Scripts” (page 143), as well as the comments in the configuration
file, for more information and examples.

external_script
The full pathname of an external script. This script is often the means of launching and
halting the application that constitutes the main function of the package. New for
modular packages.
The script is executed on package startup after volume groups and file systems are
activated and IP addresses are assigned, but before services are started; and during
package shutdown after services are halted but before IP addresses are removed and
volume groups and file systems deactivated.
If more than one external_script is specified, the scripts will be executed on package
startup in the order they are entered into this file, and in the reverse order during
package shutdown.
See “About External Scripts” (page 143), as well as the comments in the configuration
file, for more information and examples. See also service_cmd (page 215).

user_host
The system from which a user specified by user_name (page 221) can execute
package-administration commands.
Legal values are any_serviceguard_node, or cluster_member_node, or a specific
cluster node. If you specify a specific node it must be the official hostname (the
hostname portion, and only the hostname portion, of the fully qualified domain
name). As with user_name, be careful to spell the keywords exactly as given.

220 Configuring Packages and Their Services


user_name
Specifies the name of a user who has permission to administer this package. See also
user_host (page 220) and user_role; these three parameters together define the access
control policy for this package (see “Controlling Access to the Cluster” (page 183)).
These parameters must be defined in this order: user_name, user_host, user_role.
Legal values for user_name are any_user or a maximum of eight login names from
/etc/passwd on user_host.

NOTE: Be careful to spell any_user exactly as given; otherwise Serviceguard will


interpret it as a user name.
Note that the only user_role that can be granted in the package configuration file is
package_admin for this particular package; you grant other roles in the cluster
configuration file. See “Setting up Access-Control Policies” (page 185) for further
discussion and examples.

user_role
Must be package_admin, allowing the user access to the cmrunpkg, cmhaltpkg,
and cmmodpkg commands (and the equivalent functions in Serviceguard Manager)
and to the monitor role for the cluster. See “Controlling Access to the Cluster”
(page 183) for more information.

Additional Parameters Used Only by Legacy Packages

IMPORTANT: The following parameters are used only by legacy packages. Do not
try to use them in modular packages. See “Creating the Legacy Package Configuration
” (page 262) for more information.
PATH Specifies the path to be used by the script.
SUBNET Specifies the IP subnets that are to be monitored
for the package.
RUN_SCRIPTand HALT_SCRIPT Use the full pathname of each script.
These two parameters allow you to separate
package run instructions and package halt
instructions for legacy packages into separate
scripts if you need to. In this case, make sure you
include identical configuration information (such
as node names, IP addresses, etc.) in both scripts.
In most cases, though, HP recommends that you
use the same script for both run and halt
instructions. (When the package starts, the script

Choosing Package Modules 221


is passed the parameter start; when it halts, it is
passed the parameter stop.)
LV The name of a logical volume hosting a file
system that will be mounted by the package.
FS The name of the mount point for a file system to
be mounted by the package.
VGCHANGE As vgchange_cmd (page 216).

Generating the Package Configuration File


When you have chosen the configuration modules your package needs (see “Choosing
Package Modules” (page 198)), you are ready to generate a package configuration file
that contains those modules. This file will consist of a base module (failover, multi-node
or system multi-node) plus the modules that contain the additional parameters you
have decided to include.

Before You Start


Before you start building a package, create a subdirectory for it in the $SGCONF
directory, for example:
mkdir $SGCONF/pkg1
(See “Understanding the Location of Serviceguard Files” (page 153) for information
about Serviceguard pathnames.)

cmmakepkg Examples
The cmmakepkg command generates a package configuration file. Some examples
follow; see the cmmakepkg (1m) manpage for complete information. All the examples
create an editable configuration file pkg1.conf in the $SGCONF/pkg1 directory.

222 Configuring Packages and Their Services


NOTE: If you do not include a base module (or default or all) on the cmmakepkg
command line, cmmakepkg will ignore the modules you specify and generate a default
configuration file containing all the parameters.
For a complex package, or if you are not yet sure which parameters you will need to
set, the default may be the best choice; see the first example below.
You can use the-v option with cmmakepkg to control how much information is
displayed online or included in the configuration file. Valid values are 0, 1 and 2. -v
0 removes all comments; -v 1 includes a brief heading for each parameter; -v 2 provides
a full description of each parameter. The default is level 2.

• To generate a configuration file that contains all the optional modules:


cmmakepkg $SGCONF/pkg1/pkg1.conf
• To create a generic failover package (that could be applied without editing):
cmmakepkg -n pkg1 -m sg/failover $SGCONF/pkg1/pkg1.conf
• To generate a configuration file for a failover package that uses relocatable IP
addresses and runs an application that requires file systems to be mounted at run
time (enter the command all on one line):
cmmakepkg -m sg/failover -m sg/package_ip -m sg/service -m
sg/filesystem -m sg/volume_group $SGCONF/pkg1/pkg1.conf
• To generate a configuration file for a failover package that runs an application that
requires another package to be up (enter the command all on one line):
cmmakepkg -m sg/failover -m sg/dependency -m sg/service
$SGCONF/pkg1/pkg1.conf
• To generate a configuration file adding the services module to an existing
package (enter the command all on one line):
cmmakepkg -i $SGCONF/pkg1/pkg1.conf -m sg/service
$SGCONF/pkg1/pkg1_v2.conf

NOTE: You can add more than one module at a time.

Next Step
The next step is to edit the configuration file you have generated; see “Editing the
Configuration File” (page 223).

Editing the Configuration File


When you have generated the configuration file that contains the modules your package
needs (see “Generating the Package Configuration File” (page 222)), you need to edit

Editing the Configuration File 223


the file to set the package parameters to the values that will make the package function
as you intend.
It is a good idea to configure complex failover packages in stages, as follows:
1. Configure volume groups and mount points only.
2. Check and apply the configuration; see “Verifying and Applying the Package
Configuration” (page 227).
3. Run the package and ensure that it can be moved from node to node.

NOTE: cmcheckconf and cmapplyconf check for missing mount points,


volume groups, etc.

4. Halt the package.


5. Configure package IP addresses and application services.
6. Run the package and ensure that applications run as expected and that the package
fails over correctly when services are disrupted. See “Testing the Package Manager
” (page 281).
Use the following bullet points as a checklist, referring to the “Package Parameter
Explanations” (page 204), and the comments in the configuration file itself, for detailed
specifications for each parameter.

NOTE: Optional parameters are commented out in the configuration file (with a # at
the beginning of the line). In some cases these parameters have default values that will
take effect unless you uncomment the parameter (remove the #) and enter a valid value
different from the default. Read the surrounding comments in the file, and the
explanations in this chapter, to make sure you understand the implications both of
accepting and of changing a given default.
In all cases, be careful to uncomment each parameter you intend to use and assign it
the value you want it to have.

• package_name. Enter a unique name for this package. Note that there are stricter
formal requirements for the name as of A.11.18.
• package_type. Enter failover or multi_node. ( system_multi_node is
reserved for special-purpose packages supplied by HP.) Note
that there are restrictions if another package depends on this package; see “About
Package Dependencies” (page 126).
See “Types of Package: Failover, Multi-Node, System Multi-Node” (page 198) for
more information.
• node_name. Enter the name of each cluster node on which this package can run,
with a separate entry on a separate line for each node.
• auto_run. For failover packages, enter yes to allow Serviceguard to start the package
on the first available node specified by node_name, and to automatically restart it

224 Configuring Packages and Their Services


later if it fails. Enter no to keep Serviceguard from automatically starting the
package.
• node_fail_fast_enabled. Enter yes to cause the node to be halted (system halt) if the
package fails; otherwise enter no.
• run_script_timeout and halt_script_timeout. Enter the number of seconds Serviceguard
should wait for package startup or shutdown, respectively, to complete; or leave
the default, no_timeout. See (page 207).
• successor_halt_timeout. Used if other packages depend on this package; see “About
Package Dependencies” (page 126).
• script_log_file (page 208).
• log_level (page 208).
• failover_policy (page 209). Enter configured_node or min_package_node. .
(This parameter can be set for failover packages only.)
• failback_policy (page 209) . Enter automatic or manual.
(This parameter can be set for failover packages only.)
• If this package will depend on another package or packages, enter values for
dependency_name, dependency_condition, dependency_location, and optionally priority.
See “About Package Dependencies” (page 126) for more information.

NOTE: The package(s) this package depends on must already be part of the
cluster configuration by the time you validate this package (via cmcheckconf;
see “Verifying and Applying the Package Configuration” (page 227)); otherwise
validation will fail.

• To configure package weights, use the weight_name and weight_value parameters


(page 211). See “About Package Weights” (page 134) for more information.
• Use monitored_subnet to specify a subnet to be monitored for this package. If there
are multiple subnets, repeat the parameter as many times as needed, on a new line
each time.
In a cross-subnet configuration, configure the additional monitored_subnet_access
parameter for each monitored_subnet as necessary; see “About Cross-Subnet
Failover” (page 147) for more information.
• If your package will use relocatable IP addresses, enter the ip_subnet and ip_address
addresses. See the parameter descriptions (page 213) for rules and restrictions.
In a cross-subnet configuration, configure the additional ip_subnet_node parameter
for each ip_subnet as necessary; see “About Cross-Subnet Failover” (page 147) for
more information.

Editing the Configuration File 225


• For each service the package will run:
— enter the service_name (for example, a daemon or long-running process)
— enter the service_cmd (for example, the command that starts the process)
— enter values for service_fail_fast_enabled and service_halt_timeout if you need to
change them from their defaults.
— service_restart if you want the package to restart the service if it exits. (A value
of unlimited can be useful if you want the service to execute in a loop, rather
than exit and halt the package.)
Include a service entry for disk monitoring if the package depends on monitored
disks. Use entries similar to the following:
service_name=“cmresserviced_Pkg1”
service_cmd=”$SGBIN/cmresserviced /dev/sdd1”
service_restart=””
See “Creating a Disk Monitor Configuration” (page 228) for more information.
• If the package needs to activate LVM volume groups, configure vgchange_cmd, or
leave the default.
• If the package needs to mount LVM volumes to file systems (other than Red Hat
GFS; see fs_type (page 218)), use the vg parameters to specify the names of the
volume groups to be activated, and select the appropriate vgchange_cmd.
Use the fs_ parameters (page 218) to specify the characteristics of file systems and
how and where to mount them. See the comments in the FILESYSTEMS section
of the configuration file for more information and examples.
Enter each volume group on a separate line, for example:
vg vg01
vg vg02
• If your package uses a large number of volume groups or disk groups, or mounts
a large number of file systems, consider increasing the values of the following
parameters:
— concurrent_fsck_operations—specifies the number of parallel fsck operations
that will be allowed at package startup (not used for Red Hat GFS).
— concurrent_mount_and_umount_operations—specifies the number of parallel
mount operations allowed during package startup and unmount operations
during package shutdown.
• Specify the filesystem mount and unmount retry options. For Red Hat GFS (see
fs_type (page 218)), use the default (zero).
• You can use the pev_ parameter to specify a variable to be passed to external scripts.
Make sure the variable name begins with the upper-case or lower-case letters pev
and an underscore ( _). You can specify more than one variable. See “About External
Scripts” (page 143), and the comments in the configuration file, for more information.

226 Configuring Packages and Their Services


• If you want the package to run an external “pre-script” during startup and
shutdown, use the external_pre_script parameter (see (page 220)) to specify the full
pathname of the script, for example $SGCONF/pkg1/pre_script1.
• If the package will run an external script, use the external_script parameter (see
(page 220)) to specify the full pathname of the script, for example $SGCONF/pkg1/
script1.
See “About External Scripts” (page 143), and the comments in the configuration
file, for more information.
• Configure the Access Control Policy for up to eight specific users or any_user.
The only user role you can configure in the package configuration file is
package_admin for the package in question. Cluster-wide roles are defined in
the cluster configuration file. See “Setting up Access-Control Policies” (page 185)
for more information.

Verifying and Applying the Package Configuration


Serviceguard checks the configuration you enter and reports any errors.
Use a command such as the following to verify the content of the package configuration
file you have created, for example:
cmcheckconf -v -P $SGCONF/pkg1/pkg1.conf
Errors are displayed on the standard output. If necessary, re-edit the file to correct any
errors, then run cmcheckconf again until it completes without errors.
The following items are checked:
• Package name is valid, and at least one node_name entry is included.
• There are no duplicate parameter entries (except as permitted for multiple volume
groups, etc).
• Values for all parameters are within permitted ranges.
• Configured resources are available on cluster nodes.
• File systems and volume groups are valid.
• Services are executable.
• Any package that this package depends on is already be part of the cluster
configuration.
When cmcheckconf has completed without errors, apply the package configuration,
for example:
cmapplyconf -P $SGCONF/pkg1/pkg1.conf
This adds the package configuration information to the binary cluster configuration
file in the $SGCONF directory and distributes it to all the cluster nodes.

Verifying and Applying the Package Configuration 227


NOTE: For modular packages, you now need to distribute any external scripts
identified by the external_pre_script and external_script parameters.
But, if you are accustomed to configuring legacy packages, note that you do not have
to create a separate package control script for a modular package, or distribute it
manually. (You do still have to do this for legacy packages; see “Configuring a Legacy
Package” (page 262).)

Adding the Package to the Cluster


You can add the new package to the cluster while the cluster is running, subject to the
value of max_configured_packages in the cluster configuration file. See “Adding a Package
to a Running Cluster” (page 273).

Creating a Disk Monitor Configuration


Serviceguard provides disk monitoring for the shared storage that is activated by
packages in the cluster. The monitor daemon on each node tracks the status of all the
disks on that node that you have configured for monitoring.
The configuration must be done separately for each node in the cluster, because each
node monitors only the group of disks that can be activated on that node, and that
depends on which packages are allowed to run on the node.
To set up monitoring, include a monitoring service in each package that uses disks you
want to track. Remember that service names must be unique across the cluster; you
can use the package name in combination with the string cmresserviced. The
following shows an entry in the package configuration file for pkg1:
service_name cmresserviced_pkg1
service_fail_fast_enabled yes
service_halt_timeout 300
service_cmd "cmresserviced /dev/sdd1 /dsv/sde1"
service_restart none

CAUTION: Because of a limitation in LVM, service_fail_fast_enabled must be set to yes,


forcing the package to fail over to another node if it loses its storage.

NOTE: The service_cmd entry must include the cmresserviced command.


It is also important to set service_restart to none.

228 Configuring Packages and Their Services


7 Cluster and Package Maintenance
This chapter describes the cmviewcl command, then shows how to start and halt a
cluster or an individual node, how to perform permanent reconfiguration, and how to
start, halt, move, and modify packages during routine maintenance of the cluster.
Topics are as follows:
• Reviewing Cluster and Package Status
• Managing the Cluster and Nodes (page 239)
• Managing Packages and Services (page 242)
• Reconfiguring a Cluster (page 251)
• Configuring a Legacy Package (page 262)
• Reconfiguring a Package (page 271)
• Responding to Cluster Events (page 278)
• Single-Node Operation (page 279)
• Removing Serviceguard from a System (page 279)

Reviewing Cluster and Package Status


You can check status using Serviceguard Manager, or from a cluster node’s command
line.

Reviewing Cluster and Package Status with the cmviewcl Command


Information about cluster status is stored in the status database, which is maintained
on each individual node in the cluster. You can display information contained in this
database by means of the cmviewcl command:
cmviewcl -v
You can use the cmviewcl command without root access; in clusters running
Serviceguard version A.11.16 or later, grant access by assigning the Monitor role to the
users in question. In earlier versions, allow access by adding <nodename>
<nonrootuser> to the cmclnodelist file.
cmviewcl -v displays information about all the nodes and packages in a running
cluster, together with the settings of parameters that determine failover behavior.

TIP: Some commands take longer to complete in large configurations. In particular,


you can expect Serviceguard’s CPU usage to increase during cmviewcl -v as the
number of packages and services increases.
See the manpage for a detailed description of other cmviewcl options.

Reviewing Cluster and Package Status 229


Viewing Package Dependencies
The cmviewcl -v command output lists dependencies throughout the cluster. For a
specific package’s dependencies, use the -p<pkgname> option.

Cluster Status
The status of a cluster, as shown by cmviewcl, can be one of the following:
• up - At least one node has a running cluster daemon, and reconfiguration is not
taking place.
• down - No cluster daemons are running on any cluster node.
• starting - The cluster is in the process of determining its active membership.
At least one cluster daemon is running.
• unknown - The node on which the cmviewcl command is issued cannot
communicate with other nodes in the cluster.

Node Status and State


The status of a node is either up (active as a member of the cluster) or down (inactive in
the cluster), depending on whether its cluster daemon is running or not. Note that a
node might be down from the cluster perspective, but still up and running Linux.
A node may also be in one of the following states:
• Failed. A node never sees itself in this state. Other active members of the cluster
will see a node in this state if the node is no longer active in the cluster, but is not
shut down.
• Reforming. A node is in this state when the cluster is re-forming. The node is
currently running the protocols which ensure that all nodes agree to the new
membership of an active cluster. If agreement is reached, the status database is
updated to reflect the new cluster membership.
• Running. A node in this state has completed all required activity for the last
re-formation and is operating normally.
• Halted. A node never sees itself in this state. Other nodes will see it in this state
after the node has gracefully left the active cluster, for instance with a cmhaltnode
command.
• Unknown. A node never sees itself in this state. Other nodes assign a node this
state if it has never been an active cluster member.

Package Status and State


The status of a package can be one of the following:
• up - The package master control script is active.
• down - The package master control script is not active.

230 Cluster and Package Maintenance


• start_wait - A cmrunpkg command is in progress for this package. The package
is waiting for packages it depends on (predecessors) to start before it can start.
• starting - The package is starting. The package master control script is running.
• halting - A cmhaltpkg command is in progress for this package and the halt
script is running.
• halt_wait -A cmhaltpkg command is in progress for this package. The package
is waiting to be halted, but the halt script cannot start because the package is
waiting for packages that depend on it (successors) to halt. The parameter
description for successor_halt_timeout (page 208) provides more information.
• failing - The package is halting because it, or a package it depends on, has failed.
• fail_wait - The package is waiting to be halted because the package or a package
it depends on has failed, but must wait for a package that depends on it to halt
before it can halt.
• relocate_wait - The package’s halt script has completed or Serviceguard is still
trying to place the package.
• reconfiguring — The node where this package is running is adjusting the
package configuration to reflect the latest changes that have been applied.
• reconfigure_wait — The node where this package is running is waiting to
adjust the package configuration to reflect the latest changes that have been applied.
• unknown - Serviceguard could not determine the status at the time cmviewcl
was run.
A system multi-node package is up when it is running on all the active cluster nodes.
A multi-node package is up if it is running on any of its configured nodes.
A system multi-node package can have a status of changing, meaning the package is
in transition on one or more active nodes.
The state of a package can be one of the following:
• starting - The package is starting. The package master control script is running.
• start_wait - A cmrunpkg command is in progress for this package. The package
is waiting for packages it depends on (predecessors) to start before it can start.
• running - Services are active and being monitored.
• halting - A cmhaltpkg command is in progress for this package and the halt
script is running.
• halt_wait -A cmhaltpkg command is in progress for this package. The package
is waiting to be halted, but the halt script cannot start because the package is
waiting for packages that depend on it (successors) to halt. The parameter
description for successor_halt_timeout (page 208) provides more information.
• halted - The package is down and halted.
• failing - The package is halting because it, or a package it depends on, has failed.

Reviewing Cluster and Package Status 231


• fail_wait - The package is waiting to be halted because the package or a package
it depends on has failed, but must wait for a package it depends on to halt before
it can halt.
• failed - The package is down and failed.
• relocate_wait - The package’s halt script has completed or Serviceguard is still
trying to place the package.
• maintenance — The package is in maintenance mode; see “Maintaining a Package:
Maintenance Mode” (page 245).
• reconfiguring — The node where this package is running is adjusting the
package configuration to reflect the latest changes that have been applied.
• reconfigure_wait — The node where this package is running is waiting to
adjust the package configuration to reflect the latest changes that have been applied.
• unknown - Serviceguard could not determine the state at the time cmviewcl was
run.
The following states are possible only for multi-node packages:
• blocked - The package has never run on this node, either because a dependency
has not been met, or because auto_run is set to no.
• changing - The package is in a transient state, different from the status shown,
on some nodes. For example, a status of starting with a state of changing
would mean that the package was starting on at least one node, but in some other,
transitory condition (for example, failing) on at least one other node.

Package Switching Attributes


cmviewcl shows the following package switching information:
• AUTO_RUN: Can be enabled or disabled. For failover packages, enabled means
that the package starts when the cluster starts, and Serviceguard can switch the
package to another node in the event of failure.
For system multi-node packages, enabled means an instance of the package can
start on a new node joining the cluster (disabled means it will not).
• Switching Enabled for a Node: For failover packages, enabled means that the
package can switch to the specified node. disabled means that the package cannot
switch to the specified node until the node is enabled to run the package via the
cmmodpkg command.
Every failover package is marked enabled or disabled for each node that is
either a primary or adoptive node for the package.
For multi-node packages, node switching disabled means the package cannot
start on that node.

232 Cluster and Package Maintenance


Service Status
Services have only status, as follows:
• Up. The service is being monitored.
• Down. The service is not running. It may not have started, or have halted or failed.
• Unknown. Serviceguard cannot determine the status.

Network Status
The network interfaces have only status, as follows:
• Up.
• Down.
• Unknown. Serviceguard cannot determine whether the interface is up or down.

Failover and Failback Policies


Failover packages can be configured with one of two values for the failover_policy
parameter (page 209), as displayed in the output of cmviewcl -v:
• configured_node. The package fails over to the next node in the node_name list
in the package configuration file (page 205).
• min_package_node. The package fails over to the node in the cluster with the
fewest running packages on it.
Failover packages can also be configured with one of two values for the failback_policy
parameter (page 209), and these are also displayed in the output of cmviewcl -v:
• automatic: Following a failover, a package returns to its primary node when the
primary node becomes available again.
• manual: Following a failover, a package will run on the adoptive node until moved
back to its original node by a system administrator.

Examples of Cluster and Package States


The following sample output from the cmviewcl -v command shows status for the
cluster in the sample configuration.

Normal Running Status


Everything is running normally; both nodes in the cluster are running, and the packages
are in their primary locations.
CLUSTER STATUS
example up
NODE STATUS STATE
ftsys9 up running

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0

Reviewing Cluster and Package Status 233


PRIMARY up eth1

PACKAGE STATUS STATE AUTO_RUN NODE


pkg1 up running enabled ftsys9

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS MAX_RESTARTS RESTARTS NAME
Service up 0 0 service1
Subnet up 0 0 15.13.168.0

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled ftsys9 (current)
Alternate up enabled ftsys10

NODE STATUS STATE


ftsys10 up running

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1

PACKAGE STATUS STATE AUTO_RUN NODE


pkg2 up running enabled ftsys10

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS MAX_RESTARTS RESTARTS NAME
Service up 0 0 service2
Subnet up 0 0 15.13.168.0

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled ftsys10 (current)
Alternate up enabled ftsys9

234 Cluster and Package Maintenance


NOTE: The Script_Parameters section of the PACKAGE output of cmviewcl
shows the Subnet status only for the node that the package is running on. In a
cross-subnet configuration, in which the package may be able to fail over to a node on
another subnet, that other subnet is not shown (see “Cross-Subnet Configurations”
(page 32)).

Quorum Server Status


If the cluster is using a quorum server for tie-breaking services, the display shows the
server name, state and status following the entry for each node, as in the following
excerpt from the output of cmviewcl -v:
CLUSTER STATUS
example up

NODE STATUS STATE


ftsys9 up running

Quorum Server Status:


NAME STATUS STATE
lp-qs up running
...

NODE STATUS STATE


ftsys10 up running

Quorum Server Status:


NAME STATUS STATE
lp-qs up running

Status After Halting a Package


After we halt pkg2 with the cmhaltpkg command, the output of cmviewcl-v is as
follows:
CLUSTER STATUS
example up

NODE STATUS STATE


ftsys9 up running

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1

PACKAGE STATUS STATE AUTO_RUN NODE


pkg1 up running enabled ftsys9

Policy_Parameters:

Reviewing Cluster and Package Status 235


POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS MAX_RESTARTS RESTARTS NAME
Service up 0 0 service1
Subnet up 0 0 15.13.168.0

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled ftsys9 (current)
Alternate up enabled ftsys10

NODE STATUS STATE


ftsys10 up running

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1

UNOWNED_PACKAGES

PACKAGE STATUS STATE AUTO_RUN NODE


pkg2 down unowned disabled unowned

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual
Script_Parameters:
ITEM STATUS NODE_NAME NAME
Service down service2
Subnet up 15.13.168.0
Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled ftsys10
Alternate up enabled ftsys9
pkg2 now has the status down, and it is shown as unowned, with package switching
disabled. Note that switching is enabled for both nodes, however. This means that once
global switching is re-enabled for the package, it will attempt to start up on the primary
node.

Status After Moving the Package to Another Node


If we use the following command:
cmrunpkg -n ftsys9 pkg2

236 Cluster and Package Maintenance


the output of the cmviewcl -v command is as follows:
CLUSTER STATUS
example up

NODE STATUS STATE


ftsys9 up running

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1

PACKAGE STATUS STATE AUTO_RUN NODE


pkg1 up running enabled ftsys9

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS MAX_RESTARTS RESTARTS NAME
Service up 0 0 service1
Subnet up 0 0 15.13.168.0

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled ftsys9 (current)
Alternate up enabled ftsys10

PACKAGE STATUS STATE AUTO_RUN NODE


pkg2 up running disabled ftsys9

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS NAME MAX_RESTARTS RESTARTS
Service up 0 0 service2
Subnet up 0 0 15.13.168.0

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled ftsys10
Alternate up enabled ftsys9 (current)

NODE STATUS STATE


ftsys10 up running

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1

Reviewing Cluster and Package Status 237


Status After Package Switching is Enabled
The following command changes package status back to Auto Run Enabled:
cmmodpkg -e pkg2
The output of the cmviewcl command is now as follows:
CLUSTER STATUS
example up

NODE STATUS STATE


ftsys9 up running

PACKAGE STATUS STATE AUTO_RUN NODE


pkg1 up running enabled ftsys9
pkg2 up running enabled ftsys9

NODE STATUS STATE


ftsys10 up running
Both packages are now running on ftsys9 and pkg2 is enabled for switching. ftsys10
is running the daemon and no packages are running on ftsys10.

Status After Halting a Node


After halting ftsys10, with the following command:
cmhaltnode ftsys10
the output of cmviewcl is as follows on ftsys9:
CLUSTER STATUS
example up

NODE STATUS STATE


ftsys9 up running

PACKAGE STATUS STATE AUTO_RUN NODE


pkg1 up running enabled ftsys9
pkg2 up running enabled ftsys9

NODE STATUS STATE


ftsys10 down halted
This output can be seen on both ftsys9 and ftsys10.

Viewing Information about Unowned Packages


The following example shows packages that are currently unowned, that is, not running
on any configured node.
UNOWNED_PACKAGES

PACKAGE STATUS STATE AUTO_RUN NODE


PKG3 down halted enabled unowned

238 Cluster and Package Maintenance


Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover min_package_node
Failback automatic

Script_Parameters:
ITEM STATUS NODE_NAME NAME
Subnet up manx 192.8.15.0
Subnet up burmese 192.8.15.0
Subnet up tabby 192.8.15.0
Subnet up persian 192.8.15.0

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled manx
Alternate up enabled burmese
Alternate up enabled tabby
Alternate up enabled persian

Managing the Cluster and Nodes


This section describes the following tasks:
• Starting the Cluster When all Nodes are Down (page 240)
• Adding Previously Configured Nodes to a Running Cluster (page 240)
• Removing Nodes from Participation in a Running Cluster (page 241)
• Halting the Entire Cluster (page 242)
• Automatically Restarting the Cluster (page 242)
In Serviceguard A.11.16 and later, these tasks can be performed by non-root users with
the appropriate privileges. See Controlling Access to the Cluster (page 183) for more
information about configuring access.
You can use Serviceguard Manager or the Serviceguard command line to start or stop
the cluster, or to add or halt nodes. Starting the cluster means running the cluster
daemon on one or more of the nodes in a cluster. You use different Serviceguard
commands to start the cluster depending on whether all nodes are currently down
(that is, no cluster daemons are running), or whether you are starting the cluster daemon
on an individual node.
Note the distinction that is made in this chapter between adding an already configured
node to the cluster and adding a new node to the cluster configuration. An already
configured node is one that is already entered in the cluster configuration file; a new
node is added to the cluster by modifying the cluster configuration file.

Managing the Cluster and Nodes 239


NOTE: Manually starting or halting the cluster or individual nodes does not require
access to the quorum server, if one is configured. The quorum server is only used when
tie-breaking is needed following a cluster partition.

Starting the Cluster When all Nodes are Down


You can use Serviceguard Manager, or the cmruncl command as described in this
section, to start the cluster when all cluster nodes are down. Particular command options
can be used to start the cluster under specific circumstances.
The -v option produces the most informative output. The following starts all nodes
configured in the cluster without a connectivity check:
cmruncl -v
The -w option causes cmruncl to perform a full check of LAN connectivity among all
the nodes of the cluster. Omitting this option will allow the cluster to start more quickly
but will not test connectivity. The following starts all nodes configured in the cluster
with a connectivity check:
cmruncl -v -w
The -n option specifies a particular group of nodes. Without this option, all nodes will
be started. The following example starts up the locally configured cluster only onftsys9
and ftsys10. (This form of the command should only be used when you are sure that
the cluster is not already running on any node.)
cmruncl -v -n ftsys9 -n ftsys10

CAUTION: HP Serviceguard cannot guarantee data integrity if you try to start a cluster
with the cmruncl -n command while a subset of the cluster's nodes are already
running a cluster. If the network connection is down between nodes, using cmruncl
-n might result in a second cluster forming, and this second cluster might start up the
same applications that are already running on the other cluster. The result could be
two applications overwriting each other's data on the disks.

Adding Previously Configured Nodes to a Running Cluster


You can use Serviceguard Manager, or HP Serviceguard commands as shown, to bring
a configured node up within a running cluster.
Use the cmrunnode command to add one or more nodes to an already running cluster.
Any node you add must already be a part of the cluster configuration. The following
example adds node ftsys8 to the cluster that was just started with only nodes ftsys9
and ftsys10. The-v (verbose) option prints out all the messages
cmrunnode -v ftsys8

240 Cluster and Package Maintenance


By default, cmrunnode will do network validation, making sure the actual network
setup matches the configured network setup. This is the recommended method. If you
have recently checked the network and find the check takes a very long time, you can
use the -w none option to bypass the validation.
Since the node's cluster is already running, the node joins the cluster and packages
may be started, depending on the package configuration (see node_name (page 205)). If
the node does not find its cluster running, or the node is not part of the cluster
configuration, the command fails.

Removing Nodes from Participation in a Running Cluster


You can use Serviceguard Manager, or Serviceguard commands as shown below, to
remove nodes from operation in a cluster. This operation removes the node from cluster
operation by halting the cluster daemon, but it does not modify the cluster configuration.
To remove a node from the cluster configuration permanently, you must recreate the
cluster configuration file. See the next section.
Halting a node is a convenient way of bringing it down for system maintenance while
keeping its packages available on other nodes. After maintenance, the package can be
returned to its primary node. See “Moving a Failover Package ” (page 244).
To return a node to the cluster, use cmrunnode.

NOTE: HP recommends that you remove a node from participation in the cluster (by
running cmhaltnode as shown below, or Halt Node in Serviceguard Manger) before
running the Linux shutdown command, especially in cases in which a packaged
application might have trouble during shutdown and not halt cleanly.

Using Serviceguard Commands to Remove a Node from Participation in a Running Cluster


Use the cmhaltnode command to halt one or more nodes in a cluster. The cluster
daemon on the specified node stops, and the node is removed from active participation
in the cluster.
To halt a node with a running package, use the -f option. If a package was running
that can be switched to an adoptive node, the switch takes place and the package starts
on the adoptive node. For example, the following command causes the Serviceguard
daemon running on node ftsys9 in the sample configuration to halt and the package
running on ftsys9 to move to ftsys10:
cmhaltnode -f -v ftsys9
This halts any packages running on the node ftsys9 by executing the halt instructions
in each package's master control script. ftsys9 is halted and the packages start on the
adoptive node, ftsys10.

Managing the Cluster and Nodes 241


Halting the Entire Cluster
You can use Serviceguard Manager, or Serviceguard commands as shown below, to
halt a running cluster.
The cmhaltcl command can be used to halt the entire cluster. This command causes
all nodes in a configured cluster to halt their HP Serviceguard daemons. You can use
the -f option to force the cluster to halt even when packages are running. This command
can be issued from any running node. Example:
cmhaltcl -f -v
This halts all the cluster nodes.

Automatically Restarting the Cluster


You can configure your cluster to automatically restart after an event, such as a
long-term power failure, which brought down all nodes in the cluster. This is done by
setting AUTOSTART_CMCLD to 1 in the $SGAUTOSTART file (see “Understanding the
Location of Serviceguard Files” (page 153)).

Managing Packages and Services


This section describes the following tasks:
• Starting a Package (page 242)
• Halting a Package (page 243)
• Moving a Failover Package (page 244)
• Changing Package Switching Behavior (page 244)
Non-root users with the appropriate privileges can perform these tasks. See Controlling
Access to the Cluster (page 183) for information about configuring access.
You can use Serviceguard Manager or the Serviceguard command line to perform these
tasks.

Starting a Package
Ordinarily, a package configured as part of the cluster will start up on its primary node
when the cluster starts up. You may need to start a package manually after it has been
halted manually. You can do this either in Serviceguard Manager, or with Serviceguard
commands as described below.
The cluster must be running, and if the package is dependent on other packages, those
packages must be either already running, or started by the same command that starts
this package (see the subsection that follows, and “About Package Dependencies”
(page 126).)
You can use Serviceguard Manager to start a package, or Serviceguard commands as
shown below.

242 Cluster and Package Maintenance


Use the cmrunpkg command to run the package on a particular node, then use the
cmmodpkg command to enable switching for the package; for example:
cmrunpkg -n ftsys9 pkg1
cmmodpkg -e pkg1
This starts up the package on ftsys9, then enables package switching. This sequence
is necessary when a package has previously been halted on some node, since halting
the package disables switching.

Starting a Package that Has Dependencies


Before starting a package, it is a good idea to use the cmviewcl command to check for
package dependencies.
You cannot start a package unless all the packages that it depends on are running. If
you try, you’ll see a Serviceguard message telling you why the operation failed, and
the package will not start.
If this happens, you can repeat the run command, this time including the package(s)
this package depends on; Serviceguard will start all the packages in the correct order.

Halting a Package
You halt a package when you want to stop the package but leave the node running.
Halting a package has a different effect from halting the node. When you halt the node,
its packages may switch to adoptive nodes (assuming that switching is enabled for
them); when you halt the package, it is disabled from switching to another node, and
must be restarted manually on another node or on the same node.
System multi-node packages run on all cluster nodes simultaneously; halting these
packages stops them running on all nodes. A multi-node package can run on several
nodes simultaneously; you can halt it on all the nodes it is running on, or you can
specify individual nodes.
You can use Serviceguard Manager to halt a package, or cmhaltpkg; for example:
cmhaltpkg pkg1
This halts pkg1 and disables it from switching to another node.

Halting a Package that Has Dependencies


Before halting a package, it is a good idea to use the cmviewcl command to check for
package dependencies.
You cannot halt a package unless all the packages that depend on it are down. If you
try, you’ll see a Serviceguard message telling you why the operation failed, and the
package will remain up.
If this happens, you can repeat the halt command, this time including the dependent
package(s); Serviceguard will halt the all the packages in the correct order. First, use

Managing Packages and Services 243


cmviewcl to be sure that no other running package has a dependency on any of the
packages you are halting.

Moving a Failover Package


You can use Serviceguard Manager to move a failover package from one node to
another, or Serviceguard commands as shown below.
Before you move a failover package to a new node, it is a good idea to run cmviewcl
-v -l package and look at dependencies. If the package has dependencies, be sure
they can be met on the new node.
To move the package, first halt it where it is running using the cmhaltpkg command.
This action not only halts the package, but also disables package switching.
After it halts, run the package on the new node using the cmrunpkg command, then
re-enable switching as described below.

Changing Package Switching Behavior


There are two options to consider:
• Whether the package can switch (fail over) or not.
• Whether the package can switch to a particular node or not.
For failover packages, if package switching is NO the package cannot move to any other
node; if node switching is NO, the package cannot move to that particular node. For
multi-node packages, if package switching is set to NO, the package cannot start on a
new node joining the cluster; if node switching is set to NO, the package cannot start
on that node.
Both node switching and package switching can be changed dynamically while the
cluster is running. The initial setting for package switching is determined by the auto_run
parameter, which is set in the package configuration file (see page (page 206)). If auto_run
is set to yes, then package switching is enabled when the package first starts. The initial
setting for node switching is to allow switching to all nodes that are configured to run
the package.
You can use Serviceguard Manager to change package switching behavior, or
Serviceguard commands as shown below.
You can change package switching behavior either temporarily or permanently using
Serviceguard commands.
To temporarily disable switching to other nodes for a running package, use the
cmmodpkg command. For example, if pkg1 is currently running, and you want to
prevent it from starting up on another node, enter the following:
cmmodpkg -d pkg1
This does not halt the package, but will prevent it from starting up elsewhere.

244 Cluster and Package Maintenance


You can disable package switching to particular nodes by using the -n option of the
cmmodpkg command. The following prevents pkg1 from switching to node lptest3:
cmmodpkg -d -n lptest3 pkg1
To permanently disable switching so that the next time the cluster restarts, the change
you made in package switching is still in effect, change the auto_run flag in the package
configuration file, then re-apply the configuration. (See “Reconfiguring a Package on
a Running Cluster ” (page 272).)

Maintaining a Package: Maintenance Mode


Serviceguard A.11.19 provides two ways to perform maintenance on components of a
modular, failover package while the package is running. (See Chapter 6 (page 197) for
information about package types and modules.) These two methods are called
maintenance mode and partial-startup maintenance mode.
• Maintenance mode is chiefly useful for modifying networks while the package is
running.
See “Performing Maintenance Using Maintenance Mode” (page 248).
• Partial-startup maintenance mode allows you to work on package services, file
systems, and volume groups.
See “Performing Maintenance Using Partial-Startup Maintenance Mode” (page 249).
• Neither maintenance mode nor partial-startup maintenance mode can be used for
legacy packages, multi-node packages, or system multi-node packages.
• Package maintenance does not alter the configuration of the package, as specified
in the package configuration file.
For information about reconfiguring a package, see “Reconfiguring a Package”
(page 271).

Maintaining a Package: Maintenance Mode 245


NOTE: In order to run a package in partial-startup maintenance mode, you must first
put it in maintenance mode. This means that packages in partial-startup maintenance
mode share the characteristics described below for packages in maintenance mode,
and the same rules and dependency rules apply. Additional rules apply to partial-startup
maintenance mode, and the procedure involves more steps, as explained
underPerforming Maintenance Using Partial-Startup Maintenance Mode.

Characteristics of a Package Running in Maintenance Mode or Partial-Startup


Maintenance Mode
Serviceguard treats a package in maintenance mode differently from other packages
in important ways. The following points apply to a package running in maintenance
mode:
• Serviceguard ignores failures reported by package services, subnets, and file
systems; these will not cause the package to fail.

NOTE: But a failure in the package control script will cause the package to fail.
The package will also fail if an external script (or pre-script) cannot be executed
or does not exist.

• The package will not be automatically failed over, halted, or started.


• A package in maintenance mode still has its configured (or default) weight,
meaning that its weight, if any, is counted against the node's capacity; this applies
whether the package is up or down. (See “About Package Weights” (page 134) for
a discussion of weights and capacities.)
• Node-wide and cluster-wide events affect the package as follows:
— If the node the package is running on is halted or crashes, the package will no
longer be in maintenance mode but will not be automatically started.
— If the cluster is halted or crashes, the package will not be in maintenance mode
when the cluster comes back up. Serviceguard will attempt to start it if auto_run
is set to yes in the package configuration file.
• If node_fail_fast_enabled (page 206) is set to yes, Serviceguard will not halt the node
under any of the following conditions:
— Subnet failure
— A script does not exist or cannot run because of file permissions
— A script times out
— The limit of a restart count is exceeded

246 Cluster and Package Maintenance


Rules for a Package in Maintenance Mode or Partial-Startup Maintenance Mode

IMPORTANT: See the latest Serviceguard release notes for important information
about version requirements for package maintenance.
• The package must have package switching disabled before you can put it in
maintenance mode.
• You can put a package in maintenance mode only on one node.
— The node must be active in the cluster and must be eligible to run the package
(on the package's node_name list).
— If the package is not running, you must specify the node name when you run
cmmodpkg (1m) to put the package in maintenance mode.
— If the package is running, you can put it into maintenance only on the node on
which it is running.
— While the package is in maintenance mode on a node, you can run the package
only on that node.
• You cannot put a package in maintenance mode, or take it out maintenance mode,
if doing so will cause another running package to halt.
• Since package failures are ignored while in maintenance mode, you can take a
running package out of maintenance mode only if the package is healthy.
Serviceguard checks the state of the package’s services and subnets to determine
if the package is healthy. If it is not, you must halt the package before taking it out
of maintenance mode.
• You cannot do online configuration as described under “Reconfiguring a Package”
(page 271).
• You cannot configure new dependencies involving this package; that is, you cannot
make it dependent on another package, or make another package depend on it.
See also “Dependency Rules for a Package in Maintenance Mode or Partial-Startup
Maintenance Mode ” (page 248).
• You cannot use the -t option of any command that operates on a package that is
in maintenance mode; see “Previewing the Effect of Cluster Changes” (page 252)
for information about the -t option.

Additional Rules for Partial-Startup Maintenance Mode


• You must halt the package before taking it out of partial-startup maintenance
mode.
• To run a package normally after running it in partial-startup maintenance mode,
you must take it out of maintenance mode, and then restart it.

Maintaining a Package: Maintenance Mode 247


Dependency Rules for a Package in Maintenance Mode or Partial-Startup Maintenance Mode
You cannot configure new dependencies involving a package running in maintenance
mode, and in addition the following rules apply (we'll call the package in maintenance
mode pkgA).
• The packages that depend on pkgA must be down and disabled when you place
pkgA in maintenance mode. This applies to all types of dependency (including
exclusionary dependencies) as described under “About Package Dependencies”
(page 126).
— You cannot enable a package that depends on pkgA.
— You cannot run a package that depends on pkgA, unless the dependent package
itself is in maintenance mode.
• Dependency rules governing packages that pkgA depends on to be UP are bypassed
so that these packages can halt and fail over as necessary while pkgA is in
maintenance mode.
• If both packages in a dependency relationship are in maintenance mode,
dependency rules are ignored for those two packages.
For example, both packages in an exclusionary dependency can be run and halted
in maintenance mode at the same time.

Performing Maintenance Using Maintenance Mode


You can put a package in maintenance mode, perform maintenance, and take it out of
maintenance mode, whether the package is down or running.
This mode is mainly useful for making modifications to networking components. To
modify other components of the package, such as services or storage, follow the
additional rules and instructions under “Performing Maintenance Using Partial-Startup
Maintenance Mode” (page 249).
If you want to reconfigure the package (using cmapplyconf (1m)) see “Reconfiguring
a Package” (page 271) and “Allowable Package States During Reconfiguration ”
(page 274).

Procedure
Follow these steps to perform maintenance on a package's networking components.
In this example, we'll call the package pkg1 and assume it is running on node1.
1. Place the package in maintenance mode:
cmmodpkg -m on -n node1 pkg1
2. Perform maintenance on the networks or resources and test manually that they
are working correctly.

248 Cluster and Package Maintenance


NOTE: If you now run cmviewcl, you'll see that the STATUS of pkg1 is up and
its STATE is maintenance.

3. If everything is working as expected, take the package out of maintenance mode:


cmmodpkg -m off pkg1

Performing Maintenance Using Partial-Startup Maintenance Mode


To put a package in partial-startup maintenance mode, you put it in maintenance mode,
then restart it, running only those modules that you will not be working on.

Procedure
Follow this procedure to perform maintenance on a package. In this example, we'll
assume a package pkg1 is running on node1, and that we want to do maintenance on
the package's services.
1. Halt the package:
cmhaltpkg pkg1
2. Place the package in maintenance mode:
cmmodpkg -m on -n node1 pkg1

NOTE: The order of the first two steps can be reversed.

3. Run the package in maintenance mode.


In this example, we'll start pkg1 such that only the modules up to and including
the package_ip module are started. (See “Package Modules and Parameters”
(page 199) for a list of package modules. The modules used by a package are started
in the order shown near the top of its package configuration file.)
cmrunpkg -m sg/package_ip pkg1
4. Perform maintenance on the services and test manually that they are working
correctly.

NOTE: If you now run cmviewcl, you'll see that the STATUS of pkg1 is up and
its STATE is maintenance.

5. Halt the package:


cmhaltpkg pkg1

Maintaining a Package: Maintenance Mode 249


NOTE: You can also use cmhaltpkg -s, which stops the modules started by
cmrunpkg -m — in this case, all the modules up to and including package_ip.

6. Run the package to ensure everything is working correctly:


cmrunpkg pkg1

NOTE: The package is still in maintenance mode.

7. If everything is working as expected, bring the package out of maintenance mode:


cmmodpkg -m off pkg1
8. Restart the package:
cmrunpkg pkg1

Excluding Modules in Partial-Startup Maintenance Mode


In the example above, we used cmrunpkg -m to run all the modules up to and including
package_ip, but none of those after it. But you might want to run the entire package
apart from the module whose components you are going to work on. In this case you
can use the -e option:
cmrunpkg -e sg/service pkg1
This runs all the package's modules except the services module.
You can also use -e in combination with -m. This has the effect of starting all modules
up to and including the module identified by -m, except the module identified by -e.
In this case the excluded (-e) module must be earlier in the execution sequence (as
listed near the top of the package's configuration file) than the -m module. For example:
cmrunpkg -m sg/services -e sg/package_ip pkg1

250 Cluster and Package Maintenance


NOTE: The full execution sequence for starting a package is:
1. The master control script itself
2. External pre-scripts
3. Volume groups
4. File systems
5. Package IPs
6. External scripts
7. Services

Reconfiguring a Cluster
You can reconfigure a cluster either when it is halted or while it is still running. Some
operations can only be done when the cluster is halted. The table that follows shows
the required cluster state for many kinds of changes.
Table 7-1 Types of Changes to the Cluster Configuration
Change to the Cluster Configuration Required Cluster State

Add a new node All cluster nodes must be running.

Delete a node A node can be deleted even though it is unavailable or


unreachable.

Change Maximum Configured Packages Cluster can be running.

Change Quorum Server Configuration Cluster can be running; see“What Happens when You
Change the Quorum Configuration Online” (page 48).

Change Cluster Lock Configuration (lock LUN) Cluster can be running. See “Updating the Cluster Lock
LUN Configuration Online” (page 261) and“What
Happens when You Change the Quorum Configuration
Online” (page 48).

Add NICs and their IP addresses to the cluster Cluster can be running. See “Changing the Cluster
configuration Networking Configuration while the Cluster Is
Running” (page 257).

Delete NICs and their IP addresses, from the Cluster can be running. See“Changing the Cluster
cluster configuration Networking Configuration while the Cluster Is
Running” (page 257).

Change the designation of an existing interface Cluster can be running. See “Changing the Cluster
from HEARTBEAT_IP to STATIONARY_IP, or Networking Configuration while the Cluster Is
vice versa Running” (page 257).

Change an interface from IPv4 to IPv6, or vice Cluster can be running. See “Changing the Cluster
versa Networking Configuration while the Cluster Is
Running” (page 257)

Reconfiguring a Cluster 251


Table 7-1 Types of Changes to the Cluster Configuration (continued)
Change to the Cluster Configuration Required Cluster State

Reconfigure IP addresses for a NIC used by the Must delete the interface from the cluster configuration,
cluster reconfigure it, then add it back into the cluster
configuration. See “What You Must Keep in Mind”
(page 258). Cluster can be running throughout.

Change NETWORK_POLLING_INTERVAL Cluster can be running.

Change IP Monitor parameters: SUBNET, Cluster can be running. See the entries for these
IP_MONITOR, POLLING TARGET parameters under “Cluster Configuration Parameters
” (page 105)for more information.

Change MEMBER_TIMEOUT and Cluster can be running.


AUTO_START_TIMEOUT

Change Access Control Policy Cluster and package can be running.

Previewing the Effect of Cluster Changes


Many variables affect package placement, including the availability of cluster nodes;
the availability of networks and other resources on those nodes; failover and failback
policies; and package weights, dependencies, and priorities, if you have configured
them. You can preview the effect on packages of certain actions or events before they
actually occur.
For example, you might want to check to see if the packages are placed as you expect
when the cluster first comes up; or preview what happens to the packages running on
a given node if the node halts, or if the node is then restarted; or you might want to
see the effect on other packages if another, currently disabled, package is enabled, or
if a package halts and cannot restart because none of the nodes on its node_list is
available.
Serviceguard provides two ways to do this: you can use the preview mode of
Serviceguard commands, or you can use the cmeval (1m) command to simulate
different cluster states.
Alternatively, you might want to model changes to the cluster as a whole; cmeval
allows you to do this; see “Using cmeval” (page 254).

What You Can Preview


You can preview any of the following, or all of them simultaneously:
• Cluster bring-up (cmruncl)
• Cluster node state changes (cmrunnode, cmhaltnode)
• Package state changes (cmrunpkg, cmhaltpkg)
• Package movement from one node to another
• Package switching changes (cmmodpkg -e)

252 Cluster and Package Maintenance


• Availability of package subnets, resources, and storage
• Changes in package priority, node order, dependency, failover and failback policy,
node capacity and package weight

Using Preview mode for Commands and in Serviceguard Manager


The following commands support the -t option, which allows you to run the command
in preview mode:
• cmhaltnode [–t] [–f] <node_name>
• cmrunnode [–t] <node_name>
• cmhaltpkg [–t] <package_name>
• cmrunpkg [–t] [-n node_name] <package_name>
• cmmodpkg { -e [-t] | -d } [-n node_name] <package_name>
• cmruncl –v [–t]

NOTE: You cannot use the -t option with any command operating on a package in
maintenance mode; see “Maintaining a Package: Maintenance Mode” (page 245).
For more information about these commands, see their respective manpages. You can
also perform these preview functions in Serviceguard Manager: check the Preview
[...] box for the action in question.
When you use the -t option, the command, rather than executing as usual, predicts
the results that would occur, sending a summary to $stdout. For example, assume
that pkg1 is a high-priority package whose primary node is node1, and which depends
on pkg2 and pkg3 to run on the same node. These are lower-priority packages which
are currently running on node2. pkg1 is down and disabled, and you want to see the
effect of enabling it:
cmmodpkg -e -t pkg1
You will see output something like this:
package:pkg3|node:node2|action:failing
package:pkg2|node:node2|action:failing
package:pkg2|node:node1|action:starting
package:pkg3|node:node1|action:starting
package:pkg1|node:node1|action:starting
cmmodpkg: Command preview completed successfully
This shows that pkg1, when enabled, will “drag” pkg2 and pkg3 to its primary node,
node1. It can do this because of its higher priority; see “Dragging Rules for Simple
Dependencies” (page 128). Running the preview confirms that all three packages will
successfully start on node2 (assuming conditions do not change between now and
when you actually enable pkg1, and there are no failures in the run scripts).

Reconfiguring a Cluster 253


NOTE: The preview cannot predict run and halt script failures.
For more information about package dependencies and priorities, see “About Package
Dependencies” (page 126).

Using cmeval
You can use cmeval to evaluate the effect of cluster changes on Serviceguard packages.
You can also use it simply to preview changes you are considering making to the cluster
as a whole.
You can use cmeval safely in a production environment; it does not affect the state of
the cluster or packages. Unlike command preview mode (the -t discussed above)
cmeval does not require you to be logged in to the cluster being evaluated, and in fact
that cluster does not have to be running, though it must use the same Serviceguard
release and patch version as the system on which you run cmeval.
Use cmeval rather than command preview mode when you want to see more than
the effect of a single command, and especially when you want to see the results of
large-scale changes, or changes that may interact in complex ways, such as changes to
package priorities, node order, dependencies and so on.
Using cmeval involves three major steps:
1. Use cmviewcl -v -f line to write the current cluster configuration out to a
file.
2. Edit the file to include the events or changes you want to preview
3. Using the file from Step 2 as input, run cmeval to preview the results of the
changes.
For example, assume that pkg1 is a high-priority package whose primary node is
node1, and which depends on pkg2 and pkg3 to be running on the same node. These
lower-priority-packages are currently running on node2. pkg1 is down and disabled,
and you want to see the effect of enabling it.
In the output of cmviewcl -v -f line, you would find the line
package:pkg1|autorun=disabled and change it to
package:pkg1|autorun=enabled. You should also make sure that the nodes the
package is configured to run on are shown as available; for example:
package:pkg1|node:node1|available=yes. Then save the file (for example as
newstate.in) and run cmeval:
cmeval -v newstate.in
You would see output something like this:
package:pkg3|node:node2|action:failing
package:pkg2|node:node2|action:failing
package:pkg2|node:node1|action:starting

254 Cluster and Package Maintenance


package:pkg3|node:node1|action:starting
package:pkg1|node:node1|action:starting
This shows that pkg1, when enabled, will “drag” pkg2 and pkg3 to its primary node,
node1. It can do this because of its higher priority; see “Dragging Rules for Simple
Dependencies” (page 128). Running cmeval confirms that all three packages will
successfully start on node2 (assuming conditions do not change between now and
when you actually enable pkg1, and there are no failures in the run scripts.)

NOTE: cmeval cannot predict run and halt script failures.


This is a simple example; you can use cmeval for much more complex scenarios; see
“What You Can Preview” (page 252).

IMPORTANT: For detailed information and examples, see the cmeval (1m) manpage.

Reconfiguring a Halted Cluster


You can make a permanent change in cluster configuration when the cluster is halted.
This procedure must be used for changes marked “Cluster must not be running” in
Table 7-1, but it can be used for any other cluster configuration changes as well.
Use the following steps:
1. Halt the cluster on all nodes.
2. On one node, reconfigure the cluster as described in “Building an HA Cluster
Configuration” (page 153). You can use cmgetconf to generate a template file,
which you then edit.
3. Make sure that all nodes listed in the cluster configuration file are powered up
and accessible. Use cmapplyconf to copy the binary cluster configuration file to
all nodes. This file overwrites any previous version of the binary cluster
configuration file.
4. Use cmruncl to start the cluster on all nodes, or on a subset of nodes.

Reconfiguring a Running Cluster


You can add new nodes to the cluster configuration or delete nodes from the cluster
configuration while the cluster is up and running. Note the following, however:
• You cannot remove an active node from the cluster. You must halt the node first.
• The only configuration change allowed while a node is unreachable (for example,
completely disconnected from the network) is to delete the unreachable node from
the cluster configuration. If there are also packages that depend upon that node,
the package configuration must also be modified to delete the node. This all must
be done in one configuration request (cmapplyconf command).
• The access control list for the cluster can be changed while the cluster is running.
Changes to the package configuration are described in a later section.

Reconfiguring a Cluster 255


The following sections describe how to perform dynamic reconfiguration tasks.

Adding Nodes to the Configuration While the Cluster is Running


Use the following procedure to add a node. For this example, nodes ftsys8 and
ftsys9 are already configured in a running cluster named cluster1, and you are
adding node ftsys10.

NOTE: Before you start, make sure you have configured access to ftsys10 as
described under “Configuring Root-Level Access” (page 155).
1. Use the following command to store a current copy of the existing cluster
configuration in a temporary file in case you need to revert to it:
cmgetconf -C temp.conf
2. Specify a new set of nodes to be configured and generate a template of the new
configuration (all on one line):
cmquerycl -C clconfig.conf -c cluster1 -n ftsys8 -n ftsys9
-n ftsys10
3. Edit clconfig.conf to check the information about the new node.
4. Verify the new configuration:
cmcheckconf -C clconfig.conf
5. Apply the changes to the configuration and send the new binary configuration
file to all cluster nodes:
cmapplyconf -C clconfig.conf
Use cmrunnode to start the new node, and, if you so decide, set the
AUTOSTART_CMCLD parameter to 1 in the $SGAUTOSTART file (see “Understanding
the Location of Serviceguard Files” (page 153)) to enable the new node to join the cluster
automatically each time it reboots.

Removing Nodes from the Cluster while the Cluster Is Running


You can use Serviceguard Manager to delete nodes, or Serviceguard commands as
shown below. The following restrictions apply:
• The node must be halted. See “Removing Nodes from Participation in a Running
Cluster” (page 241).
• If the node you want to delete is unreachable (disconnected from the LAN, for
example), you can delete the node only if there are no packages which specify the
unreachable node. If there are packages that depend on the unreachable node, halt
the cluster; see “Halting the Entire Cluster ” (page 242).

256 Cluster and Package Maintenance


Use the following procedure to delete a node with Serviceguard commands. In this
example, nodes ftsys8, ftsys9 and ftsys10 are already configured in a running
cluster named cluster1, and you are deleting node ftsys10.

NOTE: If you want to remove a node from the cluster, run the cmapplyconf
command from another node in the same cluster. If you try to issue the command on
the node you want removed, you will get an error message.
1. Use the following command to store a current copy of the existing cluster
configuration in a temporary file:
cmgetconf -c cluster1 temp.conf
2. Specify the new set of nodes to be configured (omitting ftsys10) and generate
a template of the new configuration:
cmquerycl -C clconfig.conf -c cluster1 -n ftsys8 -n ftsys9
3. Edit the file clconfig.conf to check the information about the nodes that remain
in the cluster.
4. Halt the node you are going to remove (ftsys10in this example):
cmhaltnode -f -v ftsys10
5. Verify the new configuration:
cmcheckconf -C clconfig.conf
6. From ftsys8 or ftsys9, apply the changes to the configuration and distribute
the new binary configuration file to all cluster nodes.:
cmapplyconf -C clconfig.conf

NOTE: If you are trying to remove an unreachable node on which many packages
are configured to run, you may see the following message:
The configuration change is too large to process while the cluster is running.
Split the configuration change into multiple requests or halt the cluster.

In this situation, you must halt the cluster to remove the node.

Changing the Cluster Networking Configuration while the Cluster Is Running


What You Can Do
Online operations you can perform include:
• Add a network interface and its HEARTBEAT_IP or STATIONARY_IP.
• Delete a network interface and its HEARTBEAT_IP or STATIONARY_IP.
• Change a HEARTBEAT_IP or STATIONARY_IP interface from IPv4 to IPv6, or
vice versa.

Reconfiguring a Cluster 257


• Change the designation of an existing interface from HEARTBEAT_IP to
STATIONARY_IP, or vice versa.
• Change the NETWORK_POLLING_INTERVAL.
• Change IP Monitor parameters: SUBNET, IP_MONITOR, POLLING TARGET; see
the entries for these parameters under“Cluster Configuration Parameters ”
(page 105) for more information.
• A combination of any of these in one transaction (cmapplyconf), given the
restrictions below.

What You Must Keep in Mind


The following restrictions apply:
• You must not change the configuration of all heartbeats at one time, or change or
delete the only configured heartbeat.
At least one working heartbeat must remain unchanged.
• You cannot add interfaces or modify their characteristics unless those interfaces,
and all other interfaces in the cluster configuration, are healthy.
There must be no bad NICs or non-functional or locally switched subnets in the
configuration, unless you are deleting those components in the same operation.
• You cannot change the designation of an existing interface from HEARTBEAT_IP
to STATIONARY_IP, or vice versa, without also making the same change to all peer
network interfaces on the same subnet on all other nodes in the cluster.
Similarly, you cannot change an interface from IPv4 to IPv6 without also making
the same change to all peer network interfaces on the same subnet on all other
nodes in the cluster
• You cannot change the designation of an interface from STATIONARY_IP to
HEARTBEAT_IP unless the subnet is common to all nodes.
Remember that the HEARTBEAT_IP must be an IPv4 address, and must be on the
same subnet on all nodes, except in cross-subnet configurations; see “Cross-Subnet
Configurations” (page 32)).
• You cannot delete a subnet or IP address from a node while a package that uses
it (as a monitored_subnet, ip_subnet, or ip_address) is configured to run on that node.
Information about these parameters begins at monitored_subnet (page 212).
• You cannot change the IP configuration of an interface (NIC) used by the cluster
in a single transaction (cmapplyconf).
You must first delete the NIC from the cluster configuration, then reconfigure the
NIC (using ifconfig, for example), then add the NIC back into the cluster.

258 Cluster and Package Maintenance


Examples of when you must do this include:
— moving a NIC from one subnet to another
— adding an IP address to a NIC
— removing an IP address from a NIC

CAUTION: Do not add IP addresses to network interfaces that are configured into
the Serviceguard cluster, unless those IP addresses themselves will be immediately
configured into the cluster as stationary IP addresses. If you configure any address
other than a stationary IP address on a Serviceguard network interface, it could collide
with a relocatable package address assigned by Serviceguard.
Some sample procedures follow.

Example: Adding a Heartbeat LAN


Suppose that a subnet 15.13.170.0 is shared by nodes ftsys9 and ftsys10 in a
two-node cluster cluster1, and you want to add it to the cluster configuration as a
heartbeat subnet. Proceed as follows.
1. Run cmquerycl to get a cluster configuration template file that includes
networking information for interfaces that are available to be added to the cluster
configuration:
cmquerycl -c cluster1 -C clconfig.conf

NOTE: As of Serviceguard A.11.18, cmquerycl -c produces output that includes


commented-out entries for interfaces that are not currently part of the cluster
configuration, but are available.
The networking portion of the resulting clconfig.conf file looks something
like this:
NODE_NAME ftsys9
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.18
#NETWORK_INTERFACE lan0
#STATIONARY_IP 15.13.170.18
NETWORK_INTERFACE lan3
NODE_NAME ftsys10
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.19
#NETWORK_INTERFACE lan0
#STATIONARY_IP 15.13.170.19
NETWORK_INTERFACE lan3
2. Edit the file to uncomment the entries for the subnet that is being added (lan0 in
this example), and change STATIONARY_IP to HEARTBEAT_IP:

Reconfiguring a Cluster 259


NODE_NAME ftsys9
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.18
NETWORK_INTERFACE lan0
HEARTBEAT_IP 15.13.170.18
NETWORK_INTERFACE lan3
NODE_NAME ftsys10
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.19
NETWORK_INTERFACE lan0
HEARTBEAT_IP 15.13.170.19
NETWORK_INTERFACE lan3
3. Verify the new configuration:
cmcheckconf -C clconfig.conf
4. Apply the changes to the configuration and distribute the new binary configuration
file to all cluster nodes.:
cmapplyconf -C clconfig.conf
If you were configuring the subnet for data instead, and wanted to add it to a package
configuration, you would now need to:
1. Halt the package
2. Add the new networking information to the package configuration file
3. In the case of a legacy package, add the new networking information to the package
control script if necessary
4. Apply the new package configuration, and redistribute the control script if
necessary.
For more information, see “Reconfiguring a Package on a Running Cluster ” (page 272).

Example: Deleting a Subnet Used by a Package


In this example, we are deleting subnet 15.13.170.0 (lan0). Proceed as follows.
1. Halt any package that uses this subnet and delete the corresponding networking
information (monitored_subnet, ip_subnet, ip_address; see the descriptions for these
parameters starting with monitored_subnet (page 212)).
See “Reconfiguring a Package on a Running Cluster ” (page 272) for more
information.
2. Run cmquerycl to get the cluster configuration file:
cmquerycl -c cluster1 -C clconfig.conf
3. Comment out the network interfaces lan0 and lan3 and their network interfaces,
if any, on all affected nodes. The networking portion of the resulting file looks
something like this:

260 Cluster and Package Maintenance


NODE_NAME ftsys9
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.18
# NETWORK_INTERFACE lan0
# STATIONARY_IP 15.13.170.18
# NETWORK_INTERFACE lan3
NODE_NAME ftsys10
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.19
# NETWORK_INTERFACE lan0
# STATIONARY_IP 15.13.170.19
# NETWORK_INTERFACE lan3
4. Verify the new configuration:
cmcheckconf -C clconfig.conf
5. Apply the changes to the configuration and distribute the new binary configuration
file to all cluster nodes.:
cmapplyconf -C clconfig.conf

Updating the Cluster Lock LUN Configuration Online


Proceed as follows.

IMPORTANT: See “What Happens when You Change the Quorum Configuration
Online” (page 48) for important information.
1. In the cluster configuration file, modify the value of CLUSTER_LOCK_LUN for
each node.
2. Run cmcheckconf to check the configuration.
3. Run cmapplyconf to apply the configuration.
If you need to replace the physical device, see “Replacing a Lock LUN” (page 283).

Changing MAX_CONFIGURED_PACKAGES
As of Serviceguard A.11.18, you can change MAX_CONFIGURED_PACKAGES while
the cluster is running. The default for MAX_CONFIGURED_PACKAGES is the maximum
number allowed in the cluster. You can use Serviceguard Manager to change
MAX_CONFIGURED_PACKAGES, or Serviceguard commands as shown below.
Use the cmgetconf command to obtain a current copy of the cluster's existing
configuration, for example:
cmgetconf -C <cluster_name> clconfig.conf
Edit the clconfig.conf file to include the new value for
MAX_CONFIGURED_PACKAGES. Then use the cmcheckconf command to verify
the new configuration. Using the -k or -K option can significantly reduce the response
time.
Reconfiguring a Cluster 261
Use the cmapplyconf command to apply the changes to the configuration and send
the new configuration file to all cluster nodes. Using -k or -K can significantly reduce
the response time.

Configuring a Legacy Package

IMPORTANT: You can still create a new legacy package. If you are using a Serviceguard
Toolkit such as Serviceguard NFS Toolkit, consult the documentation for that product.
Otherwise, use this section to maintain and re-work existing legacy packages rather
than to create new ones. The method described in Chapter 6: “Configuring Packages
and Their Services ” (page 197), is simpler and more efficient for creating new packages,
allowing packages to be built from smaller modules, and eliminating the separate
package control script and the need to distribute it manually.
If you decide to convert a legacy package to a modular package, see “Migrating a
Legacy Package to a Modular Package” (page 272). Do not attempt to convert
Serviceguard Toolkit packages.

Creating or modifying a legacy package requires the following broad steps:


1. Generate the package configuration file
2. Edit the package configuration file
3. Generate the package control script
4. Edit the package control script
5. Distribute the control script to the cluster nodes
6. Apply the package configuration file
Each of these tasks is described in the sub-sections that follow.

Creating the Legacy Package Configuration


The package configuration process defines a set of application services that are run by
the package manager when a package starts up on a node in the cluster. The
configuration also includes a prioritized list of cluster nodes on which the package can
run together with definitions of the acceptable types of failover allowed for the package.

Using Serviceguard Manager to Configure a Package


You can create a legacy package and its control script in Serviceguard Manager; use
the Help for detailed instructions.

Using Serviceguard Commands to Configure a Package


Use the following procedure to create a legacy package.

262 Cluster and Package Maintenance


1. Create a subdirectory for each package you are configuring in the $SGCONF
directory:
mkdir $SGCONF/pkg1
You can use any directory names you like. (See “Understanding the Location of
Serviceguard Files” (page 153) for the name of Serviceguard directories on your
version of Linux.)
2. Generate a package configuration file for each package, for example:
cmmakepkg -p $SGCONF/pkg1/pkg1.conf
You can use any file name you like for the configuration file.
3. Edit each configuration file to specify package name, prioritized list of nodes (with
39 bytes or less in the name), the location of the control script, and failover
parameters for each package. Include the data you recorded on the Package
Configuration Worksheet.

Configuring a Package in Stages


It is a good idea to configure failover packages in stages, as follows:
1. Configure volume groups and mount points only.
2. Distribute the control script to all nodes.
3. Apply the configuration.
4. Run the package and ensure that it can be moved from node to node.
5. Halt the package.
6. Configure package IP addresses and application services in the control script.
7. Distribute the control script to all nodes.
8. Run the package and ensure that applications run as expected and that the package
fails over correctly when services are disrupted.

Editing the Package Configuration File


Edit the file you generated in step 2 of “Using Serviceguard Commands to Configure
a Package ” (page 262). Use the bullet points that follow as a checklist.
• PACKAGE_TYPE. Enter the package type; see “Types of Package: Failover,
Multi-Node, System Multi-Node” (page 198) and “package_type” (page 205).

Configuring a Legacy Package 263


NOTE: For modular packages, the default form for parameter names and literal
values in the package configuration file is lower case; for legacy packages the
default is upper case. There are no compatibility issues; Serviceguard is
case-insensitive as far as the parameter names are concerned.
Because this section is intended to be used primarily when you reconfiguring an
existing legacy package, we are using the legacy parameter names (in upper case)
for sake of continuity. But if you generate the configuration file using cmmakepkg
or cmgetconf, you will see the parameter names as they appear in modular
packages; see the notes below and the “Package Parameter Explanations” (page 204)
for details of the name changes.

• FAILOVER_POLICY. For failover packages, enter the failover_policy (page 209).


• FAILBACK_POLICY. For failover packages, enter the failback_policy (page 209).
• NODE_NAME. Enter the node or nodes on which the package can run; as described
under node_name (page 205).
• AUTO_RUN. Configure the package to start up automatically or manually; as
described under auto_run (page 206).
• NODE_FAIL_FAST_ENABLED. Enter the policy as described under
node_fail_fast_enabled (page 206).
• RUN_SCRIPT and HALT_SCRIPT. Specify the pathname of the package control
script (described in the next section). No default is provided. Permissions on the
file and directory should be set to rwxr-xr-x or r-xr-xr-x (755 or 555).
(Script timeouts): Enter the run_script_timeout (page 207) and halt_script_timeout
(page 207).
SCRIPT_LOG_FILE. (optional). Specify the full pathname of the file where the
RUN_SCRIPT and HALT_SCRIPT will log messages. If you do not specify a path,
Serviceguard will create a file with “.log” appended to each script path, and put
the messages in that file.
• If your package has relocatable IP addresses, enter the SUBNET if you want it to
be monitored (this means the package will stop if the subnet fails).
This must be a subnet that is already specified in the cluster configuration, and it
can be either an IPv4 or an IPv6 subnet. It must not be a link-local subnet (link-local
package IPs are not allowed). See monitored_subnet (page 212).

264 Cluster and Package Maintenance


IMPORTANT: Each subnet specified here must already be specified in the cluster
configuration file via the NETWORK_INTERFACE parameter and either the
HEARTBEAT_IP or STATIONARY_IP parameter. See “Cluster Configuration
Parameters ” (page 105) for more information.
See also “Stationary and Relocatable IP Addresses and Monitored Subnets”
(page 71) and monitored_subnet (page 212).
IMPORTANT: For cross-subnet configurations, see “Configuring Cross-Subnet
Failover” (page 269).

• If your package runs services, enter the SERVICE_NAME as described under


service_name (page 214) and values for SERVICE_FAIL_FAST_ENABLED as described
under service_fail_fast_enabled (page 216) and SERVICE_HALT_TIMEOUT as
described under service_halt_timeout (page 216). Enter a group of these three for
each service.

IMPORTANT: Note that the rules for valid SERVICE_NAMEs are more restrictive
as of Serviceguard A.11.18.

• ACCESS_CONTROL_POLICY. You can grant a non-root user PACKAGE_ADMIN


privileges for this package.
See the entries for user_name, user_host, and user_role user_name (page 221), and
“Controlling Access to the Cluster” (page 183), for more information.
• If the package will depend on another package, enter values for
DEPENDENCY_NAME, DEPENDENCY_CONDITION, and
DEPENDENCY_LOCATION.
For more information, see the corresponding parameter descriptions starting on
(page 210), and “About Package Dependencies” (page 126).

Creating the Package Control Script


For legacy packages, the package control script contains all the information necessary
to run all the services in the package, monitor them during operation, react to a failure,
and halt the package when necessary. You can use Serviceguard Manager, Serviceguard
commands, or a combination of both, to create or modify the package control script.
Each package must have a separate control script, which must be executable.
For security reasons, the control script must reside in a directory with the string
cmcluster in the path. The control script is placed in the package directory and is
given the same name as specified in the RUN_SCRIPT and HALT_SCRIPT parameters
in the package configuration file. The package control script template contains both
the run instructions and the halt instructions for the package. You can use a single
script for both run and halt operations, or, if you wish, you can create separate scripts.

Configuring a Legacy Package 265


Use cmmakepkg to create the control script, then edit the control script. Use the
following procedure to create the template for the sample failover package pkg1.
First, generate a control script template, for example:
cmmakepkg -s $SGCONF/pkg1/pkg1.sh
Next, customize the script; see “Customizing the Package Control Script ”.

Customizing the Package Control Script


You need to customize as follows; see the relevant entries under “Package Parameter
Explanations” (page 204) for more discussion.
• Update the PATH statement to reflect any required paths needed to start your
services.
• Specify the Remote Data Replication Method and Software RAID Data Replication
method if necessary.

CAUTION: If you are not using the XDC or CLX products, do not modify the
REMOTE DATA REPLICATION DEFINITION section. If you are using one of these
products, consult the product’s documentation.

• If you are using LVM, enter the names of volume groups to be activated using the
VG[] array parameters, and select the appropriate options for the storage activation
command, including options for mounting and unmounting file systems, if
necessary. Specify the file system type (ext2is the default; ext3, reiserfs, or
gfs can also be used; see the fs_ parameter descriptions starting with
fs_mount_retry_count (page 217) for more information).
• Add the names of logical volumes and the file system that will be mounted on
them.
• Specify the filesystem mount and unmount retry options.
• If your package uses a large number of volume groups or disk groups or mounts
a large number of file systems, consider increasing the number of concurrent
vgchange, mount/umount, and fsck operations;
• Define IP subnet and IP address pairs for your package. IPv4 or IPv6 addresses
are allowed.
• Add service name(s).
• Add service command(s)
• Add a service restart parameter, if you so decide.
For more information about services, see the discussion of the service_ parameters
starting with service_name (page 214).

266 Cluster and Package Maintenance


Adding Customer Defined Functions to the Package Control Script
You can add additional shell commands to the package control script to be executed
whenever the package starts or stops. Enter these commands in the CUSTOMER DEFINED
FUNCTIONS area of the script.
If your package needs to run short-lived processes, such as commands to initialize or
halt a packaged application, you can also run these from the CUSTOMER DEFINED
FUNCTIONS.
You can also use the CUSTOMER DEFINED FUNCTIONS to determine why a package
has shut down; see “Determining Why a Package Has Shut Down” (page 147).
An example of this portion of the script follows, showing the date and echo commands
logging starts and halts of the package to a file.
# START OF CUSTOMER DEFINED FUNCTIONS

# This function is a place holder for customer defined functions.


# You should define all actions you want to happen here, before the service is
# started. You can create as many functions as you need.

function customer_defined_run_cmds
{
# ADD customer defined run commands.
: # do nothing instruction, because a function must contain some command.
date >> /tmp/pkg1.datelog
echo 'Starting pkg1' >> /tmp/pkg1.datelog
test_return 51
}

# This function is a place holder for customer defined functions.


# You should define all actions you want to happen here, before the service is
# halted.

function customer_defined_halt_cmds
{
# ADD customer defined halt commands.
: # do nothing instruction, because a function must contain some command.
date >> /tmp/pkg1.datelog
echo 'Halting pkg1' >> /tmp/pkg1.datelog
test_return 52
}

# END OF CUSTOMER DEFINED FUNCTIONS

Adding Serviceguard Commands in Customer Defined Functions


You can add Serviceguard commands (such as cmmodpkg) in the Customer Defined
Functions section of a package control script. These commands must not interact with
the package itself.
If a Serviceguard command interacts with another package, be careful to avoid command
loops. For instance, a command loop might occur under the following circumstances.
Suppose pkg1 does a cmmodpkg -d of pkg2, and pkg2 does a cmmodpkg -d of pkg1.
If both pkg1 and pkg2 start at the same time, pkg1 tries to cmmodpkg pkg2. However,
that cmmodpkg command has to wait for pkg2 startup to complete. pkg2 tries to

Configuring a Legacy Package 267


cmmodpkg pkg1, but pkg2 has to wait for pkg1 startup to complete, thereby causing
a command loop.
To avoid this situation, it is a good idea to always specify a RUN_SCRIPT_TIMEOUT
and a HALT_SCRIPT_TIMEOUT for all packages, especially packages that use
Serviceguard commands in their control scripts. If a timeout is not specified and your
configuration has a command loop as described above, inconsistent results can occur,
including a hung cluster.

Support for Additional Products


The package control script template provides exits for use with additional products,
including Serviceguard Extended Distance Cluster (XDC) for Linux. Refer to the
additional product’s documentation for details about how to create a package using
the hooks that are provided in the control script.

Verifying the Package Configuration


Serviceguard checks the configuration you create and reports any errors.
For legacy packages, you can do this in Serviceguard Manager: click Check to verify
the package configuration you have done under any package configuration tab, or to
check changes you have made to the control script. Click Apply to verify the package
as a whole. See the local Help for more details.
If you are using the command line, use the following command to verify the content
of the package configuration you have created:
cmcheckconf -v -P $SGCONF/pkg1/pkg1.conf
Errors are displayed on the standard output. If necessary, edit the file to correct any
errors, then run the command again until it completes without errors.
The following items are checked (whether you use Serviceguard Manager or
cmcheckconf command):
• Package name is valid, and at least one NODE_NAME entry is included.
• There are no duplicate parameter entries.
• Values for parameters are within permitted ranges.
• Run and halt scripts exist on all nodes in the cluster and are executable.
• Run and halt script timeouts are less than 4294 seconds.
• Configured resources are available on cluster nodes.
• If a dependency is configured, the dependency package must already be configured
in the cluster.

Distributing the Configuration


You can use Serviceguard Manager or Linux commands to distribute the binary cluster
configuration file among the nodes of the cluster.

268 Cluster and Package Maintenance


Distributing the Configuration And Control Script with Serviceguard Manager
When you have finished creating a package in Serviceguard Manager, click Apply
Configuration. If the package configuration has no errors, it is converted to a binary
file and distributed to the cluster nodes.

Copying Package Control Scripts with Linux commands

IMPORTANT: In a cross-subnet configuration, you cannot use the same package


control script on all nodes if the package uses relocatable IP addresses. See “Configuring
Cross-Subnet Failover” (page 269).
Use Linux commands to copy package control scripts from the node where you created
the files, to the same pathname on all nodes which can possibly run the package. Use
your favorite method of file transfer (e. g., scp or ftp). For example, from ftsys9,
you can issue the scp command to copy the package control script to ftsys10:
scp $SGCONF/pkg1/control.sh ftsys10:$SGCONF/pkg1/control.sh

Distributing the Binary Cluster Configuration File with Linux Commands


Use the following steps from the node on which you created the cluster and package
configuration files:
• Verify that the configuration file is correct. Use the following command:
cmcheckconf -C $SGCONF/cmcl.conf -P $SGCONF/pkg1/pkg1.conf
• Generate the binary configuration file and distribute it across the nodes.
cmapplyconf -v -C $SGCONF/cmcl.conf -P $SGCONF/pkg1/pkg1.conf
The cmapplyconf command creates a binary version of the cluster configuration file
and distributes it to all nodes in the cluster. This action ensures that the contents of the
file are consistent across all nodes.

NOTE: You must use cmcheckconf and cmapplyconf again any time you make
changes to the cluster and package configuration files.

Configuring Cross-Subnet Failover


To configure a legacy package to fail over across subnets (see “Cross-Subnet
Configurations” (page 32)), you need to do some additional configuration.

NOTE: You cannot use Serviceguard Manager to configure cross-subnet packages.


Suppose that you want to configure a package, pkg1, so that it can fail over among all
the nodes in a cluster comprising NodeA, NodeB, NodeC, and NodeD.

Configuring a Legacy Package 269


NodeA and NodeB use subnet 15.244.65.0, which is not used by NodeC and NodeD;
and NodeC and NodeD use subnet 15.244.56.0, which is not used by NodeA and
NodeB. (See “Obtaining Cross-Subnet Information” (page 180) for sample cmquerycl
output).

Configuring node_name
First you need to make sure that pkg1 will fail over to a node on another subnet only
if it has to. For example, if it is running on NodeA and needs to fail over, you want it
to try NodeB, on the same subnet, before incurring the cross-subnet overhead of failing
over to NodeC or NodeD.
Assuming nodeA is pkg1’s primary node (where it normally starts), create node_name
entries in the package configuration file as follows:
node_name nodeA
node_name nodeB
node_name nodeC
node_name nodeD

Configuring monitored_subnet_access
In order to monitor subnet 15.244.65.0 or 15.244.56.0, you would configure
monitored_subnet and monitored_subnet_access in pkg1’s package configuration file as
follows:
monitored_subnet 15.244.65.0
monitored_subnet_access PARTIAL
monitored_subnet 15.244.56.0
monitored_subnet_access PARTIAL

NOTE: Configuring monitored_subnet_access as FULL (or not configuring


monitored_subnet_access) for either of these subnets will cause the package configuration
to fail, because neither subnet is available on all the nodes.

Creating Subnet-Specific Package Control Scripts


Now you need to create control scripts to run the package on the four nodes.

270 Cluster and Package Maintenance


IMPORTANT: In a cross-subnet configuration, you cannot share a single package
control script among nodes on different subnets if you are using relocatable IP addresses.
In this case you will need to create a separate control script to be used by the nodes on
each subnet.
In our example, you would create two copies of pkg1’s package control script, add
entries to customize it for subnet 15.244.65.0 or 15.244.56.0, and copy one of
the resulting scripts to each node, as follows.

Control-script entries for nodeA and nodeB


IP[0] = 15.244.65.82
SUBNET[0] 15.244.65.0
IP[1] = 15.244.65.83
SUBNET[1] 15.244.65.0

Control-script entries for nodeC and nodeD


IP[0] = 15.244.56.100
SUBNET[0] = 15.244.56.0
IP[1] = 15.244.56.101
SUBNET[1] =15.244.56.0

Reconfiguring a Package
You reconfigure a package in much the same way as you originally configured it; for
modular packages, see Chapter 6: “Configuring Packages and Their Services ” (page 197);
for older packages, see “Configuring a Legacy Package” (page 262).
The cluster can be either halted or running during package reconfiguration, and in
some cases the package itself can be running; the types of change you can make and
the times when they take effect depend on whether the package is running or not.
If you reconfigure a package while it is running, it is possible that the package could
fail later, even if the cmapplyconf succeeded.
For example, consider a package with two volume groups. When this package started,
it activated both volume groups. While the package is running, you could change its
configuration to list only one of the volume groups, and cmapplyconf would succeed.
If you issue cmhaltpkg command, however, the halt would fail. The modified package
would not deactivate both of the volume groups that it had activated at startup, because
it would only see the one volume group in its current configuration file.
For more information, see “Allowable Package States During Reconfiguration ”
(page 274).

Reconfiguring a Package 271


Migrating a Legacy Package to a Modular Package
The Serviceguard command cmmigratepkg automates the process of migrating legacy
packages to modular packages as far as possible. Many, but not all, packages can be
migrated in this way; for details, see the white paper Package Migration from Legacy Style
to Modular Style at http://docs.hp.com -> High Availability ->
Serviceguard -> White papers.
Do not attempt to convert Serviceguard Toolkit packages.

NOTE: The cmmigratepkg command requires Perl version 5.8.3 or higher on the
system on which you run the command.

Reconfiguring a Package on a Running Cluster


You can reconfigure a package while the cluster is running, and in some cases you can
reconfigure the package while the package itself is running; see “Allowable Package
States During Reconfiguration ” (page 274). You can do this in Serviceguard Manager
(for legacy packages), or use Serviceguard commands.
To modify the package with Serviceguard commands, use the following procedure
(pkg1 is used as an example):
1. Halt the package if necessary:
cmhaltpkg pkg1
See “Allowable Package States During Reconfiguration ”to determine whether
this step is needed.
2. If it is not already available, you can obtain a copy of the package's configuration
file by using the cmgetconf command, specifying the package name.
cmgetconf -p pkg1 pkg1.conf
3. Edit the package configuration file.

IMPORTANT: Restrictions on package names, dependency names, and service


names have become more stringent as of A.11.18. Packages that have or contain
names that do not conform to the new rules (spelled out under “package_name”
(page 204)) will continue to run, but if you reconfigure these packages, you will
need to change the names that do not conform; cmcheckconf and cmapplyconf
will enforce the new rules.

4. Verify your changes as follows:


cmcheckconf -v -P pkg1.conf

272 Cluster and Package Maintenance


5. Distribute your changes to all nodes:
cmapplyconf -v -P pkg1.conf
6. If this is a legacy package, copy the package control script to all nodes that can run
the package.

Reconfiguring a Package on a Halted Cluster


You can also make permanent changes in the package configuration while the cluster
is not running. Use the same steps as in “Reconfiguring a Package on a Running Cluster
”.

Adding a Package to a Running Cluster


You can create a new package and add it to the cluster configuration while the cluster
is up and while other packages are running. The number of packages you can add is
subject to the value of MAX_CONFIGURED_PACKAGES in the cluster configuration
file.
To create the package, follow the steps in the chapter Chapter 6: “Configuring Packages
and Their Services ” (page 197). Then use a command such as the following to verify
the configuration of the newly created pkg1 on a running cluster:
cmcheckconf -P $SGCONF/pkg1/pkg1conf.conf
Use a command such as the following to distribute the new package configuration to
all nodes in the cluster:
cmapplyconf -P $SGCONF/pkg1/pkg1conf.conf
If this is a legacy package, remember to copy the control script to the $SGCONF/pkg1
directory on all nodes that can run the package.

Deleting a Package from a Running Cluster


Serviceguard will not allow you to delete a package if any other package is dependent
on it. To check for dependencies, use cmviewcl -v -l <package>. System
multi-node packages cannot be deleted from a running cluster.
You can use Serviceguard Manager to delete the package.
On the Serviceguard command line, you can (in most cases) delete a package from all
cluster nodes by using the cmdeleteconf command. This removes the package
information from the binary configuration file on all the nodes in the cluster. The
command can only be executed when the package is down; the cluster can be up.
The following example halts the failover package mypkg and removes the package
configuration from the cluster:
cmhaltpkg mypkg cmdeleteconf -p mypkg
The command prompts for a verification before deleting the files unless you use the
-f option. The directory $SGCONF/mypkg is not deleted by this command.

Reconfiguring a Package 273


Resetting the Service Restart Counter
The service restart counter tracks the number of times a package service has been
automatically restarted. This value is used to determine when the package service has
exceeded its maximum number of allowable automatic restarts.
When a package service successfully restarts after several attempts, the package manager
does not automatically reset the restart count. You can reset the counter online using
cmmodpkg -R -s, for example:
cmmodpkg -R -s myservice pkg1
This sets the counter back to zero. The current value of the restart counter appears in
the output of cmviewcl -v.

Allowable Package States During Reconfiguration


In many cases, you can make changes to a package’s configuration while the package
is running. The table that follows shows exceptions — cases in which the package must
not be running, or in which the results might not be what you expect — as well as
differences between modular and legacy packages.
In general, you have greater scope for online changes to a modular than to a legacy
package. In some cases, though, the capability of legacy packages has been upgraded
to match that of modular packages as far as possible; these cases are shown in the table.
For more information about legacy and modular packages, see Chapter 6 (page 197).

NOTE: If neither legacy nor modular is called out under “Change to the Package”, the
“Required Package State” applies to both types of package. Changes that are allowed,
but which HP does not recommend, are labeled “should not be running”.

IMPORTANT: Actions not listed in the table can be performed for both types of package
while the package is running.
In all cases the cluster can be running, and packages other than the one being
reconfigured can be running. And remember too that you can make changes to package
configuration files at any time; but do not apply them (using cmapplyconf or
Serviceguard Manager) to a running package in the cases indicated in the table.

274 Cluster and Package Maintenance


NOTE: All the nodes in the cluster must be powered up and accessible when you
make package configuration changes.
Table 7-2 Types of Changes to Packages
Change to the Package Required Package State

Delete a package Package must not be running.


NOTE: You cannot delete a package if another package has a dependency
on it.

Change package type Package must not be running.

Add or delete a module: Package can be running.


modular package

Change run script contents: Package can be running, but should not be starting.
legacy package Timing problems may occur if the script is changed while the package is
starting.

Change halt script Package can be running, but should not be halting.
contents: legacy package Timing problems may occur if the script is changed while the package is
halting.

Add or delete a service: Package can be running.


modular package Serviceguard treats any change to service_name or service_cmd as deleting
the existing service and adding a new one, meaning that the existing service
is halted.

Add or delete a service: Package must not be running.


legacy package

Change service_restart: Package can be running.


modular package Serviceguard will not allow the change if the new value is less than the
current restart count. (You can use cmmodpkg -R<service_name>
<package> to reset the restart count if you need to.)

Change Package must not be running.


SERVICE_RESTART: legacy
package

Add or remove a SUBNET Package must not be running. (Also applies to cross-subnet configurations.)
(in control script): legacy Package must not be running. Subnet must already be configured into the
package cluster.

Add or remove an Package can be running.


ip_subnet: modular package See “ip_subnet” (page 213) for important information. Serviceguard will reject
the change if you are trying to add an ip_subnet that is not configured on all
the nodes on the package's node_name list.

Reconfiguring a Package 275


Table 7-2 Types of Changes to Packages (continued)
Change to the Package Required Package State

Add or remove an Package can be running.


ip_address: modular package See “ip_subnet” (page 213) and “ip_address” (page 214) for important
information. Serviceguard will reject the change if you are trying to add an
ip_address that cannot be configured on the specified ip_subnet, or is on a
subnet that is not configured on all the nodes on the package's node_name
list.

Add or remove an IP (in Package must not be running. (Also applies to cross-subnet configurations.)
control script): legacy
package

Add or delete nodes from Package can be running.


package’s ip_subnet_node Serviceguard will reject the change if you are trying to add a node on which
list(page 214) in the specified ip_subnet is not configured.
cross-subnet configurations

Add or remove monitoring Package can be running.


for a subnet: Serviceguard will not allow the change if the subnet being added is down,
monitored_subnet for a as that would cause the running package to fail.
modular package or SUBNET
(in the package
configuration file) for a
legacy package

Add, change, or delete a Package must not be running.


pv: modular package
NOTE: pv (page 219) is for use by HP partners only.

Add a volume group: Package can be running.


modular package

Add a volume group: Package must not be running.


legacy package

Remove a volume group: Package should not be running.


modular package
CAUTION: Removing a volume group may cause problems if file systems,
applications or other resources are using it. In addition the CAUTION under
“Remove a file system: modular package” applies to any file systems using
the volume group.

Remove a volume group: Package must not be running.


legacy package

276 Cluster and Package Maintenance


Table 7-2 Types of Changes to Packages (continued)
Change to the Package Required Package State

Change a file system: Package should not be running (unless you are only changing fs_umount_opt).
modular package Changing file-system options other than fs_umount_opt may cause problems
because the file system must be unmounted (using the existing
fs_umount_opt) and remounted with the new options; the CAUTION under
“Remove a file system: modular package” applies in this case as well.
If only fs_umount_opt is being changed, the file system will not be unmounted;
the new option will take effect when the package is halted or the file system
is unmounted for some other reason.

Add a file system: modular Package can be running.


package

Add or change a file Package must not be running.


system: legacy package

Remove a file system: Package should not be running.


modular package
CAUTION: Removing a file system may cause problems if the file system
cannot be unmounted because it's in use by a running process. In this case
Serviceguard kills the process; this could cause the package to fail.

Remove a file system: Package must not be running.


legacy package

Change Package can be running.


concurrent_fsck_operations, These changes in themselves will not cause any file system to be unmounted.
concurrent_mount_and_umount_operations,
fs_mount_retry_count,
fs_umount_retry_count:
modular package

Change Package must not be running.


concurrent_fsck_operations,
concurrent_mount_and_umount_operations,
fs_mount_retry_count,
fs_umount_retry_count:
legacy package

Add, change, or delete Package can be running.


external scripts and Changes take effect when applied, whether or not the package is running.
pre-scripts: modular package If you add a script, Serviceguard validates it and then (if there are no errors)
runs it when you apply the change. If you delete a script, Serviceguard stops
it when you apply the change.

Reconfiguring a Package 277


Table 7-2 Types of Changes to Packages (continued)
Change to the Package Required Package State

Change package auto_run Package can be either running or halted.


See “Choosing Switching and Failover Behavior” (page 125).

Add or delete a configured Both packages can be either running or halted.


dependency Special rules apply to packages in maintenance mode; see “Dependency
Rules for a Package in Maintenance Mode or Partial-Startup Maintenance
Mode ” (page 248).
For dependency purposes, a package being reconfigured is considered to
be UP. This means that if pkgA depends on pkgB, and pkgA is down and
pkgB is being reconfigured, pkgA will run if it becomes eligible to do so,
even if pkgB's reconfiguration is not yet complete.
HP recommends that you separate package dependency changes from
changes that affect resources and services that the newly dependent package
will also depend on; reconfigure the resources and services first and apply
the changes, then configure the package dependency.
For more information see “About Package Dependencies” (page 126).

Changes that Will Trigger Warnings


Changes to the following will trigger warnings, giving you a chance to cancel, if the
change would cause the package to fail.

NOTE: You will not be able to cancel if you use cmapplyconf -f.
• Package nodes
• Package dependencies
• Package weights (and also node capacity, defined in the cluster configuration file)
• Package priority
• auto_run
• failback_policy

Responding to Cluster Events


Serviceguard does not require much ongoing system administration intervention. As
long as there are no failures, your cluster will be monitored and protected. In the event
of a failure, those packages that you have designated to be transferred to another node
will be transferred automatically. Your ongoing responsibility as the system
administrator will be to monitor the cluster and determine if a transfer of package has
occurred. If a transfer has occurred, you have to determine the cause and take corrective
actions.

278 Cluster and Package Maintenance


The typical corrective actions to take in the event of a transfer of package include:
• Determining when a transfer has occurred.
• Determining the cause of a transfer.
• Repairing any hardware failures.
• Correcting any software problems.
• Restarting nodes.
• Transferring packages back to their original nodes.
• Enabling package switching.

Single-Node Operation
In a multi-node cluster, you could have a situation in which all but one node has failed,
or you have shut down all but one node, leaving your cluster in single-node operation.
This remaining node will probably have applications running on it. As long as the
Serviceguard daemon cmcld is active, other nodes can rejoin the cluster.
If the Serviceguard daemon fails when the cluster is in single-node operation, it will
leave the single node up and your applications running

NOTE: This means that Serviceguard itself is no longer running.


It is not necessary to halt the single node in this scenario, since the application is still
running, and no other node is currently available for package switching. (This is different
from the loss of the Serviceguard daemon in a multi-node cluster, which halts the node
(system reset), and causes packages to be switched to adoptive nodes.)
You should not try to restart Serviceguard, since data corruption might occur if another
node were to attempt to start up a new instance of the application that is still running
on the single node.
Instead of restarting the cluster, choose an appropriate time to shut down the
applications and reboot the node; this will allow Serviceguard to restart the cluster
after the reboot.

Removing Serviceguard from a System


If you want to disable a node permanently from Serviceguard use, use the rpm -e
command to delete the software.

CAUTION: Remove the node from the cluster first. If you run the rpm -e command
on a server that is still a member of a cluster, it will cause that cluster to halt, and the
cluster to be deleted.
To remove Serviceguard:

Single-Node Operation 279


1. If the node is an active member of a cluster, halt the node first.
2. If the node is included in a cluster configuration, remove the node from the
configuration.
3. If you are removing Serviceguard from more than one node, run rpm -eon one
node at a time.

280 Cluster and Package Maintenance


8 Troubleshooting Your Cluster
This chapter describes how to verify cluster operation, how to review cluster status,
how to add and replace hardware, and how to solve some typical cluster problems.
Topics are as follows:
• Testing Cluster Operation
• Monitoring Hardware (page 282)
• Replacing Disks (page 283)
• Replacing LAN Cards (page 285)
• Replacing a Failed Quorum Server System (page 286)
• Troubleshooting Approaches (page 288)
• Solving Problems (page 291)

Testing Cluster Operation


Once you have configured your Serviceguard cluster, you should verify that the various
components of the cluster behave correctly in case of a failure. In this section, the
following procedures test that the cluster responds properly in the event of a package
failure, a node failure, or a LAN failure.

CAUTION: In testing the cluster in the following procedures, be aware that you are
causing various components of the cluster to fail, so that you can determine that the
cluster responds correctly to failure situations. As a result, the availability of nodes and
applications may be disrupted.

Testing the Package Manager


To test that the package manager is operating correctly, perform the following procedure
for each package on the cluster:
1. Obtain the PID number of a service in the package by entering
ps -ef | grep <service_cmd>
where service_cmd is the executable specified in the package configuration file
(or legacy control script) by means of the service_cmd parameter (page 215). The
service selected must have the default service_restart value (none).
2. To kill the service_cmd PID, enter
kill <PID>
3. To view the package status, enter
cmviewcl -v

Testing Cluster Operation 281


The package should be running on the specified adoptive node.
4. Halt the package, then move it back to the primary node using the cmhaltpkg,
cmmodpkg, and cmrunpkg commands:
cmhaltpkg <PackageName>
cmmodpkg -e <PrimaryNode> <PackageName>
cmrunpkg -v <PackageName>
Depending on the specific databases you are running, perform the appropriate
database recovery.

Testing the Cluster Manager


To test that the cluster manager is operating correctly, perform the following steps for
each node on the cluster:
1. Turn off the power to the node.
2. To observe the cluster reforming, enter the following command on some other
configured node:
cmviewcl -v
You should be able to observe that the powered down node is halted, and that its
packages have been correctly switched to other nodes.
3. Turn on the power to the node.
4. To verify that the node is rejoining the cluster, enter the following command on
any configured node:
cmviewcl -v
The node should be recognized by the cluster, but its packages should not be
running.
5. Move the packages back to the original node:
cmhaltpkg <pkgname>
cmmodpkg -e -n <originalnode>
cmrunpkg <pkgname>
Depending on the specific databases you are running, perform the appropriate
database recovery.
6. Repeat this procedure for all nodes in the cluster one at a time.

Monitoring Hardware
Good standard practice in handling a high availability system includes careful fault
monitoring so as to prevent failures if possible or at least to react to them swiftly when
they occur. For information about disk monitoring, see “Creating a Disk Monitor

282 Troubleshooting Your Cluster


Configuration” (page 228). In addition, the following should be monitored for errors
or warnings of all kinds:
• CPUs
• Memory
• LAN cards
• Power sources
• All cables
• Disk interface cards
Some monitoring can be done through simple physical inspection, but for the most
comprehensive monitoring, you should examine the system log file
(/var/log/messages) periodically for reports on all configured HA devices. The
presence of errors relating to a device will show the need for maintenance.

Replacing Disks
The procedure for replacing a faulty disk mechanism depends on the type of disk
configuration you are using. Refer to your Smart Array documentation for issues related
to your Smart Array.

Replacing a Faulty Mechanism in a Disk Array


You can replace a failed disk mechanism by simply removing it from the array and
replacing it with a new mechanism of the same type. The resynchronization is handled
by the array itself. There may be some impact on disk performance until the
resynchronization is complete. For details on the process of hot plugging disk
mechanisms, refer to your disk array documentation.

Replacing a Lock LUN


You can replace an unusable lock LUN while the cluster is running. You can do this
without any cluster reconfiguration if you do not change the devicefile name; or, if you
do need to change the devicefile, you can do the necessary reconfiguration while the
cluster is running.

Replacing Disks 283


If you need to use a different devicefile, you must change the name of the devicefile in
the cluster configuration file; see “Updating the Cluster Lock LUN Configuration
Online” (page 261).

CAUTION: Before you start, make sure that all nodes have logged a message such as
the following in syslog:
WARNING: Cluster lock LUN /dev/sda1 is corrupt: bad label. Until
this situation is corrected, a single failure could cause all
nodes in the cluster to crash.

Once all nodes have logged this message, use a command such as the following to
specify the new cluster lock LUN:
cmdisklock reset /dev/sda1

CAUTION: You are responsible for determining that the device is not being used by
LVM or any other subsystem on any node connected to the device before using
cmdisklock. If you use cmdisklock without taking this precaution, you could lose
data.

NOTE: cmdisklock is needed only when you are repairing or replacing a lock LUN;
see the cmdisklock (1m) manpage for more information.
Serviceguard checks the lock LUN every 75 seconds. After using the cmdisklock
command, review the syslog file of an active cluster node for not more than 75 seconds.
By this time you should see a message showing that the lock disk is healthy again.

Revoking Persistent Reservations after a Catastrophic Failure


For information about persistent reservations (PR) and how they work, see “About
Persistent Reservations” (page 86).
Under normal circumstances, Serviceguard clears all persistent reservations when a
package halts. In the case of a catastrophic cluster failure however, you may need to
do the cleanup yourself as part of the recovery. Use the
$SGCONF/scripts/sg/pr_cleanup script to do this. (The script is also in
$SGCONF/bin/. See “Understanding the Location of Serviceguard Files” (page 153)
for the locations of Serviceguard directories on various Linux distributions.)
Invoke the script as follows, specifying either the device special file (DSF) of a LUN,
or a file containing a list of DSF names:

284 Troubleshooting Your Cluster


pr_cleanup lun -v -k <key> [-f <filename_path> | <list of DSFs>]
• lun, if used, specifies that a LUN, rather than a volume group, is to be operated
on.
• -v, if used, specifies verbose output detailing the actions the script performs and
their status.
• -k <key>, if used, specifies the key to be used in the clear operation.
• -f <filename_path>, if used, specifies that the name of the DSFs to be operated
on are listed in the file specified by <filename_path>. Each DSF must be listed
on a separate line.
• <list of DSFs> specifies one or more DSFs on the command line, if -f
<filename_path> is not used.

Examples
The following command will clear all the PR reservations registered with the key abc12
on the set of LUNs listed in the file /tmp/pr_device_list
pr_cleanup -k abc12 lun -f /tmp/pr_device_list
pr_device_list contains entries such as the following:
/dev/sdb1
/dev/sdb2
Alternatively you could enter the device-file names on the command line:
pr_cleanup -k abc12 lun /dev/sdb1 /dev/sdb2
The next command clears all the PR reservations registered with the PR key abcde on
the underlying LUNs of the volume group vg01:
pr_cleanup -k abcde vg01

NOTE: Because the keyword lun is not included, the device is assumed to be a volume
group.

Replacing LAN Cards


If you need to replace a LAN card, use the following steps. It is not necessary to bring
the cluster down to do this.
1. Halt the node using the cmhaltnode command.
2. Shut down the system:
shutdown -h
Then power off the system.
3. Remove the defective LAN card.
4. Install the new LAN card. The new card must be exactly the same card type, and
it must be installed in the same slot as the card you removed.

Replacing LAN Cards 285


5. Power up the system.
6. As the system comes up, the kudzu program on Red Hat systems will detect and
report the hardware changes. Accept the changes and add any information needed
for the new LAN card. On SUSE systems, run YAST2 after the system boots and
make adjustments to the NIC setting of the new LAN card. If the old LAN card
was part of a “bond”, the new LAN card needs to be made part of the bond. See
“Implementing Channel Bonding (Red Hat)” (page 159) or “Implementing Channel
Bonding (SUSE)” (page 162).
7. If necessary, add the node back into the cluster using the cmrunnode command.
(You can omit this step if the node is configured to join the cluster automatically.)
Now Serviceguard will detect that the MAC address (LLA) of the card has changed
from the value stored in the cluster binary configuration file, and it will notify the other
nodes in the cluster of the new MAC address. The cluster will operate normally after
this.
HP recommends that you update the new MAC address in the cluster binary
configuration file by re-applying the cluster configuration. Use the following steps for
online reconfiguration:
1. Use the cmgetconf command to obtain a fresh ASCII configuration file, as follows:
cmgetconf config.conf

2. Use the cmapplyconf command to apply the configuration and copy the new
binary file to all cluster nodes:
cmapplyconf -C config.conf

This procedure updates the binary file with the new MAC address and thus avoids
data inconsistency between the outputs of the cmviewconf and ifconfig commands.

Replacing a Failed Quorum Server System


When a quorum server fails or becomes unavailable to the clusters it is providing
quorum services for, this will not cause a failure on any cluster. However, the loss of
the quorum server does increase the vulnerability of the clusters in case there is an
additional failure. Use the following procedure to replace a defective quorum server
system. If you use this procedure, you do not need to change the configuration of any
cluster nodes.

286 Troubleshooting Your Cluster


IMPORTANT: Make sure you read the latest version of the HP Serviceguard Quorum
Server Release Notes before you proceed. You can find them at:
http://www.docs.hp.com -> High Availability -> Quorum Server. You
should also consult the Quorum Server white papers at the same location.
1. Remove the old quorum server system from the network.
2. Set up the new system and configure it with the old quorum server’s IP address
and hostname.
3. Install and configure the quorum server software on the new system. Be sure to
include in the new QS authorization file (for example, /usr/local/qs/conf/
qs_authfile) on all of the nodes that were configured for the old quorum server.
Refer to the qs(1) man page for details about configuring the QS authorization
file.

NOTE: The quorum server reads the authorization file at startup. Whenever you
modify the file qs_authfile, run the following command to force a re-read of
the file. For example on a Red Hat distribution:
/usr/local/qs/bin/qs -update
On a SUSE distribution:
/opt/qs/bin/qs -update

4. Start the quorum server as follows:


• Edit the /etc/inittab file to add the quorum server entries, as shown in
the latest version of the HP Serviceguard Quorum Server Version Release Notes.
• Use the init q command to run the quorum server.
Or
• Create a package in another cluster for the Quorum Server, as described in
the Release Notes for your version of Quorum Server. They can be found at
http://docs.hp.com ->High Availability ->Quorum Server.
5. All nodes in all clusters that were using the old quorum server will connect to the
new quorum server. Use the cmviewcl -v command from any cluster that is
using the quorum server to verify that the nodes in that cluster have connected to
the QS.
6. The quorum server log file on the new quorum server will show a log message
like the following for each cluster that uses the quorum server:
Request for lock /sg/<ClusterName> succeeded. New lock owners: N1, N2
7. To check that the quorum server has been correctly configured and to verify the
connectivity of a node to the quorum server, you can execute the following
command from your cluster nodes as follows:

Replacing a Failed Quorum Server System 287


cmquerycl -q <QSHostName> -n <Node1> -n <Node2> ...
The command will output an error message if the specified nodes cannot
communicate with the quorum server.

CAUTION: Make sure that the old system does not rejoin the network with the old
IP address.

NOTE: While the old quorum server is down and the new one is being set up:
• The cmquerycl, cmcheckconf and cmapplyconf commands will not work
• The cmruncl, cmhaltcl, cmrunnode, and cmhaltnode commands will work
• If there is a node or network failure that creates a 50-50 membership split, the
quorum server will not be available as a tie-breaker, and the cluster will fail.

Troubleshooting Approaches
The following sections offer a few suggestions for troubleshooting by reviewing the
state of the running system and by examining cluster status data, log files, and
configuration files. Topics include:
• Reviewing Package IP Addresses
• Reviewing the System Log File
• Reviewing Configuration Files
• Reviewing the Package Control Script
• Using cmquerycl and cmcheckconf
• Using cmviewcl
• Reviewing the LAN Configuration

Reviewing Package IP Addresses


The ifconfigcommand can be used to examine the LAN configuration. The command,
if executed on ftsys9 after the halting of node ftsys10, shows that the package IP
addresses are assigned to eth1:1 and eth1:2 along with the heartbeat IP address on eth1.
eth0 Link encap:Ethernet HWaddr 00:01:02:77:82:75
inet addr:15.13.169.106 Bcast:15.13.175.255 Mask:255.255.248.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:70826196 errors:0 dropped:0 overruns:1 frame:0
TX packets:5741486 errors:1 dropped:0 overruns:1 carrier:896
collisions:26706 txqueuelen:100
Interrupt:9 Base address:0xdc00

eth1 Link encap:Ethernet HWaddr 00:50:DA:64:8A:7C


inet addr:192.168.1.106 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2337841 errors:0 dropped:0 overruns:0 frame:0
TX packets:1171966 errors:0 dropped:0 overruns:0 carrier:0
collisions:6 txqueuelen:100

288 Troubleshooting Your Cluster


Interrupt:9 Base address:0xda00

eth1:1 Link encap:Ethernet HWaddr 00:50:DA:64:8A:7C


inet addr:192.168.1.200 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:9 Base address:0xda00

eth1:2 Link encap:Ethernet HWaddr 00:50:DA:64:8A:7C


inet addr:192.168.1.201 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:9 Base address:0xda00

lo Link encap:Local Loopback


inet addr:127.0.0.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP LOOPBACK RUNNING MULTICAST MTU:3924 Metric:1
RX packets:2562940 errors:0 dropped:0 overruns:1 frame:0
TX packets:2562940 errors:1 dropped:0 overruns:1 carrier:896
collisions:0 txqueuelen:0

Reviewing the System Log File


Messages from the Cluster Manager and Package Manager are written to the system
log file. The default location of the log file may vary according to Linux distribution;
the Red Hat default is /var/log/messages. You can use a text editor, such as vi,
or the more command to view the log file for historical information on your cluster.
This log provides information on the following:
• Commands executed and their outcome.
• Major cluster events which may, or may not, be errors.
• Cluster status information.

NOTE: Many other products running on Linux in addition to Serviceguard use the
syslog file to save messages. Refer to your Linux documentation for additional
information on using the system log.

Sample System Log Entries


The following sample entries from the syslog file show a package that failed to run
because of a problem in the pkg5_run script. You would look at the pkg5_run.log
for details.
Dec 14 14:33:48 star04 cmcld[2048]: Starting cluster management protocols.
Dec 14 14:33:48 star04 cmcld[2048]: Attempting to form a new cluster
Dec 14 14:33:53 star04 cmcld[2048]: 3 nodes have formed a new cluster
Dec 14 14:33:53 star04 cmcld[2048]: The new active cluster membership is:
star04(id=1) , star05(id=2), star06(id=3)
Dec 14 17:33:53 star04 cmlvmd[2049]: Clvmd initialized successfully.
Dec 14 14:34:44 star04 CM-CMD[2054]: cmrunpkg -v pkg5
Dec 14 14:34:44 star04 cmcld[2048]: Request from node star04 to start
package pkg5 on node star04.
Dec 14 14:34:44 star04 cmcld[2048]: Executing '/usr/local/cmcluster/conf/pkg5/pkg5_run
start' for package pkg5.
Dec 14 14:34:45 star04 LVM[2066]: vgchange -a n /dev/vg02
Dec 14 14:34:45 star04 cmcld[2048]: Package pkg5 run script exited with

Troubleshooting Approaches 289


NO_RESTART.
Dec 14 14:34:45 star04 cmcld[2048]: Examine the file
/usr/local/cmcluster/pkg5/pkg5_run.log for more details.
The following is an example of a successful package starting:
Dec 14 14:39:27 star04 CM-CMD[2096]: cmruncl
Dec 14 14:39:27 star04 cmcld[2098]: Starting cluster management protocols.
Dec 14 14:39:27 star04 cmcld[2098]: Attempting to form a new cluster
Dec 14 14:39:27 star04 cmclconfd[2097]: Command execution message
Dec 14 14:39:33 star04 cmcld[2098]: 3 nodes have formed a new cluster
Dec 14 14:39:33 star04 cmcld[2098]: The new active cluster membership is:
star04(id=1), star05(id=2), star06(id=3)
Dec 14 17:39:33 star04 cmlvmd[2099]: Clvmd initialized successfully.
Dec 14 14:39:34 star04 cmcld[2098]: Executing '/usr/local/cmcluster/conf/pkg4/pkg4_run
start' for package pkg4.
Dec 14 14:39:34 star04 LVM[2107]: vgchange /dev/vg01
Dec 14 14:39:35 star04 CM-pkg4[2124]: cmmodnet -a -i 15.13.168.0 15.13.168.4
Dec 14 14:39:36 star04 CM-pkg4[2127]: cmrunserv Service4 /vg01/MyPing 127.0.0.1
>>/dev/null
Dec 14 14:39:36 star04 cmcld[2098]: Started package pkg4 on node star04.

Reviewing Object Manager Log Files


The Serviceguard Object Manager daemon cmomd logs messages to the file
/usr/local/cmom/cmomd.log on Red Hat and /var/log/cmmomcmomd.log on
SUSE. You can review these messages using the cmreadlog command, for example:
/usr/local/cmom/bin/cmreadlog /usr/local/cmom/log/cmomd.log
Messages from cmomd include information about the processes that request data from
the Object Manager, including type of data, timestamp, etc.

Reviewing Configuration Files


Review the following ASCII configuration files:
• Cluster configuration file.
• Package configuration files.
Ensure that the files are complete and correct according to your configuration planning
worksheets.

Reviewing the Package Control Script


For legacy packages, ensure that the package control script is found on all nodes where
the package can run and that the file is identical on all nodes. Ensure that the script is
executable on all nodes. Ensure that the name of the control script appears in the package
configuration file, and ensure that all services named in the package configuration file
also appear in the package control script.
Information about the starting and halting of each package can be found in the package’s
control script log. This log provides the history of the operation of the package, including
all package run and halt activities. The location of the file is determined by the
script_log_file parameter (page 208) in the package configuration file. If you have written
a separate run and halt script for a legacy package, each script will have its own log.

290 Troubleshooting Your Cluster


Using the cmquerycl and cmcheckconf Commands
In addition, cmquerycl and cmcheckconf can be used to troubleshoot your cluster
just as they were used to verify its configuration. The following example shows the
commands used to verify the existing cluster configuration on ftsys9 and ftsys10:
cmquerycl -v -C $SGCONF/verify.conf -n ftsys9 -n ftsys10
cmcheckconf -v -C $SGCONF/verify.conf
cmcheckconf checks:
• The network addresses and connections.
• Quorum Server connectivity, if a quorum server is configured.
• Lock LUN connectivity, if a lock LUN is used.
• The validity of configuration parameters of the cluster and packages for:
— The uniqueness of names.
— The existence and permission of scripts.
It doesn't check:
• The correct setup of the power circuits.
• The correctness of the package configuration script.

Reviewing the LAN Configuration


The following networking commands can be used to diagnose problems:
• ifconfig can be used to examine the LAN configuration. This command lists all
IP addresses assigned to each LAN interface card.
• arp -a can be used to check the arp tables.
• cmscancl can be used to test IP-level connectivity between network interfaces in
the cluster.
• cmviewcl -v shows the status of primary LANs.
Use these commands on all nodes.

Solving Problems
Problems with Serviceguard may be of several types. The following is a list of common
categories of problem:
• Serviceguard Command Hangs.
• Cluster Re-formations.
• System Administration Errors.
• Package Control Script Hangs.
• Package Movement Errors.
• Node and Network Failures.
• Quorum Server Messages.

Solving Problems 291


Name Resolution Problems
Many Serviceguard commands, including cmviewcl, depend on name resolution
services to look up the addresses of cluster nodes. When name services are not available
(for example, if a name server is down), Serviceguard commands may hang, or may
return a network-related error message. If this happens, use the host command on
each cluster node to see whether name resolution is correct. For example:
host ftsys9
ftsys9.cup.hp.com has address 15.13.172.229
If the output of this command does not include the correct IP address of the node, then
check your name resolution services further.

Networking and Security Configuration Errors


In many cases, a symptom such as Permission denied... or Connection
refused... is the result of an error in the networking or security configuration. Most
such problems can be resolved by correcting the entries in /etc/hosts. See
“Configuring Name Resolution” (page 156) for more information.

Cluster Re-formations Caused by Temporary Conditions


You may see Serviceguard error messages, such as the following, which indicate that
a node is having problems:
Member node_name seems unhealthy, not receiving heartbeats from
it.
This may indicate a serious problem, such as a node failure, whose underlying cause
is probably a too-aggressive setting for the MEMBER_TIMEOUT parameter; see the
next section, “Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too
Low”. Or it may be a transitory problem, such as excessive network traffic or system
load.
What to do: If you find that cluster nodes are failing because of temporary network or
system-load problems (which in turn cause heartbeat messages to be delayed in network
or during processing), you should solve the networking or load problem if you can.
Failing that, you can increase the value of MEMBER_TIMEOUT, as described in the
next section.

Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low


If you have set the MEMBER_TIMEOUT parameter too low, the cluster demon, cmcld,
will write warnings to syslog that indicate the problem. There are three in particular
that you should watch for:

292 Troubleshooting Your Cluster


1. Warning: cmcld was unable to run for the last <n.n> seconds.
Consult the Managing Serviceguard manual for guidance on
setting MEMBER_TIMEOUT, and information on cmcld.
This means that cmcld was unable to get access to a CPU for a significant amount
of time. If this occurred while the cluster was re-forming, one or more nodes could
have failed. Some commands (such as cmhaltnode (1m), cmrunnode (1m),
cmapplyconf (1m)), cause the cluster to re-form, so there's a chance that running
one of these commands could precipitate a node failure; that chance is greater the
longer the hang.
What to do: If this message appears once a month or more often, increase
MEMBER_TIMEOUT to more than 10 times the largest reported delay. For example,
if the message that reports the largest number says that cmcld was unable to run
for the last 1.6 seconds, increase MEMBER_TIMEOUT to more than 16 seconds.
2. This node is at risk of being evicted from the running
cluster. Increase MEMBER_TIMEOUT.
This means that the hang was long enough for other nodes to have noticed the
delay in receiving heartbeats and marked the node “unhealthy”. This is the
beginning of the process of evicting the node from the cluster; see “What Happens
when a Node Times Out” (page 90) for an explanation of that process.
What to do: In isolation, this could indicate a transitory problem, as described in
the previous section. If you have diagnosed and fixed such a problem and are
confident that it won't recur, you need take no further action; otherwise you should
increase MEMBER_TIMEOUT as instructed in item 1.
3. Member node_name seems unhealthy, not receiving heartbeats
from it.
This is the message that indicates that the node has been found “unhealthy” as
described in the previous bullet.
What to do: See item 2.
For more information, including requirements and recommendations, see the
MEMBER_TIMEOUT discussion under “Cluster Configuration Parameters ” (page 105).

System Administration Errors


There are a number of errors you can make when configuring Serviceguard that will
not show up when you start the cluster. Your cluster can be running, and everything
appears to be fine, until there is a hardware or software failure and control of your
packages are not transferred to another node as you would have expected.

Solving Problems 293


These are errors caused specifically by errors in the cluster configuration file and
package configuration scripts. Examples of these errors include:
• Volume groups not defined on adoptive node.
• Mount point does not exist on adoptive node.
• Network errors on adoptive node (configuration errors).
• User information not correct on adoptive node.
You can use the following commands to check the status of your disks:
• df - to see if your package’s volume group is mounted.
• vgdisplay -v - to see if all volumes are present.
• strings /etc/lvmconf/*.conf - to ensure that the configuration is correct.
• fdisk -v /dev/sdx - to display information about a disk.

Package Control Script Hangs or Failures


When a RUN_SCRIPT_TIMEOUT or HALT_SCRIPT_TIMEOUT value is set, and the
control script hangs, causing the timeout to be exceeded, Serviceguard kills the script
and marks the package “Halted.” Similarly, when a package control script fails,
Serviceguard kills the script and marks the package “Halted.” In both cases, the
following also take place:
• Control of the package will not be transferred.
• The run or halt instructions may not run to completion.
• Global switching will be disabled.
• The current node will be disabled from running the package.
Following such a failure, since the control script is terminated, some of the package’s
resources may be left activated. Specifically:
• Volume groups may be left active.
• File systems may still be mounted.
• IP addresses may still be installed.
• Services may still be running.
In this kind of situation, Serviceguard will not restart the package without manual
intervention. You must clean up manually before restarting the package. Use the
following steps as guidelines:
1. Perform application specific cleanup. Any application specific actions the control
script might have taken should be undone to ensure successfully starting the
package on an alternate node. This might include such things as shutting down
application processes, removing lock files, and removing temporary files.
2. Ensure that package IP addresses are removed from the system. This step is
accomplished via the cmmodnet(1m) command. First determine which package
IP addresses are installed by inspecting the output resulting from running the
ifconfig command. If any of the IP addresses specified in the package control

294 Troubleshooting Your Cluster


script appear in the ifconfig output under the inet addr: in the ethX:Y
block, use cmmodnet to remove them:
cmmodnet -r -i <ip-address> <subnet>
where <ip-address> is the address indicated above and <subnet> is the result
of masking the <ip-address> with the mask found in the same line as the inet
address in the ifconfig output.
3. Ensure that package volume groups are deactivated. First unmount any package
logical volumes which are being used for file systems. This is determined by
inspecting the output resulting from running the command df -l. If any package
logical volumes, as specified by the LV[] array variables in the package control
script, appear under the “Filesystem” column, use umount to unmount them:
fuser -ku <logical-volume>
umount <logical-volume>
Next, deactivate the package volume groups. These are specified by the VG[]
array entries in the package control script.
vgchange -a n <volume-group>
4. Finally, re-enable the package for switching.
cmmodpkg -e <package-name>
If after cleaning up the node on which the timeout occurred it is desirable to have
that node as an alternate for running the package, remember to re-enable the
package to run on the node:
cmmodpkg -e -n <node-name> <package-name>

The default Serviceguard control scripts are designed to take the straightforward steps
needed to get an application running or stopped. If the package administrator specifies
a time limit within which these steps need to occur and that limit is subsequently
exceeded for any reason, Serviceguard takes the conservative approach that the control
script logic must either be hung or defective in some way. At that point the control
script cannot be trusted to perform cleanup actions correctly, thus the script is terminated
and the package administrator is given the opportunity to assess what cleanup steps
must be taken.
If you want the package to switch automatically in the event of a control script timeout,
set the node_fail_fast_enabled parameter (page 206) to YES. In this case, Serviceguard will
cause a reboot on the node where the control script timed out. This effectively cleans
up any side effects of the package’s run or halt attempt. In this case the package will
be automatically restarted on any available alternate node for which it is configured.

Package Movement Errors


These errors are similar to the system administration errors except they are caused
specifically by errors in the package control script. The best way to prevent these errors

Solving Problems 295


is to test your package control script before putting your high availability application
on line.
Adding a set -x statement in the second line of your control script will give you
details on where your script may be failing.

Node and Network Failures


These failures cause Serviceguard to transfer control of a package to another node. This
is the normal action of Serviceguard, but you have to be able to recognize when a
transfer has taken place and decide to leave the cluster in its current condition or to
restore it to its original condition.
Possible node failures can be caused by the following conditions:
• reboot
• Kernel Oops
• Hangs
• Power failures
You can use the following commands to check the status of your network and subnets:
• ifconfig - to display LAN status and check to see if the package IP is stacked
on the LAN card.
• arp -a - to check the arp tables.
Since your cluster is unique, there are no cookbook solutions to all possible problems.
But if you apply these checks and commands and work your way through the log files,
you will be successful in identifying and solving problems.

Troubleshooting the Quorum Server

NOTE: See the HP Serviceguard Quorum Server Version A.04.00 Release Notes for
information about configuring the Quorum Server. Do not proceed without reading
the Release Notes for your version.

Authorization File Problems


The following kind of message in a Serviceguard node’s syslog file or in the output
of cmviewcl -v may indicate an authorization problem:
Access denied to quorum server 192.6.7.4
The reason may be that you have not updated the authorization file. Verify that the
node is included in the file, and try using /usr/lbin/qs -update to re-read the
quorum server authorization file.

296 Troubleshooting Your Cluster


Timeout Problems
The following kinds of message in a Serviceguard node’s syslog file may indicate
timeout problems:
Unable to set client version at quorum server 192.6.7.2: reply
timed out
Probe of quorum server 192.6.7.2 timed out
These messages could be an indication of an intermittent network problem; or the
default quorum server timeout may not be sufficient. You can set the
QS_TIMEOUT_EXTENSION to increase the timeout, or you can increase the
MEMBER_TIMEOUT value. See “Cluster Configuration Parameters ” (page 105)for
more information about these parameters.
A message such as the following in a Serviceguard node’s syslog file indicates that
the node did not receive a reply to its lock request on time. This could be because of
delay in communication between the node and the Quorum Server or between the
Quorum Server and other nodes in the cluster:
Attempt to get lock /sg/cluser1 unsuccessful. Reason:
request_timedout

Messages
The coordinator node in Serviceguard sometimes sends a request to the quorum server
to set the lock state. (This is different from a request to obtain the lock in tie-breaking.)
If the quorum server’s connection to one of the cluster nodes has not completed, the
request to set may fail with a two-line message like the following in the quorum server’s
log file:
Oct 008 16:10:05:0: There is no connection to the applicant
2 for lock /sg/lockTest1
Oct 08 16:10:05:0:Request for lock /sg/lockTest1 from
applicant 1 failed: not connected to all applicants.
This condition can be ignored. The request will be retried a few seconds later and will
succeed. The following message is logged:
Oct 008 16:10:06:0: Request for lock /sg/lockTest1
succeeded. New lock owners: 1,2.

Lock LUN Messages


If the lock LUN device fails, the following message will be entered in the syslog file:
Oct 008 16:10:05:0: WARNING: Cluster lock lun /dev/sdc1 has failed.

Solving Problems 297


298
A Designing Highly Available Cluster Applications
This appendix describes how to create or port applications for high availability, with
emphasis on the following topics:
• Automating Application Operation
• Controlling the Speed of Application Failover (page 301)
• Designing Applications to Run on Multiple Systems (page 304)
• Restoring Client Connections (page 309)
• Handling Application Failures (page 310)
• Minimizing Planned Downtime (page 311)
Designing for high availability means reducing the amount of unplanned and planned
downtime that users will experience. Unplanned downtime includes unscheduled
events such as power outages, system failures, network failures, disk crashes, or
application failures. Planned downtime includes scheduled events such as scheduled
backups, system upgrades to new OS revisions, or hardware replacements.
Two key strategies should be kept in mind:
1. Design the application to handle a system reboot or panic. If you are modifying
an existing application for a highly available environment, determine what happens
currently with the application after a system panic. In a highly available
environment there should be defined (and scripted) procedures for restarting the
application. Procedures for starting and stopping the application should be
automatic, with no user intervention required.
2. The application should not use any system-specific information such as the
following if such use would prevent it from failing over to another system and
running properly:
• The application should not refer to uname() or gethostname().
• The application should not refer to the SPU ID.
• The application should not refer to the MAC (link-level) address.

Automating Application Operation


Can the application be started and stopped automatically or does it require operator
intervention?
This section describes how to automate application operations to avoid the need for
user intervention. One of the first rules of high availability is to avoid manual
intervention. If it takes a user at a terminal, console or GUI interface to enter commands
to bring up a subsystem, the user becomes a key part of the system. It may take hours
before a user can get to a system console to do the work necessary. The hardware in
question may be located in a far-off area where no trained users are available, the
systems may be located in a secure datacenter, or in off hours someone may have to
connect via modem.

Automating Application Operation 299


There are two principles to keep in mind for automating application relocation:
• Insulate users from outages.
• Applications must have defined startup and shutdown procedures.
You need to be aware of what happens currently when the system your application is
running on is rebooted, and whether changes need to be made in the application's
response for high availability.

Insulate Users from Outages


Wherever possible, insulate your end users from outages. Issues include the following:
• Do not require user intervention to reconnect when a connection is lost due to a
failed server.
• Where possible, warn users of slight delays due to a failover in progress.
• Minimize the reentry of data.
• Engineer the system for reserve capacity to minimize the performance degradation
experienced by users.

Define Application Startup and Shutdown


Applications must be restartable without manual intervention. If the application requires
a switch to be flipped on a piece of hardware, then automated restart is impossible.
Procedures for application startup, shutdown and monitoring must be created so that
the HA software can perform these functions automatically.
To ensure automated response, there should be defined procedures for starting up the
application and stopping the application. In Serviceguard these procedures are placed
in the package control script. These procedures must check for errors and return status
to the HA control software. The startup and shutdown should be command-line driven
and not interactive unless all of the answers can be predetermined and scripted.
In an HA failover environment, HA software restarts the application on a surviving
system in the cluster that has the necessary resources, such as access to the necessary
disk drives. The application must be restartable in two aspects:
• It must be able to restart and recover on the backup system (or on the same system
if the application restart option is chosen).
• It must be able to restart if it fails during the startup and the cause of the failure
is resolved.
Application administrators need to learn to startup and shutdown applications using
the appropriate HA commands. Inadvertently shutting down the application directly
will initiate an unwanted failover. Application administrators also need to be careful
that they don't accidently shut down a production instance of an application rather
than a test instance in a development environment.
A mechanism to monitor whether the application is active is necessary so that the HA
software knows when the application has failed. This may be as simple as a script that
issues the command ps -ef | grep xxx for all the processes belonging to the
application.

300 Designing Highly Available Cluster Applications


To reduce the impact on users, the application should not simply abort in case of error,
since aborting would cause an unneeded failover to a backup system. Applications
should determine the exact error and take specific action to recover from the error
rather than, for example, aborting upon receipt of any error.

Controlling the Speed of Application Failover


What steps can be taken to ensure the fastest failover?
If a failure does occur causing the application to be moved (failed over) to another
node, there are many things the application can do to reduce the amount of time it
takes to get the application back up and running. The topics covered are as follows:
• Replicate Non-Data File Systems
• Use Raw Volumes
• Evaluate the Use of a journaled file system
• Minimize Data Loss
• Use Restartable Transactions
• Use Checkpoints
• Design for Multiple Servers
• Design for Replicated Data Sites

Replicate Non-Data File Systems


Non-data file systems should be replicated rather than shared. There can only be one
copy of the application data itself. It will be located on a set of disks that is accessed
by the system that is running the application. After failover, if these data disks are
filesystems, they must go through filesystems recovery (fsck) before the data can be
accessed. To help reduce this recovery time, the smaller these filesystems are, the faster
the recovery will be. Therefore, it is best to keep anything that can be replicated off the
data filesystem. For example, there should be a copy of the application executables on
each system rather than having one copy of the executables on a shared filesystem.
Additionally, replicating the application executables makes them subject to a rolling
upgrade if this is desired.

Evaluate the Use of a Journaled Filesystem (JFS)


If a file system must be used, a JFS offers significantly faster file system recovery than
an HFS. However, performance of the JFS may vary with the application. An example
of an appropriate JFS is the Reiser FS or ext3.

Minimize Data Loss


Minimize the amount of data that might be lost at the time of an unplanned outage. It
is impossible to prevent some data from being lost when a failure occurs. However, it
is advisable to take certain actions to minimize the amount of data that will be lost, as
explained in the following discussion.

Controlling the Speed of Application Failover 301


Minimize the Use and Amount of Memory-Based Data
Any in-memory data (the in-memory context) will be lost when a failure occurs. The
application should be designed to minimize the amount of in-memory data that exists
unless this data can be easily recalculated. When the application restarts on the standby
node, it must recalculate or reread from disk any information it needs to have in
memory.
One way to measure the speed of failover is to calculate how long it takes the application
to start up on a normal system after a reboot. Does the application start up immediately?
Or are there a number of steps the application must go through before an end-user can
connect to it? Ideally, the application can start up quickly without having to reinitialize
in-memory data structures or tables.
Performance concerns might dictate that data be kept in memory rather than written
to the disk. However, the risk associated with the loss of this data should be weighed
against the performance impact of posting the data to the disk.
Data that is read from a shared disk into memory, and then used as read-only data can
be kept in memory without concern.

Keep Logs Small


Some databases permit logs to be buffered in memory to increase online performance.
Of course, when a failure occurs, any in-flight transaction will be lost. However,
minimizing the size of this in-memory log will reduce the amount of completed
transaction data that would be lost in case of failure.
Keeping the size of the on-disk log small allows the log to be archived or replicated
more frequently, reducing the risk of data loss if a disaster were to occur. There is, of
course, a trade-off between online performance and the size of the log.

Eliminate Need for Local Data


When possible, eliminate the need for local data. In a three-tier, client/server
environment, the middle tier can often be dataless (i.e., there is no local data that is
client specific or needs to be modified). This “application server” tier can then provide
additional levels of availability, load-balancing, and failover. However, this scenario
requires that all data be stored either on the client (tier 1) or on the database server (tier
3).

Use Restartable Transactions


Transactions need to be restartable so that the client does not need to re-enter or back
out of the transaction when a server fails, and the application is restarted on another
system. In other words, if a failure occurs in the middle of a transaction, there should
be no need to start over again from the beginning. This capability makes the application
more robust and reduces the visibility of a failover to the user.
A common example is a print job. Printer applications typically schedule jobs. When
that job completes, the scheduler goes on to the next job. If, however, the system dies
in the middle of a long job (say it is printing paychecks for 3 hours), what happens
when the system comes back up again? Does the job restart from the beginning,
302 Designing Highly Available Cluster Applications
reprinting all the paychecks, does the job start from where it left off, or does the
scheduler assume that the job was done and not print the last hours worth of paychecks?
The correct behavior in a highly available environment is to restart where it left off,
ensuring that everyone gets one and only one paycheck.
Another example is an application where a clerk is entering data about a new employee.
Suppose this application requires that employee numbers be unique, and that after the
name and number of the new employee is entered, a failure occurs. Since the employee
number had been entered before the failure, does the application refuse to allow it to
be re-entered? Does it require that the partially entered information be deleted first?
More appropriately, in a highly available environment the application will allow the
clerk to easily restart the entry or to continue at the next data item.

Use Checkpoints
Design applications to checkpoint complex transactions. A single transaction from the
user's perspective may result in several actual database transactions. Although this
issue is related to restartable transactions, here it is advisable to record progress locally
on the client so that a transaction that was interrupted by a system failure can be
completed after the failover occurs.
For example, suppose the application being used is calculating PI. On the original
system, the application has gotten to the 1,000th decimal point, but the application has
not yet written anything to disk. At that moment in time, the node crashes. The
application is restarted on the second node, but the application is started up from
scratch. The application must recalculate those 1,000 decimal points. However, if the
application had written to disk the decimal points on a regular basis, the application
could have restarted from where it left off.

Balance Checkpoint Frequency with Performance


It is important to balance checkpoint frequency with performance. The trade-off with
checkpointing to disk is the impact of this checkpointing on performance. Obviously
if you checkpoint too often the application slows; if you don't checkpoint often enough,
it will take longer to get the application back to its current state after a failover. Ideally,
the end-user should be able to decide how often to checkpoint. Applications should
provide customizable parameters so the end-user can tune the checkpoint frequency.

Design for Multiple Servers


If you use multiple active servers, multiple service points can provide relatively
transparent service to a client. However, this capability requires that the client be smart
enough to have knowledge about the multiple servers and the priority for addressing
them. It also requires access to the data of the failed server or replicated data.
For example, rather than having a single application which fails over to a second system,
consider having both systems running the application. After a failure of the first system,
the second system simply takes over the load of the first system. This eliminates the
start up time of the application. There are many ways to design this sort of architecture,
and there are also many issues with this sort of design. This discussion will not go into
details other than to give a few examples.
Controlling the Speed of Application Failover 303
The simplest method is to have two applications running in a master/slave relationship
where the slave is simply a hot standby application for the master. When the master
fails, the slave on the second system would still need to figure out what state the data
was in (i.e., data recovery would still take place). However, the time to fork the
application and do the initial startup is saved.
Another possibility is having two applications that are both active. An example might
be two application servers which feed a database. Half of the clients connect to one
application server and half of the clients connect to the second application server. If
one server fails, then all the clients connect to the remaining application server.

Design for Replicated Data Sites


Replicated data sites are a benefit for both fast failover and disaster recovery. With
replicated data, data disks are not shared between systems. There is no data recovery
that has to take place. This makes the recovery time faster. However, there may be
performance trade-offs associated with replicating data. There are a number of ways
to perform data replication, which should be fully investigated by the application
designer.
Many of the standard database products provide for data replication transparent to
the client application. By designing your application to use a standard database, the
end-user can determine if data replication is desired.

Designing Applications to Run on Multiple Systems


If an application can be failed to a backup node, how will it work on that different
system?
The previous sections discussed methods to ensure that an application can be
automatically restarted. This section will discuss some ways to ensure the application
can run on multiple systems. Topics are as follows:
• Avoid Node Specific Information
• Assign Unique Names to Applications
• Use Uname(2) With Care
• Bind to a Fixed Port
• Bind to a Relocatable IP Addresses
• Give Each Application its Own Volume Group
• Use Multiple Destinations for SNA Applications
• Avoid File Locking

Avoid Node Specific Information


Typically, when a new system is installed, an IP address must be assigned to each active
network interface. This IP address is always associated with the node and is called a
stationary IP address.
The use of packages containing highly available applications adds the requirement for
an additional set of IP addresses, which are assigned to the applications themselves.
These are known as relocatable application IP addresses. Serviceguard’s network
304 Designing Highly Available Cluster Applications
sensor monitors the node’s access to the subnet on which these relocatable application
IP addresses reside. When packages are configured in Serviceguard, the associated
subnetwork address is specified as a package dependency, and a list of nodes on which
the package can run is also provided. When failing a package over to a remote node,
the subnetwork must already be active on the target node.
Each application or package should be given a unique name as well as a relocatable IP
address. Following this rule separates the application from the system on which it runs,
thus removing the need for user knowledge of which system the application runs on.
It also makes it easier to move the application among different systems in a cluster for
for load balancing or other reasons. If two applications share a single IP address, they
must move together. Instead, using independent names and addresses allows them to
move separately.
For external access to the cluster, clients must know how to refer to the application.
One option is to tell the client which relocatable IP address is associated with the
application. Another option is to think of the application name as a host, and configure
a name-to-address mapping in the Domain Name System (DNS). In either case, the
client will ultimately be communicating via the application’s relocatable IP address. If
the application moves to another node, the IP address will move with it, allowing the
client to use the application without knowing its current location. Remember that each
network interface must have a stationary IP address associated with it. This IP address
does not move to a remote system in the event of a network failure.

Obtain Enough IP Addresses


Each application receives a relocatable IP address that is separate from the stationary
IP address assigned to the system itself. Therefore, a single system might have many
IP addresses, one for itself and one for each of the applications that it normally runs.
Therefore, IP addresses in a given subnet range will be consumed faster than without
high availability. It might be necessary to acquire additional IP addresses.
Multiple IP addresses on the same network interface are supported only if they are on
the same subnetwork.

Allow Multiple Instances on Same System


Applications should be written so that multiple instances, each with its own application
name and IP address, can run on a single system. It might be necessary to invoke the
application with a parameter showing which instance is running. This allows
distributing the users among several systems under normal circumstances, but it also
allows all of the users to be serviced in the case of a failure on a single system.

Avoid Using SPU IDs or MAC Addresses


Design the application so that it does not rely on the SPU ID or MAC (link-level)
addresses. The SPU ID is a unique hardware ID contained in non-volatile memory,
which cannot be changed. A MAC address (also known as a NIC id) is a link-specific
address associated with the LAN hardware. The use of these addresses is a common
problem for license servers, since for security reasons they want to use hardware-specific
identification to ensure the license isn't copied to multiple nodes. One workaround is

Designing Applications to Run on Multiple Systems 305


to have multiple licenses; one for each node the application will run on. Another way
is to have a cluster-wide mechanism that lists a set of SPU IDs or node names. If your
application is running on a system in the specified set, then the license is approved.
Previous generation HA software would move the MAC address of the network card
along with the IP address when services were moved to a backup system. This is no
longer allowed in Serviceguard.
There were a couple of reasons for using a MAC address, which have been addressed
below:
• Old network devices between the source and the destination such as routers had
to be manually programmed with MAC and IP address pairs. The solution to this
problem is to move the MAC address along with the IP address in case of failover.
• Up to 20 minute delays could occur while network device caches were updated
due to timeouts associated with systems going down. This is dealt with in current
HA software by broadcasting a new ARP translation of the old IP address with
the new MAC address.

Assign Unique Names to Applications


A unique name should be assigned to each application. This name should then be
configured in DNS so that the name can be used as input to gethostbyname(3), as
described in the following discussion.

Use DNS
DNS provides an API which can be used to map hostnames to IP addresses and vice
versa. This is useful for BSD socket applications such as telnet which are first told the
target system name. The application must then map the name to an IP address in order
to establish a connection. However, some calls should be used with caution.
Applications should not reference official hostnames or IP addresses. The official
hostname and corresponding IP address for the hostname refer to the primary LAN
card and the stationary IP address for that card. Therefore, any application that refers
to, or requires the hostname or primary IP address may not work in an HA environment
where the network identity of the system that supports a given application moves from
one system to another, but the hostname does not move.
One way to look for problems in this area is to look for calls to gethostname(2) in
the application. HA services should use gethostname() with caution, since the
response may change over time if the application migrates. Applications that use
gethostname() to determine the name for a call to gethostbyname(3) should also
be avoided for the same reason. Also, the gethostbyaddr() call may return different
answers over time if called with a stationary IP address.
Instead, the application should always refer to the application name and relocatable
IP address rather than the hostname and stationary IP address. It is appropriate for the
application to call gethostbyname(3), specifying the application name rather than
the hostname. gethostbyname(3) will pass in the IP address of the application. This
IP address will move with the application to the new node.

306 Designing Highly Available Cluster Applications


However, gethostbyname(3) should be used to locate the IP address of an application
only if the application name is configured in DNS. It is probably best to associate a
different application name with each independent HA service. This allows each
application and its IP address to be moved to another node without affecting other
applications. Only the stationary IP addresses should be associated with the hostname
in DNS.

Use uname(2) With Care


Related to the hostname issue discussed in the previous section is the application's use
of uname(2), which returns the official system name. The system name is unique to
a given system whatever the number of LAN cards in the system. By convention, the
uname and hostname are the same, but they do not have to be. Some applications,
after connection to a system, might call uname(2) to validate for security purposes
that they are really on the correct system. This is not appropriate in an HA environment,
since the service is moved from one system to another, and neither the uname nor the
hostname are moved. Applications should develop alternate means of verifying where
they are running. For example, an application might check a list of hostnames that have
been provided in a configuration file.

Bind to a Fixed Port


When binding a socket, a port address can be specified or one can be assigned
dynamically. One issue with binding to random ports is that a different port may be
assigned if the application is later restarted on another cluster node. This may be
confusing to clients accessing the application.
The recommended method is using fixed ports that are the same on all nodes where
the application will run, instead of assigning port numbers dynamically. The application
will then always return the same port number regardless of which node is currently
running the application. Application port assignments should be put in
/etc/services to keep track of them and to help ensure that someone will not choose
the same port number.

Bind to Relocatable IP Addresses


When sockets are bound, an IP address is specified in addition to the port number.
This indicates the IP address to use for communication and is meant to allow
applications to limit which interfaces can communicate with clients. An application
can bind to INADDR_ANY as an indication that messages can arrive on any interface.
Network applications can bind to a stationary IP address, a relocatable IP address, or
INADDR_ANY. If the stationary IP address is specified, then the application may fail
when restarted on another node, because the stationary IP address is not moved to the
new system. If an application binds to the relocatable IP address, then the application
will behave correctly when moved to another system.
Many server-style applications will bind to INADDR_ANY, meaning that they will receive
requests on any interface. This allows clients to send to the stationary or relocatable IP
addresses. However, in this case the networking code cannot determine which source

Designing Applications to Run on Multiple Systems 307


IP address is most appropriate for responses, so it will always pick the stationary IP
address.
For TCP stream sockets, the TCP level of the protocol stack resolves this problem for
the client since it is a connection-based protocol. On the client, TCP ignores the stationary
IP address and continues to use the previously bound relocatable IP address originally
used by the client.
With UDP datagram sockets, however, there is a problem. The client may connect to
multiple servers utilizing the relocatable IP address and sort out the replies based on
the source IP address in the server’s response message. However, the source IP address
given in this response will be the stationary IP address rather than the relocatable
application IP address. Therefore, when creating a UDP socket for listening, the
application must always call bind(2) with the appropriate relocatable application IP
address rather than INADDR_ANY.

Call bind() before connect()


When an application initiates its own connection, it should first call bind(2), specifying
the application IP address before calling connect(2). Otherwise the connect request
will be sent using the stationary IP address of the system's outbound LAN interface
rather than the desired relocatable application IP address. The client will receive this
IP address from the accept(2) call, possibly confusing the client software and
preventing it from working correctly.

Give Each Application its Own Volume Group


Use separate volume groups for each application that uses data. If the application
doesn't use disk, it is not necessary to assign it a separate volume group. A volume
group (group of disks) is the unit of storage that can move between nodes. The greatest
flexibility for load balancing exists when each application is confined to its own volume
group, i.e., two applications do not share the same set of disk drives. If two applications
do use the same volume group to store their data, then the applications must move
together. If the applications’ data stores are in separate volume groups, they can switch
to different nodes in the event of a failover.
The application data should be set up on different disk drives and if applicable, different
mount points. The application should be designed to allow for different disks and
separate mount points. If possible, the application should not assume a specific mount
point.

Use Multiple Destinations for SNA Applications


SNA is point-to-point link-oriented; that is, the services cannot simply be moved to
another system, since that system has a different point-to-point link which originates
in the mainframe. Therefore, backup links in a node and/or backup links in other nodes
should be configured so that SNA does not become a single point of failure. Note that
only one configuration for an SNA link can be active at a time. Therefore, backup links
that are used for other purposes should be reconfigured for the primary mission-critical
purpose upon failover.

308 Designing Highly Available Cluster Applications


Avoid File Locking
In an NFS environment, applications should avoid using file-locking mechanisms,
where the file to be locked is on an NFS Server. File locking should be avoided in an
application both on local and remote systems. If local file locking is employed and the
system fails, the system acting as the backup system will not have any knowledge of
the locks maintained by the failed system. This may or may not cause problems when
the application restarts.
Remote file locking is the worst of the two situations, since the system doing the locking
may be the system that fails. Then, the lock might never be released, and other parts
of the application will be unable to access that data. In an NFS environment, file locking
can cause long delays in case of NFS client system failure and might even delay the
failover itself.

Restoring Client Connections


How does a client reconnect to the server after a failure?
It is important to write client applications to specifically differentiate between the loss
of a connection to the server and other application-oriented errors that might be
returned. The application should take special action in case of connection loss.
One question to consider is how a client knows after a failure when to reconnect to the
newly started server. The typical scenario is that the client must simply restart their
session, or relog in. However, this method is not very automated. For example, a
well-tuned hardware and application system may fail over in 5 minutes. But if users,
after experiencing no response during the failure, give up after 2 minutes and go for
coffee and don't come back for 28 minutes, the perceived downtime is actually 30
minutes, not 5. Factors to consider are the number of reconnection attempts to make,
the frequency of reconnection attempts, and whether or not to notify the user of
connection loss.
There are a number of strategies to use for client reconnection:
• Design clients which continue to try to reconnect to their failed server.
Put the work into the client application rather than relying on the user to reconnect.
If the server is back up and running in 5 minutes, and the client is continually
retrying, then after 5 minutes, the client application will reestablish the link with
the server and either restart or continue the transaction. No intervention from the
user is required.
• Design clients to reconnect to a different server.
If you have a server design which includes multiple active servers, the client could
connect to the second server, and the user would only experience a brief delay.
The problem with this design is knowing when the client should switch to the
second server. How long does a client retry to the first server before giving up and
going to the second server? There are no definitive answers for this. The answer
depends on the design of the server application. If the application can be restarted
on the same node after a failure (see “Handling Application Failures ” following),
the retry to the current server should continue for the amount of time it takes to
Restoring Client Connections 309
restart the server locally. This will keep the client from having to switch to the
second server in the event of a application failure.
• Use a transaction processing monitor or message queueing software to increase
robustness.
Use transaction processing monitors such as Tuxedo or DCE/Encina, which provide
an interface between the server and the client. Transaction processing monitors
(TPMs) can be useful in creating a more highly available application. Transactions
can be queued such that the client does not detect a server failure. Many TPMs
provide for the optional automatic rerouting to alternate servers or for the automatic
retry of a transaction. TPMs also provide for ensuring the reliable completion of
transactions, although they are not the only mechanism for doing this. After the
server is back online, the transaction monitor reconnects to the new server and
continues routing it the transactions.
• Queue Up Requests
As an alternative to using a TPM, queue up requests when the server is unavailable.
Rather than notifying the user when a server is unavailable, the user request is
queued up and transmitted later when the server becomes available again. Message
queueing software ensures that messages of any kind, not necessarily just
transactions, are delivered and acknowledged.
Message queueing is useful only when the user does not need or expect response
that the request has been completed (i.e, the application is not interactive).

Handling Application Failures


What happens if part or all of an application fails?
All of the preceding sections have assumed the failure in question was not a failure of
the application, but of another component of the cluster. This section deals specifically
with application problems. For instance, software bugs may cause an application to
fail, or system resource issues (such as low swap/memory space) may cause an
application to die. The section deals with how to design your application to recover
after these types of failures.

Create Applications to be Failure Tolerant


An application should be tolerant to failure of a single component. Many applications
have multiple processes running on a single node. If one process fails, what happens
to the other processes? Do they also fail? Can the failed process be restarted on the
same node without affecting the remaining pieces of the application?
Ideally, if one process fails, the other processes can wait a period of time for that
component to come back online. This is true whether the component is on the same
system or a remote system. The failed component can be restarted automatically on
the same system and rejoin the waiting processing and continue on. This type of failure
can be detected and restarted within a few seconds, so the end user would never know
a failure occurred.

310 Designing Highly Available Cluster Applications


Another alternative is for the failure of one component to still allow bringing down
the other components cleanly. If a database SQL server fails, the database should still
be able to be brought down cleanly so that no database recovery is necessary.
The worse case is for a failure of one component to cause the entire system to fail. If
one component fails and all other components need to be restarted, the downtime will
be high.

Be Able to Monitor Applications


All components in a system, including applications, should be able to be monitored
for their health. A monitor might be as simple as a display command or as complicated
as a SQL query. There must be a way to ensure that the application is behaving correctly.
If the application fails and it is not detected automatically, it might take hours for a
user to determine the cause of the downtime and recover from it.

Minimizing Planned Downtime


Planned downtime (as opposed to unplanned downtime) is scheduled; examples
include backups, systems upgrades to new operating system revisions, or hardware
replacements. For planned downtime, application designers should consider:
• Reducing the time needed for application upgrades/patches.
Can an administrator install a new version of the application without scheduling
downtime? Can different revisions of an application operate within a system? Can
different revisions of a client and server operate within a system?
• Providing for online application reconfiguration.
Can the configuration information used by the application be changed without
bringing down the application?
• Documenting maintenance operations.
Does an operator know how to handle maintenance operations?
When discussing highly available systems, unplanned failures are often the main point
of discussion. However, if it takes 2 weeks to upgrade a system to a new revision of
software, there are bound to be a large number of complaints.
The following sections discuss ways of handling the different types of planned
downtime.

Reducing Time Needed for Application Upgrades and Patches


Once a year or so, a new revision of an application is released. How long does it take
for the end-user to upgrade to this new revision? This answer is the amount of planned
downtime a user must take to upgrade their application. The following guidelines
reduce this time.

Minimizing Planned Downtime 311


Provide for Rolling Upgrades
Provide for a “rolling upgrade” in a client/server environment. For a system with many
components, the typical scenario is to bring down the entire system, upgrade every
node to the new version of the software, and then restart the application on all the
affected nodes. For large systems, this could result in a long downtime. An alternative
is to provide for a rolling upgrade. A rolling upgrade rolls out the new software in a
phased approach by upgrading only one component at a time. For example, the database
server is upgraded on Monday, causing a 15 minute downtime. Then on Tuesday, the
application server on two of the nodes is upgraded, which leaves the application servers
on the remaining nodes online and causes no downtime. On Wednesday, two more
application servers are upgraded, and so on. With this approach, you avoid the problem
where everything changes at once, plus you minimize long outages.
The trade-off is that the application software must operate with different revisions of
the software. In the above example, the database server might be at revision 5.0 while
the some of the application servers are at revision 4.0. The application must be designed
to handle this type of situation.

Do Not Change the Data Layout Between Releases


Migration of the data to a new format can be very time intensive. It also almost
guarantees that rolling upgrade will not be possible. For example, if a database is
running on the first node, ideally, the second node could be upgraded to the new
revision of the database. When that upgrade is completed, a brief downtime could be
scheduled to move the database server from the first node to the newly upgraded
second node. The database server would then be restarted, while the first node is idle
and ready to be upgraded itself. However, if the new database revision requires a
different database layout, the old data will not be readable by the newly updated
database. The downtime will be longer as the data is migrated to the new layout.

Providing Online Application Reconfiguration


Most applications have some sort of configuration information that is read when the
application is started. If to make a change to the configuration, the application must
be halted and a new configuration file read, downtime is incurred.
To avoid this downtime use configuration tools that interact with an application and
make dynamic changes online. The ideal solution is to have a configuration tool which
interacts with the application. Changes are made online with little or no interruption
to the end-user. This tool must be able to do everything online, such as expanding the
size of the data, adding new users to the system, adding new users to the application,
etc. Every task that an administrator needs to do to the application system can be made
available online.

Documenting Maintenance Operations


Standard procedures are important. An application designer should make every effort
to make tasks common for both the highly available environment and the normal
environment. If an administrator is accustomed to bringing down the entire system

312 Designing Highly Available Cluster Applications


after a failure, he or she will continue to do so even if the application has been
redesigned to handle a single failure. It is important that application documentation
discuss alternatives with regards to high availability for typical maintenance operations.

Minimizing Planned Downtime 313


314
B Integrating HA Applications with Serviceguard
The following is a summary of the steps you should follow to integrate an application
into the Serviceguard environment:
1. Read the rest of this book, including the chapters on cluster and package
configuration, and the appendix “Designing Highly Available Cluster
Applications.”
2. Define the cluster’s behavior for normal operations:
• What should the cluster look like during normal operation?
• What is the standard configuration most people will use? (Is there any data
available about user requirements?)
• Can you separate out functions such as database or application server onto
separate machines, or does everything run on one machine?
3. Define the cluster’s behavior for failover operations:
• Does everything fail over together to the adoptive node?
• Can separate applications fail over to the same node?
• Is there already a high availability mechanism within the application other
than the features provided by Serviceguard?
4. Identify problem areas
• What does the application do today to handle a system reboot or panic?
• Does the application use any system-specific information such as uname()
or gethostname(), SPU_ID or MAC address which would prevent it from
failing over to another system?

Checklist for Integrating HA Applications


This section contains a checklist for integrating HA applications in both single and
multiple systems.

Checklist for Integrating HA Applications 315


Defining Baseline Application Behavior on a Single System
1. Define a baseline behavior for the application on a standalone system:
• Install the application, database, and other required resources on one of the
systems. Be sure to follow Serviceguard rules in doing this:
— Install all shared data on separate external volume groups.
— Use a journaled filesystem (JFS) as appropriate.
• Perform some sort of standard test to ensure the application is running
correctly. This test can be used later in testing with Serviceguard. If possible,
try to connect to the application through a client.
• Crash the standalone system, reboot it, and test how the application starts up
again. Note the following:
— Are there any manual procedures? if so, document them.
— Can everything start up from rc scripts?
• Try to write a simple script which brings everything up without having to do
any keyboard typing. Figure out what the administrator would do at the
keyboard, then put that into the script.
• Try to write a simple script to bring down the application. Again, figure out
what the administrator would do at the keyboard, then put that into the script.

Integrating HA Applications in Multiple Systems


1. Install the application on a second system.
• Create the LVM infrastructure on the second system.
• Add the appropriate users to the system.
• Install the appropriate executables.
• With the application not running on the first system, try to bring it up on the
second system. You might use the script you created in the step above. Is there
anything different that you must do? Does it run?
• Repeat this process until you can get the application to run on the second
system.
2. Configure the Serviceguard cluster:
• Create the cluster configuration.
• Create a package.
• Create the package script.
• Use the simple scripts you created in earlier steps as the customer defined
functions in the package control script.
3. Start the cluster and verify that applications run as planned.

Testing the Cluster


1. Test the cluster:
• Have clients connect.
• Provide a normal system load.

316 Integrating HA Applications with Serviceguard


• Halt the package on the first node and move it to the second node:
# cmhaltpkg pkg1
# cmrunpkg -n node2 pkg1
# cmmodpkg -e pkg1
• Move it back.
# cmhaltpkg pkg1
# cmrunpkg -n node1 pkg1
# cmmodpkg -e pkg1
• Fail one of the systems. For example, turn off the power on node 1. Make sure
the package starts up on node 2.
• Repeat failover from node 2 back to node 1.
2. Be sure to test all combinations of application load during the testing. Repeat the
failover processes under different application states such as heavy user load versus
no user load, batch jobs vs online transactions, etc.
3. Record timelines of the amount of time spent during the failover for each
application state. A sample timeline might be 45 seconds to reconfigure the cluster,
15 seconds to run fsck on the filesystems, 30 seconds to start the application and
3 minutes to recover the database.

Checklist for Integrating HA Applications 317


318
C Blank Planning Worksheets
This appendix reprints blank versions of the planning worksheets described in the
“Planning” chapter. You can duplicate any of these worksheets that you find useful
and fill them in as a part of the planning process. The worksheets included in this
appendix are as follows:
• Hardware Worksheet (page 320)
• Power Supply Worksheet (page 321)
• Quorum Server Worksheet (page 322)
• Volume Group and Physical Volume Worksheet (page 323)
• Cluster Configuration Worksheet (page 324)
• Package Configuration Worksheet (page 325)
• Package Control Script Worksheet (Legacy) (page 326)

319
Hardware Worksheet
=============================================================================
SPU Information:

Host Name ____________________ Server Series____________

Memory Capacity ____________ Number of I/O Slots ____________


=============================================================================
LAN Information:

Name of Name of Node IP Traffic


Master _________ Interface __________ Addr________________ Type ________

Name of Name of Node IP Traffic


Master __________ Interface __________ Addr________________ Type ________

Name of Name of Node IP Traffic


Master _________ Interface __________ Addr_______________ Type __________

===============================================================================

Quorum Server Name: __________________ IP Address: ____________________

=============================================================================

Disk I/O Information for Shared Disks:

Bus Type ______ Slot Number ____ Address ____ Disk Device File _________

Bus Type ______ Slot Number ___ Address ____ Disk Device File __________

Bus Type ______ Slot Number ___ Address ____ Disk Device File _________

Bus Type ______ Slot Number ___ Address ____ Disk Device File _________

320 Blank Planning Worksheets


Power Supply Worksheet
============================================================================
SPU Power:

Host Name ____________________ Power Supply _____________________

Host Name ____________________ Power Supply _____________________

============================================================================
Disk Power:

Disk Unit __________________________ Power Supply _______________________

Disk Unit __________________________ Power Supply _______________________

Disk Unit __________________________ Power Supply _______________________

Disk Unit __________________________ Power Supply _______________________

Disk Unit __________________________ Power Supply _______________________

Disk Unit __________________________ Power Supply _______________________

============================================================================
Tape Backup Power:

Tape Unit __________________________ Power Supply _______________________

Tape Unit __________________________ Power Supply _______________________

============================================================================
Other Power:

Unit Name __________________________ Power Supply _______________________

Unit Name __________________________ Power Supply _______________________

Power Supply Worksheet 321


Quorum Server Worksheet
Quorum Server Data:
==============================================================================

QS Hostname: _________________IP Address: ______________________

OR

Cluster Name: _________________

Package Name: ____________ Package IP Address: ___________________

Hostname Given to Package by Network Administrator: _________________

==============================================================================

Quorum Services are Provided for:

Cluster Name: ___________________________________________________________

Host Names ____________________________________________

Host Names ____________________________________________

Cluster Name: ___________________________________________________________

Host Names ____________________________________________

Host Names ____________________________________________

Cluster Name: ___________________________________________________________

Host Names ____________________________________________

Host Names ____________________________________________

322 Blank Planning Worksheets


Volume Group and Physical Volume Worksheet
==============================================================================

Volume Group Name: ___________________________________

Physical Volume Name: _________________

Physical Volume Name: _________________

Physical Volume Name: _________________

=============================================================================

Volume Group Name: ___________________________________

Physical Volume Name: _________________

Physical Volume Name: _________________

Physical Volume Name: _________________

Volume Group and Physical Volume Worksheet 323


Cluster Configuration Worksheet
===============================================================================
Name and Nodes:
===============================================================================
Cluster Name: ______________________________

Node Names: ________________________________________________

Maximum Configured Packages: ______________


===============================================================================
Cluster Lock Data:
================================================================================
If using a quorum server:
Quorum Server Host Name or IP Address: ____________________

Quorum Server Polling Interval: ______________ microseconds

Quorum Server Timeout Extension: _______________ microseconds


==============================================================================
If using a lock lun:
Lock LUN Name on Node 1: __________________
Lock LUN Name on Node 2: __________________
Lock LUN Name on Node 3: __________________
Lock LUN Name on Node 4: __________________
===============================================================================
Subnets:
===============================================================================
Heartbeat Subnet: __________________________

Monitored Non-heartbeat Subnet: __________________

Monitored Non-heartbeat Subnet: ___________________


===============================================================================
Timing Parameters:
===============================================================================
Heartbeat Interval: __________
===============================================================================
Node Timeout: ______________
===============================================================================
Network Polling Interval: __________
===============================================================================
Autostart Delay: _____________
Access Policies
User: ________ Host: ________ Role: ________
User: _________ Host: _________ Role: __________

324 Blank Planning Worksheets


Package Configuration Worksheet
=============================================================================
Package Configuration File Data:
==========================================================================
Package Name: __________________Package Type:______________
Primary Node: ____________________ First Failover Node:__________________
Additional Failover Nodes:__________________________________
Run Script Timeout: _____ Halt Script Timeout: _____________
Package AutoRun Enabled? ______
Node Failfast Enabled? ________
Failover Policy:_____________ Failback_policy:___________________________________
Access Policies:
User:_________________ From node:_______ Role:_____________________________
User:_________________ From node:_______ Role:______________________________________________
Log level____ Log
file:_______________________________________________________________________________________
Priority_____________ Successor_halt_timeout____________
dependency_name _____ dependency_condition _____
dependency_location _______
==========================================================================
LVM Volume Groups:
vg____vg01___________vg________________vg________________vg________________
vgchange_cmd:
__________________________________________________________________________________________________
Logical Volumes and File Systems:
fs_name___________________ fs_directory________________
fs_mount_opt_______________fs_umount_opt______________
fs_fsck_opt________________fs_type_________________
fs_name____________________fs_directory________________
fs_mount_opt_______________fs_umount_opt_____________
fs_fsck_opt________________fs_type_________________
fs_name____________________fs_directory________________
fs_mount_opt_______________fs_umount_opt_____________
fs_fsck_opt________________fs_type_________________
fs_mount_retry_count: ____________
fs_umount_retry_count:___________________
Concurrent mount/umount operations: ______________________________________
Concurrent fsck operations:
______________________________________________===============================================================================
Network Information:
IP ________ IP__________IP___________subnet __________
IP__________IP__________IP___________subnet___________
Monitored subnet:_______________________________________________________________
===============================================================================
Service Name: _______ Command: _________ Restart:___ Fail Fast enabled:____
Service Name: _______ Command: _________ Restart: __ Fail Fast enabled:_____
Service Name: _______ Command: _________ Restart: __ Fail Fast enabled:_____
================================================================================
Package environment variable:________________________________________________
Package environment variable:________________________________________________
External pre-script:_________________________________________________________
External
script:_____________________________________________________________================================================================================

Package Configuration Worksheet 325


Package Control Script Worksheet (Legacy)
PACKAGE CONTROL SCRIPT WORKSHEET Page ___ of ___
================================================================================
Package Control Script Data:
================================================================================

PATH______________________________________________________________
VGCHANGE_________________________________

VG[0]__________________LV[0]______________________FS[0]____________________

VG[1]__________________LV[1]______________________FS[1]____________________

VG[2]__________________LV[2]______________________FS[2]____________________

FS Umount Count: ____________FS Mount Retry Count:_________________________

IP[0] ______________________________ SUBNET ________________________

IP[1] ______________________________ SUBNET ________________________

Service Name: __________ Command: ______________________ Restart: ________

Service Name: __________ Command: ______________________ Restart: ________

NOTE: MD, RAIDTAB, and RAIDSTART are deprecated and should not be used. See
“Multipath for Storage ” (page 96).

326 Blank Planning Worksheets


D IPv6 Network Support
This appendix describes some of the characteristics of IPv6 network addresses,
specifically:
• IPv6 Address Types
• Network Configuration Restrictions (page 331)
• Configuring IPv6 on Linux (page 332)

IPv6 Address Types


Several IPv6 types of addressing schemes are specified in the RFC 2373 (IPv6 Addressing
Architecture). IPv6 addresses are 128-bit identifiers for interfaces and sets of interfaces.
There are various address formats for IPv6 defined by the RFC 2373. IPv6 addresses
are broadly classified as unicast, anycast, and multicast.
The following table explains the three types.
Table D-1 IPv6 Address Types
Unicast An address for a single interface. A packet sent to a unicast address is delivered to the
interface identified by that address.

Anycast An address for a set of interfaces. In most cases these interfaces belong to different
nodes. A packet sent to an anycast address is delivered to one of these interfaces
identified by the address. Since the standards for using anycast addresses are still
evolving, they are not supported in Linux at present.

Multicast An address for a set of interfaces (typically belonging to different nodes). A packet
sent to a multicast address will be delivered to all interfaces identified by that address.

Unlike IPv4, IPv6 has no broadcast addresses; their functions are superseded by
multicast.

Textual Representation of IPv6 Addresses


There are three conventional forms for representing IPv6 addresses as text strings:
• The first form is x:x:x:x:x:x:x:x, where the x’s are the hexadecimal values
of the eight 16-bit pieces of the 128-bit address. Example:
2001:fecd:ba23:cd1f:dcb1:1010:9234:4088.
• Some of the IPv6 addresses may contain a long strings of zero bits. In order to
make it easy for representing such addresses textually a special syntax is available.
The use of “::” indicates that there are multiple groups of 16-bits of zeros. The
“::” can appear only once in an address and it can be used to compress the leading,
trailing, or contiguous sixteen-bit zeroes in an address. Example:
fec0:1:0:0:0:0:0:1234 can be represented as fec0:1::1234.
• In a mixed environment of IPv4 and IPv6 nodes an alternative form of IPv6 address
will be used. It is x:x:x:x:x:x:d.d.d.d, where the x’s are the hexadecimal

IPv6 Address Types 327


values of higher order 96 bits of IPv6 address and the d’s are the decimal values
of the 32-bit lower order bits. Typically IPv4 Mapped IPv6 addresses and IPv4
Compatible IPv6 addresses will be represented in this notation. These addresses
are discussed in later sections.
Examples:
0:0:0:0:0:0:10.1.2.3
and
::10.11.3.123

IPv6 Address Prefix


IPv6 Address Prefix is similar to CIDR in IPv4 and is written in CIDR notation. An
IPv6 address prefix is represented by the notation:
IPv6-address/prefix-length where ipv6-address is an IPv6 address in any
notation listed above and prefix-length is a decimal value representing how many
of the leftmost contiguous bits of the address comprise the prefix. Example:
fec0:0:0:1::1234/64
The first 64-bits of the address fec0:0:0:1 forms the address prefix. An address
prefix is used in IPv6 addresses to denote how many bits in the IPv6 address represent
the subnet.

Unicast Addresses
IPv6 unicast addresses are classified into different types. They are: global aggregatable
unicast address, site-local address and link-local address. Typically a unicast address
is logically divided as follows:
Table D-2
n bits 128-n bits

Subnet prefix Interface ID

Interface identifiers in a IPv6 unicast address are used to identify the interfaces on a
link. Interface identifiers are required to be unique on that link. The link is generally
identified by the subnet prefix.
A unicast address is called an unspecified address if all the bits in the address are zero.
Textually it is represented as “::”.
The unicast address ::1 or 0:0:0:0:0:0:0:1 is called the loopback address. It is
used by a node to send packets to itself.

IPv4 and IPv6 Compatibility


There are a number of techniques for using IPv4 addresses within the framework of
IPv6 addressing.

328 IPv6 Network Support


IPv4 Compatible IPv6 Addresses
The IPv6 transition mechanisms use a technique for tunneling IPv6 packets over the
existing IPv4 infrastructure. IPv6 nodes that support such mechanisms use a special
kind of IPv6 addresses that carry IPv4 addresses in their lower order 32-bits. These
addresses are called IPv4 Compatible IPv6 addresses. They are represented as follows:
Table D-3
80 bits 16 bits 32 bits

zeros 0000 IPv4 address

Example:
::192.168.0.1

IPv4 Mapped IPv6 Address


There is a special type of IPv6 address that holds an embedded IPv4 address. This
address is used to represent the addresses of IPv4-only nodes as IPv6 addresses. These
addresses are used especially by applications that support both IPv6 and IPv4. These
addresses are called as IPv4 Mapped IPv6 Addresses. The format of these address is
as follows:
Table D-4
80 bits 16 bits 32 bits

zeros FFFF IPv4 address

Example:
::ffff:192.168.0.1

Aggregatable Global Unicast Addresses


The global unicast addresses are globally unique IPv6 addresses. This address format
is very well defined in the RFC 2374 (An IPv6 Aggregatable Global Unicast Address Format).
The format is:
Table D-5
3 13 8 24 16 64 bits

FP TLA ID RES NLA ID SLA ID Interface ID

where
FP = Format prefix. Value of this is “001” for Aggregatable Global unicast addresses.
TLA ID = Top-level Aggregation Identifier.
RES = Reserved for future use.
NLA ID = Next-Level Aggregation Identifier.

IPv6 Address Types 329


SLA ID = Site-Level Aggregation Identifier.
Interface ID = Interface Identifier.

Link-Local Addresses
Link-local addresses have the following format:
Table D-6
10 bits 54 bits 64 bits

1111111010 0 interface ID

Link-local address are supposed to be used for addressing nodes on a single link.
Packets originating from or destined to a link-local address will not be forwarded by
a router.

Site-Local Addresses
Site-local addresses have the following format:
Table D-7
10 bits 38 bits 16 bits 64 bits

1111111011 0 subnet ID interface ID

Link-local address are supposed to be used within a site. Routers will not forward any
packet with site-local source or destination address outside the site.

Multicast Addresses
A multicast address is an identifier for a group of nodes. Multicast addresses have the
following format:
Table D-8
8 bits 4 bits 4 bits 112 bits

11111111 flags scop group ID

“FF” at the beginning of the address identifies the address as a multicast address.
The “flags” field is a set of 4 flags “000T”. The higher order 3 bits are reserved and
must be zero. The last bit ‘T’ indicates whether it is permanently assigned or not. A
value of zero indicates that it is permanently assigned otherwise it is a temporary
assignment.
The “scop” field is a 4-bit field which is used to limit the scope of the multicast group.
For example, a value of ‘1’ indicates that it is a node-local multicast group. A value of
‘2’ indicates that the scope is link-local. A value of “5” indicates that the scope is
site-local.

330 IPv6 Network Support


The “group ID” field identifies the multicast group. Some frequently used multicast
groups are the following:
All Node Addresses = FF02:0:0:0:0:0:0:1 (link-local)
All Router Addresses = FF02:0:0:0:0:0:0:2 (link-local)
All Router Addresses = FF05:0:0:0:0:0:0:2 (site-local)

Network Configuration Restrictions


Serviceguard supports IPv6 for data and heartbeat IP.
The restrictions on support for IPv6 in Serviceguard for Linux are:
• Auto-configured IPv6 addresses are not supported in Serviceguard. as
HEARTBEAT_IP or STATIONARY_IP addresses. IPv6 addresses that are part of
a Serviceguard cluster configuration must not be auto-configured through router
advertisements, for example. Instead, they must be manually configured in
/etc/sysconfig/network-scripts/ifcfg-<eth-ID> on Red Hat or
/etc/sysconfig/network/ifcfg-<eth-ID> on SUSE. See “Configuring
IPv6 on Linux” (page 332) for instructions and examples.
• Link-local IP addresses are not supported, as package IPs, HEARTBEAT_IPs, or
STATIONARY_IPs. Depending on the requirements, the package IP address could
be of type site-local or global.
• Serviceguard supports only one IPv6 address belonging to each scope type
(site-local and global) on each network interface (that is, restricted multi-netting).
This means that a maximum of two IPv6 HEARTBEAT_IP or STATIONARY_IP
addresses can be listed in the cluster configuration file for a
NETWORK_INTERFACE:, one being the site-local IPv6 address, and the other
being the global IPv6 address.

NOTE: This restriction applies to cluster configuration, not package configuration:


it does not affect the number of IPv6 relocatable addresses of the same scope type
(site-local or global) that a package can use on an interface.

• Bonding is supported for IPv6 addresses, but only in active-backup mode.


• Serviceguard supports IPv6 only on the Ethernet networks, including 10BT, 100BT,
and Gigabit Ethernet.

Network Configuration Restrictions 331


IMPORTANT: For important information, see also “Cross-Subnet Configurations”
(page 32), the description of the HOSTNAME_ADDRESS_FAMILY, QS_HOST and
QS_ADDR parameters under Cluster Configuration Parameters (page 105), “Configuring
Name Resolution” (page 156), and the Release Notes for your version of Serviceguard
for Linux.
For special instructions that may apply to using IPv6 addresses to connect your version
of Serviceguard for Linux and the Quorum Server, see “Configuring Serviceguard to
Use the Quorum Server” in the latest version HP Serviceguard Quorum Server Version
A.04.00 Release Notes, at http://www.docs.hp.com -> High Availability ->
Quorum Server.

Configuring IPv6 on Linux


Red Hat Enterprise Linux and SUSE Linux Enterprise Server already have the proper
IPv6 tools installed, including the /sbin/ip command. This section explains how to
configure IPv6 stationary IP addresses on these systems.

Enabling IPv6 on Red Hat Linux


Add the following lines to /etc/sysconfig/network:
NETWORKING_IPV6=yes # Enable global IPv6 initialization
IPV6FORWARDING=no # Disable global IPv6 forwarding
IPV6_AUTOCONF=no # Disable global IPv6 autoconfiguration
IPV6_AUTOTUNNEL=no # Disable automatic IPv6 tunneling

Adding persistent IPv6 Addresses on Red Hat Linux


This can be done by modifying the system configuration script, for example,
/etc/sysconfig/network-scripts/ifcfg-eth1:
DEVICE=eth1BOOTPROTO=static
BROADCAST=192.168.1.255
IPADDR=192.168.1.10
NETMASK=255.255.255.0
NETWORK=192.168.1.0
ONBOOT=yes
IPV6INIT=yes
IPV6ADDR=3ffe:ffff:0000:f101::10/64
IPV6ADDR_SECONDARIES=fec0:0:0:1::10/64
IPV6_MTU=1280

Configuring a Channel Bonding Interface with Persistent IPv6 Addresses on Red Hat
Linux
Configure the following parameters in
/etc/sysconfig/network-scripts/ifcfg-bond0:
DEVICE=bond0
IPADDR=12.12.12.12

332 IPv6 Network Support


NETMASK=255.255.255.0
NETWORK=12.12.12.0
BROADCAST=12.12.12.255
IPV6INIT=yes
IPV6ADDR=3ffe:ffff:0000:f101::10/64
IPV6ADDR_SECONDARIES=fec0:0:0:1::10/64
IPV6_MTU=1280
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
Add the following two lines to /etc/modprobe.conf to cause the bonding driver
to be loaded on reboot:
alias bond0 bonding
options bond0 miimon=100 mode=1 # active-backup mode

Adding Persistent IPv6 Addresses on SUSE


This can be done by modifying the system configuration script, for example,
/etc/sysconfig/network/ifcfg-eth1:
BOOTPROTO=static
BROADCAST=10.10.18.255
IPADDR=10.10.18.18
MTU=""
NETMASK=255.255.255.0
NETWORK=10.10.18.0
REMOTE_IPADDR=""
STARTMODE=onboot
IPADDR1=3ffe::f101:10/64
IPADDR2=fec0:0:0:1::10/64

Configuring a Channel Bonding Interface with Persistent IPv6 Addresses on SUSE


Configure the following parameters in /etc/sysconfig/network/ifcfg-bond0:
BOOTPROTO=static
BROADCAST=10.0.2.255
IPADDR=10.0.2.10
NETMASK=255.255.0.0
NETWORK=0.0.2.0
REMOTE_IPADDR=""
STARTMODE=onboot
IPADDR1=3ffe::f101:10/64IPADDR2=fec0:0:0:1::10/64
BONDING_MASTER=yes
BONDING_MODULE_OPTS="mode=active-backup miimon=100"
BONDING_SLAVE0=eth1BONDING_SLAVE1=eth2
For each additional IPv6 address, specify an additional parameter with IPADDR<num>
in the configuration file.
Bonding module options are specified in each of the bond device files, so nothing needs
to specified in/etc/modprobe.conf

Configuring IPv6 on Linux 333


334
E Using Serviceguard Manager
HP Serviceguard Manager is a web-based, HP System Management Homepage (HP
SMH) tool that replaces the functionality of the earlier Serviceguard management tools.
Serviceguard Manager allows you to monitor, administer and configure a Serviceguard
cluster from any system with a supported web browser.
The Serviceguard Manager Main Page provides you with a summary of the health of
the cluster including the status of each node and its packages.
See the Release Notes for your version of Serviceguard for Linux for information about
the latest release of Serviceguard Manager, as well as installation and configuration
instructions.

About the Online Help System


Once Serviceguard Manager is running, use the Serviceguard Manager tooltips by
moving your mouse over a field from the read-only property pages for a brief definition
for each field. You can also open the online help by clicking the button in the
upper-right hand corner of the screen. Start with the help topic “Understanding the
HP Serviceguard Manager Main Page”. You should also read the help topic “About
Security”, as it explains HP Serviceguard Manager Access Control Policies, as well as
root privileges.

Launching Serviceguard Manager


This section provides information about two common scenarios.

TIP: To prevent an Out of Memory error reported by Tomcat (Exception in


thread "main" java.lang.OutOfMemoryError: Java heap space), which
may occur especially if the server is under heavy load or Serviceguard Manager is
managing a large cluster (16 nodes with 300 packages), do the following from the
command line:
1. Stop hpsmhd
/etc/init.d/hpsmhd -stop
2. Modify the /opt/hp/hpsmh/tomcat/bin/startup.sh file, and add the
following line in the export statements section:
export CATALINA_OPTS="-Xms512m -Xmx512m"
3. Save the file and restart hpsmhd
/etc/init.d/hpsmhd -start

Scenario 1 - Single cluster management


Scenario 1 applies if you are:

About the Online Help System 335


• Managing a single cluster, and
• Have installed Serviceguard version A.11.19.

NOTE: SMH access roles constrain a user's cluster-management capabilities:


• A user with HP SMH Administrator access has full cluster management capabilities.
• A user with HP SMH Operator access can monitor the cluster and has restricted
cluster management capabilities as defined by the user’s Serviceguard role-based
access configuration.
• A user with HP SMH User access does not have any cluster management
capabilities.
1. Enter the standard URL “http://<full hostname of server>:2301/”
For example http://clusternode1.cup.hp.com:2301/
2. When the System Management Homepage login screen appears, enter your login
credentials and click Sign In.
The System Management Homepage for the selected server appears.
3. From the Serviceguard Cluster box, click the name of the cluster.

NOTE: If a cluster is not yet configured, you will not see the Serviceguard Cluster
section on this screen. To create a cluster, from the SMH Tools menu, click the
Serviceguard Manager link in the Serviceguard box first, then click Create Cluster.
The figure below shows a browser session at the HP Serviceguard Manager Main Page.

336 Using Serviceguard Manager


Figure E-1 System Management Homepage with Serviceguard Manager

Number What is it? Description

1 Cluster and Displays information about the Cluster status, alerts and general information.
Overall
status and NOTE: The System Tools menu item is not available in this version of
alerts Serviceguard Manager.

2 Menu tool The menu tool bar is available from the HP Serviceguard Manager
bar Homepage, and from any cluster, node or package view-only property page.
Menu option availability depends on which type of property page (cluster,
node or package) you are currently viewing.

3 Tab bar The default Tab bar allows you to view additional cluster-related
information. The Tab bar displays different content when you click on a
specific node or package.

4 Node Displays information about the Node status, alerts and general information.
information

5 Package Displays information about the Package status, alerts and general
information information.

Scenario 2- Multi-Cluster Management


Open a separate browser session to administer, manage or monitor multiple clusters.
Scenario 2 applies if you have:

Launching Serviceguard Manager 337


• One or more clusters with Serviceguard version A.11.15.01 through A.11.19
• Serviceguard Manager version A.05.01 with HP SIM 5.10 or later installed on a
server

NOTE: Serviceguard Manager can be launched by HP Systems Insight Manager


version 5.10 or later if Serviceguard Manager is installed on an HP Systems Insight
Manager Central Management Server.
For a Serviceguard A.11.19 cluster, Systems Insight Manager will attempt to launch
Serviceguard Manager B.02.00 from one of the nodes in the cluster; for a
Serviceguard A.11.18 cluster, Systems Insight Manager will attempt to launch
Serviceguard Manager B.01.01 from one of the nodes in the cluster.
For a Serviceguard A.11.16 cluster or earlier, Serviceguard Manager A.05.01 will
be launched via Java Web Start. You must ensure that the hostname for each
Serviceguard node can be resolved by DNS. For more information about this older
version of Serviceguard Manager, see the Serviceguard Manager Version A.05.01
Release Notes at http://docs.hp.com -> High Availability ->
Serviceguard Manager.

1. Enter the standard URL, https://<full hostname of SIM server>:50000/


For example https://SIMserver.cup.hp.com:50000/
2. When the Systems Insight Manager login screen appears, enter your login credentials
and click Sign In.
3. From the left-hand panel, expand Cluster by Type.

338 Using Serviceguard Manager


Figure E-2 Cluster by Type

4. Expand HP Serviceguard, and click on a Serviceguard cluster.

NOTE: If you click on a cluster running an earlier Serviceguard release, the page will
display a link that will launch Serviceguard Manager A.05.01 (if installed) via Java
Webstart.

Launching Serviceguard Manager 339


340
Index
defined, 30
A broadcast storm
and possible TOC, 118
Access Control Policies, 183
building a cluster
active node, 25
identifying heartbeat subnets, 182
adding a package to a running cluster, 273
identifying quorum server, 179
adding cluster nodes
logical volume infrastructure, 165
advance planning, 150
verifying the cluster configuration, 189
adding nodes to a running cluster, 240
bus type
adding packages on a running cluster, 228
hardware planning, 96
administration
adding nodes to a running cluster, 240
halting a package, 243 C
halting the entire cluster, 242 CAPACITY_NAME
moving a package, 244 defined, 115
of packages and services, 242 CAPACITY_VALUE
of the cluster, 239 definedr, 115
reconfiguring a package while the cluster is running, 272 changes in cluster membership, 44
reconfiguring a package with the cluster offline, 273 changes to cluster allowed while the cluster is running, 255
reconfiguring the cluster, 255 changes to packages allowed while the cluster is running, 275
removing nodes from operation in a running cluster, 241 checkpoints, 303
responding to cluster events, 278 client connections
reviewing configuration files, 290 restoring in applications, 309
starting a package, 242 cluster
troubleshooting, 288 configuring with commands, 177
adoptive node, 25 redundancy of components, 29
applications Serviceguard, 23
automating, 299 typical configuration, 23
checklist of steps for integrating with Serviceguard, 315 understanding components, 29
handling failures, 310 cluster administration, 239
writing HA services for networks, 301 solving problems, 291
ARP messages cluster and package maintenance, 229
after switching, 83 cluster configuration
AUTO_START file on all nodes, 42
effect of default value, 91 identifying cluster-aware volume groups, 182
AUTO_START_TIMEOUT planning, 100
parameter in the cluster configuration file, 119 planning worksheet, 123
AUTO_START_TIMEOUT (autostart delay) verifying the cluster configuration, 189
parameter in cluster manager configuration, 119 cluster configuration file
automatic failback Autostart Delay parameter (AUTO_START_TIMEOUT), 119
configuring with failover policies, 57 cluster coordinator
automatic restart of cluster, 44 defined, 42
automatically restarting the cluster, 242 cluster lock
automating application operation, 299 4 or more nodes, 47, 48
autostart delay and cluster reformation, example, 90
parameter in the cluster configuration file, 119 and power supplies, 36
autostart for clusters identifying in configuration file, 179
setting up, 192 no lock, 48
two nodes, 45, 46
B use in re-forming a cluster, 45, 46
cluster manager
binding
automatic restart of cluster, 44
in network applications, 307
blank planning worksheet, 324
bridged net
cluster node parameter, 105, 108

341
defined, 42 support for additional productss, 268
dynamic re-formation, 44 troubleshooting, 290
heartbeat subnet parameter, 111 controlling the speed of application failover, 301
initial configuration of the cluster, 42 creating the package configuration, 262
main functions, 42 customer defined functions
maximum configured packages parameter, 123 adding to the control script, 267
member timeout parameter, 118
monitored non-heartbeat subnet, 114 D
network polling interval parameter, 119, 122 data
planning the configuration, 105
disks, 35
quorum server parameter, 108 data congestion, 43
testing, 282 deciding when and where to run packages, 50, 51
cluster node deleting a package configuration
parameter in cluster manager configuration, 105, 108 using cmdeleteconf, 273
cluster parameters deleting a package from a running cluster, 273
initial configuration, 42 deleting nodes while the cluster is running, 256
cluster re-formation deleting the cluster configuration
scenario, 90 using cmdeleteconf, 195
cluster startup dependencies
manual, 43 configuring, 126
cmapplyconf, 255, 269 designing applications to run on multiple systems, 304
cmapplyconf command, 227 disk
cmcheckconf, 189, 227, 268 data, 35
troubleshooting, 291 interfaces, 35
cmcheckconf command, 227 root, 35
cmcld daemon sample configurations, 35
and node reboot, 39 disk I/O
and node TOC, 39 hardware planning, 96
and safety timer, 39 disk layout
cmclnodelist bootstrap file, 155 planning, 99
cmdeleteconf disk logical units
deleting a package configuration, 273 hardware planning, 97
deleting the cluster configuration, 195 disk monitoring
cmmakepkg configuring, 228
examples, 222 disks
cmmodnet in Serviceguard, 34
assigning IP addresses in control scripts, 72 replacing, 283
cmnetassist daemon, 40 supported types in Serviceguard, 35
cmnetd daemon, 38 distributing the cluster and package configuration, 227, 268
cmquerycl DNS services, 158
troubleshooting, 291 down time
cmsnmpd daemon, 38 minimizing planned, 311
configuration dynamic cluster re-formation, 44
basic tasks and steps, 28
cluster planning, 100
of the cluster, 42 E
package, 197 enclosure for disks
package planning, 123 replacing a faulty mechanism, 283
service, 197 Ethernet
configuration file redundant configuration, 31
for cluster manager, 42 exclusive access
troubleshooting, 290 relinquishing via TOC, 90
configuring packages and their services, 197 expanding the cluster
control script planning ahead, 94
adding customer defined functions, 267 expansion
in package configuration, 265 planning for, 125
pathname parameter in package configuration, 221 explanations

342 Index
package parameters, 204 halting a package, 243
halting the entire cluster, 242
F handling application failures, 310
hardware
failback policy
monitoring, 282
used by package manager, 57
power supplies, 36
FAILBACK_POLICY parameter
hardware failures
used by package manager, 57
response to, 91
failover
hardware planning
controlling the speed in applications, 301
blank planning worksheet, 319
defined, 25
Disk I/O Bus Type, 96
failover behavior
disk I/O information for shared disks, 96
in packages, 125
host IP address, 95, 99
failover package, 49, 198
host name, 94
failover policy
I/O bus addresses, 96
used by package manager, 53
I/O slot numbers, 96
FAILOVER_POLICY parameter
LAN interface name, 95, 99
used by package manager, 53
LAN traffic type, 95
failure
memory capacity, 94
kinds of responses, 89
number of I/O slots, 94
network communication, 92
planning the configuration, 94
response to hardware failures, 91
S800 series number, 94
responses to package and service failures, 92
SPU information, 94
restarting a service after failure, 92
subnet, 95, 99
failures
worksheet, 97
of applications, 310
heartbeat messages, 24
FibreChannel, 35
defined, 43
figures
heartbeat subnet address
mirrored disks connected for high availability, 36
parameter in cluster configuration, 111
redundant LANs, 32
HEARTBEAT_IP
Serviceguard software components, 38
parameter in cluster configuration, 111
tasks in configuring an Serviceguard cluster, 28
high availability, 23
typical cluster after failover, 25
HA cluster defined, 29
typical cluster configuration, 24
objectives in planning, 93
file locking, 309
host IP address
file system name parameter in package control script, 222
hardware planning, 95, 99
file systems
host name
planning, 99
hardware planning, 94
floating IP address
HOSTNAME_ADDRESS_FAMILY
defined, 71
defined, 106
floating IP addresses
discussion and restrictions, 101
in Serviceguard packages, 72
how the cluster manager works, 42
FS, 222
how the network manager works, 71
in sample package control script, 266
FS_MOUNT_OPT
in sample package control script, 266 I
I/O bus addresses
G hardware planning, 96
general planning, 93 I/O slots
hardware planning, 94, 96
gethostbyname(), 306
identifying cluster-aware volume groups, 182
Installing Serviceguard, 153
H installing software
HALT_SCRIPT quorum server, 165
parameter in package configuration, 221 integrating HA applications with Serviceguard, 315
HALT_SCRIPT_TIMEOUT (halt script timeout) introduction
parameter in package configuration, 222 Serviceguard at a glance, 23
halting a cluster, 242 understanding Serviceguard hardware, 29

343
understanding Serviceguard software, 37 M
IP MAC addresses, 305
in sample package control script, 266 managing the cluster and nodes, 239
IP address manual cluster startup, 43
adding and deleting in packages, 73 MAX_CONFIGURED_PACKAGES
for nodes and packages, 71 parameter in cluster manager configuration, 123
hardware planning, 95, 99 maximum number of nodes, 29
portable, 72 MEMBER_TIMEOUT
reviewing for packages, 288 and safety timer, 39
switching, 52, 53, 83 configuring, 118
IP_MONITOR defined, 117
defined, 120 maximum and minimum values , 117
membership change
J reasons for, 44
JFS, 301 memory capacity
hardware planning, 94
memory requirements
K lockable memory for Serviceguard, 94
kernel minimizing planned down time, 311
hang, and TOC, 89 mirrored disks connected for high availability
safety timer, 39 figure, 36
kernel consistency monitor cluster with Serviceguard commands, 191
in cluster configuration, 159 monitored non-heartbeat subnet
kernel interrupts parameter in cluster configuration, 114
and possible TOC, 118 monitored resource failure
Serviceguard behavior, 29
L monitoring disks, 228
LAN monitoring hardware, 282
heartbeat, 43 moving a package, 244
interface name, 95, 99 multi-node package, 49, 198
LAN failure multiple systems
Serviceguard behavior, 29 designing applications for, 304
LAN interfaces
primary and secondary, 30 N
LAN planning name resolution services, 158
host IP address, 95, 99 network
traffic type, 95 adding and deleting package IP addresses, 73
link-level addresses, 305 load sharing with IP addresses, 73
load sharing with IP addresses, 73 local interface switching, 74
local switching, 74 redundancy, 31
lock remote system switching, 82
cluster locks and power supplies, 36 network communication failure, 92
use of the cluster lock, 46 network components
use of the cluster lock disk, 45 in Serviceguard, 30
lock volume group, reconfiguring, 255 network manager
logical volume parameter in package control script, 222 adding and deleting package IP addresses, 73
logical volumes main functions, 71
creating the infrastructure, 165 network planning
planning, 99 subnet, 95, 99
LV, 222 network polling interval (NETWORK_POLLING_INTERVAL)
in sample package control script, 266 parameter in cluster manager configuration, 119, 122
LVM network time protocol (NTP)
commands for cluster use, 165 for clusters, 159
disks, 35 networking
planning, 99 redundant subnets, 95
networks

344 Index
binding to IP addresses, 307 subnet parameter, 221
binding to port addresses, 307 using Serviceguard commands, 262
IP addresses and naming, 304 verifying, 227
node and package IP addresses, 71 verifying the configuration, 227, 268
packages using IP addresses, 306 writing the package control script, 265
supported types in Serviceguard, 30 package configuration file, 204
writing network applications as HA services, 301 editing, 223
no cluster lock generating, 222
choosing, 48 package dependency paramters, 210
node successor_halt_timeout, 208
basic concepts, 29 package configuration parameters, 204
halt (TOC), 90 package control script
in Serviceguard cluster, 23 FS parameter, 222
IP addresses, 71 LV parameter, 222
timeout and TOC example, 90 package coordinator
node types defined, 43
active, 25 package dependency
primary, 25 parameters, 210
NODE_FAIL_FAST_ENABLED successor_halt_timeou, 208
effect of setting, 92 package failover behavior, 125
NODE_NAME package failures
parameter in cluster configuration, 110 responses, 92
parameter in cluster manager configuration, 105, 108 package IP address
nodetypes defined, 71
primary, 25 package IP addresses
NTP defined, 72
time protocol for clusters, 159 reviewing, 288
package manager
O blank planning worksheet, 325, 326
testing, 281
Object Manager, 290
package modules, 199
outages
base, 200
insulating users from, 300
optional, 202
package switching behavior
P changing, 244
package packages
adding and deleting package IP addresses, 73 deciding where and when to run, 50, 51
basic concepts, 29 managed by cmcld, 39
blank planning worksheet, 325, 326 parameter explanations, 204
changes allowed while the cluster is running, 275 parameters, 204
halting, 243 types, 198
in Serviceguard cluster, 23 parameters
local interface switching, 74 for failover, 125
moving, 244 pacakge configuration, 204
reconfiguring while the cluster is running, 272 parameters for cluster manager
reconfiguring with the cluster offline, 273 initial configuration, 42
remote switching, 82 PATH, 221
starting, 242 physical volume
package administration, 242 for cluster lock, 45, 46
solving problems, 291 physical volumes
package and cluster maintenance, 229 blank planning worksheet, 323
package configuration planning, 99
applying, 227 planning
distributing the configuration file, 227, 268 cluster configuration, 100
planning, 123 cluster lock and cluster expansion, 99
run and halt script timeout parameters, 222 cluster manager configuration, 105
step by step, 197 disk I/O information, 96

345
for expansion, 125 redundancy in network interfaces, 30
hardware configuration, 94 redundant Ethernet configuration, 31
high availability objectives, 93 redundant LANS
overview, 93 figure, 32
package configuration, 123 redundant networks
power, 97 for heartbeat, 24
quorum server, 99 relocatable IP address
SPU information, 94 defined, 71
volume groups and physical volumes, 99 relocatable IP addresses
worksheets, 97 in Serviceguard packages, 72
planning and documenting an HA cluster, 93 remote switching, 82
planning for cluster expansion, 94 removing nodes from operation in a running cluster, 241
planning worksheets removing packages on a running cluster, 228
blanks, 319 removing Serviceguard from a system, 279
point of failure replacing disks, 283
in networking, 31 resources
POLLING_TARGET disks, 35
defined, 121 responses
ports to cluster events, 278
dual and single aggregated, 76 to package and service failures, 92
power planning responses to failures, 89
power sources, 97 responses to hardware failures, 91
worksheet, 98, 322 restart
power supplies automatic restart of cluster, 44
blank planning worksheet, 321 following failure, 92
power supply restartable transactions, 302
and cluster lock, 36 restarting the cluster automatically, 242
UPS, 36 restoring client connections in applications, 309
primary LAN interfaces rotating standby
defined, 30 configuring with failover policies, 54
primary node, 25 setting package policies, 54
RUN_SCRIPT
Q parameter in package configuration, 221
RUN_SCRIPT_TIMEOUT (run script timeout)
QS_ADDR
parameter in package configuration, 222
parameter in cluster manager configuration, 108
running cluster
quorum
adding or removing packages, 228
and cluster reformation, 90
quorum server
and safety timer, 39 S
installing, 165 S800 series number
parameters in cluster manager configuration, 108 hardware planning, 94
planning, 99 safety timer
and node TOC, 39
R and syslog, 39
duration, 39
re-formation
sample disk configurations, 35
of cluster, 44
service administration, 242
reconfiguring a package
while the cluster is running, 272 service configuration
step by step, 197
reconfiguring a package with the cluster offline, 273
service failures
reconfiguring a running cluster, 255
responses, 92
reconfiguring the entire cluster, 255
service restarts, 92
reconfiguring the lock volume group, 255
SERVICE_CMD
recovery time, 100
in sample package control script, 266
redundancy
SERVICE_FAIL_FAST_ENABLED
in networking, 31
and node TOC, 92
of cluster components, 29
SERVICE_NAME

346 Index
in sample package control script, 266 defined, 120
SERVICE_RESTART successor_halt_timeout parameter, 208
in sample package control script, 266 supported disks in Serviceguard, 35
Serviceguard supported networks in Serviceguard, 30
install, 153 switching
introduction, 23 ARP messages after switching, 83
Serviceguard at a Glance, 23 local interface switching, 74
Serviceguard behavior remote system switching, 82
in LAN failure, 29 switching IP addresses, 52, 53, 83
in monitored resource failure, 29 system log, 283
in software failure, 29 system log file
Serviceguard commands troubleshooting, 289
to configure a package, 262 system message
Serviceguard Manager, 27 changing for clusters, 193
overview, 26 system multi-node package, 49, 198
Serviceguard software components
figure, 38 T
shared disks tasks in Serviceguard configuration
planning, 96 figure, 28
shutdown and startup testing
defined for applications, 300 cluster manager, 282
single point of failure package manager, 281
avoiding, 23 testing cluster operation, 281
single-node operation, 193, 279 time protocol (NTP)
size of cluster for clusters, 159
preparing for changes, 150 TOC
SMN package, 49 and package availability, 90
SNA applications, 308 and safety timer, 118
software failure and the safety timer, 39
Serviceguard behavior, 29 when a node fails, 89
software planning traffic type
LVM, 99 LAN hardware planning, 95
solving problems, 291 troubleshooting
SPU information approaches, 288
planning, 94 monitoring hardware, 282
standby LAN interfaces replacing disks, 283
defined, 30 reviewing control scripts, 290
starting a package, 242 reviewing package IP addresses, 288
startup and shutdown reviewing system log file, 289
defined for applications, 300 using cmquerycl and cmcheckconf, 291
startup of cluster troubleshooting your cluster, 281
manual, 43 typical cluster after failover
stationary IP addresses, 71 figure, 25
STATIONARY_IP typical cluster configuration
parameter in cluster configuration, 114 figure, 24
status
cmviewcl, 229
package IP address, 288 U
system log file, 289 uname(2), 307
stopping a cluster, 242 understanding network components in Serviceguard, 30
SUBNET UPS
in sample package control script, 266 in power planning, 97
parameter in package configuration, 221 power supply, 36
subnet use of the cluster lock, 45, 46
hardware planning, 95, 99
parameter in package configuration, 221 V
SUBNET (for IP Monitor) verifying cluster configuration, 189

347
verifying the cluster and package configuration, 227, 268
VG
in sample package control script, 266
vgcfgbackup
using to back up volume group configuration, 175
VGCHANGE
in package control script, 266
VGChange, 222
volume group
for cluster lock, 45, 46
planning, 99
volume group and physical volume planning, 99

W
WEIGHT_DEFAULT
defined, 121
WEIGHT_NAME
defined, 121
What is Serviceguard?, 23
worksheet
blanks, 319
cluster configuration, 123, 324
hardware configuration, 97, 319
package configuration, 325, 326
power supply configuration, 98, 321, 322
use in planning, 93

348 Index

You might also like