0% found this document useful (0 votes)

268 views59 pages

Session Title:: IBM Power Systems Technical University

Uploaded by

ulysses_ramos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

268 views59 pages

Session Title:: IBM Power Systems Technical University

Uploaded by

ulysses_ramos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

IBM Power Systems Technical University

October 1822, 2010 Las Vegas, NV

Session Title:
Designing a PowerHA SystemMirror for AIX High Availability Solution

Session ID:

HA17(AIX)

Speaker Name: Michael Herrera

2010 IBM Corporation

Best Practices for Designing a PowerHA SystemMirror for AIX High Availability Solution
Michael Herrera ([email protected]) Advanced Technical Skills (ATS) Certified IT Specialist

+
Workload-Optimizing Systems

Agenda

Common Misconceptions & Mistakes Infrastructure Considerations Differences in 7.1 Virtualization & PowerHA SystemMirror Licensing Scenarios Cluster Management & Testing Summary

HACMP is now PowerHA SystemMirror for AIX!

HA & DR solutions from IBM for your mission-critical AIX applications

Current Release: 7.1.0.X

Available on: AIX 6.1 TL06 & 7.1

Packaging Changes:
Standard Edition - Local Availability Enterprise Edition - Local & Disaster Recovery

Licensing Changes:
Small, Medium, Large Server Class

Product Lifecycle: Version HACMP 5.4.1 PowerHA 5.5.0 PowerHA SystemMirror 6.1.0 PowerHA SystemMirror 7.1.0 Release Date Nov 6, 2007 Nov 14, 2008 Oct 20, 2009 Sept 10, 2010 End of Support Date Sept, 2011 N/A N/A N/A
4

* These dates are subject to change per Announcement Flash

PowerHA SystemMirror Minimum Requirements

PowerHA SystemMirror 7.1
AIX 7.1 AIX 6.1 TL 6 - SP1

7.1.0.1 Sep

PowerHA SystemMirror 6.1 AIX 7.1

AIX 6.1 TL 2 with RSCT 2.5.4.0 AIX 5.3 TL 9, with RSCT 2.4.12.0

6.1.0.2 May 21

PowerHA Version 5.5 AIX 7.1

AIX 6.1 TL2, SP1 and APAR IZ31208, RSCT 2.5.2.0 (Async GLVM - APARs IZ31205 and IZ31207) AIX 5L 5.3 TL 9 with RSCT version 2.4.10.0

5.5.0.6 June 7

HACMP 5.4.1 AIX 6.1 with RSCT version 2.5.0.0 or higher

AIX 5.3 TL4 with RSCT version 2.4.5 (IY84920) or higher AIX 5.2 TL8 with RSCT version 2.3.9 (IY84921) or higher

5.4.1.8 May 13

Common Misconceptions
PowerHA SystemMirror is an out of the box solution
Scripting & Testing of application Start / Stop scripts Application monitors will also require scripting & testing

PowerHA SystemMirror is installed we are completely protected

Consider all single points of failure ie. SAN, LAN, I/O drawers, etc..

Heartbeats go over a dedicated link

All interfaces defined to the cluster will pass heartbeats (IP & Non-IP) CAA definitely changes this behavior

With clustering I need two of everything hence idle resources

Fact: Clustering will highlight what you are & are NOT doing right in your environment
6

Common Mistakes Beyond Base Cluster Functionality

Down / Missing Serial Networks EtherChannel Links Down ERRORs in Verification Inconsistent AIX levels Down level cluster filesets Fallback Policy not set to desired behavior Missing Filesets Missing Custom Disk Methods SAN not built in a robust fashion Bootlist Issues Dump Devices
Insufficient size Mirrored Lack Secondary

Lack of Education / Experience Not knowing Expected fallover behaviors Lack of application monitoring Not knowing what to monitor or check
CLI Logs

Poor Change Controls Not propagating changes appropriately No change history

I/O Pacing Enabled (old values) HBA Levels at GA code Fiber Channel Tunable settings not enabled Interim Fixes not loaded on all cluster nodes

Solutions: IBM Training / Redbooks / Proof of Concepts / ATS Health-check Reviews

Identify & Eliminate Points of Failure

LAN Infrastructure
Redundant Switches

SAN Infrastructure
Redundant Fabric

Application Availability
Application Monitoring Availability Reports

Infrastructure Considerations
Site A
All links through one pipe

Site B

LAN

SAN
DWDM Node A 50GB 50GB
SITEAMETROVG

SAN
DWDM Node B

50GB 50GB 50GB 50GB

50GB 50GB

Important:
Identify & Eliminate Single Points of Failure! 9

Infrastructure Considerations
Site A
XD_rs232 XD_IP

Site B
WAN

net_ether_0

LAN

SAN
DWDM Node A
ECM VG: diskhb_vg1 hdisk2 000fe4111f25a1d1

SAN
DWDM Node B
ECM VG: diskhb_vg1 hdisk3 000fe4111f25a1d1

1GB

ECM VG: diskhb_vg2 hdisk3 000fe4112f998235

1GB

ECM VG: diskhb_vg2 hdisk4 000fe4112f998235

50GB 50GB
SITEAMETROVG

50GB 50GB 50GB 50GB

50GB 50GB

Important:
Identify Single Points of Failure & design the solution around them 10

Infrastructure Considerations
Power Redundancy
Real Customer Scenarios:

I/O Drawers SCSI Backplane SAN HBAs Virtualized Environments Application Fallover Protection

Ie 1. Two nodes sharing I/O drawer

1 2 3 4 5 I/O drawer I/O drawer I/O drawer I/O drawer I/O drawer 6 7 8 9 10

Ie 2. Application Failure with no monitoring box remains up : no cluster fallover

Moral of the Story: * High Availability goes beyond just installing the cluster software
11

PowerHA SystemMirror 7.1: Topology management

Heartbeating differences between earlier cluster releases

diskhb_net1

LPAR 1

diskhb_net2

LPAR 1

LPAR 4

RSCT Subnet Heartbeat Rings

LPAR 2

LPAR 4

LPAR 2

Multicasting
diskhb_net4

LPAR 3

diskhb_net3

LPAR 3

PowerHA SM 6.1 & Earlier

RSCT Based Heartbeating Leader, Successor, Mayor, etc.. Strict Subnet Rules No Heartbeating over HBAs Multiple Disk Heartbeat Networks Point to Point only Each Network requires LUN with ECM VG

PowerHA SM 7.1 with CAA

Kernel Based Cluster Message Handling Multi cast based protocol Use Network & SAN as needed Discover and use as many adapters as possible All monitors are implemented at low levels of the AIX Kernel & are largely insensitive to system load Single Repository Disk Used to heartbeat & store information

Transition of PowerHA Topology IP Networks

Network: Net_ether0 9.19.51.20 (service IP) 9.19.51.10 (persistent IP) 192.168.100.1 (base address) 9.19.51.21 (service IP) 192.168.101.1 (base address)

en0

(persistent IP) 9.19.51.11 ( base address) 192.168.100.2 HB Rings In 6.1 & below

VLAN
en1 en1

( base address) 192.168.101.2

Traditional heartbeating rules no longer apply. However, route stripping is still a potential issue. When two interfaces have routable IPs on the same subnet AIX will send half the traffic out of either interface

Methods to circumvent this: Link Aggregation / EtherChannel Virtualized Interfaces with dual VIO servers
9.19.51.21 9.19.51.20 9.19.51.10 (service IP) (service IP) (base address)

en2

( base address)

9.19.51.11

VLAN
ent0 ent1 ent0 ent1

PowerHA SM 7.1: Additional Heartbeating Differences

Heartbeating:
Self Tuning Failure Detection Rate (FDR) All interfaces are used even if not in cluster networks

en3

9.19.51.21 9.19.51.20 9.19.51.10

(service IP) (service IP) (base address)

en2

( base address)

9.19.51.11

VLAN
ent0 ent1 ent0 ent1

Serial Networks removed:

No more rs232 support No more traditional disk heartbeating over ECM VG No more slow takeover w/disk heartbeat device as last device on selective takeover

Critical Volume Groups

Replace Multi-node Disk Heartbeating Oracle RAC three disk volume group - Voting Files Unlike MNDHB, no more general use Migration is a manual operation, and customer responsibility Any Concurrent Access Volume Group can be marked as Critical

CAA Cluster Aware AIX

Enabling tighter Integration with PowerHA SystemMirror

What is it: A set of services/tools embedded in AIX to help manage a cluster of AIX nodes and/or help run cluster software on AIX IBM cluster products (including RSCT, PowerHA, and the VIOS) will use and/or call CAA services/tools CAA services can assist in the management and monitoring of an arbitrary set of nodes and/or running a third-party cluster CAA does not form a cluster by itself. It is a tool set. There is no notion of quorum (If 20 nodes of a 21 node cluster are down, CAA still runs on the remaining node) CAA does not eject nodes from a cluster. CAA provides tools to fence a node but never fences a node and will continue to run on a fenced node

Major Benefits: Enhanced Health Management (Integrated Health Monitoring) Cluster Wide Device Naming
15

Cluster Aware AIX Exploiters

DB2 IBM Director TSA HMC

RSCT Consumers
IBM Storage HPC PowerHA System Mirror VIOS

Legacy RSCT
Bundled Resource Managers Group Services Messaging API Cluster Messaging Resource Mgr Services Monitoring API Cluster Monitoring Cluster Admin UI Cluster CFG Repository

RSCT With Cluster Aware AIX

Bundled Resource Managers Group Services Messaging API Resource Mgr Services Monitoring API Cluster Admin UI

Cluster Layers Integrated Cluster to CAA Cluster CFG Redesigned Capabilities Messaging Monitoring Repository

Cluster Aware AIX Legacy AIX

Cluster Repository CAA APIs and UIs Cluster Monitoring Cluster Messaging Cluster Events

RSCT and Cluster Aware AIX together provide the foundation of strategic Power Systems SW RSCT-CAA integration enables compatibility with a diverse set of dependent IBM products RSCT integration with CAA extends simplified cluster management along with optimized and robust cluster monitoring, failure detection, and recovery to RSCT exploiters on Power / AIX 16

Cluster Aware AIX: Central Repository Disk

Contrast from previous releases Aids in: Global Device Naming Inter node synchronization Centrally managed configuration Heartbeating device
PowerHA SystemMirror 7.1 & CAA:

Host 1

Host 2

Host 3

Direction: In the first release, support is confined to shared storage Will eventually evolve into a general AIX device rename interface Future direction is to enable clusterwide storage policy settings PowerHA ODM will eventually also entirely move to the repository disk
PowerHA SystemMirror 6.1 & Prior:
Cluster Synchronization

HA ODM

Central Repository
PowerHA SystemMirror will continue to run if Central Repository Disk goes away However, no changes may take place within the cluster.

Host 1

Host 2

Host 3

Multi Channel Health Management Out of the Box

Hardened Environments with new communication protocol

Faster detection & more efficient communication

LPAR 1

LPAR 2

Heartbeats

Reliable Messaging

Heartbeats

Reliable Messaging

First Line of Defense

Network SAN

Second Line of Defense

Third Line of Defense

Repository Disk

Highlights: RSCT Topology services no longer used for cluster Heartbeating All customers now have multiple communication paths by default
18

Basic Cluster vs. Advanced Cluster Features

IP Network

Resource Group IP VGs App Server

Basic Cluster Network Topology Resource Group/s

IPs VGs Application Server

SAN Network

Application Monitoring Pager Events

Site A IP Network IP Network

Site B

Advanced Cluster Multiple Networks

Resource Group IP VGs App1

Resource Group IP VGs App2

Resource Group IP VGs Dev App

SAN Network

Disk Replication

Crossover Connections Virtualized Resources Multiple Resource Groups Mutual Takeover Custom Resource Groups Adaptive Fallover NFS Cross-Mounts File Collections Dependencies Parent / Child Location Start After
Stop After

Smart Assistants Multiple Sites Cross Site LVM Configs Storage Replication IP Replication Application Monitoring Pager Events DLPAR Integration
Grow LPAR on Fallover

Director Management WebSMIT Management Dynamic Node Priority 19

PowerHA SystemMirror: Fallover Possibilities

Cluster Scalable to 32 nodes

One to one

One to any

Any to one

Any to any

Methods to Circumvent Unused Resources

Resource Group A Node A, Node B Shared IP VG/s & filesystems App 1 Resource Group B Node A, Node B Shared IP VG/s & filesystems App 2 Resource Group C Node B, Node A Shared IP VG/s & filesystems App 3 Resource Group D Node B, Node A Shared IP VG/s & filesystems App 4

RG Dependency

Mutual Takeover

RG Dependency

Frame 1 Node A rootvg

NIC HBA

Virtualization
NIC HBA

Frame 2 Node B
hdisk1
SAN

oracle_vg1

{ {

hdisk2
SAN

VIO
NIC

hdisk4

VIO

hdisk4

} }

rootvg

Storage Subsystem

HBA

oracle_vg1

Power Virtualization & PowerHA SystemMirror

Power HA Cluster

Power HA_node 2 AIX Rootvg

en0 vfc0

LPAR / DLPAR Micropartitioning & Shared Processor Pools Virtual I/O Server
Virtual Ethernet Virtual SCSI Virtual Fiber

Power HA_node 1

LPAR X
AIX Rootvg
en0 vfc0

LPAR Y

LPAR Z

Data

vfc1

VIO1 A VIO2 A VIO1 B VIO2 B

LAN

SAN

Live Partition Mobility Active Memory Sharing WPAR (AIX 6.1)

External Storage Enclosure Rootvg volumes Data

PowerHA SystemMirror Virtualization Considerations

Ethernet Virtualization
Topology should look the same as environment using link aggregation Version 7.1 no longer uses netmon.cf file As a best practice dual VIO Servers are recommended
SEA Fallover Backend

Storage Virtualization
Both methods of virtualizing storage are supported
VSCSI vs. Virtual fiber (NPIV)

In DR implementations leveraging disk replication consider the implications of using either option

Benefits of virtualization:
Maximize utilization of resources Less PCI slots & physical adapters Foundation for advanced functions like Live Partition Mobility Migrations to newer Power Hardware are simplified

* Live Partition Mobility & PowerHA SM compliment each other Maintenance vs. High Availability
(non-reactive . reactive)

Chapter 2.4 PowerVM Virtualization Considerations

Virtual Ethernet & PowerHA SystemMirror

No Link Aggregation / Same Frame

Virtual I/O Server (VIOS1)

ent4 (SEA) en6
Control Channel

PowerHA LPAR 1
en0

PowerHA LPAR 2
en0

Virtual I/O Server (VIOS2)

en6
Control Channel

ent4 (SEA)

ent0 (phy)

ent2 (virt)

ent5 (virt)

ent6 (virt)

ent0 (virt)

ent6 (virt)

ent5 (virt)

ent2 (virt)

ent0 (phy)

PVID 99
Hypervisor Frame 1

PVID 10

Ethernet Switch

This is a diagram of the configuration required for SEA fallover across VIO Servers. Note that Ethernet traffic will not be load balanced across the VIO Servers. The lower trunk priority on the ent2 virtual adapter would designate the primary VIO Server to use.
24

Virtual Ethernet & PowerHA SystemMirror

Independent Frames & Link Aggregation
Virtual I/O Server (VIOS1)
ent3 (LA) ent4 (SEA)

PowerHA LPAR 1
en0

Virtual I/O Server (VIOS2)

ent4 (SEA) ent3 (LA)

Control Channel

Frame1

ent1 (phy)

ent0 (phy)

ent2 (virt)

ent5 (virt)

ent0 (virt)

ent5 (virt)

ent2 (virt)

ent1 (phy)

ent0 (phy)

Hypervisor

Ethernet Switch

Hypervisor

Frame2

ent1 (phy)

ent0 (phy)

ent2 (virt)

ent5 (virt)
Control Channel

ent0 (virt)

ent5 (virt)
Control Channel

ent2 (virt)

ent1 (phy)

ent0 (phy)

ent3 (LA)

ent4 (SEA)

en0

ent4 (SEA)

ent3 (LA)

Virtual I/O Server (VIOS1)

PowerHA LPAR 2

Virtual I/O Server (VIOS2)

PowerHA SystemMirror 6.1 & Below

net_ether_0 9.19.51.20 (service IP) (service IP) 9.19.51.21

9.19.51.10 ( base address)

Topsvcs heartbeating

9.19.51.11 ( base address)

en0

serial_net_0

en0

PowerHA Node 1 FRAME 1

PowerHA Node 2 FRAME 2

Hypervisor

ent1 (phy)

ent0 (phy)

ent2 (virt)

ent5 (virt)
Control Channel

FRAME X
ent3 (LA) ent4 (SEA)

ent0 (virt)

ent5 (virt)
Control Channel

ent2 (virt)

ent1 (phy)

ent0 (phy)

en0

ent4 (SEA)

ent3 (LA)

Virtual I/O Server (VIOS1)

AIX Client LPAR

Virtual I/O Server (VIOS2)

* Netmon.cf file used for single adapter networks

PowerHA SystemMirror 7.1 - Topology

All nodes are monitored: Cluster Aware AIX tells you what nodes are in the cluster and information on those nodes including state. A special gossip protocol is used over the multicast address to determine node information and implement scalable reliable multicast. No traditional heartbeat mechanism is employed. Gossip packets travel over all interfaces including storage.

Differences: RSCT Topology services is no longer used for heartbeat monitoring Subnet Requirements no longer need to be followed Netmon.cf file is no longer required or used All interfaces are used for monitoring even if they are not in an HA network (this may be tunable in a future release) IGMP Snooping must be enabled on the switches
27

VSCSI Mapping vs. NPIV (virtual fiber)

FRAME 1
VIOS 1 NPIV HBA
hdisk

Node 1 vscsi0
hdisk0

} rootvg

HBA HBA

vhost0

Hypervisor

MPIO
vscsi1

hdisk1 hdisk2

hdisk

vhost0 NPIV HBA

fcs0

MPIO
fcs1

hdisk3 hdisk4

LUNS VSCSI LUNS NPIV

VIOS 2

} }

vscsi_vg

npiv_vg

FRAME 2
STORAGE SUBSYSTEM

VIOS 1

NPIV HBA vscsi0 Hypervisor

hdisk

Node 2
hdisk0

} rootvg

HBA HBA

vhost0

MPIO
vscsi1 fcs0

hdisk1 hdisk2

} }

vscsi_vg

hdisk

vhost0 NPIV HBA

MPIO
fcs1

hdisk3 hdisk4

npiv_vg 28

VIOS 2

Live Partition Mobility Support with IBM PowerHA

How does it all work?

Frame 1

SAN Storage rootvg

Frame 2

Considerations: This is a planned move It assumes that all resources are virtualized through VIO (Storage & Ethernet connections) PowerHA should only experience a minor disruption to the heartbeats during a move IVE / HEA virtual Ethernet is not supported for LPM VSCSI & NPIV virtual fiber mappings are supported

VIOS 1 VIOS 2

rootvg

PowerHA Node 1 PowerHA Node 2

datavg

PowerHA Node 2 PowerHA Node 1

The two solutions compliment each other by providing the ability to perform non-disruptive maintenance while retaining the ability to fallover in the event of a system or application outage 29

PowerHA and LPM Feature Comparison

PowerHA SystemMirror Live OS/App move between physical frames* Server Workload Management** Energy Management** Hardware Maintenance Software Maintenance Automated failover upon System Failure (OS or HW) Automated failover upon HW failure Automated failover upon App failure Automated failover upon vg access loss Automated failover upon any specified AIX error (via customized error notification of error report entry) Live Partition Mobility

*~ 2 seconds of total interruption time ** Require free system resources on target system

PowerHA SystemMirror: DLPAR Value Proposition

Pros: Automated action on acquisition of resources (bound to the PowerHA application server) HMC Verification Checking for connectivity to the HMC Ability to Grow LPAR on Failover Save $ on PowerHA SM Licensing
Thin Standby node

Cons: Requires Connectivity to HMC Potentially Slower Failover

System Specs:
32-way (2.3GHz) Squad-H+ 256GB of memory

Results:
120GB DLPAR add took 1min 55 sec 246GB DLPAR add took 4 min 25 sec 30% busy running artificial load the add took 4 minutes 36 seconds

Lacks ability to grow LPAR on-fly

ssh communication

LPAR A HMC
DLPAR CPU Count Minimal CPU Count Application Server

LPAR B HMC
Minimal CPU Count

Backup

DLPAR Licensing Scenario

How does it all work?

System A
Acquired via DLPAR with App + 1 CPU + 2 CPU Oracle DB 1 CPU Banner DB 1 CPU Standby Standby 1 CPU 1 CPU Cluster 1 Cluster 2 Cluster 3 Cluster 4

System B
Standby Standby 1 CPU 1 CPU + 1 CPU Acquired via DLPAR + 2 CPU with App

PeopleSoft 1 CPU Financial DB 1 CPU TSM Capacity 2 CPU 10 CPU

Print Server 2 CPU Capacity 10 CPU

Power7 740 16 Way

Applications
Production Oracle DB Production PeopleSoft AIX Print Server Banner Financial DB Production Financial DB Tivoli Storage Manager 5.5.2.0

CPU
2 2 2 3 3 2

Memory
16 GB 8 GB 4 GB 32 GB 32 GB 8 GB

Environment: PowerHA App Server Definitions

Application Server Min 1 Desired 2

System A
Oracle DB 1 CPU Banner DB 1 CPU Standby Standby 1 CPU 1 CPU Cluster 1 Cluster 2 Cluster 3 Cluster 4

System B
Standby Standby 1 CPU 1 CPU

Application Server Min 1 Desired 3

PeopleSoft 1 CPU Financial DB 1 CPU

The actual application requirements are stored in the PowerHA SystemMirror definitions and enforced during the acquisition or release of application server resources

HMC

System A
Acquired via DLPAR with App + 1 CPU + 2 CPU Oracle DB 1 CPU Banner DB 1 CPU Standby Standby 1 CPU 1 CPU Cluster 1 Cluster 2 Cluster 3 Cluster 4

System B
Standby Standby PeopleSoft 1 CPU 1 CPU 1 CPU + 1 CPU

During acquisition of resources in the cluster start up the host will ssh to the pre-defined HMC/s to perform the DLPAR operation automatically

Financial DB 1 CPU

Acquired via DLPAR + 2 CPU with App

Environment: DLPAR Resource Processing Flow

1. Activate LPARs
LPAR Profile Min 1 Desired 1 Max 2 LPAR Profile Min 1 Desired 1 Max 3

2. Start PowerHA
Application Server Min 1 Desired 2 Max 2 Application Server Min 1 Desired 3 Max 3

Read Requirements
Application Server Min 1 Desired 2 Max 2 Application Server Min 1 Desired 3 Max 3

Activate LPARs
LPAR Profile Min 1 Desired 1 Max 2 LPAR Profile Min 1 Desired 1 Max 3

HMC

DLPAR

System A
- 1 CPU - 2 CPU 3. Release resources Fallover or RG_move + 1 CPU + 2 CPU 1 CPU Oracle DB 2 1 CPU Banner DB 3 Cluster 1 Cluster 2

System B
Standby Oracle DB 12CPU CPU Standby CPU Banner DB 1 3 CPU + 1 CPU + 2 CPU - 1 CPU - 2 CPU 4. Release resources Stop cluster without takeover

Take Aways: CPU allocations follow the application server wherever it is being hosted (this model allows you to lower the HA license count) DLPAR resources will only get processed during the acquisition or release of cluster resources PowerHA 6.1+ allows provide micro-partitioning support and the ability to also alter virtual processor counts DLPAR resources can come from free CPUs in shared processor pool or CoD resources

PowerHA SystemMirror: DLPAR Value Proposition

Environment using dedicated CPU model (No DLPAR)
System A
Oracle DB 2 CPU Banner DB 3 CPU Standby Standby 2 CPU 3 CPU Cluster 1 Cluster 2 Cluster 3 Cluster 4

System B
Standby Standby 2 CPU 3 CPU

PeopleSoft 2 CPU Financial DB 3 CPU

PowerHA license counts: Cluster 1 : 4 CPUs Cluster 2 : 6 CPUs Cluster 3 : 4 CPUs Cluster 4 : 6 CPUs Total : 20 licenses

HMC

Environment using DLPAR model

System A
Acquired via DLPAR with App + 1 CPU + 2 CPU Oracle DB 1 CPU Banner DB 1 CPU Standby Standby 1 CPU 1 CPU Cluster 1 Cluster 2 Cluster 3 Cluster 4

System B
Standby Standby PeopleSoft 1 CPU 1 CPU 1 CPU

PowerHA license counts: Cluster 1 : 3 CPUs Cluster 2 : 4 CPUs Cluster 3 : 3 CPUs Cluster 4 : 4 CPUs Total : 14 licenses

+ 1 CPU

Financial DB 1 CPU

Acquired via DLPAR + 2 CPU with App

PowerHA SystemMirror: DLPAR Modified Model

HMC

Environment using DLPAR model

* Same as previous slide System A System B
Cluster 1 Cluster 2 Cluster 3 Cluster 4 Standby Standby PeopleSoft 1 CPU 1 CPU 1 CPU

Acquired via DLPAR with App

+ 1 CPU + 2 CPU

Oracle DB 1 CPU Banner DB 1 CPU Standby Standby 1 CPU 1 CPU

PowerHA license counts: Cluster 1 : 3 CPUs Cluster 2 : 4 CPUs Cluster 3 : 3 CPUs Cluster 4 : 4 CPUs Total : 14 licenses

+ 1 CPU

Financial DB 1 CPU

Acquired via DLPAR + 2 CPU with App

HMC

Environment using modified DLPAR model

System A
Acquired via DLPAR with App Oracle DB + 4 CPU Banner DB 1 CPU Cluster2 Standby 1 CPU Financial DB 1 CPU PeopleSoft Acquired + 4 CPU via DLPAR with App Cluster 1 Standby 1 CPU

System B

PowerHA license counts: Cluster 1 : 6 CPUs Cluster 2 : 6 CPUs Total : 12 licenses

36
* Consolidated both Prod LPARs into one LPAR. Control separated by Resource Groups

Data Protection with PowerHA SM 7.1 & CAA

Enhanced Concurrent Mode Volume Groups are now required ECM VGs were introduced in version 5.1 Fast Disk Takeover Fast Failure Detection Disk heartbeating Disk Fencing in CAA Fencing is automatic and transparent Cannot be turned off Fence group created by cl_vg_fence_init called from node_up CAA Storage Framework fencing support Ability to specify level of disk access allowed by device driver
Read/Write Read Only No Access (I/O is held until timeout) Fast Failure

Data Protection with PowerHA SM 7.1 & CAA

ECM Volume groups and the newly added protection LVM Enhanced Concurrent Mode VGs (Passive Mode) Prevent writes to logical volume or volume group devices Prevent filesystems from being mounted or any change requiring access to it CAA Fencing prevents writes to the disk itself (ie. dd which runs below LVM level)

Node A /data/app /data/db /data ACTIVE

read/write
read/write CAA read only CAA

Node B

Datavg
ECM VG

Cluster Services

No Access Fail all I/Os

In the event of a failure on node B

PASSIVE
read only

Shared LUNs 38

Management Console: WebSMIT vs. IBM Director

CLI & SMIT sysmirror panels still the most common management interfaces WebSMIT Available since HACMP 5.2 Required web server to run on host until HACMP 5.5 (Gateway server) Did not fall in line with look and feel of other IBM offerings

IBM Systems Director Plug-in New for PowerHA SystemMirror 7.1 Only for management of 7.1 & above Same look and feel as IBM suite of products Will leverage existing Director implementation Uses clvt & clmgr CLI behind the covers

WebSMIT Gateway Model: One-to-Many (6.1 & Below)

WebSMIT converted from a one-to-one architecture to one-to-many

User_1

User_2

User_3

User_4

Multiple WebSMIT users accessing multiple clusters through *one* WebSMIT server

Standalone WebSMIT Server

Cluster_B

One WebSMIT server managing multiple clusters

Cluster_A

Cluster_C

WebSMIT Screenshot: Associations Tab

PowerHA SystemMirror Cluster Management

New GUI User Interface for Version 7.1 Clusters

Three tier architecture provides scalability: User Interface Management Server Director Agent

User Interface Web-based interface Command-line interface

Director Agent
Automatically installed on AIX 7.1 & AIX V6.1 TL06

AIX PowerHA Director Agent

P P P P

D D D

Secure communication D
Director Server

P P P

D D D Discovery of clusters and resources

Central point of control Supported on AIX, Linux, and Windows Agent manager

PowerHA SystemMirror Director Integration

Accessing the SystemMirror Plug-ins

IBM Systems Director: Monitoring Status of Clusters

Accessing the SystemMirror Plug-ins

PowerHA SystemMirror Configuration Wizards

Wizards

PowerHA SystemMirror Smart Assistant Enhancements

Deploy HA Policy for Popular Middleware

PowerHA SystemMirror Detailed Views

SystemMirror Management View

IBM Director: Management Dashboard

Do you know about clvt & clmgr ?

clmgr announced in PowerHA SM 7.1
clvt available since HACMP 5.4.1 for Smart Assists Hard linked clmgr to clvt Originally clmgr was intended for the Director team & rapidly evolved into a major, unintended, informal line item. allows for deviation from clvt without breaking the Smart Assists

From this release forward, only clmgr is supported for customer use
clvt is strictly for use by the Smart Assists

New Command Line Infrastructure

Ease of Management Stop Start Move Resources Start Cluster Services on all nodes Verify & Sync Cluster Move a resource group 49

# clmgr on cluster # clmgr sync cluster # clmgr rg appAgroup node=node2

Do you know about clcmd in CAA ?

Allows commands to be run across all cluster nodes
# lslpp -w /usr/sbin/clcmd /usr/sbin/clcmd

bos.cluster.rte

# clcmd lssrc -g caa ------------------------------NODE mutiny.dfw.ibm.com ------------------------------Subsystem Group PID clcomd caa 9502848 cld caa 10551448 clconfd caa 10092716 solid caa 7143642 solidhac caa 7340248 ------------------------------NODE munited.dfw.ibm.com ------------------------------Subsystem Group PID cld caa 4390916 clcomd caa 4587668 clconfd caa 6357196 solidhac caa 6094862 solid caa 6553698

Status active active active active active

# clcmd lspv ------------------------------NODE mutiny.dfw.ibm.com ------------------------------hdisk0 0004a99c161a7e45 caa_private0 0004a99cd90dba78 hdisk2 0004a99c3b06bf99 hdisk3 0004a99c3b076c86 hdisk4 0004a99c3b076ce3 hdisk5 0004a99c3b076d2d ------------------------------NODE munited.dfw.ibm.com ------------------------------hdisk0 0004a99c15ecf25d caa_private0 0004a99cd90dba78 hdisk2 0004a99c3b06bf99 hdisk3 0004a99c3b076c86 hdisk4 0004a99c3b076ce3 hdisk5 0004a99c3b076d2d

rootvg active caavg_private active None None None None

PowerHA SystemMirror: Sample Application Monitor

# cat /usr/local/hascripts/ora_monitor.sh #!/bin/ksh ps ef | grep ora_pmon_hatest 51

PowerHA SystemMirror: Pager Events

HACMPpager: methodname = "Herrera_notify" desc = Lab Systems Pager Event" nodename = "connor kaitlyn" dialnum = "[email protected]" filename = "/usr/es/sbin/cluster/samples/pager/sample.txt" eventname = "acquire_takeover_addr config_too_long event_error node_down_complete node_up_complete" retrycnt = 3 timeout = 45 # cat /usr/es/sbin/cluster/samples/pager/sample.txt Node %n: Event %e occurred at %d, object = %o

Action Taken: Halted Node Connor

Sample Email: From: root 09/01/2009 Subject: HACMP Node kaitlyn: Event acquire_takeover_addr occurred at Tue Sep 1 16:29:36 2009, object =

Attention: Sendmail must be working and accessible via the firewall to receive notifications
52

PowerHA SystemMirror Tunables

AIX I/O Pacing (High & Low Watermark)
Typically only enable if recommended after performance evaluation Historical values 33 & 24 have been updated to 513 & 256 on AIX 5.3 and 8193 & 4096 on AIX 6.1
http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.prftungd/doc/prftungd/disk_io_pacing.htm

Syncd Setting
Default value of 60 recommended change to 10

Failure Detection Rate (FDR) only for Version 6.1 & below
Normal Settings should suffice in most environments (note that it can be tuned further) Remember to enable FFD when using disk heartbeating

Pre & Post Custom EVENTs

Entry points for notifications or actions required before phases in a takeover

PowerHA SystemMirror: Testing Best Practices

Test Application scripts and Application monitors thoroughly
Common problems include edits to scripts within scripts

Test fallovers in all directions

Will confirm start & stop scripts on both locations

Test Cluster
Lpars within same frame Virtual resources

Utilize Available Tools Cluster Test Tool Testing upgrades Alternate disk install is your friend

Best Practice: Testing should be the foundation for your documentation in the event that someone not PowerHA savvy is there when a failure occurs.
54

How to be successful with PowerHA SystemMirror

Strict Change Controls
Available test environment Testing of changes

Leverage CSPOC functions

Create / Remove / Change - VGs, LVs, Filesystems User Administration

Know what to look for

cluster.log / hacmp.out / clstrmgr.debug log files /var/hacmp/log/clutils.log Summary of nightly verification /var/hacmp/clverify/clverify.log detailed verification output

munited /# cltopinfo -m Interface Name Adapter Total Missed Current Missed Address Heartbeats Heartbeats -------------------------------------------------------------------------------------------------------------------en0 192.168.1.103 0 0 rhdisk1 255.255.10.0 1 1 Cluster Services Uptime: 30 days 0 hours 31 minutes

Summary
Review your infrastructure for potential single points of failure
Be aware of the potential pitfalls listed in the common mistakes slide

Leverage Features like:

File Collections Application monitoring Pager Notification Events

Keep up with feature changes in each release

New dependencies & fallover behaviors

Virtualizing P7 or P6 environments is the foundation for Live Partition Mobility

NPIV capable adapters can help simplify the configuration & management

WebSMIT & IBM Director are the available GUI front-ends

The cluster release will determine which one to use
56

Learn More About PowerHA SystemMirror

PowerHA SystemMirror IBM Portal

Popular Topics: * Frequently Asked Questions * Customer References * Documentation * White Papers

http://www-03.ibm.com/systems/power/software/availability/aix/index.html
( or Google PowerHA SystemMirror and click Im Feeling Lucky)

Questions?

Thank you for your time!

Additional Resources
New - Disaster Recovery Redbook
SG24-7841 - Exploiting PowerHA SystemMirror Enterprise Edition for AIX
http://www.redbooks.ibm.com/abstracts/sg247841.html?Open

New - RedGuide: High Availability and Disaster Recovery Planning: Next-Generation

Solutions for Multi server IBM Power Systems Environments
http://www.redbooks.ibm.com/abstracts/redp4669.html?Open

Online Documentation
http://www-03.ibm.com/systems/p/library/hacmp_docs.html

PowerHA SystemMirror Marketing Page

http://www-03.ibm.com/systems/p/ha/

PowerHA SystemMirror Wiki Page

http://www-941.ibm.com/collaboration/wiki/display/WikiPtype/High+Availability

PowerHA SystemMirror (HACMP) Redbooks

http://www.redbooks.ibm.com/cgi-bin/searchsite.cgi?query=hacmp

CS505 Update Mcqs FinalTerm by Vu Topper RM
No ratings yet
CS505 Update Mcqs FinalTerm by Vu Topper RM
5 pages
PowerHA - 1 PowerHA Consideration
No ratings yet
PowerHA - 1 PowerHA Consideration
17 pages
Aix Clusters
No ratings yet
Aix Clusters
68 pages
Ibm Powerha: This Is Power On A Smarter Planet
No ratings yet
Ibm Powerha: This Is Power On A Smarter Planet
36 pages
Au Aix Powerha Cluster Migration PDF
No ratings yet
Au Aix Powerha Cluster Migration PDF
15 pages
The HACMP Cheat Sheet: Building A Redundant Environment For High Availability With AIX
No ratings yet
The HACMP Cheat Sheet: Building A Redundant Environment For High Availability With AIX
13 pages
2011-10 PowerHA Presentation and Demo by Glenn Miller - October 20, 2011
No ratings yet
2011-10 PowerHA Presentation and Demo by Glenn Miller - October 20, 2011
27 pages
AIX Training Slides
No ratings yet
AIX Training Slides
344 pages
8VZZ001288T2120 A en SPlus Operations 2.1 SP2 Release Notes
100% (1)
8VZZ001288T2120 A en SPlus Operations 2.1 SP2 Release Notes
50 pages
Cp7029 Information Storage Management
100% (1)
Cp7029 Information Storage Management
1 page
APTARE StorageConsole UserGuide
No ratings yet
APTARE StorageConsole UserGuide
382 pages
Bce Unit 5
No ratings yet
Bce Unit 5
36 pages
The VIO Cheat Sheet
100% (1)
The VIO Cheat Sheet
20 pages
PowerHA - 3 Basic Configuration
No ratings yet
PowerHA - 3 Basic Configuration
52 pages
ITIL v3 Sample Exams & Answers
No ratings yet
ITIL v3 Sample Exams & Answers
30 pages
Ce PDF
No ratings yet
Ce PDF
45 pages
Guide AIX Monitoring
No ratings yet
Guide AIX Monitoring
31 pages
Dell XC630-10 Nutanix On VMware ESXi Reference Architecture
No ratings yet
Dell XC630-10 Nutanix On VMware ESXi Reference Architecture
51 pages
An512inst PDF
No ratings yet
An512inst PDF
754 pages
Cluster Management: AIX Version 7.1
No ratings yet
Cluster Management: AIX Version 7.1
38 pages
PowerVM Monitoring and Performance Management Tools
No ratings yet
PowerVM Monitoring and Performance Management Tools
47 pages
Software Engineering - Challenges Ahead
No ratings yet
Software Engineering - Challenges Ahead
43 pages
Hacmp Basics
No ratings yet
Hacmp Basics
62 pages
Linux Commands
No ratings yet
Linux Commands
10 pages
Storage For Converged Infrastructures
No ratings yet
Storage For Converged Infrastructures
8 pages
VM
No ratings yet
VM
16 pages
VM
No ratings yet
VM
16 pages
Citrix Virtual Desktop Handbook (7x)
No ratings yet
Citrix Virtual Desktop Handbook (7x)
220 pages
HP - HPE0-S22.v2018-11-13.q47: Leave A Reply
No ratings yet
HP - HPE0-S22.v2018-11-13.q47: Leave A Reply
19 pages
Cloud Computing Syllabus
No ratings yet
Cloud Computing Syllabus
2 pages
VMW Vcta DCV Exam Preparation Guide
No ratings yet
VMW Vcta DCV Exam Preparation Guide
8 pages
How To Setup SEA Failover With Load Sharing Configuration
No ratings yet
How To Setup SEA Failover With Load Sharing Configuration
25 pages
POWERHA
No ratings yet
POWERHA
36 pages
Top 30 Linux System Admin Interview Questions & Answers
No ratings yet
Top 30 Linux System Admin Interview Questions & Answers
7 pages
Lecture 2 IT Infrastructure
No ratings yet
Lecture 2 IT Infrastructure
21 pages
KORN Shell - Cheat Sheet
No ratings yet
KORN Shell - Cheat Sheet
13 pages
Aix Interview Qust
100% (2)
Aix Interview Qust
21 pages
Evolution of Cloud Computing - Unit 1
No ratings yet
Evolution of Cloud Computing - Unit 1
24 pages
Backup - Recovery Aix5
No ratings yet
Backup - Recovery Aix5
310 pages
Networker Installation in Linux
No ratings yet
Networker Installation in Linux
12 pages
bos.adt.lib bos.adt.libm bos.adt.syscalls bos.rte.SRC bos.rte.libc bos.rte.libcfg bos.rte.libcur bos.rte.libpthreads bos.rte.odm 如果您要安装并行的资源组，还要安装下面的包： bos.rte.lvm.rte5.1.0.25 or higher bos.clvm.enh
No ratings yet
bos.adt.lib bos.adt.libm bos.adt.syscalls bos.rte.SRC bos.rte.libc bos.rte.libcfg bos.rte.libcur bos.rte.libpthreads bos.rte.odm 如果您要安装并行的资源组，还要安装下面的包： bos.rte.lvm.rte5.1.0.25 or higher bos.clvm.enh
16 pages
Dual Vios Upgrade Walk-Through Powervm Technical Webinar #12 For Customers/Bps/Ibmers
No ratings yet
Dual Vios Upgrade Walk-Through Powervm Technical Webinar #12 For Customers/Bps/Ibmers
31 pages
AIX Network Commands
No ratings yet
AIX Network Commands
7 pages
HA06
No ratings yet
HA06
122 pages
Installation Guide: High Availability Cluster Multi-Processing For AIX
No ratings yet
Installation Guide: High Availability Cluster Multi-Processing For AIX
148 pages
Overview of Avaya Aura System Platform
No ratings yet
Overview of Avaya Aura System Platform
11 pages
How To Setup and Use Hyper
No ratings yet
How To Setup and Use Hyper
153 pages
Step 1: Fileset Installation: Configuration
No ratings yet
Step 1: Fileset Installation: Configuration
12 pages
IBM Power10 Scale-Out L2 Quiz - Attempt Review
No ratings yet
IBM Power10 Scale-Out L2 Quiz - Attempt Review
13 pages
Aix Hpux
No ratings yet
Aix Hpux
185 pages
Creating A Virtual Optical Drive in A HMC
No ratings yet
Creating A Virtual Optical Drive in A HMC
10 pages
Powervm Quicksheet
No ratings yet
Powervm Quicksheet
2 pages
CLMGR
No ratings yet
CLMGR
12 pages
NIM Install Methods
No ratings yet
NIM Install Methods
17 pages
Trusted Execution Environments Carlton Shepherd Kostantinos Markantonakis PDF Download
No ratings yet
Trusted Execution Environments Carlton Shepherd Kostantinos Markantonakis PDF Download
77 pages
Ibm Aix Online Training - Ibm Aix Online Classes - Aix Online Certification in USA - India - Hyderabad
No ratings yet
Ibm Aix Online Training - Ibm Aix Online Classes - Aix Online Certification in USA - India - Hyderabad
20 pages
AIX CFGMGR
No ratings yet
AIX CFGMGR
4 pages
OpenStack Object Storage (Swift) Essentials
From Everand
OpenStack Object Storage (Swift) Essentials
Amar Kapadia
No ratings yet
Activities Hacmp
No ratings yet
Activities Hacmp
21 pages
HACMPDOC
No ratings yet
HACMPDOC
18 pages
Aix Interview Questions IBM
No ratings yet
Aix Interview Questions IBM
3 pages
Aix Vio
No ratings yet
Aix Vio
8 pages
Dynamic LPAR Tips and Checklists For RMC Authentication and Authorization
No ratings yet
Dynamic LPAR Tips and Checklists For RMC Authentication and Authorization
7 pages
HACMP Config
No ratings yet
HACMP Config
6 pages
Study of Unix Os
No ratings yet
Study of Unix Os
28 pages
Aix Updates Using Multibos
No ratings yet
Aix Updates Using Multibos
11 pages
HA07 Building A Highly Available Web Application: IBM Power Systems Technical University
No ratings yet
HA07 Building A Highly Available Web Application: IBM Power Systems Technical University
53 pages
AIX IPL Refenrence Code in AIX
No ratings yet
AIX IPL Refenrence Code in AIX
18 pages
Lab: Deploying Open Community Applications in Ibm I (Sugarcrm) Lab Booklet
No ratings yet
Lab: Deploying Open Community Applications in Ibm I (Sugarcrm) Lab Booklet
21 pages
Vir Tech
No ratings yet
Vir Tech
20 pages
Aix Vio Quick Reference
No ratings yet
Aix Vio Quick Reference
28 pages
Extending Puppet - Second Edition
From Everand
Extending Puppet - Second Edition
Alessandro Franceschi
No ratings yet
Azure Cloud
No ratings yet
Azure Cloud
21 pages
Backup and Recover DB2 AIX
No ratings yet
Backup and Recover DB2 AIX
17 pages
Moving A DVD or CD Drive To Another LPAR
No ratings yet
Moving A DVD or CD Drive To Another LPAR
8 pages
VMware Vsphere Standard SPD Final May2024
No ratings yet
VMware Vsphere Standard SPD Final May2024
2 pages
How To Add A New Resource Group To An Active Cluster
No ratings yet
How To Add A New Resource Group To An Active Cluster
6 pages
Netronome-Ovs Offload PDF
No ratings yet
Netronome-Ovs Offload PDF
18 pages
Dell Best Practices For Oversubscription of CPU Memory and Storage in VSphere Virtual Environments - 0
No ratings yet
Dell Best Practices For Oversubscription of CPU Memory and Storage in VSphere Virtual Environments - 0
9 pages
015 Vios Sea
No ratings yet
015 Vios Sea
8 pages
ME Computer Science and Engineering - Syllabi
No ratings yet
ME Computer Science and Engineering - Syllabi
17 pages
Incorrect: Which Statement Is Correct With Regards To AWS Service Limits? (Choose Two)
No ratings yet
Incorrect: Which Statement Is Correct With Regards To AWS Service Limits? (Choose Two)
5 pages
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Which of The Following Enables You To Monitor and Collect Log Files From Your Amazon EC2 Instances?
No ratings yet
Which of The Following Enables You To Monitor and Collect Log Files From Your Amazon EC2 Instances?
4 pages
TOR Senior Unix - Linux Admin
No ratings yet
TOR Senior Unix - Linux Admin
2 pages
VIOS Certification Training
No ratings yet
VIOS Certification Training
1 page
What Are The Default Security Credentials That Are Required To Access The AWS Management Console For An IAM User Account?
No ratings yet
What Are The Default Security Credentials That Are Required To Access The AWS Management Console For An IAM User Account?
3 pages
Correct: Amazon EBS
No ratings yet
Correct: Amazon EBS
3 pages
Detecting EBPF Rootkits Using Virtualization and Memory Forensics
No ratings yet
Detecting EBPF Rootkits Using Virtualization and Memory Forensics
8 pages
AIX - LVM Cheatsheet
No ratings yet
AIX - LVM Cheatsheet
4 pages
Explanation: Amazon RDS
No ratings yet
Explanation: Amazon RDS
2 pages
ASE On VMWare Vshpere
No ratings yet
ASE On VMWare Vshpere
24 pages
Zportal-Brochure
No ratings yet
Zportal-Brochure
4 pages
Cloud Computing (Unit-2)
No ratings yet
Cloud Computing (Unit-2)
11 pages
Configuring MPIO
No ratings yet
Configuring MPIO
3 pages
Quick Setup Guide For Hacmp/Powerha: Prerequisites
No ratings yet
Quick Setup Guide For Hacmp/Powerha: Prerequisites
2 pages
Mastering Active Directory
From Everand
Mastering Active Directory
VICTOR P HENDERSON
No ratings yet
Model CS3551 DC Set 2
No ratings yet
Model CS3551 DC Set 2
2 pages
Windows Server 2008 For Dummies
From Everand
Windows Server 2008 For Dummies
Ed Tittel
No ratings yet
Basic of Ubuntu Linux
No ratings yet
Basic of Ubuntu Linux
165 pages
WebSphere Application Server 7.0 Administration Guide
From Everand
WebSphere Application Server 7.0 Administration Guide
Steve Robinson
No ratings yet
VMware Horizon View Essentials
From Everand
VMware Horizon View Essentials
Peter von Oven
No ratings yet
Lotus Domino Interview Questions, Answers, and Explanations: Lotus Domino Certification Review
From Everand
Lotus Domino Interview Questions, Answers, and Explanations: Lotus Domino Certification Review
Equity Press
No ratings yet