2.4 Active-Active DR Solution
2.4 Active-Active DR Solution
Foreword
● Interruptions in the information system can be the cause of economic losses,
damage to brand image, and loss of important data. Disaster recovery (DR)
solutions are adopted to prevent errors, faults, and disasters. For example,
the local high availability (HA) solution is designed to mitigate device faults,
the intra-city DR data center (DC) ensures continuity if a data center disaster
occurs, and the remote DR data center offers availability in the event of
regional disasters.
● This course focuses on the active-active DR solution, including its overview,
architecture, and key technologies.
3 Huawei Confidential
Objectives
4 Huawei Confidential
Contents
1. Solution Overview
2. Solution Architecture
3. Key Technologies
5 Huawei Confidential
Introduction to HyperMetro
● HyperMetro is Huawei's active-active storage solution.
● It adopts a high-reliability architecture and the data dual-write technology to ensure storage
redundancy and achieve service continuity and zero data loss.
Application layer
Image synchronization
DC
DC A
B
6 Huawei Confidential
Basic Concepts of HyperMetro
Includes LUNs and protection groups (PGs), for which storage features (such as HyperMetro) are used to back up
Protected object
and implement DR.
Consists of the local and remote storage systems and the quorum server. Application servers can access data
HyperMetro domain
across data centers using a HyperMetro domain.
Created between a local and a remote LUN within a HyperMetro domain. A HyperMetro pair comprises a local LUN
HyperMetro pair
(local storage) and a remote LUN (remote storage)
HyperMetro CG A collection of HyperMetro pairs that have a service relationship with each other.
7 Huawei Confidential
Active-Passive DR and Active-Active DR
DC A DC B DC A DC B
8 Huawei Confidential
Active-Active DR Solution
Remote DR solution
Intra-city DR solution
Local HA solution
●Geo-redundant 3DC DR solution
●Active-active DC solution
●Active-passive DR solution
●Local HA solution ●Active-passive DR solution
> 100 km
≤ 100 km
9 Huawei Confidential
Contents
1. Solution Overview
2. Solution Architecture
3. Key Technologies
10 Huawei Confidential
Architecture of the Local HA Solution
Application
host cluster
Windows Linux
Heterogeneous
HyperMetro
virtualization
Ethernet switch
(dual-redundancy
networking)
Quorum server
11 Huawei Confidential
HyperMetro DC Architecture
DC A DC
Bare optical fiber B
Highly reliable and
optimal Layer 2
connection
Active-active DR at the
network layer
Active-active DR at the
ES
ES
CN
CN
RA
RA
CN
ES
ES
CN
ES
RA
ES
CN
RA
CN
application layer
Xi
Xi
Xi
Xi
Xi
Xi
C
C
A
A
C
C
A
A
A
A
Cross-DC HA, load
balancing, and
migration
scheduling
supported for
Oracle RAC,
VMware, and
SA FusionSphere
SA
N Active-active DR at the N
storage layer
Active-active access
OceanStor and zero data loss
OceanStor
12 Huawei Confidential
HyperMetro DC Solution Modules
Solution Module Design
●Gateway-free active-active architecture
●HyperMetro is an OceanStor Dorado feature that implements active-active at the storage layer, reducing points of failure
Storage layer
and preventing I/O bottlenecks caused by storage virtualization gateways.
●The FastWrite feature halves a standard write I/O process (two round trips to one) to improve the write performance.
● The Ethernet Virtual Network (EVN) and Virtual Extensible Local Area Network (VXLAN) technologies of Huawei
CloudEngine series DC switches are used.
Network layer
● The streamlined layer-2 network with EVN and VXLAN allows layer-2 network protocols to run on a layer-3 network,
ensuring the cross-DC service interconnection and communication.
Virtualization platforms such as Huawei FusionSphere and VMware allow for cross-DC clustering, meeting the active-active
Computing layer
requirements of enterprises' mission-critical services.
● Virtual clusters provide higher reliability for web services and applications, and achieve automatic service switchover
Application layer based on load balancing.
● Databases are deployed on active-active LUNs across sites.
● Huawei OptiX OSN series dense wavelength division multiplexing (DWDM) devices are used for active-active DCs.
Transmission
● 1+1 protection schemes for links, boards, and devices meet the reliability requirements of various levels.
layer
● Optimization methods such as dispersion compensation minimize the transmission-layer latency.
13 Huawei Confidential
Contents
1. Solution Overview
2. Solution Architecture
3. Key Technologies
▫ Storage-Layer HyperMetro
▫ Computing-Layer HyperMetro
▫ Application-Layer HyperMetro
14 Huawei Confidential
Arbitration Mechanism of HyperMetro
X
Arbitration of the
preferred site
Storage Storage
array A 2 Mode 2: Static priority mode
array B
● Working principle: In the event of disconnection between
15 Huawei Confidential
Single Quorum Server
X X
X X Storage Storage Storage Storage
A B A B
Storage A Storage B Storage A Storage B X X
Quorum Quorum
server server
⮚ Without arbitration, interruptions between HyperMetro ⮚ As a component of the active-active solution, the quorum
storage systems cause the following: server is subject to reliability degradation.
1. If storage A and storage B are providing services, split- ⮚ If the quorum server fails, split-brain may occur or services
brain occurs. may be interrupted because there is no arbitration after the
2. If storage A and storage B stop providing services, two storage systems are disconnected.
services are interrupted.
16 Huawei Confidential
Two Quorum Servers
Storage Storage Storage Storage Storage Storage Storage Storage
A B A B A B A B
X X
X
X
Active Standby Active Standby Active Standby Active Standby
quorum quorum quorum quorum quorum quorum quorum quorum
server server server server server server server server
⮚ If the active quorum server fails, storage A and storage B negotiate to switch arbitration to the
standby quorum server. If storage A fails, the standby quorum server executes arbitration.
⮚ If the link between the active quorum server and storage B is down, storage A and storage B
negotiate to switch arbitration to the standby quorum server. If storage A fails, the standby quorum
server executes arbitration.
17 Huawei Confidential
Arbitration in Static Priority Mode
Pair Running
No. Diagram Fault Type Result
Status
HyperMetro
pair The link between To be The LUN in DC A continues providing
1 LUN LUN two DCs is down. synchronized
DC A DC B services while the LUN in DC B stops.
HyperMetro
pair
To be The LUN in DC A continues providing
2 LUN LUN DC B is faulty.
synchronized
DC A DC B services while the LUN in DC B stops.
HyperMetro
pair
The LUNs in both DCs stop. You must
To be
3 LUN LUN DC A is faulty. forcibly start the HyperMetro pair to enable
synchronized
DC A DC B
the LUN in DC B to provide services.
18 Huawei Confidential
Arbitration in Quorum Server Mode (1)
No. Diagram Pair Running Status Result
Quorum server
19 Huawei Confidential
Arbitration in Quorum Server Mode (2)
Pair Running
No. Diagram Result
Status
Quorum server
HyperMetro pair Simultaneous failure: The LUN in DC A stops while the LUN
1 To be synchronized
LUN LUN
in DC B continues providing services.
DC A DC B
Quorum server
Simultaneous failure: The LUNs in both DCs stop. You must
2 HyperMetro To be synchronized forcibly start the HyperMetro pair to enable the LUN in DC B
pair
LUN LUN to provide services.
DC A DC B
Quorum server
Simultaneous failure: The LUNs in both DCs stop. You must
3 HyperMetro To be synchronized forcibly start the HyperMetro pair to enable the LUN in DC A
pair
LUN LUN or B to provide services.
DC A DC B
Quorum server
4 HyperMetro pair Normal The LUNs in both DCs continue providing services.
LUN LUN
DC A DC B
20 Huawei Confidential
HyperMetro Write I/O Process
Host
Storage 1 5
DCL LOG 2
2 3 4
4
3
Cache Cache
DWDM
Local LUN Remote LUN
Local storage Intra-city network
Remote storage
system
system
21 Huawei Confidential
HyperMetro Read I/O Process
Application server
1 5
HyperMetro management
module
2 4
3 4
Local Remote
LUN LUN
DC A DC B
22 Huawei Confidential
Dual-Write Performance Optimization - FastWrite
General solution FastWrite
Huawei Huawei Huawei Huawei
Host storage storage Host Host Host
storage storage
100 km 100 km
1. Write
Command 8 Gbit/s FC/10GE 8 Gbit/s FC/10GE
2. Transfer 1. Command
Ready
2. Ready
3. Data Transfer 3. Data
4. Write Transfer
Comman
d 4. Write
Com
5. Transfer Ready & Data T mand
RTT-1 ransfer
6. Data Transfer
oo d RTT-1
5. Status G
d RTT-2
o
7. Status Go
d
6. Status Goo
8. Status Good
23 Huawei Confidential
Host Access Optimization
Local HA Site A Site B
HyperMetro HyperMetro
LUNs LUNs
Short-distance
Long-distance deployment
deployment
24 Huawei Confidential
Data Zero Copy
Full copy of non-zero data blocks Zero copy of zero-page data blocks
25 Huawei Confidential
Status Changing of a HyperMetro Pair/Consistency Group
Normal
Pause Fault
Synchronization Synchronized
Synchronization
Paused Synchronizing To be synchronized
Synchronization
Force start Force start
Force start
HyperMetro member
HyperMetro Management
N
Prerequisite
Operation
Synchronizing a HyperMetro pair The pair/CG running status is Paused, To be synchronized, or Force Start and the links between devices are normal.
Pausing a HyperMetro pair The pair/CG running status is Normal or Synchronizing and the links between devices are normal.
Switching the preferred site for a The pair/CG running status is Normal, Synchronizing, Paused, To be synchronized, or Force Start and the links between devices
HyperMetro pair are normal.
The pair/CG running status is Paused or To be synchronized, and the local resource data status is unreadable and unwritable.
Links between storage devices are disconnected.
Note:
Forcibly starting a HyperMetro pair
To ensure data security, stop service hosts before forcibly starting a HyperMetro pair. Start hosts and services after the
HyperMetro pair is started.
The host access status of the remote LUN is Access denied.
Deleting a HyperMetro Pair The pair/CG running status is Paused, To be synchronized, or Force Start.
26 Huawei Confidential
Contents
1. Solution Overview
2. Solution Architecture
3. Key Technologies
▫ Storage-Layer HyperMetro
▫ Computing-Layer HyperMetro
▫ Application-Layer HyperMetro
27 Huawei Confidential
What Is a Cluster?
● A cluster is a parallel or distributed system consisting of interconnected computers.
User
High
scalability
VM
High
availability
Cluster
High
manageability
Node 01 Node 02 Node 03 Node 04
28 Huawei Confidential
Cluster Technology
● According to the cluster technology, clusters can be classified into high-performance,
high-availability, and high-scalability clusters.
Server cluster
Load balancer → Server:
Source IP address = 4.3.2.1
Client Destination IP address = 192.168.1.10
Server → Client:
Source IP address = Application
192.168.1.10 192.168.1.11
Load balancer → Client: Destination IP
Source IP address = VIP (6.6.6.100) Virtual IP address (VIP): address = 4.3.2.1
Destination IP address = 4.3.2.1 6.6.6.100
Load balancer
Application
192.168.1.12
29 Huawei Confidential
Virtualization Cluster
DC A DC B
FusionSphere cluster
DWDM
HyperMetro DWDM
device device
Storage system Storage system
IP IP
Quorum server
30 Huawei Confidential
Introduction to HA
● High availability (HA), also known as cluster high availability service, consists of host HA and VM HA.
When a server becomes faulty, services running on the faulty server will restart on another server if HA is
enabled.
Server HA VM HA
• The HA subsystem monitors the running status of hosts in • The HA subsystem monitors the running status of VMs
the cluster in real time. If a host is abnormal, the that have been registered for protection in the cluster in
subsystem uses multiple methods (checking network real time. If a VM is abnormal, the subsystem uses
heartbeat, storage heartbeat, and network ping) to multiple methods (checking VM heartbeat, disk I/O, and
confirm faults on the host, ensuring fault detection network I/O) to confirm the VM fault, ensuring fault
accuracy and reducing misjudgment. detection accuracy and reducing misjudgment.
• If the host is confirmed to be faulty, the HA subsystem • If the VM is confirmed to be faulty, the HA subsystem
restarts services on the faulty host on another proper host restarts the faulty VM on another proper host in the
in the cluster. cluster.
31 Huawei Confidential
VM HA
VRM
RACK01 RACK01
32 Huawei Confidential
Management Node HA
Active Standby
Detect
heartbeats and
VRM execute the
VRM
synchronization
policy.
Data synchronization service Data synchronization service
vSwitch vSwitch
33 Huawei Confidential
Dynamic Resource Scheduling (DRS)
Imbalanced service
load in a cluster
Automatic Intelligent Flexible
migration algorithms policies
40%
100% Host A Host B 100% Host C
80% 80%
30% 80% 80% Load Adjustable Periodic
0% 20% 0% balancing threshold scheduling
30% 30%
10%
0%
CPU Memory
Custom cluster scheduling rules
Balanced service load in a cluster
VM1 VM2 VM1 VM2 VM group
Host group
60% 60% 60% Host 1 Host 2
Host A Host B Host C Host 1 Host 1 Host 2 VM VM
40% 40% 40% VM VM VM VM VM VM
VM VM
50% 50% 50% 50% 50% 50%
20% 20% 20%
Keeping VMs
0% 0% Keeping VMs together mutually exclusive VMs to hosts
0%
CPU Memory CPU Memory CPU Memory
VMs must run on the VMs must run on A VM group can be forcibly
same host. different hosts. or non-forcibly clustered or
mutually exclusive in a host
group.
Load balancing and custom cluster scheduling rules are used to improve the running efficiency of service VMs.
34 Huawei Confidential
Contents
1. Solution Overview
2. Solution Architecture
3. Key Technologies
▫ Storage-Layer HyperMetro
▫ Computing-Layer HyperMetro
▫ Application-Layer HyperMetro
35 Huawei Confidential
B/S and C/S Architectures
C/S B/S
architecture architecture
Database APP
Client Server Browser Web server
server server
Access Query Access Query
Client
Server Database Browser Server Middleware
program
36 Huawei Confidential
Working Principles of Active-Active B/S Applications
WebLogic
cluster
Client
37 Huawei Confidential
Active-Active B/S Applications — Load Balancing
Client Client
DC A DC
B
SL GSL GSL SL
B B B B
SLB SLB
Web
Web
Web
Web
Resource
Web
Web
Resource
pool pool
Active-active
AP
AP
AP
AP
AP
AP
cluster
P
P
P
P
P
P
DB
DB
DB
DB
Cross-DC database cluster
Primary path Backup path
38 Huawei Confidential
Active-Active C/S Applications — Non-Distributed
Client Client
DC A DC
B
AP
AP
AP
AP
P
P
P
P
P
P
DB
DB
DB
DB
Primary path Backup path
40 Huawei Confidential
Active-Active C/S Applications — Distributed
Client Client
DC A DC
DNS B
request
SL GSL GSL SL
B B B B
AP
AP
AP
AP
AP
P
P
P
P
P
P
DB
DB
DB
DB
Primary path Backup path
41 Huawei Confidential
Active-Active Databases
DC A DC
B
AP
AP
AP
AP
AP
AP
P
P
P
P
P
P
Cross-DC database cluster
DB
DB
DB
DB
Primary path Backup path
42 Huawei Confidential
Quiz
1. (True or false) The quorum server software in the HyperMetro solution can be
deployed on a virtualization platform. ( )
A. True
B. False
2. (Single-answer question) In Static Priority Mode, if the link between two storage
systems breaks down and HyperMetro pair status is to be synchronized, which of
the following is the arbitration result? ( )
A. The LUN in DC A continues providing services while the LUN in DC B stops.
B. The LUN in DC B continues providing services while the LUN in DC A stops.
C. The LUNs in both DCs stop. You must forcibly start the HyperMetro pair to enable the LUN
in DC B to provide services.
D. The LUNs in both DCs continue providing services.
43 Huawei Confidential
Summary
Solution Overview
Storage-Layer
Active-Active DR Solution Solution Architecture
HyperMetro
Computing-Layer
Key Technologies
HyperMetro
Application-Layer
HyperMetro
44 Huawei Confidential
More Information
45 Huawei Confidential
Acronyms and Abbreviations
Acronym Full Name
ROW Redirect-on-write
FC Fibre Channel
IP Internet Protocol
46 Huawei Confidential
Thank you. 把数字世界带入每个人、每个家庭、
每个组织,构建万物互联的智能世界。
Bring digital to every person, home, and
organization for a fully connected,
intelligent world.
Copyright©2023 Huawei Technologies Co., Ltd.
All Rights Reserved.