0% found this document useful (0 votes)
28 views45 pages

2.4 Active-Active DR Solution

huawei

Uploaded by

luis ardila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views45 pages

2.4 Active-Active DR Solution

huawei

Uploaded by

luis ardila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

Active-Active DR Solution

Foreword
● Interruptions in the information system can be the cause of economic losses,
damage to brand image, and loss of important data. Disaster recovery (DR)
solutions are adopted to prevent errors, faults, and disasters. For example,
the local high availability (HA) solution is designed to mitigate device faults,
the intra-city DR data center (DC) ensures continuity if a data center disaster
occurs, and the remote DR data center offers availability in the event of
regional disasters.
● This course focuses on the active-active DR solution, including its overview,
architecture, and key technologies.
3 Huawei Confidential
Objectives

● Upon completion of this course, you will understand:

● Concept of the active-active mode

● Architecture of the active-active solution

● Key technologies of active-active DR

4 Huawei Confidential
Contents

1. Solution Overview
2. Solution Architecture
3. Key Technologies

5 Huawei Confidential
Introduction to HyperMetro
● HyperMetro is Huawei's active-active storage solution.
● It adopts a high-reliability architecture and the data dual-write technology to ensure storage
redundancy and achieve service continuity and zero data loss.

Application layer

Application center A Application center B

Image synchronization
DC
DC A
B

6 Huawei Confidential
Basic Concepts of HyperMetro

Basic Concept Description

Includes LUNs and protection groups (PGs), for which storage features (such as HyperMetro) are used to back up
Protected object
and implement DR.

Consists of the local and remote storage systems and the quorum server. Application servers can access data
HyperMetro domain
across data centers using a HyperMetro domain.

Created between a local and a remote LUN within a HyperMetro domain. A HyperMetro pair comprises a local LUN
HyperMetro pair
(local storage) and a remote LUN (remote storage)

HyperMetro CG A collection of HyperMetro pairs that have a service relationship with each other.

7 Huawei Confidential
Active-Passive DR and Active-Active DR

Traditional active-passive DR Active-active DR

DC A DC B DC A DC B

• Only one DC is active. • Two DCs provide external services


• Data is synchronized periodically or in real time. simultaneously.
• If a disaster occurs, services are manually • Data is synchronized in real time. The service
switched over to the DR system. host can access resources in any DC.
• Services are automatically switched in the event
of a disaster.

8 Huawei Confidential
Active-Active DR Solution
Remote DR solution
Intra-city DR solution
Local HA solution
●Geo-redundant 3DC DR solution
●Active-active DC solution
●Active-passive DR solution
●Local HA solution ●Active-passive DR solution

> 100 km

≤ 100 km

Local production center Intra-city DR center Remote DR center

9 Huawei Confidential
Contents

1. Solution Overview
2. Solution Architecture
3. Key Technologies

10 Huawei Confidential
Architecture of the Local HA Solution
Application
host cluster
Windows Linux

Fibre Channel switch


(dual-redundancy
networking)

Heterogeneous
HyperMetro
virtualization

Third-party OceanStor OceanStor


storage

Ethernet switch
(dual-redundancy
networking)

Quorum server

11 Huawei Confidential
HyperMetro DC Architecture
DC A DC
Bare optical fiber B
Highly reliable and
optimal Layer 2
connection
Active-active DR at the
network layer

Active-active DR at the

ES

ES
CN

CN
RA

RA
CN

ES

ES

CN

ES
RA

ES
CN

RA

CN
application layer

Xi

Xi
Xi

Xi

Xi

Xi
C

C
A

A
C

C
A

A
A

A
Cross-DC HA, load
balancing, and
migration
scheduling
supported for
Oracle RAC,
VMware, and
SA FusionSphere
SA
N Active-active DR at the N
storage layer
Active-active access
OceanStor and zero data loss
OceanStor

12 Huawei Confidential
HyperMetro DC Solution Modules
Solution Module Design
●Gateway-free active-active architecture
●HyperMetro is an OceanStor Dorado feature that implements active-active at the storage layer, reducing points of failure
Storage layer
and preventing I/O bottlenecks caused by storage virtualization gateways.
●The FastWrite feature halves a standard write I/O process (two round trips to one) to improve the write performance.
● The Ethernet Virtual Network (EVN) and Virtual Extensible Local Area Network (VXLAN) technologies of Huawei
CloudEngine series DC switches are used.
Network layer
● The streamlined layer-2 network with EVN and VXLAN allows layer-2 network protocols to run on a layer-3 network,
ensuring the cross-DC service interconnection and communication.
Virtualization platforms such as Huawei FusionSphere and VMware allow for cross-DC clustering, meeting the active-active
Computing layer
requirements of enterprises' mission-critical services.
● Virtual clusters provide higher reliability for web services and applications, and achieve automatic service switchover
Application layer based on load balancing.
● Databases are deployed on active-active LUNs across sites.

● Huawei OptiX OSN series dense wavelength division multiplexing (DWDM) devices are used for active-active DCs.
Transmission
● 1+1 protection schemes for links, boards, and devices meet the reliability requirements of various levels.
layer
● Optimization methods such as dispersion compensation minimize the transmission-layer latency.

13 Huawei Confidential
Contents

1. Solution Overview
2. Solution Architecture
3. Key Technologies
▫ Storage-Layer HyperMetro

▫ Computing-Layer HyperMetro

▫ Application-Layer HyperMetro

14 Huawei Confidential
Arbitration Mechanism of HyperMetro

Arbitration deployment Mode 1: Quorum server mode


● Working principle: In the event of disconnection between
HyperMetro storage systems, each storage system sends an
arbitration request to the quorum server, and only the winner
Storage resource pool continues providing services. The preferred site takes
precedence in arbitration.
● Application scenarios: A third-place quorum server is
deployed.

X
Arbitration of the
preferred site
Storage Storage
array A 2 Mode 2: Static priority mode
array B
● Working principle: In the event of disconnection between

X1 HyperMetro storage systems, the preset preferred site wins the


arbitration and provides services.
Quorum ● Application scenario: The third-place quorum server is faulty.
server

15 Huawei Confidential
Single Quorum Server

No quorum server Single quorum server


HyperMetro link fault Device fault

X X
X X Storage Storage Storage Storage
A B A B
Storage A Storage B Storage A Storage B X X
Quorum Quorum
server server

⮚ Without arbitration, interruptions between HyperMetro ⮚ As a component of the active-active solution, the quorum
storage systems cause the following: server is subject to reliability degradation.
1. If storage A and storage B are providing services, split- ⮚ If the quorum server fails, split-brain may occur or services
brain occurs. may be interrupted because there is no arbitration after the
2. If storage A and storage B stop providing services, two storage systems are disconnected.
services are interrupted.

16 Huawei Confidential
Two Quorum Servers
Storage Storage Storage Storage Storage Storage Storage Storage
A B A B A B A B

X X
X
X
Active Standby Active Standby Active Standby Active Standby
quorum quorum quorum quorum quorum quorum quorum quorum
server server server server server server server server

⮚ If the active quorum server fails, storage A and storage B negotiate to switch arbitration to the
standby quorum server. If storage A fails, the standby quorum server executes arbitration.
⮚ If the link between the active quorum server and storage B is down, storage A and storage B
negotiate to switch arbitration to the standby quorum server. If storage A fails, the standby quorum
server executes arbitration.

17 Huawei Confidential
Arbitration in Static Priority Mode

Pair Running
No. Diagram Fault Type Result
Status

HyperMetro
pair The link between To be The LUN in DC A continues providing
1 LUN LUN two DCs is down. synchronized
DC A DC B services while the LUN in DC B stops.

HyperMetro
pair
To be The LUN in DC A continues providing
2 LUN LUN DC B is faulty.
synchronized
DC A DC B services while the LUN in DC B stops.

HyperMetro
pair
The LUNs in both DCs stop. You must
To be
3 LUN LUN DC A is faulty. forcibly start the HyperMetro pair to enable
synchronized
DC A DC B
the LUN in DC B to provide services.

18 Huawei Confidential
Arbitration in Quorum Server Mode (1)
No. Diagram Pair Running Status Result
Quorum server

HyperMetro The LUNs in both DCs continue providing services.


1 Normal
LUN pair LUN HyperMetro automatically switches to static priority mode.
DC A DC B
Quorum server
HyperMetro The LUNs in both DCs continue providing services.
2 Normal
LUN pair LUN HyperMetro automatically switches to static priority mode.
DC A DC B
Quorum server
HyperMetro The LUN in DC A stops while the LUN in DC B continues
3 To be synchronized
LUN pair LUN providing services.
DC A DC B
Quorum server

HyperMetro If DC A is the preferred site, the LUN in DC A continues


4 To be synchronized
LUN
pair LUN providing services while the LUN in DC B stops.
DC A DC B
Quorum server
Simultaneous failure: The LUNs in both DCs stop. You must
5 HyperMetro To be synchronized forcibly start the HyperMetro pair to enable the LUN in DC B
pair
LUN LUN to provide services.
DC A DC B

19 Huawei Confidential
Arbitration in Quorum Server Mode (2)
Pair Running
No. Diagram Result
Status
Quorum server

HyperMetro pair Simultaneous failure: The LUN in DC A stops while the LUN
1 To be synchronized
LUN LUN
in DC B continues providing services.
DC A DC B

Quorum server
Simultaneous failure: The LUNs in both DCs stop. You must
2 HyperMetro To be synchronized forcibly start the HyperMetro pair to enable the LUN in DC B
pair
LUN LUN to provide services.
DC A DC B

Quorum server
Simultaneous failure: The LUNs in both DCs stop. You must
3 HyperMetro To be synchronized forcibly start the HyperMetro pair to enable the LUN in DC A
pair
LUN LUN or B to provide services.
DC A DC B
Quorum server

4 HyperMetro pair Normal The LUNs in both DCs continue providing services.
LUN LUN
DC A DC B

20 Huawei Confidential
HyperMetro Write I/O Process

Host

Storage 1 5

DCL LOG 2

6 HyperMetro management module


6

2 3 4
4
3

Cache Cache

DWDM
Local LUN Remote LUN
Local storage Intra-city network
Remote storage
system
system

21 Huawei Confidential
HyperMetro Read I/O Process

Application server

1 5

HyperMetro management
module
2 4
3 4

Local Remote
LUN LUN

DC A DC B

22 Huawei Confidential
Dual-Write Performance Optimization - FastWrite
General solution FastWrite
Huawei Huawei Huawei Huawei
Host storage storage Host Host Host
storage storage

100 km 100 km
1. Write
Command 8 Gbit/s FC/10GE 8 Gbit/s FC/10GE
2. Transfer 1. Command
Ready
2. Ready
3. Data Transfer 3. Data
4. Write Transfer
Comman
d 4. Write
Com
5. Transfer Ready & Data T mand
RTT-1 ransfer
6. Data Transfer
oo d RTT-1
5. Status G
d RTT-2
o
7. Status Go
d
6. Status Goo
8. Status Good

Site A Site B Site A Site B

23 Huawei Confidential
Host Access Optimization
Local HA Site A Site B

HyperMetro HyperMetro
LUNs LUNs

Short-distance
Long-distance deployment
deployment

Load balancing mode Local preferred mode


● This mode achieves I/O load balancing across ● This mode greatly reduces data access across sites
storage systems. to shorten the transmission latency.
● This mode applies to short-distance deployments, ● This mode applies to long-distance deployments.
such as in the same equipment room. ● The hosts at site A preferentially access the storage
● I/Os are evenly distributed to two storage systems system at site A, and the hosts at site B
to maximize resource utilization and improve preferentially access the storage system at site B.
performance. I/Os are only delivered to the preferred storage
system.

24 Huawei Confidential
Data Zero Copy

General data synchronization solution Huawei thin copy solution


Site A storage Site B storage Site A storage Site B storage
A B C D A B C D A B C D A B C D
Full copy of 8 blocks Full copy of 8 blocks
H G F E H G F E H G F E H G F E
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 Full copy of 12 blocks 0 0 0 0 0 0 0 0 Sends one command. 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
I J K L I J K L I J K L Full copy of 4 blocks I J K L
Full copy of 4 blocks

Full copy of non-zero data blocks Zero copy of zero-page data blocks

Full copy of zero-page data blocks

● Thin copy solution: When data is being synchronized, all-zero


● General solution: When data is being synchronized, all-zero data data is intelligently identified. Only an identifier is transferred, but
is not identified and all data blocks are copied one by one. data is not.
● The initial data synchronization requires a large bandwidth and ● The initial data synchronization duration is shortened by 90%, and
takes a long time. the initial data synchronization link consumes 90% less
bandwidth resources.

25 Huawei Confidential
Status Changing of a HyperMetro Pair/Consistency Group
Normal
Pause Fault
Synchronization Synchronized
Synchronization
Paused Synchronizing To be synchronized
Synchronization
Force start Force start
Force start
HyperMetro member
HyperMetro Management
N
Prerequisite
Operation
Synchronizing a HyperMetro pair The pair/CG running status is Paused, To be synchronized, or Force Start and the links between devices are normal.

Pausing a HyperMetro pair The pair/CG running status is Normal or Synchronizing and the links between devices are normal.

Switching the preferred site for a The pair/CG running status is Normal, Synchronizing, Paused, To be synchronized, or Force Start and the links between devices
HyperMetro pair are normal.
The pair/CG running status is Paused or To be synchronized, and the local resource data status is unreadable and unwritable.
Links between storage devices are disconnected.
Note:
Forcibly starting a HyperMetro pair
To ensure data security, stop service hosts before forcibly starting a HyperMetro pair. Start hosts and services after the
HyperMetro pair is started.
The host access status of the remote LUN is Access denied.
Deleting a HyperMetro Pair The pair/CG running status is Paused, To be synchronized, or Force Start.

26 Huawei Confidential
Contents

1. Solution Overview
2. Solution Architecture
3. Key Technologies
▫ Storage-Layer HyperMetro

▫ Computing-Layer HyperMetro

▫ Application-Layer HyperMetro

27 Huawei Confidential
What Is a Cluster?
● A cluster is a parallel or distributed system consisting of interconnected computers.

User

High
scalability

VM

High
availability
Cluster

High
manageability
Node 01 Node 02 Node 03 Node 04

28 Huawei Confidential
Cluster Technology
● According to the cluster technology, clusters can be classified into high-performance,
high-availability, and high-scalability clusters.
Server cluster
Load balancer → Server:
Source IP address = 4.3.2.1
Client Destination IP address = 192.168.1.10

User request: Application


Client IP address: 4.3.2.1
Source IP address = 4.3.2.1 192.168.1.10
Destination IP address = VIP (6.6.6.100)

Server → Client:
Source IP address = Application
192.168.1.10 192.168.1.11
Load balancer → Client: Destination IP
Source IP address = VIP (6.6.6.100) Virtual IP address (VIP): address = 4.3.2.1
Destination IP address = 4.3.2.1 6.6.6.100
Load balancer
Application
192.168.1.12

High-scalability cluster: web cluster

29 Huawei Confidential
Virtualization Cluster
DC A DC B

FusionSphere cluster

Host Host Host Host

FC SAN or IP SAN HyperMetro replication link: FC SAN or IP


Fibre Channel or IP SAN

DWDM
HyperMetro DWDM
device device
Storage system Storage system

IP IP
Quorum server

30 Huawei Confidential
Introduction to HA
● High availability (HA), also known as cluster high availability service, consists of host HA and VM HA.
When a server becomes faulty, services running on the faulty server will restart on another server if HA is
enabled.

Server HA VM HA

• The HA subsystem monitors the running status of hosts in • The HA subsystem monitors the running status of VMs
the cluster in real time. If a host is abnormal, the that have been registered for protection in the cluster in
subsystem uses multiple methods (checking network real time. If a VM is abnormal, the subsystem uses
heartbeat, storage heartbeat, and network ping) to multiple methods (checking VM heartbeat, disk I/O, and
confirm faults on the host, ensuring fault detection network I/O) to confirm the VM fault, ensuring fault
accuracy and reducing misjudgment. detection accuracy and reducing misjudgment.
• If the host is confirmed to be faulty, the HA subsystem • If the VM is confirmed to be faulty, the HA subsystem
restarts services on the faulty host on another proper host restarts the faulty VM on another proper host in the
in the cluster. cluster.

31 Huawei Confidential
VM HA
VRM

RACK01 RACK01

After detecting that a compute node


VM or VM is faulty, the VRM restarts the VM
faulty VM on a normal compute
node based on the recorded VM
information.

32 Huawei Confidential
Management Node HA

Active Standby
Detect
heartbeats and
VRM execute the
VRM
synchronization
policy.
Data synchronization service Data synchronization service

vSwitch vSwitch

33 Huawei Confidential
Dynamic Resource Scheduling (DRS)
Imbalanced service
load in a cluster
Automatic Intelligent Flexible
migration algorithms policies

40%
100% Host A Host B 100% Host C
80% 80%
30% 80% 80% Load Adjustable Periodic
0% 20% 0% balancing threshold scheduling
30% 30%
10%
0%
CPU Memory
Custom cluster scheduling rules
Balanced service load in a cluster
VM1 VM2 VM1 VM2 VM group

Host group
60% 60% 60% Host 1 Host 2
Host A Host B Host C Host 1 Host 1 Host 2 VM VM
40% 40% 40% VM VM VM VM VM VM
VM VM
50% 50% 50% 50% 50% 50%
20% 20% 20%
Keeping VMs
0% 0% Keeping VMs together mutually exclusive VMs to hosts
0%
CPU Memory CPU Memory CPU Memory
VMs must run on the VMs must run on A VM group can be forcibly
same host. different hosts. or non-forcibly clustered or
mutually exclusive in a host
group.
Load balancing and custom cluster scheduling rules are used to improve the running efficiency of service VMs.

34 Huawei Confidential
Contents

1. Solution Overview
2. Solution Architecture
3. Key Technologies
▫ Storage-Layer HyperMetro

▫ Computing-Layer HyperMetro

▫ Application-Layer HyperMetro

35 Huawei Confidential
B/S and C/S Architectures

C/S B/S
architecture architecture
Database APP
Client Server Browser Web server
server server
Access Query Access Query

Client
Server Database Browser Server Middleware
program

Return the Return the Return the Return the


result. result. result. result.

36 Huawei Confidential
Working Principles of Active-Active B/S Applications

WebLogic
cluster

Web server (Apache) App server (WebLogic)

Client

Web server (Apache) App server (WebLogic)

37 Huawei Confidential
Active-Active B/S Applications — Load Balancing
Client Client

DC A DC
B

SL GSL GSL SL
B B B B
SLB SLB
Web
Web

Web
Web
Resource

Web

Web
Resource
pool pool

Active-active
AP
AP

AP
AP
AP

AP
cluster
P
P

P
P
P

P
DB

DB

DB

DB
Cross-DC database cluster
Primary path Backup path

38 Huawei Confidential
Active-Active C/S Applications — Non-Distributed
Client Client

DC A DC
B

Cross-DC physical machine/VM cluster


AP
AP

AP
AP

AP
AP
P
P

P
P

P
P
DB

DB

Cross-DC database cluster

DB

DB
Primary path Backup path

40 Huawei Confidential
Active-Active C/S Applications — Distributed
Client Client

DC A DC
DNS B
request

SL GSL GSL SL
B B B B

Cross-DC physical machine/VM cluster


AP

AP
AP

AP
AP
AP
P

P
P

P
P
P
DB

DB

Cross-DC database cluster

DB

DB
Primary path Backup path

41 Huawei Confidential
Active-Active Databases

DC A DC
B

Cross-DC physical machine/VM cluster

AP
AP

AP
AP

AP
AP
P
P

P
P

P
P
Cross-DC database cluster
DB

DB

DB

DB
Primary path Backup path

42 Huawei Confidential
Quiz
1. (True or false) The quorum server software in the HyperMetro solution can be
deployed on a virtualization platform. ( )
A. True
B. False
2. (Single-answer question) In Static Priority Mode, if the link between two storage
systems breaks down and HyperMetro pair status is to be synchronized, which of
the following is the arbitration result? ( )
A. The LUN in DC A continues providing services while the LUN in DC B stops.
B. The LUN in DC B continues providing services while the LUN in DC A stops.
C. The LUNs in both DCs stop. You must forcibly start the HyperMetro pair to enable the LUN
in DC B to provide services.
D. The LUNs in both DCs continue providing services.

43 Huawei Confidential
Summary

Solution Overview

Storage-Layer
Active-Active DR Solution Solution Architecture
HyperMetro

Computing-Layer
Key Technologies
HyperMetro

Application-Layer
HyperMetro

44 Huawei Confidential
More Information

● Huawei Data Storage Infocenter: https://info.support.huawei.com/storage/#/home

● Huawei enterprise website: https://e.huawei.com/en/

● Technical support: https://support.huawei.com/enterprise

● Online learning: https://e.huawei.com/en/talent/portal/#/

● Huawei support knowledge base: https://support.huawei.com/enterprise/en/knowledge?lang=en

45 Huawei Confidential
Acronyms and Abbreviations
Acronym Full Name

IDC Internet Data Center

ROW Redirect-on-write

FC Fibre Channel

IP Internet Protocol

DWDM Dense wavelength division multiplexing

RAC Real application cluster

RTO Recovery Time Objective

RPO Recovery Point Objective

TCO Total cost of ownership

DCL Data change log

LUN Logical unit number

SAN Storage area network

46 Huawei Confidential
Thank you. 把数字世界带入每个人、每个家庭、
每个组织,构建万物互联的智能世界。
Bring digital to every person, home, and
organization for a fully connected,
intelligent world.
Copyright©2023 Huawei Technologies Co., Ltd.
All Rights Reserved.

The information in this document may contain predictive


statements including, without limitation, statements regarding
the future financial and operating results, future product
portfolio, new technology, etc. There are a number of factors that
could cause actual results and developments to differ materially
from those expressed or implied in the predictive statements.
Therefore, such information is provided for reference purpose only
and constitutes neither an offer nor an acceptance. Huawei may
change the information at any time without notice.

You might also like