0% found this document useful (0 votes)
731 views68 pages

153 Data Guard PDF

Data Guard is Oracle software that protects data by maintaining a standby copy of a production Oracle database. It automates the process of data replication, detection and resolution of data failures or corruptions. The key aspects are redo transport services to ship transaction logs from the primary database to the standby, and log apply services to maintain a physical or logical copy of the primary database. This provides high availability, disaster recovery capabilities and flexible options to balance data protection and performance.

Uploaded by

Sangeeth Talluri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
731 views68 pages

153 Data Guard PDF

Data Guard is Oracle software that protects data by maintaining a standby copy of a production Oracle database. It automates the process of data replication, detection and resolution of data failures or corruptions. The key aspects are redo transport services to ship transaction logs from the primary database to the standby, and log apply services to maintain a physical or logical copy of the primary database. This provides high availability, disaster recovery capabilities and flexible options to balance data protection and performance.

Uploaded by

Sangeeth Talluri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Daniela Milanova

Senior Sales Consultant


Oracle Disaster
Recovery Solution
What is Data Guard?

Management, monitoring and automation


software infrastructure that protects data
against failure, errors, and corruptions of the
database
Automates the process of maintaining a copy
of a Oracle production database (standby
database)
Data Guard Architecture
Clients Clients

Primary Standby
Site Site

Data Changes

Primary Standby
Database Database
Services types:
Log transport services
Log apply services
Role-management services
Software Data Guard
Requirements

Same release of Oracle Database


Enterprise Edition must be installed for all
databases
Incase of using ASM/OMF, all should use
the same combination
Hardware an OS Data Guard
Requirements
The hardware can be different for the
primary and standby database
The operating system and platform
architecture for the primary and standby
databases must be the same
The operating system version for the
primary and standby databases can be
different
In case of all databases are on the same
system, OS must allow mounting more
than one database with the same name.
Data Guard
At the Highest Level
Data Guard comprises of two parts
REDO APPLY
Maintains a physical, block for block copy of
the Production (also called Primary) database.
Can be open in Read Only mode for short time
reporting
SQL APPLY
Maintains a logical, transaction for transaction
copy of the Production database.
Can be open in Read Write for reporting
purposes and cloning activities
REDO Apply Architecture
Physical Standby Database
Primary Asynchronous/
Database Synchronous MRP
Redo Shipping
Redo
Apply

Network

Backup
DIGITAL DATA STORAGE

DIGITAL DATA STORAGE

Maintains a Physical block for block copy of the Primary Database


SQL Apply Architecture

Primary Asynchronous/
Database Synchronous
Redo Shipping Logical Standby Database
Continuously
Open for Reports
Network
SQL
Apply

Transform
Redo to Additional
SQL Indexes and
Materialized
Views
Maintains a Logical transactional copy of the Primary Database
Data Protection & Disaster Recovery
Solution with Reporting Capability
Standby Physical
Clients Standby
Site
Database

Primary ges
Site an
h
C
ata Data Reporting
D Guard
Da Clients
ta
Ch
Primary an
ge
Database s

Standby Logical
Site Standby
Database
Data Guard Data Protection
Modes
Maximum protection
No data loss
In case of failure remote writting the primary
database is shutsdown
Maximum availability
No data loss
In case of failure remote writting the primary
database works in maximum performance
Maximum performance
Highest possible level of data protection
No affecting performance of the primary database
Data Guard Role Transition
Oracle Data Guard supports two role-
transition operations
Switchover
Planned role reversal
Used for OS or hardware maintenance
No data loss
Failover
Unplanned role reversal
Use in Emergency
Zero or minimal data loss depending on choice
of data protection mode
Existing Site Recovery Tradeoffs
Primary Database Standby Database Reporting on
Redo delayed data
Shipment

Delayed
Apply

Log apply may be delayed to protect from user errors but:


Switchover/Failover gets delayed
Reports run on old data
After failing over to standby, production DB must be rebuilt
Enhanced DR with Flashback Database
Primary Database Real Time Standby Database
Real Time
Redo Apply Reporting
Shipment

No Delay!

Flashback Flashback
Log Log

Primary: No reinstantiation
after failover!

Flashback DB removes the need to delay application of logs


Flashback DB removes the need to reinstantiate primary after
failover
Real-time apply enables real-time reporting on standby
Rolling Database Upgrades
In Oracle Database SQL Apply provides
the starting point for performing rolling
upgrades of the Oracle RDBMS software
and database with minimal interruption of
service.
By utilizing a Logical standby database
customers can upgrade one database
while running on the original production
database and then run in a mixed version
environment before returning to the
original, but upgraded, configuration!
SQL Apply Rolling Database Upgrades
Upgrade

Redo Patch Set


Clients Upgrades
A B Logs A B
Queue

Major
Version X Version X X X+1 Release
1 2 Upgrades
Initial SQL Apply Config Upgrade node B to X+1

Redo Redo Cluster


Upgrade Software &
A B A B
Hardware
Upgrades

X+1 X+1 X X+1


4 Switchover to B, upgrade A 3 Run in mixed mode to test
Benefits of Oracle Disaster
Recovery Solution
Disaster recovery and high availability
Complete data protection
Efficient utilization of system resources
Flexibility in data protection to balance
availability against performance
requirements
Automatic gap detection and resolution
Centralized and simple management
Integrated with Oracle database
Ease of Use
New and Improved Data Guard Manager!
Monitoring SQL Apply
Unsupported Storage Attributes
Applied Logs and Apply Progress
Managing the Logical Standby
Bypassing the Guard
Skipping Table Redo
Skipping Failed (and subsequently
fixed) Transactions
New Data Guard Feature:
Fast-Start Failover
Automatic and fast
Physical and Logical standby each complete
failover in less than 20 seconds
Old primary is reinstated automatically once
connectivity is re established between
Observer and primary database
Data Guard Best Practices:
Switchover for Planned Maintenance
For fastest switchover (< 1 minute)
Prior to switchover
A physical standby transitioning from read-only back to Redo
Apply should be restarted
Disconnect all sessions and stop job processing
Shutdown abort for all secondary RAC instances on both primary
and standby databases
Enable real-time apply on the standby database and ensure the
standby is synchronized with the primary database
For switchovers using SQL or command line interface,
open the new primary directly from the mount state
Or, simulate a Fast-Start Failover - complete transactions
and shutdown abort all primary instances
Data Guard Best Practices:
Faster Redo Transport
Set SDU=32K
Tune network parameters that affect network
buffer sizes and queue lengths
Ensure sufficient network bandwidth for peak
database redo generation rate + other
activities
http://www.oracle.com/technology/deploy/availa
bility/pdf/MAA_DG_NetBestPrac.pdf
Data Guard Best Practices:
Tune Network Parameters
Send and receive buffer size = 3 x bandwidth delay
product (BDP)
BDP = the product of the estimated minimum
bandwidth and the round trip time between the
primary and standby server
BDP = 1,000 Mbps * 25ms (.025 secs)
= 1,000,000,000 * .025
= 25,000,000 Megabits / 8 = 3,125,000 bytes
Tune network device queues to eliminate packet
losses and waits. Set device queues to a minimum
of 10,000 (default 100)
Impact of Network Tuning

TestResults
Test Results--Oracle
OracleDatabase10g
Database10gRelease
Release11&&22
Data Guard or Remote Mirroring
Remote Mirroring (host-based and storage-based)
is another way to protect enterprise data

However:
What about Data Reliability?
What about Data Recoverability?
What about Data Availability?
What about Cost?

A well-designed Business Continuity Plan must


consider these critical issues in addition to simple
data protection
Data Guard is the Preferred Solution
1. Better Network Efficiency
- Transmits only redo data
- Remote mirroring solutions: datafiles, archivelog files, redolog files
must be mirrored
2. Better suited for WAN-s
Fibre/ESCON-based mirroring solutions have an intrinsic distance
limitation
Protocol converters needed adds to the cost, complexity and
latency
Data Guard based on standard TCP/IP
Data Guard doesnt have to deal with protocol converters, extra cost
and latency issues
3. Better Data Protection
Data Guard enables zero data loss
Preserves write-order consistency
Avoids logical and physical corruptions
Both SQL Apply and Redo Apply validates redo data before applying
Data Guard is the Preferred Solution
4. Higher Flexibility
Data Guard based on commodity hardware
Does not force lock-in with storage vendors
Remote mirroring solutions typically need identically configured
storage from the same vendor
5. Better Functionality
Data Guard is a comprehensive DR solution:
Redo Apply/SQL Apply
Flexible protection modes
Push-button switchover/failover
Graceful handling of network connectivity problems
6. Higher ROI
Provides more value for DR investment
Standby database can be opened read-only or read-write
Allow backups to be offloaded on the standby database
Allows reporting/queries using the standby database
Integrated natively with other HA features (RAC, RMAN, etc.)
No extra cost
Data Guard and Remote Mirroring -
Summary
For protecting Oracle data, Oracle Data Guards
integrated disaster recovery solution involving
standby databases is preferred to remote disk
mirroring:
For technical reasons
For business reasons

Remote mirroring may be used to protect non-


Oracle database data that are changing
frequently:
File system data
Data in databases that are not Oracle
Competitive Strengths vs. SharePlex
SharePlex
Redo log-based replication tool from Quest software
Heavy front-end processing to extract transaction information from the primary redo
logs
Somewhat similar to Data Guard SQL Apply

It doesnt make sense for customers to use SharePlex:


Data Guard SharePlex
Cost Free Expensive

Feature support Native feature of the database Based on unpublished and unsupported interface1

DR Comprehensive and integrated DR solution At best a replication solution

Zero Data Loss Supported No support because of architecture limitations

Primary system overhead Minimal Much more

Integration with HA features Integrated with RAC, RMAN, Flashback, Limited integration

1. See MetaLink Note 97080.1


10g New Features and Best
Practices
Data Guard Release 10.2
Redo Transport Improvements
Increased network write sizes to 10 MB to better
utilize network capacity for both ARCH and LNS
LNS can potentially write 10MB or less
Full decoupling of LGWR and LNS processes
No more waits during log switches
No more waits when LNS buffer is full
Intra-file parallelism support for ARCH
Up to 29 parallel remote archive processes
1GB/100Mbps/25msRTT
1GB/100Mbps/25msRTT
Data Guard Best Practices:
Gap Resolution and Data Loss
For fastest gap resolution
Leverage intra-file archive parallelism (MAX_CONNECTIONS attr)
Follow tips for tuning redo transport to improve network
utilization
To minimize data loss
For a low latency, high bandwidth network, use SYNC transport
For high latency or low bandwidth networks, use ASYNC to
minimize primary database performance impact
Follow tips for tuning redo transport
Example: Less than 7 seconds of data loss exposure for
high redo rates of 2-12 MB/sec with <=25 ms latency in our
tests
Data Guard Best Practices:
Reduce Overhead on Primary
Performance Gains with 10g Release 2 ASYNC
Transport
For redo rates less than 2 MB/sec, there is less than 5%
impact on the primary database across different
latencies
For very high redo rates of 20 MB/sec, less than 10%
impact on primary database even with latencies of 50
and 100 ms
Primary database performance impact was 2-3 times less
with the new ASYNC transport compared to previous
releases
Best Practice
Allocate additional I/O bandwidth for Online Redo Log
Files
Data Guard Best Practices:
Using Standby for Backups
Offload Backups to Physical Standby Database
Eliminate backup overhead on primary database
RMAN allows for backup operations while Redo Apply is
in progress
Best Practices
For simplicity, use identical directory structures on the
primary and standby databases
Use RMAN Recovery Catalog so that backups taken on
one database server can be restored on another
Use a catalog server physically separate from primary and
standby sites
Reference MAA RMAN/Data Guard best practices paper
http://www.oracle.com/technology/deploy/availability/pdf/
RMAN_DataGuard_10g_wp.pdf
Data Guard or Remote Mirroring?
Load 200txns/sec & Redo rate 1.1 MB/sec
Data Guard SYNC transport has less overhead
on the primary database
Data Guard Advantage Because
Data Guard only transmits redo. A remote
mirroring solution must transmit all database
writes
A remote mirroring solution needs to transmit the
following writes: LGWR - log writer, DBWR database
writer, ARCH - archiver, RVWR flashback log writer,
and foreground direct writes
Both DBWR and LGWR are affected by network
latency in a remote mirroring solution. In contrast,
only LGWR is impacted by network latency in a
Data Guard solution
Higher wait times for DBWR can be very etrimental to
performance, causing contention for free buffers and an
increase in buffer busy waits
Some customer references
First American Real Estate Solutions

Nations largest source of Real Estate data


100 million properties
Online services for 50,000 clients
Lenders, Information Resellers, Government,
Utilities, Corporations, Appraisers, Agents & Title
Companies
Thousands of concurrent online users at peak
www.firstamres.com
HA/DR Requirements
High Availability: 24x7 - 365 days/year
Limited instances of planned downtime once/quarter

Recovery Point Objective (RPO) - maximum data


loss
Oracle9i: 10MB for computer failure, 200MB for site failure

Recovery Time Objective (RTO) for Oracle Database


Oracle9i: 10 minutes for computer failure, 1 hour for site
failure
Oracle Database 10g goals
RPO: zero data loss for computer failure, 10MB for site
failure
RTO: zero downtime for computer failure, 10 minutes for
site failure
First American
Oracle 9i HA/DR Architecture
Local Standby #1
Data Guard
LGWR Asynchronous
Redo Shipping
Primary Remote Disaster
Production Local Standby #2 Recovery Site
Site Data Guard
Delayed Apply Remote
(30 minutes) Standby #3
LGWR Asynchronous
Redo Shipping

Data Guard
Archive Log Shipping (ARCH)
1500 miles >
Primary Database
Looking Ahead to Oracle Database 10g

Real Application Clusters


Transparent failover on node failure, zero data loss
Flashback Technologies
Flashback Database & Flashback Table
Protect/repair for logical corruptions
Enhanced LGWR ASYNC redo transport
Improve RPO for remote DR site
Real Time Apply
Improve RTO
First American
Oracle Database 10g Architecture - Plan
Primary Remote Disaster
Production Recovery Site
Site

Data Guard
LGWR Asynchronous
redo shipping
1500 miles >
Primary Database Standby Database
Real Application Cluster Data Guard
First American
Oracle Database 10g Benefits
Higher Availability transparent node failover
RAC for HA, Data Guard for DR
Better remote data protection
ASYNC enhancements = less compromise on WAN
Better protection against logical corruption
Fewer databases, surgically repair vs full point in time
Less downtime
Faster failover, quicker repair of logical corruptions
Oracle Corporation
Global Single Instance (GSI)
A key enabler in Oracle saving $1 billion annually
Consolidation: 1 is the magic number
Versus 75 separate implementations of Oracle Apps
Versus 100s of Oracle databases world wide
Oracle E-Business Suite
7,000 concurrent users
5.5TB Oracle database
www.oracle.com
Oracle Global Single Instance
HA/DR Requirements
HA requirement
Continuous operation regardless of component failure
DR requirement
Protect against site failure, physical & logical corruption
RPO 5 minutes of transactions
RTO database failover in less than 1 hour
High workload OLTP system
8.2MB/sec redo generation at peak, 2.5MB/sec sustained
WAN, dual OC12
1,000 miles of separation, 25-35ms RTT network latency
Oracle Global Single Instance
HA/DR Architecture

GSI Production Site Disaster Recovery Site


(4) SUN F12Ks (4) SUN F12Ks
36 CPUs each DR domain 8 CPUs each
Development & Test domain: 28 CPUs each

Data Guard
LGWR Asynchronous
redo shipping

1,000 miles >


Primary Database Standby Database
(4 hour delayed apply)
Utilization of Standby Resources
Four node Standby Cluster
2 domains: DR, Development & Test
DR domain has sufficient capacity to maintain standby
database and execute failover
At Failover time:
Failover is executed, standby assumes primary role
Development & Test is stopped
CPUs are re-allocated to the new production domain
Nodes are upgraded in a rolling fashion with no
application downtime
Delayed Apply Downtime Avoided
Human error caused logical corruption on primary
160,000 row table updated by mistake
Standby database configured with 4 hour delayed
apply
Instead of 10 hours of downtime, just 30 minutes
Cancel recovery on standby and open read only
Stop the affected application on primary
Export data from standby
Recreate table on primary, import data to primary db after
disabling triggers
Restart application on primary
Restart recovery on standby
Oracle Global Single Instance
Oracle Database 10g Feature Adoption
Flashback Technologies
Flashback Table
Flashback Database
Data Guard 10g
Real Time Apply
Asynchronous Redo Transport enhancements
Redo Apply performance enhancements
Benefits
Faster failover, better data protection
Ohio Savings Bank
Founded in 1899
In Top 20 of all US Mortgage Lenders
Provide mortgage services to independent brokers
nationwide via Web
$13 billion in assets
Reputation for Innovation
2002 Web Site of the Year (Mortgage Technology Magazine)
www.ohiosavings.com
HA/DR Requirements

24 x 7 - 365 days/year

Recovery Point Objective: zero data loss

Recovery Time Objective: 30 minutes

Planned maintenance windows Sunday mornings


Ohio Savings Bank
Oracle9i Architecture
Online Mortgage Services
Primary Production Remote DR Site
2-node RAC Cluster HP N-Class PA-RISC
HP N-Class PA-RISC EMC Symmetrix
EMC Symmetrix SAN attached
SAN attached HP-UX v11.0
HP-UX v11.0

Data Guard
Archive Log Shipping (ARCH)
3rd party storage based
synchronous disk mirroring
for online logs
Primary Database
15 miles >
Ohio Savings Bank
Oracle Database 10g Architecture
Customer Call Center
Primary Production Remote DR Site
3-node RAC Cluster 3-node RAC Cluster
HP DL-380, 2 Zeon CPUs/node HP DL-380, 2 Zeon CPUs/node
EMC Symmetrix EMC Symmetrix
& Clariion & Clariion
SAN attached SAN attached
Red Hat Linux Red Hat Linux

Data Guard Maximum Availability


synchronous redo shipping
Zero Data Loss
Primary Database 15 miles > Standby Database
Ohio Savings Bank
Oracle Database 10g Features Deployed
Automatic Storage Management
Reduces time spent managing storage
RMAN Flash Recovery Area
Fully automates disk-based backup & recovery
Oracle Data Guard
Zero Data Loss
Replaces 3rd party remote mirroring
Standby DB also used for daily exports
Ohio Savings Bank
Automatic Storage Management
Automatically spreads database files across
all available storage
Automatic rebalancing of used disk space
when disks are added or removed
Increases I/O distribution beyond disk array
striping
Reduces DBA workload
Ohio Savings Bank, Future Plans
GRID from concept to reality

Add nodes to the existing RAC 10g cluster


Manage cluster via a single system view
Add mortgage database, and potentially the OSB
Data Warehouse to same RAC 10g cluster
Define application workloads as services
Establish rules to dynamically allocate processing
resources to services
Maximize the utilization of resources while
meeting changing business needs
Oracle Disaster Recovery
Solution
Includes as Oracle Products:
Oracle Database Enterprise Edition
on both sites
Oracle Maximum Available
Architecture
Oracle Maximum Availability
Architecture
Clients Clients

Application Application
Servers Servers
WAN Traffic
Manager

Dedicated Network
hb hb
Instance1 hb Instance2 Instance1 hb Instance2
Data Guard

Primary
Site Secondary
RAC based Site
Resources
Maximum Availability Architecture white papers:
http://otn.oracle.com/deploy/availability/htdocs/maa.html
New SQL Apply Best Practices Paper now available!

HA Portal on OTN: http://otn.oracle.com/deploy/availability

Data Guard home page on OTN:


http://otn.oracle.com/deploy/availability/htdocs/odg_overview.html

You might also like