0% found this document useful (0 votes)
27 views12 pages

Oracle - Day 22

Uploaded by

suresh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views12 pages

Oracle - Day 22

Uploaded by

suresh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Oracle Database Administration Training Series – Day 22

Oracle RAC Backup and Recovery

Oracle Real Application Clusters (RAC) presents unique challenges and opportunities in backup and
recovery due to its multi-instance architecture. Proper strategies ensure data consistency, minimize
downtime, and protect against data loss in a clustered environment.

1. Backup Strategies for Oracle RAC Databases

a. Types of Backups

1. Full Database Backup:

o A comprehensive backup of all datafiles, control files, and optionally, archived redo
logs.

o Typically scheduled during low-usage windows.

2. Incremental Backup:

o Backs up only blocks changed since the last backup (level 1) or since the last full
backup (level 0).

o Efficient for large RAC databases with frequent changes.

3. Archived Log Backup:

o Backs up archived redo logs to ensure no committed transaction is lost.

4. Control File and SPFILE Backup:

o Essential for recovery. SPFILE and control files should be backed up frequently.

b. Backup Locations

• Shared Storage: Using shared ASM disk groups or network-mounted drives.

• Separate Storage: Storing backups on external storage systems for redundancy.

c. Backup Frequency

• Depends on RTO (Recovery Time Objective) and RPO (Recovery Point Objective).

• Daily incremental and weekly full backups are common practices.

2. Using RMAN with Oracle RAC

a. RMAN Features in RAC

• RMAN is RAC-aware and coordinates backups across all instances in the cluster.

• Automatic Channel Allocation: Allocates channels across all RAC instances for parallelism.

• Datafile Split: Distributes datafile backups across multiple instances to optimize


performance.

b. Example RMAN Configuration

• Configure RMAN for parallel backups:

1
CONFIGURE DEVICE TYPE DISK PARALLELISM 4 BACKUP TYPE TO BACKUPSET;

CONFIGURE CONTROLFILE AUTOBACKUP ON;

• Backing up a RAC database:

RUN {
ALLOCATE CHANNEL c1 DEVICE TYPE DISK;
ALLOCATE CHANNEL c2 DEVICE TYPE DISK;
BACKUP DATABASE PLUS ARCHIVELOG;
}

c. Benefits of Using RMAN in RAC

• Centralized management of backups.

• Consistency across multiple instances.

• Seamless integration with Data Guard for disaster recovery.

3. Recovery Techniques and Challenges in a RAC Environment

a. Instance Recovery

• Triggered When: An individual instance in the cluster fails.

• Recovery Process:

o Surviving instances apply redo logs for uncommitted changes from the failed
instance.

o Automatic process handled by RAC, requiring no DBA intervention.

b. Media Recovery

• Triggered When: A datafile is lost or corrupted.

• Recovery Process:

o Use RMAN to restore the affected datafile from backup.

o Apply archived redo logs to recover transactions.

RESTORE DATAFILE 3;
RECOVER DATAFILE 3;

c. RAC-Specific Challenges

• Redo Log Coordination: All instances write to a common set of redo log files.

• Clusterwide Data Consistency: Recovery must ensure consistency across all nodes.

• Parallelism: Coordinating recovery in a multi-instance environment requires careful


management of RMAN channels and resources.

2
4. Flashback Technology and Its Usage in RAC

Flashback features allow for fast recovery from logical errors, such as accidental deletes or updates.

a. Flashback Database

• Rolls back the entire database to a previous point in time.

• Requires Flash Recovery Area (FRA) and sufficient disk space.

• Command:

FLASHBACK DATABASE TO SCN 123456;

b. Flashback Query

• Retrieves historical data for a specific table.

• Example:

SELECT * FROM bookings AS OF TIMESTAMP SYSDATE - 1;

c. Flashback Table

• Recovers a specific table to a prior state.

• Example:

FLASHBACK TABLE bookings TO TIMESTAMP SYSDATE - 1;

d. Flashback Challenges in RAC

• Shared Storage Dependency: Requires sufficient shared storage for the FRA.

• Cluster Coordination: Flashback operations need to be synchronized across all nodes.

5. Disaster Recovery Considerations for RAC

a. Using Data Guard

• RAC and Data Guard integration provides robust disaster recovery.

• Physical Standby: Mirrors the primary RAC database for failover.

• Snapshot Standby: Allows testing without affecting the primary database.

b. Cross-Site Clusters

• Extended RAC clusters span multiple data centers for site-level redundancy.

• Challenges include latency and interconnect performance.

c. Backup Validation

• Periodic validation of backups ensures recoverability.

• Command:

3
RESTORE DATABASE VALIDATE;

d. Testing Recovery Scenarios

• Regular disaster recovery drills:

o Node failure simulation.

o Media failure simulation.

o Site-wide disaster simulation.

Real-Time Scenarios

Scenario 1: Node Failure During Backup

• If a node fails during an RMAN backup:

o Surviving nodes continue the backup using allocated channels.

o RMAN retries failed backup pieces.

Scenario 2: Corrupted Datafile in a Shared Disk Group

• Detect the corruption:

SELECT * FROM V$DATABASE_BLOCK_CORRUPTION;

• Restore and recover the datafile:

RESTORE DATAFILE 5;
RECOVER DATAFILE 5;

Scenario 3: Logical Data Error

• Accidentally deleted records in the bookings table:

FLASHBACK TABLE bookings TO TIMESTAMP SYSDATE - 1;

Summary

Oracle RAC provides robust mechanisms for backup and recovery, ensuring high availability and data
integrity in a clustered environment. Key features like RMAN, Flashback Technology, and Data Guard
integration enable efficient handling of failures and logical errors. Proper planning, frequent
validations, and regular disaster recovery drills are essential for maintaining a resilient RAC
environment.

4
Oracle RAC Performance Tuning

Oracle RAC (Real Application Clusters) environments offer high availability and scalability by running
a single database across multiple nodes. However, due to the distributed architecture, tuning RAC for
optimal performance requires attention to specific components such as interconnects, global cache
management, memory, CPU, and storage.

1. RAC Performance Bottlenecks

a. Common Bottlenecks

1. Interconnect Latency:

o High latency or packet loss can degrade cluster communication.

2. Global Cache Contention:

o Excessive block transfers between nodes can increase waits.

3. I/O Bottlenecks:

o Slow or misconfigured shared storage can impact performance.

4. CPU Starvation:

o Uneven workload distribution across nodes leads to CPU bottlenecks.

5. Memory Issues:

o Inefficient memory allocation can lead to paging or swapping.

b. Symptoms of Bottlenecks

• High gc buffer busy or gc cr request wait events.

• Increased interconnect traffic or retransmissions.

• Elevated disk I/O latency.

• Uneven CPU utilization among nodes.

2. Global Cache Management and Global Resource Management

a. Global Cache Service (GCS)

• Manages cache coherency across nodes.

• Tracks ownership and access permissions for database blocks in the buffer cache.

b. Global Resource Directory (GRD)

• A metadata structure maintained in memory to track resource information such as block


ownership and state.

c. Key Wait Events

1. gc buffer busy acquire:

5
o Indicates contention while acquiring a block for modification.

2. gc buffer busy release:

o Occurs when another node is modifying a block and hasn’t released it.

3. gc cr request:

o Reflects waits for consistent-read blocks from another instance.

d. Tuning Strategies

• Object Partitioning:

o Partition frequently accessed tables to minimize block sharing between nodes.

• Affinity-based Services:

o Assign specific services to nodes to localize workload and reduce inter-node traffic.

• Proper Indexing:

o Ensure indexes reduce the number of rows scanned and thus minimize block
transfers.

3. Memory and CPU Tuning in RAC

a. Memory Tuning

• Monitor and adjust the size of the SGA (System Global Area) and PGA (Program Global
Area).

• Enable Automatic Memory Management (AMM) or Automatic Shared Memory


Management (ASMM) for dynamic memory tuning.

• Check and tune buffer cache usage to minimize physical I/O:

SELECT * FROM V$BUFFER_POOL_STATISTICS;

b. CPU Tuning

• Use Oracle Resource Manager to prioritize CPU usage among sessions.

• Distribute workload evenly across all RAC nodes.

• Monitor CPU usage:

SELECT inst_id, cpu_count_current FROM gv$osstat;

4. Interconnect Tuning and Troubleshooting

The interconnect is a critical component of RAC performance as it handles communication between


nodes.

a. Monitoring Interconnect

• Use the oradebug ipc command to view interconnect traffic and diagnostics.

• Monitor interconnect wait events:

6
SELECT event, total_waits, time_waited
FROM gv$system_event
WHERE event LIKE 'gc%';

b. Tuning Interconnect

1. Hardware:

o Use low-latency, high-bandwidth network hardware (e.g., InfiniBand or 10/25/40


Gbps Ethernet).

o Configure Jumbo Frames to reduce CPU overhead.

2. Network Configuration:

o Ensure the private interconnect is on a dedicated network.

o Use multiple NICs for redundancy and load balancing.

3. Oracle Configuration:

o Set the appropriate MTU size to match the network's capability.

o Adjust RAC parameters for optimal interconnect performance:

ALTER SYSTEM SET _gc_policy_minimum='adaptive';

5. Optimizing I/O and Storage in RAC Environments

a. Shared Storage

• Use ASM (Automatic Storage Management) for better performance and management.

• Balance ASM disk groups to avoid hot spots:

SELECT name, total_mb, free_mb, usable_file_mb


FROM v$asm_diskgroup;

b. I/O Distribution

• Distribute high-read/write tablespaces across multiple disks or ASM disk groups.

• Enable Oracle Smart Scan on Exadata for RAC environments.

c. Monitoring I/O

• Check I/O performance metrics using V$SYSMETRIC:

SELECT inst_id, metric_name, value


FROM gv$sysmetric
WHERE metric_name LIKE 'IO%';

d. Tuning I/O

1. Redo Log Tuning:

o Use faster storage for redo logs to minimize write latencies.

2. TEMP Tablespace Tuning:

7
o Use multiple TEMP tablespaces to distribute temporary I/O load.

3. ASM Striping:

o Enable high redundancy and striping for performance-critical workloads.

Real-Time Scenarios

Scenario 1: Global Cache Contention

• Problem: High gc buffer busy acquire wait events.

• Solution:

o Partition the table causing contention.

o Reconfigure services to localize access to the blocks.

Scenario 2: Uneven CPU Utilization

• Problem: One RAC node shows higher CPU usage than others.

• Solution:

o Rebalance the workload using Oracle RAC services.

o Monitor and adjust CPU affinity for sessions.

Scenario 3: High Interconnect Wait Times

• Problem: gc cr request and gc buffer busy release events are high.

• Solution:

o Check and optimize the private interconnect.

o Verify Jumbo Frames configuration and interconnect bandwidth.

Summary

Oracle RAC performance tuning requires a holistic approach to optimize interconnects, global cache
management, memory, CPU, and storage. By addressing bottlenecks, ensuring efficient resource
distribution, and using RAC-specific features like Oracle Resource Manager, affinity-based services,
and ASM, you can achieve a balanced and highly performant RAC environment.

8
Oracle RAC Administration and Maintenance

Administering and maintaining an Oracle RAC (Real Application Clusters) environment involves tasks
related to managing instances, clusterware, storage, patching, service management, and monitoring.
RAC environments are more complex than single-instance databases due to their distributed nature,
requiring specialized tools and techniques.

1. Managing RAC Instances with Enterprise Manager (OEM)

a. Overview of OEM in RAC

• Oracle Enterprise Manager provides a centralized console to manage RAC databases and
infrastructure.

• Key RAC-specific features include:

o Cluster performance monitoring.

o Service-level monitoring for individual RAC services.

o Real-time session monitoring across instances.

b. Common Tasks in OEM

1. Monitoring Cluster Performance:

o Navigate to the RAC database target in OEM and select "Performance".

o View Cluster Cache Coherency to analyze interconnect efficiency.

2. Session and Query Monitoring:

o Use "Top Sessions" to identify session-level bottlenecks.

o Analyze queries causing high wait events.

3. Instance Management:

o Start/stop instances across the RAC cluster using the "Instances" tab.

2. Clusterware and ASM Maintenance

a. Oracle Clusterware Components

1. Oracle Local Registry (OLR):

o A local registry on each node storing node-specific configuration.

2. Oracle Cluster Registry (OCR):

o Stores cluster configuration details, including resources and services.

3. Voting Disks:

o Facilitate node membership and quorum decisions.

b. Maintenance Tasks

9
1. Verifying Clusterware Health:

crsctl check cluster

crsctl stat res -t

2. Backup and Restore of OCR and Voting Disks:

o Backup OCR:

ocrconfig -manualbackup

o Restore OCR:

ocrconfig -restore <backup_location>

3. ASM Disk Maintenance:

o View disk group details:

SELECT name, state, total_mb, free_mb FROM v$asm_diskgroup;

o Add a disk to an ASM disk group:

ALTER DISKGROUP DATA ADD DISK '/dev/oracleasm/disks/DISK1';

3. Patch Management and Upgrades for Oracle RAC

a. Patching RAC with OPatch and OPatchauto

1. Patch Compatibility:

o Verify patch compatibility with the current RAC version.

o Use opatch lsinventory to list installed patches.

2. Applying a Patch:

o Apply patches node by node or use opatchauto for automatic patching.

opatchauto apply <patch_location>

3. Rolling vs. Non-Rolling Patches:

o Rolling patches allow patching one node at a time, ensuring database availability.

o Non-rolling patches require downtime.

b. RAC Database Upgrades

• Use the DBUA (Database Upgrade Assistant) for guided upgrades.

• Upgrade Oracle Grid Infrastructure before upgrading the database.

4. Managing Services and Instances with SRVCTL

The SRVCTL (Server Control Utility) is used to manage Oracle RAC services and instances.

a. Common SRVCTL Commands

1. Start/Stop RAC Instances:

10
srvctl start instance -d <db_name> -i <instance_name>

srvctl stop instance -d <db_name> -i <instance_name>

2. Manage Services:

o Add a service:

srvctl add service -d <db_name> -s <service_name> -r <preferred_nodes>

o Start a service:

srvctl start service -d <db_name> -s <service_name>

3. View Resource Status:

srvctl status database -d <db_name>

4. Enable/Disable Services:

srvctl enable service -d <db_name> -s <service_name>


srvctl disable service -d <db_name> -s <service_name>

5. RAC-Specific Alert and Log Management

a. Key Logs in RAC

1. Clusterware Logs:

o Clusterware alert logs:

/u01/app/11.2.0/grid/log/<hostname>/alert<hostname>.log

o CRS logs:

/u01/app/11.2.0/grid/crs/log/<hostname>/

2. ASM Logs:

/u01/app/oracle/diag/asm/+asm/<instance_name>/trace/

3. Database Alert Logs:

o For each instance:

/u01/app/oracle/diag/rdbms/<db_name>/<instance_name>/alert.log

b. Monitoring Alerts in OEM

• Set up custom alert thresholds in OEM for metrics like interconnect latency, I/O
performance, and instance availability.

• Configure email or SNMP notifications for proactive alerts.

c. Common Scenarios

1. Node Eviction:

o Check logs for eviction details:

11
/u01/app/grid/log/<node>/alert<node>.log

o Resolve issues with interconnect or voting disk availability.

2. Interconnect Latency:

o Troubleshoot using traceroute or ping between nodes.

o Validate network configurations.

Best Practices for RAC Administration

1. Automate Maintenance:

o Schedule regular backups of OCR and ASM metadata.

o Automate RAC instance startups using SRVCTL scripts.

2. Use Redundancy:

o Ensure redundancy for voting disks, OCR, and ASM disks.

3. Regular Patch Management:

o Stay updated with the latest Oracle patch sets.

o Use rolling patches to minimize downtime.

4. Monitor Proactively:

o Use tools like OEM and crsctl for continuous monitoring.

5. Test Before Production:

o Always test patches and upgrades in a non-production environment.

Oracle RAC administration and maintenance require a combination of tools (OEM, SRVCTL, CRSCTL)
and best practices to ensure high availability, performance, and scalability. Proper patch
management, monitoring, and service configuration are essential to maintain a robust RAC
environment.

12

You might also like