Oracle Cache Fusion - in Operation

Oracle Cache Fusion In Operation
Agenda
Cache Fusion
What is it?
Cache Coherency Vs. Cache Fusion
Key Components and terminology
Cache Fusion in operation
Lock Mastering & Resource Affinity
Type of Contentions
Cache Fusion I
Cache Fusion II
Examples
Instance Crash Recovery in RAC
Key Components in a Instance crash
I Pass recovery
II Pass recovery
Cache Fusion What is it?
What is it?
Oracle introduced the framework of sharing data using private interconnects

between the nodes, which was used only for messaging purposes in previous
versions. This protocol is Cache Fusion. Data blocks are shipped throughout
the network similar to messages, reducing the most expensive component of
data transfer, disk I/O, to data sharing.
According to the manual:
Process that implement Cache Fusion. It maintains the block mode for blocks
in the global role. It is responsible for block transfers between instances. The
Global Cache Service employs various background processes such as the
Global Cache Service Processes (LMSn) and Global Enqueue Service
Daemon (LMD).
A diskless cache coherency mechanism in Oracle Real Application Clusters
that provides copies of blocks directly from a holding instance's memory
cache to a requesting instance's memory cache.
Cache Coherency
According to Manual
The synchronization of data in multiple caches so that reading a memory
location through any cache will return the most recent data written to that
location through any other cache. Sometimes called cache consistency.
Can We say its something to maintain the resource (block) status, If so, the
following two together provides the same for us.
GCS (Global Cache Services)
GES (Global Enqueue Services)
In the name of
Global Resource Directory
Now both together

The GCS manages all types of data blocks. Cache coherency is maintained through the GCS by
requiring that instances acquire a resource (lock or enqueue on a block) cluster-wide before
modifying or reading a database block. The GCS is used to synchronize global cache access,
allowing only one instance to modify a block at any single point in time. The GCS, through the RAC
wide Global Services Directory, ensures that the status of data blocks cached in any mode in the
cluster is globally visible and maintained.
Oracles RAC has multi-versioning architecture. This multi-versioning architecture distinguishes
between current data blocks and one or more consistent read (CR) versions of a block. A current
block contains changes for all committed and yet-to-be-committed transactions. A consistent read
(CR) version of a block represents a consistent snapshot of the data at a previous point in time. A
data block can reside in many buffer caches under the auspices of shared resources.
In Oracle9i RAC, applying rollback segment information to current blocks produces consistent read
versions of a block. Both the current and consistent read blocks are managed by the GCS.
To transfer data blocks among database caches, buffers are shipped by means of the high speed
IPC interconnect. Disk writes are only required for cache replacement. A past image (PI) of a block
is kept in memory before the block is sent if it is a dirty (modified) block. In the event of failure,
Oracle reconstructs the current version of the block by reading the PI blocks.
Background Process and their roles

LMSx Lock Monitor Services (GCS)
Primarily responsible for shipping the blocks across buffers
Provides/creates a CR image whenever there is cross instance call for a dirtyblcok
LMS must also check constantly with the LMD background process (or our GES process) to get the lock
requests placed by the LMD process.
Parameter: GCS_SERVER_PROCESS upto 36 as of 10.2, Min. cpu_count/2
LMON Lock Monitor Process (GES)

LMON Processes manages the global locks & resources.
Reconfiguration of locks & resources when an instance joins or leaves the cluster are handled by LMON (
During reconfiguration LMON generate the trace files)
LMON also provides cluster group services.
LMD Lock Manager Daemon

LMD process performs global lock deadlock detection local and remote . (GES)
Also monitors for lock conversion timeouts.
Basically maintains the lock queues, traverse through the GES structures
LCK Lock Process

Manages instance resource requests & cross instance calls for shared resources.
During instance recovery,it builds a list of invalid lock elements and validates lock elements.
DIAG Diagnostic Daemon

Oracle 10g - this one new background processes ( New enhanced diagnosability framework).
Regularly monitors the health of the instance.
Also checks instance hangs & deadlocks.
History of Cache Fusion

Oracle
Release
Feature
Description
Prior to 8.1.5
OPS
OPS used disk-based pings
8.1.5
Cache Fusion I or Consistent Read

Server
Consistent read version of the block is

transferred over the interconnect
9i
Cache Fusion II (write/write cache fusion)
Current version of the block is transferred

over the interconnect
10g R1
Oracle Cluster Ready Services (CRS)
CRS eliminates the need for third-party

clusterware, though it can be used
10g R2
Oracle CRS for High Availability
CRS provides high availability for nonOracle applications
Key Components in Cache Fusion

Ping
The transfer of a data block from one instances buffer cache to another instances buffer cache is known as a ping.
Whenever an instance needs a block, it sends a request to the lock master to obtain a lock in the desired mode. If
another lock resides on the same block, the master will ask the current holder to downgrade/release the current lock.,
this process is known as a blocking asynchronous trap (BAST). When an instance receives a BAST it downgrades the
lock as soon as possible. However, before downgrading the lock, it might have to write the corresponding block to disk.
This operation sequence is known as disk ping or a hard ping.
CR Fabrication
When ever there is Consistent read request from any other instance, the holding instance (LMS) has to create a
Consistent read image by applying the undo information to the Current Block. Since CR fabrication is I/O
expensive which requires a undo into the buffer and apply the undo image etc.
Past Image (PI) Blocks
PI blocks are copies of blocks in the local buffer cache. Whenever an instance has to send a block it has recently
modified to another instance, it preserves a copy of that block, marking it as PI. An instance is obliged to keep Pls until
that block is written to the disk by the current owner of the block. Pls are discarded after the latest version of the block is
written to disk. When a block is written to disk and is known to have global role, indicating the presence of Pls in other
instances buffer caches, Global Cache Services (GCS) informs the instance holding the Pls to discard the Pls. With
Cache Fusion, a block is written to disk to satisfy checkpoint requests and so on, not to transfer the block from one
instance to another via disk.
Lock Mastering
The memory structure where GCS keeps information about a data block (and other sharable resources) usage is known
as the lock resource. The responsibility of tracking locks is distributed among all the instances and the required memory
also comes from the participating instances System Global Area (SGA). Due to this distributed ownership of the
resources, a master node exists for each lock resource. The master node maintains complete information about current
users and requestors for the lock resource. The master node also contains information about the Pls of the block.
Resource Affinity and Dynamic remastering

Each block is mastered in any one of the instance at any given point of time
Resource Master can be changed based on frequency of the block that is requested by other
instances
For a period of 10 Mins if an instance request 50 times for a particular resource the requested instance
become the master. This is called resource affinity
- Block Mastering
In Oracle 9.2
documentation describes dynamic remastering
not implemented in code
In Oracle 10.1
work at data file level

very high threshold so difficult to test
does occur on some customer sites
may cause LMON process to crash in 10.1.0.4
bug 3659289 - patch available

fixed in 10.1.0.5/10.2.0.1
In Oracle 10.2
works at object level
thresholds are relatively low.
Object re mastering is recorded in V$GCSPFMASTER_INFO
Cache Fusion- Possible Types of Contention

Contention of a resource occurs when two or more instances want the same
resource. If a resource such as a data block is being used by an instance and is
needed by another instance at the same time, a contention occurs. There are three
types of contention for data blocks:
Read/Read contention Read/read contention is never a problem because of the shared disk system. A block read by one
instance can be read by other instances without the intervention of GCS.
Write/Read contention Write/read contention was addressed in Oracle 8i by the consistent read server. The holding
instance constructs the CR block and ships the requesting instance using interconnects.
Write/Write contention Write/write contention is addressed by the Cache Fusion technology. Since Oracle 9i, cluster
interconnect is used in some cases to ship data blocks among the instances that need to modify the same data block
simultaneously.
Prior to Cache Fusion
(before 8.1.5)
Write/read contention before Cache Fusion
Cache Fusion I aka Consistent Read Server
Write/Read contention - CR Block Transfer in Cache Fusion

Oracle Introduced a background process called BSP (Block Server process) makes the CR fabrication at the holders cache and ships the
CR version of the block across the interconnect
Still need to address Write/Write Contention
Write / Write Contention before Cache Fusion II (before 9i)
So now Cache Fusion II or Write/Write Cache Fusion
Cache Fusion current block transfer (from 9i r2 )
Buffer States In Cache Fusion

Mode/Role
Local
Global
Null: N
NL
NG
Shared: S
SL
SG
Exclusive: X
XL
XG
SL When an instance has a resource in SL form, it can serve a copy of the block to other instances and it can read
the block from disk. Since the block is not modified, there is no need to write to disk.
XL When an instance has a resource in XL form, it has sole ownership and interest in that resource. It also has the
exclusive right to modify the block. All changes to the blocks are in its local buffer cache, and it can write the block to
disk. If another instance wants the block, it will contact the instance via GCS.
NL A NL form is used to protect consistent read blocks. If a block is held in SL mode and another instance wants it in
X mode, the current instance will send the block to the requesting instance and downgrade its role to NL.
SG In SG form, a block is present in one or more instances. An instance can read the block from disk and serve it to
other instances.
XG In XG form, a block can have one or more Pls, indicating multiple copies of the block in several instances buffer
caches. The instance with the XG role has the latest copy of the block and is the most likely candidate to write the
block to disk. GCS can ask the instance with the XG role to write the block to disk or to serve it to another instance.
NG After discarding Pls when instructed by GCS, the block is kept in the buffer cache with NG role. This serves only
as the CR copy of the block.
Example 1: Reading a Block from Disk
Example 2: Reading a Block from the Cache
Example 3: Getting a (Cached) Clean Block for Update
Example 4: Getting a (Cached) Modified Block for Update and Commit
Example 5: Commit the Previously Modified Block and Select the Data
Example 6: Write the Dirty Buffers to Disk Due to Checkpoint
Example 7: Master Instance Crash
Example 7: What Alert log says abt reconfiguration.
List of nodes:
012
Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE
Communication channels reestablished
* domain 0 valid = 0 according to instance 0
Wed Jun 21 23:22:22 2006
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed Jun 21 23:22:22 2006
LMS 0: 0 GCS shadows cancelled, 0 closed
Wed Jun 21 23:22:22 2006
Wed Jun 21 23:22:22 2006
Wed Jun 21 23:22:22 2006
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Wed Jun 21 23:22:22 2006
LMS 0: 2189 GCS shadows traversed, 332 replayed
Wed Jun 21 23:22:22 2006
Wed Jun 21 23:22:22 2006
Wed Jun 21 23:22:22 2006
Wed Jun 21 23:22:22 2006
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
Crash Recovery Key Components

Redo Threads and Streams
Redo Records and Change Vectors

Checkpoints
Thread Checkpoint or Local Checkpoint
Database Checkpoint or Global Checkpoint
Incremental Checkpoint
Bounded Recovery
Block Written Record (BWR)
Past Image (PI)
Checkpoints and PI
I Pass Recovery
II Pass Recovery
Merge Threads
Cache Fusion - Crash Instance

Recovery
The steps for GRD reconfiguration are as follows:

Instance death is detected by the cluster manager.
Requests for PCM locks are frozen.
Enqueues are reconfigured and made
available.
DLM recovery.
GCS (PCM lock) is remastered.
Pending writes and notifications are
processed.
The steps for I Pass recovery are as follows:
The instance recovery (IR) lock is acquired
by SMON.
The recovery set is prepared and built.
Memory space is allocated in the SMON
Program Global Area (PGA).
SMON acquires locks on buffers that need
recovery.
II Pass recovery steps are as follows:
II Pass is initiated. The database is partially
available.
Blocks are made available as they are
recovered.
The IR lock is released by SMON. Recovery
is complete.
The system is available.
Example 8: Select the Rows from Instance A
Just for a clear understanding

Its time to play
Cross Instance Consistent Read

Instance 1
Instance 2
Session 15
LMS0
SELECT runs,wickets
FROM score
WHERE team = 'ENG';
Build read
consistent version
of block 42
Session 27
UPDATE score
SET runs = runs + 6
4
2
WHERE team = 'ENG';
segment 5 slot 18:

state: 10
wrap#: 4E7
dba: 00800777
Undo Header
ITL1
ITL1
ITL1
seq: 530 irb 12
xid: 0005.018.4E7
xid: 0005.018.4E7
xid: 0005.018.4E7
xid: 0005.018.4E7
uba:
uba: -800777.530.12
800777.530.13
800777.530.12
800777.530.13
800777.530.14
uba:
uba: -800777.530.12
800777.530.13
800777.530.12
800777.530.13
800777.530.14
uba: 800777.530.14
800777.530.12
800777.530.13
slot 0
slot 0
slot 0
col1: ENG
col1: ENG
col1: ENG
col2: 340
350
344
352
col2: 340
350
344
352
340
col2: 352
344
350
col3: 1
col3: 1
col3: 1
12 uba: 5.1
slot 1
slot 1
col1: AUS
col1: AUS
col1: AUS
col2: 99
col2: 99
col2: 99
col3: 10
col3: 10
col3: 10
DataData
Block
Block
42 (copy)
42
DataData
Block
Block
42 (copy)
42
Data Block 42
col3: 340
13 uba 800777.530.12
5.1
slot 1
block 42 slot 0
block 42 slot 0
col3: 344
14 uba 800777.530.13
5.1
block 42 slot 0
col3: 350
Undo Block 800777
Commited Block Block on Disk
Session1
5
LMS0
Session2
7
22:9
22:10
ENG 199
ENG 205
ENG 205
199
200
204
ENG 200
AUS 99
AUS 99
ENG 204
Block 42
Undo
Block
SELECT runs
FROM score
WHERE team = 'ENG';
199
ENG 205
AUS 99
Instance 1
Instance 2
UPDATE score
SET runs = 200
WHERE team = 'ENG';
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
COMMIT;
Committed Block Block on Buffer Cache
Session1
5
LMS0
Session2
7
22:9
22:10
ENG 199
ENG 205
ENG 205
200
204
199
ENG 200
AUS 99
AUS 99
ENG 204
Block 42
Undo
Block
SELECT runs
FROM score
WHERE team = 'ENG';
ENG 199
AUS 99
Instance 1
STOP
Instance 2
UPDATE score
SET runs = 200
WHERE team = 'ENG';
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
COMMIT;
Uncommitted Block Block in Buffer cache
Session1
5
LMS0
Session2
7
22:10
ENG 199
ENG 199
ENG 199
205
204
200
ENG 205
199
200
204
ENG 200
AUS 99
AUS 99
AUS 99
ENG 204
Block 42
Copy
Block 42
Undo
Block
SELECT runs
FROM score
WHERE team = 'ENG';
ENG 199
AUS 99
Instance 1
Instance 2
UPDATE score
SET runs = 200
WHERE team = 'ENG';
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
Uncommitted Block On Disk
Session1
5
LMS0
Session2
7
ENG 199
22:10
ENG 199
ENG 205
200
204
199
ENG 200
ENG 205
199
200
204
ENG 200
AUS 99
ENG 204
AUS 99
ENG 204
Block 42
Undo
Block
SELECT runs
FROM score
WHERE team = 'ENG';
UPDATE score
SET runs = 200
WHERE team = 'ENG';
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
ENG 205
199
200
204
SEE SLIDE NOTES

FOR ADDITIONAL
INFORMATION
AUS 99
Instance 1
Instance 2
Q&A
References: Oracle 10g Real Application Clusters handbook K Gopalkrishnan

Julian Dyke RAC Presentation
Oracle 10g RAC Administrators Guide

Oracle Cache Fusion - in Operation

Uploaded by

Copyright:

Available Formats

Oracle Cache Fusion - in Operation

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Oracle Cache Fusion - in Operation

Uploaded by

Copyright:

Available Formats

Oracle Cache Fusion In Operation

Cache Fusion What is it?

Oracle introduced the framework of sharing data using private interconnects

Now both together

Background Process and their roles

LMON Lock Monitor Process (GES)

LMD Lock Manager Daemon

LCK Lock Process

DIAG Diagnostic Daemon

History of Cache Fusion

OPS used disk-based pings

Cache Fusion I or Consistent Read

Consistent read version of the block is

Cache Fusion II (write/write cache fusion)

Current version of the block is transferred

Oracle Cluster Ready Services (CRS)

CRS eliminates the need for third-party

Oracle CRS for High Availability

CRS provides high availability for nonOracle applications

Key Components in Cache Fusion

Resource Affinity and Dynamic remastering

work at data file level

bug 3659289 - patch available

Cache Fusion- Possible Types of Contention

Prior to Cache Fusion

Write/read contention before Cache Fusion

Cache Fusion I aka Consistent Read Server

Write/Read contention - CR Block Transfer in Cache Fusion

Still need to address Write/Write Contention

Write / Write Contention before Cache Fusion II (before 9i)

So now Cache Fusion II or Write/Write Cache Fusion

Cache Fusion current block transfer (from 9i r2 )

Buffer States In Cache Fusion

Example 1: Reading a Block from Disk

Example 2: Reading a Block from the Cache

Example 3: Getting a (Cached) Clean Block for Update

Example 4: Getting a (Cached) Modified Block for Update and Commit

Example 6: Write the Dirty Buffers to Disk Due to Checkpoint

Example 7: Master Instance Crash

Example 7: What Alert log says abt reconfiguration.

Crash Recovery Key Components

Redo Records and Change Vectors

Cache Fusion - Crash Instance

The steps for GRD reconfiguration are as follows:

Example 8: Select the Rows from Instance A

Just for a clear understanding

Cross Instance Consistent Read

segment 5 slot 18:

seq: 530 irb 12

Undo Block 800777

Commited Block Block on Disk

Committed Block Block on Buffer Cache

Uncommitted Block Block in Buffer cache

Uncommitted Block On Disk

SEE SLIDE NOTES

References: Oracle 10g Real Application Clusters handbook K Gopalkrishnan

You might also like