0% found this document useful (0 votes)
5 views5 pages

The ADABAS Buffer Pool Manager: Address Converter

Uploaded by

VagnerBellacosa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views5 pages

The ADABAS Buffer Pool Manager: Address Converter

Uploaded by

VagnerBellacosa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

The ADABAS Buffer Pool Manager

Harald Schijning
Software AG, Uhlandstr. 12,64297 Darmstadt
hsgasoftware-ag.de

completely redesigned during this process is the


ADABAS buffer pool manager.
Abstract The primary task of the buffer pool manager is to cache
The buffer pool manager is a central component database pages which have been read from disk, in order
of ADABAS, a high performance scaleable to save I/OS if those pages are re-referenced. Changes of
database system for OLTP processing. High the database pages are performed in the buffer pool only
efficiency and scalability of the buffer pool and not immediately written to disk. Of course, adequate
manager is mandatory for ADABAS on all logging is needed to guarantee the persistence of
supported platforms. In order to allow a committed transactions, but logging algorithms are
maximum of parallelism without facing the beyond the scope of this paper.
danger of deadlocks, a multi-version locking The buffer pool also handles temporary information
method is used. Partitioning of central data which is written to disk only if there is a lack of space.
structures is another key to performance. In the following, we will shortly mention some
Variable page sizes allow for flexible tuning, but characteristics of ADABAS which are important for the
make the buffer pool logic more sophisticated, in buffer pool design.
particular concerning parallelism.
1.1 Container Types
1 Introduction ADABAS data on disk are organized in two so-called
container files. The DATA container stores the mere data
SOFTWARE AG’s ADABAS is a database system with a
of a database in a compressed form, while the ASS0
very long history. It is successfully used in business-
critical OLTP applications which depend on the container stores the schema information, the database
robustness and the high performance of the underlying translation table (called Address Converter in ADABAS),
and the indexes. Each record in the DATA container has a
database system, such as flight reservation systems,
unique identifier. The Address Converter maps the unique
emergency management, etc. Over the years it has been
ported to all major operating systems. Today, ADABAS ID to a physical location. The indexes contain logical
identifiers only. There is a third container file called
databases can operate on UNIX machines running various
WORK which is used to store temporary data and log
flavors of UNIX, WINDOWS PCs, various mainframe
information.
operating systems such as MVS, VSE, and BS2000, and
Container files can be distributed over an arbitrary
other platforms.
number of disks, using raw device or tile system access
To cope with the evolving world of environments,
(or mixing both). A container consists of pages, which are
continues re-engineering of ADABAS has been necessary.
numbered in ascending sequence. These pages are the unit
In particular, multi-processing architectures (symmetric
of input and output. The buffer pool stores pages from
multi-processing) have caused considerable changes to the
each of the three container types.
whole system. One component which has recently been
1.2 Varying page sizes
Permission to copy without fee all or purt of this material is granted
The pages within a container may have different size.
provided that the copies are not made pr distributed for direct
commercial advantage, the VLDB copyright notice and the title of the Depending on the size of the data of a database table and
publication and its data appear, and notice is given that copying is by the typical access pattern, the database administrator
permission of ihe Very Large Data Base Endowment. To copy (DBA) can adjust the page size (in a range of 1 to 32
othetwise, or to republish, requires a fee and/or special permission Kilobytes). For example, if a table is typically accessed
from the Endowment.
with exact match queries via an index, the DBA might
Proceedings of the 24th VLDB Conference choose a page size for the data storage such that a page
New York, USA, 1998 contains only a few data records. The page size for the

675
index, and for the address converter, can be chosen and the page must not be removed from the buffer pool
independently. This flexibility in page size can also be while a command is still working on it. To guarantee this,
used for an adaptation of the database to changing storage database management systems usually FIX and UNFIX
medium characteristics, as pointed out in [GG97]. the pages in a buffer pool explicitly [EH84]. In the
As a consequence, the ADABAS buffer pool manager ADABAS buffer pool manager, this functionality is
must cope with pages of various sizes from all three combined with the synchronization of page accesses
container types. In particular, the varying page sizes affect by the ADABAS commands. For this purpose, each
the buffer replacement algorithms. header contains a readers/writer lock.
Other commercial database systems do not have this The headers are linked in physical sequence (so-called
freedom in configuration, and, as a consequence, do not physical chain) and in LRU sequence (LRU chain).
need to handle such complex replacement problems. The Furthermore, to enable an efficient search for a specific
research database system PRIMA [HMMS87] developed database page, a hash structure is allocated. Each hash
at the University of Kaiserslautern, also supports multiple bucket contains the pointers to the corresponding headers
page sizes. The replacement algorithm implemented there and is protected by its own latch. Hence, lock conflicts on
[Si88], however, differs from the one used in ADABAS. the hash structure are rare. The overflow of a hash bucket
There, a free list is kept separate from the LRU chain. is organized as AVL tree. Thus, even in the case where a
This list is then copied and a the buffers contained in the bucket has a large amount of overflow information the
LRU chain are consecutively marked as replaceable. access remains fast - another pre-requisite for low lock
When an area has been found which is large enough for contention on the hash structure. The buffer pool
the new buffer, the process stops. This algorithm, architecture is depicted in Figure 1 (LRU chain not
however, lacks the flexibility of the ADABAS algorithm shown).
described below.
3 Page Access Synchronization
1.3 Parallel access
The access to a database page works as follows: The page
The buffer pool manager has to care for a proper is searched in the hash structure. The hash bucket is
synchronization of the accesses to the pages in the buffer protected by a latch (a short time mutual exclusion lock).
pool. Several threads which execute in parallel on If it is not found, a header for the database page is
multiple CPUs may want to access the same database allocated, exclusively locked, and entered into the hash
page, and hence the same buffer. Of course, the structure such that other tasks have a reference to it. Then
synchronization has to be very efficient, not only avoiding the physical I/O is started. When it is finished, the
deadlocks, but also keeping waiting times as short as exclusive lock can be downgraded to a shared lock if the
possible. database pages was needed for reading only.
The following sections discuss the architecture and the If the page had been found in the hash structure, the
algorithms chosen for the new ADABAS buffer pool buffer pool tries to acquire a lock of the requested quality
manager. (shared or exclusive) and, if successful, returns a pointer
to the corresponding location in the buffer pool.
2 Architecture of the buffer pool manager Locks on database pages in the database are held until
The buffer pool is allocated as a contiguous piece of the command has performed its changes to the page, i.e.
memory. In order to avoid double page faults [EH84], i.e. for a very short time only.
page faults in the operating system’s virtual memory, the When database pages are logically linked (e.g. nodes in
whole buffer pool can be pinned in the physical memory. an index tree), and updates affecting this link have to be
When a block from disk is read into the buffer pool, a performed, more than one page has to be exclusively
header structure is assigned to it, which stores all locked at a time. While deadlocks in such situations often
information needed for the management of the block, can be prevented by enforcing a certain sequence of
including, of course, the identification of the database locking, this is not possible in all cases. For example,
pages this block corresponds to. These headers positioning in an index is done from root to leave, while
themselves are allocated in the buffer pool in contiguous index updates occur from the leave to the root. Hence,
areas. Note that the variable page size in ADABAS makes positioning and updating simultaneously could lead to a
it impossible to predict the number of headers needed (of deadlock. Locking the whole index would lead to
course, the least possible page size determines an upper unacceptable waiting situations. To cope with this
limit to the number of headers, but allocating that much situation, the ADABAS buffer pool manager uses a multi-
headers in advance could waste a lot of space). version locking scheme: when an exclusive lock is not
ADABAS directly references the pages in the buffer granted, the buffer pool tries to acquire a shared lock. If
pool. Therefore, the address of a page must not change this is granted, a copy of the database page is generated in

676
the buffer pool, and the pointer in the hash structure is set l A page which has been chosen for replacement need
to this copy. Hence, all threads that subsequently search not be written to disk before it is replaced, allowing a
for this page will find the copy. The block containing the fast replacement.
original page remains in the buffer pool, but is placed at l Pages cannot be replaced if they had been changed
the end of the LRU chain, thus being prime candidate for after the last buffer flush. Therefore, a buffer flush
replacement once all (shared) locks on it are released. Its must occur before too many pages are “dirty”. On the
other hand, flushing too early destroys the caching
Hash structure effect for updates because pages are written to disk
after fewer updates per page. Obviously, finding a
reasonable percentage of dirty pages for the start of a
buffer flush is not trivial. In order to relieve the DBA
Buffer headers from this task, ADABAS can choose a useful
percentage and internally adapt it to the current
situation.
l The crash recovery algorithms are tightly coupled to
the asynchronous writing. The start and the end of the
buffer flush are logged. From this logging information
Contiguous buffer area containing database pages the crash recovery algorithm can infer which database
of varying sizes changes are already reflected on disk and which are
(possibly) not. Therefore it is essential that all changed
Figure 1: The architecture of the ADABAS buffer pool pages are covered by the buffer flush. On the other
hand, pages which are very frequently updated could
access time stamp is set to zero.
defer the whole buffer flush considerably. To cope
A consequence of this locking scheme is that shared
with such situations, the buffer flush can be split into a
locks do not protect a logical link between pages against
first part which contains all pages which could be
changes. Therefore, pages found by following a link
locked without blocking on the lock, and a second part
always have to be re-evaluated before they can be used.
which flushed all the other pages (and usually handles
Example 1 illustrates this effect.
very few pages).
Note that in high-load situations, more than one thread
could copy the same database page. Only one of these two
5 Buffer Replacement Handling
copies must survive. For this purpose, the exchange in the
hash structure for search must be atomic. It fails if the As pointed out earlier, buffer replacement in ADABAS is
address to be replaced is not the expected one. quite sophisticated. If a page of a certain page size is to be
read into the buffer pool, and the buffer pool is filled up
4 Saving Changes to Disk (which is the normal case after an initial filling phase), the
necessary space must be provided by selecting another
The ADABAS buffer pool managers saves changed pages
buffer which can be overwritten. However, caused by the
to disk in an asynchronous manner. When a certain
varying page sizes in an ADABAS database, it might be
(configurable) percentage of all pages has been modified,
necessary to overwrite several other pages of smaller page
an asynchronous thread (called the buffer flush thread)
size. In databases with one fixed page size, the first
starts to write all changed pages to the disk. Of course, the
available page when searching from the end of the LRU
pages must not be modified while they are written to disk.
chain can be chosen for replacement. This is not true for
On the other hand, it is not acceptable to defer changes to
ADABAS.
those pages until the writing has been done. If a page
Consider the following case: A page with size 4 KB
which is locked by the buffer flush thread is to be changed
has to be read into the buffer pool. At the end of the LRU
by another thread, the same multi-version locking as
chain, only 2 KB pages can be found. The next 4 KB page
described above is applied. The pages involved in the
is quite at the begin of the LRU chain, i.e., it is a quite
buffer flush are locked by the buffer flush thread using a new page. In this case, one of the 2KB pages and its
privileged read lock. If a page is currently write locked, it
physical neighbor should be replaced. However, the
is entered into a refused-lock list and skipped. After all physical neighbor might also be a very new page. To find
other pages are locked, the buffer flush thread blocks on
good replacement candidates, the following handling is
the pages in the refused-lock list if necessary. Typically, applied. The LRU chain is searched from its end. When a
the updating command that had held a lock on those pages page is found which is available for replacement (i.e.
has meanwhile released the lock contains no unwritten changes and is not locked),
The asynchronous writing of changed pages has some ADABAS searches for the necessary space starting from
consequences: this page. The left neighbors are considered, as long as

677
they can be replaced and the necessary space is not yet anyway. Note, that in contrast to the procedure used in
gathered. Then, the right neighbors are checked. The ORACLE [Br97], the ADABAS algorithms reflects the
space between the left-most neighbor found and the right- correct sequence of accessesin the LRU chain.
most one usually leaves several choices for so-called
overlay sets, i.e. sets of buffers which could be replaced 6 Prefetching
to gain the needed space (cf. Figure 2). The overlay set
Sequential operations which scan the data of multiple
with the lowest costs is stored.
adjacent database pages are quite common. In ADABAS,
I--+ LRU chain the DBA can optimize for such operations, e.g. by
choosing large page sizes. However, such optimizations
Buffer headers are complex and might decrease the performance of
commands with different access patterns. Therefore,
ADABAS recognizes sequential access, and can read
several pages in one IO into a contiguous buffer pool area.
The replacement algorithm described above obviously
covers this case without adaptation. Although read with
one I/O, all pages have their own header and are managed
by the buffer pool manager as if they has been read
L-Y-
separately. The number of pages to be read in one IO is
dynamically determined according to the following
Figure 2: Two overlay sets based on page 1567 criteria:
l Maximum number of pages needed by the current
Then the LRU search is continued, until an upper limit
command
of found pages is reached, or until a single page is found
which is large to render the necessary space. This page is l Number of pages which fit into a single physical IO.
a singleton overlay set. The cheapest overlay set is chosen This is platform dependent, but it also depends on the
for replacement. distribution of the container files over disks.
The cost of an overlay set is determined by applying a l Next page which is already in the buffer pool. In order
function to the access time stamps of the pages in the set. to avoid inconsistencies with updates on this page, the
The choice of this function heavily influences the page must not be re-read into the buffer pool.
replacement algorithm. If the function is MIN, for l Available space in the buffer pool
example, the first overlay set found would be selected. If
the function is SUM, the likelihood of a multi-buffer 7 Summary
replacement decreases rapidly with the number of pages. The ADABAS buffer pool manager has completely been
In the case of MAX, an overlay set is chosen only if all its redesigned for the latest parallel version of ADABAS. In
pages are older than the oldest replaceable single page of particular, the locking of the LRU chain had been a
sufficient size. bottleneck, in particular because the replacement
Obviously, search for replacement candidates is an algorithm is very complex due to the different page sizes
operation which takes quite long due to the need to cope used in a database. To prevent lock contention on the
with different page sizes. Unfortunately, the LRU chain LRU chain , the chain has been split into several areas.
cannot be changed while such a search is performed. Furthermore, the update of the LRU chain is done lazily.
Since every access to a database page should update the In order to increase parallelism and avoid deadlocks in
LRU chain (placing the accessed page in front of the LRU particular in index operations, dedicated multi-version
chain), there is a considerable bottle neck. To avoid lock locking protocols have been introduced. Various dynamic
contention on the LRU chain, ADABAS splits the buffer optimizations relieve the DBA from too sophisticated
pool into several physical regions, where each region has tuning. The use of further self-tuning algorithms such as
its own LRU chain. The number of regions depends on the LRU-2 [OOW93] is constantly investigated.
size of the buffer pool, the maximum parallelism allowed
by the DBA, and other criteria.
These physical regions are chosen for replacement in a
round-robin manner. Only the affected LRU chain is
locked. Furthermore, the updates to the LRU chain are
deferred. Obviously, the updates need not be done before
a replacement search is performed. Hence, the access to
pages is memorized, but it is reflected in the LRU chain
only when the lock for replacement search is required

678
a) Thread 1 wants to insert value
20 into the index. It locks leaf 3
exclusively.

X(thr. 1) /\

b) Thread 2 wants to read value 30. root


It acquires a shared lock on I
node 1. Before it gives up the
lock again, thread 1 tries to lock
Xl
node 1 because leaf 3 has to be
split due to lack of space for
value 20. It does not get the lock
and creates a copy of node 1.
Then it creates a new leaf 5.

c) Thread 2 had blocked on the


shared lock on leaf 3. After
thread 1 has finished thread 2
gets the shared lock now. The
value 30, however, cannot be
found in leaf 3 any longer.
Thread 2 must check whether
the version of node 1 that it had
seen is still the current one. If
so, the value 30 is not in the
index, otherwise thread 2 must
repeat the positioning.

Example 1: The need for repositioning

HMMS87 Harder, T., Meyer-Wegener, K., Mitschang,


References B., Sikeler, A.: PRIMA - A DBMS Prototype
Supporting Engineering Applications, in:
Br97 Bridge, W., et al: The Oracle Universal
Proc. 131hVLDB, 1987, pp. 433-442.
Server Buffer Manager, in: Proc. 231d Int.
oow93 O’Neil, E.J., O’Neil, P. E. , Weikum, G.:The
Conference On Very Large Data Bases,
LRU-K Page Replacement Algorithm for
VLDB 97, pp. 590-594.
database disk buffering, in: Proc. ACM
EH84 Effelsberg, W., Harder, T.: ACM
SIGMOD Int. Conf. On Management of Data,
Transactions on Database Systems, Vol. 9,
1993, pp. 297-306.
No. 4, Dec. 1984, pp. 560-595.
Si88 Sikeler, A.: VAR-PAGE-LRU: A Buffer
GG97 Gray, J., Graefe, G.: The Five-Minute Rule
Replacement Algorithm Supporting Different
Ten Years Later, and Other Computer
Pages Sizes, in: Proc. Int. Conf. On
Storage Rules of Thumb, in: SIGMOD
Extending Database Technology, EDBT88,
RECORD Vol. 26, No. 4, Dec. 1997, pp. 63-
Venice, Italy, Springer-Verlag, Berlin, 1988,
68.
pp. 336-35 1.

679

You might also like