Topic 8 - OO Databases
Topic 8 - OO Databases
Systems
The Object-Oriented Paradigm
05/03/24 1
OO Database Systems:
Recommended References
Date C. J. (1990) An Introduction to Database Systems. Volume 1, 5th Edition
Addison Wesley
Korth HF & Silberschatz A (1991) Database System Concepts Second Edition
McGraw-Hill
Cattell, R.G.G. (1991) Object Data Management. OO & Extended Relational
Database, Systems Addison-Wesley Publishing
Atkinson,M., DeWitt,D., Maier,D., Bancilhon,F., Dittrich,K., Zdonik,S.(1990) The
Object-Oriented Database System Manifesto, In: “Deductive and Object-Oriented
Databases”, Elsiever Science Publishers
Zdonik,S., Maier,D. Eds. (1990) Readings in Object-Oriented Database Systems
Morgan Kaufmann Publishers
Won Kim (1991) Introduction to Object-Oriented Databases, The MIT Press,
Cambridge (Massachusetts)
Unland,R., Schlagetter,G. Object-Oriented Database Systems: State of the Art and
Research Problems. In: Expert Database Systems, 1992, Academic Press Ltd
05/03/24 2
“Old” Database Applications:
Features
Uniformity
Record orientation
Small data items
Atomic fields
Short transactions
Static conceptual schemes
05/03/24 3
Who Needs Object-Oriented
Databases ?
Computer-aided design (CAD)
Computer-aided software engineering (CASE)
Multimedia databases
Office information systems (OIS)
Expert database systems
05/03/24 4
Conventional DBMS for New
Applications: Disadvantages
Artificial representation of complex structured objects.
Properties of objects cannot be modeled in an
appropriate way
Operational semantics of complex structured objects is
not expressible
Different levels of abstraction are not supported
Lack of trigger, constraint and event mechanisms
Clumsy database access (impedance mismatch)
05/03/24 5
Object-Oriented Paradigm
Our environment exclusively consists of objects
Together with the objects comes their behavior; i.e. objects are described by
their functionality
Mostly we know the functionality of objects but we don't know how this
functionality is realized (encapsulation)
Objects react to messages
Only the objects themselves decide in which way they react to a message
Different objects may react to the same message in different ways
(polymorphism)
Objects inherit characteristics and abilities as a result of their membership of a
special class or category
The concepts of inheritance and polymorphism can best be exploited if the
underlying system supports run-time type checking and late binding - the
operation to be used is choosen at run time and converted to a program address
05/03/24 6
Object-Oriented Databases:
Basic Definitions
In an object-oriented system everything is an object
Objects are encapsulated which means that there is no way to access an object
except through the public interface - a set of operations - specified for it
To emphasize object independence, objects communicate by message passing
A class is a template that special operations (like new) can use to create new
objects
Inheritance makes it possible to declare a class as a specialization of another
class, thus supporting the management of (hierarchical) relationships among
classes as well as the reusability of software
The ability of different objects to respond differently to the same message is
known as polymorphism
The concepts of inheritance and polymorphism can best be exploited if the
underlying system supports run-time type checking and late binding - operation
names are translated into program addresses at run-time
05/03/24 7
O-O Databases: Top-Down &
Bottom-Up Approach
Historical roots in
– database technology, including semantic data models
– abstraction-based programming languages, e.g., Simula-67, Smalltalk
– Knowledge-representation of AI technology
Bottom-Up approach - by database community
– Structurally object-oriented
Top-Down approach - by programming languages people
– behavioral object-orientation
– persistence - objects remain in existence even if an application program
ends
– object sharing - independent applications may use the same data
concurrently
05/03/24 8
Origins of Object-Oriented
Database Concepts
Traditional database systems
– Persistence
– Sharing
– Query Language
– Concurrency control
– Transaction Management
Semantic
– data models
– Aggregation
– Generalization
Object-Oriented Programming
– Complex objects
– Object Identity
– Classes and Methods
– Encapsulation
– Inheritance
– Extensibility
Object-Oriented Data Model
05/03/24 9
The Object-Oriented Database
System Manifesto
An object-oriented database system must satisfy two criteria: it should be a DBMS,
and it should be an object-oriented system, i.e., to the extent possible, it should be
consistent with the current crop of object-oriented programming languages. The first
criterion translates into five features: persistence, secondary storage management,
concurrency, recovery and an ad hoc query facility. The second one translates into
eight features: complex objects, object identity, encapsulation, types or classes,
inheritance, overriding combined with late binding, extensibility and computational
completeness.
– The above listed features are classified as the mandatory characteristics
– Optional features include the ones that can be added to make the system better,
but which are not mandatory. These are multiple inheritance, type checking and
inferencing, distribution, design transactions and versions.
– Open features are the points where the designer can make a number or choices.
These are the programming paradigm, the representation system, the type
system, and uniformity.
05/03/24 10
OODBS Golden Rules: Support
for Complex Objects (1)
Complex objects are built from simpler ones by applying constructors to them
The simplest objects are objects such as integers, characters, byte strings of any
length, booleans and floats
There are various complex object constructors: tuples, sets, bags, lists, and arrays
The minimal set of constructors that the system should have are set, list and tuple
Sets are critical because they are a natural way of representing collections from the
real world
Tuples are critical because they are a natural way of representing properties of an
entity
Lists or arrays are important because they capture order, which occurs in the real
world
The object constructors must be orthogonal: any constructor should apply to any
object
Operations on a complex object must propagate transitively to all its components
05/03/24 11
OODBS Golden Rules: Support for
Object Identity (2)
The idea of object identity is the following: in a model with object identity, an object has an existence
which is independent of its value
Each Object has a unique, unchangable (immutable) ID independent of its current state or behaviour
Logical ID - implementation free (PK) - value oriented
OID - system generated - unique to an environment not between environments
Thus two notions of object equivalence exist: two objects can be identical or they can be equal. This has
two implications: object sharing and object updates
In an identity-based model, two objects can share a component
– Example: A Person has a name, an age and a set of children. Assume Peter and Susan both have a
15-year-old child named John. In real life, two situations may arise: Susan and Peter are parents of
the same child or there are two children involved. In a system without identity, Peter is represented
by:
(peter, 40, {(john, 15, {})}) and Susan is represented by:
In an identity-based model, these two structures can share the common part (john, 15, {}) or
not, thus capturing either situations. (SETS in CODASYL)
•Support for many-to-many relationships?
05/03/24 12
OODBS Golden Rules:
Encapsulation (3)
The idea of encapsulation comes from:
– the need to cleanly distinguish between the specification and the
implementation of an operation and
– the need for modularity
There are two views of encapsulation
– the programming language view - abstract data types
– the database adaptation of that view - encapsulate both program and
data
05/03/24 13
OODBS Golden Rules: Support for
Types or Classes (4)
There are two main categories of object-oriented systems, those supporting the
notion of class and those supporting the notion of type
A type, in an object-oriented system, summarizes the common features of a set of
objects with the same characteristics. It has two parts: the interface and the
implementation (or implementations).
– The interface consists of a list of operations together with their signatures (i.e., the type of
the input parameters and the type of the result)
– The type implementation consists of a data part and an operation part
The notion of class is different from that of type. Its specification is the same as that
of a type, but it is more of a run-time notion. It contains two aspects: an object
factory and an object warehouse.
– The object factory can be used to create new objects, by performing the operation new on
the class, or by cloning some prototype object representative of the class
– The object warehouse means that attached to the class is its extension, i.e., the set of objects
that are instances of the class
05/03/24 14
OODBS Golden Rules: Class or Type
Hierarchies (5)
Inheritance has two advantages:
– it is a powerful modeling tool, because it gives a concise and precise description of the
world and
– it helps in factoring out shared specifications and implementations in applications
Example
– Employee(name, age, salary) - can die, get married, be paid
– Student(name, age, set-of-grades) - can die, get married, have GPA computed
– Person(name, age) - can die, get married
Employee(+salary) - pay
Student(+set-of-grades) - GPA computation
There are at least four types of inheritance:
– substitution inheritance
– inclusion inheritance
– constraint inheritance
– specialization inheritance
05/03/24 15
OODBS Golden Rules:
Overriding, Overloading and
Late Binding (6)
Example The display operation
for x in X do
begin
case of type(x)
person: display(x); for x in X do display(X)
bitmap: display-bitmap(x)
graph: display-graph(x);
end
end
In an object-oriented system, we define the display operation at the object type level (the most general
type in the system). Thus, display has a single name and can be used indifferently on graphs, persons
and pictures. However, we redefine the implementation of the operation for each of the types according
to the type (this redefinition is called overriding). This results in a single name (display) denoting three
different programs (this is called overloading). To display the set of elements, we simply apply the display
operations to each one of them, and let the system pick the appropriate implementation at run-time.
In order to provide this new functionality, the system cannot bind operation names to programs at
compile time. Therefore, operation names must be resolved (translated into program addresses) at run-
time. This delayed translation is called late binding.
05/03/24 16
OODBS Golden Rules:
Computational
Completeness (7)
From a programming language point of view the computational completeness means
that one can express any computable function, using the DML of the database system
From a database point of view this is a novelty, since SQL, for instance, is not complete.
Computational completeness can be introduced through a reasonable connection to
existing programming languages
Note that this is different from being "resource complete", i.e., being able to access all
resources of the system (e.g. screen and remote communication) from within the
language
Database query languages usually impose severe restrictions on the kind of
computations that can be performed. As a result application programs must be
implemented in general-purpose languages while access to data is realized via
declarative data sublanguages, like SQL. As a consequence data has to be passed
between these two languages. Since both languages are usually semantically as well as
structurally different such transformations may lead to a loss of information. This
problem is known as impedance mismatch.
05/03/24 17
OODBS Golden Rules: Extensibility
(8)
The database system comes with a set of predefined types. These types can be
used at will by programmers to write their applications. This set of types must be
extensible in the following sense:
– there is a means to define new types and
– there is no distinction in usage between system defined and user defined types
Relational database systems rely on a simple data model. It offers a fixed number of
predefined atomic data types and very few type constructors.
In order to permit the integration of arbitrary data types into a data model, it must be
possible to define new basic data types, to use previously defined data types as
building blocks for other data types, or to nest data structures to arbitrary levels
Since object-oriented database systems are meant to offer an interface which
integrates a programming language and data management facilities they must allow
the programmer to define and execute any kind of operations
05/03/24 18
OODBS Golden Rules:
Persistence (9)
Persistence is the ability of the programmer to have her/his data survive the
execution of a process, in order to eventually reuse it in another process
Persistence should be orthogonal, i.e., each object, independent of its type,
is allowed to become persistent as such
Persistence should also be implicit: the user should not have to explicitly
move or copy data to make it persistent
Persistence can be achieved in several ways:
A first solution is to define the persistence property in the methods by using
operations which make an object persistent
– Another straightforward solution is to make everything persistent
– Different approach is to let the system designer specify which classes
are to be persistent or to store only data which is explicitly identified by
a unique name
05/03/24 19
OODBS Golden Rules: Secondary
Storage Management (10)
Secondary storage management is a classical feature of database
management systems. It is usually supported through a set of mechanisms.
These include index management, data clustering, data buffering, access
path selection and query optimization.
None of these is visible to the user: they are simply performance features
The application programmer should not have to write code to maintain
indices, to allocate disk storage, or to move data between disk and main
memory
Since objects in object-oriented database systems may be rather complex
as well as extremely large it seems to be necessary to support access
strategies within objects (e.g. indexes on attribute types which consist of
collections of values or instances of other types)
05/03/24 20
OODBS Golden Rules: Support for
Concurrent Users (11)
With respect to the management of multiple users concurrently interacting with the system, the
system should offer the same level of service as current database systems provide
It should therefore insure harmonious coexistence among users working simultaneously on the
database
The system should therefore support the standard notion of atomicity of a sequence of
operations and of controlled sharing
Serializability of operations should at least be offered, although less strict alternatives may be
offered
The existence of a class hierarchy and the fact that complex objects may contain other objects
need to be considered especially by the concurrency-control mechanism for the following
reasons:
– If the scope of a query is not a single class but the complete subclass hierarchy the set of objects of the subclass
hierarchy must not be modified by a conflicting transaction
– If a complete composite object O is accessed by a transaction it has to be ensured that shared subobjects of O
cannot be accessed in an incompatible way by concurrent transactions
– Since a subclass S inherits structure and behavior from its superclasses it must be forbidden that a concurrent
transaction modifies the definition of any of the superclasses of S while a transaction T accesses instances of S
(otherwise inherited attributes or methods may be deleted from S or new attributes may be added to S while
instances of S are accessed by T)
05/03/24 21
OODBS Golden Rules: Recovery (12)
05/03/24 22
OODBS Golden Rules: Ad Hoc
Query Facility (13)
A query language should be:
– application-independent
– high-level (descriptive)
– optimizable
– complete
– generic
– powerful
– adequate
– extensible
– efficient
05/03/24 23
OODBS Manifesto: Optional
Features
Multiple inheritance
Type checking and type inferencing
Distribution
Design transactions
– inheritance - lock sub-superclasses
– support for long tx
– softlocks and notification
– choose your own cc mechanism
Versions
– when to create a new version?
– how to represent versions
Change Management
– How do we handle schema changes?
– Write once classes, immediate update, lazy update, schema mapping
05/03/24 24
OODBS: Products & Prototypes
All the systems we categorize as object-oriented databases have the common thread
of objects as the basic data structure, public methods (and, in some cases, attributes)
associated with objects as the mechanism to operate on objects, private attributes (and, in
some cases, procedures) associated with objects as the underlying representation, inheritance
of attributes and procedures from supertypes, the ability to define dynamically new simple
attribute types as well as new object types, and the representation of relationships by
attributes.
ONTOS is based on C++, and stores the binary code for methods associated with objects in
C++ binaries rather than the database, linking the methods with object data when objects are
accessed. ONTOS operations are invoked by calls on a run-time library, and persistent
application classes inherit methods from an ONTOS-supplied persistent object class.
GemStone was designed based on a derivative of Smalltalk called OPAL. However, it has
subsequently been integrated with C++ as well.
ORION was designed around extensions of Common LISP, as is the product followon,
ITASCA. LISP methods are stored in the database, and are invoked by the system.
05/03/24 25
Typical Architecture of an
Object-Oriented DBMS
05/03/24 26
OODBS: ONTOS
ONTOS [Ontologic 1990], a product of Ontologic of Billerica, Massachusetts, is
one of several new commercial products based on making C++ into a database
programming language
ONTOS provides a persistent object store for C++ programs
Objects exist in two states: deactivated, stored on the disk, and activated, as
ordinary C++ objects in an application program
Concurrency in ONTOS is provided by conventional transactions
ONTOS incorporates a variant of SOL as a query language, making attributes of
objects visible to the query-language user
Other ONTOS features include the following:
– Inverse-attribute pairs
– Binary Large Objects (BLOBs)
– Versions
– Administration tools
05/03/24 27
OODBS: GemStone
GemStone is a product of Servio Corporation of Alameda, California and Beaverton,
Oregon. GemStone was one of the first object-oriented database system products
GemStone originally evolved in an effort to make the Smalltalk programming language
and system into a DBMS
GemStone has subsequently been integrated with C++; it is the first database
programming language product with close integration with two languages
Data may also be fetched and stored from GemStone using procedure calls from C or
Pascal application programs
Other GemStone features include:
– Relational gateway
– BLOBs
– Multiple name spaces
– Garbage collection
– Schema evolution
– User interfaces
05/03/24 28
OODBS: ORION
ORION is an object-oriented ODMS built at MCC in Austin, Texas. A commercial product
version of ORION is being marketed as ITASCA, from Itasca Systems of Minneapolis,
Minnesota.
ORION incorporates object identifiers, multiple inheritance, composite objects, versions,
indexing, queries, transactions, distributed databases, dynamic schema evolution, and access
authorization. It is implemented in LISP.
The version of ORION extends LISP with object-oriented capabilities, with database calls for
navigational or query access, and for definition of data types
ORION allows BLOBs, like most other object-oriented database systems, but provides an
additional capability for "multimedia" values such as images, audio, or text: the values may
be accessed and manipulated in either one or two dimensions.
An area where ORION has put more emphasis than other systems is distributed databases.
ORION implements distributed two-phase commit, and can process queries that span
databases.
Another area that ORION has emphasized more than have other systems is schema evolution
05/03/24 29
OODBS: Future Directions &
Research
Conceptual
– Query languages for OODBs are still rough
– An active area in database-systems research is deductive or logic databases
System Engineering
– Optimization technology for object-oriented systems
– Languages in OODBs are currently more procedural than declarative
– Encapsulation hides the big picture for what is going on in a query
– Storage management for OODBs is in its infancy
– Parallelism has been a major topic in database research throughout the 1980s
– Emergence of nonstandard architectures for database systems
Applications
– Object-oriented schema design is more complex than is design of record-based databases, because
behaviour must be modelled, and its partitioning between classes must be decided
– Hypertext systems share features with OODBs,
– OODBs are good candidates for supporting environments for cooperative work
05/03/24 30
OODBS: What have we
covered ?
The Object-Oriented Paradigm
05/03/24 31
SQL3
Most relevant to extended relational approach
ANSI and ISO draft standard
“Work on SQL3 is well underway, but the final
standards is several years away”
Most important new features:
– Abstract data types w. attributes and routines
– Inheritance of ADTs and tables
– Collections, parameterized types
– Procedures, functions: computational completeness
05/03/24 32
STRATEGIES FOR DEVELOPING
AN OODBMS
Extend an Existing OOP Language with Database Capabilities:
Add traditional database capabilities to an existing OOP language e.g.
Smalltalk, C++ or Java. Used in GemStone.
Provide Extensible OODBMS Libraries: Similar to the above.
However, rather than extending the language, class libraries are
provided that support persistence, aggregation, data types, transactions,
cocurrency, etc. Used in Ontos, Versant, ObjectStore.
Embed OO Database Language Constructs in a Conventional Host
Language: used in O2, which supports embedded extensions for C.
Extend an Existing Database Language with OO Capabilities: e.g.
extending SQL to provide OO constructs (SQL 3).
Develop a New Database Data Model/Data Language: develop an
entirely new database language and DBMS with OO capabilities.
05/03/24 33
OBJECT DB STANDARDS
05/03/24 34
OBJECT MANAGEMENT GROUP
05/03/24 38