Advanced Database
Advanced Database
(MSIT 610)
1
Chapter One
2
Content
Overview of O-O Concepts
Object-oriented data model ,Complex data types
Object-oriented languages, Persistent programming languages
Object relational databases
Nested relations, Complex types, Inheritance, Reference types,
Querying with complex types, Functions and procedures, Object-
oriented versus object-relational.
3
Introduction
Traditional Data Models:
Hierarchical
Network (since mid-60’s)
Relational (since 1970 and commercially since
1982)
Object Oriented (OO) Data Models since mid-90’s
Reasons for creation of Object Oriented Databases
Need for more complex applications
Need for additional data modeling features
Increased use of object-oriented programming
languages
4
History of OO Models and Systems
Languages:
Simula (1960’s)
Smalltalk (1970’s)
C++ (late 1980’s)
Java (1990’s and 2000’s)
5
Overview of Object-Oriented Concepts
Main Claim:
OO databases try to maintain a direct correspondence
between real-world and database objects so that objects
do not lose their integrity and identity and can easily be
identified and operated upon
Object:
Two components:
state (value) and behavior (operations)
Similar to program variable in programming language,
except that it will typically have a complex data structure
as well as specific operations defined by the programmer
6
Cont.
7
Cont.
The internal structure of an object in OOPLs includes
the specification of instance variables, which hold
the values that define the internal state of the object.
8
Object Identity, Object Structure,
and Type Constructors
Object Identity:
An OO database system provides a unique identity to
each independent object stored in the database.
9
Cont.
Type Constructors:
In OO databases, the state (current value) of a complex
object may be constructed from other objects (or other
values) by using certain type constructors.
The three most basic constructors are atom, tuple, and
set.
Other commonly used constructors include list, bag, and
array.
The atom constructor is used to represent all basic atomic
values, such as integers, real numbers, character strings,
Booleans, and any other basic data types that the system
supports directly.
10
Cont.
Example 1
One possible relational database state
corresponding to COMPANY schema
11
Cont.
Example 1 (contd.):
12
Cont.
Example 1 (contd.)
13
Cont.
14
Cont.
Example 1 (contd.)
The first six objects listed in this example
represent atomic values.
Object seven is a set-valued object that represents
the set of locations for department 5; the set refers
to the atomic objects with values {‘Houston’,
‘Bellaire’, ‘Sugarland’}.
Object 8 is a tuple-valued object that represents
department 5 itself, and has the attributes DNAME,
DNUMBER, MGR, LOCATIONS, and so on.
15
Cont.
16
Encapsulation of Operations,
Methods, and Persistence
Encapsulation
One of the main characteristics of OO languages and systems
Related to the concepts of abstract data types and information
hiding in programming languages
Keeping everything of the object together.
17
Cont.
18
Cont.
19
Object-Oriented Languages
Object-oriented concepts can be used in different ways
Object-orientation can be used as a design tool, and be
encoded into, for example, a relational database
analogous to modeling data with E-R diagram and then
converting to a set of relations)
The concepts of object orientation can be incorporated into
a programming language that is used to manipulate the
database.
Object-relational systems – add complex types and object-
orientation to relational language.
Persistent programming languages – extend object-oriented
programming language to deal with databases by adding concepts
such as persistence and collections.
20
Persistent Programming
Languages
Persistent Programming languages allow objects to be created and
stored in a database, and used directly from a programming language
allow data to be manipulated directly from the programming language
No need to go through SQL.
No need for explicit format (type) changes
format changes are carried out transparently by system
Without a persistent programming language, format changes
becomes a burden on the programmer
More code to be written
More chance of bugs
allow objects to be manipulated in-memory
no need to explicitly load from or store to the database
Saved code, and saved overhead of loading/storing
large amounts of data
21
?
22
Object Database
Standards, Languages,
and Design
23
Outline
Overview of the Object Model ODMG
The Object Definition Language DDL
The Object Query Language OQL
Object Database Conceptual Model
24
The Object Model of ODMG
Provides a standard model for object databases
Supports object definition via ODL
Supports object querying via OQL
Supports a variety of data types and type constructors
25
ODMG Objects and Literals
The basic building blocks of the object model are
Objects
Literals
An object has four characteristics
1. Identifier: unique system-wide identifier
2. Name: unique within a particular database and/or program;
it is optional
3. Lifetime: persistent vs. transient
4. Structure: specifies how object is constructed by the type
constructor and whether it is an atomic object
26
ODMG Literals
A literal has a current value but not an identifier
Three types of literals
1. atomic: predefined; basic data type values (e.g.,
short, float, boolean, char)
2. structured: values that are constructed by type
constructors (e.g., date, struct variables)
3. collection: a collection (e.g., array) of values or
objects
27
ODMG Interface Definition:
An Example
Note: interface is ODMG’s keyword for class/type
interface Date:Object {
enum weekday{sun,mon,tue,wed,thu,fri,sat};
enum Month{jan,feb,mar,…,dec};
unsigned short year();
unsigned short month();
unsigned short day();
…
boolean is_equal(in Date other_date);
};
28
Built-in Interfaces for
Collection Objects
A collection object inherits the basic collection
interface, for example:
cardinality()
is_empty()
insert_element()
remove_element()
contains_element()
create_iterator()
29
Collection Types
Collection objects are further specialized into types
like a set, list, bag, array, and dictionary
Each collection type may provide additional interfaces,
for example, a set provides:
create_union()
create_difference()
is_subset_of(
is_superset_of()
is_proper_subset_of()
30
Object Inheritance Hierarchy
31
Atomic Objects
Atomic objects are user-defined objects and are defined via
keyword class
An example:
class Employee (extent all_emplyees key ssn) {
attribute string name;
attribute string ssn;
attribute short age;
relationship Dept works_for;
void reassign(in string new_name);
}
32
Class Extents
An ODMG object can have an extent defined via a
class declaration
Each extent is given a name and will contain all
persistent objects of that class
For Employee class, for example, the extent is called
all_employees
This is similar to creating an object of type
Set<Employee> and making it persistent
33
Object Factory
An object factory is used to generate individual
objects via its operations
An example:
interface ObjectFactory {
Object new ();
};
new() returns new objects with an object_id
34
Interface and Class Definition
ODMG supports two concepts for specifying object
types:
Interface
Class
There are similarities and differences between
interfaces and classes
Both have behaviors (operations) and state (attributes
and relationships)
35
ODMG Interface
An interface is a specification of the abstract behavior
of an object type
State properties of an interface (i.e., its attributes and
relationships) cannot be inherited from
Objects cannot be instantiated from an interface
36
ODMG Class
A class is a specification of abstract behavior and
state of an object type
A class is Instantiable
Supports “extends” inheritance to allow both state and
behavior inheritance among classes
Multiple inheritance via “extends” is not allowed
37
Object Definition Language
ODL supports semantics constructs of ODMG
ODL is independent of any programming language
ODL is used to create object specification (classes
and interfaces)
ODL is not used for database manipulation
38
39
40
41
ODL Examples (1)
A Very Simple Class
A very simple, straightforward class definition
(all examples are based on the university schema
presented in Chapter 4):
class Degree {
attribute string college;
attribute string degree;
attribute string year;
};
42
ODL Examples (2)
A Class With Key and Extent
A class definition with “extent”, “key”, and more
elaborate attributes; still relatively straightforward
43
ODL Examples (3)
A Class With Relationships
Note extends (inheritance) relationship
Also note “inverse” relationship
44
45
Inheritance via “:” – An Example
interface Shape {
attribute struct point {…} reference_point;
float perimeter ();
…
};
46
47
48
Object Query Language
OQL is DMG’s query language
OQL works closely with programming languages such
as C++
Embedded OQL statements return objects that are
compatible with the type system of the host language
OQL’s syntax is similar to SQL with additional features
for objects
49
Simple OQL Queries
Basic syntax: select…from…where…
SELECT d.name
FROM d in departments
WHERE d.college = ‘Engineering’;
An entry point to the database is needed for each
query
An extent name (e.g., departments in the above
example) may serve as an entry point
50
Iterator Variables
Iterator variables are defined whenever a collection is
referenced in an OQL query
Iterator d in the previous example serves as an
iterator and ranges over each object in the collection
Syntactical options for specifying an iterator:
d in departments
departments d
departments as d
51
Data Type of Query Results
The data type of a query result can be any type
defined in the ODMG model
A query does not have to follow the select…from…
where… format
A persistent name on its own can serve as a query
whose result is a reference to the persistent object.
For example,
departments; whose type is set<Departments>
52
Path Expressions
A path expression is used to specify a path to
attributes and objects in an entry point
A path expression starts at a persistent object name
(or its iterator variable)
The name will be followed by zero or more dot
connected relationship or attribute names
E.g., departments.chair;
53
An Example of OQL View
A view to include students in a department who have
a minor:
define has_minor(dept_name) as
select s
from s in students
where s.minor_in.dname=dept_name
54
Single Elements from
Collections
An OQL query returns a collection
OQL’s element operator can be used to return a
single element from a singleton collection that
contains one element:
element (select d from d in departments
where d.dname = ‘Software Engineering’);
If d is empty or has more than one elements, an
exception is raised
55
Object Database
Conceptual Design
Object Database (ODB) vs. Relational Database
(RDB)
Relationships are handled differently
Inheritance is handled differently
Operations in OBD are expressed early on since they
are a part of the class specification
56
Relationships: ODB vs. RDB (1)
Relationships in ODB:
relationships are handled by reference attributes that
include OIDs of related objects
single and collection of references are allowed
references for binary relationships can be expressed in
single direction or both directions via inverse operator
57
Relationships: ODB vs.. RDB (2)
Relationships in RDB:
Relationships among tuples are specified by attributes
with matching values (via foreign keys)
Foreign keys are single-valued
M:N relationships must be presented via a separate
relation (table)
58
Inheritance Relationship
in ODB vs. RDB
Inheritance structures are built in ODB (and achieved
via “:” and extends operators)
RDB has no built-in support for inheritance
relationships; there are several options for mapping
inheritance relationships in an RDB
59
Early Specification of
Operations
Another major difference between ODB and RDB is
the specification of operations
ODB:
Operations specified during design (as part of class
specification)
RDB:
Operations specification may be delayed until
implementation
60
Mapping EER Schemas
to ODB Schemas
Mapping EER schemas into ODB schemas is
relatively simple especially since ODB schemas
provide support for inheritance relationships
Once mapping has been completed, operations must
be added to ODB schemas since EER schemas do
not include an specification of operations
61
62
63
64
Mapping EER to ODB Schemas
Step 1
Create an ODL class for each EER entity type or
subclass
Multi-valued attributes are declared by sets, bags or lists
constructors
Composite attributes are mapped into tuple constructors
65
Mapping EER to ODB Schemas
Step 2
Add relationship properties or reference attributes for
each binary relationship into the ODL classes
participating in the relationship
Relationship cardinality: single-valued for 1:1 and N:1
directions; set-valued for 1:N and M:N directions
Relationship attributes: create via tuple constructors
66
Mapping EER to ODB Schemas
Step 3
Add appropriate operations for each class
Operations are not available from the EER schemas;
original requirements must be reviewed
Corresponding constructor and destructor operations
must also be added
67
Mapping EER to ODB Schemas
Step 4
Specify inheritance relationships via extends clause
An ODL class that corresponds to a sub-class in the
EER schema inherits the types and methods of its super-
class in the ODL schemas
Other attributes of a sub-class are added by following
Steps 1-3
68
Mapping EER to ODB Schemas
Step 5
Map weak entity types in the same way as regular
entities
Weak entities that do not participate in any relationships
may alternatively be presented as composite multi-
valued attribute of the owner entity type
69
Mapping EER to ODB Schemas
Step 6
Map categories (union types) to ODL
The process is not straightforward
May follow the same mapping used for EER-to-relational
mapping:
Declare a class to represent the category
Define 1:1 relationships between the category and
each of its super-classes
70
Mapping EER to ODB Schemas
Step 7
Map n-ary relationships whose degree is greater
than 2
Each relationship is mapped into a separate class with
appropriate reference to each participating class
71
72
73
74
75
?
76
OO database concept
Representing complex object
Encapsulation
Class
Inheritance
77
OO database concept
Association: is the link between entities in an
application. It is represented by means of references
between objects. It can be binary, ternary and reverse
79
Complex object model
Allows
Sets of atomic values
Tuple-valued attributes
Sets of tuples (nested relations)
General set and tuple constructors
Object identity
Thus, formally
Every atomic value in A is an object.
If a1, ..., an are attribute names in N, and O1, ..., On are
objects, then T = [a1:O1, ..., an:On] is also an object, and
T.ai retrieves the value Oi.
If O1, ..., On are objects, then S = {O1, ..., On} is an abject.
80
Object Model
An object is defined by a triple (OID, type constructor,
state)
where OID is the unique object identifier,
type constructor is its type (such as atom, tuple, set, list, array,
bag, etc.) and state is its actual value.
Example:
(i1, atom, 'John')
(i2, atom, 30)
(i3, atom, 'Mary')
(i4, atom, 'Mark')
(i5, atom 'Vicki')
(i6, tuple, [Name:i1, Age:i2])
(i7, set, {i4, i5})
(i8, tuple, [Name:i3, Friends:i7])
(i9, set, {i6, i8})
81
OBJECT-ORIENTED
DATABASES
OODB = Object Orientation + Database Capabilities
82
OODB
RESEARCH PROTOTYPES
ORION: Lisp-based system
IRIS: Functional data model, version control, object-SQL.
Galileo: Strong typed language, complex objects.
PROBE .
POSTGRES: Extended relational database supporting objects.
COMMERCIAL OODB
O2: O2 Technology. Language O2C to define classes, methods and types. Supports
multiple inheritance. C++ compatible. Supports an extended SQL language O2SQL
which can refer to complex objects.
G-Base: Lisp-based system, supports ADT, multiple inheritance of classes.
CORBA: Standards for distributed objects.
GemStone: Earliest OODB supporting object identity, inheritance, encapsulation.
Language OPAL is based upon Smalltalk.
Ontos: C++ based system, supports encapsulation, inheritance, ability to construct
complex objects.
Object Store: C++ based system. A good feature is that it supports the creation of
indexes.
Statics: Supports entity types, set valued attributes, and inheritance of entity types
and methods.
83
OODB
COMMERCIAL OODB
Relational DB Extensions: Many relational systems
support OODB extensions.
User-defined functions (dBase).
User-defined ADTs (POSTGRES)
Very-long multimedia fields (BLOB or Binary Large
Object). (DB2 from IBM, SQL from SYBASE, Informix,
Interbase)
84
OODB Implemenation
Strategies
Develop novel database data model or data language
(SIM)
Extend an existing database language with object-
oriented capabilities. (IRIS, O2 and VBASE/ONTOS
extended SQL)
Extend existing object-oriented programming
language with database capabilities (GemStone
OPAL extended SmallTalk)
Extendable object-oriented DBMS library (ONTOS)
85
ODL A Class With Key and Extent
A class definition with “extent”, “key”, and more
elaborate attributes; still relatively straightforward
class Person (extent persons key ssn) {
attribute struct Pname {string fname …} name;
attribute string ssn;
attribute date birthdate;
…
short age();
}
SELECT d.name
FROM departments d
WHERE d.college = ‘Engineering’;
Object-Relational Data Models
Extend the relational data model by including object
orientation and constructs to deal with added data types.
Allow attributes of tuples to have complex types, including
non-atomic values such as nested relations.
Preserve relational foundations, in particular the
declarative access to data, while extending modeling
power.
Upward compatibility with existing relational languages.
88
Nested Relations
Motivation:
Permit non-atomic domains (atomic indivisible)
Example of non-atomic domain: set of integers,or set of
tuples
Allows more intuitive modeling for applications with complex
data
Intuitive definition:
allow relations whenever we allow atomic (scalar) values -
relations within relations
Retains mathematical foundation of relational model
Violates first normal form.
89
Example of a Nested Relation
Example: library information system
Each book has
title,
a set of authors,
Publisher, and
a set of keywords
Non-1NF relation books
90
1NF Version of Nested Relation
1NF version of books
flat-books
91
4NF Decomposition of Nested
Relation
Remove awkwardness of flat-books by assuming that the
following multi-valued dependencies hold:
title author
title keyword
title pub-name, pub-branch
Decompose flat-doc into 4NF using the schemas:
(title, author)
(title, keyword)
(title, pub-name, pub-branch)
92
4NF Decomposition of flat–
books
93
Problems with 4NF Schema
4NF design requires users to include joins in their
queries.
1NF relational view flat-books defined by join of 4NF
relations:
eliminates the need for users to perform joins,
but loses the one-to-one correspondence between tuples
and documents.
And has a large amount of redundancy
Nested relations representation is much more natural
here.
94
Complex Types and SQL
Extensions to SQL to support complex types include:
Collection and large object types
Nested relations are an example of collection types
Structured types
Nested record structures like composite attributes
Inheritance
Object orientation
Including object identifiers and references
95
Collection Types
Set type (not in SQL:1999)
create table books (
…..
keyword-set setof(varchar(20))
……
)
Sets are an instance of collection types. Other
instances include
Arrays (are supported in SQL:1999)
E.g. author-array varchar(20) array[10]
Can access elements of array in usual fashion:
E.g. author-array[1]
Multisets (not supported in SQL:1999)
I.e., unordered collections, where an element may occur
multiple times
Nested relations are sets of tuples
SQL:1999 supports arrays of tuples
96
Large Object Types
Large object types
clob: Character large objects
book-review clob(10KB)
blob: binary large objects
image blob(10MB)
movie blob (2GB)
97
Structured and Collection Types
(Oracle)
Structured types can be declared and used in SQL
CREATE OR REPLACE TYPE Publisher as Object (name varchar(20),
branch varchar(20));
/
CREATE OR REPLACE TYPE VA as VARRAY (5) of VARCHAR(30);
/
CREATE OR REPLACE TYPE Book AS OBJECT (title varchar(20), authors
VA, pub_date date, pub Publisher, keywords VA);
/
Structured types can be used to create tables
98
Structured Types (Cont.)
Creating tables without creating an intermediate type
For example, the table books could also be defined as follows:
Create or Replace table books
(title varchar(20),authors VA,
pub_date date, pub Publisher, keywords VA)
Methods can be part of the type definition of a structured
type:
Create or Replace type Employee_Ty as Object
(name varchar(20), salary int,
MEMBER function giveraise (percent IN int) return
NUMBER);
Method body is created separately
CREATE OR REPLACE TYPE BODY Employee_Ty AS
MEMBER Function giveraise(percent IN int ) return NUMBER IS
begin
RETURN (salary + ( salary * percent) / 100);
end giveraise;
END;
/
99
Creation of Values of Complex
Types
Values of structured types are created using
constructor functions
E.g. Publisher(‘McGraw-Hill’, ‘New York’)
Note: a value is not an object
100
Creation of Values of Complex
Types
To insert the preceding tuple into the relation
books
101
Inheritance
Suppose that we have the following type definition for people:
create or replace type Person_typ as Object
(name varchar(20),
address varchar(20)) not final;
/
Using inheritance to define the student and teacher types
create type Student under Person
As Object (degree varchar(20),
department varchar(20))
create or replace type Student_typ UNDER Person_ty
(degree varchar(20),
department varchar(20)) not final;
/
102
Reference Types
Object-oriented languages provide the ability to create
and refer to objects.
In SQL:1999
References are to tuples, and
References must be scoped,
I.e., can only point to tuples in one specified table
103
Reference Declaration in
SQL:1999
E.g. define a type Department with a field name and a
field head which is a reference to the Person in table
people as scope
create type Department as Object
(name varchar(20), head ref Person_typ )
104
Initializing Reference Typed
Values
In Oracle, to create a tuple with a reference value, first
create the tuple with a null reference and then set the
reference separately using the function ref(p) applied to a
tuple variable
105
Querying with Structured Types
Find the title and the name of the publisher of each book.
select title, publisher.name
from books
106
Nested Table
CREATE TYPE animal_ty AS OBJECT (breed
VARCHAR(25), name VARCHAR(25), birthdate DATE);
/
CREATE TYPE animals_nt AS TABLE OF animal_ty;
/
CREATE TABLE breeder (breederName VARCHAR(25),
animals animals_nt)
nested table animals store as animals_nt_tab;
breederName Animals
109
Comparison of O-O and O-R
Databases
Relational systems
simple data types, powerful query languages, high
protection.
Persistent-programming-language-based OODBs
complex data types, integration with programming language,
high performance.
Object-relational systems
complex data types, powerful query languages, high
protection.
Note: Many real systems built these boundaries
E.g. persistent programming language built as a wrapper on
a relational database offers first two benefits, but may have
poor performance.
110
Storage and Access of Persistent
Objects
Naming and Reach ability:
Naming Mechanism:
Assign an object a unique persistent name through which
it can be retrieved by this and other programs.
Reach ability Mechanism:
Make the object reachable from some persistent object.
An object B is said to be reachable from an object A if a
sequence of references in the object graph lead from
object A to object B.
111
Persistence of Objects
Approaches to make transient objects persistent include establishing
Persistence by Class – declare all objects of a class to be persistent;
simple but inflexible.
Persistence by Creation – extend the syntax for creating objects to
specify that an object is persistent.
Persistence by Marking – an object that is to persist beyond program
execution is marked as persistent before program termination.
Persistence by Reachability - declare (root) persistent objects; objects are
persistent if they are referred to (directly or indirectly) from a root object.
Easier for programmer, but more overhead for database system
Similar to garbage collection used e.g. in Java, which
also performs reachability tests
112
Object Identity and Pointers
A persistent object is assigned a persistent object identifier.
Degrees of permanence of identity:
Intraprocedure – identity persists only during the
executions of a single procedure
Intraprogram – identity persists only during execution of a
single program or query.
Interprogram – identity persists from one program
execution to another, but may change if the storage
organization is changed
Persistent – identity persists throughout program
executions and structural reorganizations of data; required
for object-oriented systems.
113
Type and Class Hierarchies and
Inheritance
Type (class) Hierarchy
A type in its simplest form can be defined by giving
it a type name and then listing the names of its
visible (public) functions
When specifying a type in this section, we use the
following format, which does not specify arguments
of functions, to simplify the discussion:
TYPE_NAME: function, function, . . . , function
Example:
PERSON: Name, Address, Birthdate, Age, SSN
114
Cont.
Super type:
Super type is an object type that has got relationship
(parent to child relationship) with one or more subtypes
and it contains attributes that are common to its
subtypes
Sub type:
New type that is similar but not identical to an already
defined type (Super type)
115
Cont.
Example (1):
PERSON: Name, Address, Birthdate, Age, SSN
EMPLOYEE: Name, Address, Birthdate, Age, SSN,
Salary, HireDate, Seniority
STUDENT: Name, Address, Birthdate, Age, SSN,
Major, GPA
OR:
EMPLOYEE subtype-of PERSON: Salary,
HireDate, Seniority
STUDENT subtype-of PERSON: Major, GPA
116
Cont.
Example (2):
Consider a type that describes objects in plane geometry,
which may be defined as follows:
GEOMETRY_OBJECT: Shape, Area, ReferencePoint
Now suppose that we want to define a number of subtypes for
the GEOMETRY_OBJECT type, as follows:
RECTANGLE subtype-of GEOMETRY_OBJECT: Width,
Height
TRIANGLE subtype-of GEOMETRY_OBJECT: Side1,
Side2, Angle
CIRCLE subtype-of GEOMETRY_OBJECT: Radius
117
Example of Multiple Inheritance
118
Complex Objects
Can be
Unstructured complex object:
These is provided by a DBMS and permits the storage
and retrieval of large objects that are needed by the
database application.
Typical examples of such objects are bitmap images and
long text strings (such as documents); they are also known
as binary large objects( BLOBs)
This has been the standard way by which Relational
DBMSs have dealt with supporting complex objects, leaving
the operations on those objects outside the RDBMS.
119
Cont.
Structured complex object:
This differs from an unstructured complex object in
that the object’s structure is defined by repeated
application of the type constructors provided by the
OODBMS.
Hence, the object structure is defined and known to
the OODBMS.
The OODBMS also defines methods or operations
on it.
120
Other Objected-Oriented Concepts
Polymorphism
is the provision of a single interface to entities of
different types
Operator Overloading
This concept allows the same operator name or
symbol to be bound to two or more different
implementations of the operator, depending on the
type of objects to which the operator is applied
For example addtion can be:
Addition in integers
Concatenation in strings (of characters)
121
Cont.
122
Cont.
Selective Inheritance
a subtype inherits only some of the functions of a
super type
an EXCEPT clause may be used to list the functions
in a super type that are not to be inherited by the
subtype
123
Cont.
Versions
Many database applications that use OO systems
require the existence of several versions of the
same object
There may be more than two versions of an object.
Configuration:
A configuration of the complex object is a
collection consisting of one version of each module
arranged in such a way that the module versions in
the configuration are compatible and together form
a valid version of the complex object.
124
? 125