0% found this document useful (0 votes)
14 views51 pages

Adbms Even

Data base

Uploaded by

gadisakarorsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views51 pages

Adbms Even

Data base

Uploaded by

gadisakarorsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 51

Wollega University

Department of Computer Science


Advanced Database Management
system
BY Tadele D.
November 16, 2024

ADBMS_Notes 1
CHAPTER ONE
OBJECT ORIENTED DATABASE Concepts
OO is efficient in complex main memory operations of
persistent data

susceptible to data corruption

Object-oriented concepts can be used in different ways

Object orientation can be used as a design tool

ADBMS_Notes 2
OR(Object Relational )
OR is declarative and limited power of (extended) SQL
(compared to PL) protection and good optimization

 extends the relational model to make modeling and


querying easier

ADBMS_Notes 3
Cont..
Object orientation can be incorporated into a programming
language that is used to manipulate the database:
 Object-relational systems – add complex types and object-
orientation to relational language
 Persistent programming languages – extend object-oriented
programming language to deal with databases b y adding
concepts such as persistence and collections.

ADBMS_Notes 4
Persistent Programming Languages
allows objects to be created and stored in a data base, à and
used directly from a programming language

Allow data to be manipulated directly from the programming


language

No need to go through S Q L

ADBMS_Notes 5
Cont…
o Database Management System (DBMS) contains information
about a particular enterprise
o Collection of interrelated data, Set of programs to access the data
and an environment that is both convenient and efficient to use
o The primary goal of a DBMS is to provide an environment that
makes it both convenient and efficient to retrieve and store the
database information.
Database Applications:
 Banking: all transactions
Airlines: reservations, schedules
 Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized recommendations
ADBMS_Notes 6
Drawbacks of File System
Data redundancy and inconsistency - Multiple file formats,
duplication of information in different files.
Difficulty in accessing data - Need to write a new program
to carry out each new task
Data isolation - multiple files and formats
Integrity problems - Integrity constraints (e.g. account
balance > 0) become ―buried‖ in program code rather than
being stated explicitly; Hard to add new constraints or
change existing ones

ADBMS_Notes 7
Object Database Standards, Languages, and Design

Potential advantage of having and adhering to standards


type of database system is that it helps in achieving:
 portability of database applications to execute a
particular application program on different systems
 interoperability, which generally refers to the ability of
an application to access multiple distinct systems.
 in Object Database Standards the same application
program may access some data stored under one ODBMS
package, and other data stored under another package.

ADBMS_Notes 8
Overview of ODMG

 ODMG stands for Object Database Management Group


The ODMG is the data model upon which the object definition
language (ODL) and object query language (OQL) are used.
ODMG provides the data types, type constructors, and other
concepts that can be utilized in the ODL to specify object database
schemas.
Generally, ODMG provide a standard data model for OODB

ADBMS_Notes 9
Object-oriented DBMS
• Object-oriented data model (OODM)
o A logical data model of objects supported in object-oriented
programming
• Object-oriented database (OODB)
o A persistent and sharable collection of objects defined by an
OODM
• Object-oriented DBMS (OODBMS)
o The manager of an OODB

ADBMS_Notes 10
Objects and Literals
o Objects and literals are the basic building blocks of the object model
o The main difference between the two is that an object has both an
object identifier and a current value, whereas a literal has only a
value but no object identifier
o Objects is User defined complex data types
o An object has structure or state (variables) and methods
(behavior/operations)

ADBMS_Notes 11
Cont…
An object is described by four characteristics
Identifier: a system-wide unique id for an object
Name: an object may also have a unique name in DB
(optional)
Lifetime: determines if the object is persistent or transient
Structure: Construction of objects using type constructors
o The object identifier is a unique system-wide identifier
( Object_Id).
o Every object must have an object identifier and optionally be
given a unique name within a particular database to refer the
object in a program, and the system should be able to locate
the object given that name

ADBMS_Notes 12
literal
In the object model, a literal is a value that does not have an
object identifier
 There are three types of literals:
o atomic,
o collection, and
o Structured.
Atomic literals correspond to the values of basic data types
and are predefined.
Collection literals specify a value that is a collection of
objects or values but the collection itself does not have an
Object_Id.
Structured literals correspond roughly to values that are
constructed using the tuple constructor they include Date,
Interval, Time, and Timestamp as built-in structures

ADBMS_Notes 13
Atomic Objects
In the object model, any user-defined object that is not
a collection object is called an atomic object.
atomic object type is defined as a class by specifying
its properties and operations.
Eg; in a UNIVERSITY database application, the user can
specify an object type (class) for Student objects
Most such objects will be structured objects;

ADBMS_Notes 14
Cont…

E.g. a Student object will have a complex structure, with


many attributes, relationships, and operations is atomic.
An attribute is a property that describes some aspect of
an object.
A relationship is a property that specifies that two
objects in the database are related together.

ADBMS_Notes 15
DBMS
Languages
Two classes of languages
Procedural – user specifies what data is required and
how to get those data
Declarative (nonprocedural) – user specifies what data
is required without specifying how to get those data

ADBMS_Notes 16
Data Definition Language (DDL)

o DDL is Specification notation for defining the database schema

Example: create table account ( account_number char(10),


branch_name char(10), balance integer)

o DDL compiler generates a set of tables stored in a data dictionary

• The main SQL data definition language statements are:

Create, Alter ,Drop

ADBMS_Notes 17
Cont…
o Data dictionary contains metadata (i.e., data about data)
Database schema
Data storage and definition language
• Specifies the storage structure and access
methods used
Integrity constraints
• Domain constraints
• Referential integrity
Authorization

ADBMS_Notes 18
Data Manipulation Language (DML)
 Language for accessing and manipulating the data organized by the
appropriate data model

 DML also known as query language

 The main SQL data manipulation language statements are:

SELECT,INSERT INTO, UPDATE and DELETE FROM

ADBMS_Notes 19
Data Models

o A collection of tools for describing - Data, Data relationships,


Data semantics and Data constraints.
 Relational model
Entity-Relationship data model
Object-based data models (Object-oriented and Object-relational)
Semi structured data model (XML)
A schema is a description of a particular collection of data, using a
given data model.
Main concept: relation, basically a table with rows and columns.
Every relation has a schema, which describes the columns, or fields.

ADBMS_Notes 20
Object Persistence
An OODBMS is often closely coupled with an OOPL
 The OOPL is used to specify the method implementations and
application code
Objects created may have different lifetimes in the database :
a. transient: allocated memory managed by the programming
language run-time system
 Transient objects exist in the executing program and
disappear once the program terminates.
 This exists temporarily during the execution of a program
but is not kept when the program terminates

ADBMS_Notes 21
Cont…
b. persistent: allocated memory and stored managed by
ODBMS runtime system.
 Persistent objects are stored in the database and persist after
program termination.
 This holds a collection of objects that is stored permanently
in the database and hence can be accessed and shared by
multiple programs
 Persistence is the storage of data from working memory so
that it can be restored when the application is run again
 Persistent objects are stored in the database and accessed
from the programming language

ADBMS_Notes 22
CHAPTER TWO
Query Processing and Optimization
 Query Processing is process by which the query results are
retrieved from a high-level query such as SQL .
☺A query expressed in a high-level Query Languages must first
be scanned, parsed, and validated.
☺ The scanner identifies the language tokens such as keywords,
attribute names, and relation names in the text of the query.
☺ The parser checks the scanned query syntax to determine whether
it is formulated according to the syntax rules of the query language.
☺The query must also be validated, by checking that all attribute and
relation names are valid and semantically meaningful names in the
schema of the database.

ADBMS_Notes 23
Basic Steps in Query Processing
1.Parsing and translation
2.Optimization
3.Evaluation

ADBMS_Notes 24
Processing a Query
Typical steps in processing a high-level query

ADBMS_Notes 25
Parsing and Translating the Query
☺It is used to convert the query into a form usable by the query
processing engine.
☺High-level query languages such as SQL represent a query as a
string, or tokens such as keywords, operators, operands, literal
strings, etc.
☺ The primary job of the parser is to extract the tokens from the
raw string of characters and translate them into the
corresponding internal data elements (i.e. relational algebra
operations and operands) and structures (i.e. query tree, query
graph).
☺The last job of the parser is to verify the validity and syntax of the
original query string.

ADBMS_Notes 26
Example
• In SQL, a user wants to fetch the records of the employees
whose salary is greater than or equal to 10000.

• For doing this, the following query is undertaken:

select emp_name from Employee where salary>10000;

• To make the system understand the user query, it needs to be


translated in the form of relational algebra. We can bring this
query in the relational algebra form as:

σsalary>10000 (πsalary (Employee))

πsalary (σsalary>10000 (Employee))


ADBMS_Notes 27
Optimizing the Query

☺The DBMS must then select an execution strategy for


retrieving the result of the query from the database files
☺Query optimization is process of choosing a suitable
execution strategy for processing a query
☺The query optimizer module has the task of producing an
execution plan, and the code generator generates the code to
execute that plan
☺The runtime database processor has the task of running the
query code, whether in compiled or interpreted mode, to
produce the query result.
☺If a runtime error results, an error message is generated by the
runtime database processor.
ADBMS_Notes 28
☺In optimization stage, the query processor applies rules to
the internal data structures of the query to transform these
structures into equivalent, but more efficient representations.

☺The rules can be based on the relational algebra


expression and tree (heuristics), upon cost estimates of
different algorithms applied to operations and the relations
it involves.

☺Selecting the proper rules to apply, when to apply them and


how they are applied is the function of the query.
ADBMS_Notes 29
Evaluating the Query
☺The final step in processing a query . The best evaluation
plan candidate generated by the optimization engine is selected
and then executed.
☺Note that there can exist multiple methods of executing a
query.
☺Regardless of the method chosen, the actual results should be
same.
☺Finding the optimal strategy is usually too time-consuming
except for the simplest of queries and may require information
on how the files are implemented
☺Hence, planning of an execution strategy may be a more
accurate description than query optimization.

ADBMS_Notes 30
Translating SQL Queries into Relational Algebra
☺SQL(non-procedural ) is the query language that is used in
most commercial RDBMS.

☺An SQL query is first translated into an equivalent extended


relational algebra expression as a query tree data structure and
then optimized.

☺Relational algebra is a procedural languages which takes


relation as input and produce relation as output

ADBMS_Notes 31
Cont…
☺SQL queries are decomposed into query blocks, which form
the basic units that can be translated into the algebraic operators
and optimized.
☺A query block contains a single SELECT-FROM-WHERE
expression, as well as GROUP BY and HAVING clauses if these
are part of the block.
☺The nested queries within a query are identified as
separate query blocks.
☺So, The query optimizer would then choose an execution plan
for each block.
☺The inner block needs to be evaluated only once to produce the
maximum salary which is called uncorrelated nested query.

ADBMS_Notes 32
ADBMS_Notes 33
Exercise
Assume based on the following tables perform query processing :-
instructor(ID, name, dept_name, salary)
teaches(ID, course_id, sec_id, semester, year)
course(course_id, title, dept_name, credits)
• Query 1: Find the names of all instructors in the Music department,
along with the titles of the courses that they teach
• Query 2: Find the names of all instructors in the CSE department
who have taught a course in 2009, along with the titles of the
courses that they taught

ADBMS_Notes 34
Implementing Basic Query
Operations
• An RDBMS must provide implementation(s) for all the required
operations including relational operators
Internal Sorting
Each record contains a field called the key.
The keys can be places in a linear order.
External Sorting
 Refers an algorithms that are suitable for large files of records
stored on disk that do not fit entirely in main memory, such as most
database files.
Sort-merge strategy
o It Starts by sorting small sub files (runs) of the main file and then
merges the sorted runs, creating larger sorted sub files that are
merged in turn.
1. Sorting phase
2. Merging phase
ADBMS_Notes 35
Cont…

1. Sorting phase
• Number of file blocks (b)
• Number of available buffers (nB)
• Runs file = (b / nB)
2. Merging phase --- passes
• Degree of merging --- the number of runs that are merged together in each pass
• Analysis of the algorithm
Number of file blocks = b
Number of initial runs = nR
Available buffer space = nB
Sorting phase: nR = (b/nB)
Degree of merging: dM = Min (nB-1, nR);
Number of passes: nP = (logdM(nR))+1
Total cost of sorting=2*b(#Pass)
Number of block accesses: (2 * b) + (2 * b * (np)))
ADBMS_Notes 36
ADBMS_Notes 37
ADBMS_Notes 38
ADBMS_Notes 39
Number of passes =
Total Cost of external sort-merge= 2N * (# of passes)
To sort a file with N pages using B buffer pages:
 Pass 0: use B buffer pages. Produce sorted runs of B
pages each.
Pass 2, …, etc.: merge B-1 runs.
E.g., with 4 buffer pages, to sort 36 page file:
 Pass 0: [36/4] = 9 sorted runs of 4 pages each
 Pass 1: [9/3] = 3 sorted runs of 12 pages each
 Pass 2:[3/2]= 1 sorted runs, 36 pages
 Number of passes= 4(pass0,pass1,pass2 and pass3)
 Total Cost of external sort-merge= 2(36) * 3=216 IO

ADBMS_Notes 40
Query Tree Optimization
Query optimization categorized into two types:
1. Heuristic (Rule based):the optimizer chooses execution plans
based on heuristically ranked operations.
2. Systematic (Cost based):the optimizer examines alternative
access paths and operator algorithms and chooses the execution
plan with lowest estimate cost.
 Cost-based optimization is expensive, even with dynamic
programming.
Steps in cost-based query optimization
 Generate logically equivalent expressions using equivalence rules
 Annotate resultant expressions to get alternative query plans
 Choose the cheapest plan based on estimated cost

ADBMS_Notes 41
Heuristic-Based Query tree Optimization
 Experience based techniques for problem solving, learning and discovery
 Systems may use heuristics to reduce the number of choices that must be
made in a cost-based fashion

It transforms the query-tree by using a set of rules that typically (but not
in all cases) improve execution performance: –

o Perform selection early (reduces the number of tuples)

o Perform projection early (reduces the number of attributes)

o Perform most restrictive selection and join operations (i.e. with smallest
result size) before other similar operations

o Some systems use only heuristics, others combine heuristics with partial
cost-based optimization
ADBMS_Notes 42
Steps to optimize queries in of heuristic optimization algorithm are:

1. Break up SELECT operations with conjunctive conditions into


a cascade of SELECT operations
2. Using the commutativity operations, move each SELECT
operation as far down the query tree as is permitted the select
condition
3. Using commutativity and associativity of binary operations,
rearrange the leaf nodes of the tree
4. Combine a CARTESIAN PRODUCT operation with a
subsequent SELECT operation in the tree into a JOIN operation,
if the condition allows
5. Using the cascading of PROJECT and the commuting of
operations, break down and move lists of projection
attributes down the tree as far as possible by creating new
PROJECT operations as needed
6. Identify sub-trees that represent groups of operations that can be
executed by a single algorithm
ADBMS_Notes 43
Example 3:Find the last names of employees born after 1957 who
work on a project named ‘Aquarius’.
SELECT LNAME
FROM EMPLOYEE, WORKS_ON, PROJECT
WHERE PNAME=‘Aquarius’ AND PNUMBER=PNO AND
ESSN=SSN AND BDATE.‘1957-12-31’;

ADBMS_Notes 44
ADBMS_Notes 45
ADBMS_Notes 46
ADBMS_Notes 47
ADBMS_Notes 48
ADBMS_Notes 49
Cont…

• The main goal behind query optimization is to reduce intermediate


results:
o This includes performing SELECT operation to reduce the number
of tuples and
o PROJECT operation to reduce number of attributes.

ADBMS_Notes 50
Assignment 1(10%)
1. Describe Object-Oriented Concepts . Give examples
2. Describe Advantages and Disadvantages of OODBMS
3. Describe how to create objects in DBMS
4. Discus different Data model
5. Describe 2-way external sorting and B-way external sorting
6. Explain External sorting vs internal Sorting
7. Describe different types of index in DMS.

ADBMS_Notes 51

You might also like