0% found this document useful (0 votes)
11 views43 pages

ADB Notes 2021

Uploaded by

kaffe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views43 pages

ADB Notes 2021

Uploaded by

kaffe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Department of Computer Science

Advanced Database Management system

04/02/2024 1
CHAPTER ONE
OBJECT ORIENTED DATABASE Concepts

Database Management System (DBMS) is Collection of


interrelated data, Set of programs to access ,monitor and
management of an enterprise data .
The primary goal of a DBMS is to provide an environment that
makes it both convenient and efficient to retrieve and store the
enterprise database information.
Database Applications areas :
 Banking: all transactions
Airlines: reservations, schedules
 Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized recommendations

04/02/2024 2
Object Database Standards
Advantages of having and adhering to standards type of database
system is to achieving:
 portability of database applications to execute a particular
application program on different systems
 interoperability, which generally refers to the ability of an
application to access multiple distinct systems.
ODMG ( Object Database Management Group)
The ODMG is the data model upon which the object definition
language (ODL) and object query language (OQL) are used.
ODMG provides the data types, type constructors, and other
concepts that can be utilized in the ODL to specify object database
schemas.
• Object-oriented database (OODB) is A persistent and sharable
collection of objects defined by an OODM.

04/02/2024 3
Exercise
1. Describe Object-oriented data model (OODM)
2. Describe Object-oriented DBMS (OODBMS)

04/02/2024 4
DBMS Languages
There are different types of DBMS of languages:
a) Data Definition Language (DDL)
b) Data Manipulation Language (DML)
c) Data Control Languages (DCL)
d) Transaction Control Languages (TCL)

04/02/2024 5
Data Definition Language (DDL)

o DDL is Specification notation for defining the database schema


such as data dictionary

Example: create table account ( account_number char(10),


branch_name char(10), balance integer)

o DDL compiler generates a set of tables stored in a data dictionary

Example :Create, Alter ,Drop

04/02/2024 6
Cont…
o Data dictionary contains metadata (i.e., data about data)
Database schema
Integrity constraints
• Domain constraints
• Referential integrity
Authorization

04/02/2024 7
Data Manipulation Language (DML)
 Language for accessing and manipulating the data organized by the
appropriate data model

 DML also known as query language

Example : SELECT,INSERT INTO, UPDATE and DELETE FROM

04/02/2024 8
Object Consistency
An OODBMS is often closely coupled with an OOPL
 The OOPL is used to specify the method implementations and
application code
Objects created may have different lifetimes in the database :
a. transient:
 allocated memory managed by the programming language run-
time system
 Transient objects exist in the executing program and disappear
once the program terminates.
 It exists temporarily during the execution of a program but is not
kept when the program terminates

04/02/2024 9
Cont…
b. persistent:
 allocated memory and stored managed by ODBMS runtime
system.
 Persistent objects are stored in the database and persist after
program termination.
 It holds a collection of objects that is stored permanently in
the database and hence can be accessed and shared by multiple
programs
 Persistence is the storage of data from working memory so that
it can be restored when the application is run again
 Persistent objects are stored in the database and accessed from
the programming language

04/02/2024 10
CHAPTER TWO
Query Processing and Optimization
 Query Processing is process by which the query results are
retrieved from a high-level query such as SQL .
☺A query expressed in a high-level Query Languages must first
be scanned, parsed, and validated.
☺ The scanner identifies the language tokens such as keywords,
attribute names, and relation names in the text of the query.
☺ The parser checks the scanned query syntax to determine whether
it is formulated according to the syntax rules of the query language.
☺The query must be validated, by checking that all attribute validity
and semantics in the schema of the database.

04/02/2024 11
Processing a Query
Typical steps in processing a high-level query

04/02/2024 12
Cont…
☺The DBMS must then devise an execution strategy for
retrieving the result of the query from the database files
☺Query optimization is process of choosing a suitable
execution strategy for processing a query
☺The query optimizer module has the task of producing an
execution plan, and the code generator generates the code to
execute that plan
☺The runtime database processor has the task of running the
query code, whether in compiled or interpreted mode, to
produce the query result.
☺If a runtime error results, an error message is generated by the
runtime database processor.
04/02/2024 13
Basic Steps in Query Processing
1.Parsing and translation
2.Optimization
3.Evaluation

04/02/2024 14
Parsing and Translating the Query
☺It is used to convert the query into a form usable by the query
processing engine.
☺High-level query languages such as SQL represent a query as a
string, or tokens such as keywords, operators, operands, literal
strings, etc.
☺ The primary job of the parser is to extract the tokens from the
raw string of characters and translate them into the
corresponding internal data elements (i.e. relational algebra
operations and operands) and structures (i.e. query tree, query
graph).
☺The last job of the parser is to verify the validity and syntax of the
original query string.

04/02/2024 15
Optimizing the Query
☺In this stage, the query processor applies rules to the internal data
structures of the query to transform these structures into equivalent,
but more efficient representations.

☺The rules can be based on the relational algebra expression


and tree (heuristics), upon cost estimates of different algorithms
applied to operations .

☺Selecting the proper rules to apply, when to apply them and how
they are applied is the function of the query.

04/02/2024 16
Evaluating the Query
☺The final step in processing a query . The best evaluation plan
candidate generated by the optimization engine is selected and then
executed.
☺Note that there can exist multiple methods of executing a query.
☺Regardless of the method chosen, the actual results should be
same.
☺Finding the optimal strategy is usually too time-consuming except
for the simplest of queries and may require information on how
the files are implemented
☺Hence, planning of an execution strategy may be a more
accurate description than query optimization.

04/02/2024 17
Translating SQL Queries into Relational Algebra

☺SQL(non-procedural ) is the query language that is used in most


commercial RDBMS.

☺An SQL query is first translated into an equivalent extended


relational algebra expression as a query tree data structure and then
optimized.

☺Relational algebra is a procedural languages which takes relation


as input and produce relation as output

04/02/2024 18
Cont…
☺SQL queries are decomposed into query blocks, which form
the basic units that can be translated into the algebraic operators
and optimized.
☺A query block contains a single SELECT-FROM-WHERE
expression, as well as GROUP BY and HAVING clauses if these
are part of the block.
☺The nested queries within a query are identified as
separate query blocks.
☺So, The query optimizer would then choose an execution plan
for each block.
☺The inner block needs to be evaluated only once to produce the
maximum salary which is called uncorrelated nested query.

04/02/2024 19
04/02/2024 20
Implementing Basic Query Operations
 An RDBMS must provide implementation(s) for all the required
operations including relational operators
Internal Sorting
Each record contains a field called the key.
The keys can be places in a linear order.
External Sorting
 Refers an algorithms that are suitable for large files of records
stored on disk that do not fit entirely in main memory, such as most
database files.
Sort-merge strategy
o It Starts by sorting small sub files (runs) of the main file and then
merges the sorted runs, creating larger sorted sub files that are
merged in turn.
1. Sorting phase
2. Merging phase
04/02/2024 21
Example1: External Sorting Using Sort-Merge

04/02/2024 22
1. Sorting phase
 Number of file blocks (b)
 Number of available buffers (nB)
 Runs file = (b / nB)
2. Merging phase --- passes
 Degree of merging --- the number of runs that are merged
together in each pass
Analysis of the algorithm
Number of file blocks = b
Number of initial runs = nR
Available buffer space = nB
Sorting phase: nR = (b/nB)
Degree of merging:dM = Min (nB-1, nR);
Number of passes: nP = (logdM(nR))
Number of block accesses: (2 * b) + (2 * b * (np)))
04/02/2024 23
Number of passes =
Total Cost of external sort-merge= 2N * (# of passes)
To sort a file with N pages using B buffer pages:
 Pass 0: use B buffer pages. Produce sorted runs of B pages each.
Pass 2, …, etc.: merge B-1 runs.
E.g., with 5 buffer pages, to sort 108 page file:
 Pass 0: [108/5] = 22 sorted runs of 5 pages each (last run is
only 3 pages)
 Pass 1: [22/4] = 6 sorted runs of 20 pages each (last run is
only 8 pages)
 Pass 2:[6/3]= 2 sorted runs, 80 pages and 28 pages
 Pass 3: [2/1] =1 Sorted file of 108 pages
 Number of passes= 4(pass0,pass1,pass2 and pass3)
 Total Cost of external sort-merge= 2(108) * 4=864

04/02/2024 24
Query Tree Optimization
Query optimization categorized into two types:
1. Heuristic (Rule based):
 the optimizer chooses execution plans based on heuristically ranked
operations
 Experience based techniques for problem solving, learning and
discovery
1. Systematic (Cost based):the optimizer examines alternative access
paths and operator algorithms and chooses the execution plan with
lowest estimate cost.

04/02/2024 25
Chapter 3
Transaction Processing Concepts
A transaction is an action, carried out by a single user or an
application program, which reads or updates the contents of a
database.
It is a Logical unit of database processing that includes one or more
access operations (read -retrieval, write -insert or update, delete).
Transaction Processing is an application program may contain
several transactions separated by the Begin and End transaction
boundaries.
 If the database operations in a transaction do not update the
database, it is called “Read-only transaction”

04/02/2024 26
Transaction User
 According to number of user that concurrently connect to the
Transaction system it is divided into :
o Single User: only one user use the system in each time
o Multi User: many users use the system in the same time
Two modes of Concurrency
o Concurrency Interleaved processing: Concurrent execution of
processes is interleaved in a single CPU
o Parallel processing: Processes are concurrently executed in
multiple CPUs.

04/02/2024 27
Transaction properties
To ensure integrity of data, the database system requires to
maintain the ACID properties of the transactions:
a. Atomicity.
b. Consistency preservation.
c. Isolation.
d. Durability or permanency.

04/02/2024 28
Atomicity
Either all operations of the transaction are reflected properly in
the database, or none are
Ensuring atomicity, If the transaction does not complete, the
DBMS restores the old values to make it appear as though the
transaction have never execute.
Consistency
The consistency requirement here is that the sum of A and B be
unchanged by the execution of the transaction.
If the database is consistency before an execution of the
transaction, the database remains consistent after the execution of
the transaction.

04/02/2024 29
Isolation
Even though multiple transactions may execute concurrently, the
system guarantees that,
o for every pair of transactions T and T , it appears to T that either
i j i
Tj finished execution before Ti started, or Tj started execution
after Ti finished.
o ( Execution of transaction should not be interfered with by

any other transactions executing concurrently )


Durability or permanency
After a transaction completes successfully, the changes it has made to
the database persist, even if there are system failures.
These changes must not be lost because of any failure
ensures that, transaction has been committed, that transaction’s
updates do not get lost, even if there is a system failure

04/02/2024 30
Example1)
• Transfer £50 from account A to
account B Atomicity - shouldn’t take
money from A without
Read(A) giving it to B
A = A - 50 Consistency - money isn’t
Write(A) transaction lost or gained
Read(B) Isolation - other queries
B = B+50 shouldn’t see A or B
Write(B) change until completion
Durability - the money does
not go back to A

04/02/2024 31
Cont…
Example2) Let Ti be a transaction that transfer $5000 from Bona’s
account (5000) to Beshatu’s account . The transaction can be
defined as follow :

• Ti: read (Bona) (withdraw from Bona)


Bona := Bona – 5000
write (Bona); (update Bona)
____________________________________________
read (Bona) (deposit to Beshatu)
Bona := Bona + 5000
write (Bona) (update Beshatu)

04/02/2024 32
Transaction Failure
o DBMS is responsible for making sure that either
o All operations in transaction are completed successfully and the
changes is recorded permanently in the database.
o The DBMS must not permit some operations of a transaction T to
be applied to the DB while others of T are not
Transactions should be durable, but we cannot prevent all sorts of
transaction failures in the middle of execution due to :
A computer failure (System crash) – media failures
A transaction or system error: logical program error
Load error or exception conditions detected by the transaction : no data
for the transaction
Concurrency control enforcement: by concurrency control method
Disk failure
Physical problems and catastrophes: ex. Power failure, fire, overwrite
disk
04/02/2024 33
The Transaction Manager
 The transaction manager enforces the ACID properties
It schedules the operations of transactions
COMMIT and ROLLBACK are used to ensure atomicity
 Locks or timestamps are used to ensure consistency and
isolation for concurrent transactions
 A log is kept to ensure durability in the event of system
failure

04/02/2024 34
The Transaction Log
 Log: An ordered list of REDO/UNDO actions
 The system maintain log by keeping track of all transactions that effect the
database
 The transaction log records the details of all transactions:
Any changes the transaction makes to the database
How to undo these changes
When transactions complete and how
 Keeps information about operations made by transactions
Log is kept on Disk.
Effected only by disk or catastrophic failure
Log file keep track of the database from start transaction → complete
transaction
 Each log record has a unique Log Sequence Number (LSN).
 The log is stored on disk, not in memory
If the system crashes it is preserved
04/02/2024 35
State of transaction
For recovery purpose, the system needs to keep track of when the
transaction :
Active, the initial state; the transaction stays in this state while it is
executing.
Partially committed, after the final statement has been executed
Failed, after the discovery that normal execution can no longer
proceed.
Aborted, after the transaction has been rolled backed and the
database has been restored to its state prior to the start of
transaction.
Committed, after successful completion

04/02/2024 36
States of Transaction

04/02/2024 37
Cont…
The recovery manager keep track of the followings in Transaction
Process :
 Begin_transaction: mark the beginning of transaction execute
 Read or write: specified operations on the database item that
executes as part of transaction
 End_transaction: specifies that operations have ended and
marks the end of execution
o The change can be committed Or the transaction has to
aborted
 Commit_Transaction: successful end
 Rollback: unsuccessful end (undone)

04/02/2024 38
Concurrency Control and Concurrent Executions
 Transaction processing permit
Multiple transactions to run concurrently.
Multiple transactions to update data concurrently
 Cause
• Complications with consistency of data
Why concurrency is allowed ?
Improved throughput of transactions and system resource
utilization
Reduced waiting time of transactions

04/02/2024 39
Scheduling Transactions
• Schedule is A series of operation from one transaction to another
transaction
• It is used to preserve the order of the operation in each of the
individual transaction
Serial schedule:
 Schedule that does not interleave the actions of different
transactions.
 one transaction is executed completely before starting another
transaction.
 In the serial schedule, when the first transaction completes its cycle,
then the next transaction is executed.
Non-serial Schedule
 If interleaving of operations is allowed, then there will be non-serial
schedule.
 It contains many possible orders in which the system can execute the
individual operations of the transactions.
Cont…
Serializable schedule:
 It is used to find non-serial schedules that allow the transaction to
execute concurrently without interfering with one another.
 Also, it identifies which schedules are correct when executions of
the transaction have interleaving of their operations.
 A non-serial schedule will be serializable if its result is equal to the
result of its transactions executed serially.
 A serializable schedule always leaves the database in consistent
state
Note: If each transaction preserves consistency, every serializable
schedule preserves consistency

04/02/2024 41
Serial Schedule

04/02/2024 42
Non-serial schedule

04/02/2024 43

You might also like