ADB Notes 2021
ADB Notes 2021
04/02/2024 1
CHAPTER ONE
OBJECT ORIENTED DATABASE Concepts
04/02/2024 2
Object Database Standards
Advantages of having and adhering to standards type of database
system is to achieving:
portability of database applications to execute a particular
application program on different systems
interoperability, which generally refers to the ability of an
application to access multiple distinct systems.
ODMG ( Object Database Management Group)
The ODMG is the data model upon which the object definition
language (ODL) and object query language (OQL) are used.
ODMG provides the data types, type constructors, and other
concepts that can be utilized in the ODL to specify object database
schemas.
• Object-oriented database (OODB) is A persistent and sharable
collection of objects defined by an OODM.
04/02/2024 3
Exercise
1. Describe Object-oriented data model (OODM)
2. Describe Object-oriented DBMS (OODBMS)
04/02/2024 4
DBMS Languages
There are different types of DBMS of languages:
a) Data Definition Language (DDL)
b) Data Manipulation Language (DML)
c) Data Control Languages (DCL)
d) Transaction Control Languages (TCL)
04/02/2024 5
Data Definition Language (DDL)
04/02/2024 6
Cont…
o Data dictionary contains metadata (i.e., data about data)
Database schema
Integrity constraints
• Domain constraints
• Referential integrity
Authorization
04/02/2024 7
Data Manipulation Language (DML)
Language for accessing and manipulating the data organized by the
appropriate data model
04/02/2024 8
Object Consistency
An OODBMS is often closely coupled with an OOPL
The OOPL is used to specify the method implementations and
application code
Objects created may have different lifetimes in the database :
a. transient:
allocated memory managed by the programming language run-
time system
Transient objects exist in the executing program and disappear
once the program terminates.
It exists temporarily during the execution of a program but is not
kept when the program terminates
04/02/2024 9
Cont…
b. persistent:
allocated memory and stored managed by ODBMS runtime
system.
Persistent objects are stored in the database and persist after
program termination.
It holds a collection of objects that is stored permanently in
the database and hence can be accessed and shared by multiple
programs
Persistence is the storage of data from working memory so that
it can be restored when the application is run again
Persistent objects are stored in the database and accessed from
the programming language
04/02/2024 10
CHAPTER TWO
Query Processing and Optimization
Query Processing is process by which the query results are
retrieved from a high-level query such as SQL .
☺A query expressed in a high-level Query Languages must first
be scanned, parsed, and validated.
☺ The scanner identifies the language tokens such as keywords,
attribute names, and relation names in the text of the query.
☺ The parser checks the scanned query syntax to determine whether
it is formulated according to the syntax rules of the query language.
☺The query must be validated, by checking that all attribute validity
and semantics in the schema of the database.
04/02/2024 11
Processing a Query
Typical steps in processing a high-level query
04/02/2024 12
Cont…
☺The DBMS must then devise an execution strategy for
retrieving the result of the query from the database files
☺Query optimization is process of choosing a suitable
execution strategy for processing a query
☺The query optimizer module has the task of producing an
execution plan, and the code generator generates the code to
execute that plan
☺The runtime database processor has the task of running the
query code, whether in compiled or interpreted mode, to
produce the query result.
☺If a runtime error results, an error message is generated by the
runtime database processor.
04/02/2024 13
Basic Steps in Query Processing
1.Parsing and translation
2.Optimization
3.Evaluation
04/02/2024 14
Parsing and Translating the Query
☺It is used to convert the query into a form usable by the query
processing engine.
☺High-level query languages such as SQL represent a query as a
string, or tokens such as keywords, operators, operands, literal
strings, etc.
☺ The primary job of the parser is to extract the tokens from the
raw string of characters and translate them into the
corresponding internal data elements (i.e. relational algebra
operations and operands) and structures (i.e. query tree, query
graph).
☺The last job of the parser is to verify the validity and syntax of the
original query string.
04/02/2024 15
Optimizing the Query
☺In this stage, the query processor applies rules to the internal data
structures of the query to transform these structures into equivalent,
but more efficient representations.
☺Selecting the proper rules to apply, when to apply them and how
they are applied is the function of the query.
04/02/2024 16
Evaluating the Query
☺The final step in processing a query . The best evaluation plan
candidate generated by the optimization engine is selected and then
executed.
☺Note that there can exist multiple methods of executing a query.
☺Regardless of the method chosen, the actual results should be
same.
☺Finding the optimal strategy is usually too time-consuming except
for the simplest of queries and may require information on how
the files are implemented
☺Hence, planning of an execution strategy may be a more
accurate description than query optimization.
04/02/2024 17
Translating SQL Queries into Relational Algebra
04/02/2024 18
Cont…
☺SQL queries are decomposed into query blocks, which form
the basic units that can be translated into the algebraic operators
and optimized.
☺A query block contains a single SELECT-FROM-WHERE
expression, as well as GROUP BY and HAVING clauses if these
are part of the block.
☺The nested queries within a query are identified as
separate query blocks.
☺So, The query optimizer would then choose an execution plan
for each block.
☺The inner block needs to be evaluated only once to produce the
maximum salary which is called uncorrelated nested query.
04/02/2024 19
04/02/2024 20
Implementing Basic Query Operations
An RDBMS must provide implementation(s) for all the required
operations including relational operators
Internal Sorting
Each record contains a field called the key.
The keys can be places in a linear order.
External Sorting
Refers an algorithms that are suitable for large files of records
stored on disk that do not fit entirely in main memory, such as most
database files.
Sort-merge strategy
o It Starts by sorting small sub files (runs) of the main file and then
merges the sorted runs, creating larger sorted sub files that are
merged in turn.
1. Sorting phase
2. Merging phase
04/02/2024 21
Example1: External Sorting Using Sort-Merge
04/02/2024 22
1. Sorting phase
Number of file blocks (b)
Number of available buffers (nB)
Runs file = (b / nB)
2. Merging phase --- passes
Degree of merging --- the number of runs that are merged
together in each pass
Analysis of the algorithm
Number of file blocks = b
Number of initial runs = nR
Available buffer space = nB
Sorting phase: nR = (b/nB)
Degree of merging:dM = Min (nB-1, nR);
Number of passes: nP = (logdM(nR))
Number of block accesses: (2 * b) + (2 * b * (np)))
04/02/2024 23
Number of passes =
Total Cost of external sort-merge= 2N * (# of passes)
To sort a file with N pages using B buffer pages:
Pass 0: use B buffer pages. Produce sorted runs of B pages each.
Pass 2, …, etc.: merge B-1 runs.
E.g., with 5 buffer pages, to sort 108 page file:
Pass 0: [108/5] = 22 sorted runs of 5 pages each (last run is
only 3 pages)
Pass 1: [22/4] = 6 sorted runs of 20 pages each (last run is
only 8 pages)
Pass 2:[6/3]= 2 sorted runs, 80 pages and 28 pages
Pass 3: [2/1] =1 Sorted file of 108 pages
Number of passes= 4(pass0,pass1,pass2 and pass3)
Total Cost of external sort-merge= 2(108) * 4=864
04/02/2024 24
Query Tree Optimization
Query optimization categorized into two types:
1. Heuristic (Rule based):
the optimizer chooses execution plans based on heuristically ranked
operations
Experience based techniques for problem solving, learning and
discovery
1. Systematic (Cost based):the optimizer examines alternative access
paths and operator algorithms and chooses the execution plan with
lowest estimate cost.
04/02/2024 25
Chapter 3
Transaction Processing Concepts
A transaction is an action, carried out by a single user or an
application program, which reads or updates the contents of a
database.
It is a Logical unit of database processing that includes one or more
access operations (read -retrieval, write -insert or update, delete).
Transaction Processing is an application program may contain
several transactions separated by the Begin and End transaction
boundaries.
If the database operations in a transaction do not update the
database, it is called “Read-only transaction”
04/02/2024 26
Transaction User
According to number of user that concurrently connect to the
Transaction system it is divided into :
o Single User: only one user use the system in each time
o Multi User: many users use the system in the same time
Two modes of Concurrency
o Concurrency Interleaved processing: Concurrent execution of
processes is interleaved in a single CPU
o Parallel processing: Processes are concurrently executed in
multiple CPUs.
04/02/2024 27
Transaction properties
To ensure integrity of data, the database system requires to
maintain the ACID properties of the transactions:
a. Atomicity.
b. Consistency preservation.
c. Isolation.
d. Durability or permanency.
04/02/2024 28
Atomicity
Either all operations of the transaction are reflected properly in
the database, or none are
Ensuring atomicity, If the transaction does not complete, the
DBMS restores the old values to make it appear as though the
transaction have never execute.
Consistency
The consistency requirement here is that the sum of A and B be
unchanged by the execution of the transaction.
If the database is consistency before an execution of the
transaction, the database remains consistent after the execution of
the transaction.
04/02/2024 29
Isolation
Even though multiple transactions may execute concurrently, the
system guarantees that,
o for every pair of transactions T and T , it appears to T that either
i j i
Tj finished execution before Ti started, or Tj started execution
after Ti finished.
o ( Execution of transaction should not be interfered with by
04/02/2024 30
Example1)
• Transfer £50 from account A to
account B Atomicity - shouldn’t take
money from A without
Read(A) giving it to B
A = A - 50 Consistency - money isn’t
Write(A) transaction lost or gained
Read(B) Isolation - other queries
B = B+50 shouldn’t see A or B
Write(B) change until completion
Durability - the money does
not go back to A
04/02/2024 31
Cont…
Example2) Let Ti be a transaction that transfer $5000 from Bona’s
account (5000) to Beshatu’s account . The transaction can be
defined as follow :
04/02/2024 32
Transaction Failure
o DBMS is responsible for making sure that either
o All operations in transaction are completed successfully and the
changes is recorded permanently in the database.
o The DBMS must not permit some operations of a transaction T to
be applied to the DB while others of T are not
Transactions should be durable, but we cannot prevent all sorts of
transaction failures in the middle of execution due to :
A computer failure (System crash) – media failures
A transaction or system error: logical program error
Load error or exception conditions detected by the transaction : no data
for the transaction
Concurrency control enforcement: by concurrency control method
Disk failure
Physical problems and catastrophes: ex. Power failure, fire, overwrite
disk
04/02/2024 33
The Transaction Manager
The transaction manager enforces the ACID properties
It schedules the operations of transactions
COMMIT and ROLLBACK are used to ensure atomicity
Locks or timestamps are used to ensure consistency and
isolation for concurrent transactions
A log is kept to ensure durability in the event of system
failure
04/02/2024 34
The Transaction Log
Log: An ordered list of REDO/UNDO actions
The system maintain log by keeping track of all transactions that effect the
database
The transaction log records the details of all transactions:
Any changes the transaction makes to the database
How to undo these changes
When transactions complete and how
Keeps information about operations made by transactions
Log is kept on Disk.
Effected only by disk or catastrophic failure
Log file keep track of the database from start transaction → complete
transaction
Each log record has a unique Log Sequence Number (LSN).
The log is stored on disk, not in memory
If the system crashes it is preserved
04/02/2024 35
State of transaction
For recovery purpose, the system needs to keep track of when the
transaction :
Active, the initial state; the transaction stays in this state while it is
executing.
Partially committed, after the final statement has been executed
Failed, after the discovery that normal execution can no longer
proceed.
Aborted, after the transaction has been rolled backed and the
database has been restored to its state prior to the start of
transaction.
Committed, after successful completion
04/02/2024 36
States of Transaction
04/02/2024 37
Cont…
The recovery manager keep track of the followings in Transaction
Process :
Begin_transaction: mark the beginning of transaction execute
Read or write: specified operations on the database item that
executes as part of transaction
End_transaction: specifies that operations have ended and
marks the end of execution
o The change can be committed Or the transaction has to
aborted
Commit_Transaction: successful end
Rollback: unsuccessful end (undone)
04/02/2024 38
Concurrency Control and Concurrent Executions
Transaction processing permit
Multiple transactions to run concurrently.
Multiple transactions to update data concurrently
Cause
• Complications with consistency of data
Why concurrency is allowed ?
Improved throughput of transactions and system resource
utilization
Reduced waiting time of transactions
04/02/2024 39
Scheduling Transactions
• Schedule is A series of operation from one transaction to another
transaction
• It is used to preserve the order of the operation in each of the
individual transaction
Serial schedule:
Schedule that does not interleave the actions of different
transactions.
one transaction is executed completely before starting another
transaction.
In the serial schedule, when the first transaction completes its cycle,
then the next transaction is executed.
Non-serial Schedule
If interleaving of operations is allowed, then there will be non-serial
schedule.
It contains many possible orders in which the system can execute the
individual operations of the transactions.
Cont…
Serializable schedule:
It is used to find non-serial schedules that allow the transaction to
execute concurrently without interfering with one another.
Also, it identifies which schedules are correct when executions of
the transaction have interleaving of their operations.
A non-serial schedule will be serializable if its result is equal to the
result of its transactions executed serially.
A serializable schedule always leaves the database in consistent
state
Note: If each transaction preserves consistency, every serializable
schedule preserves consistency
04/02/2024 41
Serial Schedule
04/02/2024 42
Non-serial schedule
04/02/2024 43