Database System
Database System
May 2024
Addis Ababa,
Ethiopia
1
Contents for fundamental of Database
Chapter One .................................................................................................................................................. 5
Introduction to Database Systems ................................................................................................................. 5
1.1 Introduction ................................................................................................................................... 5
1.2 Database System versus File System ............................................................................................ 5
1.3 Characteristics of the Database Approach .................................................................................... 5
1.4 Actors on the scene ....................................................................................................................... 6
Chapter Two – Database System Architecture ............................................................................................. 7
2.1 Data models .................................................................................................................................. 7
Types of Data Models ........................................................................................................................... 7
2.2 Schemas and Instances .................................................................................................................. 7
2.3 Three-schema Architecture and Data Independence..................................................................... 7
2.4 Data Independence ........................................................................................................................ 8
Chapter Three – Database Modeling............................................................................................................. 9
3.1 Database Modeling ....................................................................................................................... 9
3.2 Phases of database design ............................................................................................................. 9
Requirement Analysis ........................................................................................................................... 9
Conceptual Database Design ................................................................................................................ 9
Logical database design (Data model mapping) ................................................................................... 9
Physical Design ..................................................................................................................................... 9
3.3 ERD (Entity Relationship Diagram) ............................................................................................. 9
Entity and Attribute............................................................................................................................. 10
Types of Attributes ............................................................................................................................. 10
3.4 Mapping ER-models to relational tables ..................................................................................... 12
3.5 Enhanced Entity Relationship (EER) Model .............................................................................. 12
3.6 The Relational Database Model .................................................................................................. 13
3.7 Relational Model Concepts in DBMS......................................................................................... 13
3.8 Relational Constraints ................................................................................................................. 14
Chapter Four ............................................................................................................................................... 15
Functional Dependency and Normalization ................................................................................................ 15
4.1 Functional Dependency............................................................................................................... 15
4.2 Types of Functional Dependency ............................................................................................... 15
4.3 Database Anomalies.................................................................................................................... 16
2
4.4 Normalization ............................................................................................................................. 16
Normal forms ...................................................................................................................................... 16
Chapter Five ................................................................................................................................................ 19
5.1 Introduction ................................................................................................................................. 19
5.2 Choosing Attribute Data Types ................................................................................................... 19
5.3 Data/File Storage: Categories ..................................................................................................... 20
5.4 File Organizations ....................................................................................................................... 20
5.5 Index ........................................................................................................................................... 21
Types of Indexes ................................................................................................................................. 21
Chapter Six.................................................................................................................................................. 24
The Relational Algebra and Relational Calculus ........................................................................................ 24
6.1 Relational algebra ....................................................................................................................... 24
6.2 Relational Calculus ..................................................................................................................... 24
Tuple Relational Calculus ................................................................................................................... 25
Chapter Seven - SQL .................................................................................................................................. 26
7.1 SQL ............................................................................................................................................. 26
7.1.1 DDL (data definition language) .......................................................................................... 26
7.1.2 DML (data manipulation language) .................................................................................... 26
3
2.3 Backup and Recovery ................................................................................................................. 60
Chapter 3: Transaction Processing Concepts .............................................................................................. 63
3.1 Introduction to Transaction ......................................................................................................... 63
3.2 Transaction and System Concepts .............................................................................................. 64
3.3 Transaction Processing ............................................................................................................... 67
3.4 Concept of Schedules and Serializability ................................................................................... 70
3.5 Transaction Support in SQL ....................................................................................................... 76
Chapter 4: Concurrency Controlling Techniques........................................................................................ 79
4.1 Database Concurrency Control ................................................................................................... 79
4.2 Concurrency Control Techniques ............................................................................................... 81
4.2.1 Locking ................................................................................................................................... 82
4.2.2 Two-Phase Locking Techniques: The algorithm .................................................................... 86
4.2.3 Timestamp based concurrency control algorithm ................................................................... 90
4.2.4 Multiversion Concurrency Control Techniques ...................................................................... 92
4.2.5 Validation (Optimistic) Concurrency Control Schemes ......................................................... 94
4.2.6 Multiple Granularity Locking ................................................................................................. 95
Chapter 5: Database Recovery Techniques................................................................................................. 97
5.1 Database Recovery...................................................................................................................... 97
5.2 Transaction and Recovery ........................................................................................................... 97
Chapter 6: Distributed Databases and Client-Server Architectures .......................................................... 113
6.1 Distributed Database Concepts ................................................................................................. 113
6.2 Data Replication and Fragmentation: Distributed data storage................................................. 116
6.3 Types of Distributed Database Systems .................................................................................... 117
6.4 Query Processing in Distributed Databases .............................................................................. 119
6.5 Concurrency Control and Recovery .......................................................................................... 121
6.5.1 Distributed Concurrency control ........................................................................................... 122
6.6 Client-Server Database Architecture ........................................................................................ 124
4
Chapter One
A database can be of any size and complexity. A personal database is designed for use by a single
person on a single computer. Such a database usually has a rather simple structure and a relatively
small size. An enterprise database can be huge. Enterprise databases may model the information
flow of entire large organizations.
Database Management System (DBMS) is then a tool for creating and managing databases.
DBMS is a software that facilities the processes of defining, constructing, manipulating, and
sharing databases. Example: MySQL server, MS ACESS, Microsoft SQL server.
5
Data abstraction and Insulation between programs and data- Data Abstraction is a process of
hiding unwanted or irrelevant details from the end user. It provides a different view and helps in
achieving data independence which is used to enhance the security of data.
Support of multiple views of the data - A view may be a subset of the database or it may contain
virtual data that is derived from the database files but is not explicitly stored.
Sharing of data and multiuser transaction processing - A multiuser DBMS must allow multiple
users to access the database at the same time. This is essential if data for multiple applications is
to be integrated and maintained in a single database.
6
Chapter Two – Database System Architecture
2.1 Data models
A data model a collection of concepts that can be used to describe the structure of a database and
provides the necessary means to achieve data abstraction. By structure of a database we mean the
data types, relationships, and constraints that apply to the data. Most data models also include a
set of basic operations for specifying retrievals and updates on the database.
7
2. The conceptual level has a conceptual schema, which describes the structure of the whole
database for users. The conceptual schema hides the details of physical storage structures and
concentrates on describing entities, data types, relationships, user operations, and constraints.
3. The external or view level includes a number of external schemas or user views. Each external
schema describes the part of the database that a particular user group is interested in and hides the
rest of the database from that user group.
8
Chapter Three – Database Modeling
3.1 Database Modeling
High-level conceptual data models provide concepts for presenting data in ways that are close to
the way people perceive data. A typical example is the entity relationship model, which uses main
concepts like entities, attributes and relationships.
3.2 Phases of database design
The phases of database development life cycle (DDLC) in the Database Management System
(DBMS) are explained below.
Requirement Analysis
The most important step in implementing a database system is to find out what is needed i.e what
type of a database is required for the business organization, daily volume of data, how much data
needs to be stored in the master files etc. In order to collect all this information, a database analyst
spends a lot of time within the business organization talking to people, end users and getting
acquainted with the day-to-day process.
Physical Design
Physical design or Database implementation needs the formation of special storage related
constructs. These constructs consist of storage groups, table spaces, data files, tables etc.
9
Entity and Attribute
An entity may be an object with a physical existence (for example, a particular person, car, house,
or employee) or it may be an object with a conceptual existence (for instance, a company, a job,
or a university course). In ER diagrams, an entity is placed in a rectangle.
Each entity has attributes, the particular properties that describe it. Entities that do not have key
attributes of their own are called weak entity. The regular entity types that have a key attribute are
called strong entity types. Weak entities are identified by being related to specific strong entities
in combination with one of their attribute values. A weak entity type has a partial key, which is
the attribute that can uniquely identify weak entities that are related to the same owner entity.
In ER diagrams, both a weak entity type and its identifying relationship are distinguished by
surrounding their boxes and diamonds with double lines. The partial key attribute is underlined
with a dashed or dotted line.
Types of Attributes
Composite and Simple (Atomic) Attributes
Composite attributes can be divided into smaller subparts, which represent more basic attributes
with independent meanings. Attributes that are not divisible are called simple or atomic attributes.
Multivalued Attributes
Most attributes have a single value for a particular entity; such attributes are called single-valued.
For example, Age is a single-valued attribute of a person. In some cases an attribute can have a set
of values for the same entity. For instance, a College_degrees attribute for a person. Similarly, one
person may not have any college degrees, another person may have one, and a third person may
have two or more degrees; therefore, different people can have different numbers of values for the
College_degrees attribute. Such attributes are called multivalued.
Derived Attributes
In some cases, two (or more) attribute values are related. For example, the Age and Birth_date
attributes of a person. For a particular person entity, the value of Age can be determined from the
current (today‘s) date and the value of that person‘s Birth_date. The Age attribute is hence called
a derived attribute and is said to be derivable from the Birth_date attribute, which is called a
stored attribute.
Key Attribute
10
An important constraint on the entities of an entity type is the key or uniqueness constraint on
attributes. An entity type usually has one or more attributes whose values are distinct for each
individual entity in the entity set. Such an attribute is called a key attribute, and its values can be
used to identify each entity uniquely.
Key
Super Key − A set of attributes (one or more) that collectively identifies an entity in an entity set.
Candidate Key − A minimal super key is called a candidate key.
An entity set may have more than one candidate key.
Primary Key − A primary key is one of the candidate keys chosen by the database designer to
uniquely identify the entity set.
Relationship
A relationship type represents the association between entity types. For example, ‗Enrolled in‘ is
a relationship type that exists between entity Student and Course. In ER diagram, relationship is
represented by a diamond and connecting the entities with lines.
Degree of a relationship
The number of different entity sets participating in a relationship set is called as degree of a
relationship set.
Unary Relationship- When there is only ONE entity set participating in a relation, the relationship
is called as unary relationship. For example, an employee can manage another employee.
Binary Relationship - When there are TWO entities set participating in a relation, the relationship
is called as binary relationship. For example, Student is enrolled in Course.
If the participant entities are 3, the relationship degree is called ternary or degree 3 relationship.
Cardinality
The number of times an entity of an entity set participates in a relationship set is known as
cardinality. Cardinality can be of different types:
One to one (1-1) – When each entity in each entity set can take part only once in the relationship,
the cardinality is one to one.
One to many(1-M)– When entities in one entity set can take part only once in the relationship set
and entities in other entity set can take part more than once in the relationship set, cardinality is
many to one.
11
Many to many (M-M) – When entities in all entity sets can take part more than once in the
relationship cardinality is many to many.
The participation constraint specifies whether the existence of an entity depends on its being
related to another entity via the relationship type. This constraint specifies the minimum number
of relationship instances that each entity can participate in and is sometimes called the minimum
cardinality constraint. There are two types of participation constraints, total and partial.
Enhanced ER (EER) model is a database model that incorporates the concepts of class/subclass
relationships and type inheritance into the ER model. The EER model includes all the modeling
12
concepts of the ER model. In addition, it includes the concepts of subclass and superclass,
categories (UNION types), attribute and relationship inheritance. EER is created to design more
accurate database schemas.
Subclasses and super classes can be thought as useful ways to describe similarities in the entities
represented. Because subclasses inherit their attributes and relationships from the superclass, the
database designer can represent complex, hierarchical interdependencies (inheritance) within the
system being described.
Union
Category represents a single super class or sub class relationship with more than one super class.
It can be a total or partial participation.
Tables – In the Relational model, relations are saved in the table format. It is stored along with its
entities. A table has two properties rows and columns. Rows represent records and columns
represent attributes.
Column: The column represents the set of values for a specific attribute.
13
Relation Schema: A relation schema represents the name of the relation with its attributes
(columns).
Properties of Relations
Domain Constraints - Every domain (cell in the table or attribute value) must contain atomic
values (smallest indivisible units). We perform data type check here, which means we need to
assign a data type to a column and limit the values that it can contain.
Key Constraints- These are called uniqueness constraints since it ensures that every tuple in the
relation should be unique. An attribute that can uniquely identify a tuple in a relation is called the
key or primary key of the table. The value of the attribute for different tuples in the relation has to
be unique. Primary keys must be unique and cannot be null.
Referential Integrity Constraints - The Referential integrity constraints is specified between two
relations or tables and used to maintain the consistency among the tuples in two relations.
Referential Integrity constraints in DBMS are based on the concept of Foreign Keys. A foreign
key is an important attribute of a relation which should be referred to in other relationships.
Referential integrity constraint state happens where relation refers to a key attribute of a different
or same relation. However, that key element must exist in the table.
14
Chapter Four
Let {A, B} is the key set and C is no key attribute, then if {A, B} → C and [B → C or A → C]
holds. We say C is partially functionally dependent on {A, B}.
Let {A, B} is the key set and C is not key attribute, then if {A, B} → C and [B→ C & A → C]
doesn‘t hold (i.e; if A can‘t determine C and B can‘t determine C), then C is fully functionally
dependent on {A, B}.
15
If A functionally governs B, AND If B functionally governs C THEN ‗A‘ functionally governs
‗C‘, provided that neither C nor B determines A. A functional dependency is said to be transitive
if it is indirectly formed by two functional dependencies.
Insertion Anomaly - occurs when the entire primary key is not known and the database cannot
insert a new record properly. This would violate entity integrity. It can be avoided by using a
sequence number for the primary key.
Deletion Anomaly - happens when a record is deleted that results in the loss of other related data.
It can be avoided by normalizing tables in a database to minimize their dependency.
Update Anomaly - occurs when a change to one attribute value causes the database to either
contain inconsistent data or causes multiple records to need changing. It may be prevented by
making sure tables are in third normal form.
4.4 Normalization
Normalization is a process of organizing the data in database to avoid data redundancy, insertion
anomaly, update anomaly & deletion anomaly. It is a technique of organizing the data in the
database. It is a systematic approach of decomposing tables to eliminate data redundancy
(repetition) and undesirable characteristics like Insertion, Update and Deletion Anomalies. It is a
multi-step process that puts data into tabular form, removing duplicated data from the relation
tables.
Normal forms
Redundancy in a relation may causes insertion, deletion, and update anomalies. Normalization is
database design technique that minimizes redundancy from a relation or a set of relations.
Normalization rules divides larger tables into smaller tables and links them using relationship.
Normal Forms are maintenance rules that are used to eliminate or reduce the redundancy in
database relations. These normal forms are 1NF (First Normal Form), 2NF (Second Normal Form)
and 3NF (Third Normal Form).
16
First normal form (1NF)
A relation (table) R is in 1NF if and only if (iff) all underlying domains of attributes contain only
atomic (simple or indivisible) values. i.e. The value of any attribute in a tuple (row) must be a
single value from the domain of their attribute. It is defined to disallow multivalued attributes,
composite attributes, and their combinations. In other words, 1NF disallows relations within
relations or relations as attribute values within tuples. It must be remove repeating groups (values)
in a single field. The primary key with repeating group attributes are moved into a new table. When
a relation contains no repeating groups (values), it is in first normal form (1NF).
Second normal form (2NF) is based on the concept of full functional dependency. A relation
schema R is in 2NF if it is in 1NF and every non-prime attribute A in R is fully functionally
dependent on the primary key (i.e. the non-prime attributes are not partially dependent on key
attributes). Remove any partially dependent attributes and place them in another relation. A partial
dependency is when the attribute(s) is/are dependent on a part of a key.
Example: Consider a relation schemas for Employees and Teams in a single relation as follows
Emp-Teams(EmpId, Name, BDate, Gender, TeamId, Project, TeamName)
Then upon decomposition we will have
EmpId → {Name, BDate, Gender}
TeamId → {Project, TeamName}
17
Employees(EmpId, Name, BDate, Gender)
Teams(TeamId, Project, TeamName)
Emp-Teams(EmpId, TeamId)
A relation schema R is in third normal form (3NF) if it is in 2NF and every non-prime attribute of
‗R‘ are fully functionally dependent on every key of R and non-transitively dependent on every
key of R.
18
Chapter Five
Record Storage and Primary File Organization
5.1 Introduction
To translate the logical description of data into technical specifications for storing and retrieving
data, a physical database design is needed. The goal is to create a design for storing and retrieving
data that will provide adequate performance and insure database integrity, security, and
availability.
Some decisions need to be made on the physical database design. They are Attribute data types,
data/file storages, File organizations and Indexes.
BLOB – binary large object (good for graphics images, audio, sound clips, etc.)
Exact
SMALLINT → 16 bit
INT → 32 bit
LONG → 64 bit
DECIMAL → 128 bit ...
Approximate
19
5.3 Data/File Storage: Categories
Three main storage categories: Primary storage, Secondary storage, and Tertiary storage.
Primary storage - includes storage media that store a data which can be operated directly by
the computer central processing unit (CPU). Primary storage includes; the computer main
memory (RAM) and Smaller but faster cache memories.
Primary storage usually provides fast read and write to data. It is of limited storage capacity.
Primary storage devices are more expensive. The contents of main memory are lost in case
maybe when power failure, a system crash or other issue occurs.
Secondary storage - includes large storage devices such as computer hard disk (HDD). These
devices usually have a larger capacity, less cost, and Provide slower access to data than primary
storage devices.
Data in secondary storage cannot be processed directly by the CPU, it must first be copied into
primary storage. They are called online storage devices because they can be accessed in short
period of time whenever needed.
Tertiary storage - optical disks (CD-ROMs, DVDs, and other similar storage media) and
magnetic tapes which are removable media are used in today‘s systems as offline storage for
archiving databases.
Heap file (unordered file) - places the records on disk in no particular order → by appending
new records at the end of the file.
Sorted file (sequential file) - keeps the records ordered by the value of a particular field (called
the sorting key).
Hashed file - uses a hash function applied to a particular field (called the hash key) to
determine a record‘s placement on disk.
20
Secondary organization or auxiliary access structures allows efficient access & storage to file
records based on alternate fields than those that have been used for the primary file
organization.
5.5 Index
Indexes are additional auxiliary access structures which are used to speed up the retrieval of
records in response to certain search conditions. Index structures are additional files, which
provide alternative ways to access the records without affecting the physical placement of
records on disk.
Types of Indexes
Dense Index
An index entry is created for every search key value (for each records) in each block. This
index contains search key value and a pointer to the actual record. It has large index size. Less
time needed to locate arbitrary data.
Sparse Index
One index entry for each block. Indexes are created only for some of the data records. It has
small index size and more time needed to locate arbitrary data Records must be clustered or
arranged in blocks.
Here it is assumed that a file already exists with some primary organization like unordered,
ordered, or hashed organizations. For a file with a given record structure consisting of several
fields, an index access structure is usually defined on a single field of a file, called an indexing
field (or indexing attribute). Types of Single Level Ordered Indexes are Primary, Clustering
and Secondary.
Primary Index
A primary index is specified on the ordering key field of an ordered file of records. An ordering
key field is used to physically order the file records on disk, and every record has a unique
value for that field. Primary index is an ordered file whose records are of fixed length with two
21
fields. The first field is of the same data type as the ordering key field called the PK of the data
file, and the second field is a pointer to a disk block (block address).
There is one index entry (or index record) in the index file for each block in the data file. Each
index entry has the value of the primary key field for the first record in a block and a pointer
to that block as its two field values. It is sparse.
The index file for a primary index occupies a much smaller space than the data file does,
because there are fewer index entries than the records in the data file and each index entry is
typically smaller in size than a data record because it has only two fields, both of which tend
to be short in size. To locate a record with a search key value requires log2(b) + 1
Clustering Indexes
Here files are assumed as ordered with a Non-key field. If file records are physically ordered
on a non-key field which doesn‘t have a distinct value for each record that field is called
clustering field and the data file is called a clustered file.
We can create a different type of index, called a clustering index, to speedup retrieval of all
records that have the same value for the clustering field. If the ordering field is not a key field
i.e; if numerous records in the file can have the same value for the ordering field clustering
index can be used.
A clustering index is also an ordered file with two fields. The first field is of the same type as
the clustering field of the data file, and the second field is a disk block/cluster pointer.
There is one entry in the clustering index for each distinct value of the clustering field, and it
contains the value and a pointer to each block in the data file that has a record with that value
for its clustering field. It is a non-dense indexing mechanism.
Secondary index
Provides a secondary means of accessing a data file for which some primary access already
exists. The data file records could be ordered, unordered, or hashed. Secondary index may be
created on a field that is a candidate key and has a unique value in every record, or on a non-
key field with duplicate values.
22
A data file can have several secondary indexes in addition to its primary access method. The
index is again an ordered file with two fields. The first field is of the same data type as some
non-ordering field of the data file that is an indexing field. The second field is either a block
or a record pointer.
Many secondary indexes can be created for the same file each represents an additional means
of accessing that file based on some specific field.
Hence, such an index is dense. A secondary index usually needs more storage space and longer
search time than does a primary index, because of its larger number of entries. However, the
improvement in search time for an arbitrary record is much greater for a secondary index than
for a primary index.
23
Chapter Six
Basic operations:
Intersection (∩) - Only tuples in relation 1 and in relation 2 (only the common tuples).
In a calculus expression, there is no order of operations to specify how to retrieve the query
result. A calculus expression specifies only what information the result should contain. This is
the main distinguishing feature between relational algebra and relational calculus. Relational
calculus is considered to be a nonprocedural language. This differs from relational algebra,
where we must write a sequence of operations to specify a retrieval request; hence relational
algebra can be considered as a procedural way of stating a query.
24
Tuple Relational Calculus
The tuple relational calculus is based on specifying a number of tuple variables. Each tuple
variable usually ranges over a particular database relation, meaning that the variable may take
as its value any individual tuple from that relation.
{t | COND(t)} where t is a tuple variable and COND (t) is a conditional expression involving
t. The result of such a query is the set of all tuples t that satisfy COND (t).
25
Chapter Seven - SQL
7.1 SQL
SQL represents Structured Query Language. SQL is a comprehensive database language: It
has statements for data definitions, and updates. Hence, it is both a DDL (Data Definition
language) and a DML (Data Manipulation Language).
26
Chapter 1: Query Processing and Optimization
1.5 Query Processing and Optimization: Why?
What is Query Processing?
Steps required to transform high level SQL query into a correct and ―efficient‖ strategy
for execution and retrieval.
Example:
Identify all managers who work in a London branch
SELECT *
FROM Staff s, Branch b
WHERE s.branchNo = b.branchNo AND s.position = ‗Manager‘ AND
b.city = ‗london‘;
27
– No indexes or sort keys
– All temporary results are written back to disk (memory is small)
– Tuples are accessed one at a time (not in blocks)
(Position=„Manager‟)^(city=„London‟)^(Staff.branchNo=Branch.branchNo) (Staff X Branch)
◦ Requires (1000+50) disk accesses to read from Staff and Branch relations
◦ Creates temporary relation of Cartesian Product (1000*50) tuples
◦ Requires (1000*50) disk access to read in temporary relation and test predicate
Total Work = (1000+50) + 2*(1000*50) =
101,050 I/O operations
Query 1 (Bad)
(Staff X Branch)
(position=„Manager‟)^(city=„London‟)^(Staff.branchNo=Branch.branchNo)
◦ Requires (1000+50) disk accesses to read from Staff and Branch relations
◦ Creates temporary relation of Cartesian Product (1000*50) tuples
◦ Requires (1000*50) disk access to read in temporary relation and test predicate
Total Work = (1000+50) + 2*(1000*50) = 101,050 I/O operations
Query 2 (Better)
– Again requires (1000+50) disk accesses to read from Staff and Branch
– Joins Staff and Branch on branchNo with 1000 tuples
(1 employee : 1 branch )
– Requires (1000) disk access to read in joined relation and check predicate
Total Work = (1000+50) + 2*(1000) =
3050 I/O operations
28
◦ Read Branch relation to determine ‗London‘ branches (50
reads)
🞄 Create 5 tuple relation(5 writes)
◦ Join reduced relations and check predicate (50 + 5 reads)
Total Work = 1000 + 2*(50) + 5 + (50 + 5) =
1160 I/O operations
8700% Improvement over Query 1
1.6 Query Processing Steps
i. Analysis: lexical and syntactical analysis of the query (correctness) based on attributes,
data type.. ,. Query tree will be built for the query containing leaf node for base relations,
one or many non-leaf nodes for relations produced by relational algebra operations and
root node for the result of the query. Sequence of operation is from the leaves to the
root.(SELECT * FROM Catalog c ,Author a Where a.authorid = c.authorid AND
c.price>200 AND a.country= ‗ USA‘ ).
29
– Key activities
• Verify operations
ii. Normalization: convert the query into a normalized form. The predicate WHERE will be
converted to Conjunctive (𝖠) or Disjunctive (∨) Normal form.
iii. Semantic Analysis: to reject normalized queries that are not correctly formulated or
contradictory. Incorrect if components do not contribute to generate result.
Contradictory if the predicate can not be satisfied by any tuple. Say for example,(Catalog
=―BS‖ Catalog= ―CS‖) since a given book can only be classified in either of the
category at a time
Problem of query optimization is to find the sequence of steps that produces the answer
to user request in the most efficient manner, given the database structure.
The performance of a query is affected by the tables or views that underlies the query and
by the complexity of the query.
Given a request for data manipulation or retrieval, an optimizer will choose an optimal
plan for evaluating the request from among the manifold alternative strategies. i.e. there
are many ways (access paths) for accessing desired file/record.
Hence ,DBMS is responsible to pick the best execution strategy based on various
considerations( Least amount of I/O and CPU resources. )
Method 1 :
30
c. All records of r concatenated?
NO: goto a.
Method 2: Improvement
Performance: Reduces the number of times s blocks are loaded by a factor of equal to the
number of r records than can fit in main memory.
A. Heuristics Approach
🞄 The Leafs: the base relations used for processing the query/
extracting the required information
31
🞄 Nodes: intermediate results or relations before reaching the final
result.
32
33
Using Heuristics in Query Optimization
The main heuristic is to apply first the operations that reduce the size of intermediate
results.
1. E.g. Apply SELECT and PROJECT operations before applying the JOIN or other
binary operations.
Query block: The basic unit that can be translated into the algebraic operators and
optimized.
Query tree:
An execution of the query tree consists of executing an internal node operation whenever
its operands are available and then replacing that internal node by the relation that results
from executing the operation.
Query graph:
34
◦ PNUMBER, DNUM, LNAME, ADDRESS, BDATE
((( PLOCATION=„STAFFORD‟
(PROJECT))
(DEPARTMENT))
DNUM=DNUMBER MGRSSN=SSN
(EMPLOYEE))
SQL query:
Q2: SELECT P.NUMBER,P.DNUM,E.LNAME,E.ADDRESS, E.BDATE
FROM PROJECT AS P,DEPARTMENT AS D, EMPLOYEE AS
E
WHERE P.DNUM=D.DNUMBER AND
D.MGRSSN=E.SSN AND
P.PLOCATION=‗STAFFORD‘;
35
⚫ Heuristic Optimization of Query Trees:
◦ The same query could correspond to many different relational algebra expressions
— and hence many different query trees.
36
◦ The task of heuristic optimization of query trees is to find a final query tree
that is efficient to execute.
⚫ Example:
Q: SELECT LNAME
37
Transformation Rules for RA Operations
38
Use of Transformation Rules
For prospective renters of flats, find properties that match requirements and owned by
CO93.
p.ownerNo = „CO93‟;
39
40
Heuristical Processing Strategies
⚫ Combine Cartesian product with subsequent Selection whose predicate represents join
condition into a Join operation.
⚫ Use associativity of binary operations to rearrange leaf nodes so leaf nodes with most
restrictive Selection operations executed first.
⚫ If common expression appears more than once, and result not too large, store result and
reuse it when required.
⚫ Useful when querying views, as same expression is used to construct view each time.
Summary of Heuristics for Algebraic Optimization:
1. The main heuristic is to apply first the operations that reduce the size of
intermediate results.
3. The select and join operations that are most restrictive should be executed
before other similar operations. (This is done by reordering the leaf nodes of the
tree among themselves and adjusting the rest of the tree appropriately.)
⚫ The main idea is to minimize he cost of processing a query. The cost function is
comprised of:
41
⚫ The DBMs will use information stored in the system catalogue for the purpose of
estimating cost.
⚫ The main target of query optimization is to minimize the size of the intermediate
relation. The size will have effect in the cost of:
◦ Disk Access
◦ Data Transportation
◦ Writing on Disk
⚫ Use formulae that estimate costs for a number of options, and select one with lowest cost.
⚫ Consider only cost of disk access, which is usually dominant cost in QP.
⚫ Many estimates are based on cardinality of the relation, so need to be able to estimate
this.
◦ Estimate and compare the costs of executing a query using different execution
strategies and choose the strategy with the lowest cost estimate. (Compare to
heuristic query optimization)
⚫ Issues
◦ Cost function
2. Storage cost
3. Computation cost
42
4. Memory usage cost
5. Communication cost
⚫ Data is going to be accessed from secondary storage, as a query will be needing some
part of the data stored in the database. The disk access cost can again be analyzed in
terms of:
◦ Searching
◦ Reading, and
◦ The file organization used and the access method implemented for the file
organization.
◦ whether the data is stored contiguously or in scattered manner, will affect the disk
access cost.
2. Storage Cost
• While processing a query, as any query would be composed of many database operations,
there could be one or more intermediate results before reaching the final output. These
intermediate results should be stored in primary memory for further processing. The
bigger the intermediate relation, the larger the memory requirement, which will have
impact on the limited available space. This will be considered as a cost of storage.
3. Computation Cost
⚫ Query is composed of many operations. The operations could be database operations like
reading and writing to a disk, or mathematical and other operations like:
◦ Searching
◦ Sorting
◦ Merging
4. Communication Cost
43
o In most database systems the database resides in one station and various queries
originate from different terminals. This will have impact on the performance of the
system adding cost for query processing. Thus, the cost of transporting data between the
database site and the terminal from where the query originate should be analyzed.
◦ The methods to be used in computing the relational operators stored in the tree.
1.9 Pipelining
⚫ Pipelined evaluation: evaluate several operations simultaneously, passing the results of
one operation on to the next.
◦ Instead, pass tuples directly to the join.. Similarly, don‘t store result of join, pass
tuples directly to projection.
⚫ For pipelining to be effective, use evaluation algorithms that generate output tuples even
as tuples are received for inputs to the operation.
⚫ Pipelines can be executed in two ways: demand driven and producer driven
◦ Each operation requests next tuple from children operations as required, in order
to output its next tuple
◦ In between calls, operation has to maintain ―state‖ so it knows what to return next
44
⚫ Buffer maintained between operators, child puts tuples in buffer, parent
removes tuples from buffer
⚫ if buffer is full, child waits till there is space in the buffer, and then
generates more tuples
◦ System schedules operations that have space in output buffer and can process
more input tuples
⚫ open()
⚫ next()
⚫ E.g. for file scan: Output next tuple, and advance and store file
pointer
⚫ E.g. for merge join: continue with merge from earlier state till
next output tuple is found. Save pointers as iterator state.
⚫ close()
Evaluation Algorithms for Pipelining
⚫ Some algorithms are not able to output results even as they get input tuples
⚫ Algorithm variants to generate (at least some) results on the fly, as input tuples are read
in
45
◦ E.g. hybrid hash join generates output tuples even as probe relation tuples in the
in-memory partition (partition 0) are read in
◦ Pipelined join technique: Hybrid hash join, modified to buffer partition 0 tuples
of both relations in-memory, reading them as they become available, and output
results of any matches between partition 0 tuples
46
Chapter 2: Database Security and Authorization
2.1 Data Integrity
⚫ Security vs Integrity
◦ Database security makes sure that the user is authorised to access information
◦ Database integrity makes sure that (authorised) users use that information
correctly
⚫ Integrity constraints
⚫ The constraints we wish to impose in order to protect the database from becoming
inconsistent.
⚫ Five types
i. Required data
v. enterprise constraints
i. Required Data
⚫ Some attributes must always contain a value -- they cannot have a NULL value
⚫ For example:
47
◦ Every employee must have a job title.
◦ Every diveshop diveitem must have an order number and an item number
⚫ Every attribute has a domain, that is a set of values that are legal for it to use
⚫ For example:
◦ Must be Unique
◦ Cannot be NULL
⚫ A ―foreign key‖ links each occurrence in a relation representing a child entity to the
occurrence of the parent entity containing the matching candidate (usually primary) key
⚫ Referential Integrity means that if the foreign key contains a value, that value must refer
to an existing occurrence in the parent entity
⚫ For example:
◦ Since the Order ID in the diveitem relation refers to a particular diveords item,
that item must exist for referential integrity to be satisfied.
⚫ Referential integrity options are declared when tables are defined (in most systems)
⚫ There are many issues having to do with how particular referential integrity constraints
are to be implemented to deal with insertions and deletions of data from the parent and
child tables.
Insertion rules
⚫ A row should not be inserted in the referencing (child) table unless there already exists a
matching entry in the referenced table
⚫ Inserting into the parent table should not cause referential integrity problems
48
⚫ Sometimes a special NULL value may be used to create child entries without a parent or
with a ―dummy‖ parent
Deletion rules
⚫ A row should not be deleted from the referenced table (parent) if there are matching rows
in the referencing table (child)
◦ Nullify -- reset the foreign keys in the child to some NULL or dummy value
◦ Cascade -- Delete all rows in the child where there is a foreign key matching the
key in the parent row being deleted
v. Enterprise Constraints
⚫ These are business rule that may affect the database and the data in it
⚫ This is now increasing handled by the database. In Oracle, for example, when defining a
table you can specify:
attrN attr-type CHECK (attrN = UPPER(attrN) verifies that the data meets certain criteria
Referential Integrity
⚫ Ensures that dependent relationships in the data are maintained. In Oracle, for example:
49
Concurrency Control
⚫ The goal is to support access by multiple users to the same data, at the same time
⚫ It must assure that the transactions are serializable and that they are isolated
⚫ Specifically:
◦ Lost updates
John Marsha
⚫ Read account balance (balance = $1000) ⚫ Read account balance (balance = $1000)
⚫ Withdraw $200 (balance = $800) ⚫ Withdraw $300 (balance = $700)
⚫ Write account balance (balance = $800) ⚫ Write account balance (balance = $700)
⚫ Locking levels
◦ Database
◦ Table
◦ Block or page
◦ Record
◦ Field
⚫ Types
◦ Shared (S locks)
◦ Exclusive (X locks)
50
John
⚫ Lock account balance Marsha
⚫ Read account balance (balance = $1000) ⚫ Read account balance (DENIED)
⚫ Withdraw $200 (balance = $800) ⚫ Lock account balance
⚫ Write account balance (balance = $800) ⚫ Read account balance (balance = $800)
⚫ Unlock account balance ⚫ etc...
⚫ Avoiding deadlocks by maintaining tables of potential deadlocks and ―backing out‖ one
side of a conflicting transaction
◦ From the user‘s point of view a private copy of the database is created for the
duration of the transaction
⚫ Transactions are started with SET TRANSACTION, followed by the SQL statements
51
⚫ Part or all of a transaction can be undone using ROLLBACK
⚫ COMMIT;
⚫ COMMIT;
⚫ Freezes the data for the user in both tables before either select retrieves any rows, so that
changes that occur concurrently will not show up
⚫ Commits before and after ensure any uncompleted transactions are finish, and then
release the frozen data when done
⚫ Savepoints are places in a transaction that you may ROLLBACK to (called checkpoints
in other DBMS)
◦ SET TRANACTION…;
◦ SAVEPOINT ALPHA;
◦ SQL STATEMENTS…
◦ SAVEPOINT BETA;
◦ SQL STATEMENTS…
◦ IF …;
◦ COMMIT;
⚫ Authorization rules to identify users and the actions they can perform
52
⚫ Authentication schemes to positively identify a person attempting to gain access to the
database
⚫ Legal issues
⚫ Physical security
⚫ OS/Network security
⚫ DBMS security
◦ These are used to identify a user and control their access to information
⚫ DBMS verifies password and checks a user‘s permissions when they try to
◦ Retrieve data
◦ Modify data
◦ SELECT privilege
◦ INSERT privilege
◦ UPDATE privilege
◦ DELETE privilege
53
⚫ The owner (creator) of a database has all privileges on all objects in the database, and can
grant these to others
⚫ The owner (creator) of an object has all privileges on that object and can pass them on to
others
⚫ Some systems permit finer grained authorization (most use GRANT and REVOKE on
variant views
GRANT <privileges>
ON <object>
TO <users>
• <privileges> is a list of
• WITH GRANT OPTION means that the users can pass their privileges on to others
Privileges Examples
TO Manager
The user ‗Manager‘ can do anything to the Employee table, and can allow other users to do the
same (by using GRANT statements)
GRANT SELECT,
54
UPDATE(Salary) ON
Employee TO Finance
The user ‗Finance‘ can view the entire Employee table, and can change Salary values, but cannot
change other values or pass on their privilege
Removing Privileges
REVOKE <privileges>
ON <object>
FROM <users>
⚫ If a user has the same privilege from other users then they keep it
55
⚫ Manager‘ revokes ALL from ‗Personnel‘
Views
◦ SQL:
CREATE VIEW viewname AS SELECT field1, field2, field3,…, FROM table1, table2
WHERE <where clause>;
Restricted Views
56
S-view of the data
◦ You can SELECT from (and sometimes UPDATE etc) views just like tables
Creating Views
AS <select stmt>
57
⚫ <select stmt> is a query that returns the rows and columns of the view
⚫ Example
⚫ We want each user to be able to view the names and phone numbers (only) of
those employees in their own department
⚫ Example
⚫ We want each user to be able to view the names and phone numbers (only) of
those employees in their own department
Employee
ID Name Phone Department Salary
E158 Mark x6387 Accounts £15,000
E159 Mary x6387 Marketing £15,000
E160 Jane x6387 Marketing £15,000
View example
WHERE Department =
◦ Privileges are granted to that view, rather than the underlying tables
58
View Updating
◦ Their value depends on the ‗base‘ tables that they are defined from
⚫ Updating views
◦ It is often not clear how to change the base tables to make the desired change to
the view
⚫ For a view to be updatable, the defining query of the view should satisfy certain
conditions:
◦ View should be defined on a single table (no join, union, etc. used in FROM)
◦ Create a view of that table that shows only the information they need to see
59
◦ Grant them privileges on the view
◦ We want to let the user 'John' read the department and name, and be able to
update the department (only)
◦ Create a view
AS SELECT Name,
Department
FROM Employee
GRANT SELECT,
UPDATE (Department)
ON forJohn
TO John
⚫ Checkpoint facility
⚫ Recovery manager
60
Disaster Recovery Planning
⚫ Water
⚫ Fire
⚫ Power Failure
Kinds of Records
⚫ Class I: VITAL
61
◦ Records whose loss would be inconvenient, but which are replaceable
⚫ Early offsite storage facilities were often intended to survive atomic explosions
◦ http://www.prismintl.org/
62
Chapter 3: Transaction Processing Concepts
3.1 Introduction to Transaction
⚫ A Transaction:
◦ Logical unit of database processing that includes one or more access operations
(read -retrieval, write - insert or update, delete).
⚫ Examples include ATM transactions, credit card approvals, flight reservations, hotel
check-in, phone calls, supermarket scanning, academic registration and billing.
⚫ Transaction boundaries:
◦ Any single transaction in an application program is bounded with Begin and End
statements.
⚫ An application program may contain several transactions separated by the Begin and
End transaction boundaries.
⚫ Granularity of data - a field, a record , or a whole disk block that measure the size of the
data item
⚫ Basic operations that a transaction can perform are read and write
◦ write_item(X): Writes the value of program variable X into the database item
named X.
⚫ Basic unit of data transfer from the disk to the computer main memory is one block.
◦ Copy that disk block into a buffer in main memory (if that disk block is not
already in some main memory buffer).
63
◦ Find the address of the disk block that contains item X.
◦ Copy that disk block into a buffer in main memory (if that disk block is not
already in some main memory buffer).
◦ Copy item X from the program variable named X into its correct location in the
buffer.
◦ Store the updated block from the buffer back to disk (either immediately or at
some later point in time).
⚫ The DBMS maintains a number of buffers in the main memory that holds database disk
blocks which contains the database items being processed.
◦ if there is a need for additional database block to be copied to the main memory ;
⚫ Some buffer management policy is used to choose for replacement but if the chosen
buffer has been modified, it must be written back to disk before it is used.
◦ For recovery purposes, the system needs to keep track of when the transaction
starts, terminates, and commits or aborts.
⚫ Transaction states:
64
◦ Partially committed state shows the end of read/write operation but this will not
ensure permanent modification on the data base
◦ Committed state -ensures that all the changes done on a record by a transition
were done persistently
◦ Failed state happens when a transaction is aborted during its active state or if one
of the rechecking is fails
🞄 [read_item,T,X]: Records that transaction T has read the value of database item X.
🞄 [commit,T]: Records that transaction T has completed successfully, and affirms that its
effect can be committed (recorded permanently) to the database.
65
Desirable Properties of Transactions
To ensure data integrity, DBMS should maintain the following ACID properties:
⚫ Consistency preservation: A correct execution of the transaction must take the database
from one consistent state to another.
⚫ Isolation: A transaction should not make its updates visible to other transactions until it
is committed; this property, when enforced strictly, solves the temporary update problem
and makes cascading rollbacks of transactions unnecessary
⚫ Durability or permanency: Once a transaction changes the database and the changes are
committed, these changes must never be lost because of subsequent failure.
Example
⚫ Suppose that Ti is a transaction that transfer 200 birr from account CA2090( which is
5,000 Birr) to SB2359(which is 3,500 birr) as follows
🞄 Read(CA2090)
🞄 CA2090= CA2090-200
🞄 Write(CA2090)
🞄 Read(SB2359)
🞄 SB2359= SB2359+200
🞄 Write(SB2359)
⚫ Atomicity- either all or none of the above operation will be done – this is materialized by
transaction management component of DBMS
⚫ Isolation- when several transaction are being processed concurrently on a data item they
may create many inconsistent problems. So handling such case is the responsibility of
Concurrency control component of the DBMS
⚫ Durability - once Ti writes its update this will remain there when the database restarted
from failure . This is the responsibility of recovery management components of the
DBMS
66
3.3 Transaction Processing
⚫ Single-User System:
◦ At most one user at a time can use the database management system.
⚫ Multiuser System:
◦ Eg. Air line reservation, Bank and the like system are operated by many users
who submit transaction concurrently to the system
⚫ Concurrency
◦ Interleaved processing:
Advantages:
🞄 keeps the CPU busy when the process requires I/O by switching to execute
another process rather than remaining idle during I/O time and hence this
will increase system throughput (average no. of transactions completed
within a given time)
◦ Parallel processing:
67
Problems of Concurrent Sharing
◦ This occurs when two transactions that access the same database items have their
operations interleaved in a way that makes the value of some database item
incorrect.
68
T1 withdraws 10 from A
T2 adds 100 on A
In the above case, if done one after the other (serially) then we have no problem.
After the successful completion of the operation the final value of A will be 200 which
override the update made by the first transaction that changed the value from 100 to 90.
T1 T2
Read_item(A)
A=A-10
Read_item(A)
A=A+100
Write_item(A)
Write_item(A)
69
ii. The Temporary Update (or Dirty Read) Problem
◦ This occurs when one transaction updates a database item and then the transaction
fails for some reason .
Example: T1 would like to add the values of A=10, B=20 and C=30. after the values are read by
T1 and before its completion, T2 updates the value of B to be 50. at the end of the execution of
the two transactions T1 will come up with the sum of 60 while it should be 90 since B is updated
to 50
◦ Note, however, that operations from other transactions Tj can be interleaved with
the operations of Ti in S. Eg. Consider the following example:
Sa : r2(X);w2(X);r1(X);w1(X);a2;
⚫ Two operations in a schedule are said to be conflict if they satisfy the following
conditions.
70
At least one of the operation is a write_Item(X)
⚫ Recoverable schedule:
⚫ Cascadeless schedule:
◦ One where every transaction reads only the items that are written by committed
transactions. Eg.
⚫ Strict Schedules:
◦ A schedule in which a transaction can neither read or write an item X until the last
transaction that wrote X has committed/aborted.
71
ii. Characterizing Schedules based on Serializability
– The concept of Serializable of schedule is used to identify which schedules are correct
when concurrent transactions executions have interleaving of their operations in the
schedule.
⚫ Serial schedule:
◦ A schedule S is serial if, for every transaction Ti participating in the schedule, all
the operations of Ti are executed consecutively in the schedule. Otherwise, the
schedule is called nonserial schedule.
◦ For example, in the banking example suppose there are two transaction where
one transaction calculate the interest on the account and another deposit some
money into the account. hence the order of execution is important for the final
result.
⚫ Serializable schedule:
72
⚫ Result equivalent:
◦ Two schedules are called result equivalent if they produce the same final state of
the database
i. Conflict equivalent:
◦ Two schedules are said to be conflict equivalent if the order of any two
conflicting operations is the same in both schedules. Eg
◦ Conflict serializable:
ii. Two schedules are said to be view equivalent if the following three conditions hold:
1. The same set of transactions participates in S and S‘, and S and S‘ include the
same operations of those transactions.
3. for each data object A, the transaction that perform the final write on A in S1
must also perform the final write on A in S2
73
– An edge is created from Ti to Tj if one of the operations in Ti appears before a
conflicting operation in Tj
– The schedule is serializable if and only if the precedence graph has no cycles.
⚫ Example: Constructing the precedence graphs for schedules A to D from (Slide No.
23) to test for conflict serializability.
74
75
Summary of Schedule types
◦ Either the statement completes execution without error or it fails and leaves the
database unchanged.
⚫ Every transaction has three characteristics: Access mode, Diagnostic size and isolation
◦ Access mode:
76
◦ Isolation level can be
🞄 READ UNCOMMITTED,
🞄 READ COMMITTED,
🞄 REPEATABLE READ or
SET TRANSACTION
BEGIN TRANSACTION;
COMMIT;
BEGIN TRANSACTION;
ROLLBACK;
• However, if any transaction executes at a lower level, then serializability may be violated.
77
Potential problem with lower isolation levels: Four types
• a transaction T2 could read a database object A that has been modified by another
transaction T1, which has not yet committed.
78
Chapter 4: Concurrency Controlling Techniques
4.1 Database Concurrency Control
⚫ Transaction Processor is divided into:
⚫ The scheduler (concurrency-control manager) must assure that the individual actions of
multiple transactions are executed in such an order that the net effect is the same as if the
transactions had in fact executed one-at-a-time.
⚫ A typical scheduler does its work by maintaining locks on certain pieces of the database.
These locks prevent two transactions from accessing the same piece of data at the same
time. Example:
79
80
4.2 Concurrency Control Techniques
⚫ Basic concurrency control techniques:
◦ Locking,
◦ Timestamping
◦ Optimistic methods
⚫ The First two are conservative approaches: delay transactions in case they conflict with
other transactions.
⚫ Optimistic methods assume conflict is rare and only check for conflicts at commit.
81
4.2.1 Locking
• Lock is a variable associated with a data item that describes the status of the data item
with respect to the possible operations that can be applied to it.
• Generally, a transaction must claim a shared (read) or exclusive (write) lock on a data
item before read or write.
• Lock prevents another transaction from modifying item or even reading it, in the case of a
write lock.
• Example:
• Unlocking is an operation which removes these permissions from the data item.
• Example:
• More than one transaction can apply share lock on X for reading its value
but no write lock can be applied on X by any other transaction.
• Only one write lock on X can exist at any time and no shared lock can be
applied by any other transaction on X.
Conflict matrix
82
Lock Manager:
Lock table:
🞄 It must not lock an already locked data items and it must not try to unlock
a free data item.
When a transaction finished operation on X it issues an Unlock _item operation which set
lock(x) to 0 so that X may be accessed by another transaction
If transaction has shared lock on item, can read but not update item.
If transaction has exclusive lock on item, can both read and update item.
Reads cannot conflict, so more than one transaction can hold shared locks simultaneously
on same item.
83
B: if LOCK (X) = ―unlocked‖ then
begin
no_of_reads (X) 1;
end
else begin
go to B
end;
else
goto B
end;
⚫ Lock conversion
else
84
force Ti to wait until Tj unlocks X
Ti has a write-lock (X) (*no transaction can have any lock on X*) convert
write-lock (X) to read-lock (X)
⚫ Using such locks in the transaction do not guarantee serializability of schedule on its
own: example
Schedule:
Write Lock( X)
Read (X)
X=X+100
Write(X)
Unlock(X)
Write Lock( X)
Read (X)
X= X*1.1
Write(X)
Unlock(X)
Write Lock( Y)
Read (Y)
Y= Y*1.1
Write(Y)
Unlock(Y)
Commit
write_lock(Y)
read(Y)
Y= Y-100
write(Y)
unlock(Y)
commit
• If at start, X = 100, Y = 400, result should be:
85
– X = 210, Y = 340, if T2 executes before T1
• Problem is that transactions release locks too soon, resulting in loss of total isolation and
atomicity.
⚫ Every transaction can be divided into Two Phases: Locking (Growing) & Unlocking
(Shrinking)
⚫ Requirement:
◦ For a transaction these two phases must be mutually exclusively, that is, during
locking phase unlocking phase must not start and during unlocking phase locking
phase must not begin.
# locks
held by
Ti
⚫ Deadlock
◦ It is a state that may result when two or more transaction are each waiting for
locks held by the other to be released
◦ Example :
T1 T2
read_lock (Y);
read_item (Y);
read_lock (X);
read_item (X);
write_lock (X);
write_lock (Y);
⚫ So the DBMS must either prevent or detect and resolve such deadlock situations
87
− This way of locking prevents deadlock since a transaction never waits for
a data item.
− The lower the timestamp, the higher the transaction's priority, that is, the
oldest transaction has the highest priority.
⚫ Wait-die
◦ If TS(Ti) < TS(Tj), then (Ti older than Tj)Ti is allowed to wait.
◦ Otherwise (Ti younger than Tj)Abort Ti (Ti dies) and restart it later with the same
timestamp
wai
wai 88
– If TS(Ti) < TS(Tj), then (Ti older than Tj),Abort Tj (Ti wounds Tj) and restart Tj
later with the same timestamp
T1(ts =25)
wait
T2 (ts =20)
wait
wait
T3 (ts =10)
◦ When a chain like: Ti waits for Tj waits for Tk waits for Ti or Tj occurs, then this
creates a cycle.
89
◦ When the system is in the state of deadlock , some of the transaction should be
aborted by selected (victim) and rolled-back
◦ This can be done by aborting those transaction: that have made the least work,
the one with the lowest locks, and that have the least # of abortion and so on
iii. Timeouts
◦ It uses the period of time that several transaction have been waiting to lock items
◦ If the transaction wait for a longer time than the predefined time out period, the
system assume that may be deadlocked and aborted it
Starvation
◦ Solution
🞄 FIFO
🞄 Give higher priority for transaction that have been aborted for many time
◦ But here, Timestamp values are assigned based on time in which the transaction
are submitted to the system using the current date & time of the system
90
◦ A monotonically increasing variable (integer) indicating the age of an operation
or a transaction.
◦ This can be achieved by associating timestamp value (TS) to each database item
which is denoted as follow:
a) Read_Ts(x): the read timestamp of x – this is the largest time among all the time stamps
of transaction that have successfully read item X
b) Write_TS(X): the largest of all the timestamps of transaction that have successfully
written item X
The concurrency control algorithm check whether conflict operation violate the
timestamp ordering in the following manner: three options
🞄 If the condition in part (a) does not exist, then execute write_item(X) of T
and set write_TS(X) to TS(T).
91
🞄 If write_TS(X) TS(T), then execute read_item(X) of T and set
read_TS(X) to the larger of TS(T) and the current read_TS(X)
🞄 If TS(T) > read_TS(X), then delay T until the transaction T‘ that wrote or
read X has terminated (committed or aborted).
🞄 If TS(T) > write_TS(X), then delay T until the transaction T‘ that wrote X
has terminated (committed or aborted).
🞄 If read_TS(X) > TS(T) then abort and roll-back T and reject the operation.
🞄 If write_TS(X) > TS(T), then just ignore the write operation and continue
execution. This is because the most recent writes counts in case of two
consecutive writes. Example
🞄 If the conditions given in 1 and 2 above do not occur, then execute write_item(X)
of T and set write_TS(X) to TS(T).
◦ This algorithm uses the concept of view serilazabilty than conflict serialiazabilty
◦ Side effect:
92
◦ Two schemes : based on time stamped ordering & 2PL
◦ Assume X1, X2, …, Xn are the version of a data item X created by a write
operation of transactions. With each Xi a read_TS (read timestamp) and a
write_TS (write timestamp) are associated.)
◦ If transaction T issues read_item (X), find the version i of X that has the highest
write_TS(Xi) of all versions of X that is also less than or equal to TS(T), then
return the value of Xi to T, and set the value of read _TS(Xi) to the largest of
TS(T) and the current read_TS(Xi).
◦ Note that: Rule two indicates that read request will never be rejected
– This is accomplished by maintaining two versions of each data item X where one
version must always have been written by some committed transaction.
Steps
93
3. Other transactions continue to read X.
Note:
– In multiversion 2PL read and write operations from conflicting transactions can
be processed concurrently.
– It avoids cascading abort but like strict two phase locking scheme conflicting
transactions may get deadlocked.
◦ A transaction can read values of committed data items. However, updates are
applied only to local copies (versions) of the data items (in database cache).
− If the transaction Ti decides that it wants to commit, the DBMS checks whether
the transaction could possibly have conflicted with any other concurrently
executing transaction.
− While one transaction ,Ti, is being validated , no other transaction can be allowed
to commit
− This phase for Ti checks that, for each transaction Tj that is either committed or is
in its validation phase, one of the following conditions holds:
94
Ti starts its write phase after Tj completes its write phase and the read set of Ti
has no item in common with the write set of Tj
Both the read_set and write_set of Ti have no items in common with the write_set
of Tj, and Tj completes its read phase before Ti completes its read phase.
– When validating Ti, the first condition is checked first for each transaction Tj, since (1) is
the simplest condition to check. If (1) is false then (2) is checked and if (2) is false then
(3 ) is checked.
Granularity can be coarse (entire database) or it can be fine (an attribute of a relation).
A database record
An entire file
Thus, the degree of concurrency is low for coarse granularity and high for fine
granularity.
Example:
A transaction that expects to access most of the pages in a file should probably
set a lock on the entire file , rather than locking individual pages or records
If a transaction that requires to access relatively few pages of the file , it is better
to lock those pages
95
Similarly , if a transaction access several records on a page , it should lock the
entire page and if it access just a few records , it should lock some those records.
This example will hold true , if a lock on the node locks that node and implicitly all its
descendants
The set of rules which must be followed for producing serializable schedule are
T can lock a node only if it has not unlocked any node (to enforce 2PL policy).
T can unlock a node, N, only if none of the children of N are currently locked by
T.
To lock a node in S mode, a transaction must first lock all its ancestors
96
Chapter 5: Database Recovery Techniques
5.1 Database Recovery
i. Purpose of Database Recovery
◦ To bring the database into the last consistent state, which existed prior to the
failure.
Whenever a transaction is submitted to the DBMS for execution, the system is responsible for
making sure that either all operations in the transaction to be completed successfully or the
transaction has no effect on the database or any other transaction.
97
The DBMS may permit some operations of a transaction T to be applied to the database but a
transaction may fails after executing some of its operations
A hardware or software error occurs in the computer system during transaction execution. If
the hardware crashes, the contents of the computer‘s internal memory may be lost.
Some operation in the transaction may cause it to fail, such as integer overflow or division by
zero. Transaction failure may also occur because of erroneous parameter values or because of a
logical programming error. In addition, the user may interrupt the transaction during its
execution.
◦ For example, data for the transaction may not be found. such as insufficient
account balance in a banking database, may cause a transaction, such as a fund
withdrawal from that account, to be canceled.
The concurrency control method may decide to abort the transaction, to be restarted later,
because it violates serializability or because several transactions are in a state of deadlock (see
Chapter 2).
5. Disk failure:
Some disk blocks may lose their data because of a read or write malfunction or because of a
disk read/write head crash. This may happen during a read or a write operation of the transaction.
This refers to an endless list of problems that includes power or air-conditioning failure, fire,
theft, overwriting disks or tapes by mistake
Transaction Manager : Accepts transaction commands from an application, which tell the
transaction manager when transactions begin and end, as well as information about the
expectations of the application.
98
Logging: In order to assure durability, every change in the database is logged separately
on disk.
Log manager initially writes the log in buffers and negotiates with the buffer manager to
make sure that buffers are written to disk at appropriate times.
Recovery Manager: will be able to examine the log of changes and restore the database
to some consistent state.
Buffer Recovery
Manager Manager
Data
Log
Recovery Algorithms
99
◦ Actions taken during normal transaction processing to ensure enough information
exists to recover from failures
◦ Actions taken after a failure to recover the database contents to a state that ensures
atomicity, consistency and durability
Storage Structure
⚫ Volatile storage:
⚫ Nonvolatile storage:
⚫ Stable storage:
Stable-Storage Implementation
◦ copies can be at remote sites to protect against disasters such as fire or flooding.
⚫ Failure during data transfer can still result in inconsistent copies: Block transfer can result
in
◦ Successful completion
⚫ Protecting storage media from failure during data transfer (one solution):
100
1. Write the information onto the first physical block.
2. When the first write successfully completes, write the same information
onto the second physical block.
⚫ Copies of a block may differ due to failure during output operation. To recover from
failure:
2. Better solution:
Data Access
⚫ Block movements between disk and main memory are initiated through the following
two operations:
◦ output(B) transfers the buffer block B to the disk, and replaces the appropriate
physical block there.
⚫ Each transaction Ti has its private work-area in which local copies of all data items
accessed and updated by it are kept.
101
◦ Ti's local copy of a data item X is called xi.
⚫ We assume, for simplicity, that each data item fits in, and is stored inside, a single block.
⚫ Transaction transfers data items between system buffer blocks and its private work-area
using the following operations :
◦ read(X) assigns the value of data item X to the local variable xi.
◦ write(X) assigns the value of local variable xi to data item {X} in the buffer block.
◦ both these commands may necessitate the issue of an input(BX) instruction before
the assignment, if the block BX in which X resides is not already in memory.
⚫ Transactions
⚫ output(BX) need not immediately follow write(X). System can perform the output
operation when it deems fit.
⚫ Modifying the database without ensuring that the transaction will commit may leave the
database in an inconsistent state.
⚫ Consider transaction Ti that transfers $50 from account A to account B; goal is either to
perform all database modifications made by Ti or none at all.
⚫ Several output operations may be required for Ti (to output A and B). A failure may
occur after one of these modifications have been made but before all of them are made.
⚫ shadow-paging
⚫ We assume (initially) that transactions run serially, that is, one after the other.
102
System log
To recover from system failure , the system keeps information about the change in the
system log
This method restore a past copy of the database from the backup storage
and reconstructs operation of a committed transaction from the back up
log up to the time of failure
The strategy uses undoing and redoing some operations in order to restore
to a consistent state : example
For instance,
If failure occurs between commit and database buffers being flushed to secondary
storage then, to ensure durability, recovery manager has to redo (roll forward)
transaction‘s updates.
If transaction had not committed at failure time, recovery manager has to undo
(rollback) any effects of that transaction for atomicity. Example,
Transaction Log
◦ For recovery from any type of failure data values prior to modification (BFIM -
BeFore Image) and the new value after modification (AFIM – AFter Image) are
required.
◦ These values and other information is stored in a sequential file (appended file)
called Transaction log
◦ These log files becomes very useful in brining back the system to a stable state
after a system crash.
◦ A sample log is given below. Back P and Next P point to the previous and next
log records of the same transaction.
103
Data Caching
Data items to be modified are first stored into database cache by the Cache
Manager (CM) and after modification they are flushed (written) to the disk
If the cache is already full, some buffer replacement policy can be used .
Like
FIFO
While replacing buffers , first of all the updated value on that buffer
should be saved on the appropriate block in the data base
Write-Ahead Logging
When in-place update (immediate or deferred) is used then log is necessary for
recovery
For Undo: Before a data item‘s AFIM is flushed to the database disk
(overwriting the BFIM) its BFIM must be written to the log and the log
must be saved on a stable store (log disk).
For Redo: Before a transaction executes its commit operation, all its
AFIMs must be written to the log and the log must be saved on a stable
store.
Advantage:
104
the need for a very large buffer space to store all updated
pages in the memory
ii. Force: if all Cache updates are immediately flushed (forced) to disk when
a transaction commits ---- force writing
iii. No-Force: if Cached are flushed to a disk when the need arise after a
committed transaction
Advantage:
an updated pages of a committed transaction may still be in
a the buffer when an other transaction needs to update
v
o If this page is updated by multiple transaction, it eliminates
i the I/O cost to read that page again ,
d
s
🞄 Steal/No-Force (Undo/Redo)
🞄 Steal/Force (Undo/No-redo)
🞄 No-Steal/No-Force (No-undo/Redo)
🞄 No-Steal/Force (No-undo/No-redo)
Check pointing
Log file is used to recover failed DB but we may not know how far back in the
log to search. Thus
Time to time (randomly or under some criteria) the database flushes its buffer to
database disk to minimize the task of recovery.
When failure occurs, redo all transactions that committed since the checkpoint
and undo all transactions active at time of crash.
106
The # t of committed transaction since the last check point
If transaction fails for what so ever reason after updating the data base it must be
necessary to roll back
If a transaction T is roll back , any transaction that has read the value of some
data item X written by T must be also be roll back
Similarly once S is roll back, any transaction R that has read the values of some
data item Y written by S must aslo be roll back, and so on..
At commit point under WAL scheme these updates are saved on database
disk.
107
After reboot from a failure the log is used to redo all the transactions
affected by this failure.
At commit point under WAL scheme these updates are saved on database
disk.
In a system recovery, transactions which were recorded in the log after the last
checkpoint were redone.
The recovery manager may scan some of the transactions recorded before the checkpoint
to get the AFIMs.
During recovery, all transactions of the commit table are redone and all transactions of
active tables are ignored since none of their AFIMs reached the database.
It is possible that a commit table transaction may be redone twice but this does not
create any inconsistency because of a redone is ―idempotent‖,
that is, one redone for an AFIM is equivalent to multiple redone for the same
AFIM.
108
ii. Recovery Techniques Based on Immediate Update
Undo/No-redo Algorithm
◦ In this algorithm AFIMs of a transaction are flushed to the database disk under
WAL before it commits.
◦ For this reason the recovery manager undoes all transactions during recovery.
◦ No transaction is redone.
Maintain two page tables during life of a transaction: current page and shadow page
table.
Shadow page table is never changed thereafter and is used to restore database in event of
failure.
When transaction completes, current page table becomes shadow page table.
X Y
X' Y'
Database
109
🞄 Any change to the database object is first recorded in the log
🞄 The record on the log must first be save on the disk and then
🞄 ARIES will retrace all actions of the database system prior to the crash to
reconstruct the database state when the crash occurred.
◦ Deferred Update:
All modified data items in the cache is then written after a transaction ends
its execution or after a fixed number of transactions have completed their
execution.
During commit the updates are first recorded on the log and then on the
database
If a transaction fails before reaching its commit point undo is not needed
because it didn‘t change the database yet
If a transaction fails after commit (writing on the log) but before finishing
saving to the data base redoing is needed from the log
◦ Immediate Update:
These update are first recorded on the log and on the disk by force writing,
before the database is updated
110
◦ Shadow update:
The modified version of a data item does not overwrite its disk copy but
is written at a separate disk location.
Thus the old value ( before image BFIM) and the new value (AFIM) are
kept in the disk
◦ In-place update: The disk version of the data item is overwritten by the cache
version.
Log
◦ The log is a history of actions executed by the DBMS in terms of a file of records
stored in disk for recovery purpose
🞄 LSN increases monotonically and indicates the disk address of the log
record it is associated with.
🞄 In addition, each data page stores the LSN of the latest log record
corresponding to a change for that page.
🞄 updating a page
ii. The LSN of the page is then set to the LSN of the update log
record
111
🞄 commit- when a transaction decides to commit , it force writes a commit
type log record containing the transaction id
🞄 undo update- when transaction is roll back its updates are undone
🞄 the transaction ID
◦ For efficient recovery following tables are also stored in the log during
checkpointing:
🞄 Dirty Page table: Contains an entry for each dirty page in the buffer,
which includes the page ID and the LSN corresponding to the earliest
update to that page.
112
🞄 Writes a begin_checkpoint record in the log
🞄 Writes an end_checkpoint record in the log. With this record the contents
of transaction table and dirty page table are appended to the end of the log.
◦ To reduce the cost of checkpointing and allow the system to continue to execute
transactions, ARIES uses ―fuzzy check pointing”.
113
Chapter 6: Distributed Databases and Client-Server Architectures
6.1 Distributed Database Concepts
A transaction can be executed by multiple networked computers in a unified manner.
– The physical placement of data (files, relations, etc.) which is not known to the
user (distribution transparency).
114
•
Remark:
• Each site has a
DBMS
– Fragments
(replicated
or unique).
– Linked by
network.
115
Advantages of DDB :
Reliability refers to system live time, that is, system is running efficiently most of
the time. Availability is the probability that the system is continuously available
(usable or accessible) during a time interval.
A distributed database system has multiple nodes (computers) and if one fails then
others are available to do the job.
v. Improved performance:
116
vi. Easier expansion (scalability):
Allows new nodes (computers) to be added anytime without changing the entire
configuration.
i. Complexity- The data replication , failure recovery , network management …make the
system more complex than the central DBMSs
ii. Cost- Since DDBMS needs more people and more hardware, maintaining and running the
system can be more expensive than the centralized system .
iv. Data integrity and security problem - Because data maintained by distributed systems can
be accessed at locations in the network, controlling the integrity of a database can be
difficult.
There are two approaches to store the relation in the distributed database : Replication
and Fragmentation
I. Data Replication
The system maintain several identical copies of the relation & store each copy at a
different site
• Efficiency- data that is not needed by the local applications is not stored
– but reconstruction of the whole relation will require accessing data from all sites
containing part of the relation
117
• A relation can be fragmented in two ways:
Horizontal fragmentation
• Consider the Employee relation with selection condition (DNO = 5). All
rows satisfy this condition will create a subset which will be a horizontal
fragment of Employee relation.
Vertical fragmentation
– Because there is no condition for creating a vertical fragment, each fragment must
include the primary key attribute of the parent relation Employee. In this way all
vertical fragments of a relation are connected.
Representation
– All sites of the database system have identical setup, i.e., same database system
software.
– The underlying operating systems can be a mixture of Linux, Window, Unix, etc.
118
Window
Site 5 Unix
Oracle Site 1
Oracle
Window
Site 4 Communications
network
Oracle
Site 3 Site 2
Linux Oracle Linux Oracle
• Heterogeneous
– At least one of the database must be from different vendor : two variants
– Federated: Each site may run different database system but the data access is
managed through a single conceptual schema.
• This implies that the degree of local autonomy is minimum. Each site
must adhere to a centralized access policy. There may be a global schema.
119
Object Unix Relational
Oriented Site 5 Unix
Site 1
Hierarchical
Window
Site 4 Communications
network
Network
Object DBMS
Oriented Site 3 Site 2 Relational
Linux Linux
6.4 Query Processing in Distributed Databases
Issues
• Example: suppose there are three sites. Where the relation Employee at
site 1, Department at Site 2 and no relation at site 3
– Employee at site 1. 10,000 rows. Row size = 100 bytes. Table size
= 106 bytes.
– And a query is initiated from S3 to retrieve employees [First Name (15 byte long),
Last name (15 byte long) and Department name (10 byte long) total of 40 bytes]
Assumption
120
– The result of this query will have 10,000 rows, assuming that every employee is
related to a department.
– Suppose each result row 40 bytes long. The query is submitted at site 3 and the
result is sent to this site.
what is your best strategy that can optimize data transportation cost?
– Transfer Employee to site 2, execute join at site 2 and send the result to site 3.
– Transfer Department relation to site 1, execute the join at site 1, and send the
result to site 3.
– Q‟: For each department, retrieve the department name ,Fname and LName of the
department manager
• The result of this query will have 100 tuples, assuming that every department has a
manager, the execution strategies are:
121
– Transfer Employee and Department to the result site and perform the join at site
3.
– Transfer Employee to site 2, execute join at site 2 and send the result to site 3.
– Transfer Department relation to site 1, execute join at site 1 and send the result to
site 3.
1. Transfer Employee relation to site 2, execute the query and present the result to
the user at site 2.
2. Transfer Department relation to site 1, execute join at site 1 and send the result
back to site 2.
122
Database availability must not be affected due to the failure of one or two
sites and the recovery scheme must recover them before they are available
for use.
Communication link failure:
This failure may create network partition which would affect database
availability even though all database sites may be running.
Distributed commit:
A transaction may be fragmented and they may be executed by a number
of sites. This require a two or three-phase commit approach for
transaction commit.
Distributed deadlock:
Since transactions are processed at multiple sites, two or more sites may
get involved in deadlock. This must be resolved in a distributed manner.
6.5.1 Distributed Concurrency control
i. Primary site technique: A single site is designated as a primary site which serves as
a coordinator for transaction management.
Primary site
Site 5
Site 1
Site 3 Site 2
• Transaction management:
– In two phase locking, this site manages locking and releasing data items. If all
transactions follow two-phase policy at all sites, then serializability is guaranteed.
– Advantages:
123
• An extension to the centralized two phase locking so implementation and
management is simple.
• Data items are locked only at one site but they can be accessed at any site.
– Disadvantages:
• Advantages:
– Since primary copies are distributed at various sites, a single site is not overloaded
with locking and unlocking requests.
• Disadvantages:
• In both approaches a coordinator site or copy may become unavailable. This will require
the selection of a new coordinator.
• Aborts and restarts all active transactions at all sites. Elects a new
coordinator and initiates transaction processing.
• Suspends all active transactions, designates the backup site as the primary
site and identifies a new back up site.
124
– Primary and backup sites fail or no backup site:
– If majority of sites grant lock then the requesting transaction gets the data item.
Server 1 Client 1
Client 2
Server 2 Client 3
Server n Client n
Many Web applications use an architecture called the three-tier architecture, which adds
an intermediate layer between the client and the database server. This intermediate layer
called the Web server. This server plays an intermediary role by storing business
rules(constraints) that are used to access data from the database server.
125
It can also improve database security by checking a client's credentials before
forwarding a request to the database server. The intermediate server accepts requests
from the client, processes the request and sends database commands to the database
server, and then acts as a conduit for passing (partially) processed data from the database
server to the clients
• Clients reach server for desired service, but server does reach clients.
• The server software is responsible for local data management at a site, much like
centralized DBMS software.
• Client parses a user query and decomposes it into a number of independent sub-
queries. Each subquery is sent to appropriate site for execution.
• Each server processes its query and sends the result to the client.
• The client combines the results of subqueries and produces the final result.
126