0% found this document useful (0 votes)
20 views327 pages

DBMS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views327 pages

DBMS

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 327

ENTITY-RELATIONSHIP

(E/R) MODEL…..
ENTITY-RELATIONSHIP (E/R) MODEL
ENTITY-RELATIONSHIP (E/R) MODEL

 Widely used conceptual level data model


• proposed by Peter P Chen in 1970s
 Data model to describe the database system at the requirements collection stage
• high level description.
• easy to understand for the enterprise managers.
• rigorous enough to be used for system building.
 Concepts available in the model
• entities and attributes of entities.
• relationships between entities.
• diagrammatic notation.
• Entities
 Entity - a thing (animate or inanimate) of independent physical or
conceptual existence and distinguishable.
• In the University database context, an individual student, faculty
member, a class room, a course are entities.

 Entity Set or Entity Type-
• Collection of entities all having the same properties. Student
entity set – collection of all student entities. Course entity set –
collection of all course entities.
ATTRIBUTES

• Each entity is described by a set of attributes/properties that have associated values


• student entity
• StudName – name of the student.
• RollNumber – the roll number of the student.
• Sex – the gender of the student etc.

• All entities in an Entity set/type have the same set of attributes. Chosen set of attributes
– amount of detail in modeling.
TYPES OF ATTRIBUTES (1/2)

• Simple Attributes
 having atomic or indivisible values.
• example: Dept – a string
• PhoneNumber – a ten digit number
• Composite Attributes
 having several components in the value.
• example: Qualification with components (DegreeName, Year, UniversityName)
• Derived Attributes
 Attribute value is dependent on some other attribute.
• example: Age depends on DateOfBirth. So age is a derived attribute.
TYPES OF ATTRIBUTES (2/2)

• Single-valued
 having only one value rather than a set of values.
 for instance, PlaceOfBirth – single string value.
• Multi-valued
 having a set of values rather than a single value.
 for instance, CoursesEnrolled attribute for student EmailAddress attribute for student PreviousDegree
attribute for student.
• Attributes can be:
 simple single-valued, simple multi-valued,
 composite single-valued or composite multi-valued.
DIAGRAMMATIC NOTATION FOR ENTITIES

• entity - rectangle
• attribute - ellipse connected to rectangle
• multi-valued attribute - double ellipse
• composite attribute - ellipse connected to ellipse
• derived attribute - dashed ellipse
DOMAINS OF ATTRIBUTES

• Each attribute takes values from a set called its domain


• For instance, studentAge – {17,18, …, 55}
• HomeAddress – character strings of length 35
• Domain of composite attributes –
• cross product of domains of component attributes

• Domain of multi-valued attributes –


• set of subsets of values from the basic domain
ENTITY SETS AND KEY ATTRIBUTES

• Key – an attribute or a collection of attributes whose value(s) uniquely identify an entity in the
entity set.
• For instance,
• RollNumber - Key for Student entity set
• EmpID - Key for Faculty entity set
• HostelName, RoomNo - Key for Student entity set (assuming that each student gets to stay in
a single room)
• A key for an entity set may have more than one attribute.
• An entity set may have more than one key.
• Keys can be determined only from the meaning of the attributes in the entity type.
• Determined by the designers
RELATIONSHIPS

• When two or more entities are associated with each other, we have an instance of a
Relationship.
• E.g.: student Ramesh enrolls in Discrete Mathematics course
• Relationship enrolls has Student and Course as the
• participating entity sets.

• Formally, enrolls ⊆ Student ´ X Course


• (s,c) ∈ enrolls  Student ‘s’ has enrolled in Course ‘c’
• Tuples in enrolls – relationship instances
• enrolls is called a relationship Type/Set.
DEGREE OF A RELATIONSHIP

• Degree : the number of participating entities.


• Degree 2: binary
• Degree 3: ternary
• Degree n: n-ary
• Binary relationships are very common and widely used.
DIAGRAMMATIC NOTATION FOR RELATIONSHIPS

 Relationship – diamond shaped box


 Rectangle of each participating entity is connected by a line to this diamond. Name of the relationship is
written in the box.
BINARY RELATIONSHIPS AND CARDINALITY RATIO

• The maximum number of entities from E2 that an entity from E1 can possibly be
associated thru R (and vice-versa) determines the cardinality ratio of R.

• Four possibilities are usually specified:
• one-to-one (1:1)
• one-to-many (1:N)
• many-to-one (N:1)
• many-to-many (M:N)
CARDINALITY RATIOS

• One-to-one: An E1 entity may be associated with at


• most one E2 entity and similarly
• an E2 entity may be associated with at most one E1 entity.
• One-to-many: An E1 entity may be associated with
• many E2 entities whereas an E2 entity may be associated with at most one E1 entity.
• Many-to-one: An E2 entity may be associated with
• many E1 entities whereas an E1 entity may be associated with at most one E2 entity.
• Many-to-many: Many E1 entities may be associated with a
• single E2 entity and a single E1 entity may be associated with many E2 entities.
CARDINALITY RATIO – EXAMPLE (ONE-TO-ONE)
CARDINALITY RATIO – EXAMPLE (MANY-TO-ONE/ONE-TO-MANY)
CARDINALITY RATIO – EXAMPLE (MANY-TO-MANY)
PARTICIPATION CONSTRAINTS

An entity set may participate in a relation either totally or


• partially.
• Total participation: Every entity in the set is involved in some association (or tuple) of the
relationship.
• Partial participation: Not all entities in the set are involved in association (or tuples) of the
relationship.
EXAMPLE OF TOTAL/PARTIAL PARTICIPATION
STRUCTURAL CONSTRAINTS

• Cardinality Ratio and Participation Constraints are together called Structural


Constraints.
• They are called constraints as the data must satisfy them to be consistent with the
requirements.
• Min-Max notation: pair of numbers (m,n) placed on the line connecting an entity to the
relationship.
• m: the minimum number of times a particular entity must appear in the relationship
tuples at any point of time
• 0 – partial participation
• ≥ 1 – total participation
• n: similarly, the maximum number of times a particular entity
• can appear in the relationship tuples at any point of time
COMPARING THE NOTATIONS
ATTRIBUTES FOR RELATIONSHIP TYPES

• Relationship types can also have attributes.


 properties of the association of entities.

 grade gives the letter grade (S,A,B, etc.) earned by the


student for a course.
 neither an attribute of student nor that of course.
ATTRIBUTES FOR RELATIONSHIP TYPES – MORE EXAMPLES
RECURSIVE RELATIONSHIPS AND ROLE NAMES

• Recursive relationship: An entity set relating to itself gives rise to a recursive relationship
• E.g., the relationship prereqOf is an example of a recursive relationship on the entity Course
• Role Names – used to specify the exact role in which the entity participates in the relationships
• Essential in case of recursive relationships
• Can be optionally specified in non-recursive cases
WEAK ENTITY SETS

• Weak Entity Set: An entity set whose members owe their


• existence to some entity in a strong entity set.
 weak entities are not of independent existence.
 each weak entity is associated with some entity of the

• Owner entity set through a special relationship


 Weak entity set may not have a key attribute
 The weak entity type always has total participation(existence dependency) in a
relationship because the weak entity type can not be identified without an owner identity.
 A weak entity type may have more than one identifying entity type and an identifying
relationship type of degree higher than two.
EXAMPLES…

A popular course may have several


sections each taught by a different
professor and having its own class room
and meeting times
EXAMPLES…

Institute has many pieces of


equipment and we like to
keep track of their
utilization. Keeping track
of the usage is relevant
only when the equipment
exists!
COMPLETE EXAMPLE FOR E/R SCHEMA:
SPECIFICATIONS (1/2)
• In an educational institute, there are several departments and each student belongs to
one of them. Each department has a unique department number, a name, a location,
phone number and is headed by a professor. Professors have a unique employee Id, name
and a phone number.A professor works for exactly one department.
• We like to keep track of the following details regarding students: name, unique roll
number, sex, phone number, date of birth, age and one or more email addresses. Students
have a local address consisting of the hostel name and the room number. They also have
home address consisting of house number, street, city and PIN. It is assumed that all
students reside in the hostels.
COMPLETE EXAMPLE FOR E/R SCHEMA:
SPECIFICATIONS (2/2)
•A course taught in a semester of the year is called a section. There can be several sections of the
same course in a semester; these are identified by the section number. Each section is taught by a
professor and has its own timings and a room to meet. Students enroll for several sections in a
semester.

•Each course has a name, number of credits and the department that offers it. A course may have
other courses as pre-requisites i.e, courses to be completed before it can be enrolled in.

•Professors also undertake research projects. These are sponsored by funding agencies and have a
specific start date, end date and amount of money given. More than one professor can be involved
in a project. Also a professor may be simultaneously working on several projects. A project has a
unique projectId.
ENTITIES - STUDENT
ENTITIES – DEPARTMENT AND COURSE
ENTITIES – PROFESSOR, PROJECT AND SECTIONS
E/R DIAGRAM SHOWING RELATIONSHIPS
DESIGN CHOICES: ATTRIBUTE VERSUS
RELATIONSHIP
 Should offering department be an attribute of a course or should
we create a relationship between Course and Dept entities
called, say, offers ?
DESIGN CHOICES: ATTRIBUTE VERSUS
RELATIONSHIP
 Should offering department be an attribute of a course or should
we create a relationship between Course and Dept entities
called, say, offers ?
• Later approach is preferable when the necessary entity, in this
case the Department, already exists.
DESIGN CHOICES: ATTRIBUTE VERSUS
RELATIONSHIP
 Should class room be an attribute of Section or should we create
an entity called ClassRoom and have a relationship, say, meetsIn,
connecting Section and ClassRoom?
DESIGN CHOICES: ATTRIBUTE VERSUS
RELATIONSHIP
 Should class room be an attribute of Section or should we create
an entity called ClassRoom and have a relationship, say, meetsIn,
connecting Section and ClassRoom?
• In this case, the option of making classRoom as an attribute of
Section is better as we do not want to give a lot of importance to
class room and make it a an entity.
DESIGN CHOICES

• Weak entity versus composite multi-valued attributes


• Note that a section could also be modeled as a composite multi- valued attribute of
Course entity.
DESIGN CHOICES

• Weak entity versus composite multi-valued attributes


• Note that a section could also be modeled as a composite multi- valued attribute of
Course entity.
 However, if so, section can not participate in relationships, such as, enrolls with
Student entity.
• In general, if a thing, even though not of independent existence, participates in other
relationships on its own, it is best captured as a weak entity.
• If the above is not the case, composite multi-valued attribute may be enough.
TERNARY RELATIONSHIPS

Relationship instance (c, p, j) in supply indicates that company c supplies a component p that is made
use of by the project j
TERNARY RELATIONSHIP

• (c,p) in canSupply, (j,p) in uses, (c,j) in serves may not together imply (c,p,j) is in supply.
Whereas the other way round is of course true.
RELATIONAL DATA MODEL
&
NOTION OF KEYS
RELATIONAL DATA MODEL

• Introduction
• Proposed by Edgar F Codd (1923-2003) in the early seventies [ Turing Award – 1981 ]
• Most of the modern DBMS use the relational data model.
• Simple and elegant model with a mathematical basis.
Led to the development of a theory of data dependencies and database design.
• Relational algebra operations –
• crucial role in query optimization and execution.
• Laid the foundation for the development of
• Tuple relational calculus and then
• Database standard SQL
RELATION SCHEME

• Consists of relation name, and a set of attributes or field names or column names. Each
attribute has an associated domain.
• Example:

• Domain – set of atomic (or indivisible ) values – data type


Relation Instance

• No duplicate tuples ( or rows ) in a relation instance.


• We shall later see that in SQL, duplicate rows would be allowed in tables.
ATTRIBUTES

• Each entity is described by a set of attributes/properties that have associated values


• student entity
• StudName – name of the student.
• RollNumber – the roll number of the student.
• Sex – the gender of the student etc.

• All entities in an Entity set/type have the same set of attributes. Chosen set of attributes
– amount of detail in modeling.
ANOTHER RELATION EXAMPLE

• enrollment (studentName, rollNo, courseNo, sectionNo)

Student Name rollNumber courseNo Section No

2
Rajesh CS04B125 CS3200

Rajesh CS04B135 CS3700 1

Suresh CS04B130 CS3200 2


KEYS FOR A RELATION (1/2)

• Key: A set of attributes K, whose values uniquely identify a tuple in any instance. And none of the proper subsets of K has this
property

• Example: {rollNumber} is a key for student relation.

• {rollNumber, name} – values can uniquely identify a tuple


• but the set is not minimal
• not a Key

• A key can not be determined from any particular instance data


• it is an intrinsic property of a scheme
• it can only be determined from the meaning of attributes
KEYS FOR A RELATION (2/2)

• A relation can have more than one key.

• Each of the keys is called a candidate key


• Example: book (isbnNo, authorName, title, publisher, year) (Assumption : books have only one author )
• Keys: {isbnNo}, {authorName, title}

• A relation has at least one key

• - the set of all attributes, in case no proper subset is a key.

• Superkey: A set of attributes that contains a key as a subset.


• A key can also be defined as a minimal superkey

• Primary Key: One of the candidate keys chosen for indexing purposes ( More details later…)
RELATIONAL DATABASE SCHEME AND INSTANCE

• Relational database scheme: D consist of a finite no. of relation schemes and a set I of
integrity constraints.
• Integrity constraints: Necessary conditions to be satisfied by the data values in the relational
instances so that the set of data values constitute a meaningful database
• domain constraints
• key constraints
• referential integrity constraints
• Database instance: Collection of relational instances satisfying the integrity constraints.
DOMAIN AND KEY CONSTRAINTS

• Domain Constraints: Attributes have associated domains


• Domain – set of atomic data values of a specific type.
• Constraint – stipulates that the actual values of an attribute in any tuple must belong to the
declared domain.
• Key Constraint: Relation scheme – associated keys Constraint – if K is supposed to be a
key for scheme R,
• any relation instance r on R should not have two tuples that have identical values for
attributes in K.
• Also, none of the key attributes can have null value.
FOREIGN KEYS

• Tuples in one relation, say r1(R1), often need to refer to tuples in another relation, say r2(R2)
• to capture relationships between entities
• Primary Key of R2 : K = {B1, B2, …, Bj}
• A set of attributes F = {A1, A2, …, Aj} of R1 such that dom(Ai) = dom(Bi), 1≤ i ≤ j and
• whose values are used to refer to tuples in r2
• is called a foreign key in R1 referring to R2.

• R1, R2 can be the same scheme also.


• There can be more than one foreign key in a relation scheme
FOREIGN KEY – EXAMPLES (1/2)

Foreign key attribute deptNo of course relation refers to


Primary key attribute deptID of department relation
FOREIGN KEY – EXAMPLES(2/2)

• It is possible for a foreign key in a relation

to refer to the primary key of the relation itself An Example:


• univEmployee ( empNo, name, sex, salary, dept, reportsTo)
• reportsTo is a foreign key referring to empNo of the same relation

• Every employee in the university reports to some other employee for administrative purposes
• - except the vice-chancellor, of course!
REFERENTIAL INTEGRITY CONSTRAINT (RIC)

• Let F be a foreign key in scheme R1 referring to scheme R2 and let K be


the primary key of R2.
• RIC: any relational instances r1on R1 and r2 on R2 must be s.t for any
tuple t in r1, either its F-attribute values are all null or they are identical to
the K-attribute values of some tuple in r2.
• RIC ensures that references to tuples in r2 are for currently existing
tuples.
• That is, there are no dangling references.
REFERENTIAL INTEGRITY CONSTRAINT (RIC) - EXAMPLE

The new course refers to a non-existent department and thus violates the RIC
EXAMPLE RELATIONAL SCHEME

• student (rollNo, name, degree, year, sex, deptNo, advisor)


• degree is the program ( B Tech, M Tech, M S, Ph D etc) for which the student has joined.
• year is the year of admission and
• advisor is the EmpId of a faculty member identified as the student’s advisor.

• department (deptId, name, hod, phone)
• phone is that of the department’s office.

• professor (empId, name, sex, startYear, deptNo, phone)
• startYear is the year when the faculty member has joined the department deptNo.
EXAMPLE RELATIONAL SCHEME

• course (courseId, cname, credits, deptNo)


• deptNo indicates the department that offers the course.
• enrollment (rollNo, courseId, sem, year, grade)
• sem can be either “odd” or “even” indicating the two semesters of an academic year.
• The value of grade will be null for the current semester and non-null for past semesters.
• teaching (empId, courseId, sem, year, classRoom) preRequisite (preReqCourse,
courseID)
• Here, if (c1, c2) is a tuple, it indicates that c1 should be
successfully completed before enrolling for c2.
EXAMPLE RELATIONAL SCHEME

• student (rollNo, name, degree, year, sex, deptNo, advisor)


• department (deptId, name, hod, phone)
• professor (empId, name, sex, startYear, deptNo, phone)
• course (courseId, cname, credits, deptNo)
• enrollment (rollNo, courseId, sem, year, grade)
• teaching (empId, courseId, sem, year, classRoom)
• preRequisite (preReqCourse, courseId)
EXAMPLE RELATIONAL SCHEME WITH RICS SHOWN
RELATIONAL ALGEBRA

 A set of operators (unary and binary) that take relation instances as arguments and return new
relations.
 Gives a procedural method of specifying a retrieval query.
 Forms the core component of a relational query engine.
 SQL queries are internally translated into RA expressions.
 Provides a framework for query optimization.
• RA operations: select (s), project (p), cross product (´),

union (⋃), intersection (∩), difference (−), join ( ⋈ )


THE SELECT OPERATOR
SELECTION CONDITION
EXAMPLES OF SELECT EXPRESSIONS
THE PROJECT OPERATOR
EXAMPLES OF PROJECT EXPRESSIONS
SIZE OF PROJECT EXPRESSION RESULT
SET OPERATORS ON RELATIONS
SET OPERATIONS
CROSS PRODUCT OPERATION
EXAMPLE QUERY USING CROSS PRODUCT
QUERY USING CROSS PRODUCT – USE OF RENAMING
USE OF RENAMING OPERATOR Ρ
JOIN OPERATION
THETA JOIN
EXAMPLES..
EXAMPLE
EQUI-JOIN AND NATURAL JOIN
EXAMPLE – EQUI-JOIN
NATURAL JOIN
DIVISION OPERATOR
EXAMPLES..
QUERY USING DIVISION OPERATION
EXAMPLES..
COMPLETE SET OF OPERATORS
• Retrieve the list of female PhD students
• Obtain the name and rollNo of all female BTech students
• Obtain the rollNo of students who never obtained an ‘E’ grade
EXAMPLE QUERIES
EXAMPLES…
OUTER JOIN OPERATION (1/2)
OUTER JOIN OPERATION (2/2)
LEFT OUTER JOIN
RIGHT OUTER JOIN
FULL OUTER JOIN
• Obtain the names, roll numbers of students who have got S grade in the CS3700 course
offered in 2017 odd semester along with his/her advisor name.
THE SQL STANDARD
SQL – STRUCTURED QUERY LANGUAGE

• An international standard (ANSI, ISO) that specifies how


• a relational schema is created
• data is inserted / updated in the relations
• data is queried
• transactions are started and stopped
• programs access data in the relations
• and a host of other things are done
Every relational database management system (RDBMS) is required to support / implement the SQL
standard.
• RDBMS vendors may give additional features
• Downside of using vendor-specific features - portability
SEQUEL

• developed by IBM in early 70’s


• relational query language as part of System-R project at
• IBM San Jose Research Lab.
• the earliest version of SQL
• SQL evolution
• SQL- 86/89
• SQL- 92 - SQL2
• SQL- 99/03 - SQL3 (includes object relational features) And the evolution continues .....
• Disclaimer: This module covers only important principles of SQL
COMPONENTS OF SQL STANDARD(1/2)

Data Definition Language (DDL)


• Specifies constructs for schema definition, relation definition, integrity constraints, views
and schema modification.
Data Manipulation Language (DML)
• Specifies constructs for inserting, updating and querying the data in the relational
instances ( or tables ).
Embedded SQL and Dynamic SQL
• Specifies how SQL commands can be embedded in a high-level host language such as C,
C++ or Java for programmatic access to the data.
COMPONENTS OF SQL STANDARD(2/2)

 Transaction Control
• Specifies how transactions can be started / stopped, how a set
of concurrently executing transactions can be managed.
 Authorization
• Specifies how to restrict a user / set of users to access only
certain parts of data, perform only certain types of queries etc.
DATA DEFINITION IN SQL
DOMAIN TYPES IN SQL-92 (1/2)
DOMAIN TYPES IN SQL-92 (2/2)

Date data type


• DATE type has 10 position format – YYYY-MM-DD
 Time data type
• TIME type has 8 position format – HH : MM : SS
 Others
•There are several more data types whose details are available in
SQL reference books
SPECIFYING INTEGRITY CONSTRAINTS IN SQL
SPECIFYING REFERENTIAL INTEGRITY CONSTRAINTS

• FOREIGN KEY (A1) REFERENCES r2 (B1)


 specifies that attribute A1 of the table being defined, say r1, is a
• foreign key referring to attribute B1 of table r2

 recall that this means:
• each value of column A1 is either null or is one of the values
appearing in column B1 of r2
SPECIFYING WHAT TO DO IF RIC VIOLATION
OCCURS
• RIC violation
 can occur if a referenced tuple is deleted or modified
 action can be specified for each case using qualifiers ON DELETE or ON UPDATE
• Actions
 three possibilities can be specified
• SET NULL, SET DEFAULT, CASCADE
 these are actions to be taken on the referencing tuple
 SET NULL – foreign key attribute value to be set null
 SET DEFAULT – foreign key attribute value to be set to its
• default value
 CASCADE – delete the referencing tuple if the referenced tuple is deleted or update the FK attribute if
the referenced tuple is updated
TABLE DEFINITION EXAMPLE
MODIFYING A DEFINED SCHEMA

• ALTER TABLE command can be used to modify a schema


• Adding a new attribute
• ALTER table student ADD address varchar(30);
• Deleting an attribute
 need to specify what needs to be done about views or constraints that refer to the
attribute being dropped
 two possibilities
• CASCADE – delete the views/constraints also RESTRICT – do not delete the attributes
if there are some
• views/constraints that refer to it.
 ALTER TABLE student DROP degree RESTRICT Similarly, an entire table
definition can be deleted
DATA MANIPULATION IN SQL
MEANING OF THE BASIC QUERY BLOCK

The cross-product M of the tables in the from clause would be


considered.
• Tuples in M that satisfy the condition q are selected.
• For each such tuple, values for the attributes A1,A2,….,Am
(mentioned in the select clause) are projected.
 This is a conceptual description
•- in practice more efficient methods are employed for evaluation.
 The word select in SQL should not be confused with select operation
of relational algebra.
SQL QUERY RESULT
EXAMPLE QUERIES INVOLVING A SINGLE TABLE
EXAMPLES INVOLVING TWO OR MORE RELATIONS
(1/2)
EXAMPLES INVOLVING TWO OR MORE RELATIONS
(2/2)
NESTED QUERIES OR SUBQUERIES
NESTED QUERY EXAMPLE
SET COMPARISON OPERATORS
SEMANTICS OF SET COMPARISON OPERATORS
CORRELATED AND UNCORRELATED NESTED
QUERIES
EXAMPLE OF A CORRELATED SUBQUERY
THE EXISTS OPERATOR
THE NOT EXISTS OPERATOR
Functional
Dependency
Definition
A functional dependency is defined as a constraint between two sets of
attributes in a relation from a database.

Given a relation R, a set of attributes X in R is said to functionally


determine another attribute Y, also in R, (written X → Y) if and only if
each X value is associated with at most one Y value.
In other words….

X is the determinant set and Y is the dependent attribute.


Thus, given a tuple and the values of the attributes in
X, one can determine the corresponding value of the
Y attribute.
Relation vs. Function
 A function is a relation whose every input
corresponds with a single output. This is best
explained visually. In Figure , you see two relations,
expressed as diagrams called relation maps.

 Both have the same domain, { 0, 1, 2, }, and range,


{1, 2, 4}, but relation 1 is a function, while 2 is not.
Functional Dependency

A B
1 a
2 b
3 c
4 c
Functional Dependency
Which of the following is True?

1->a 1->b
2 -> a
If there is a functional dependency, then if A is known, we can find B.
FD Definition

 R (A , B) R
Tuple A B
t1 a b
 A ⊆ R, B ⊆ R
t2 a a
 Let A -> B exists

 If t1 (A) = t2 (A),
 Then t1 (B) = t2 (B),
FD Types

A->B
 Trivial (No new element derived)
B ⊆ A
 αβ- > α
 {Name, Roll No, Course No} -> {Course No}

 Non Trivial (New element derived)


 B⊄A
 αβ- > µ
Functional Dependency
Example
A B C D E
a 2 3 4 5

2 a 3 4 5
a 2 3 6 5
a 2 3 6 6

Find the True FD ?


A-> BC
DE ->C
C-> DE
BC->A
Functional Dependency
Example
A B C D E
a 2 3 4 5

2 a 3 4 5
a 2 3 6 5
a 2 3 6 6

α- > β
A-> BC (Yes)
If α is unique,
DE ->C (Yes)
Then there always exists dependency
C-> DE (No) A->C
BD -> C
BC->A (Yes) ABDE -> C
FDs – Examples
Consider the scheme preRequisite(preReqCourse, courseId)

Does preReqCourse → courseId ?

No, as a course might be pre-requisite for many courses

Does courseId → preReqCourse ?

No, a course may have many pre-requisite courses So,

it is possible that no FDs hold on some schema


Closure

 R (A B C)
 A- > B
 A - > BC
 B -> C
 (A)+ = (A B C)
 (B)+ = ( B C)
 (C)+ = (C)
Closure

Let a relation R have some functional


dependencies F specified. The closure of F
(usually written as F+) is the set of all functional
dependencies that may be logically derived
from F.
Often F is the set of most obvious and important
functional dependencies and F+, the closure, is
the set of all the functional dependencies
including F and those that can be deduced from
F.
The closure is important and may, for example, be
needed in finding one or more candidate keys
of the relation.
More Closure Examples.

 R ( A B C D E F G)
 1: A -> B
 2: BC -> DE
 3: AEG -> G
 Find (AC)+
 Using Rule 1: (AC)+ = ABC
 Using Rule 2: (AC)+ = ABCDE
 Rule 3 Not applicable.
 R (A B C D E F)
 AB -> C
 BC -> AD
 D- >E
 CF -> B

 (AB)+=?

Repeat the process until 2 iterations are completely same.


Armstrong's Axioms
Developed by Armstrong in 1974, there are six rules
(axioms) that all possible functional dependencies
may be derived from them.
Axioms Cont.

1. Reflexivity Rule --- If X is a set of attributes and Y is a


subset of X, then X  Y holds.

each subset of X is functionally


dependent on X.
Example: ABC -> AB
2. Augmentation Rule --- If X  Y holds and W is a set of
attributes, then WX  WY holds.
3. Transitivity Rule --- If X  Y and Y  Z holds, then X  Z
holds.
Derived Theorems from
Axioms
4. Union Rule --- If X  Y and X  Z holds, then X  YZ
holds.
5. Decomposition Rule --- If X  YZ holds, then so do X 
Y and X  Z.
6. Pseudo transitivity Rule --- If X  Y and WY  Z hold
then so does WX  Z.
Check whether the following is true?
Given: AB->C
Can we write:
A->C ?
B->C ?
7: Composition*
If X->Y and Z->W holds, then XZ->YW
 R ( A B C D E G H)
 A -> BC
 CD -> E
 E ->C
 D -> AEH
 ABH -> BD
 DH -> BC
 Check whether BCD -> H ?
Equivalence of FD
 R ( A C D E H)

F G
A- >C A->CD
AC -> D E->AH
E -> AD
E->H

Find (F)+ using G


Find (G)+ using F (a) F ⊆ G
(A)+ = ACD
(A)+ = ACD (b) G ⊆ F
(AC)+ = ACD
(E)+ = EAHCD (c) F =G
(E)+ = EAHCD
(d) F≠ G
KEYS in DBMS

 Keys are one of the basic requirements of a relational


database model.
 It is widely used to identify the tuples(rows) uniquely in
the table. We also use keys to set up relations amongst
various columns and tables of a relational database.

A B C
1 a p
2 b q
3 c q
4 c r

Key =A ? Key =BC ?


Means: A->BC or (A)+ = R Means: (BC)+ = R
Example
R (A B C D) R (A B C D E)
R (A B C D) R (A B C D)
AB->CD AB->CD
A->B AB->CD
C->A D->A
B->C D->A
D->B BC->DE
C->A

(D)+=D R (A B C D E F)
(AD)+= ABCD AB->C
(BD)+= DC->AE
(CD)+ E->F

1: Find Essential attributes


2:Find Combination with Attributes
3: Find the Combination which generates R
Normalization
 Anomalies are problems that can occur in poorly planned, un-
normalised databases where all the data is stored in one table and performing
Database operations like Insert, delete and Updates are resulting into some
errors / discrepancies' in the Database.
 Types of Anomalies:
 Insert Anomaly
 Delete Anomaly
 Update Anomaly
Schema Analysis: EmpDept

 Insert Anomaly:
 The first problem students usually identify with the EmpDept
schema is that it combines two different ideas: employee
information and department information.
 But what is wrong with this?
 Also that if we want to open a new Deptt. and thus want to insert
DeptName but do not have any employees so EID (Primary key)
and Name field will have no values…
 Without any values in Primary Key..
 How will the record exist..
 So we can not Insert new DeptName is this scheme..
 Insert Anomaly..?
EID Name DeptID DeptName
A01 Ali 12 CSE
A12 Eric 10 ECE
A13 Jack 12 CSE
A03 Tyler 12 ECE
Insert Anomaly
 An Insert Anomaly occurs when certain attributes cannot be inserted into the
database without the presence of other attributes.
 Example: of Courses and Rooms-teacher allotment schema.

Course Tutor Room Room_siz En_limit


_no e
353 Smith A532 45 40
351 Smith C320 100 60
355 Clark H940 400 300
456 Turner H940 400 45
Insert Anomaly
Course _no Tutor Room Room_size En_limit
353 Smith A532 45 40
351 Smith C320 100 60
355 Clark C320 400 300
456 Turner H940 400 45

e.g. we have built a new room (e.g. C321) but it has not yet been
timetabled for any courses or members of staff.
>> Inserting C321 is problem here !!
Delete Anomaly
A Delete Anomaly exists when certain attributes are lost because of the deletion of
other attributes.
Presence of such Anomaly clearly suggests that we have not designed our database
properly.
Delete Anomaly
Course_n Tutor Room Room_siz En_limit
o e
353 Smith A532 45 40
351 Smith C320 100 60
355 Clark H940 400 300
456 Turner H940 400 45

e.g. if we remove the entity, course_no:351 from the above


table, the details of room C320 get deleted.

Which implies the corresponding course will also get


deleted.
Update Anomaly
 An Update Anomaly exists when one or more instances of duplicated
data is updated, but not all.
Update Anomaly

Course_n Tutor Room Room_size En_limit


o
353 Smith A532 45 40
351 Smith C320 100 60
355 Clark H940 400 300
456 Turner H940 400 45

e.g. Room H940 has been improved, it is now of RSize =


500. For updating a single entity, we have to update all
other columns where room=H940.

*May lead to Data Inconsistency.


Normalization & Design objectives
 The basic objective of normalization is to reduce the various anomalies in the
database.
 Database design & Normalization can be looked upon as a process of analyzing the
given relation schemas based on their FDs and primary keys to achieve the desirable
properties of ;

Minimizing redundancy
Minimizing the insertion, deletion, and update
anomalies.
Normalization is Principle of Decomposition

S_id Name Age Branch B_Name HoD


Code
1 A 18 101 CSE XYZ
2 B 19 102 ECE PQR
3 C 19 101 CSE XYZ
4 D 18 101 CSE XYZ
5 E 20 102 ECE PQR
6 F 19 102 ECE PQR

S_id Name Age Branch BCode


CODE 101 CSE XYZ
1 A 18 101 102 ECE PQR
2 B 19 102
3 C 19 101
4 D 18 101
5 E 20 102
6 F 19 102
Definition
 This is the process which allows you to winnow
out redundant data within your database.
 This involves restructuring the tables to
successively meeting higher forms of
Normalization.
 A properly normalized database should have
the following characteristics
 Scalar values in each fields
 Absence of redundancy.
 Minimal use of null values.
 Minimal loss of information.
Levels of Normalization
 Levels of normalization based on the amount of
redundancy in the database.
 Various levels of normalization are:
 First Normal Form (1NF)
 Second Normal Form (2NF)

Number of Tables
Redundancy

Complexity
 Third Normal Form (3NF)
 Boyce-Codd Normal Form (BCNF)
 Fourth Normal Form (4NF)

Most databases should be 3NF or BCNF in order to avoid


the database anomalies.
Functional Dependencies
35

EmpNum EmpEmail EmpFname EmpLname


123 [email protected] John Doe
456 [email protected] Peter Smith
555 [email protected] Alan Lee
633 [email protected] Peter Doe
787 [email protected] Alan Lee

If EmpNum is the PK then the FDs:


EmpNum  EmpEmail
EmpNum  EmpFname
EmpNum  EmpLname
must exist.

91.2914
36 Functional Dependencies
EmpNum  EmpEmail
EmpNum  EmpFname 3 different ways
EmpNum  EmpLname you might see FDs
depicted
EmpEmail
EmpNum EmpFname

EmpLname

EmpNum EmpEmail EmpFname EmpLname

91.2914
37 Determinant
Functional Dependency

EmpNum  EmpEmail

Attribute on the LHS is known as the determinant


• EmpNum is a determinant of EmpEmail

91.2914
38 Transitive dependency
Transitive dependency
Consider attributes A, B, and C, and where
A  B and B  C.
Functional dependencies are transitive, which
means that we also have the functional dependency
AC
We say that C is transitively dependent on A
through B.

91.2914
Transitive
39
dependency
EmpNum  DeptNum

EmpNum EmpEmail DeptNum DeptNname

DeptNum  DeptName

EmpNum EmpEmail DeptNum DeptNname

DeptName is transitively dependent on EmpNum via DeptNum


EmpNum  DeptName
91.2914
Partial dependency
40
A partial dependency exists when an attribute B is
functionally dependent on an attribute A, and A is a
component of a multipart candidate key.

InvNum LineNum Qty InvDate

Candidate keys: {InvNum, LineNum} InvDate is


partially dependent on {InvNum, LineNum} as
InvNum is a determinant of InvDate and InvNum is
part of a candidate key
91.2914
First Normal Form
 Some definitions of 1NF, most notably that of Edgar F. Codd, make
reference to the concept of atomicity.

 Codd states that the "values in the domains on which each relation is
defined are required to be atomic with respect to the DBMS."

 Codd defines an atomic value as one that "cannot be decomposed into


smaller pieces by the DBMS (excluding certain special functions).“

 Meaning a field should not be divided into parts with more than one kind
of data in it such that what one part means to the DBMS depends on
another part of the same field.
First Normal Form
 Domain is atomic if its elements are considered to be indivisible units
 Examples of non-atomic domains:
 Set of names, composite attributes
 Identification numbers like CS101 that can be broken up into parts
 A relational schema R is in first normal form if the domains of all
attributes of R are atomic.
Roll Number Name Course
101 Modi CN
OS
102 Sonia DBMS
CAO

Roll Number Name Course


101 Modi CN
101 Modi OS
102 Sonia DBMS
102 Sonia CAO
First Normal Form (1NF)
A table is considered to be in 1NF if all the fields contain
only scalar values (as opposed to list of values).
Example (Not 1NF)

ISBN Title AuName AuPhone PubName PubPhone Price

0-321-32132-1 Balloon Sleepy, 321-321-1111, Small House 714-000-0000 $34.00


Snoopy, 232-234-1234,
Grumpy 665-235-6532

0-55-123456-9 Main Street Jones, 123-333-3333, Small House 714-000-0000 $22.95


Smith 654-223-3455
0-123-45678-0 Ulysses Joyce 666-666-6666 Alpha Press 999-999-9999 $34.00

1-22-233700-0 Visual Roman 444-444-4444 Big House 123-456-7890 $25.00


Basic

Author and AuPhone columns are not scalar


1NF - Decomposition
1. Place all items that appear in the repeating group in a new
table
2. Designate a primary key for each new table produced.
3. Duplicate in the new table the primary key of the table from
which the repeating group was extracted or vice versa.
Example (1NF)
ISBN AuName AuPhone

0-321-32132-1 Sleepy 321-321-1111

ISBN Title PubName PubPhone Price 0-321-32132-1 Snoopy 232-234-1234

0-321-32132-1 Balloon Small House 714-000-0000 $34.00 0-321-32132-1 Grumpy 665-235-6532

0-55-123456-9 Main Street Small House 714-000-0000 $22.95 0-55-123456-9 Jones 123-333-3333

0-123-45678-0 Ulysses Alpha Press 999-999-9999 $34.00 0-55-123456-9 Smith 654-223-3455

1-22-233700-0 Visual Big House 123-456-7890 $25.00 0-123-45678-0 Joyce 666-666-6666


Basic
1-22-233700-0 Roman 444-444-4444
Functional Dependencies
1. If one set of attributes in a table determines another
set of attributes in the table, then the second set of
attributes is said to be functionally dependent on the
first set of attributes.

Example 1
ISBN Title Price Table Scheme: {ISBN, Title, Price}
0-321-32132-1 Balloon $34.00 Functional Dependencies: {ISBN}  {Title}
0-55-123456-9 Main Street $22.95
{ISBN}  {Price}
0-123-45678-0 Ulysses $34.00

1-22-233700-0 Visual $25.00


Basic
Functional Dependencies
Example 2
PubID PubName PubPhone Table Scheme: {PubID, PubName, PubPhone}
1 Big House 999-999-9999 Functional Dependencies: {PubId}  {PubPhone}
2 Small House 123-456-7890 {PubId}  {PubName}
3 Alpha Press 111-111-1111 {PubName, PubPhone}  {PubID}

Example 3
AuID AuName AuPhone
1 Sleepy 321-321-1111
Table Scheme: {AuID, AuName, AuPhone}
2 Snoopy 232-234-1234
Functional Dependencies: {AuId}  {AuPhone}
3 Grumpy 665-235-6532
{AuId}  {AuName}
4 Jones 123-333-3333
{AuName, AuPhone}  {AuID}
5 Smith 654-223-3455

6 Joyce 666-666-6666

7 Roman 444-444-4444
FD – Example
Database to track reviews of papers submitted to an academic
conference. Prospective authors submit papers for review and
possible acceptance in the published conference proceedings.
Details of the entities
 Author information includes a unique author number, a name, a mailing
address, and a unique (optional) email address.
 Paper information includes the primary author, the paper number, the
title, the abstract, and review status (pending, accepted,rejected)
 Reviewer information includes the reviewer number, the name, the
mailing address, and a unique (optional) email address
 A completed review includes the reviewer number, the date, the paper
number, comments to the authors, comments to the program
chairperson, and ratings (overall, originality, correctness, style, clarity)
FD – Example
Functional Dependencies
 AuthNo  AuthName, AuthEmail, AuthAddress
 AuthEmail  AuthNo
 PaperNo  Primary-AuthNo, Title, Abstract, Status
 RevNo  RevName, RevEmail, RevAddress
 RevEmail  RevNo
 RevNo, PaperNo  AuthComm, Prog-Comm, Date,
Rating1, Rating2, Rating3, Rating4, Rating5
R( A B C D)
FD:
AB->D
B->C

(AB)+=ABCD
AB: Candidate Key
Prime Attribute: A & B
Non-Prime Attribute: C & D

B->C: Partial Dependency as it is only Depend on B


Not on Entire Candidate Key
Conversion to 2 NF

R( A B C D)
FD:
AB->D
B->C

R (A B D) R(B C)
FD: FD:
AB->D B->C
A B C
a 1 x
b 2 y
a 3 z
c 3 z
d 3 z
e 3 z

R (A B C)
FD: B->C

Candidate Key: ?
Decomposition Example A B C
a 1 x
b 2 y
a 3 z
c 3 z
d 3 z
e 3 z

R (A B C)
Candidate Key: (AB)+= ABC
FD: B->C
R (B C), FD: B->C
R (A B)

A B B C
a 1 1 x
b 2 2 y
a 3 3 z
c 3
d 3
e 3
Second Normal Form (2NF)
For a table to be in 2NF, there are two requirements
 The database is in first normal form
 All nonkey attributes in the table must be functionally dependent on the entire
primary key
Note: Remember that we are dealing with non-key attributes

Example 1 (Not 2NF)


Scheme  {Title, PubId, AuId, Price, AuAddress}
1. Key  {Title, PubId, AuId}
2. {Title, PubId, AuID}  {Price}
3. {AuID}  {AuAddress}
4. AuAddress does not belong to a key
5. AuAddress functionally depends on AuId which is a subset of a key
Second Normal Form (2NF)
Example 2 (Not 2NF)
Scheme  {City, Street, HouseNumber, HouseColor, CityPopulation}
1. key  {City, Street, HouseNumber}
2. {City, Street, HouseNumber}  {HouseColor}
3. {City}  {CityPopulation}
4. CityPopulation does not belong to any key.
5. CityPopulation is functionally dependent on the City which is a proper subset of the key

Example 3 (Not 2NF)


Scheme  {studio, movie, budget, studio_city}
1. Key  {studio, movie}
2. {studio, movie}  {budget}
3. {studio}  {studio_city}
4. studio_city is not a part of a key
5. studio_city functionally depends on studio which is a proper subset of the key
2NF - Decomposition
1. If a data item is fully functionally dependent on only a part of the
primary key, move that data item and that part of the primary key
to a new table.
2. If other data items are functionally dependent on the same part of
the key, place them in the new table also
3. Make the partial primary key copied from the original table the
primary key for the new table. Place all items that appear in the
repeating group in a new table
Example 1 (Convert to 2NF)
Old Scheme  {Title, PubId, AuId, Price, AuAddress}
New Scheme  {Title, PubId, AuId, Price}
New Scheme  {AuId, AuAddress}
2NF - Decomposition
Example 2 (Convert to 2NF)
Old Scheme  {Studio, Movie, Budget, StudioCity}
New Scheme  {Movie, Studio, Budget}
New Scheme  {Studio, City}

Example 3 (Convert to 2NF)


Old Scheme  {City, Street, HouseNumber, HouseColor, CityPopulation}
New Scheme  {City, Street, HouseNumber, HouseColor}
New Scheme  {City, CityPopulation}
Transitive Dependency
A B C
a 1 x
b 1 x
c 1 x
d 2 y
e 2 y
f 3 z
g 3 z

R (A B C)
FD:
A->B
B->C

Lets Find the Candidate Key:


(A)+= ABC
Candidate Key: A
Decomposition to 3 NF A B C
a 1 x
b 1 x
c 1 x
d 2 y
e 2 y
f 3 z
g 3 z

A B B C
a 1 1 x
b 1 2 y
c 1 3 z
d 2
For Transitive Dependency:
e 2 α- > β
f 3 Either α is a Super key
Or β is a Prime Attribute
g 3
To Check 3 NF

For 3 NF: P.D.: P->NP


α- > β T.D.: NP->NP
Either α is a Super key
Or β is a Prime Attribute

R (A B C D E) R1 (A B E)
FD:  R11( A B)
A->B  R12(B E)
B->E R2 (C D)
C->D R3 (A C)
Find Candidate Key:
(AC)+=ABCDE Final Decomposition:
Candidate Key: AC R11( A B)
R12(B E)
R2 (C D)
R3 (A C)
Check for 3NF

R ( A B C D E F G H I J) R1 (A D E I J)
FD: R11(A D E)
AB->C R12(D I J)
A->DE R2(B F G H)
B->F R21(B F)
F->GH R22 (G H)
D->IJ R3 (A B C)

Candidate Key: AB

Final Decomposition
R11(A D E)
R12(D I J)
R21(B F)
R22 (G H)
R3 (A B C)
Check for 3NF

R ( A B C D E F G H I J)
FD:
AB->C
A->DE
B->F
F->GH
D->IJ

Candidate Key: AB
Third Normal Form (3NF)
This form dictates that all non-key attributes of a table must be functionally
dependent on a candidate key i.e. there can be no interdependencies among
non-key attributes.

For a table to be in 3NF, there are two requirements


 The table should be second normal form
 No attribute is transitively dependent on the primary key

Example (Not in 3NF)


Scheme  {Title, PubID, PageCount, Price }
1. Key  {Title, PubId}
2. {Title, PubId}  {PageCount}
3. {PageCount}  {Price}
4. Both Price and PageCount depend on a key hence 2NF
5. Transitively {Title, PubID}  {Price} hence not in 3NF
Third Normal Form (3NF)
Example 2 (Not in 3NF)
Scheme  {Studio, StudioCity, CityTemp}
1. Primary Key  {Studio}
2. {Studio}  {StudioCity}
3. {StudioCity}  {CityTemp}
4. {Studio}  {CityTemp}
5. Both StudioCity and CityTemp depend on the entire key hence 2NF
6. CityTemp transitively depends on Studio hence violates 3NF

BuildingID Contractor Fee


Example 3 (Not in 3NF)
Scheme  {BuildingID, Contractor, Fee} 100 Randolph 1200
1. Primary Key  {BuildingID} 150 Ingersoll 1100
2. {BuildingID}  {Contractor} 200 Randolph 1200
3. {Contractor}  {Fee} 250 Pitkin 1100
4. {BuildingID}  {Fee}
300 Randolph 1200
5. Fee transitively depends on the BuildingID
6. Both Contractor and Fee depend on the entire key hence 2NF
3NF - Decomposition
1. Move all items involved in transitive dependencies to a new entity.
2. Identify a primary key for the new entity.
3. Place the primary key for the new entity as a foreign key on the
original entity.

Example 1 (Convert to 3NF)


Old Scheme  {Title, PubID, PageCount, Price }
New Scheme  {PubID, PageCount, Price}
New Scheme  {Title, PubID, PageCount}
3NF - Decomposition
Example 2 (Convert to 3NF)
Old Scheme  {Studio, StudioCity, CityTemp}
New Scheme  {Studio, StudioCity}
New Scheme  {StudioCity, CityTemp}

Example 3 (Convert to 3NF) BuildingID Contractor Contractor Fee

Old Scheme  {BuildingID, Contractor, Fee} 100 Randolph Randolph 1200


150 Ingersoll Ingersoll 1100
New Scheme  {BuildingID, Contractor}
200 Randolph Pitkin 1100
New Scheme  {Contractor, Fee} 250 Pitkin
300 Randolph
BCNF

 There is a Functional Dependency


 α- > β
 P->NP (Not allowed in 2NF)
 NP->NP (Not allowed in 3 NF)
 P/NP->P (Not Covered in 2 NF and 3 NF)
 BCNF Says:
If there is a Functional Dependency
α must be Super Key
R ( A B C)
FD: AB -> C
C->B

Candidate Key: AB , AC

The Table is not in BCNF as C is not a Super Key


Check for BCNF and Decompose

A B C
a 1 x
b 2 y
c 2 z
c 3 w
d 3 w
e 3 w

R (A B C) R1 (C B)
FD: AB->C R2 (A C)
C->B
BCNF Leads to Lossy Decomposition.
Check for BCNF and Decompose

A B C
a 1 x
b 2 y
c 2 z
c 3 w
d 3 w
e 3 w

A C C B
a x x 1
b y y 2
c z z 2
c w w 3
d w
e w
Levels of Normalization

Each higher level is a subset of the lower level


DBMS - Transaction
Transaction Definition
• A transaction can be defined as a group of
tasks. A single task is the minimum processing
unit which cannot be divided further.

• Let’s take an example of a simple transaction.


Suppose a bank employee transfers Rs 500
from A's account to B's account. This very
simple and small transaction involves several
low-level tasks.
Banking Transaction Example
• A’s Account
Open_Account(A)
Old_Balance = A.balance
New_Balance = Old_Balance - 500
A.balance = New_Balance
Close_Account(A)

B’s Account
Open_Account(B)
Old_Balance = B.balance
New_Balance = Old_Balance + 500
B.balance = New_Balance
Close_Account(B)
Transaction Concept
• A transaction is a unit of program execution
that accesses and possibly updates various
data items.

• Two main issues to deal with:


– Failures of various kinds, such as hardware failures
and system crashes
– Concurrent execution of multiple transactions
Transaction Properties
• A transaction is a very small unit of a program
and it may contain several low-level tasks.

• A transaction in a database system must
maintain Atomicity, Consistency, Isolation, and
Durability − commonly known as ACID
properties − in order to ensure accuracy,
completeness, and data integrity.
Atomicity
• This property states that a transaction must
be treated as an atomic unit, that is, either all
of its operations are executed or none.
• There must be no state in a database where a
transaction is left partially completed.
• States should be defined either before the
execution of the transaction or after the
execution/abortion/failure of the transaction.
Atomicity
Isolation
• In a database system where more than one
transaction are being executed simultaneously
and in parallel, the property of isolation states
that all the transactions will be carried out
and executed as if it is the only transaction in
the system.
• No transaction will affect the existence of any
other transaction.
Durability
• The database should be durable enough to hold
all its latest updates even if the system fails or
restarts.
• If a transaction updates a chunk of data in a
database and commits, then the database will
hold the modified data.
• If a transaction commits but the system fails
before the data could be written on to the disk,
then that data will be updated once the system
springs back into action.
Consistency
• The database must remain in a consistent state
after any transaction.
• No transaction should have any adverse effect on
the data residing in the database.
• If the database was in a consistent state before
the execution of a transaction, it must remain
consistent after the execution of the transaction
as well.
• Transaction
• (T1)

• DB DB
States of Transactions
• Active − In this state, the transaction is being
executed. This is the initial state of every
transaction.

• Partially Committed − When a transaction


executes its final operation, it is said to be in a
partially committed state.

• Failed − A transaction is said to be in a failed state


if any of the checks made by the database
recovery system fails. A failed transaction can no
longer proceed further.
• Aborted − If any of the checks fails and the
transaction has reached a failed state, then the
recovery manager rolls back all its write
operations on the database to bring the database
back to its original state where it was prior to the
execution of the transaction.
• Transactions in this state are called aborted. The
database recovery module can select one of the
two operations after a transaction aborts −

• Re-start the transaction


• Kill the transaction
• Committed − If a transaction executes all its
operations successfully, it is said to be
committed.

• All its effects are now permanently


established on the database system.
Advantages of Concurrency in Database and System Operations

• Definition: Concurrency in computing allows


multiple processes to operate simultaneously,
sharing resources to optimize performance.
• Purpose: To improve efficiency, resource
utilization, and overall system performance.
Advantage 1: Reduced Waiting Time
• Waiting Time ↓: Concurrency decreases the
amount of time each process spends waiting
for resources.
• Benefit: Reduces delays, ensuring faster
execution of tasks and improved user
experience.
Advantage 2: Decreased Response
Time
• Response Time ↓: The time taken to respond
to a process or transaction is shortened.
• Benefit: Boosts system responsiveness and
allows for faster data processing, critical in
real-time applications.
Advantage 3: Improved Resource
Utilization
• Resource Utilization ↑: Concurrency increases
the efficient use of resources like CPU,
memory, and I/O devices.
• Benefit: Maximizes the output from available
resources, reducing idle time and minimizing
waste.
Advantage 4: Enhanced System
Efficiency
• Efficiency ↑: Allows multiple tasks to be
executed simultaneously, improving overall
system throughput.
• Benefit: Faster processing and reduced
completion time for tasks, leading to greater
productivity.
Concurrency Control Problems
• Concurrency introduces complexities in
maintaining data integrity. Common issues
include:
– Dirty Reads
– Unrepeatable Reads
– Phantom Reads
– Lost Updates
Dirty Read Problem
• Definition: When a transaction reads data
that has been modified by another transaction
that hasn’t yet been committed.
• Example: Transaction A reads uncommitted
data from Transaction B. If Transaction B rolls
back, Transaction A has incorrect data.
Unrepeatable Read Problem
• Definition: A transaction reads the same data
twice but gets different values because
another transaction modified it in the
meantime.
• Example: Transaction A reads a value, and
before it reads it again, Transaction B updates
the value.
Unrepeatable Read Problem
Phantom Read Problem
Lost Update Problem
• Definition: When two transactions update the
same data item simultaneously, one update
overwrites the other, causing data loss.
• Example: Transactions A and B read the same
value, both update it, but only one update is
retained.
Lost Update Problem
Schedule
• Schedule – a sequences of instructions that
specify the chronological order in which
instructions of concurrent transactions are
executed
– a schedule for a set of transactions must consist of
all instructions of those transactions
– must preserve the order in which the instructions
appear in each individual transaction.
Serial Schedule and Non Serial Schedule in Transaction DBMS
Serializability
• Serial Schedule − It is a schedule in which
transactions are aligned in such a way that
one transaction is executed first.
• When the first transaction completes its cycle,
then the next transaction is executed.
Transactions are ordered one after the other.
• This type of schedule is called a serial
schedule, as transactions are executed in a
serial manner.
• Let T1 transfer $50 from A to B, and T2 transfer 10% of
the balance from A to B.
• A serial schedule in which T1 is followed by T2 :
• In a multi-transaction environment, serial schedules are
considered as a benchmark.

• The execution sequence of an instruction in a transaction cannot be
changed, but two transactions can have their instructions executed
in a random fashion.

• This execution does no harm if two transactions are mutually


independent and working on different segments of data; but in case
these two transactions are working on the same data, then the
results may vary. This ever-varying result may bring the database to
an inconsistent state.

• To resolve this problem, we allow parallel execution of a


transaction schedule, if its transactions are either serializable or
have some equivalence relation among them.
Equivalence Schedules

Equivalence Schedules
• An equivalence schedule can be of the following types

Result Equivalence
• If two schedules produce the same result after
execution, they are said to be result equivalent.

• They may yield the same result for some value and
different results for another set of values. That's why
this equivalence is not generally considered significant.
T1 Followed by T2 T2 Followed by T1
Conflict Equivalence

• Two schedules would be conflicting if they


have the following properties −

– Both belong to separate transactions.


– Both accesses the same data item.
– At least one of them is "write" operation.
Conflicting Instructions
• Instructions li and lj of transactions Ti and Tj
respectively, conflict if and only if there exists some
item Q accessed by both li and lj, and at least one of
these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
• Intuitively, a conflict between li and lj forces a (logical)
temporal order between them.
– If li and lj are consecutive in a schedule and they do not
conflict, their results would remain the same even if they
had been interchanged in the schedule.
Conflict Serializability
• If a schedule S can be transformed into a
schedule S´ by a series of swaps of non-
conflicting instructions, we say that S and S´
are conflict equivalent.

• We say that a schedule S is conflict serializable


if it is conflict equivalent to a serial schedule.
Conflict Serializability (Cont.)
• Schedule 3 can be transformed into Schedule 6, a serial schedule
where T2 follows T1, by series of swaps of non-conflicting
instructions.
– Therefore Schedule 3 is conflict serializable. Here

Schedule 3 Schedule 6
Conflict Serializability (Cont.)
• Example of a schedule that is not conflict
serializable:

• We are unable to swap instructions in the above


schedule to obtain either the serial schedule < T3,
T4 >, or the serial schedule < T4, T3 >.
Check Conflict Serializability [R: Read, W: Write]
T1 T2 T3
R (X)
R(Z)

W(Z)
R(Y)
R(Y)
W(Y)
W(X)
W(Z)
W(X)

Check Following Conditions: For Each Transaction, create


▪ Both belong to separate distinct node
transactions. If all the condition satisfied, then
▪ Both accesses the same data draw a link to the dependent
item. transaction
▪ At least one of them is "write" If there is a cycle the Schedule is
operation. not conflict serializable.
– For Each Transaction, create distinct node
– If all the Conflict condition satisfied, then draw a
link to the dependent transaction
– If there is a cycle the Schedule is not conflict
serializable.

T1 T2

T3
Another Example
T1 T2 T3
R (X)
R(Y)
R(Y)
W(Y)
W(X)
W(X)
R(X)
W(Y)
As there is no Cycle, So Conflict Serializable.
Order: T1 – T3-T2
View Serializability
• A schedule will view serializable if it is view
equivalent to a serial schedule.
• If a schedule is conflict serializable, then it will
be view serializable.
• The view serializable which does not conflict
serializable contains blind writes.
View Equivalent

• Two schedules S1 and S2 are said to be view


equivalent if they satisfy the following
conditions:
• Initial Read
• An initial read of both schedules must be the
same. Suppose two schedule S1 and S2. In
schedule S1, if a transaction T1 is reading the
data item A, then in S2, transaction T1 should
also read A.
Above two schedules are view equivalent because Initial read
operation in S1 is done by T1 and in S2 it is also done by T1.
• Updated Read
• In schedule S1, if Ti is reading A which is
updated by Tj then in S2 also, Ti should read A
which is updated by Tj.

Above two schedules are not view


equal because, in S1, T3 is reading A
updated by T2 and in S2, T3 is reading
A updated by T1.
• Final Write
• A final write must be the same between both
the schedules. In schedule S1, if a transaction
T1 updates A at last then in S2, final writes
operations should also be done by T1.

Above two schedules is view equal because


Final write operation in S1 is done by T3 and in
S2, the final write operation is also done by T3.
Example

Find the View Equivalent Schedule…

Schedule S
With 3 transactions, the total number of
possible serial schedule: = 3! = 6
S1 = <T1 T2 T3>
S2 = <T1 T3 T2>
S3 = <T2 T3 T1>
S4 = <T2 T1 T3>
S5 = <T3 T1 T2>
S6 = <T3 T2 T1>
Taking first schedule S1:
Step 1: final updation on data items
In both schedules S and S1, there is no
read except the initial read that's why
we don't need to check that
condition.
Step 2: Initial Read
The initial read operation in S is done
by T1 and in S1, it is also done by T1.
Step 3: Final Write
The final write operation in S is done
by T3 and in S1, it is also done by T3.
So, S and S1 are view Equivalent.

The first schedule S1 satisfies all three


conditions, so we don't need to check
another schedule.

Hence, view equivalent serial


Given Schedule S’ schedule is:
T1 → T2 → T3
View Serializability in DBMS
• The View serializability is a concept that is
used to compute whether schedules are View-
Serializable or not. A schedule is said to be
View-Serializable if it is view equivalent to a
Serial Schedule.
View Serializability in DBMS
• There may be some schedules that are not Conflict-Serializable but still gives a
consistent result because the concept of Conflict-Serializability becomes limited
when the Precedence Graph of a schedule contains a loop/cycle.

• In such a case we cannot predict whether a schedule would be consistent or


inconsistent. As per the concept of Conflict-Serializability, We can say that a
schedule is Conflict-Serializable (means serial and consistent) if its corresponding
precedence graph does not have any loop/cycle.

• But, what if a schedule’s precedence graph contains a cycle/loop and is giving


consistent result/accurate results as a conflict serializable schedule is giving?

• So, to address such cases we brought the concept of View-Serializability because


we did not want to confine the concept of serializability only to Conflict-
Serializability.
Recoverability of Schedule

• Sometimes a transaction may not execute


completely due to a software issue, system
crash or hardware failure.
• In that case, the failed transaction has to be
rollback.
• But some other transaction may also have
used value produced by the failed transaction.
So we also have to rollback those transactions.
• Table 1 shows a schedule which has two transactions.
• T1 reads and writes the value of A and that value is read and written by T2.
• T2 commits but later on, T1 fails.
• Due to the failure, we have to rollback T1.
• T2 should also be rollback because it reads the value written by T1, but T2 can't
be rollback because it already committed.
• So this type of schedule is known as irrecoverable schedule.
• Irrecoverable schedule: The
schedule will be irrecoverable if Tj
reads the updated value of Ti and Tj
committed before Ti commit.
• The above table 2 shows a schedule with two transactions.
• Transaction T1 reads and writes A, and that value is read and written by
transaction T2. But later on, T1 fails.
• Due to this, we have to rollback T1.
• T2 should be rollback because T2 has read the value written by T1.
• As it has not committed before T1 commits so we can rollback transaction
T2 as well.
• So it is recoverable with cascade rollback.
• Recoverable with cascading rollback: The schedule will be recoverable with
cascading rollback if Tj reads the updated value of Ti. Commit of Tj is
delayed till commit of Ti.
Cascade less recoverable
schedule.

• The above Table 3 shows a schedule with two transactions.


Transaction T1 reads and write A and commits, and that value
is read and written by T2.
• So this is a cascade less recoverable schedule.
Schedule Based on Recoverability
Recoverable Schedule:
• A schedule is recoverable if it allows for the recovery of the database to a
consistent state after a transaction failure.
• In a recoverable schedule, a transaction that has updated the database must
commit before any other transaction reads or writes the same data.
• If a transaction fails before committing, its updates must be rolled back, and any
transactions that have read its uncommitted data must also be rolled back.
Cascadeless Schedule
• A schedule is cascaded less if it does not result in a cascading rollback of
transactions after a failure.
• In a cascade-less schedule, a transaction that has read uncommitted data
from another transaction cannot commit before that transaction commits.
• If a transaction fails before committing, its updates must be rolled back,
but any transactions that have read its uncommitted data need not be
rolled back
Strict Schedule
• If the schedule contains no read or write before commit then it is known as a strict
schedule. A strict schedule is strict in nature.
• A schedule is strict if for any two transactions T i , T j , if a write operation of T i
precedes a conflicting operation of T j (either read or write),
• Then the commit or abort event of T i also precedes that conflicting operation of
Tj
• In other words, T j can read or write updated or written value of T i only after T i
commits/aborts.
Relation between various types of
schedules
Summary: Type of Schedule
Concurrency Control
• Whenever multiple transactions are working
on same database concurrently, it may leads
to data in consistency.
• Let us consider this scenario:

T1 T2

X
X √ X
T1 W T1

A R

X R
B T2
W
T2 Database R

T3

▪ Multiple Transactions may allowed to read operation simultaneously.


▪ But only one Transaction should perform write operation at a time.
• We need to control concurrent execution of
transactions. We need to make them
Serilazible.
• One method is to Lock the Data item accessed
by transaction until transaction is completed.
This ensures serializability.
• Another method is to Assign some Time
Stamps and compare the Time stamps. This
ensures serializability.

• Locking. Time Stamps

T1 T2 T1 (Start Time=2)

X
A X A
√ TS = 3
A

T2 (Start Time=4)
Lock-Based Protocols
• A lock is a mechanism to control concurrent access
to a data item.
• Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read
as well as written. X-lock is requested using lock-X
instruction.
2. shared (S) mode. Data item can only be read. S-
lock is requested using lock-S instruction.
• Lock requests are made to concurrency-control
manager. Transaction can proceed only after
request is granted.

Lock-Based
Lock-compatibility matrix
Protocols (Cont.)

• A transaction may be granted a lock on an item if the requested lock is


compatible with locks already held on the item by other transactions.

• Any number of transactions can hold shared locks on an item,

• But if any transaction holds an exclusive on the item no other


transaction may hold any lock on the item.

• If a lock cannot be granted, the requesting transaction is made to wait


till all incompatible locks held by other transactions have been
released. The lock is then granted.
Lock-Based Protocol
1. Shared lock:
•It is also known as a Read-only lock. In a shared lock, the data
item can only read by the transaction.
•It can be shared between the transactions because when the
transaction holds a lock, then it can't update the data on the data
item.
2. Exclusive lock:
•In the exclusive lock, the data item can be both reads as well as
written by the transaction.
•This lock is exclusive, and in this lock, multiple transactions do
not modify the same data simultaneously.
Scenario: Lock-Based Protocol
Lock Manager (Lock Request and Unlock)

1: Requests
Concurrency Control Component T1
4: Granted/ Denied

2: Check 3: T/F

Lock Compatibility Table


• Examples:

T1 T2 T1 T2
Lock-S (A)
Lock-S (A)
Read (A)
Read (A)
Lock-X(A)
Lock-S(A) Read (P)

Read (P) P=P+10


Write (P)
Unlock (A)
Unlock (A)
Unlock (A) Unlock (A)

Possible Not Possible


Locking Protocols
• A locking protocol is a set of rules followed by all
transactions while requesting and releasing locks.
Locking protocols restrict the set of possible schedules.

• Simple Locking.

• 2 Phase Locking.
– Basic 2 Phase Locking (B2PL)
– Conservative 2 Phase Locking (C2PL)
– Strict 2 Phase Locking(S2PL)
– Rigorous 2 Phase Locking (R2PL)
Simple Locking
• Lock The Data Item before access.
• Release the Lock as soon as action is Over.
• Example:
1. Lock –X (A)
2. Read (A)
3. A=A+10
4. Write (A)
5. Unlock (A)
• When the Transaction executes serially, no issue.
• But when executed in interleaved manner may leads to inconsistent.

100 T2: Display (A+B)


T1: (A B)
Lock –X (A) Lock-S (A)
R(A)
R(A)
A=A-100;
W(A) Unlock (A)
Unlock(A)
Lock-X(B) Lock-S(B)

R(B)
R(B)
B=B+100
W(B) Unlock(B)

W(B)
Display (A +B)
Unlock (B)
T1 T2
Lock –X (A)
R(A)
A=A-100;
W(A)
Unlock(A) Lock-S (A)

R(A)
Unlock (A)

Lock-X(B)
R(B)
B=B+100
W(B)
Unlock (B) Lock-S(B)

R(B)
Unlock(B)
Display (A +B)
Pitfalls of Lock-Based Protocols (Cont.)

• The potential for deadlock exists in most locking


protocols. Deadlocks are a necessary evil.
• Starvation is also possible if concurrency control
manager is badly designed. For example:
– A transaction may be waiting for an X-lock on an item,
while a sequence of other transactions request and
are granted an S-lock on the same item.
– The same transaction is repeatedly rolled back due to
deadlocks.
• Concurrency control manager can be designed to
prevent starvation.
The Two-Phase Locking Protocol
• This is a protocol which ensures conflict-
serializable schedules.
• Phase 1: Growing Phase
– transaction may obtain locks
– transaction may not release locks
• Phase 2: Shrinking Phase
– transaction may release locks
– transaction may not obtain locks
• The protocol assures serializability. It can be
proved that the transactions can be serialized in
the order of their lock points (i.e. the point
where a transaction acquired its final lock).
Simple 2 Phase Locking Protocol

Lock Point
Unlock
R/W Lock
R/W
Locks Acquired Locks Released
Unlock
Lock

Growing Phase Shrinking Phase


Start Transaction End Transaction
• Basic 2 Phase Locking Protocol Disadvantages:
– Unnecessary wait due to early access
– Deadlock
– Cascading rollback
• Unnecessary wait due to early access
T1 T2
Lock-X(A)
R(A)
Growing Phase
W(A)
Lock-X(B)
R(B)
. Lock-S(A): has to wait:
.
.
• Deadlock:
T1 T2
Lock-X(B)
R(B)
W(B)
Lock-S(A)
R(A)
Lock-X(B)
Lock-X(A)

T1 B

A T2
• Cascading Rollback:
T1 T2 T3
Lock-X (A)
R(A)
Rollback
W(A)
U(A) Rollback
. Lock-X(A)
Rollback
. R(A)
Failure W(A)
U(A)
Lock-X(A)
R(A)
. . .
. . .
. . .
Conservative 2 Phase Locking Protocol
• Lock the Data items before Transaction Starts.
• Need early predictions of how many data
items may be required for transaction.
– T1 : [A B C D]
– Request Lock on { A B C D}
– If lock is granted operation starts.
– Otherwise it will wait further the request to be
granted.
Conservative 2 Phase Locking Protocol

All Locks Granted Lock Point Start Transaction

Unlock
No Growing Phase
R/W
Locks Released
Unlock
Shrinking Phase
End Transaction
• Deadlock Never Occur.
• Early Prediction is difficult for practical
implementation.
• Cascading Rollback may present.
Strict Phase Locking Protocol
• Transaction does not release any of its lock Exclusive
Lock until it is committed/Aborted.
Lock Point
Unlock-S
Lock-X
R/W
Locks Acquired Locks Released
Lock-S End Transaction
Committed
Growing Phase Shrinking Phase
Unlock-X
• S2PL is Most Popular.
• Does not have cascading roll back.
• Deadlock May Present.
• Generates Strict Schedule.(Both Recoverable
and Cascade less.
• Easy Recovery.
Recoverable Schedule
• For each pair of transactions (Ti, Ti), if Tj reads the
item that was previously written by Ti, then commit
operation of Ti should appear before commit
operation Tj.
T1 T2
R1(A)
A= A+150
W(A)
R2(A)
A= A-50;
W2(A)
Commit;
Commit;
Strict Schedule
• If a value written by a transaction cannot be
read or over written by another transaction
until the transactions is either aborted or
committed.
• Every Strict Schedule is Recoverable and
Cascade less. T1 T2
R1 (A)
W1(A)
Commit;
W2(A)
R(A)
Rigorous Phase Locking Protocol
• Transaction does not release any of its Lock (Shared
and Exclusive) until it is committed or Aborted.
Lock Point End Transaction
Committed
R/W Lock-X
Unlock-S
Locks Acquired
Unlock-X
Lock-S
Start Transaction
No Shrinking Phase
Growing Phase
• Does not have cascading roll back.
• Deadlock May Present.
C2PL
B2PL
Lock Point
Unlock-S
Lock-X
R/W
Locks Acquired Locks Released
Lock-S End Transaction
Committed
Growing Shrinking Phase
S2PL Unlock-X
Phase
R2PL
Examples
T1

Lock-S(A) B2PL

R(A) C2PL

Lock-X(B) S2PL
R(A)
R2PL
R(B)

B=A+B

Unlock(A)

W(B)

Unlock(B)
T1

Lock-S(A)

R(A)
B2PL
Lock-X(B)
C2PL

Unlock(A) S2PL

R(B) R2PL

W(B)

Commit

Unlock(B)
T1
B2PL
Lock-S(A)
C2PL
R(A)
S2PL
Unlock(A)
R2PL
Lock-X(B)

R(B)

W(B)

Unlock(B)

Commit
Write Schedules Using B2PL
• Transaction 1: T1
Lock –S (A)
– Read (A)
R(A)
– Read (B) Lock –X (B)
– If A=0, B=B+1; R(B)
If A=0; B=B+1;
– Write (B); Unlock (A)
W(B)
Unlock(B)
Write Schedules Using C2PL
• Transaction 1: T1

– Read (A) Lock –S (A), Lock –X(B)

– Read (B) R(A)

R(B)
– If A=0, B=B+1;
If A=0; B=B+1;
– Write (B);
Unlock (A)

W(B)

Unlock(B)
Write Schedules Using S2PL
• Transaction 1: T1
Lock –S (A),
– Read (A)
R(A)
– Read (B) Lock –X(B)
– If A=0, B=B+1; R(B)
If A=0; B=B+1;
– Write (B); Unlock (A)
W(B)
Commit
Unlock(B)
Write Schedules Using R2PL
• Transaction 1: T1
Lock –S (A),
– Read (A)
R(A)
– Read (B) Lock –X(B)
– If A=0, B=B+1; R(B)
If A=0; B=B+1;
– Write (B); W(B)
Commit
Unlock(B)
Unlock (A)
Solve This
• Transaction 2:
– Read (B)
– Read (A)
– If B=0, A =A-B
– Write(A)

• Find Schedules using: B2PL, C2PL, S2PL and


R2PL.
Implementation of Locking
• A lock manager can be implemented as a separate
process to which transactions send lock and unlock
requests
• The lock manager replies to a lock request by sending a
lock grant messages (or a message asking the transaction
to roll back, in case of a deadlock)
• The requesting transaction waits until its request is
answered
• The lock manager maintains a data-structure called a lock
table to record granted locks and pending requests
• The lock table is usually implemented as an in-memory
hash table indexed on the name of the data item being
locked
Lock Table
• Black rectangles indicate granted
locks, white ones indicate waiting
requests
• Lock table also records the type of
lock granted or requested
• New request is added to the end of
the queue of requests for the data
item, and granted if it is
compatible with all earlier locks
• Unlock requests result in the
request being deleted, and later
requests are checked to see if they
Granted can now be granted
Waiting • If transaction aborts, all waiting or
granted requests of the transaction
are deleted
– lock manager may keep a list of
locks held by each transaction,
to implement this efficiently
DBMS TIMESTAMP
PROTOCOLS
INTRODUCTION

•A timestamp is a unique identifier used in DBMS to identify atransaction.

•Typically, timestamp values are assigned in the order in which the transactions
are submitted to the system, so a timestamp can be thought of as the transaction
start time.

•We will refer to the timestamp of transaction Tas TS(T).

•Concurrency control techniques based on timestamp ordering do not uselocks;


hence, deadlocks cannot occur.
GENERATION OFTIMESTAMPS

•Timestamps can be generated in several ways.

•One possibility is to use a counter that is incremented each time its value is
assigned to a transaction. The transaction timestamps are numbered 1, 2, 3, . . .
in this scheme.

•A computer counter has a finite maximum value, so the system must periodically
reset the counter to zero when no transactions are executing for some short
period of time.

•Another way to implement timestamps is to use the current date/time value of


the system clock and ensure that no two timestamp values are generated during
the same tick of the clock.
TIMESTAMP ORDERING
•A schedule in which the transactions participate is then serializable, and the
equivalent serial schedule has the transactions in order of their timestamp
values. This is called timestamp ordering (TO).

•Notice how this differs from 2PL, where a schedule is serializable by being
equivalent to some serial schedule allowed by the lockingprotocols.

•In timestamp ordering, however, the schedule is equivalent to theparticular


serial order corresponding to the order of the transaction timestamps.

•The algorithm must ensure that, for each item accessed by conflicting
operations in the schedule, the order in which the item is accessed does not
violate the serializability order.

•Todo this, the algorithm associates with each database item X two timestamp
(TS) values:-
1.Read_TS(X): The read timestamp of item X; this is the largest timestamp
among all the timestamps of transactions that have successfully read itemX—
that is, read _TS(X) = TS(T), where Tis the youngest transaction that has
read X successfully.

2.Write_TS(X): The write timestamp of item X; this is the largest ofall the
timestamps of transactions that have successfully written item X—that is,
write_ TS(X) = TS(T), where Tis the youngest transaction that has
written X successfully.
TS=10 TS=20 TS=30 TS=40
T1 T2 T3 T4
R1(A)
W1(A)
R3(A)
R2(A)
W4(A)
Read Time Stamp:
RTS (A) = 0;
Update by T1=10 (0<10)
RTS (A)=10;
Update by T3=30 (10<30)
RTS(A) =30;
Update by T2 =20 (30<20) X
RTS(A) =30.
Write Time Stamp:
WTS (A) = 0;
Update by T1=10 (0<10)
WTS(A) = 10;
Update by T4 =40 (10 <40)
WTS (A) =40;
BASICTIMESTAMPORDERING
•Whenever some transaction Ttries to issue a read_item(X) or a write_item(X)
operation, the basic TOalgorithm compares the timestamp of Twith read_TS(X)
and write_TS(X) to ensure that the timestamp order of transaction execution is
not violated.

•The concurrency control algorithm must check whether conflicting operations


violate the timestamp ordering in the following two cases:
2. Transaction Tissues a read_item(X) operation:

a.If write_TS(X) >TS(T),then abort and roll back Tand reject the operation.

b.This should be done because some younger transaction with timestamp greater
than TS(T)—and hence after T in the timestamp ordering—has already written the
value of item X before Thad a chance to read X.

c.If write_TS(X) <=TS(T),


d.then execute the read_item(X) operation of T and set read_TS(X) to the
larger of TS(T)and the current read_TS(X).
Algorithm:
If (write_TS(X) > TS(T))
{
Abort T ;
Rollback;
}
Else
{
Read (X);
Read-TS(X) = TS (T)
}
1. Transaction Tissues a write_item(X) operation:
a.If read_TS(X) > TS(T) or if write_TS(X) > TS(T), then abort and roll back Tand reject
the operation.

b.This should be done because some younger transaction with a timestamp greater
than TS(T)—and henceafter T in the timestamp ordering—has already read or
written the value of item X before T had a chance to write X, thus violating the
timestamp ordering.

c.If the condition in part (a) does not occur, then execute the write_item(X)
operation of Tand set write_TS(X) toTS(T).
Algorithm:
If (read_TS(X) > TS(T) Or write_TS(X) > TS(T))
{
Abort T ;
Rollback;
}
Else
{
Write (X);
write-TS(X) = TS (T)
}
TS=1 TS=2
T1 T2
R1(X)
RollBack W2(X)
R1(X)

Initial RTS (X) =0; WTS(X)=0;


Step 1: T1 wants to read:
Check: If (WTS(X) > TS(T)) (0 >1) X
T1: allowed to read
RTS(X) = TS (T) (RTS (X) =1)
Step 2: T2 wants to Write
Check If (RTS(X) > TS(T) (1>2) X
Or WTS(X) > TS(T)) (0>2) X
T2 allowed to Write.
WTS(X) = TS (T) (WTS(X) =2)
Step 2: T1 wants to Read
Check: If (WTS(X) > TS(T)) (2 >1)
T1 Aborted and Rollback.
TS=1 TS=2
T1 T2
R1(X)
W2(X)
R1(X)

T1 T2

As cycle is present, not Conflict Serializable.


Time Stamp Ordering Protocol ensures Conflict Serializability.
Correctness of Timestamp-Ordering
• The timestamp-ordering protocol guarantees serializability
since all the arcs in the precedence graph are of the form:

transaction transaction
with smaller with larger
timestamp timestamp

Thus, there will be no cycles in theprecedence graph


• Timestamp protocol ensures freedom from deadlock asno
transaction ever waits.
• But the schedule may not be cascade-free,and may not even
be recoverable.
Deadlock
A System is in a deadlock state if there exists a set of
transactions such that every transaction in the set is waiting
for another transaction in the set.
Deadlock Detection

Deadlocks can be described as a wait-for graph, which consists


of a pair G = (V,E),
 V is a set of vertices (all the transactions in the system)
 E is a set of edges; each element is an ordered pair Ti →Tj.
If Ti → Tj is in E, then there is a directed edge from Ti to Tj,
implying that Ti is waiting for Tj to release a data item.
When Ti requests a data item currently being held by Tj, then the
edge Ti Tj is inserted in the wait-for graph. This edge is removed
only when Tj is no longer holding a data item needed by Ti.
The system is in a deadlock state if and only if the wait-for graph
has a cycle. Must invoke a deadlock-detection algorithm
periodically to look for cycles.
Examples.

Without Cycle: No Deadlock With Cycle: Deadlock


Transactions Data Item Lock

T1 Q Shared

T2 P Exclusive
Q (Lock Not Granted) Exclusive
T3 Q Shared

T4 P (Lock Not Granted) Exclusive

T1 T2
As No Cycle, So No Deadlock.

T3
T4
Deadlock Prevention
(Using Time Stamp Ordering)
To prevent any deadlock situation in the system, the DBMS
aggressively inspects all the operations, where transactions are
about to execute.

The DBMS inspects the operations and analyzes if they can


create a deadlock situation.

If it finds that a deadlock situation might occur, then that


transaction is never allowed to be executed.
Wait-Die Scheme

In this scheme, if a transaction requests to lock a resource (data


item), which is already held with a conflicting lock by another
transaction, then one of the two possibilities may occur:

➢ If TS(Ti) < TS(Tj) − that is Ti, which is requesting a


conflicting lock, is older than Tj − then Ti is allowed to wait
until the data-item is available.

➢ If TS(Ti) > TS(tj) − that is Ti is younger than Tj − then Ti dies.


Ti is restarted later with a random delay but with the same
timestamp.
Wound-Wait Scheme
In this scheme, if a transaction requests to lock a resource (data
item), which is already held with conflicting lock by some another
transaction, one of the two possibilities may occur −

➢ If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back − that is Ti


wounds Tj. Tj is restarted later with a random delay but with
the same timestamp.

➢ If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is


available.

This scheme, allows the younger transaction to wait; but


when an older transaction requests an item held by a
younger one, the older transaction forces the younger one to
abort and release the item.

In both the cases, the transaction that enters the system at a


later stage(Younger Transaction) is aborted.
Timeout Scheme
A Transaction has to wait for a limited period of time for a
particular resource.

If Transaction don’t get that resource within time limit,


Transaction dies (Aborted).

Example: Bank OTP Transaction.

T1

10 Sec

You might also like