20it007-Database Management Systems
20it007-Database Management Systems
SYLLABUS
Introduction to DBMS
Overview of DBMS- Data Models- Database Languages- Database
Administrator- Database Users- Three Schema architecture of DBMS: Basic
concepts- Mapping Constraints- Keys. Relational Algebra – Relational Calculus:
Domain relational Calculus –Tuple Relational Calculus.
Database Design and SQL
Entity-Relationship Diagram-Design Issues- Weak Entity Sets- and
Extended E-R features - Structure of relational Databases- Views- Modifications of
the Database- Concept of DDL- DML- TCL - DCL: Basic Structure- Set
Operations- Aggregate Functions- Null Values- Domain Constraints- Referential
Integrity Constraints- Assertions- Views- Nested Sub Queries- Stored Procedures.
Functional Dependency- Different Anomalies in designing a Database.-
Normalization using Functional Dependencies- Decomposition- Boyce-Codd
Normal Form- 3NF- Normalization using Multi-Valued Dependencies- 4NF- 5NF.
• Query Processing and Transactions
Database Query Processing - Transactions- Concurrency Control – Recovery
System- State Serializability- Lock Based Protocols- Two Phase Locking.
• Storage Management and Indexing
Physical Storage Systems: Storage Interfaces – Magnetic Disks – Flash Memory -
RAID – Disk block access. Data Storage Structures: Database Storage Architecture - File
Organization- Organization of Records in Files – Data Dictionary Storage - Indexing.
• Advances in Database
Database System Architectures – Parallel and Distributed Transaction Processing –
Complex Data types: Semi structured Data – Spatial Data – Textual Data Big Data – Data
Analytics – Blockchain Databases.
References
• Abraham Silberschatz- Henry F. Korth, S. Sudharshan, “Database System Concepts”,
Tata McGraw Hill, 7th Edition, 2019.
• Ramez Elmasri, Shamkant B. Navathe, “Fundamentals of Database Systems”, Pearson
Education, 7th Edition, 2015.
• C.J. Date, A.Kannan, S.Swamynathan, “An Introduction to Database Systems”,
Pearson Education, 8th Edition, 2006.
• Raghu Ramakrishnan, “Database Management Systems”, McGraw-Hill College
Publications, 4th Edition, 2015.
• G.K.Gupta, "Database Management Systems”, Tata McGraw Hill, 1 st Edition, 2018.
• Atul Kahate, “Introduction to Database Management Systems”, Pearson Education, 1 st
Edition, 2004.
• Ivan Bayross, “SQL, PL/SQL the Programming Language of Oracle”, BPB
Publications, 2010.
20IT009-DATABASE MANAGEMENT SYSTEMS LABORATORY
• Creation of a database and write SQL queries to retrieve information from the database.
• Perform Insertion, Deletion, Modifying, Altering, Updating and Viewing records based on conditions.
• Creation of a database using views, synonyms, sequences and indexes
• Creation of a database using Commit, Rollback and Save point.
• Creation of a database to set various constraints.
• Creating relationship between the databases.
• Write PL/SQL block to by accepting input from the user and handling exceptions.
• Creation of Procedures.
• Creation of functions.
• Mini project (Application Development using Oracle/ MySQL)
a) Inventory Control System.
b) Material Requirement Processing.
c) Hospital Management System.
d) Railway Reservation System.
e) Personal Information System.
f) Web Based User Identification System.
g) Timetable Management System.
h) Hotel Management System.
Introduction to DBMS
Data: Data can be facts related to any object in consideration.
Information: Information is data that has been converted into a more
useful form.
For example, marks obtained by students and their roll numbers form
data, the report card/sheet is the information.
• name, age, height, weight, etc are some data related to you.
What is a Database?
• Database is a systematic collection of data.
• Databases support storage and manipulation of data.
• Databases make data management easy.
Examples
• An online telephone directory would use database to store data
pertaining to people, phone numbers, other contact details, etc.
• Your electricity service provider is using a database to manage billing ,
client related issues, to handle fault data, etc.
• Facebook- It needs to store, manipulate and present data related to
members, their friends, member activities, messages, advertisements
etc.
What is DBMS?
• DBMS contains information about one particular enterprise
• Collection of interrelated data
• Set of programs to access the data
• An environment that is both convenient and efficient to use
• Database Management Systems (DBMS) are software systems used to
store, retrieve, and run queries on data.
• A DBMS serves as an interface between an end-user and a database,
allowing users to create, read, update, and delete data in the database.
Database Applications
• Banking: transactions
• Airlines: reservations, schedules
• Universities: registration, grades
• Sales: customers, products, purchases
• Online retailers: order tracking, customized recommendations
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions
• Telecommunication: monthly bills, keeping records of calls made,
maintaining balances on prepaid calling cards and storing information about
the communication networks.
• Databases can be very large.
• Databases touch all aspects of our lives.
Drawbacks of using file systems to store data
• Before dbms were introduced, organizations usually stored information
in file processing systems.
• File Management /processing System/File System is the traditional and
popular way to keep your data files organized on your drives.
• Keeping organizational information in a file processing system has a
number of major disadvantages.
Drawbacks of using file systems /
purpose of database systems
• Atomicity of updates
• Failures may leave database in an inconsistent state with partial updates carried out
• Example: Transfer of funds from one account to another account should either
complete or not happen at all.
• Concurrent access by multiple users
• Allow multiple users to update the data simultaneously.
• Example: Two people reading a balance (say $500) and updating it by withdrawing money (say
$50 and $100 each) at the same time, the result of the concurrent executions may leave the
account A at the same time.(a/c may contain either $450 or $400,rather than $ 350).
• Security problems
• Not every user of the database system should be able to access all the data.
• Database systems offer solutions to all the above problems
Three Schema architecture of DBMS/
Views of Data
5. Hierarchical Model
• This database model organizes data into a tree-like-structure, with a single root, to which all the other data is
linked.
• The hierarchy starts from the Root data, and expands like a tree, adding child nodes to the parent nodes.
• In this model, a child node will only have a single parent node.
• In hierarchical model, data is organized into tree-like structure with one-to-many relationship between two
different types of data, for example, one department can have many courses, many professors and of-course
many students.
TYPES OF DATA MODELS
6. Network Model
• This is an extension of the Hierarchical model.
• In this model data is organized more like a graph, and are allowed to have more than one
parent node.
• In this database model, data is more related hence accessing the data is also easier and fast.
This database model was used to map many-to-many data relationships.
• This was the most widely used database model, before Relational Model was introduced.
Database Language
• A DBMS has appropriate languages and interfaces to express database queries and
updates.
• Database languages can be used to read, store and update the data in the database.
Types of Database Language
1. Data Definition Language
•DDL stands for Data Definition Language.
•It is used to define database structure.
•It is used to create schema, tables, indexes, constraints, etc. in the database.
•Using the DDL statements, you can create the skeleton of the database.
•Data definition language is used to store the information of metadata like the number of
tables and schemas, their names, indexes, columns in each table, constraints, etc.
Here are some tasks that come under DDL:
Create: It is used to create objects in the database.
Alter: It is used to alter the structure of the database.
Drop: It is used to delete objects from the database.
Truncate: It is used to remove all records from a table.
Rename: It is used to rename an object.
2. Data Manipulation Language
• All attributes have values. For example, a student entity may have name, class, and age as
attributes.
• There exists a domain or range of values that can be assigned to attributes. For example, a
student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be
negative, etc.
Types of Attributes
• Simple attribute − Simple attributes are atomic values, which cannot be divided further.
For example, a student's phone number is an atomic value of 10 digits.
• Multi-value attribute − Multi-value attributes may contain more than one values.
For example, a person can have more than one phone number, email_address, hobby etc.
• Composite attribute − Composite attributes are made of more than one simple attribute.
For example, a student's complete name may have first_name ,middle_name and
last_name.
• Derived attribute − Derived attributes are the attributes that do not exist in the physical
database, but their values are derived from other attributes present in the database.For
example, age can be derived from data_of_birth.
• Binary Relationship –
When there are TWO entities set participating in a relation, the relationship is called as
binary relationship.For example, Student is enrolled in Course.
n-ary Relationship
• When there are n entities set participating in a relation, the relationship
is called as n-ary relationship
Degree of Relationship
• The number of participating entities in a relationship defines the degree
of the relationship.
• Binary = degree 2
• Ternary = degree 3
• n-ary = degree n
Ternary and Quaternary relationship
Constraints
• An E-R enterprise schema may define certain constraints to which the contents of
database system must conform.
• Two types of constraints are
1.Mapping cardinalities
One-one
One-many
Many-one
Many-many
2.Participation constraints
Total participation
Partial participation
Mapping Cardinalities
• Cardinality defines the number of entities in one entity set, which can be associated with the
number of entities of other set via relationship set.
1. One to one – When each entity in each entity set can take part only once in the
relationship, the cardinality is one to one.
Eg: A male can marry to one female and a female can marry to one male. So the relationship
will be one to one.
2. One-to-many − One entity from entity set A can be associated with more than
one entities of entity set B however an entity from entity set B, can be associated
with at most one entity.
3. Many to one – When entities in one entity set can take part only once in the relationship set
and entities in other entity set can take part more than once in the relationship set, cardinality is
many to one.
Eg: A student can take only one course but one course can be taken by many students. So the
cardinality will be n to 1. It means that for one course there can be n students but for one student,
there will be only one course.
4. Many to many – When entities in all entity sets can take part more than once in the relationship cardinality
is many to many.
Eg: A student can take more than one course and one course can be taken by many students. So the relationship
will be many to many.
Participation Constraint
ID
Name
Age
Salary.
Attribute Domain
• When an attribute is defined in a relation(table), it is defined to hold only a certain type
of values, which is known as Attribute Domain.
• The attribute Name will hold the name of employee for every tuple.
• If we save employee's address there, it will be violation of the Relational database
model.
What is a Relation Schema?
A relation schema describes the structure of the relation, with the name of the relation(name of table), its
attributes and their names and type.
What is a Relation instance − A finite set of tuples in the relational database system represents relation
instance. Relation instances do not have duplicate tuples.
For example,
∏rollno,name (Student)
It will show only the rollno and name columns for all the rows in the Student table.
3. Union Operation (∪)
• This operation is used to fetch data from two relations(tables).
• The relations(tables) specified should have same number of attributes(columns) and same
attribute domain. Also the duplicate tuples are automatically eliminated from the result.
Syntax: r ∪ s
• where r and s are relations.
4. Set Difference (-)
• This operation is used to find data present in one(first) relation and not present in the
second relation. This operation is also applicable on two relations, just like Union
operation.
Syntax: r - s
• where r and s are relations.
5.Cartesian Product (X)
• This is used to combine data from two different relations(tables) into one and fetch data
from the combined relation.
Syntax: A X B
6.Rename Operation (ρ)-(rho)
• This operation is used to rename either the relation or the attributes.
Syntax: ρs(R)
ρ(RelationNew, RelationOld)
Additional Operations in Relational Algebra
• Set Intersection
• Natural join
• Division operation
• Assignment operation
Set Intersection
• The intersection operator gives the common data values between the two
data sets/tables/relations that are intersected.
Natural join operation
• Natural join is a binary operation that is used to combine certain selections and a Cartesian
product into one operation.
• It is denoted by the join symbol ⋈ .
Division operation
• The division is a binary operation that is written as R ÷ S.
• Suited to queries that include the phrase ‘for all’.
Assignment Operation
Extended Relational Algebra Operations
• In Relational Algebra, Extended Operators are those operators that are
derived from the basic operators.
Generalized Projection
Outer Join
Aggregate Functions
Aggregation function takes a collection of values and returns a
single value as a result.
avg:
maximum value
sum:
sum of values
count:
number of values
Aggregate operation in relational
Relation r:
A B C
7
7
3
10
sum-C
g sum(c) (r)
27
Relation account grouped by branch-name:
branch-name account-
number balance
Perryridge A-102 400
A- 900
Perryridge 201 750
Brighton A-217 750
Brighton A- 700
Redwood 215
A-222
branch-name g sum(balance) (account)
branch-name balance
Perryridge 1300
Brighton 1500
Redwood 700
An extension of the join operation that avoids loss of information.
Computes the join and then adds tuples form one relation that does
not match tuples in the other relation to the result of the join.
Uses null values:
null signifies that the value is unknown or does not exist
All comparisons involving null are false by definition.
The content of the database may be modified using the following
operations:
Deletion
Insertion
Updating
All these operations are expressed using the assignment
operator.
Deletion
A delete request is expressed similarly to a query, except instead of
displaying tuples to the user, the selected tuples are removed from
the database.
Can delete only whole tuples; cannot delete values on only
particular attributes
A deletion is expressed in relational algebra by:
rr–E
where r is a relation and E is a relational algebra query.
Delete all account records in the Perryridge branch.
Provide as a gift for all loan customers in the Perryridge branch, a $200 savings
account.
Let the loan number serve as the account number for the new savings account.
loan))
r1 (branch-name = “Perryridge” (borrower
• Relational Algebra - procedural query language to fetch data and which also explains
how it is done.
• Relational Calculus - non-procedural query language and has no description about how
the query will work or the data will be fetched. It only focusses on what to do, and not
on how to do it.
For example,
{< name, age > | ∈ Student ∧ age > 17}
• The above query will return the names and ages of the students in the table
Student who are older than 17.
Integrity Constraints
• Integrity constraints are a set of rules.
• It is used to maintain the quality of information.
• Integrity constraints ensure that the data insertion, updating, and other
processes have to be performed in such a way that data integrity is not
affected.
Types of Integrity Constraint
1. Domain constraints
• Domain constraints can be defined as the definition of a valid set of
values for an attribute.
• The data type of domain includes string, character, integer, time, date,
currency, etc. The value of the attribute must be available in the
corresponding domain.
• Example:
2. Entity integrity constraints
• The entity integrity constraint states that primary key value can't be null.
• This is because the primary key value is used to identify individual rows in
relation and if the primary key has a null value, then we can't identify those
rows.
• A table can contain a null value other than the primary key field.
Example:
3. Referential Integrity Constraints
• Keys are the entity set that is used to identify an entity within its entity set
uniquely.
• An entity set can have multiple keys, but out of which one key will be the primary
key. A primary key can contain a unique value in the relational table.
• Example:
ATTRIBUTE CLOSURE
• USING ATTRIBUTE CLOSURE,WE CAN FIND THE GIVEN KEY IS CANDIDATE
KEY OR NOT.
• X=set of attributes
• X(superscript +) = contains set of attributes determined by X.
• (here ‘+’ symbol indicates attribute closure of X)
Eg.1: R(A,B,C,D,E) and FD { A->B,B->C,C->D,D->E}
A+={A,B,C,D,E} ----------------SUPER KEY
AB+={A,B,C,D,E} ----------------SUPER KEY
AC+={A,C,B,D,E} ----------------SUPER KEY
AD+={A,D,B,C,E} ----------------SUPER KEY
AE+={A,E,B,C,D} --------------SUPER KEY
ABC+={A,B,C,D,E} ----------------SUPER KEY
.etc….,
ATTRIBUTE CLOSURE
CHECK FOR ‘B’:
• BC+={B,C,D,E}
• BD+={B,D,C,E}
• BE+={B,E,C,D}
• BDC+={B,D,C,E}
……..etc.,
But we cannot determine ‘A’ here. None of the subsets gave SUPER KEY.
CHECK FOR ‘C’:
CD+={C,D,E}
CE+={C,E,D}
But we cannot determine ‘A’ and ‘B’ here. None of the subsets gave SUPER KEY.
Similarly check for ‘D’ and ‘E’.
ATTRIBUTE CLOSURE
AC+:
• PROPER SUBSET={A},{C}
Check ‘A’ and ‘C’ is a Super Key.
Here ‘A’ is a Super Key then ‘AC’ is not a
Candidate Key.
In this relation,
ABC+:
5 Super keys : A,AB,AC,ABC,BC
PROPER SUBSET= {A},{B},{C},{AB},
{AC},{BC} 2 Candidate Keys: A ,BC
Here A ,AB,AC are Super Keys. 1 Primary Key: A
So ABC is not a Candidate Key. 1 Alternate Key: BC.
BC:
PROPER SUBSET= {B}{C}
Here B and C are not a SK.
SO BC is a CK.
A B C D
1 1 5 1
2 1 7 1
3 1 7 1
4 2 7 1
5 2 5 1
6 2 5 2
SUPER KEYS: • ABD={A},{B},{D},{AB},{AD},{BD}
A,AB,AC,AD,ABC,ABD,ACD,ABCD Already A,AB,AD are super keys.
Check the proper subset: So ABD is not a CK.
• A+={empty}, so A is a CK. • ACD={A},{C},{D},{AC},{AD},{CD}
• AB+={A},{B} Already A,AC,AD are super keys.
Already A is a SK.so AB is not a CK. • ABCD={A},{B},{C},{D},{AB},{AC},
{AD},{BC},{BD},{CD},{ABC},{ABD},
• AC={A},{C}
{BCD},{ACD}
Already A is a SK.so AC is not a CK.
Already A,AB,AC,AD,ABC,ABD,ACD->SK.
• AD={A},{D}
Already A is a SK.so AD is not a CK.
So in this relation,
• ABC={A},{B},{C},{AB},{AC},{BC}
8 Super keys are,
Already A,AB,AC are super keys.
A,AB,AC,AD,ABC,ABD,ACD,ABCD
so ABC is not a CK.
1 Candidate key= A
FUNCTIONAL DEPENDENCY
• Functional Dependency (FD) is a constraint that determines the relation of one attribute to
another attribute in a Database Management System. (OR) (one attribute is dependent on
another attribute)
• Functional Dependency helps to maintain the quality of data in the database.
• A functional dependency is denoted by an arrow “→”. The functional dependency of X on Y
is represented by X → Y. (X-Determinant and Y-Dependent)
• Eg:If we know the value of Employee number, we can obtain Employee Name, city, salary,
etc.Here city, Employee Name, and salary are functionally depended on Employee number.
FUNCTIONAL DEPENDENCY
FD, XY X Y
1 1
If t1.x=t2.x
2 1
then t1.y=t2.y 3 2
• If every value of the X is unique, 4 3
then this would be a functional 2 5
dependency.
X Y
1 1
Eg: rollno is unique for each 2 1
student, then we can identify the
3 2
name of the student using rollno.
4 3
FUNCTIONAL DEPENDENCY
• Here X and Y as a single attribute R.NO. NAME MARKS DEPT COURS
E
• R.NONAME (all the R.NO.is unique, then it is
FD).No 2 row values are same, then no need to 1 a 78 CS C1
check Y value. 2 b 60 EE C1
• NAMER.NO 3 a 78 CS C2
(a=a then 1=2 ,not same, then it is not FD) 4 b 60 EE C3
• R.NOMARKS 5 c 80 IT C3
(Here all rollnos are unique,then it is FD) 6 d 80 EC C2
• DEPTCOURSE
(CS=CS then C1=C2,not same , then it is not FD)
FD, XY
• COURSEDEPT
If t1.x=t2.x
(C1=C1 then CS=EE, not same, then it is not FD)
then t1.y=t2.y
• MARKSDEPT
(78=78 then a=a, 60=60 then b=b, same, then it is FD)
FUNCTIONAL DEPENDENCY
• Here x and Y as a multiple attribute/set of attributes R.NO. NAME MARKS DEPT COURS
E
• (R.NO,NAME) MARKS
1 a 78 CS C1
(X) (Y)
2 b 60 EE C1
(R.NO.is unique, so R.NO combination is different,
then it is FD) 3 a 78 CS C2
• NAMEMARKS 4 b 60 EE C3
NAME,MARKS,DEPTR.NO 5 c 80 IT C3
6 d 80 EC C2
FD, XY
If t1.x=t2.x
then t1.y=t2.y
TYPES OF FUNCTIONAL DEPENDENCY
• TRIVIAL (This function is always valid)
• NON-TRIVIAL
• MULTI-VALUED
• TRANSITIVE
TRIVIAL: (X) (Y)
1) FD,XY, If Y is a subset of X (eg. R.NO,NAMENAME)
2) XX (eg.R.NOR.NO) (always valid)
NON TRIVIAL: (this function may or may not be valid depends on the data in the table)
XY , (X intersection Y = empty) nothing is common in X and Y
(eg: R.NONAME)
SEMI TRIVIAL:
eg. R.NO, NAME NAME,MARKS
ARMSTRONGS AXIOMS/INFERENCE RULE
• Using these rule, we can find out all the functional dependencies exist on a given
relation/table. R.NO. NAME MARK DEPT COUR
S SE
• 7 rules( 3 primary rules and 4 secondary rules)
1 a 78 CS C1
1)REFLEXIVITY: 2 b 60 EE C1
XX 3 a 78 CS C2
XY, Y is the subset of X. 4 b 60 EE C3
5 c 80 IT C3
2)TRANSITIVITY:
6 d 80 EC C2
If (XY & YZ) then XZ
Eg.NAMEMARKS & MARKSDEPT) then NAMEDEPT.
3)AUGUMENTATION:
If XY then XAYA(add any attribute in left and right side)
eg. R.NONAME
(R.NO,MARKS)(NAME,MARKS)
ARMSTRONGS AXIOMS/INFERENCE RULE
• (4 secondary rules)
4)UNION:
If XY & XZ then xYZ
Eg.RNONAME & RNOMARKS then RNO(NAME,MARKS)
5)DECOMPOSITION/SPLITTING:
cannot split in the left side(determinant)
Split only in the right side)
If XYZ then XY & XZ
Eg.(NAME,MARKS)(DEPT,COURSE) then
(NAME,MARKS)DEPT & (NAME,MARKS)COURSE
ARMSTRONGS AXIOMS/INFERENCE RULE
• (4 secondary rules)
6)PSEUDO TRANSITIVITY:
If (XY & YZA) then XZA
Eg.(R.NONAME) & (NAME,MARKS)DEPT then (R.NO,MARKS)DEPT
7)COMPOSITION:
If XY & AB then XAYB
ATTRIBUTE CLOSURE/CLOSURE SET
• USING ATTRIBUTE CLOSURE,WE CAN FIND THE GIVEN KEY IS CANDIDATE
KEY OR NOT.
• X=set of attributes
• X(superscript +) = contains set of attributes determined by X.
• (here ‘+’ symbol indicates attribute closure of X)
Eg.R(A,B,C,D,E)
• FD { A->B,B->C,C->D,D->E}
• A->B, B->C then A->C (transitivity)
• A->A (write REFLEXIVITY)
• A->C ,C->D then A->D (transitivity)
• A->D,D->E then A->E (transitivity)
• A->ABCDE (union)
ATTRIBUTE CLOSURE/CLOSURE SET
Eg.R(A,B,C,D,E)
• FD { A->B,B->C,C->D,D->E}
We can also write,
• B->C,C->D then B->D (transitivity)
• B->D,D->E then B->E (transitivity)
• B->B (reflexivity)
• B->BCDE(union),but we cannot determine A here.
We can also write
C->D,D->E then C->E (transitivity)
C->C (reflexivity)
C->CDE (union), but we cannot determine A and B here.
We can also write
E->E (reflexivity)
ATTRIBUTE CLOSURE/CLOSURE SET
Eg.R(A,B,C,D,E)
• FD { A->B,B->C,C->D,D->E}
• A->B, we can write it as AD->BD(augumentation)
• AD->BD then AD->B and AD->D (splitting/Decomposition)
FIND the closure of A(superscript +)={A,B,C,D, E}-super key
FIND the closure of AD(+)={A,D,B,C,E}super key
FIND the closure of B(+)={B,C,D,E}
FIND the closure of CD(+)={C,D,E}
FIND the closure of AB(+)={A,B,C,D,E}->SUPER KEY
Super key: it is a set of attributes whose closure contains all attributes of a given relation.
Number of super key present in the relation = 16 (since R(A,B,C,D,E) =A with (B,C,D,E)
Possibilities= 2 power 4=16 superkeys
Normalization
• Delete anomaly: Suppose, if at a point of time the company closes the department
D890 then deleting the rows that are having emp_dept as D890 would also delete
the information of employee Maggie since she is assigned only to this department.
• Update anomaly: Two rows for employee Rick as he belongs to two departments
of the company. If we want to update the address of Rick then we have to update
the same in two rows or the data will become inconsistent.
OUTPUT TABLE:
Second Normal Form (2NF)
• To be in second normal form, a relation must be in first normal form(1NF) and
relation must not contain any partial dependency.
• A relation is in 2NF if it has No Partial Dependency, i.e., no non-prime
attribute (An attribute that is not part of any candidate key is known as non-
prime attribute) is dependent on any proper subset of any candidate key of the
table.
• Partial Dependency – If the proper subset of candidate key determines non-
prime attribute, it is called partial dependency.
Eg:1
• There are many courses having the same course fee.
• COURSE_FEE cannot alone decide the value of COURSE_NO or STUD_NO
• COURSE_FEE together with STUD_NO cannot decide the value of COURSE_NO
• COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO
• Hence,COURSE_FEE would be a non-prime attribute, as it does not belong to the one only candidate key
{STUD_NO, COURSE_NO} ;
• But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on COURSE_NO, which is a proper
subset of the candidate key. Non-prime attribute COURSE_FEE is dependent on a proper subset of the
candidate key, which is a partial dependency and so this relation is not in 2NF.
• To convert the above relation to 2NF, we need to split the table into two tables such as :
Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE
• all non-key attributes are fully functional dependent on the primary key.
Third Normal Form (3NF)
• A relation will be in 3NF if it is in 2NF and not contain any transitive partial
dependency.
• 3NF is used to reduce the data duplication.
TRANSACTION
Transaction
• A transaction can be defined as a group of tasks. A single task is the minimum
processing unit which cannot be divided further.
• One example is a transfer from one bank account to another:
the complete transaction requires subtracting the amount to be transferred from
one account and adding that same amount to the other.
E.g. transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Transaction
ACID PROPERTIES
• To preserve integrity of data, the database system must ensure.
Atomicity
• Either all operations of the transaction are properly reflected in the database
or none. Atomicity is also known as the ‘All or nothing rule’.
It involves the following two operations.
• Abort: If a transaction aborts, changes made to database are not visible.
• Commit: If a transaction commits, changes made are visible.
Eg: Transaction to transfer $50 from account A to account B:
1.read(A)
2.A:= A –50
3.write(A)
4.read(B)
5.B:= B + 50
6.write(B)
Consistency
• Execution of a transaction in isolation preserves the consistency of the database.
• This means that integrity constraints must be maintained so that the database is
consistent before and after the transaction. It refers to the correctness of a database.
Eg: Transaction to transfer $50 from account A (A=500) to account B (B=400):
1.read(A)
2.A:= A –50
3.write(A) ----------------- A=450
4.read(B)
5.B:= B + 50
6.write(B) -----------------B=450
• Consistency requirement –the sum of A and B is unchanged by the execution of
the transaction.
Isolation
• Every transaction is individual, and one transaction can’t access the result of other
transactions until the transaction completed.
• Although multiple transactions may execute concurrently, each transaction must be
unaware of other concurrently executing transactions.
• If several transactions are executed concurrently,their operations may interleave in
some undesirable way,resulting in an inconsistent state.
• To avoid the problem of concurrent execution, transactions should be executed in
isolation(serially).
Isolation
• Example: If two operations are concurrently running on two different accounts,
then the value of both accounts should not get affected. The value should remain
persistent. As you can see in the below diagram, account A is making T1 and T2
transactions to account B and C, but both are executing independently without
affecting each other. It is known as Isolation.
Durability
• Once the transaction completed, then the changes it has made to the database
will be permanent.
• Even if there is a system failure, or any abnormal changes , this will safeguard
the committed data.
Transaction states
In a database, the transaction can be in one of the following states -
Transaction states
1.Active state
The active state is the first state of every transaction. In this state, the transaction is
being executed.
For example: Insertion or deletion or updating a record is done here. But all the
records are still not saved to the database.
2. Partially committed
In the partially committed state, a transaction executes its final operation, but the
data is still not saved to the database.
3. Committed
A transaction is said to be in a committed state if it executes all its operations
successfully. In this state, all the effects are now permanently saved on the database
system.
Transaction states
4. Failed state
If any of the checks made by the database recovery system fails, then the
transaction is said to be in the failed state.
In the example of total mark calculation, if the database is not able to fire a
query to fetch the marks, then the transaction will fail to execute.
5.Aborted
If the transaction fails in the middle of the transaction then before executing
the transaction, all the executed transactions are rolled back to its consistent
state.
After aborting the transaction, the database recovery module will select one of
the two operations:
• Re-start the transaction- only if no internal logical error
• Kill the transaction
Concurrent Executions
• Multiple transactions are allowed to run concurrently in the system.
Advantages are:
• increased processor and disk utilization, leading to better transaction
throughput
• E.g. one transaction can be using the CPU while another is reading from or
writing to the disk.
• reduced average response time for transactions: short transactions need not
wait behind long ones.
Schedule 5
Schedule 6
Conflict Equivalent
View Serializability
• A schedule will view serializable if it is view equivalent to a serial schedule.
• If a schedule is conflict serializable, then it will be view serializable.
• The view serializable which does not conflict serializable contains blind writes. Blind write is simply when
a transaction writes without reading.
• A transaction have WRITE(Q), but no READ(Q) before it. So, the transaction is writing to the database
"blindly" without reading previous value.
View Equivalent
• Two schedules S1 and S2 are said to be view equivalent if they satisfy the following conditions:
1. Initial Read
• An initial read of both schedules must be the same. Suppose two schedule S1 and S2. In schedule S1, if a
transaction T1 is reading the data item A, then in S2, transaction T1 should also read A.
View Serializability
View Serializability
Blind write is simply when a
transaction writes without
reading.
• A transaction have WRITE(Q), but
no READ(Q) before it. So, the
transaction is writing to the
database "blindly" without reading
previous value.
Recoverability of Schedule
What is recoverability?
• Sometimes a transaction may not execute completely due to a software
issue, system crash or hardware failure. In that case, the failed
transaction has to be rollback. But some other transaction may also
have used value produced by the failed transaction.
What is non recoverable schedule?
• A non recoverable schedule means: When there is a system failures,
we may not be able to recover to a consistent database state.
A cascading rollback occurs in database systems when a transaction (T1) causes a failure
and a rollback must be performed. Other transactions dependent on T1's actions must also be
rollbacked due to T1's failure, thus causing a cascading effect. That is, one transaction's
failure causes many to fail.
Concurrency Control
• In the concurrency control, the multiple transactions can be executed
simultaneously.
• It may affect the transaction result. It is highly important to maintain the order
of execution of those transactions.
Problems of concurrency control
Several problems can occur when concurrent transactions are executed in an
uncontrolled manner. Following are the three problems in concurrency control.
• Lost updates
• Dirty read
• Unrepeatable /Nonrepeatable read
lost update problem
• In the lost update problem, update done to a data item by a transaction is lost as it is
overwritten by the update done by another transaction.
• This is incorrect, the correct result is 12-3-2 = 7.figure.1
Dirty read
• A Dirty read is the situation when a transaction reads a data that has not yet
been committed.
• For example, Let’s say transaction 1 updates a row and leaves it
uncommitted, meanwhile, Transaction 2 reads the updated row. If transaction
1 rolls back the change, transaction 2 will have read data that is considered
never to have existed.
Unrepeatable /Non repeatable read
• Non Repeatable read occurs when a transaction reads same row twice, and get a different value
each time.
• For example, suppose transaction T1 reads data. Due to concurrency, another transaction T2
updates the same data and commit, Now if transaction T1 rereads the same data, it will retrieve a
different value.
Concurrency Control Protocol
• In this type of protocol, any transaction cannot read or write data until it acquires an appropriate
lock on it.
There are two types of lock:
1. Shared lock:
• It is also known as a Read-only lock. In a shared lock, the data item can only read by the
transaction.
• It can be shared between the transactions because when the transaction holds a lock, then it
can't update the data on the data item.
2. Exclusive lock:
• In the exclusive lock, the data item can be both reads as well as written by the transaction.
• This lock is exclusive, and in this lock, multiple transactions do not modify the same data
simultaneously.
Lock-Based Protocol