0% found this document useful (0 votes)
37 views588 pages

5-8 DBMS

Uploaded by

Arnav Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views588 pages

5-8 DBMS

Uploaded by

Arnav Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 588

Module 21

Partha Pratim
Das

Week Recap

Objectives &
Database Management Systems
Outline
Module 21: Relational Database Design/1
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains Partha Pratim Das


and First Normal
Form

Module Summary
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 21.1


Week Recap PPD

Module 21

Partha Pratim • Discussed relational algebra with examples


Das
• Introduced tuple relational and domain relational calculus
Week Recap

Objectives &
• Illustrated equivalence of algebra and calculus
Outline

Features of Good
• Introduced the Design Process for Database Systems
Relational Design
Redundancy and
• Elucidated the E-R Model for real world representation with entities, entity sets,
Anomaly
Decomposition
attributes, and relationships
Atomic Domains • Illustrated ER Diagram notation for ER Models
and First Normal
Form
• Discussed translation of ER Models to Relational Schema and extended features of ER
Module Summary
Model
• Deliberated on various design issues

Database Management Systems Partha Pratim Das 21.2


Module Objectives PPD

Module 21

Partha Pratim • To identify the features of good relational design


Das
• To familiarize with the First Normal Form
Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains
and First Normal
Form

Module Summary

Database Management Systems Partha Pratim Das 21.3


Module Outline PPD

Module 21

Partha Pratim • Features of Good Relational Design


Das
• Atomic Domains and First Normal Form
Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains
and First Normal
Form

Module Summary

Database Management Systems Partha Pratim Das 21.4


PPD

Module 21

Partha Pratim
Das

Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains
and First Normal
Form

Module Summary Features of Good Relational Design

Database Management Systems Partha Pratim Das 21.5


Good Relational Design

Module 21

Partha Pratim • Reflects real-world structure of the problem


Das
• Can represent all expected data over time
Week Recap

Objectives &
• Avoids redundant storage of data items
Outline

Features of Good
• Provides efficient access to data
Relational Design
Redundancy and
• Supports the maintenance of data integrity over time
Anomaly
Decomposition • Clean, consistent, and easy to understand
Atomic Domains
and First Normal • Note: These objectives are sometimes contradictory!
Form

Module Summary

Database Management Systems Partha Pratim Das 21.6


What is a Good Schema?
instructor
Module 21

Partha Pratim instructor with department


Das

Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains
and First Normal
Form

Module Summary department

• ID: Key
• building , budget: Redundant Information
• name, salary , dept name: No Redundant Information

Database Management Systems Partha Pratim Das 21.7


What is a Good Schema? (2)

Module 21

Partha Pratim • Consider combining relations


Das
◦ sec class(sec id, building, room number) and
Week Recap
◦ section(course id, sec id, semester, year)
Objectives &
Outline into one relation
Features of Good
Relational Design ◦ section(course id, sec id, semester, year, building, room number)
Redundancy and
Anomaly • No repetition in this case
Decomposition

Atomic Domains
and First Normal
Form

Module Summary

Database Management Systems Partha Pratim Das 21.8


Redundancy and Anomaly

Module 21

Partha Pratim • Redundancy: having multiple copies of same data in the database.
Das
◦ This problem arises when a database is not normalized
Week Recap
◦ It leads to anomalies
Objectives &
Outline • Anomaly: inconsistencies that can arise due to data changes in a database with
Features of Good
Relational Design
insertion, deletion, and update
Redundancy and
Anomaly ◦ These problems occur in poorly planned, un-normalised databases where all the data
Decomposition
is stored in one table (a flat-file database)
Atomic Domains
and First Normal There can be three kinds of anomalies
Form

Module Summary ◦ Insertions Anomaly


◦ Deletion Anomaly
◦ Update Anomaly

Database Management Systems Partha Pratim Das 21.9


Redundancy and Anomaly (2)

Module 21
• Insertions Anomaly
Partha Pratim
Das ◦ When the insertion of a data record is not possible without adding some additional
Week Recap
unrelated data to the record
Objectives &
◦ We cannot add an Instructor in instructor with department if the department does
Outline
not have a building or budget
Features of Good
Relational Design • Deletion Anomaly
Redundancy and
Anomaly
◦ When deletion of a data record results in losing some unrelated information that
Decomposition

Atomic Domains
was stored as part of the record that was deleted from a table
and First Normal
Form
◦ We delete the last Instructor of a Department from instructor with department, we
Module Summary
lose building and budget information
• Update Anomaly
◦ When a data is changed, which could involve many records having to be changed,
leading to the possibility of some changes being made incorrectly
◦ When the budget changes for a Department having large number of Instructors in
instructor with department application may miss some of them
Database Management Systems Partha Pratim Das 21.10
Redundancy and Anomaly (3)

Module 21 • We have observed the following:


Partha Pratim
Das
◦ Redundancy ⇒ Anomaly
◦ Relations instructor and department is better than instructor with department
Week Recap

Objectives &
• What causes redundancy?
Outline
◦ Dependency ⇒ Redundancy
Features of Good
Relational Design ◦ dept name uniquely decides building and budget. A department cannot have two
Redundancy and
Anomaly different budget or building. So building and budget depends on dept name
Decomposition

Atomic Domains
• How to remove, or at least minimize, redundancy?
and First Normal
Form ◦ Decompose (partition) the relation into smaller relations
Module Summary ◦ instructor with department can be decomposed into instructor and department
◦ Good Decomposition ⇒ Minimization of Dependency
• Is every decomposition good?
◦ No. It needs to preserve information, honour the dependencies, be efficient etc.
◦ Various schemes of normalization ensure good decomposition
◦ Normalization ⇒ Good Decomposition
Database Management Systems Partha Pratim Das 21.11
Decomposition

Module 21

Partha Pratim • Suppose we had started with inst dept. How would we know to split up (decompose)
Das
it into instructor and department?
Week Recap
• Write a rule “if there were a schema (dept name, building, budget), then dept name
Objectives &
Outline would be a candidate key”
Features of Good
Relational Design • Denote as a functional dependency: dept name → building, budget
Redundancy and
Anomaly • In inst dept, because dept name is not a candidate key, the building and budget of a
Decomposition
department may have to be repeated.
Atomic Domains
and First Normal
Form
◦ This indicates the need to decompose inst dept
Module Summary

Database Management Systems Partha Pratim Das 21.12


Decomposition (2)

Module 21

Partha Pratim • Not all decompositions are good


Das
• Suppose we decompose
Week Recap
employee(ID, name, street, city, salary) into
Objectives &
Outline employee1 (ID, name)
Features of Good
Relational Design
employee2 (name, street, city, salary)
Redundancy and
Anomaly
• Note that if name can be duplicate, then employee2 is a weak entity set and cannot
Decomposition
exist without an identifying relationship
Atomic Domains
and First Normal • Consequently, this decomposition cannot preserve the information
Form

Module Summary • The next slide shows how we lose information – we cannot reconstruct the original
employee relation – and so, this is a lossy decomposition.

Database Management Systems Partha Pratim Das 21.13


Decomposition (3): Lossy Decomposition PPD

Module 21

Partha Pratim
Das

Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains
and First Normal
Form

Module Summary

Database Management Systems Partha Pratim Das 21.14


Decomposition (4): Lossless-Join Decomposition

Module 21

Partha Pratim • Lossless Join Decomposition


Das
• Decomposition of R = (A, B, C)
Week Recap
R1 = (A, B), R2 = (B, C )
Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains
and First Normal
Form

Module Summary

Database Management Systems Partha Pratim Das 21.15


Decomposition (5): Lossless-Join Decomposition

Module 21

Partha Pratim • Lossless Join Decomposition is a decomposition of a relation R into relations R1 , R2


Das
such that if we perform natural join of two smaller relations it will return the original
Week Recap relation
Objectives &
Outline
R1 ∪ R2 = R, R1 ∩ R2 6= φ
Features of Good
Relational Design ∀r ∈ R, r1 = uR1 (r ), r2 = uR2 (r )
Redundancy and
Anomaly
Decomposition
r1 ./ r2 = r
Atomic Domains
and First Normal • This is effective in removing redundancy from databases while preserving the original
Form
data
Module Summary
• In other words by lossless decomposition it becomes feasible to reconstruct the relation
R from decomposed tables R1 and R2 by using Joins

Database Management Systems Partha Pratim Das 21.16


PPD

Module 21

Partha Pratim
Das

Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains
and First Normal
Form

Module Summary Atomic Domains and First Normal Form

Database Management Systems Partha Pratim Das 21.17


First Normal Form (1NF) PPD

Module 21

Partha Pratim • A domain is atomic if its elements are considered to be indivisible units
Das
◦ Examples of non-atomic domains:
Week Recap

Objectives &
. Set of names, composite attributes
Outline . Identification numbers like CS101 that can be broken up into parts
Features of Good
Relational Design • A relational schema R is in First Normal Form (INF) if
Redundancy and
Anomaly ◦ the domains of all attributes of R are atomic
Decomposition
◦ the value of each attribute contains only a single value from that domain
Atomic Domains
and First Normal
Form • Non-atomic values complicate storage and encourage redundant (repeated) storage of
Module Summary data
◦ Example: Set of accounts stored with each customer, and set of owners stored with
each account
◦ We assume all relations are in first normal form

Database Management Systems Partha Pratim Das 21.18


First Normal Form (2)

Module 21

Partha Pratim • Atomicity is actually a property of how the elements of the domain are used
Das
◦ Strings would normally be considered indivisible
Week Recap
◦ Suppose that students are given roll numbers which are strings of the form CS0012
Objectives &
Outline or EE1127
Features of Good ◦ If the first two characters are extracted to find the department, the domain
Relational Design
Redundancy and of roll numbers is not atomic
Anomaly
Decomposition ◦ Doing so is a bad idea
Atomic Domains
and First Normal
. Leads to encoding of information in application program rather than in the
Form database
Module Summary

Database Management Systems Partha Pratim Das 21.19


First Normal Form (3) PPD

Module 21

Partha Pratim • The following is not in 1NF


Das

Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
◦ A telephone number is composite
Atomic Domains
and First Normal ◦ Telephone number is multi-valued
Form

Module Summary

Database Management Systems Partha Pratim Das 21.20


First Normal Form (4) PPD

Module 21

Partha Pratim • Consider:


Das

Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly ◦ is in 1NF if telephone number is not considered composite
Decomposition

Atomic Domains
◦ However, conceptually, we have two attributes for the same concept
and First Normal
Form
. Arbitrary and meaningless ordering of attributes
Module Summary . How to search telephone numbers
. Why only two numbers?

Database Management Systems Partha Pratim Das 21.21


First Normal Form (5) PPD

Module 21

Partha Pratim • Is the following in 1NF?


Das

Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains ◦ Duplicated information


and First Normal
Form ◦ ID is no more the key. Key is (ID, Telephone Number)
Module Summary

Database Management Systems Partha Pratim Das 21.22


First Normal Form (6) PPD

Module 21

Partha Pratim • Better to have 2 relations:


Das

Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains ◦ One-to-Many relationship between parent and child relations


and First Normal
Form ◦ Incidentally, satisfies 2NF and 3NF
Module Summary
• Decomposition helps to attain 1NF for the embedded one-to-many relationship

Database Management Systems Partha Pratim Das 21.23


Module Summary

Module 21

Partha Pratim • Identified the features of good relational design


Das
• Familiarized with the First Normal Form
Week Recap

Objectives &
Outline

Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition

Atomic Domains
and First Normal
Form

Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 21.24


Module 22

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Functional
Dependencies Module 22: Relational Database Design/2
Closure of FDs

Module Summary

Partha Pratim Das

Department of Computer Science and Engineering


Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 22.1


Module Recap PPD

Module 22

Partha Pratim • Identified the features of good relational design


Das
• Familiarized with the First Normal Form
Objectives &
Outline

Functional
Dependencies
Closure of FDs

Module Summary

Database Management Systems Partha Pratim Das 22.2


Module Objectives PPD

Module 22

Partha Pratim • To Introduce Functional Dependencies


Das

Objectives &
Outline

Functional
Dependencies
Closure of FDs

Module Summary

Database Management Systems Partha Pratim Das 22.3


Module Outline PPD

Module 22

Partha Pratim • Functional Dependencies


Das

Objectives &
Outline

Functional
Dependencies
Closure of FDs

Module Summary

Database Management Systems Partha Pratim Das 22.4


PPD

Module 22

Partha Pratim
Das

Objectives &
Outline

Functional
Dependencies
Closure of FDs

Module Summary

Functional Dependencies

Database Management Systems Partha Pratim Das 22.5


Goal: Devise a Theory for Good Relations

Module 22

Partha Pratim • Decide whether a particular relation R is in “good” form.


Das
• In the case that a relation R is not in “good” form, decompose it into a set of relations
Objectives &
Outline {R1 , R2 , . . . , Rn } such that
Functional
Dependencies
◦ each relation is in good form
Closure of FDs ◦ the decomposition is a lossless-join decomposition
Module Summary
• The theory is based on:
◦ Functional dependencies
◦ Multivalued dependencies
◦ Other dependencies

Database Management Systems Partha Pratim Das 22.6


Functional Dependencies

Module 22

Partha Pratim • Constraints on the set of legal relations


Das
• Require that the value for a certain set of attributes determines uniquely the value for
Objectives &
Outline another set of attributes
Functional
Dependencies
• A functional dependency is a generalization of the notion of a key
Closure of FDs

Module Summary

Database Management Systems Partha Pratim Das 22.7


Functional Dependencies (2)

Module 22 • Let R be a relation schema


Partha Pratim α ⊆ R and β ⊆ R
Das
• The functional dependency or FD
Objectives &
Outline α→β
Functional holds on R if and only if for any legal relations r (R), whenever any two tuples t1 and t2
Dependencies
Closure of FDs
of r agree on the attributes α, they also agree on the attributes β. That is,
Module Summary
t1 [α] = t2 [α] ⇒ t1 [β] = t2 [β]

• Example: Consider r (A, B) with the following instance of r .

• On this instance, A → B does NOT hold, but B → A does hold. So we cannot have
tuples like (2, 4), or (3, 5), or (4, 7) added to the current instance.
Database Management Systems Partha Pratim Das 22.8
Functional Dependencies (3)

Module 22

Partha Pratim • K is a superkey for relation schema R if and only if K → R


Das
• K is a candidate key for R if and only if
Objectives &
Outline ◦ K → R and
Functional
Dependencies
◦ for no α ⊂ K , α → R
Closure of FDs
• Functional dependencies allow us to express constraints that cannot be expressed using
Module Summary
superkeys. Consider the schema:
inst dept(ID, name, salary, dept name, building, budget)
• We expect these functional dependencies to hold:
dept name → building
dept name → budget
ID → budget
but would not expect the following to hold:
dept name → salary

Database Management Systems Partha Pratim Das 22.9


Functional Dependencies (4)

Module 22

Partha Pratim • We use functional dependencies to:


Das
◦ test relations to see if they are legal under a given set of functional dependencies.
Objectives &
Outline . If a relation r is legal under a set F of functional dependencies, we say that r
Functional
Dependencies
satisfies F
Closure of FDs ◦ specify constraints on the set of legal relations
Module Summary
. We say that F holds on R if all legal relations on R satisfy the set of functional
dependencies F
• Note: A specific instance of a relation schema may satisfy a functional dependency
even if the functional dependency does not hold on all legal instances
◦ For example, a specific instance of instructor may, by chance, satisfy
name → ID
◦ In such cases we do not say that F holds on R

Database Management Systems Partha Pratim Das 22.10


Functional Dependencies (5)

Module 22

Partha Pratim • A functional dependency is trivial if it is satisfied by all instances of a relation


Das
◦ Example:
Objectives &
Outline . ID, name → ID
Functional
Dependencies
. name → name
Closure of FDs
• In general, α → β is trivial if β ⊆ α.
Module Summary

Database Management Systems Partha Pratim Das 22.11


Functional Dependencies (6) PPD

Module 22

Partha Pratim • Functional dependencies are:


Das

Objectives &
Outline

Functional
Dependencies
Closure of FDs

Module Summary

◦ StudentID → Semester
StudentID, Lecture → TA
{StudentID, Lecture} → {TA, Semester }

Database Management Systems Partha Pratim Das 22.12


Functional Dependencies (7) PPD

Module 22

Partha Pratim • Functional dependencies are:


Das

Objectives &
Outline

Functional
Dependencies
Closure of FDs

Module Summary

◦ EmployeeID → EmployeeName
EmployeeID → DepartmentID
DepartmentID → DepartmentName

Database Management Systems Partha Pratim Das 22.13


Functional Dependencies (9): Closure of a Set of FDs

Module 22

Partha Pratim • F = {A → B, B → C }
Das
• F + = {A → B, B → C , A → C }
Objectives &
Outline

Functional
Dependencies
Closure of FDs

Module Summary

Database Management Systems Partha Pratim Das 22.14


Module Summary

Module 22

Partha Pratim • Introduced the notion of Functional Dependencies


Das

Objectives &
Outline

Functional
Dependencies
Closure of FDs

Module Summary

Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 22.15


Module 23

Partha Pratim
Das

Objectives &
Outline Database Management Systems
FD Theory
Armstrong’s Axioms
Module 23: Relational Database Design/3
Closure of FDs
Closure of Attributes

Decomposition
using FDs
BCNF Partha Pratim Das
3NF
Normalization
Department of Computer Science and Engineering
Module Summary
Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 23.1


Module Recap PPD

Module 23

Partha Pratim • Introduced the notion of Functional Dependencies


Das

Objectives &
Outline

FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes

Decomposition
using FDs
BCNF
3NF
Normalization

Module Summary

Database Management Systems Partha Pratim Das 23.2


Module Objectives PPD

Module 23

Partha Pratim • To develop the theory of functional dependencies


Das
• To understand how a schema can be decomposed for a ‘good’ design using functional
Objectives &
Outline dependencies
FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes

Decomposition
using FDs
BCNF
3NF
Normalization

Module Summary

Database Management Systems Partha Pratim Das 23.3


Module Outline PPD

Module 23

Partha Pratim • Functional Dependency Theory


Das
• Decomposition Using Functional Dependencies
Objectives &
Outline

FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes

Decomposition
using FDs
BCNF
3NF
Normalization

Module Summary

Database Management Systems Partha Pratim Das 23.4


Functional Dependency Theory PPD

Module 23

Partha Pratim
Das

Objectives &
Outline

FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes

Decomposition
using FDs
BCNF
3NF
Normalization

Module Summary Functional Dependency Theory

Database Management Systems Partha Pratim Das 23.5


Functional Dependencies: Armstrong’s Axioms

Module 23 • Given a set of Functional Dependencies F , we can infer new dependencies by the
Partha Pratim Armstrong’s Axioms:
Das
◦ Reflexivity: if β ⊆ α, then α → β
Objectives &
Outline
◦ Augmentation: if α → β, then γα → γβ
FD Theory ◦ Transitivity: if α → β and β → γ, then α → γ
Armstrong’s Axioms
Closure of FDs • These axioms can be repeatedly applied to generate new FDs and added to F
Closure of Attributes

Decomposition
• A new FD obtained by applying the axioms is said to the logically implied by F
using FDs
BCNF
• The process of generations of FDs terminate after finite number of steps and we call it
3NF the Closure Set F + for FDs F . This is the set of all FDs logically implied by F
Normalization

Module Summary • Clearly, F ⊆ F +


• These axioms are
◦ Sound (generate only functional dependencies that actually hold), and
◦ Complete (eventually generate all functional dependencies that hold)
• Prove the axioms from definitions of FDs
• Prove the soundness and completeness of thePartha
Database Management Systems
axioms
Pratim Das 23.6
Functional Dependencies (2): Closure of a Set FDs

Module 23

Partha Pratim • F = {A → B, B → C }
Das
• F + = {A → B, B → C , A → C }
Objectives &
Outline

FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes

Decomposition
using FDs
BCNF
3NF
Normalization

Module Summary

Database Management Systems Partha Pratim Das 23.7


Functional Dependencies (3): Closure of a Set FDs

Module 23 • R = (A, B, C , G , H, I )
Partha Pratim
Das
F = {A → B
A→C
Objectives &
Outline CG → H
FD Theory CG → I
Armstrong’s Axioms
Closure of FDs B → H}
Closure of Attributes

Decomposition
• Some members of F +
using FDs
BCNF
◦ A→H
3NF
Normalization
. by transitivity from A → B and B → H
Module Summary ◦ AG → I
. by augmenting A → C with G , to get AG → CG and then transitivity with
CG → I
◦ CG → HI
. by augmenting CG → I with CG to infer CG → CGI , and augmenting CG → H
with I to infer CGI → HI , and then transitivity
Database Management Systems Partha Pratim Das 23.8
Functional Dependencies (4): Closure of a Set FDs: Computing
F+
Module 23

Partha Pratim • To compute the closure of a set of functional dependencies F :


Das
F+ ← F
Objectives &
Outline
repeat
FD Theory
for each functional dependency f in F +
Armstrong’s Axioms apply reflexivity and augmentation rules on f
Closure of FDs
Closure of Attributes add the resulting functional dependencies to F +
Decomposition for each pair of functional dependencies f1 and f2 in F +
using FDs
BCNF if f1 and f2 can be combined using transitivity
3NF
Normalization
then add the resulting functional dependency to F +
+
Module Summary until F does not change any further
• Note: We shall see an alternative procedure for this task later

Database Management Systems Partha Pratim Das 23.9


Functional Dependencies (5): Armstrong’s Axioms: Derived Rules

Module 23

Partha Pratim • Additional Derived Rules:


Das
◦ Union: if α → β holds and α → γ holds, then α → βγ holds
Objectives &
Outline ◦ Decomposition: if α → βγ holds, then α → β holds and α → γ holds
FD Theory ◦ Pseudotransitivity: if α → β holds and γβ → δ holds, then αγ → δ holds
Armstrong’s Axioms
Closure of FDs • The above rules can be inferred from basic Armstrong’s axioms (and hence are not
Closure of Attributes
included in the basic set). They can be proven independently too
Decomposition
using FDs ◦ Reflexivity: if β ⊆ α, then α → β
BCNF
3NF ◦ Augmentation: if α → β, then γα → γβ
Normalization
◦ Transitivity: if α → β and β → γ, then α → γ
Module Summary
• Prove the Rules from:
◦ Basic Axioms
◦ The definitions of FDs

Database Management Systems Partha Pratim Das 23.10


Functional Dependencies (6): Closure of Attribute Sets

Module 23

Partha Pratim • Given a set of attributes α, define the closure of α under F (denoted by α+ ) as the
Das
set of attributes that are functionally determined by α under F
Objectives &
Outline • Algorithm to compute α+ , the closure of α under F
FD Theory result ← α
Armstrong’s Axioms
Closure of FDs
while (changes to result) do
Closure of Attributes
for each β → γ in F do
Decomposition
using FDs
begin
BCNF if β ⊆ result then result ← result ∪ γ
3NF
Normalization end
Module Summary

Database Management Systems Partha Pratim Das 23.11


Functional Dependencies (7): Closure of Attribute Sets: Example

Module 23

Partha Pratim • R = (A, B, C , G , H, I )


Das
• F = {A → B, A → C , CG → H, CG → I , B → H}
Objectives &
Outline • (AG )+
FD Theory
Armstrong’s Axioms
a) result = AG
Closure of FDs
Closure of Attributes
b) result = ABCG (A → C and A → B)
Decomposition
c) result = ABCGH (CG → H and CG ⊆ AGBC )
using FDs d) result = ABCGHI (CG → I and CG ⊆ AGBCH)
BCNF
3NF
Normalization
• Is AG a candidate key?
Module Summary a) Is AG a super key?
i) Does AG → R? == Is (AG )+ ⊇ R
b) Is any subset of AG a superkey?
i) Does A → R? == Is (A)+ ⊇ R
ii) Does G → R? == Is (G )+ ⊇ R

Database Management Systems Partha Pratim Das 23.12


Functional Dependencies (7): Closure of Attribute Sets: Use

Module 23

Partha Pratim
There are several uses of the attribute closure algorithm:
Das
• Testing for superkey:
Objectives &
Outline ◦ To test if α is a superkey, we compute α+ , and check if α+ contains all attributes
FD Theory of R.
Armstrong’s Axioms
Closure of FDs • Testing functional dependencies
Closure of Attributes

Decomposition
◦ To check if a functional dependency α → β holds (or, in other words, is in F + ), just
using FDs check if β ⊆ α+
BCNF
3NF ◦ That is, we compute α+ by using attribute closure, and then check if it contains β.
Normalization
◦ Is a simple and cheap test, and very useful
Module Summary
• Computing closure of F
◦ For each γ ⊆ R, we find the closure γ + , and for each S ⊆ γ + , we output a
functional dependency γ → S.

Database Management Systems Partha Pratim Das 23.13


Decomposition using Functional Dependencies PPD

Module 23

Partha Pratim
Das

Objectives &
Outline

FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes

Decomposition
using FDs
BCNF
3NF
Normalization

Module Summary Decomposition using Functional Dependencies

Database Management Systems Partha Pratim Das 23.14


BCNF: Boyce-Codd Normal Form

Module 23

Partha Pratim • A relation schema R is in BCNF with respect to a set F of FDs if for all FDs in F + of
Das
the form
Objectives &
Outline
α → β, where α ⊆ R and β ⊆ R at least one of the following holds:
FD Theory ◦ α → β is trivial (that is, β ⊆ α)
Armstrong’s Axioms
Closure of FDs
◦ α is a superkey for R
Closure of Attributes
• Example schema not in BCNF:
Decomposition
using FDs instr dept (ID, name, salary, dept name, building, budget)
BCNF
3NF • because the non-trivial dependency dept name → building, budget holds on instr dept,
Normalization
but dept name is not a superkey
Module Summary

Database Management Systems Partha Pratim Das 23.15


BCNF (2): Decomposition

Module 23

Partha Pratim • If in schema R and a non-trivial dependency α → β causes a violation of BCNF, we


Das
decompose R into:
Objectives &
Outline ◦ α∪β
FD Theory ◦ (R − (β − α))
Armstrong’s Axioms
Closure of FDs • In our example,
Closure of Attributes

Decomposition
◦ α = dept name
using FDs ◦ β = building , budget
BCNF
3NF ◦ dept name → building, budget
Normalization
inst dept is replaced by
Module Summary
◦ (α ∪ β) = (dept name, building, budget)
. dept name → building, budget
◦ (R − (β − α)) =(ID, name, salary, dept name)
. ID → name, salary, dept name

Database Management Systems Partha Pratim Das 23.16


BCNF (3): Lossless Join

Module 23 • If we decompose a relation R into relations R1 and R2 :


Partha Pratim
Das
◦ Decomposition is lossy if R1 ./ R2 ⊃ R
◦ Decomposition is lossless if R1 ./ R2 = R
Objectives &
Outline • To check for lossless join decomposition using FD set, following must hold:
FD Theory
Armstrong’s Axioms
◦ Union of Attributes of R1 and R2 must be equal to attribute of R
Closure of FDs
Closure of Attributes
R1 ∪ R2 = R
Decomposition
using FDs
BCNF
3NF
◦ Intersection of Attributes of R1 and R2 must not be NULL
Normalization

Module Summary R1 ∩ R2 6= Φ

◦ Common attribute must be a key for at least one relation (R1 or R2 )

R1 ∩ R2 → R1 or R1 ∩ R2 → R2

• Prove that BCNF ensures Lossless Join


Database Management Systems Partha Pratim Das 23.17
BCNF (4): Dependency Preservation

Module 23 • Constraints, including FDs, are costly to check in practice unless they pertain to only
Partha Pratim one relation
Das
• If it is sufficient to test only those dependencies on each individual relation of a
Objectives &
Outline decomposition in order to ensure that all functional dependencies hold, then that
FD Theory decomposition is dependency preserving.
Armstrong’s Axioms
Closure of FDs • It is not always possible to achieve both BCNF and dependency preservation. Consider:
Closure of Attributes

Decomposition
◦ R = CSZ , F = {CS → Z , Z → C }
using FDs
BCNF
◦ Key = CS
3NF ◦ CS → Z satisfies BCNF, but Z → C violates
Normalization
◦ Decompose as: R1 = ZC , R2 = CSZ − (C − Z ) = SZ
Module Summary
◦ R1 ∪ R2 = CSZ = R, R1 ∩ R2 = Z 6= Φ, and R1 ∩ R2 = Z → ZC = R1 . So it has
lossless join
◦ However, we cannot check CS → Z without doing a join. Hence it is not
dependency preserving
• We consider a weaker normal form, known as Third Normal Form (3NF)
Database Management Systems Partha Pratim Das 23.18
3NF: Third Normal Form

Module 23

Partha Pratim • A relation schema R is in third normal form (3NF) if for all:
Das
α → β ∈ F+
Objectives &
Outline
at least one of the following holds:
FD Theory ◦ α → β is trivial (that is, β ⊆ α)
Armstrong’s Axioms
Closure of FDs
◦ α is a superkey for R
Closure of Attributes ◦ Each attribute A in β − α is contained in a candidate key for R
Decomposition
using FDs
(Nore: Each attribute may be in a different candidate key)
BCNF
3NF
• If a relation is in BCNF it is in 3NF (since in BCNF one of the first two conditions
Normalization above must hold)
Module Summary
• Third condition is a minimal relaxation of BCNF to ensure dependency preservation
(will see why later)

Database Management Systems Partha Pratim Das 23.19


Goals of Normalization

Module 23

Partha Pratim • Let R be a relation scheme with a set F of functional dependencies


Das
• Decide whether a relation scheme R is in “good” form
Objectives &
Outline • In the case that a relation scheme R is not in “good” form, decompose it into a set of
FD Theory
Armstrong’s Axioms
relation scheme {R1 , R2 , ..., Rn } such that
Closure of FDs
Closure of Attributes
◦ each relation scheme is in good form
Decomposition
◦ the decomposition is a lossless-join decomposition
using FDs ◦ Preferably, the decomposition should be dependency preserving
BCNF
3NF
Normalization

Module Summary

Database Management Systems Partha Pratim Das 23.20


Problems with Decomposition

Module 23

Partha Pratim
There are three potential problems to consider:
Das
• May be impossible to reconstruct the original relation! (Lossiness)
Objectives &
Outline • Dependency checking may require joins
FD Theory
Armstrong’s Axioms
• Some queries become more expensive
Closure of FDs
Closure of Attributes
◦ What is the building for an instructor?
Decomposition Tradeoff: Must consider these issues vs. redundancy
using FDs
BCNF
3NF
Normalization

Module Summary

Database Management Systems Partha Pratim Das 23.21


How good is BCNF?

Module 23

Partha Pratim • There are database schemas in BCNF that do not seem to be sufficiently normalized
Das
• Consider a relation
Objectives &
Outline inst info (ID, child name, phone)
FD Theory ◦ where an instructor may have more than one phone and can have multiple children
Armstrong’s Axioms
Closure of FDs
Closure of Attributes

Decomposition
using FDs
BCNF
3NF
Normalization

Module Summary

inst info

Database Management Systems Partha Pratim Das 23.22


How good is BCNF? (2)

Module 23

Partha Pratim • There are no non-trivial functional dependencies and therefore the relation is in BCNF
Das
• Insertion anomalies – that is, if we add a phone 981-992-3443 to 99999, we need to add
Objectives &
Outline two tuples
FD Theory
Armstrong’s Axioms
Closure of FDs
(99999, David, 981-992-3443)
Closure of Attributes
(99999, William, 981-992-3443)
Decomposition
using FDs
BCNF
3NF
Normalization

Module Summary

Database Management Systems Partha Pratim Das 23.23


How good is BCNF? (3)

Module 23

Partha Pratim • Therefore, it is better to decompose inst info into:


Das
inst child
Objectives &
Outline

FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes

Decomposition
using FDs inst phone
BCNF
3NF
Normalization

Module Summary

• This suggests the need for higher normal forms, such as the Fourth Normal Form (4NF)

Database Management Systems Partha Pratim Das 23.24


Module Summary

Module 23

Partha Pratim • Introduced the theory of functional dependencies


Das
• Discussed issues in ”good” design in the context of functional dependencies
Objectives &
Outline

FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes

Decomposition
using FDs
BCNF
3NF
Normalization

Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 23.25


Module 24

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Algorithms for
FDs Module 24: Relational Database Design/4
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Partha Pratim Das
Practice Problems

Module Summary Department of Computer Science and Engineering


Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 24.1


Module Recap PPD

Module 24

Partha Pratim • Introduced the theory of functional dependencies


Das
• Discussed issues in ”good” design in the context of functional dependencies
Objectives &
Outline

Algorithms for
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 24.2


Module Objectives PPD

Module 24

Partha Pratim • To Learn Algorithms for Properties of Functional Dependencies


Das

Objectives &
Outline

Algorithms for
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 24.3


Module Outline PPD

Module 24

Partha Pratim • Algorithms for Functional Dependencies


Das

Objectives &
Outline

Algorithms for
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 24.4


Algorithms for Functional Dependencies PPD

Module 24

Partha Pratim
Das

Objectives &
Outline

Algorithms for
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems

Module Summary
Algorithms for Functional Dependencies

Database Management Systems Partha Pratim Das 24.5


Attribute Set Closure

Module 24

Partha Pratim • R = (A, B, C , G , H, I )


Das
• F = {A → B, A → C , CG → H, CG → I , B → H}
Objectives &
Outline • (AG )+
Algorithms for
FDs a) result = AG
Attribute Set Closure
Extraneous
b) result = ABCG (A → C and A → B)
Attributes
Equivalence of FD
c) result = ABCGH (CG → H and CG ⊆ AGBC )
Sets
Canonical Cover of
d) result = ABCGHI (CG → I and CG ⊆ AGBCH)
FDs
Practice Problems • Is AG a candidate key?
Module Summary a) Is AG a super key?
i) Does AG → R? == Is (AG )+ ⊇ R
b) Is any subset of AG a superkey?
i) Does A → R? == Is (A)+ ⊇ R
ii) Does G → R? == Is (G )+ ⊇ R

Database Management Systems Partha Pratim Das 24.6


Attribute Set Closure: Uses

Module 24

Partha Pratim
There are several uses of the attribute closure algorithm:
Das
• Testing for superkey:
Objectives &
Outline ◦ To test if α is a superkey, we compute α+ , and check if α+ contains all attributes
Algorithms for of R.
FDs
Attribute Set Closure • Testing functional dependencies
Extraneous
Attributes
Equivalence of FD
◦ To check if a functional dependency α → β holds (or, in other words, is in F + ), just
Sets
check if β ⊆ α+ .
Canonical Cover of
FDs ◦ That is, we compute α+ by using attribute closure, and then check if it contains β.
Practice Problems

Module Summary
◦ Is a simple and cheap test, and very useful
• Computing closure of F
◦ For each γ ⊆ R, we find the closure γ + , and for each S ⊆ γ + , we output a
functional dependency γ → S.

Database Management Systems Partha Pratim Das 24.7


Extraneous Attributes

Module 24 • Consider a set F of FDs and the FD α → β in F .


Partha Pratim
Das
◦ Attribute A is extraneous in α if A ∈ α and F logically implies
(F − {α → β}) ∪ {(α − A) → β}.
Objectives &
Outline ◦ Attribute A is extraneous in β if A ∈ β and the set of FDs
Algorithms for (F − {α → β}) ∪ {α → (β − A)} logically implies F .
FDs
Attribute Set Closure • Note: Implication in the opposite direction is trivial in each of the cases above, since a
Extraneous
Attributes “stronger” functional dependency always implies a weaker one
Equivalence of FD
Sets
Canonical Cover of
• Example: Given F = {A → C , AB → C }
FDs
Practice Problems
◦ B is extraneous in AB → C because {A → C , AB → C } logically implies A → C
Module Summary (that is, the result of dropping B from AB → C ).
◦ A+ = AC in {A → C , AB → C }
• Example: Given F = {A → C , AB → CD}
◦ C is extraneous in AB → CD since AB → C can be inferred even after deleting C
◦ AB + = ABCD in {A → C , AB → D}

Database Management Systems Partha Pratim Das 24.8


Extraneous Attributes (2): Tests

Module 24

Partha Pratim • Consider a set F of functional dependencies and the functional dependency α → β in F .
Das
• To test if attribute A ∈ α is extraneous in α
Objectives &
Outline a) Compute ({α} − A)+ using the dependencies in F
Algorithms for
FDs
b) Check that ({α} − A)+ contains β; if it does, A is extraneous in α
Attribute Set Closure
Extraneous
• To test if attribute A ∈ β is extraneous in β
a) Compute α+ using only the dependencies in
Attributes
Equivalence of FD

F 0 = (F − {α → β}) ∪ {α → (β − A)},
Sets
Canonical Cover of
FDs
Practice Problems
b) Check that α+ contains A; if it does, A is extraneous in β
Module Summary

Database Management Systems Partha Pratim Das 24.9


Equivalence of Sets of Functional Dependencies PPD

Module 24 • Let F & G are two functional dependency sets.


Partha Pratim
Das
◦ These two sets F & G are equivalent if F + = G + . That is:
(F + = G + ) ⇔ (F + ⇒ G and G + ⇒ F )
Objectives &
Outline ◦ Equivalence means that every functional dependency in F can be inferred from G ,
Algorithms for
FDs
and every functional dependency in G an be inferred from F
Attribute Set Closure • F and G are equal only if
Extraneous
Attributes
Equivalence of FD
◦ F covers G : Means that all functional dependency of G are logically members of
Sets
Canonical Cover of
functional dependency set F ⇒ F + ⊇ G .
FDs
Practice Problems
◦ G covers F : Means that all functional dependency of F are logically members of
Module Summary
functional dependency set G ⇒ G + ⊇ F .

Database Management Systems Partha Pratim Das 24.10


Canonical Cover

Module 24 • Sets of FDs may have redundant dependencies that can be inferred from the others
Partha Pratim
Das
• Can we have some kind of ”optimal” or ”minimal” set of FDs wto work with?
Objectives &
• A Canonical Cover for F is a set of dependencies Fc such that ALL the following
Outline
properties are satisfied:
Algorithms for
FDs ◦ F + = Fc+ . Or,
Attribute Set Closure
Extraneous . F logically implies all dependencies in Fc
Attributes
Equivalence of FD . Fc logically implies all dependencies in F
Sets
Canonical Cover of
FDs
◦ No functional dependency in Fc contains an extraneous attribute
Practice Problems ◦ Each left side of functional dependency in Fc is unique. That is, there are no two
Module Summary dependencies α1 → β1 and α2 → β2 in such that α1 → α2
• Intuitively, a Canonical cover of F is a minimal set of FDs
◦ Equivalent to F
◦ Having no redundant FDs
◦ No redundant parts of FDs
• Minimal / Irreducible Set of Functional Dependencies
Database Management Systems Partha Pratim Das 24.11
Canonical Cover (2): Example

Module 24 • For example: A → C is redundant in: {A → B, B → C , A → C }


Partha Pratim
Das
• Parts of a functional dependency may be redundant
Objectives &
◦ For example: on RHS: {A → B, B → C , A → CD} can be simplified to
Outline {A → B, B → C , A → D}
Algorithms for
FDs
– In the forward: (1) A → CD ⇒ A → C and A → D
Attribute Set Closure (2) A → B, B → C ⇒ A → C
Extraneous
Attributes – In the reverse: (1) A → B, B → C ⇒ A → C
Equivalence of FD
Sets (2) A → C , A → D ⇒ A → CD
Canonical Cover of
FDs ◦ For example: on LHS: {A → B, B → C , AC → D} can be simplified to
Practice Problems
{A → B, B → C , A → D}
Module Summary
– In the forward: (1) A → B, B → C ⇒ A → C ⇒ A → AC
(2) A → AC , AC → D ⇒ A → D
– In the reverse: A → D ⇒ AC → D

Database Management Systems Partha Pratim Das 24.12


Canonical Cover (3): RHS PPD

Module 24

Partha Pratim • {A → B, B → C , A → CD} ⇒ {A → B, B → C , A → D}


Das
◦ (1) A → CD ⇒ A → C and A → D
Objectives &
Outline (2) A → B, B → C ⇒ A → C
Algorithms for ◦ A+ = ABCD
FDs
Attribute Set Closure • {A → B, B → C , A → D} ⇒ {A → B, B → C , A → CD}
Extraneous
Attributes
Equivalence of FD
◦ A → B, B → C ⇒ A → C
Sets
◦ A → C , A → D ⇒ A → CD
Canonical Cover of
FDs ◦ A+ = ABCD
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 24.13


Canonical Cover (4): LHS PPD

Module 24

Partha Pratim • {A → B, B → C , AC → D} ⇒ {A → B, B → C , A → D}
Das
◦ A → B, B → C ⇒ A → C ⇒ A → AC
Objectives &
Outline ◦ A → AC , AC → D ⇒ A → D
Algorithms for ◦ A+ = ABCD
FDs
Attribute Set Closure • {A → B, B → C , A → D} ⇒ {A → B, B → C , AC → D}
Extraneous
Attributes
Equivalence of FD
◦ A → D ⇒ AC → D
Sets
◦ AC + = ABCD
Canonical Cover of
FDs
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 24.14


Canonical Cover (5) PPD

Module 24
• To compute a canonical cover for F :
Partha Pratim
Das repeat
Objectives &
Use the union rule to replace any dependencies in F
Outline
α1 → β1 and α1 → β2 with α1 → β1 β2
Algorithms for
FDs Find a functional dependency α → β with an
Attribute Set Closure extraneous attribute either in α or in β
Extraneous
Attributes /* Note: test for extraneous attributes done using Fc , not F */
Equivalence of FD
Sets If an extraneous attribute is found, delete it from α → β
Canonical Cover of
FDs until F does not change
Practice Problems

Module Summary • Note: Union rule may become applicable after some extraneous attributes have been
deleted, so it has to be re-applied

Database Management Systems Partha Pratim Das 24.15


Canonical Cover (6): Example

Module 24

Partha Pratim • R = (A, B, C )


Das
F = {A → BC , B → C , A → B, AB → C }
Objectives &
Outline • Combine A → BC and A → B into A → BC
Algorithms for
FDs
◦ Set is now {A → BC , B → C , AB → C }
Attribute Set Closure
Extraneous
• A is extraneous in AB → C
Attributes
Equivalence of FD ◦ Check if the result of deleting A from AB → C is implied by the other dependencies
Sets
Canonical Cover of . Yes: in fact, B → C is already present!
FDs
Practice Problems ◦ Set is now {A → BC , B → C }
Module Summary
• C is extraneous in A → BC
◦ Check if A → C is logically implied by A → B and the other dependencies
. Yes: using transitivity on A → B and B → C .
– Can use attribute closure of A in more complex cases
• The canonical cover is: A → B, B → C
Database Management Systems Partha Pratim Das 24.16
Practice Problems on Functional Dependencies PPD

Module 24

Partha Pratim • Find if a given functional dependency is implied from a set of Functional
Das
Dependencies:
Objectives &
Outline a) For: A → BC , CD → E , E → C , D → AEH, ABH → BD, DH → BC
Algorithms for
FDs
i) Check: BCD → H
Attribute Set Closure ii) Check: AED → C
Extraneous
Attributes b) For: AB → CD, AF → D, DE → F , C → G , F → E , G → A
Equivalence of FD
Sets
Canonical Cover of
i) Check: CF → DF
FDs
Practice Problems
ii) Check: BG → E
Module Summary
iii) Check: AF → G
iv) Check: AB → EF
c) For: A → BC , B → E , CD → EF
i) Check: AD → F

Database Management Systems Partha Pratim Das 24.17


Practice Problems on Functional Dependencies (2) PPD

Module 24

Partha Pratim • Find Super Key using Functional Dependencies:


Das
a) Relational Schema R(ABCDE ). Functional dependencies:
Objectives &
Outline AB → C , DE → B, CD → E
Algorithms for b) Relational Schema R(ABCDE ). Functional dependencies:
FDs
Attribute Set Closure
AB → C , C → D, B → EA
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 24.18


Practice Problems on Functional Dependencies (3) PPD

Module 24

Partha Pratim • Find Candidate Key using Functional Dependencies:


Das
a) Relational Schema R(ABCDE ). Functional dependencies:
Objectives &
Outline AB → C , DE → B, CD → E
Algorithms for b) Relational Schema R(ABCDE ). Functional dependencies:
FDs
Attribute Set Closure
AB → C , C → D, B → EA
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 24.19


Practice Problems on Functional Dependencies (4) PPD

Module 24 • Find Prime and Non Prime Attributes using Functional Dependencies:
Partha Pratim
Das
a) R(ABCDEF ) having FDs {AB → C , C → D, D → E , F → B, E → F }
b) R(ABCDEF ) having FDs {AB → C , C → DE , E → F , C → B}
Objectives &
Outline c) R(ABCDEFGHIJ) having FDs {AB → C , A → DE , B → F , F → GH, D → IJ}
Algorithms for d) R(ABDLPT ) having FDs {B → PT , A → D, T → L}
FDs
Attribute Set Closure e) R(ABCDEFGH) having FDs
Extraneous
Attributes
{E → G , AB → C , AC → B, AD → E , B → D, BC → A}
Equivalence of FD
Sets
f) R(ABCDE ) having FDs {A → BC , CD → E , B → D, E → A}
Canonical Cover of
FDs
g) R(ABCDEH) having FDs {A → B, BC → D, E → C , D → A}
Practice Problems

Module Summary
• Prime Attributes: Attribute set that belongs to any candidate key are called Prime Attributes
◦ It is union of all the candidate key attribute: {CK1 ∪ CK2 ∪ CK3 ∪ · · · }
◦ If Prime attribute determined by other attribute set, then more than one candidate key is possible.
◦ For example, If A is Candidate Key, and X → A, then, X is also Candidate Key.
• Non Prime Attribute: Attribute set does not belong to any candidate key are called Non Prime
Attributes

Database Management Systems Partha Pratim Das 24.20


Practice Problems on Functional Dependencies (5) PPD

Module 24

Partha Pratim • Check the Equivalence of a Pair of Sets of Functional Dependencies:


Das
a) Consider the two sets F and G with their FDs as below :
Objectives &
Outline i) F : A → C , AC → D, E → AD, E → H
Algorithms for
FDs
ii) G : A → CD, E → AH
Attribute Set Closure b) Consider the two sets P and Q with their FDs as below :
Extraneous
Attributes
Equivalence of FD
i) P : A → B, AB → C , D → ACE
Sets
Canonical Cover of
ii) Q : A → BC , D → AE
FDs
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 24.21


Practice Problems on Functional Dependencies (6) PPD

Module 24

Partha Pratim • Find the Minimal Cover or Irreducible Sets or Canonical Cover of a Set of
Das
Functional Dependencies:
Objectives &
Outline a) AB → CD, BC → D
Algorithms for b) ABCD → E , E → D, AC → D, A → B
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 24.22


Module Summary

Module 24

Partha Pratim • Studied Algorithms for Properties of Functional Dependencies


Das

Objectives &
Outline

Algorithms for
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems
Slides used in this presentation are borrowed from http://db-book.com/ with kind
Module Summary
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 24.23


Module 25

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Lossless Join
Decomposition Module 25: Relational Database Design/5
Practice Problems

Dependency
Preservation
Practice Problems

Module Summary
Partha Pratim Das

Department of Computer Science and Engineering


Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 25.1


Module Recap PPD

Module 25

Partha Pratim • Studied Algorithms for Properties of Functional Dependencies


Das

Objectives &
Outline

Lossless Join
Decomposition
Practice Problems

Dependency
Preservation
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 25.2


Module Objectives PPD

Module 25

Partha Pratim • To Understand the Characterizations for Lossless Join Decomposition


Das
• To Understand the Characterizations for Dependency Preservation
Objectives &
Outline

Lossless Join
Decomposition
Practice Problems

Dependency
Preservation
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 25.3


Module Outline PPD

Module 25

Partha Pratim • Lossless Join Decomposition


Das
• Dependency Preservation
Objectives &
Outline

Lossless Join
Decomposition
Practice Problems

Dependency
Preservation
Practice Problems

Module Summary

Database Management Systems Partha Pratim Das 25.4


Lossless join Decomposition PPD

Module 25

Partha Pratim
Das

Objectives &
Outline

Lossless Join
Decomposition
Practice Problems

Dependency
Preservation
Practice Problems

Module Summary

Lossless Join Decomposition

Database Management Systems Partha Pratim Das 25.5


Lossless Join Decomposition

Module 25 • For the case of R = (R1 , R2 ), we require that for all possible relations r on schema R
Partha Pratim
Das
r = πR1 (r ) ./ πR2 (r )
Objectives &
Outline
• A decomposition of R into R1 and R2 is lossless join if at least one of the following
Lossless Join
Decomposition dependencies is in F + :
Practice Problems

Dependency
◦ R1 ∩ R2 → R 1
Preservation ◦ R1 ∩ R2 → R 2
Practice Problems

Module Summary • The above functional dependencies are a sufficient condition for lossless join
decomposition; the dependencies are a necessary condition only if all constraints are
functional dependencies

To Identify whether a decomposition is lossy or lossless, it must satisfy the following conditions:
• R1 ∪ R2 = R
• R1 ∩ R2 6= φ and
• R1 ∩ R2 → R1 or R1 ∩ R2 → R2
Database Management Systems Partha Pratim Das 25.6
Lossless Join Decomposition (2): Example PPD

Module 25 • Consider Supplier Parts schema: Supplier Parts(S#, Sname, City, P#, Qty)
Partha Pratim
Das
• Having dependencies: S# → Sname, S# → City, (S#, P#) → Qty
Objectives &
• Decompose as: Supplier(S#, Sname, City, Qty): Parts(P#, Qty)
Outline
• Take Natural Join to reconstruct: Supplier ./ Parts
Lossless Join
Decomposition
Practice Problems

Dependency
Preservation
Practice Problems

Module Summary

• We get extra tuples! Join is Lossy!


• Common attribute Qty is not a superkey in Supplier or in Parts
• Does not preserve (S#, P#) → Qty
Database Management Systems Partha Pratim Das 25.7
Lossless Join Decomposition (3): Example PPD

Module 25 • Consider Supplier Parts schema: Supplier Parts(S#, Sname, City, P#, Qty)
Partha Pratim
Das
• Having dependencies: S# → Sname, S# → City, (S#, P#) → Qty
Objectives &
• Decompose as: Supplier(S#, Sname, City): Parts(S#, P#, Qty)
Outline
• Take Natural Join to reconstruct: Supplier ./ Parts
Lossless Join
Decomposition
Practice Problems

Dependency
Preservation
Practice Problems

Module Summary

• We get back the original relation. Join is Lossless.


• Common attribute S# is a superkey in Supplier
• Preserves all dependencies
Database Management Systems Partha Pratim Das 25.8
Lossless Join Decomposition (4): Example

Module 25

Partha Pratim • R = (A, B, C )


Das
F = {A → B, B → C }
Objectives &
Outline ◦ Can be decomposed in two different ways
Lossless Join
Decomposition
• R1 = (A, B), R2 = (B, C )
Practice Problems
◦ Lossless-join decomposition:
Dependency
Preservation R1 ∩ R2 = {B} and B → BC
Practice Problems
◦ Dependency preserving
Module Summary
• R1 = (A, B), R2 = (A, C )
◦ Lossless-join decomposition:
R1 ∩ R2 = {A} and A → AB
◦ Not dependency preserving
(cannot check B → C without computing R1 ./ R2 )

Database Management Systems Partha Pratim Das 25.9


Practice Problems on Lossless Join PPD

Module 25

Partha Pratim • Check if the decomposition of R into D is lossless:


Das
a) R(ABC ) : F = {A → B, A → C }. D = R1 (AB), R2 (BC )
Objectives &
Outline b) R(ABCDEF ) : F = {A → B, B → C , C → D, E → F }.
Lossless Join D = R1 (AB), R2 (BCD), R3 (DEF )
Decomposition
Practice Problems
c) R(ABCDEF ) : F = {A → B, C → DE , AC → F }. D = R1 (BE ), R2 (ACDEF )
Dependency d) R(ABCDEG ) : F = {AB → C , AC → B, AD → E , B → D, BC → A, E → G }
Preservation
Practice Problems i) D1 = R1 (AB), R2 (BC ), R3 (ABDE ), R4 (EG )
Module Summary ii) D2 = R1 (ABC ), R2 (ACDE ), R3 (ADG )
e) R(ABCDEFGHIJ) : F = {AB → C , B → F , D → IJ, A → DE , F → GH}
i) D1 = R1 (ABC ), R2 (ADE ), R3 (BF ), R4 (FGH), R5 (DIJ)
ii) D2 = R1 (ABCDE ), R2 (BFGH), R3 (DIJ)
iii) D3 = R1 (ABCD), R2 (DE ), R3 (BF ), R4 (FGH), R5 (DIJ)

Database Management Systems Partha Pratim Das 25.10


Dependency Preservation PPD

Module 25

Partha Pratim
Das

Objectives &
Outline

Lossless Join
Decomposition
Practice Problems

Dependency
Preservation
Practice Problems

Module Summary

Dependency Preservation

Database Management Systems Partha Pratim Das 25.11


Dependency Preservation

Module 25

Partha Pratim • Let Fi be the set of dependencies F + that include only attributes in Ri
Das
◦ A decomposition is dependency preserving, if
Objectives &
Outline

Lossless Join
(F1 ∪ F2 ∪ · · · ∪ Fn )+ = F +
Decomposition
Practice Problems

Dependency
◦ If it is not, then checking updates for violation of functional dependencies may
Preservation require computing joins, which is expensive
Practice Problems

Module Summary
Let R be the original relational schema having FD set F . Let R1 and R2 having FD
set F1 and F2 respectively, are the decomposed sub-relations of R. The decomposition
of R is said to be preserving if
• F1 ∪ F2 ≡ F {Decomposition Preserving Dependency}
• If F1 ∪ F2 ⊂ F {Decomposition NOT Preserving Dependency} and
• F1 ∪ F2 ⊃ F {this is not possible}
Database Management Systems Partha Pratim Das 25.12
Dependency Preservation (2): Testing

Module 25
• To check if a dependency α → β is preserved in a decomposition of R into D = {R1 , R2 , . . . , Rn } we
Partha Pratim apply the following test (with attribute closure done with respect to F )
Das
• The restriction of F + to Ri is the set of all functional dependencies in F + that include only attributes
Objectives & of Ri .
Outline

Lossless Join ◦ compute F + ;


Decomposition for each schema Ri in D do
Practice Problems
begin
Dependency
Preservation
Fi = the restriction of F + to Ri ;
Practice Problems end
Module Summary F0 = φ
for each restriction Fi do
begin
F 0 = F 0 ∪ Fi
end
compute F 0+ ;
if (F 0+ = F + ) then return (true)
else return (false);
• The procedure for checking dependency preservation takes exponential time to compute F + and
(F1 ∪ F2 ∪ · · · ∪ Fn )+
Database Management Systems Partha Pratim Das 25.13
Dependency Preservation (3): Example

Module 25
• R (A, B, C, D, E, F )
Partha Pratim F = {A → BCD, A → EF , BC → AD, BC → E , BC → F , B → F , D → E }
Das
• Decomposition: R1(A, B, C, D) R2(B, F) R3(D, E)
Objectives &
Outline
◦ A → BCD, BC → AD are preserved on table R1
◦ B → F is preserved on table R2
Lossless Join
Decomposition
◦ D → E is preserved on table R3
Practice Problems
◦ We have to check whether the remaining FDs: A→E, A→ F, BC→E, BC→ F are preserved or not.
Dependency R1 R2 R3
Preservation
Practice Problems
F1 ={A → ABCD, B → B, C → C , D → D, F2 ={B → BF , F → F } F3 ={D → DE , E → E }
AB → ABCD, BC → ABCD, CD → CD, AD → ABCD
Module Summary
ABC → ABCD, ABD → ABCD, ACD → ABCD
BCD → ABCD}

◦ F 0 = F1 ∪ F2 ∪ F3 .
◦ Checking for: A → E , A → F in F 0+
. A → D (from R1), D → E (from R3) : A→ E (By Transitivity)
. A → B (from R1), B → F (from R2) : A→ F (By Transitivity)
◦ Checking for: BC → E , BC → F in F 0+
. BC → D (from R1), D → E (from R3) : BC→ E (By Transitivity)
. B → F (from R2) : BC→ F (By Augmentation) Hence all dependencies are preserved.

Database Management Systems Partha Pratim Das 25.14


Dependency Preservation (4): Example

Module 25

Partha Pratim
• R (A, B, C, D)
Das F = {A → B, B → C , C → D, D → A}
Objectives & • Decomposition: R1(A, B) R2(B, C) R3(C, D)
Outline
◦ A → B is preserved on table R1
Lossless Join
Decomposition ◦ B → C is preserved on table R2
Practice Problems ◦ C → D is preserved on table R3
Dependency ◦ We have to check whether the one remaining FD: D→A is preserved or not.
Preservation
Practice Problems
R1 R2 R3
Module Summary
F1 ={A → AB, B → BA} F2 ={B → BC , C → CB} F3 ={C → CD, D → DC }

◦ F 0 = F1 ∪ F2 ∪ F3 .
◦ Checking for: D → A in F 0+
. D → C (from R3), C → B (from R2), B → A (from R1) : D→ A (By Transitivity)
Hence all dependencies are preserved.

Database Management Systems Partha Pratim Das 25.15


Dependency Preservation (5): Testing

Module 25

Partha Pratim • To check if a dependency α → β is preserved in a decomposition of R into R1 , R2 , · · · , Rn we


Das
apply the following test (with attribute closure done with respect to F )
Objectives &
Outline ◦ result = α
Lossless Join while (changes to result) do
Decomposition
for each Ri in the decomposition
Practice Problems

Dependency
t = (result ∩ Ri )+ ∩ Ri
Preservation result = result ∪ t
Practice Problems
◦ If result contains all attributes in β, then the functional dependency α → β is preserved.
Module Summary
• We apply the test on all dependencies in F to check if a decomposition is dependency
preserving
• This procedure takes polynomial time, instead of the exponential time required to compute F +
and (F1 ∪ F2 ∪ · · · ∪ Fn )+

Database Management Systems Partha Pratim Das 25.16


Dependency Preservation (6): Example PPD

Module 25 • R(ABCDEF ) :. F = {A → BCD, A → EF , BC → AD, BC → E , BC → F , B → F , D → E }


Partha Pratim
Das
• Decomp = {ABCD, BF , DE }
• On projections:
Objectives &
Outline

Lossless Join
Decomposition
Practice Problems

Dependency
Preservation
Practice Problems
• Need to check for: (
A→ (( A → EF , (
(BCD, BC(
(
→(AD, BC → E , BC → F , B→
F ,
D→ E
Module Summary
• (BC ) + /F 1 = ABCD. (ABCD) + /F 2 = ABCDF . (ABCDF ) + /F 3 = ABCDEF . Preserves
BC → E , BC → F
BC → AD (R1), AD → E (R3) implies BC → E
B → F (R2) implies BC → F
• (A) + /F 1 = ABCD. (ABCD) + /F 2 = ABCDF . (ABCDF ) + /F 3 = ABCDEF . Preserves A → EF
A → B (R1), B → F (R2) implies A → F
A → D (R1), D → E (R3) implies A → E

Database Management Systems Partha Pratim Das 25.17


Dependency Preservation (7): Example PPD

Module 25 • R(ABCDEF ) : F = {A → BCD, A → EF , BC → AD, BC → E , BC → F , B → F , D → E }. Decomp =


Partha Pratim
{ABCD, BF , DE }
Das
• On projections:
Objectives &
Outline

Lossless Join
Decomposition
Practice Problems

Dependency • Infer reverse FD’s:


Preservation
Practice Problems ◦ B + /F = BF : B → A cannot be inferred
Module Summary ◦ C + /F = C : C → A cannot be inferred
◦ D + /F = DE : D → A and D → BC cannot be inferred
◦ A + /F = ABCDEF : A → BC can be inferred, but it is equal to A → B and A → C
◦ F + /F = F : F → B cannot be inferred
◦ E + /F = E : E → D cannot be inferred
• Need to check for: (A→ (( A → EF , (
(BCD, BC(
(
→(AD, BC → E , BC → F ,  B→F ,
D→E
◦ (BC ) + /F = ABCDEF . Preserves BC → E , BC → F
◦ (A) + /F = ABCDEF . Preserves A → EF

Database Management Systems Partha Pratim Das 25.18


Practice Problems on Dependency Preservation PPD

Module 25

Partha Pratim
• Check whether the decomposition of R into D is preserving dependency:
Das

Objectives & a) R(ABCD) : F = {A → B, B → C , C → D, D → A}. D = {AB, BC , CD}


Outline
b) R(ABCDEF ) : F = {AB → CD, C → D, D → E , E → F }. D = {AB, CDE , EF }
Lossless Join
Decomposition c) R(ABCDEG ) : F = {AB → C , AC → B, BC → A, AD → E , B → D, E → G }. D =
Practice Problems {ABC , ACDE , ADG }
Dependency d) R(ABCD) : F = {A → B, B → C , C → D, D → B}. D = {AB, BC , BD}
Preservation
Practice Problems
e) R(ABCDE ) : F = {A → BC , CD → E , B → D, E → A}. D = {ABCE , BD}
Module Summary

Database Management Systems Partha Pratim Das 25.19


Module Summary

Module 25

Partha Pratim • Understood the Characterization for and Determination of Lossless Join
Das
• Understood the Characterization for and Determination of Dependency Preservation
Objectives &
Outline

Lossless Join
Decomposition
Practice Problems

Dependency
Preservation
Practice Problems

Module Summary

Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 25.20


Module 26

Partha Pratim
Das

Week Recap

Objectives &
Database Management Systems
Outline
Module 26: Relational Database Design/6: Normal Forms
Normal Forms
1NF
2NF
3NF

Module Summary Partha Pratim Das

Department of Computer Science and Engineering


Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 26.1


Week Recap PPD

Module 26

Partha Pratim • Identified the features of good relational design


Das
• Familiarized with the First Normal Form
Week Recap

Objectives &
• Introduced the notion and the theory of functional dependencies
Outline

Normal Forms
• Discussed issues in ”good” design in the context of functional dependencies
1NF
2NF
• Studied Algorithms for Properties of Functional Dependencies
3NF
• Understood the Characterization for and Determination of Lossless Join and
Module Summary
Determination of Dependency Preservation

Database Management Systems Partha Pratim Das 26.2


Module Objectives PPD

Module 26

Partha Pratim • To Understand the Normal Forms and their Importance in Relational Design
Das

Week Recap

Objectives &
Outline

Normal Forms
1NF
2NF
3NF

Module Summary

Database Management Systems Partha Pratim Das 26.3


Module Outline PPD

Module 26

Partha Pratim • Normal Forms


Das

Week Recap

Objectives &
Outline

Normal Forms
1NF
2NF
3NF

Module Summary

Database Management Systems Partha Pratim Das 26.4


Normal Forms PPD

Module 26

Partha Pratim
Das

Week Recap

Objectives &
Outline

Normal Forms
1NF
2NF
3NF

Module Summary

Normal Forms

Database Management Systems Partha Pratim Das 26.5


Normalization or Schema Refinement PPD

Module 26
• Normalization or Schema Refinement is a technique of organizing the data in the
Partha Pratim
Das database
Week Recap • A systematic approach of decomposing tables to eliminate data redundancy and
Objectives & undesirable characteristics
Outline

Normal Forms
◦ Insertion Anomaly
1NF ◦ Update Anomaly
2NF
3NF ◦ Deletion Anomaly
Module Summary • Most common technique for the Schema Refinement is decomposition.
◦ Goal of Normalization: Eliminate Redundancy
• Redundancy refers to repetition of same data or duplicate copies of same data stored in
different locations
• Normalization is used for mainly two purpose:
◦ Eliminating redundant (useless) data
◦ Ensuring data dependencies make sense, that is, data is logically stored
Database Management Systems Partha Pratim Das 26.6
Anomalies PPD

Module 26
b) Insertion Anomaly: Until the new faculty
Partha Pratim
Das a) Update Anomaly: Employee 519 is shown as member, Dr. Newsome, is assigned to teach
having different addresses on different records at least one course, his details cannot be
Week Recap

Objectives &
Outline

Normal Forms
1NF
2NF
3NF
recorded
Module Summary
c) Deletion Anomaly: All information about
Resolution: Decompose the Schema Dr. Giddens is lost if he temporarily ceases
to be assigned to any courses.
a) Update: (ID, Address), (ID, Skill)
b) Insert: (ID, Name, Hire Date), (ID, Code)
c) Delete: (ID, Name, Hire Date), (ID, Code)

Database Management Systems Partha Pratim Das 26.7


Desirable Properties of Decomposition PPD

Module 26

Partha Pratim • Lossless Join Decomposition Property


Das
◦ It should be possible to reconstruct the original table
Week Recap

Objectives &
• Dependency Preserving Property
Outline
◦ No functional dependency (or other constraints should get violated)
Normal Forms
1NF
2NF
3NF

Module Summary

Database Management Systems Partha Pratim Das 26.8


Normalization and Normal Forms PPD

Module 26

Partha Pratim • A normal form specifies a set of conditions that the relational schema must satisfy in
Das
terms of its constraints – they offer varied levels of guarantee for the design
Week Recap
• Normalization rules are divided into various normal forms. Most common normal forms
Objectives &
Outline are:
Normal Forms
1NF
◦ First Normal Form (1NF)
2NF ◦ Second Normal Form (2NF)
3NF
◦ Third Normal Form (3NF)
Module Summary
• Informally, a relational database relation is often described as ”normalized” if it meets
third normal form. Most 3NF relations are free of insertion, update, and deletion
anomalies

Database Management Systems Partha Pratim Das 26.9


Normalization and Normal Forms PPD

Module 26

Partha Pratim • Additional Normal Forms


Das
◦ Elementary Key Normal Form (EKNF)
Week Recap
◦ Boyce-codd Normal Form (BCNF)
Objectives &
Outline ◦ Multivalued Dependencies And Fourth Normal Form (4NF)
Normal Forms ◦ Essential Tuple Normal Form (ETNF)
1NF
2NF
◦ Join Dependencies and Fifth Normal Form (5NF)
3NF
◦ Sixth Normal Form (6NF)
Module Summary
◦ Domain/Key Normal Form (DKNF)

Database Management Systems Partha Pratim Das 26.10


1NF: First Normal Form PPD

Module 26 • A relation is in First Normal Form if and only if all underlying domains contain atomic
Partha Pratim values only (doesn’t have multivalued attributes (MVA))
Das
• STUDENT(Sid, Sname, Cname)
Week Recap

Objectives &
Outline

Normal Forms
1NF
2NF
3NF

Module Summary

Source: http://www.edugrabs.com/normal- forms/#fnf


Database Management Systems Partha Pratim Das 26.11
1NF (2): Possible Redundancy PPD

Module 26
• Example: Supplier(SID, Status, City, PID, Qty)
Partha Pratim
Das

Week Recap

Objectives &
Outline

Normal Forms
1NF
2NF
3NF

Module Summary

Drawbacks:
• Deletion Anomaly: If we delete <S3,40,Rohtak,P1,245>, then we lose the information that S3 lives in Rohtak.
• Insertion Anomaly: We cannot insert a Supplier S5 located in Karnal, until S5 supplies at least one part.
• Update Anomaly: If Supplier S1 moves from Delhi to Kanpur, then it is difficult to update all the tuples having SID
as S1 and City as Delhi.

Normalization is a method to reduce redundancy. However, sometimes 1NF increases redundancy.


Database Management Systems Partha Pratim Das 26.12
1NF (3): Possible Redundancy PPD

Module 26
• When LHS is not a Superkey :
• When LHS is a Superkey :
Partha Pratim
◦ Let X → Y be a non trivial FD over R with X
Das
is not a superkey of R, then redundancy exist ◦ If X → Y is a non trivial FD over R with X is
Week Recap between X and Y attribute set. a superkey of R, then redundancy does not
Objectives & ◦ Hence in order to identify the redundancy, we exist between X and Y attribute set.
Outline
need not to look at the actual data, it can be ◦ Example : X → Y and X is a Candidate Key
Normal Forms identified by given functional dependency. ⇒ X cannot duplicate
1NF
◦ Example : X → Y and X is not a Candidate ⇒ Corresponding Y value may or may not
2NF
Key duplicate.
3NF

Module Summary ⇒ X can duplicate


⇒ Corresponding Y value would duplicate
also.

Database Management Systems Partha Pratim Das 26.13


2NF: Second Normal Form PPD

Module 26

Partha Pratim • Relation R is in Second Normal Form (2NF) only iff :


Das
◦ R is in 1NF and
Week Recap
◦ R contains no Partial Dependency
Objectives &
Outline

Normal Forms
1NF
Partial Dependency:
2NF
3NF Let R be a relational Schema and X , Y , A be the attribute sets over R where X : Any Candi-
Module Summary date Key, Y : Proper Subset of Candidate Key, and A : Non Prime Attribute

If Y → A exists in R, then R is not in 2NF.

(Y → A) is a Partial dependency only if


• Y : Proper subset of Candidate Key
• A: Non Prime Attribute
A prime attribute of a relation is an attribute that is a part of a candidate key of the relation

Database Management Systems Partha Pratim Das 26.14


2NF (2) PPD

Key Normalization
Module 26 • STUDENT(Sid, Sname, Cname) (already in 1NF)
Partha Pratim
Das

Week Recap

Objectives &
Outline

Normal Forms
1NF
2NF
3NF
Functional Dependencies:
Module Summary {SID, Cname} → Sname
• Redundancy? SID → Sname

◦ Sname
Partial Dependencies: The above two relations R1 and R2 are
• Anomaly? SID → Sname (as SID is a 1.Lossless Join
◦ Yes Proper Subset of Candidate Key 2.2NF
{SID, Cname}) 3.Dependency Preserving

Source: http://www.edugrabs.com/2nf- second- normal- form/

Database Management Systems Partha Pratim Das 26.15


2NF (3): Possible Redundancy PPD

Module 26
Post Normalization

Partha Pratim
Das • Supplier(SID, Status, City, PID, Qty)

Week Recap

Objectives &
Outline

Normal Forms
Partial Dependencies:
1NF
2NF
SID → Status
3NF
SID → City
Drawbacks:
Module Summary
• Deletion Anomaly: If we delete a tuple in
Sup City , then we not only loose the infor-
mation about a supplier, but also loose the
status value of a particular city.
• Insertion Anomaly: We cannot insert a City
and its status until a supplier supplies at least
one part.
• Update Anomaly: If the status value for a
city is changed, then we will face the problem
of searching every tuple for that city.
Source: http://www.edugrabs.com/2nf- second- normal- form/
Database Management Systems Partha Pratim Das 26.16
3NF: Third Normal Form PPD

Module 26
Let R be the relational schema.
Partha Pratim
Das • [E. F. Codd,1971] R is in 3NF only if:
Week Recap ◦ R should be in 2NF
Objectives &
◦ R should not contain transitive dependencies (OR, Every non-prime attribute of R is
Outline non-transitively dependent on every key of R)
Normal Forms
1NF
• [Carlo Zaniolo, 1982] Alternately, R is in 3NF iff for each of its functional dependencies X → A, at least
2NF
one of the following conditions holds:
3NF
◦ X contains A (that is, A is a subset of X , meaning X → A is trivial functional dependency), or
Module Summary
◦ X is a superkey, or
◦ Every element of A − X , the set difference between A and X , is a prime attribute (i.e., each
attribute in A − X is contained in some candidate key)
• [Simple Statement] A relational schema R is in 3NF if for every FD X → A associated with R either
◦ A ⊆ X (that is, the FD is trivial) or
◦ X is a superkey of R or
◦ A is part of some candidate key (not just superkey!)
• A relation in 3NF is naturally in 2NF
Database Management Systems Partha Pratim Das 26.17
3NF (2): Transitive Dependency

Module 26

Partha Pratim • A transitive dependency is a functional dependency which holds by virtue of


Das
transitivity. A transitive dependency can occur only in a relation that has three or more
Week Recap attributes.
Objectives &
Outline • Let A, B, and C designate three distinct attributes (or distinct collections of attributes)
Normal Forms in the relation. Suppose all three of the following conditions hold:
1NF
2NF ◦ A→B
3NF
◦ It is not the case that B → A
Module Summary
◦ B→C
• Then the functional dependency A → C (which follows from 1 and 3 by the axiom of
transitivity) is a transitive dependency

Database Management Systems Partha Pratim Das 26.18


3NF (3): Transitive Dependency

Module 26
• Example of transitive dependency
Partha Pratim
Das • The functional dependency {Book} → {Author Nationality} applies; that is, if we know
Week Recap the book, we know the author’s nationality. Furthermore:
Objectives &
Outline
◦ {Book} → {Author}
Normal Forms
◦ {Author} does not → {Book}
1NF ◦ {Author} → {Author Nationality}
2NF
3NF • Therefore {Book} → {Author Nationality} is a transitive dependency.
Module Summary
• Transitive dependency occurred because a non-key attribute (Author) was determining
another non-key attribute (Author Nationality).

Database Management Systems Partha Pratim Das 26.19


3NF (4): Example PPD

Module 26
• Example:
Partha Pratim Sup City(SID, Status, City) (already in 2NF) Post Normalization
Das

Week Recap

Objectives &
Outline

Normal Forms
1NF
2NF
3NF

Module Summary

The above two relations SC


Functional Dependencies: and CS are
SID → Status, • Lossless Join
SID → City,
• Redundancy? City→ Status • 3NF
◦ Status Transitive Dependency : • Dependency Preserving
• Anomaly? SID → Status
{As SID → City and City →
◦ Yes Status}

Database Management Systems Partha Pratim Das 26.20


3NF (5): Example

Module 26 • Relation dept advisor (s ID, i ID, dept name)


Partha Pratim
Das
• F = {s ID, dept name → i ID, i ID → dept name}
Week Recap
• Two candidate keys: s ID, dept name, and i ID, s ID
Objectives & • R is in 3NF
Outline

Normal Forms
◦ s ID, dept name → i ID
1NF
2NF
. s ID, dept name is a superkey
3NF
◦ i ID → dept name
Module Summary
. dept name is contained in a candidate key

A relational schema R is in 3NF if for every FD X → A associated with R either


• A ⊆ X (i.e., the FD is trivial) or
• X is a superkey of R or
• A is part of some key (not just superkey!)

Database Management Systems Partha Pratim Das 26.21


3NF (6): Redundancy

Module 26
• There is some redundancy in this schema
Partha Pratim
Das • Example of problems due to redundancy in 3NF (J : s ID, L : i ID, K : dept name)
Week Recap ◦ R = (J, L, K ). F = {JK → L, L → K }
Objectives &
Outline

Normal Forms
1NF
2NF
3NF

Module Summary

• Repetition of information (for example, the relationship l1 , k1 )


◦ (i ID, dept name)
• Need to use null values (for example, to represent the relationship l2 , k2 where there is
no corresponding value for J).
◦ (i ID, dept name) if there is no separate relation mapping instructors to
departments
Database Management Systems Partha Pratim Das 26.22
Module Summary

Module 26

Partha Pratim • Studied the Normal Forms and their Importance in Relational Design – how progressive
Das
increase of constraints can minimize redundancy in a schema
Week Recap

Objectives &
Outline

Normal Forms
1NF
2NF
3NF

Module Summary Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 26.23


Module 27

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Decomposition to
3NF Module 27: Relational Database Design/7: Normal Forms
Test
Algorithm
Practice Problem

Decomposition to
BCNF Partha Pratim Das
Test
Algorithm
Practice Problem Department of Computer Science and Engineering
Comparison Indian Institute of Technology, Kharagpur
Module Summary
[email protected]

Database Management Systems Partha Pratim Das 27.1


Module Recap PPD

Module 27

Partha Pratim • Studied the Normal Forms and their Importance in Relational Design – how progressive
Das
increase of constraints can minimize redundancy in a schema
Objectives &
Outline

Decomposition to
3NF
Test
Algorithm
Practice Problem

Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.2


Module Objectives PPD

Module 27

Partha Pratim • To Learn the Decomposition Algorithm for a Relation to 3NF


Das
• To Learn the Decomposition Algorithm for a Relation to BCNF
Objectives &
Outline

Decomposition to
3NF
Test
Algorithm
Practice Problem

Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.3


Module Outline PPD

Module 27

Partha Pratim • Decomposition to 3NF


Das
• Decomposition to BCNF
Objectives &
Outline

Decomposition to
3NF
Test
Algorithm
Practice Problem

Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.4


Decomposition to 3NF PPD

Module 27

Partha Pratim
Das

Objectives &
Outline

Decomposition to
3NF
Test
Algorithm
Practice Problem

Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison

Module Summary Decomposition to 3NF

Database Management Systems Partha Pratim Das 27.5


3NF Decomposition: Motivation

Module 27

Partha Pratim • There are some situations where


Das
◦ BCNF is not dependency preserving, and
Objectives &
Outline ◦ Efficient checking for FD violation on updates is important
Decomposition to
3NF
• Solution: define a weaker normal form, called Third Normal Form (3NF)
Test
Algorithm
◦ Allows some redundancy (with resultant problems; as seen above)
Practice Problem ◦ But functional dependencies can be checked on individual relations without
Decomposition to
BCNF
computing a join
Test ◦ There is always a lossless-join, dependency-preserving decomposition into
Algorithm
Practice Problem
3NF
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.6


3NF Decomposition (2): 3NF Definition PPD

Module 27

Partha Pratim • A relational schema R is in 3NF if for every FD X → A associated with R either
Das
◦ A ⊆ X (that is, the FD is trivial) or
Objectives &
Outline ◦ X is a superkey of R or
Decomposition to ◦ A is part of some candidate key (not just superkey!)
3NF
Test • A relation in 3NF is naturally in 2NF
Algorithm
Practice Problem

Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.7


3NF Decomposition (3): Testing for 3NF

Module 27

Partha Pratim • Optimization: Need to check only FDs in F , need not check all FDs in F + .
Das
• Use attribute closure to check for each dependency α → β, if α is a superkey.
Objectives &
Outline • If α is not a superkey, we have to verify if each attribute in β is contained in a
Decomposition to
3NF
candidate key of R
Test
Algorithm
◦ This test is rather more expensive, since it involve finding candidate keys
Practice Problem ◦ Testing for 3NF has been shown to be NP-hard
Decomposition to
BCNF
◦ Decomposition into 3NF can be done in polynomial time
Test
Algorithm
Practice Problem
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.8


3NF Decomposition (4): Algorithm PPD

Module 27

Partha Pratim • Given: relation R, set F of functional dependencies


Das
• Find: decomposition of R into a set of 3NF relation Ri
Objectives &
Outline • Algorithm:
Decomposition to
3NF a) Eliminate redundant FDs, resulting in a canonical cover Fc of F
Test
Algorithm
b) Create a relation Ri = XY for each FD X → Y in Fc
Practice Problem c) If the key K of R does not occur in any relation Ri , create one more relation Ri = K
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.9


3NF Decomposition (5): Algorithm

Module 27 Let Fc be a canonical cover for F ;


i := 0;
Partha Pratim
Das for each functional dependency α → β in Fc do
if none of the schemas Rj , 1 ≤ j ≤ i contains αβ
Objectives & then begin
Outline
i := i + 1;
Decomposition to
3NF Ri := αβ
Test end
Algorithm
if none of the schemas Rj , 1 ≤ j ≤ i contains a candidate key for R
Practice Problem
then begin
Decomposition to
BCNF
i := i + 1;
Test Ri := any candidate key for R;
Algorithm end
Practice Problem
Comparison
/* Optionally, remove redundant relations */
repeat
Module Summary
if any schema Rj is contained in another schema Rk
then /* delete Rj */
Rj = R;
i = i − 1;
return (R1 , R2 , · · · , Ri )
Database Management Systems Partha Pratim Das 27.10
3NF Decomposition (6): Algorithm

Module 27

Partha Pratim • Upon decomposition:


Das
◦ Each relation schema Ri is in 3NF
Objectives &
Outline ◦ Decomposition is
Decomposition to
3NF
. Dependency Preserving
Test . Lossless Join
Algorithm
Practice Problem • Prove these properties
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.11


3NF Decomposition (7): Example

Module 27

Partha Pratim • Relation schema:


Das
cust banker branch = (customer id, employee id, branch name, type)
Objectives &
Outline • The functional dependencies for this relation schema are:
Decomposition to
3NF
a) customer id, employee id → branch name, type
Test b) employee id → branch name
Algorithm
Practice Problem c) customer id, branch name → employee id
Decomposition to
BCNF
• We first compute a canonical cover
Test
Algorithm
◦ branch name is extraneous in the RHS of the 1st dependency
Practice Problem ◦ No other attribute is extraneous, so we get Fc =
Comparison
customer id, employee id → type
Module Summary
employee id → branch name
customer id, branch name → employee id

Database Management Systems Partha Pratim Das 27.12


3NF Decomposition (8): Example

Module 27

Partha Pratim • The for loop generates following 3NF schema:


Das
(customer id, employee id, type)
Objectives &
Outline
(employee id, branch name)
Decomposition to
(customer id, branch name, employee id)
3NF
Test
◦ Observe that (customer id, employee id, type) contains a candidate key of the
Algorithm
Practice Problem
original schema, so no further relation schema needs be added
Decomposition to • At end of for loop, detect and delete schemas, such as (employee id, branch name),
BCNF
Test
which are subsets of other schemas
Algorithm
Practice Problem
◦ result will not depend on the order in which FDs are considered
Comparison
• The resultant simplified 3NF schema is:
Module Summary
(customer id, employee id, type)
(customer id, branch name, employee id)

Database Management Systems Partha Pratim Das 27.13


Practice Problem for 3NF Decomposition (1) PPD

Module 27

Partha Pratim • R = ABCDEFGH


Das
• FDs = {A → B, ABCD → E , EF → GH, ACDF → EG }
Objectives &
Outline

Decomposition to
3NF
Test
Algorithm
Practice Problem

Decomposition to
BCNF
Test
Algorithm
Practice Problem
Solution is given in the next slide (hidden from presentation – check after you have solved
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.14


Practice Problem for 3NF Decomposition (2) PPD

Module 27

Partha Pratim • R = CSJDPQV


Das
• FDs = {C → CSJDPQV , SD → P, JP → C , J → S}
Objectives &
Outline

Decomposition to
3NF
Test
Algorithm
Practice Problem

Decomposition to
BCNF
Test
Algorithm
Practice Problem
Solution is given in the next slide (hidden from presentation – check after you have solved)
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.15


Practice Problem for 3NF Decomposition (3) PPD

Module 27
Decompose the following schema to 3NF in the following steps
Partha Pratim
Das • Compute all keys for R
Objectives & • Compute a Canonical Cover Fc for F Put the FDs into alphabetical order.
Outline

Decomposition to
• Using Fc , employ the 3NFdecom algorithm to obtain a lossless and dependency preserving
3NF decomposition of relation R into a collection of relations that are in 3NF
Test
Algorithm
• Does your schema allow redundancy?
Practice Problem

Decomposition to • R(ABCDEFGH):
BCNF
Test
F = {A → CD, ACF → G , AD → BEF , BCG → D, CF → AH, CH → G , D → B, H → DEG }
Algorithm
Practice Problem
• R(ABCDE ):
Comparison
F = {A → B, A → C , C → D, A → E }
Module Summary • R(ABCDE ):
F = {A → BC , CD → E , B → D, E → A}
• R(ABCD):
F = {A → D, AB → C , AD → C , B → C , D → AB}

Database Management Systems Partha Pratim Das 27.16


Decomposition to BCNF PPD

Module 27

Partha Pratim
Das

Objectives &
Outline

Decomposition to
3NF
Test
Algorithm
Practice Problem

Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison

Module Summary Decomposition to BCNF

Database Management Systems Partha Pratim Das 27.17


BCNF Decomposition: BCNF Definition

Module 27

Partha Pratim • A relation schema R is in BCNF with respect to a set F of FDs if for all FDs in F + of
Das
the form
Objectives &
Outline
α → β, where α ⊆ R and β ⊆ R at least one of the following holds:
Decomposition to ◦ α → β is trivial (that is, β ⊆ α)
3NF
Test
◦ α is a superkey for R
Algorithm
Practice Problem

Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.18


BCNF Decomposition (2): Testing for BCNF

Module 27
• To check if a non-trivial dependency α → β causes a violation of BCNF
Partha Pratim
Das a) Compute α+ (the attribute closure of α), and
Objectives &
b) Verify that it includes all attributes of R, that is, it is a superkey of R.
Outline
• Simplified test: To check if a relation schema R is in BCNF, it suffices to check only
Decomposition to
3NF the dependencies in the given set F for violation of BCNF, rather than checking all
Test
Algorithm
dependencies in F + .
Practice Problem
◦ If none of the dependencies in F causes a violation of BCNF, then none of the
Decomposition to
BCNF dependencies in F + will cause a violation of BCNF either.
Test
Algorithm • However, simplified test using only F is incorrect when testing a relation in a
Practice Problem
Comparison
decomposition of R
Module Summary ◦ Consider R = (A, B, C , D, E ), with F = {A → B, BC → D}
. Decompose R into R1 = (A, B) and R2 = (A, C , D, E )
. Neither of the dependencies in F contain only attributes from (A, C , D, E ) so we
might be mislead into thinking R2 satisfies BCNF.
. In fact, dependency AC → D in F + shows R2 is not in BCNF.
Database Management Systems Partha Pratim Das 27.19
BCNF Decomposition (3): Testing for BCNF Decomposition

Module 27

Partha Pratim • To check if a relation Ri in a decomposition of R is in BCNF,


Das
◦ Either test Ri for BCNF with respect to the restriction of F to Ri (that is, all FDs
Objectives &
Outline in F + that contain only attributes from Ri )
Decomposition to ◦ Or use the original set of dependencies F that hold on R, but with the following
3NF
Test
test:
Algorithm
Practice Problem
. for every set of attributes α ⊆ Ri , check that α+ (the attribute closure of α)
Decomposition to either includes no attribute of Ri − α, or includes all attributes of Ri .
BCNF
Test
. If the condition is violated by some α → β in F , the dependency
Algorithm α → (α+ − α) ∩ Ri
Practice Problem
Comparison can be shown to hold on Ri , and Ri violates BCNF.
Module Summary . We use above dependency to decompose Ri

Database Management Systems Partha Pratim Das 27.20


BCNF Decomposition (4): Testing Dependency Preservation:
Using Closure Set of FD (Exp. Algo.): Module 25
Module 27
Consider the example given below, we will apply both the algorithms to check dependency preservation and
Partha Pratim
Das
will discuss the results.
• R (A, B, C, D)
Objectives &
Outline F = {A → B, B → C , C → D, D → A}
Decomposition to • Decomposition: R1(A, B) R2(B, C) R3(C, D)
3NF
Test ◦ A → B is preserved on table R1
Algorithm
Practice Problem
◦ B → C is preserved on table R2
◦ C → D is preserved on table R3
Decomposition to
BCNF ◦ We have to check whether the one remaining FD: D→A is preserved or not.
Test
Algorithm
Practice Problem
R1 R2 R3
Comparison F1 ={A → AB, B → BA} F2 ={B → BC , C → CB} F3 ={C → CD, D → DC }
Module Summary

◦ F 0 = F1 ∪ F2 ∪ F3 .
◦ Checking for: D → A in F 0+
. D → C (from R3), C → B (from R2), B → A (from R1) : D→ A (By Transitivity)
Hence all dependencies are preserved.
Database Management Systems Partha Pratim Das 27.21
BCNF Decomposition (4): Testing Dependency Preservation:
Using Closure of Attributes (Poly. Algo.): Module 25
Module 27
• R(ABCD) :. F = {A → B, B → C , C → D, D → A}
Partha Pratim
Das • Decomp = {AB, BC , CD}
Objectives & • On projections:
Outline

Decomposition to
3NF
Test
Algorithm
Practice Problem
In this algo F1, F2, F3 are not the closure sets, rather the set of dependencies directly applicable on R1, R2, R3
Decomposition to respectively.
BCNF
Test • Need to check for: A → B, B → C , C → D, D → A
Algorithm
Practice Problem • (D) + /F 1 = D. (D) + /F 2 = D. (D) + /F 3 = D. So, D → A could not be preserved.
Comparison
• In the previous method we saw the dependency was preserved. In reality also it is preserved.
Module Summary
Therefore the polynomial time algorithm may not work in case of all examples. To prove preservation
Algo 2 is sufficient but not necessary whereas Algo 1 is both sufficient as well as necessary.

Note: This difference in result can occur in any example where a functional dependency of one decomposed table
uses another functional dependency in its closure which is not applicable on any of the decomposed table because of
absence of all attributes in the table.
Database Management Systems Partha Pratim Das 27.22
BCNF Decomposition (4): Algorithm PPD

Module 27

Partha Pratim a) For all dependencies A → B in F + , check if A is a superkey


Das
• By using attribute closure
Objectives &
Outline b) If not, then
Decomposition to
3NF • Choose a dependency in F + that breaks the BCNF rules, say A → B
Test
Algorithm
• Create R1 = AB
Practice Problem • Create R2 = (R − (B − A))
Decomposition to
BCNF
• Note that: R1 ∩ R2 = A and A → AB (= R1), so this is lossless decomposition
Test
Algorithm
c) Repeat for R1, and R2
Practice Problem
• By defining F 1+ to be all dependencies in F that contain only attributes in R1
Comparison

Module Summary
• Similarly F 2+

Database Management Systems Partha Pratim Das 27.23


BCNF Decomposition (5): Algorithm

Module 27

Partha Pratim
result := {R};
Das done := false;
Objectives & compute F + ;
Outline
while (not done) do
Decomposition to
3NF if (there is a schema Ri in result that is not in BCNF)
Test
Algorithm
then begin
Practice Problem
let α → β be a nontrivial functional dependency that
Decomposition to
BCNF
holds on Ri such that α → β is not in F + ,
Test and α ∩ β = φ;
Algorithm
Practice Problem result := (result − Ri ) ∪ (Ri − β) ∪ (α, β);
Comparison
end
Module Summary
else done := true;

Note: each Ri is in BCNF, and decomposition is lossless-join.


Database Management Systems Partha Pratim Das 27.24
BCNF Decomposition (6): Example

Module 27

Partha Pratim • R = (A, B, C )


Das
F = {A → B
Objectives &
Outline
B → C}
Decomposition to
Key = {A}
3NF
Test • R is not in BCNF (B → C but B is not superkey)
Algorithm
Practice Problem • Decomposition
Decomposition to
BCNF
◦ R1 = (B, C )
Test ◦ R2 = (A, B)
Algorithm
Practice Problem
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.25


BCNF Decomposition (7): Example

Module 27
• class (course id, title, dept name, credits, sec id, semester, year, building,
Partha Pratim
Das
room number, capacity, time slot id)
Objectives &
• Functional dependencies:
Outline
◦ course id → title, dept name, credits
Decomposition to
3NF ◦ building, room number → capacity
Test
Algorithm
◦ course id, sec id, semester, year → building, room number, time slot id
Practice Problem
• A candidate key course id, sec id, semester, year.
Decomposition to
BCNF
Test
• BCNF Decomposition:
Algorithm ◦ course id → title, dept name, credits holds
Practice Problem
Comparison . but course id is not a superkey
Module Summary
◦ We replace class by:
. course(course id, title, dept name, credits)
. class-1 (course id, sec id, semester, year, building,
room number, capacity, time slot id)
Database Management Systems Partha Pratim Das 27.26
BCNF Decomposition (8): Example

Module 27

Partha Pratim • course is in BCNF


Das
◦ How do we know this?
Objectives &
Outline • building, room number → capacity holds on
Decomposition to
3NF
class-1(course id, sec id, semester, year, building, room number, capacity, time slot id)
Test
Algorithm
◦ But {building, room number} is not a superkey for class-1.
Practice Problem ◦ We replace class-1 by:
Decomposition to
BCNF
. classroom (building, room number, capacity)
Test . section (course id, sec id, semester, year, building, room number, time slot id)
Algorithm
Practice Problem
• classroom and section are in BCNF.
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.27


BCNF Decomposition (8): Dependency Preservation

Module 27

Partha Pratim • It is not always possible to get a BCNF decomposition that is dependency preserving
Das
• R = (J, K , L)
Objectives &
Outline F = {JK → L
Decomposition to L → K}
3NF
Test Two candidate keys = JK and JL
Algorithm
Practice Problem • R is not in BCNF
Decomposition to
BCNF
• Any decomposition of R will fail to preserve
Test JK → L
Algorithm
Practice Problem This implies that testing for JK → L requires a join
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.28


Practice Problem for BCNF Decomposition PPD

Module 27

Partha Pratim
Decompose the following schema to BCNF
Das
• R = ABCDE . F = {A → B, BC → D}
Objectives &
Outline • R = ABCDEH. F = {A → BC , E → HA}
Decomposition to
3NF
• R = CSJDPQV . F = {C → CSJDPQV , SD → P, JP → C , J → S}
Test
Algorithm
• R = ABCD. F = {C → D, C → A, B → C }
Practice Problem

Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison

Module Summary

Database Management Systems Partha Pratim Das 27.29


Comparison of BCNF and 3NF

Module 27
• It is always possible to decompose a relation into a set of relations that are in 3NF such
Partha Pratim
Das
that:
Objectives &
◦ the decomposition is lossless
Outline ◦ the dependencies are preserved
Decomposition to
3NF • It is always possible to decompose a relation into a set of relations that are in BCNF
Test
Algorithm
such that:
Practice Problem
◦ the decomposition is lossless
Decomposition to
BCNF ◦ it may not be possible to preserve dependencies.
Test
Algorithm
Practice Problem
S# 3NF BCNF
Comparison 1. It concentrates on Primary Key It concentrates on Candidate Key
Module Summary 2. Redundancy is high as compared to BCNF 0% redundancy
3. It preserves all the dependencies It may not preserve the dependencies
4. A dependency X → Y is allowed in 3NF if A dependency X → Y is allowed if X is a
X is a super key or Y is a part of some key super key

Database Management Systems Partha Pratim Das 27.30


Module Summary

Module 27

Partha Pratim • Learnt how to decompose a schema into 3NF while preserving dependency and lossless
Das
join
Objectives &
Outline • Learnt how to decompose a schema into BCNF with lossless join
Decomposition to
3NF
Test
Algorithm
Practice Problem

Decomposition to
BCNF
Test
Algorithm Slides used in this presentation are borrowed from http://db-book.com/ with kind
Practice Problem
Comparison
permission of the authors.
Module Summary Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 27.31


Module 28

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Library
Information Module 28: Relational Database Design/8: Case Study
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Partha Pratim Das
Final Schema

Module Summary Department of Computer Science and Engineering


Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 28.1


Module Recap PPD

Module 28

Partha Pratim • Learnt how to decompose a schema into 3NF while preserving dependency and lossless
Das
join
Objectives &
Outline • Learnt how to decompose a schema into BCNF with lossless join
Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema

Module Summary

Database Management Systems Partha Pratim Das 28.2


Module Objectives PPD

Module 28

Partha Pratim • To design the schema for a Library Information System


Das

Objectives &
Outline

Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema

Module Summary

Database Management Systems Partha Pratim Das 28.3


Module Outline PPD

Module 28

Partha Pratim • Library Information System


Das

Objectives &
Outline

Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema

Module Summary

Database Management Systems Partha Pratim Das 28.4


Library Information System (LIS) PPD

Module 28

Partha Pratim
We are given to design a relational database schema for a Library Information System (LIS)
Das of an Institute
Objectives &
Outline
• The specification document of the LIS has already been shared with you
Library • In this presentation, we include the key points from the Specs; but the actual document
Information
System must be referred to
Specification
Entity Sets • We carry out the following tasks in the module:
Relationships
Relational Schema ◦ Identify the Entity Sets with attributes
Schema Refinement
Final Schema
◦ Identify the Relationships
Module Summary ◦ Build the initial set of relational schema
◦ Refine the set of schema with FDs that hold on them
◦ Finalize the design of the schema
• The coding of various queries in SQL, based on these schema are left as exercises

Database Management Systems Partha Pratim Das 28.5


LIS Specs Excerpts

Module 28
• An institute library has 200000+ books and 10000+ members
Partha Pratim
Das • Books are regularly issued by members on loan and returned after a period.
Objectives &
Outline
• The library needs an LIS to manage the books, the members and the issue-return
Library process
Information
System • Every book has
Specification
Entity Sets ◦ title
Relationships
Relational Schema
◦ author (in case of multiple authors, only the first author is maintained)
Schema Refinement ◦ publisher
Final Schema

Module Summary
◦ year of publication
◦ ISBN number (which is unique for the publication), and
◦ accession number (which is the unique number of the copy of the book in the
library)
◦ There may be multiple copies of the same book in the library

Database Management Systems Partha Pratim Das 28.6


LIS Specs Excerpts (2)

Module 28 • There are four categories of members of the library:


Partha Pratim
Das
◦ undergraduate students
◦ post graduate students
Objectives &
Outline ◦ research scholars, and
Library ◦ faculty members
Information
System
Specification
• Every student has
Entity Sets ◦ . name
Relationships
Relational Schema . roll number
Schema Refinement
Final Schema
. department
Module Summary . gender
. mobile number
. date of birth, and
. degree
− undergrad
− grad
− doctoral
Database Management Systems Partha Pratim Das 28.7
LIS Specs Excerpts (3)

Module 28
• Every faculty has
Partha Pratim
Das ◦ name
Objectives & ◦ employee id
Outline
◦ department
Library
Information ◦ gender
System
Specification
◦ mobile number, and
Entity Sets
Relationships
◦ date of joining
Relational Schema
Schema Refinement
• Library also issues a unique membership number to every member. Every member
Final Schema has a maximum quota for the number of books she / he can issue for the maximum
Module Summary duration allowed to her / him. Currently these are set as:
◦ Each undergraduate student can issue up to 2 books for 1 month duration
◦ Each postgraduate student can issue up to 4 books for 1 month duration
◦ Each research scholar can issue up to 6 books for 3 months duration
◦ Each faculty member can issue up to 10 books for six months duration

Database Management Systems Partha Pratim Das 28.8


LIS Specs Excerpts (4)

Module 28
• The library has the following rules for issue:
Partha Pratim
Das ◦ A book may be issued to a member if it is not already issued to someone else
Objectives & (trivial)
Outline
◦ A book may not be issued to a member if another copy of the same book is already
Library
Information issued to the same member
System
Specification
◦ No issue will be done to a member if at the time of issue one or more of the books
Entity Sets
Relationships
issued by the member has already exceeded its duration of issue
Relational Schema ◦ No issue will be allowed also if the quota is exceeded for the member
Schema Refinement
Final Schema
◦ It is assumed that the name of every author or member has two parts
Module Summary . first name
. last name

Database Management Systems Partha Pratim Das 28.9


LIS Specs Excerpts (5): Queries PPD

Module 28 LIS should support the following operations / query:


Partha Pratim
Das
• Add / Remove members, categories of members, books.
Objectives &
• Add / Remove / Edit quota for a category of member, duration for a category of
Outline
member.
Library
Information • Check if the library has a book given its title (part of title should match). If yes: title,
System
Specification author, publisher, year and ISBN should be listed.
Entity Sets
Relationships • Check if the library has a book given its author. If yes: title, author, publisher, year and
Relational Schema
Schema Refinement
ISBN should be listed.
Final Schema
• Check if a copy of a book (given its ISBN) is available with the library for issue. All
Module Summary
accession numbers should be listed with issued or available information.
• Check the available (free) quota of a member.
• Issue a book to a member. This should check for the rules of the library.
• Return a book from a member.
• and so on
Database Management Systems Partha Pratim Das 28.10
LIS Entity Sets: books PPD

Module 28

Partha Pratim • Every book has title, author (in case of multiple authors, only the first author is
Das
maintained), publisher, year of publication, ISBN number (which is unique for the
Objectives &
Outline
publication), and accession number (which is the unique number of the copy of the
Library
book in the library). There may be multiple copies of the same book in the library
Information
System • Entity Set:
Specification
Entity Sets ◦ books
Relationships
Relational Schema • Attributes:
Schema Refinement
Final Schema ◦ title
Module Summary ◦ author name (composite)
◦ publisher
◦ year
◦ ISBN no
◦ accession no

Database Management Systems Partha Pratim Das 28.11


LIS Entity Sets (2): students PPD

Module 28

Partha Pratim • Every student has name, roll number, department, gender, mobile number, date of
Das
birth, and degree (undergrad, grad, doctoral)
Objectives &
Outline • Entity Set:
Library
Information
◦ students
System
Specification
• Attributes:
Entity Sets
Relationships
◦ member no – is unique
Relational Schema ◦ name (composite)
Schema Refinement
Final Schema ◦ roll no – is unique
Module Summary ◦ department
◦ gender
◦ mobile no – may be null
◦ dob
◦ degree

Database Management Systems Partha Pratim Das 28.12


LIS Entity Sets (3): faculty PPD

Module 28

Partha Pratim • Every faculty has name, employee id, department, gender, mobile number, and date of
Das
joining
Objectives &
Outline • Entity Set:
Library
Information
◦ faculty
System
Specification
• Attributes:
Entity Sets
Relationships
◦ member no – is unique
Relational Schema ◦ name (composite)
Schema Refinement
Final Schema ◦ id – is unique
Module Summary ◦ department
◦ gender
◦ mobile no – may be null
◦ doj

Database Management Systems Partha Pratim Das 28.13


LIS Entity Sets (4): members PPD

Module 28

Partha Pratim • Library also issues a unique membership number to every member. There are four
Das
categories of members of the library: undergraduate students, post graduate students,
Objectives &
Outline
research scholars, and faculty members
Library • Entity Set:
Information
System ◦ members
Specification
Entity Sets
Relationships
• Attributes:
Relational Schema ◦ member no
Schema Refinement
Final Schema ◦ member type (takes a value in ug, pg, rs or fc)
Module Summary

Database Management Systems Partha Pratim Das 28.14


LIS Entity Sets (5): quota PPD

Module 28

Partha Pratim • Every member has a maximum quota for the number of books she / he can issue for
Das
the maximum duration allowed to her / him. Currently these are set as:
Objectives &
Outline ◦ Each undergraduate student can issue up to 2 books for 1 month duration
Library ◦ Each postgraduate student can issue up to 4 books for 1 month duration
Information
System ◦ Each research scholar can issue up to 6 books for 3 months duration
Specification
Entity Sets
◦ Each faculty member can issue up to 10 books for six months duration
Relationships
Relational Schema
• Entity Set:
Schema Refinement
Final Schema
◦ quota
Module Summary • Attributes:
◦ member type
◦ max books
◦ max duration

Database Management Systems Partha Pratim Das 28.15


LIS Entity Sets (6): staff PPD

Module 28

Partha Pratim • Though not explicitly stated, library would have staffs to manage the LIS
Das
• Entity Set:
Objectives &
Outline ◦ staff
Library
Information • Attributes: (speculated – to ratify from customer)
System
Specification ◦ name (composite)
Entity Sets
Relationships
◦ id – is unique
Relational Schema ◦ gender
Schema Refinement
Final Schema ◦ mobile no
Module Summary ◦ doj

Database Management Systems Partha Pratim Das 28.16


LIS Relationships PPD

Module 28

Partha Pratim • Books are regularly issued by members on loan and returned after a period. The library
Das
needs an LIS to manage the books, the members and the issue-return process
Objectives &
Outline • Relationship
Library
Information
◦ book issue
System
Specification
• Involved Entity Sets
Entity Sets
Relationships
◦ students / faculty / members
Relational Schema
Schema Refinement
. member no
Final Schema
◦ books
Module Summary
. accession no
• Relationship Attribute
◦ doi – date of issue
• Type of relationship
◦ Many-to-one from books
Database Management Systems Partha Pratim Das 28.17
LIS Relational Schema PPD

Module 28

Partha Pratim • books(title, author fname, author lname, publisher, year, ISBN no, accession no)
Das
• book issue(members, accession no, doi)
Objectives &
Outline • members(member no, member type)
Library
Information • quota(member type, max books, max duration)
System
Specification • students(member no, student fname, student lname, roll no, department, gender,
Entity Sets
Relationships mobile no, dob, degree)
Relational Schema
Schema Refinement • faculty(member no, faculty fname, faculty lname, id, department, gender, mobile no,
Final Schema
doj)
Module Summary
• staff(staff fname, staff lname, id, gender, mobile no, doj)

Database Management Systems Partha Pratim Das 28.18


LIS Schema Refinement: books PPD

Module 28

Partha Pratim • books(title, author fname, author lname, publisher, year, ISBN no, accession no)
Das
◦ ISBN no → title, author fname, author lname, publisher, year
Objectives &
Outline ◦ accession no → ISBN no
Library ◦ Key: accession no
Information
System • Redundancy of book information across copies
Specification
Entity Sets
Relationships
• Good to normalize:
Relational Schema ◦ book catalogue(title, author fname, author lname, publisher, year, ISBN no)
Schema Refinement
Final Schema . ISBN no → title, author fname, author lname, publisher, year
Module Summary . Key: ISBN no
◦ book copies(ISBN no, accession no)
. accession no → ISBN no
. Key: accession no
• Both in BCNF. Decomposition is lossless join and dependency preserving

Database Management Systems Partha Pratim Das 28.19


LIS Schema Refinement (2): book issue PPD

Module 28

Partha Pratim • book issue(member no, accession no, doi)


Das
◦ member no, accession no → doi
Objectives &
Outline ◦ Key: members, accession no
Library
Information
• In BCNF
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema

Module Summary

Database Management Systems Partha Pratim Das 28.20


LIS Schema Refinement (3): quota

Module 28

Partha Pratim • quota(member type, max books, max duration)


Das
◦ member type →max books, max duration
Objectives &
Outline ◦ Key: member type
Library
Information
• In BCNF
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema

Module Summary

Database Management Systems Partha Pratim Das 28.21


LIS Schema Refinement (4): members

Module 28

Partha Pratim • members(member no, member type)


Das
◦ member no → member type
Objectives &
Outline ◦ Key: member no
Library ◦ Value constraint on member type
Information
System . ug, pg, or rs: if the member is a student
Specification
Entity Sets . fc: if the member is a faculty
Relationships
Relational Schema
◦ In BCNF
Schema Refinement
Final Schema
◦ How to determine the member type?
Module Summary

Database Management Systems Partha Pratim Das 28.22


LIS Schema Refinement (5): students

Module 28

Partha Pratim • students(member no, student fname, student lname, roll no, department, gender,
Das
mobile no, dob, degree)
Objectives &
Outline ◦ roll no → student fname, student lname, department, gender, mobile no, dob,
Library degree
Information
System ◦ member no → roll no
Specification
Entity Sets
◦ roll no → member no
Relationships ◦ 2 Keys: roll no | member no
Relational Schema
Schema Refinement • In BCNF
Final Schema

Module Summary • Issues:


◦ member no is needed for issue / return queries. It is unnecessary to have student’s
details with that.
◦ member no may also come from faculty relation.
◦ member type is needed for issue / return queries. This is implicit in degree – not
explicitly given.
Database Management Systems Partha Pratim Das 28.23
LIS Schema Refinement (6): faculty

Module 28

Partha Pratim • faculty(member no, faculty fname, faculty lname, id, department, gender, mobile no,
Das
doj)
Objectives &
Outline ◦ id → faculty fname, faculty lname, department, gender, mobile no, doj
Library ◦ id → member no
Information
System ◦ member no →id
Specification
Entity Sets
◦ 2 Keys: id | member no
Relationships
Relational Schema
• In BCNF
Schema Refinement
Final Schema
• Issues:
Module Summary ◦ member no is needed for issue / return queries. It is unnecessary to have faculty
details with that.
◦ member no may also come from students relation.
◦ member type is needed for issue / return queries. This is implicit by the fact that
we are in faculty relation.

Database Management Systems Partha Pratim Das 28.24


LIS Schema Refinement (7): Query

Module 28

Partha Pratim • Consider a query:


Das
◦ Get the name of the member who has issued the book having accession number =
Objectives &
Outline 162715
Library
Information
. If the member is a student,
System – SELECT student fname as First Name, student lname as Last Name
Specification
Entity Sets – FROM students, book issue
Relationships
Relational Schema
– WHERE accession no = 162715 AND book issue.member no =
Schema Refinement students.member no;
Final Schema
. If the member is a faculty,
Module Summary
– SELECT faculty fname as First Name, faculty lname as Last Name
– FROM faculty, book issue
– WHERE accession no = 162715 AND book issue.member no =
faculty.member no;
◦ Which query to fire!

Database Management Systems Partha Pratim Das 28.25


LIS Schema Refinement (8): members

Module 28 There are 4 categories of members: ug students, grad students, research scholars, and
Partha Pratim faculty members. This leads to the following specialization relationships:
Das
• Consider the entity set members of a library and refine:
Objectives &
Outline ◦ Attributes:
Library
Information . member no
System
Specification
. member class – ‘student’ or ‘faculty’, used to choose table
Entity Sets . member type – ug,pg, rs, fc, ...
Relationships
Relational Schema . roll no (if member class – ‘student’. Else null)
Schema Refinement
Final Schema
. id (if member class – ‘faculty’. Else null)
Module Summary • We can then exploit some hidden relationship:
◦ students IS A members
◦ faculty IS A members
• Type of relationship
◦ One-to-one

Database Management Systems Partha Pratim Das 28.26


LIS Schema Refinement (9): Query

Module 28

Partha Pratim • Consider the access query again:


Das
◦ Get the name of the member who has issued the book having accession number =
Objectives &
Outline 162715
Library
Information
System
SELECT
Specification ((SELECT faculty fname as First Name, faculty lname as Last Name
Entity Sets FROM faculty
Relationships
WHERE member class = ‘faculty’ AND members.id = faculty.id)
Relational Schema
Schema Refinement
UNION
Final Schema (SELECT student fname as First Name, student lname as Last Name
Module Summary FROM students
WHERE member class = ‘student’ AND members.roll no = students.roll no))
FROM members, book issue
WHERE accession no = 162715 AND book issue.member no = members.member no;

Database Management Systems Partha Pratim Das 28.27


LIS Schema Refinement (10): members

Module 28

Partha Pratim • members(member no, member class, member type, roll no, id)
Das
◦ member no → member type, member class, roll no, id
Objectives &
Outline ◦ member type → member class
Library ◦ Key: member no
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema

Module Summary

Database Management Systems Partha Pratim Das 28.28


LIS Schema Refinement (11): students

Module 28
• students(student fname, student lname, roll no, department, gender, mobile no, dob,
Partha Pratim
Das degree)
Objectives &
◦ roll no → student fname, student lname, department, gender, mobile no, dob,
Outline
degree
Library
Information ◦ Keys: roll no
System
Specification
◦ Note:
Entity Sets
Relationships
. member no is no longer used
Relational Schema . member type and member class are set in members from degree at the time of
Schema Refinement
Final Schema
creation of a new record.
Module Summary

Database Management Systems Partha Pratim Das 28.29


LIS Schema Refinement (12): faculty

Module 28
• faculty(faculty fname, faculty lname, id, department, gender, mobile no, doj)
Partha Pratim
Das ◦ id → faculty fname, faculty lname, department, gender, mobile no, doj
Objectives &
◦ Keys: id
Outline
◦ Note:
Library
Information . member no is no longer used
System
Specification . member type and member class are set in members at the time of creation of a
Entity Sets
Relationships
new record
Relational Schema
Schema Refinement
Final Schema

Module Summary

Database Management Systems Partha Pratim Das 28.30


LIS Schema Refinement (13): Final

Module 28

Partha Pratim • book catalogue(title, author fname, author lname, publisher, year, ISBN no)
Das
• book copies(ISBN no, accession no)
Objectives &
Outline • book issue(member no, accession no, doi)
Library
Information • quota(member type, max books, max duration)
System
Specification • members(member no, member class, member type, roll no, id)
Entity Sets
Relationships
• students(student fname, student lname, roll no, department, gender, mobile no, dob,
Relational Schema
Schema Refinement degree)
Final Schema

Module Summary
• faculty(faculty fname, faculty lname, id, department, gender, mobile no, doj)
• staff(staff fname, staff lname, id, gender, mobile no, doj)

Database Management Systems Partha Pratim Das 28.31


Module Summary

Module 28

Partha Pratim • Using the specification for a Library Information System, we have illustrated how a
Das
schema can be designed and then refined for finalization
Objectives &
Outline • Coding of various queries based on these schema are left as exercises
Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema

Module Summary

Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 28.32


Module 29

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Multivalued
Dependency Module 29: Relational Database Design/9: MVD and 4NF
Definition
Example
Use
Theory

Decomposition to Partha Pratim Das


4NF

Module Summary
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 29.1


Module Recap PPD

Module 29

Partha Pratim • Using the specification for a Library Information System, we have illustrated how a
Das
schema can be designed and then refined for finalization
Objectives &
Outline

Multivalued
Dependency
Definition
Example
Use
Theory

Decomposition to
4NF

Module Summary

Database Management Systems Partha Pratim Das 29.2


Module Objectives PPD

Module 29

Partha Pratim • To understand multi-valued dependencies arising out of attributes that can have
Das
multiple values
Objectives &
Outline • To define Fourth Normal Form and learn the decomposition algorithm to 4NF
Multivalued
Dependency
Definition
Example
Use
Theory

Decomposition to
4NF

Module Summary

Database Management Systems Partha Pratim Das 29.3


Module Outline PPD

Module 29

Partha Pratim • Multivalued Dependencies


Das
• Decomposition to 4NF
Objectives &
Outline

Multivalued
Dependency
Definition
Example
Use
Theory

Decomposition to
4NF

Module Summary

Database Management Systems Partha Pratim Das 29.4


MVD

Module 29

Partha Pratim
Das

Objectives &
Outline

Multivalued
Dependency
Definition
Example
Use
Theory

Decomposition to
4NF

Module Summary
Multivalued Dependency

Database Management Systems Partha Pratim Das 29.5


MVD: Multivalued Dependency PPD

Module 29 • Persons(Man, Phones, Dog Like)


Partha Pratim
Das

Objectives &
Outline

Multivalued
Dependency
Definition
Example
There are no non trivial FDs because all attributes are
Use combined forming Candidate Key, that is, MDP. In the
Theory

Decomposition to
above relation, two multivalued dependencies exists:
4NF
• Man  Phones
Module Summary
• Man  Dogs Like
A man’s phone are independent of the dogs they like.
But after converting the above relation in Single Valued
Attribute, each of a man’s phones appears with each of
the dogs they like in all combinations.
Source: http://www.edugrabs.com/multivalued-dependency-mvd/

Database Management Systems Partha Pratim Das 29.6


MVD (2) PPD

Module 29
• If two or more independent relations are kept in a single relation, then Multivalued
Partha Pratim
Das Dependency is possible. For example, Let there are two relations :
Objectives &
◦ Student(SID, Sname) where (SID → Sname)
Outline
◦ Course(CID, Cname) where (CID → Cname)
Multivalued
Dependency • There is no relation defined between Student and Course. If we kept them in a single
Definition
Example relation named Student Course, then MVD will exists because of m:n Cardinality
Use
Theory • If two or more MVDs exist in a relation, then while converting into SVAs, MVD exists.
Decomposition to
4NF

Module Summary

Source: http://www.edugrabs.com/multivalued-dependency-mvd/

Database Management Systems Partha Pratim Das 29.7


MVD (3)

Module 29

Partha Pratim
• Suppose we record names of children, and phone numbers for instructors:
Das
◦ inst child(ID, child name)
Objectives & ◦ inst phone(ID, phone number)
Outline

Multivalued
• If we were to combine these schema to get
Dependency
◦ inst info(ID, child name, phone number)
Definition
Example
◦ Example data:
Use (99999, David, 512-555-1234)
Theory (99999, David, 512-555-4321)
Decomposition to (99999, William, 512-555-1234)
4NF
(99999, William, 512-555-4321)
Module Summary
• This relation is in BCNF
◦ Why?

Database Management Systems Partha Pratim Das 29.8


MVD: Definition PPD

Module 29
• Let R be a relation schema and let α ⊆ R and β ⊆ R. The multivalued dependency
Partha Pratim
Das αβ
Objectives &
holds on R if in any legal relation r(R), for all pairs for tuples t1 and t2 in r such that
Outline t1 [α] = t2 [α], there exist tuples t3 and t4 in r such that:
Multivalued
Dependency
Definition
t1 [α] = t2 [α] = t3 [α] = t4 [α]
Example t3 [β] = t1 [β]
Use t3 [R – β] = t2 [R – β] Test: course  book
Theory
t4 [β] = t2 [β]
Decomposition to t4 [R – β] = t1 [R – β]
4NF

Module Summary

Example: A relation of university


courses, the books recommended for
the course, and the lecturers who
will be teaching the course:
• course  book
• course  lecturer
Database Management Systems Partha Pratim Das 29.9
MVD: Example

Module 29

Partha Pratim
• Let R be a relation schema with a set of attributes that are partitioned into 3 nonempty subsets.
Das Y, Z, W
Objectives & • We say that Y  Z (Y multidetermines Z ) if and only if for all possible relations r (R )
Outline < y1 , z1 , w1 >∈ r and < y1 , z2 , w2 >∈ r
Multivalued then
Dependency
Definition
< y1 , z1 , w2 >∈ r and < y1 , z2 , w1 >∈ r
Example
Use
• Note that since the behavior of Z and W are identical it follows that
Theory
Y  Z if Y  W
Decomposition to
4NF

Module Summary

Database Management Systems Partha Pratim Das 29.10


MVD: Example (2)

Module 29

Partha Pratim • In our example:


Das
ID  child name
Objectives &
Outline
ID  phone number
Multivalued • The above formal definition is supposed to formalize the notion that given a particular
Dependency
Definition value of Y (ID) it has associated with it a set of values of Z (child name) and a set of
Example
Use
values of W (phone number ), and these two sets are in some sense independent of
Theory each other.
Decomposition to
4NF • Note:
Module Summary ◦ If Y → Z then Y  Z
◦ Indeed we have (in above notation) Z1 = Z2
The claim follows.

Database Management Systems Partha Pratim Das 29.11


MVD: Use

Module 29

Partha Pratim • We use multivalued dependencies in two ways:


Das
a) To test relations to determine whether they are legal under a given set of
Objectives &
Outline functional and multivalued dependencies
Multivalued b) To specify constraints on the set of legal relations. We shall thus concern ourselves
Dependency
Definition
only with relations that satisfy a given set of functional and multivalued
Example
Use
dependencies.
Theory
• If a relation r fails to satisfy a given multivalued dependency, we can construct a
Decomposition to
4NF relations r ’ that does satisfy the multivalued dependency by adding tuples to r.
Module Summary

Database Management Systems Partha Pratim Das 29.12


MVD: Theory PPD

Module 29

Partha Pratim Name Rule


Das
C- Complementation If X  Y , then X  (R − (X ∪ Y )).
Objectives & A- Augmentation If X  Y and W ⊇ Z , then WX  YZ .
Outline
T- Transitivity If X  Y and Y  Z , then X  (Z − Y ).
Multivalued
Dependency Replication If X → Y , then X  Y but the reverse is not true.
Definition Coalescence If X  Y and there is a W such that
Example
W ∩ Y is empty, W → Z and Y ⊇ Z , then X → Z .
Use
Theory

Decomposition to • A MVD X  Y in R is called a trivial MVD is


4NF

Module Summary
◦ Y is a subset of X (X ⊇ Y) or
◦ X ∪ Y = R. Otherwise, it is a non trivial MVD and we have to repeat values
redundantly in the tuples.

Database Management Systems Partha Pratim Das 29.13


MVD: Theory (2)

Module 29
• From the definition of multivalued dependency, we can derive the following rule:
Partha Pratim
Das ◦ If α → β, then α  β
Objectives & That is, every functional dependency is also a multivalued dependency
Outline

Multivalued
• The closure D + of D is the set of all functional and multivalued dependencies logically
Dependency
Definition
implied by D.
Example
◦ We can compute D + from D, using the formal definitions of functional
Use
Theory dependencies and multivalued dependencies.
Decomposition to
4NF
◦ We can manage with such reasoning for very simple multivalued dependencies,
Module Summary
which seem to be most common in practice
◦ For complex dependencies, it is better to reason about sets of dependencies using a
system of inference rules

Database Management Systems Partha Pratim Das 29.14


Decomposition to 4NF PPD

Module 29

Partha Pratim
Das

Objectives &
Outline

Multivalued
Dependency
Definition
Example
Use
Theory

Decomposition to
4NF

Module Summary
Decomposition to 4NF

Database Management Systems Partha Pratim Das 29.15


Fourth Normal Form

Module 29
• A relation schema R is in 4NF with respect to a set D of functional and multivalued
Partha Pratim
Das dependencies if for all multivalued dependencies in D + of the form α  β, where α ⊆
Objectives &
R and β ⊆ R, at least one of the following hold:
Outline
◦ α  β is trivial (that is, β ⊆ α or α ∪ β = R)
Multivalued
Dependency ◦ α is a superkey for schema R
Definition
Example • If a relation is in 4NF it is in BCNF
Use
Theory

Decomposition to
4NF

Module Summary

Database Management Systems Partha Pratim Das 29.16


Restriction of Multivalued Dependencies

Module 29
• The restriction of D to Ri is the set Di consisting of
Partha Pratim
Das ◦ All functional dependencies in D + that include only attributes of Ri
Objectives &
◦ All multivalued dependencies of the form
Outline
α  (β ∩ Ri )
Multivalued
Dependency where α ⊆ Ri and α  β is in D +
Definition
Example
Use
Theory

Decomposition to
4NF

Module Summary

Database Management Systems Partha Pratim Das 29.17


4NF Decomposition Algorithm PPD

Module 29

Partha Pratim a) For all dependencies A  B in D + , check if A is a superkey


Das
• By using attribute closure
Objectives &
Outline b) If not, then
Multivalued
Dependency • Choose a dependency in F+ that breaks the 4NF rules, say A  B
Definition
Example
• Create R1 = A B
Use • Create R2 = (R – (B – A))
Theory
• Note that: R1 ∩ R2 = A and A  AB (= R1), so this is lossless decomposition
Decomposition to
4NF
c) Repeat for R1, and R2
Module Summary
• By defining D1+ to be all dependencies in F that contain only attributes in R1
• Similarly D2+

Database Management Systems Partha Pratim Das 29.18


4NF Decomposition Algorithm

Module 29
result: = {R};
Partha Pratim
Das
done := false;
compute D + ;
Objectives & Let Di denote the restriction of D + to Ri
Outline
while ( not done)
Multivalued
Dependency
if (there is a schema Ri in result that is not in 4NF) then
Definition begin
Example let α  β be a nontrivial multivalued dependency that holds
Use
Theory
on Ri such that α → Ri is not in Di , and α ∩ β = φ ;
result := (result − Ri ) ∪ (Ri − β) ∪ (α, β);
Decomposition to
4NF end
Module Summary
else done:= true;
Note: each Ri is in 4NF, and decomposition is lossless-join

Database Management Systems Partha Pratim Das 29.19


Example of 4NF Decomposition PPD

Module 29
• Example:
Partha Pratim
Das ◦ Person Modify(Man(M), Phones(P), Dog Likes(D),
Address(A)) Post Normalization
Objectives &
Outline ◦ FDs:
Multivalued . FD1 : Man  Phones
Dependency
Definition
Example
. FD2 : Man  Dogs Like
Use
Theory . FD3 : Man → Address
Decomposition to
4NF
◦ Key = MPD
Module Summary
◦ All dependencies violate 4NF

Database Management Systems Partha Pratim Das 29.20


Example of 4NF Decomposition

Module 29
• R =(A, B, C, G, H, I)
Partha Pratim
Das F=AB
Objectives &
B  HI
Outline CG  H
Multivalued
Dependency • R is not in 4NF since A  B and A is not a superkey for R
Definition
Example • Decomposition
Use
Theory
a) R1 = (A, B) (R1 is in 4NF)
Decomposition to b) R2 = (A, C, G, H, I) (R2 is not in 4NF, decompose into R3 and R4 )
4NF
c) R3 = (C, G, H) (R3 is in 4NF)
Module Summary
d) R4 = (A, C, G, I) (R4 is not in 4NF, decompose into R5 and R6 )
◦ A  B and B  HI → A  HI, (MVD transitivity), and
◦ and hence A  I (MVD restriction to R4 )
e) R5 = (A, I) (R5 is in 4NF)
f) R6 = (A, C, G) (R6 is in 4NF)

Database Management Systems Partha Pratim Das 29.21


Module Summary

Module 29

Partha Pratim • Understood multi-valued dependencies to handle attributes that can have multiple
Das
values
Objectives &
Outline • Learnt Fourth Normal Form and decomposition to 4NF
Multivalued
Dependency
Definition
Example
Use
Theory

Decomposition to
4NF

Module Summary

Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 29.22


Module 30

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Database Design
Process Module 30: Relational Database Design/10: Design Summary and Temporal Data
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Partha Pratim Das
Temporal
Databases
Temporal Data
Department of Computer Science and Engineering
Uni / Bi Temporal
Indian Institute of Technology, Kharagpur
Example

Module Summary [email protected]

Database Management Systems Partha Pratim Das 30.1


Module Recap PPD

Module 30

Partha Pratim • Understood multi-valued dependencies to handle attributes that can have multiple
Das
values
Objectives &
Outline • Learnt Fourth Normal Form and decomposition to 4NF
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example

Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example

Module Summary

Database Management Systems Partha Pratim Das 30.2


Module Objectives PPD

Module 30

Partha Pratim • To summarize the database design process


Das
• To explore the issues with temporal data
Objectives &
Outline

Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example

Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example

Module Summary

Database Management Systems Partha Pratim Das 30.3


Module Outline PPD

Module 30

Partha Pratim • Database-Design Process


Das
• Modeling Temporal Data
Objectives &
Outline

Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example

Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example

Module Summary

Database Management Systems Partha Pratim Das 30.4


PPD

Module 30

Partha Pratim
Das

Objectives &
Outline

Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example

Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Database Design Process
Module Summary

Database Management Systems Partha Pratim Das 30.5


Design Goals

Module 30
• Goal for a relational database design is:
Partha Pratim
Das ◦ BCNF / 4NF
Objectives &
◦ Lossless join
Outline
◦ Dependency preservation
Database Design
Process • If we cannot achieve this, we accept one of
Normal Forms
Normalization &
De-Normalization
◦ Lack of dependency preservation
Bad Design ◦ Redundancy due to use of 3NF
LIS Example

Temporal
• Interestingly, SQL does not provide a direct way of specifying functional dependencies
Databases
Temporal Data
other than superkeys.
Uni / Bi Temporal
Example
• Can specify FDs using assertions, but they are expensive to test, (and currently not
Module Summary
supported by any of the widely used databases!)
• Even if we had a dependency preserving decomposition, using SQL we would not be
able to efficiently test a functional dependency whose left hand side is not a key

Database Management Systems Partha Pratim Das 30.6


Further Normal Forms

Module 30
• Further NFs
Partha Pratim
Das ◦ Elementary Key Normal Form (EKNF)
Objectives &
◦ Essential Tuple Normal Form (ETNF)
Outline
◦ Join Dependencies And Fifth Normal Form (5 NF)
Database Design
Process ◦ Sixth Normal Form (6NF)
Normal Forms
Normalization &
◦ Domain/Key Normal Form (DKNF)
De-Normalization
Bad Design
• Join dependencies generalize multivalued dependencies
LIS Example
◦ lead to project-join normal form (PJNF) (also called fifth normal form)
Temporal
Databases
Temporal Data
• A class of even more general constraints, leads to a normal form called domain-key
Uni / Bi Temporal normal form.
Example

Module Summary
• Problem with these generalized constraints: are hard to reason with, and no set of
sound and complete set of inference rules exists.
• Hence rarely used

Database Management Systems Partha Pratim Das 30.7


Overall Database Design Process

Module 30
• We have assumed schema R is given
Partha Pratim
Das ◦ R could have been generated when converting E-R diagram to a set of tables
Objectives &
◦ R could have been a single relation containing all attributes that are of interest
Outline
(universal relation)
Database Design
Process ◦ Normalization breaks R into smaller relations
Normal Forms
Normalization &
◦ R could have been the result of some ad hoc design of relations, which we then
De-Normalization
test/convert to normal form
Bad Design
LIS Example

Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example

Module Summary

Database Management Systems Partha Pratim Das 30.8


ER Model and Normalization

Module 30

Partha Pratim • When an E-R diagram is carefully designed, identifying all entities correctly, the tables
Das
generated from the E-R diagram should not need further normalization
Objectives &
Outline • However, in a real (imperfect) design, there can be functional dependencies from
Database Design non-key attributes of an entity to other attributes of the entity
Process
Normal Forms ◦ Example: an employee entity with attributes
Normalization &
De-Normalization department name and building,
Bad Design
LIS Example
and a functional dependency
Temporal department name → building
Databases
Temporal Data
◦ Good design would have made department an entity
Uni / Bi Temporal
Example
• Functional dependencies from non-key attributes of a relationship set possible, but rare
Module Summary — most relationships are binary

Database Management Systems Partha Pratim Das 30.9


Denormalization for Performance

Module 30
• May want to use non-normalized schema for performance
Partha Pratim
Das • For example, displaying prereqs along with course id, and title requires join of course
Objectives & with prereq
Outline

Database Design
◦ Course(course id, title,. . . )
Process ◦ Prerequisite(course id, prereq)
Normal Forms
Normalization &
De-Normalization
• Alternative 1: Use denormalized relation containing attributes of course as well as
Bad Design prereq with all above attributes: Course(course id, title, prereq,. . . )
LIS Example

Temporal
◦ faster lookup
Databases
Temporal Data
◦ extra space and extra execution time for updates
Uni / Bi Temporal ◦ extra coding work for programmer and possibility of error in extra code
Example

Module Summary
• Alternative 2: Use a materialized view defined as Course ./ Prerequisite
◦ Benefits and drawbacks same as above, except no extra coding work for
programmer and avoids possible errors

Database Management Systems Partha Pratim Das 30.10


Other Design Issues

Module 30
• Some aspects of database design are not caught by normalization
Partha Pratim
Das • Examples of bad database design, to be avoided:
Objectives & Instead of earnings (company id, year, amount ), use
Outline

Database Design
◦ earnings 2004, earnings 2005, earnings 2006, etc., all on the schema (company id,
Process earnings).
Normal Forms
Normalization &
De-Normalization
. Above are in BCNF, but make querying across years difficult and needs new
Bad Design table each year
LIS Example

Temporal
◦ company year (company id, earnings 2004, earnings 2005, earnings 2006 )
Databases
Temporal Data
. Also in BCNF, but also makes querying across years difficult and requires new
Uni / Bi Temporal attribute each year.
Example

Module Summary
. Is an example of a crosstab, where values for one attribute become column
names
. Used in spreadsheets, and in data analysis tools

Database Management Systems Partha Pratim Das 30.11


LIS Example for 4NF

Module 30
• Consider a different version of relation book catalogue having the following attributes:
Partha Pratim
Das
◦ book title
◦ book catalogue, author lname: A book title may be associated with more than one
Objectives &
Outline author.
Database Design
Process
• book title {book title, author fname, author lname, edition}
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example

Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example

Module Summary

Figure: book catalogue


Database Management Systems Partha Pratim Das 30.12
LIS Example 4NF (2)

Module 30

Partha Pratim
Das

Objectives &
Outline

Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example

Temporal
Databases
Figure: book catalogue
Temporal Data
Uni / Bi Temporal
Example • Since the relation has no FDs, it is already in BCNF.
Module Summary
• However, the relation has two nontrivial MVDs
book title  {author fname, author lname} and book title  edition.
Thus, it is not in 4NF.
• Nontrivial MVDs must be decomposed to convert it into a set of relations in 4NF.
Database Management Systems Partha Pratim Das 30.13
LIS Example 4NF (3)

Module 30

Partha Pratim
Das
• We decompose book catalogue into book author
Objectives & and book edition because:
Outline

Database Design
◦ book author has trivial MVD
Process
Normal Forms
book title  {author fname, author lname}
Normalization &
De-Normalization
◦ book edition has trivial MVD
Bad Design
Figure: book author book title  edition.
LIS Example

Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example

Module Summary

Figure: book edition

Database Management Systems Partha Pratim Das 30.14


Temporal Databases PPD

Module 30

Partha Pratim
Das

Objectives &
Outline

Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example

Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Temporal Databases
Module Summary

Database Management Systems Partha Pratim Das 30.15


Temporal Databases

Module 30
• Some data may be inherently historical because they include time-dependent /
Partha Pratim
Das
time-varying data, such as:
◦ Medical Records
Objectives &
Outline ◦ Judicial records
Database Design
Process
◦ Share prices
Normal Forms ◦ Exchange rates
Normalization &
De-Normalization ◦ Interest rates
Bad Design
LIS Example
◦ Company profits
Temporal ◦ etc.
Databases
Temporal Data • The desire to model such data means that we need to store not only the respective
Uni / Bi Temporal
Example
value but also an associated date or a time period for which the value is valid. Typical
Module Summary
queries expressed informally might include:
◦ Give me last month’s history of the Dollar-Pound Sterling exchange rate.
◦ Give me the share prices of the NYSE on October 17, 1996.
• Temporal databases provide a uniform and systematic way of dealing with historical data
Source: https://www.cs.uct.ac.za/mit notes/database/htmls/chp18.html
Database Management Systems Partha Pratim Das 30.16
Temporal Data

Module 30
• Temporal data have an association time interval during which the data are valid.
Partha Pratim
Das • A snapshot is the value of the data at a particular point in time
Objectives &
Outline
• In practice, database designers may add start and end time attributes to relations
Database Design • For example, course(course id, course title) is replaced by
Process
Normal Forms
course(course id, course title, start, end)
Normalization &
De-Normalization ◦ Constraint: no two tuples can have overlapping valid times and are Hard to enforce
Bad Design
LIS Example
efficiently
Temporal ◦ Foreign key references may be to current version of data, or to data at a point in
Databases
Temporal Data
time
Uni / Bi Temporal
Example
. For example, student transcript should refer to course information at the time
Module Summary
the course was taken

Database Management Systems Partha Pratim Das 30.17


Temporal Database Theory

Module 30 • Model of Temporal Domain: Single-dimensional linearly ordered which may be


Partha Pratim
Das
◦ Discrete or dense
◦ Bounded or unbounded
Objectives &
Outline ◦ Single dimensional or multi-dimensional
Database Design ◦ Linear or non-linear
Process
Normal Forms • Timestamp Model
Normalization &
De-Normalization
Bad Design
• Temporal ER model by adding valid time to
LIS Example
◦ Attributes: address of an instructor at different points in time
Temporal
Databases ◦ Entities: time duration when a student entity exists
Temporal Data
Uni / Bi Temporal
◦ Relationships: time during which a student attended a course
Example ◦ But no accepted standard
Module Summary
• Temporal Functional Dependency Theory
• Temporal Logic
• Temporal Query Languge: TQuel [1987], TSQL2 [1995], SQL/Temporal [1996],
SQL/TP [1997]
Database Management Systems Partha Pratim Das 30.18
Modeling Temporal Data: Uni / Bi Temporal

Module 30
• There are two different aspects of time in temporal databases.
Partha Pratim
Das
◦ Valid Time: Time period during which a fact is true in real world, provided to the
system.
Objectives &
Outline ◦ Transaction Time: Time period during which a fact is stored in the database, based
Database Design on transaction serialization order and is the timestamp generated automatically by
Process
Normal Forms the system.
Normalization &
De-Normalization • Temporal Relation is one where each tuple has associated time; either valid time or
Bad Design
LIS Example
transaction time or both associated with it.
Temporal ◦ Uni-Temporal Relations: Has one axis of time, either Valid Time or Transaction
Databases
Temporal Data
Time.
Uni / Bi Temporal
◦ Bi-Temporal Relations: Has both axis of time – Valid time and Transaction time.
Example

Module Summary
It includes Valid Start Time, Valid End Time, Transaction Start Time, Transaction
End Time.

Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Database Management Systems Partha Pratim Das 30.19
Modeling Temporal Data: Example (1)

Module 30
• Example.
Partha Pratim
Das ◦ Let’s see an example of a person, John:
Objectives & . John was born on April 3, 1992 in Chennai.
Outline
. His father registered his birth after three days on April 6, 1992.
Database Design
Process . John did his entire schooling and college in Chennai.
Normal Forms
Normalization &
. He got a job in Mumbai and shifted to Mumbai on June 21, 2015.
De-Normalization
Bad Design
. He registered his change of address only on Jan 10, 2016.
LIS Example

Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example

Module Summary

Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database

Database Management Systems Partha Pratim Das 30.20


Modeling Temporal Data: Example (2)

Module 30

Partha Pratim
• John’s Data In Non-Temporal Database
Das

Objectives &
• John was born on April 3, 1992
Outline in Chennai.
Database Design • His father registered his birth af-
Process ter three days on April 6, 1992.
Normal Forms
Normalization &
• John did his entire schooling and
De-Normalization college in Chennai.
Bad Design
LIS Example
• He got a job in Mumbai and
shifted to Mumbai on June 21,
Temporal In a non-temporal database, John’s address is entered as Chennai from 1992. When he 2015.
Databases
Temporal Data
registers his new address in 2016, the database gets updated and the address field now • He registered his change of ad-
shows his Mumbai address. The previous Chennai address details will not be available. dress only on Jan 10, 2016.
Uni / Bi Temporal
Example
So, it will be difficult to find out exactly when he was living in Chennai and when he
moved to Mumbai.
Module Summary

Database Management Systems Partha Pratim Das 30.21


Modeling Temporal Data: Example (3)

Module 30
• Uni-Temporal Relation (Adding Valid Time To John’s Data)
Partha Pratim
Das

Objectives &
Outline

Database Design • John was born on April 3, 1992


Process in Chennai.
Normal Forms
Normalization &
• His father registered his birth af-
De-Normalization ter three days on April 6, 1992.
Bad Design
LIS Example
• John did his entire schooling and
college in Chennai.
Temporal
Databases
• The valid time temporal database contents look like this:
• He got a job in Mumbai and
Temporal Data shifted to Mumbai on June 21,
Name, City, Valid From, Valid Till 2015.
Uni / Bi Temporal
Example • Johns father registers his birth on 6th April 1992, a new database entry is made: • He registered his change of ad-
Module Summary
Person(John, Chennai, 3-Apr-1992, ∞). dress only on Jan 10, 2016.
• On January 10, 2016 John reports his new address in Mumbai:
Person(John, Mumbai, 21-June-2015, ∞).
◦ The original entry is updated:
Person(John, Chennai, 3-Apr-1992, 20-June-2015).

Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Database Management Systems Partha Pratim Das 30.22
Modeling Temporal Data: Example (4)

Module 30
• Bi-Temporal Relation (John’s Data Using Both Valid And Transaction Time)
Partha Pratim
Das

Objectives &
Outline • John was born on April 3, 1992
Database Design in Chennai.
Process
Normal Forms
• His father registered his birth af-
ter three days on April 6, 1992.
Normalization &
De-Normalization
• John did his entire schooling and
Bad Design • The database contents look like this: college in Chennai.
LIS Example Name, City, Valid From, Valid Till, Entered, Superseded
• He got a job in Mumbai and
Temporal
Databases
• Johns father registers his birth on 6th April 1992: shifted to Mumbai on June 21,
Person(John, Chennai, 3-Apr-1992, ∞, 6-Apr-1992, ∞). 2015.
Temporal Data
Uni / Bi Temporal • On January 10, 2016 John reports his new address in Mumbai: • He registered his change of ad-
Example Person(John, Mumbai, 21-June-2015, ∞, 10-Jan-2016, ∞). dress only on Jan 10, 2016.
Module Summary ◦ The original entry is updated as:
Person(John, Chennai, 3-Apr-1992, 20-June-2015, 6-Apr-1992 ,
10-Jan-2016).

Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database

Database Management Systems Partha Pratim Das 30.23


Modeling Temporal Data: Summary

Module 30
• Advantages
Partha Pratim
Das
◦ The main advantages of this bi-temporal relations is that it provides historical and
roll back information.
Objectives &
Outline . Historical Information – Valid Time.
Database Design
Process
. Rollback Information – Transaction Time.
Normal Forms ◦ For example, you can get the result for a query on John’s history, like: Where did
Normalization &
De-Normalization John live in the year 2001?. The result for this query can be got with the valid time
Bad Design
LIS Example
entry. The transaction time entry is important to get the rollback information.
Temporal • Disadvantages
Databases
Temporal Data
◦ More storage
Uni / Bi Temporal ◦ Complex query processing
Example

Module Summary
◦ Complex maintenance including backup and recovery

Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Database Management Systems Partha Pratim Das 30.24
Module Summary

Module 30

Partha Pratim • Discussed aspects of the database design process


Das
• Studied the issues with temporal data
Objectives &
Outline

Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example

Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Slides used in this presentation are borrowed from http://db-book.com/ with kind
Module Summary permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 30.25


Module 31

Partha Pratim
Das

Week Recap

Objectives &
Database Management Systems
Outline
Module 31: Application Design and Development/1: Architecture
Application
Programs &
Architecture
Architectures
Classification
1-Tier Partha Pratim Das
2-Tier
3-Tier
n-Tier Department of Computer Science and Engineering
Sample Applications Indian Institute of Technology, Kharagpur
Module Summary
[email protected]

Database Management Systems Partha Pratim Das 31.1


Module 31 https://www.overleaf.com/project/60ab55772e24e3d695070fc1
Partha Pratim
Das

Week Recap

Objectives &
Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Database Management Systems Partha Pratim Das 31.2


Week Recap PPD

Module 31

Partha Pratim • Studied the Normal Forms and their Importance in Relational Design – how progressive
Das
increase of constraints can minimize redundancy in a schema
Week Recap
• Learnt how to decompose a schema into 3NF while preserving dependency and lossless
Objectives &
Outline join
Application
Programs & • Learnt how to decompose a schema into BCNF with lossless join
Architecture
Architectures • Using the specification for a Library Information System, we have illustrated how a
Classification
1-Tier
schema can be designed and then refined for finalization
2-Tier
3-Tier
• Coding of various queries based on these schema are left as exercises
n-Tier
Sample Applications
• Understood multi-valued dependencies to handle attributes that can have multiple
Module Summary values
• Learnt Fourth Normal Form and decomposition to 4NF
• Discussed aspects of the database design process
• Studied the issues with temporal data
Database Management Systems Partha Pratim Das 31.3
Module Objectives PPD

Module 31

Partha Pratim • What are the Application Programs across various sectors?
Das
• Commonality of architecture across applications
Week Recap

Objectives &
• Understanding the classification and evolution of the architectures
Outline

Application
• A look at the architecture for a few sample applications
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Database Management Systems Partha Pratim Das 31.4


Module Outline PPD

Module 31

Partha Pratim • Application Programs


Das
• Application Architecture with classification and evolution
Week Recap

Objectives &
• Sample application architectures
Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Database Management Systems Partha Pratim Das 31.5


Application Programs and User Interfaces PPD

Module 31

Partha Pratim
Das

Week Recap

Objectives &
Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary Application Programs and Architecture

Database Management Systems Partha Pratim Das 31.6


Application Programs: Internet / Web or Mobile PPD

• Financial: • Software Engineering:


Module 31
◦ Netbanking: SBI, PNB, BoB, Canara, HDFC, ICICI ◦ Issue Tracking: JIRA, BugZilla, Githubs, Gitlab,
Partha Pratim ◦ Share Market: ICICIDirect, Sharekhan, HDFCDirect ◦ VCS: Githubs, Gitlab, BitBucket, SourceForge,
Das ◦ Insurance & Investment: LICI, PolicyBazaar, NSDL, NPS, ◦ Online IDE: OnlineGDB, Codechef, Ideone,
◦ Payment Gateway: Paytm, GPay, Bhim UPI, PhonePe, • Library:
Week Recap
◦ e-Commerce: Amazon, Flipkart, eBay, BigBazaar, BigBasket,
Objectives &
• Travel & Tourism:
◦ Digital Library: National Digital Library of India,
Outline ◦ Archives: Internet Archive, arXiv, Nextpoint,
◦ Travel Reservations: IRCTC, Airlines, MakeMyTrip, Yatra, • Education:
Application
Programs & ◦ Accommodation: Booking, OYO, AirBnb, Fabhotels, Treebo,
Architecture ◦ Transportation: Uber, Ola Cab, Mega Cab, Meru Cab, ◦ eLearning: BYJU’s, IGNOU, NIIT, Edukart,
Architectures ◦ Navigation: Google Maps, MapQuest, Apple Maps, ◦ MOOCs: SWAYAM, edX, Coursera, Udemy,
Classification ◦ Food & Delivery: Zomato, Swiggy, UberEats, Dunzo, • Document Processing:
1-Tier
2-Tier
• Communication: ◦ Editing: Overleaf, Google Docs, Spreadsheet
3-Tier ◦ Live Interaction: Zoom, Google Meet, Teams, Webex, Skype, ◦ Website, Blog: Google Sites, WordPress, Webly,
n-Tier ◦ Intermittent Interaction: WhatsApp, Telegram, Signal, Skype • Health:
Sample Applications ◦ Mail: Gmail, Yahoo, Hotmail, Rediffmail, Enterprise Mail, ◦ Telemedicine: MDLIVE, Doctor on Demand,
Module Summary ◦ Social Media: Facebook, Instagram, Twitter, YouTube, ◦ National: Aarogy Setu, CoWin, NACO App,
• Knowledge Discovery:
• Organizational ERP: (Intranet)
◦ Static: Google, Yahoo, Bing, Wikipedia, Encyclopedia.com, ◦ Institutions: Students, Faculty, Course
◦ Q&A: Quora, ASKfm, Yahoo Answers, Reddit, Digg, ◦ Hospital: Patient, Doctor, OPD, IPD, Pharmacy,
• Sports: ◦ Manufacturing: Suppliers, Inventory, Customers,
◦ Cricket: Cricbuzz, CricViz, Cricket-21, Cricket Exchange, ◦ Bank: Customers, Accounts, Locker, Deposits,
◦ Tennis: ATP, ITF, SwingVision, TennisPAL, Tennis Clash, ◦ Courier: Customers, Parcels, Delivery Agents,
Database Management Systems Partha Pratim Das 31.7
Characteristic of Application Programs PPD

Module 31 • Diversity: These applications widely differ in their


Partha Pratim ◦ Domain, functionality, user base, response time, scale, daily hit and many more
Das
• Unity: Yet, these have a lot in common
Week Recap

Objectives & ◦ Most use an RDBMS like Oracle, DB2 MySQL, PostgreSQL, etc. for managing data
Outline
◦ Applications are functionally split into frontend layer, middle layer, backend layer
Application
Programs & . Frontend or Presentation Layer / Tier
Architecture
Architectures − Interacts with the user: Display / View, Input / Output
Classification − Choose item, Add to cart, Checkout, Pay, Track order
1-Tier
2-Tier
− Interfaces may be, Browser-based, Mobile App, or Custom
3-Tier
. Middle or Application / Business Logic Layer / Tier
n-Tier
Sample Applications − Implements the Functionality of the Application: Links front and backend
Module Summary − Authentication, Search / Browse logic, Pricing, Cart management, Payment handling
(gateway), Order management (mail / SMS / internal actions), Delivery management
− Support functionality based on frontend interface
. Backend or Data Access Layer / Tier
− Manages persistent data, large volume, efficient access, security
− User, Cart, Inventory, Order, Vendor databases
Database Management Systems Partha Pratim Das 31.8
Characteristic of Application Programs (2): Architecture PPD

Module 31

Partha Pratim
Das

Week Recap

Objectives &
Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Source: https: // en. wikipedia. org/ wiki/ Multitier_ architecture


Database Management Systems Partha Pratim Das 31.9
Application Architectures: Layers

Module 31
• Presentation Layer / Tier
Partha Pratim
Das ◦ Model-View-Controller (MVC) architecture
Week Recap . model: business logic
Objectives & . view: presentation of data, depends on display device
Outline
. controller: receives events, executes actions, and returns a view to the user
Application
Programs &
Architecture
• Business Logic Layer / Tier
Architectures
Classification
◦ provides high level view of data and actions on data
1-Tier
2-Tier
. often using an object data model
3-Tier ◦ hides details of data storage schema
n-Tier
Sample Applications
• Data Access Layer / Tier
Module Summary
◦ interfaces between business logic layer and the underlying database
◦ provides mapping from object model of business layer to relational model of
database
◦ Already discussed and studied in depth
Database Management Systems Partha Pratim Das 31.10
Application Architecture (2): MVC

Module 31

Partha Pratim
Das

Week Recap

Objectives &
Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Database Management Systems Partha Pratim Das 31.11


Application Architecture (3): User Interface

Module 31

Partha Pratim • Web browsers have become the de-facto standard user interface to databases
Das
◦ Enable large numbers of users to access databases from anywhere
Week Recap
◦ Avoid the need for downloading / installing specialized code, while providing a good
Objectives &
Outline graphical user interface
Application
Programs &
. Javascript, Flash and other scripting languages run in browser, but are
Architecture downloaded transparently
Architectures
Classification ◦ Examples: banks, airline and rental car reservations, university course registration
1-Tier
2-Tier and grading, and so on.
3-Tier
n-Tier • Use in Mobile Devices are getting popular
Sample Applications

Module Summary
◦ Mobile Apps or Browser in Mobile
◦ These are similar in architecture and workflow with web, but have significant
differences with their smaller (but wide range of) form factor, and extremely low
resources
◦ Will be discussed later
Database Management Systems Partha Pratim Das 31.12
Application Architecture (4): Business Logic Layer

Module 31

Partha Pratim • Provides abstractions of entities


Das
◦ For example, students, instructors, courses, etc
Week Recap

Objectives &
• Enforces business rules for carrying out actions
Outline
◦ For example, student can enroll in a class only if she has completed prerequisites,
Application
Programs & and has paid her tuition fees
Architecture
Architectures • Supports workflows which define how a task involving multiple participants is to be
Classification
1-Tier carried out
2-Tier
3-Tier
◦ For example, how to process application by a student applying to a university
n-Tier
◦ Sequence of steps to carry out task
Sample Applications

Module Summary
◦ Error handling
. For example, what to do if recommendation letters not received on time

Database Management Systems Partha Pratim Das 31.13


Application Architecture (5): Object-Relational Mapping

Module 31

Partha Pratim • Allows application code to be written on top of object-oriented data model, while
Das
storing data in a traditional relational database
Week Recap
◦ alternative: implement object-oriented or object-relational database to store object
Objectives &
Outline model
Application
Programs &
. has not been commercially successful
Architecture
Architectures
• Schema designer has to provide a mapping between object data and relational schema
Classification
1-Tier
◦ For example, Java class Student mapped to relation student, with corresponding
2-Tier mapping of attributes
3-Tier
n-Tier ◦ An object can map to multiple tuples in multiple relations
Sample Applications

Module Summary
• Application opens a session, which connects to the database
• Objects can be created and saved to the database using session.save(object)
◦ mapping used to create appropriate tuples in the database
• Query can be run to retrieve objects satisfying specified predicates
Database Management Systems Partha Pratim Das 31.14
Application Architecture (6): Data Access Layer

Module 31

Partha Pratim • Issues of modeling and design of databases have already discussed in depth through the
Das
previous module
Week Recap
• Issues of accessing and updating data from application will be discussed later this with
Objectives &
Outline through the interactions of native languages and SQL
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Database Management Systems Partha Pratim Das 31.15


Architecture Classification

Module 31
• Database architecture uses programming languages to design a particular type of
Partha Pratim
Das
software for businesses or organizations.
Week Recap
• Database architecture focuses on the design, development, implementation and
Objectives & maintenance of computer programs that store and organize information for businesses,
Outline
agencies and institutions.
Application
Programs &
Architecture
• A database architect develops and implements software to meet the needs of users.
Architectures
Classification
• The design of a DBMS depends on its architecture. It can be
1-Tier
2-Tier
◦ centralized
3-Tier ◦ decentralized
n-Tier
Sample Applications
◦ hierarchical
Module Summary • The architecture of a DBMS can be seen as either single tier or multi-tier:
◦ 1-tier architecture
◦ 2-tier architecture
◦ 3-tier architecture
◦ n-tier architecture
Database Management Systems Partha Pratim Das 31.16
Architecture Evolution

Module 31
• Three distinct eras of application architecture
Partha Pratim
Das ◦ Mainframe (1960’s and 70’s)
Week Recap ◦ Personal computer era (1980’s)
Objectives & ◦ Web / Mobile era (1990’s onwards)
Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Database Management Systems Partha Pratim Das 31.17


1-tier Architecture

Module 31 • One-tier architecture involves putting all of the required components for a software
Partha Pratim application or technology on a single server or platform
Das

Week Recap

Objectives &
Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

• Basically, a one-tier architecture keeps all of the elements of an application, including


the interface, Middleware and back-end data, in one place
• Developers see these types of systems as the simplest and most direct way
Source: Concepts of Database Architecture

Database Management Systems Partha Pratim Das 31.18


2-tier Architecture

Module 31
• The two-tier is based on Client Server architecture
Partha Pratim
Das • It is like client server application
Week Recap

Objectives &
Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

• The direct communication takes place between client and server


• There is no intermediate between client and server
Source: Concepts of Database Architecture

Database Management Systems Partha Pratim Das 31.19


3-tier Architecture

Module 31
• A 3-tier architecture separates its tiers - Presentation, Logic and Data Access - from
Partha Pratim
Das
each other based on the complexity of the users and how they use the data present in
the database
Week Recap

Objectives & • It is the most widely used architecture to design a DBMS


Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Source: Concepts of Database Architecture


Database Management Systems Partha Pratim Das 31.20
n-tier Architecture

Module 31
• An n-tier architecture distributes different components of the 3 tiers between different
Partha Pratim
Das
servers and adds interfaces tiers for interactions and workload balancing
Week Recap

Objectives &
Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Source: Concepts of Database Architecture

Database Management Systems Partha Pratim Das 31.21


Sample Applications in Multiple Tiers PPD

Module 31

Partha Pratim
Das

Week Recap

Objectives &
Outline

Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Database Management Systems Partha Pratim Das 31.22


Module Summary

Module 31

Partha Pratim • Had a glimpse of Application Programs across various sectors


Das
• Understood the typical architecture for an application
Week Recap

Objectives &
• Studies the classification and evolution of the architectures
Outline

Application
• Glimpsed at architecture for a few sample applications
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications

Module Summary

Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Database Management Systems Partha Pratim Das 31.23
Module 32

Partha Pratim
Das

Objectives &
Outline Database Management Systems
WWW
URL
Module 32: Application Design and Development/2: Web Applications
HTML & HTTP
Sessions & Cookies
Web Browser &
Server

Scripting Partha Pratim Das


Client Side
Javscript
Server Side Department of Computer Science and Engineering
Servlets Indian Institute of Technology, Kharagpur
JSP
PHP
[email protected]
Module Summary

Database Management Systems Partha Pratim Das 32.1


Module Recap PPD

Module 32

Partha Pratim • Had a glimpse of Application Programs across various sectors


Das
• Understood the typical architecture for an application
Objectives &
Outline • Studies the classification and evolution of the architectures
WWW
URL • Glimpsed at architecture for a few sample applications
HTML & HTTP
Sessions & Cookies
Web Browser &
Server

Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.2


Module Objectives PPD

Module 32

Partha Pratim • To familiarize with the fundamentals notions and technologies of Web
Das
• To learn about scripting
Objectives &
Outline • To learn about Servlets
WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server

Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.3


Module Outline PPD

Module 32

Partha Pratim • Web Fundamentals and Scripting


Das
• Servlets
Objectives &
Outline

WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server

Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.4


Web Fundamentals PPD

Module 32

Partha Pratim
Das

Objectives &
Outline

WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server

Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Web Fundamentals
Module Summary

Database Management Systems Partha Pratim Das 32.5


The World Wide Web

Module 32

Partha Pratim • The Web is a distributed information system based on hypertext


Das
• Most Web documents are hypertext documents formatted via the HyperText Markup
Objectives &
Outline Language (HTML)
WWW
URL
• HTML documents contain
HTML & HTTP
Sessions & Cookies
◦ text along with font specifications, and other formatting instructions
Web Browser & ◦ hypertext links to other documents, which can be associated with regions of the text
Server

Scripting
◦ forms, enabling users to enter data which can then be sent back to the Web server
Client Side
Javscript
Server Side
Servlets
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.6


Uniform Resources Locators PPD

Module 32 • On the Web, functionality of pointers is provided by Uniform Resource Locators (URLs).
Partha Pratim
Das
• URL example: http://www.acm.org/sigmod
Objectives &
◦ The first part indicates how the document is to be accessed (protocol)
Outline
. “http” indicates that the document is to be accessed using the Hyper Text
WWW
URL
Transfer Protocol.
HTML & HTTP
Sessions & Cookies
◦ The second part gives the unique name of a machine on the Internet
Web Browser &
Server
◦ The rest of the URL identifies the document within the machine
Scripting • The local identification can be:
Client Side
Javscript ◦ The path name of a file on the machine: A file at
Server Side
Servlets
C:/WINDOWS/media/Alarm01.wav of local machine can be accessed as:
JSP
PHP
. file:///C:/WINDOWS/media/Alarm01.wav
Module Summary . file://localhost/c:/WINDOWS/media/Alarm01.wav
◦ An identifier (path name) of a program, plus arguments to be passed to the
program: Searching google.com with ‘silberschatz’ has the uri:
. http://www.google.com/search?q=silberschatz
Database Management Systems Partha Pratim Das 32.7
URI, URL, and URN PPD

Module 32

Partha Pratim
Das • Uniform Resource Identifier (URI)

Objectives &
• Uniform Resource Locator (URL)
Outline
• Uniform Resource Name (URN)
WWW
URL
• Relationships:
HTML & HTTP ◦ URIs can be classified as lo-
Sessions & Cookies
cators (URLs), or as names
Web Browser &
Server (URNs), or as both.
Scripting ◦ URN functions like a person’s
Client Side name
Javscript
Server Side
◦ URL resembles that person’s
Servlets street address.
JSP ◦ URN defines an item’s iden-
PHP
tity, while the URL provides a
Module Summary method for finding it

Database Management Systems Partha Pratim Das 32.8


HTML and HTTP

Module 32

Partha Pratim • HTML provides formatting, hypertext link, and image display features
Das
◦ including tables, stylesheets (to alter default formatting), etc.
Objectives &
Outline • HTML also provides input features
WWW
URL
◦ Select from a set of options
HTML & HTTP
Sessions & Cookies
. Pop-up menus, radio buttons, check lists
Web Browser &
Server
◦ Enter values
Scripting . Text boxes
Client Side
Javscript ◦ Filled in input sent back to the server, to be acted upon by an executable at the
Server Side
Servlets
server
JSP
PHP
• HyperText Transfer Protocol (HTTP) used for communication with the Web server
Module Summary

Database Management Systems Partha Pratim Das 32.9


Sample HTML

Module 32
<html>
Partha Pratim <body>
Das <table border>
<tr> <th>ID</th><th>Name</th> <th>Department</th></tr>
Objectives & <tr> <td>00128</td> <td>Zhang</td> <td>Comp. Sci.</td> </tr>
Outline
···
WWW </table>
URL <form action=”PersonQuery” method=get>
HTML & HTTP Search for:
Sessions & Cookies <select name=”persontype”>
Web Browser &
Server
<option value=”student” selected>Student</option>
<option value=”instructor”> Instructor </option>
Scripting
</select> <br>
Client Side
Name: <input type=text size=20 name=”name”>
Javscript
Server Side
<input type=submit value=”submit”>
Servlets
</form>
JSP
</body>
PHP </html>
Module Summary

Database Management Systems Partha Pratim Das 32.10


HTTP and Sessions

Module 32

Partha Pratim • The HTTP protocol is connectionless


Das
◦ That is, once the server replies to a request, the server closes the connection with
Objectives &
Outline the client, and forgets all about the request
WWW ◦ In contrast, Unix logins, and JDBC/ODBC connections stay connected until the
URL
HTML & HTTP
client disconnects
Sessions & Cookies
Web Browser &
. retaining user authentication and other information
Server
◦ Motivation: reduces load on server
Scripting
Client Side . operating systems have tight limits on number of open connections on a machine
Javscript
Server Side • Information services need session information
Servlets
JSP ◦ For example, user authentication should be done only once per session
PHP

Module Summary
• Solution: use a cookie

Database Management Systems Partha Pratim Das 32.11


Sessions and Cookies

Module 32

Partha Pratim • A cookie is a small piece of text containing identifying information


Das
◦ Sent by server to browser
Objectives &
Outline . Sent on first interaction, to identify session
WWW
URL
◦ Sent by browser to the server that created the cookie on further interactions
HTML & HTTP
Sessions & Cookies
. part of the HTTP protocol
Web Browser &
Server
◦ Server saves information about cookies it issued, and can use it when serving a
Scripting request
Client Side
Javscript
. For example, authentication information, and user preferences
Server Side
Servlets
• Cookies can be stored permanently or for a limited time
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.12


Web Browser

Module 32 • A web browser is application software for accessing the World Wide Web
Partha Pratim
Das
• A web browser is to fetch content from the Web and display it on a user’s device
Objectives &
• This process begins when the user inputs a URL into the browser starting with either
Outline
http: or https:
WWW
URL • Once a web page has been retrieved, the rendering engine displays it on the user’s device
HTML & HTTP
Sessions & Cookies ◦ A browser or rendering engine is a core software component for a web browser
Web Browser &
Server ◦ The primary job of a browser engine is to transform HTML documents and other
Scripting resources of a web page into an interactive visual representation on a user’s device
Client Side
Javscript
◦ This includes image and video formats supported by the browser
Server Side
Servlets
• Web pages usually contain hyperlinks to other pages and resources. Each link contains
JSP a URL, and when it is clicked or tapped, the browser navigates to the new resource
PHP

Module Summary • Web browsers are used on a range of devices, including desktops, laptops, tablets, and
smartphones. In 2020, an estimated 4.9 billion people used a browser. The most used
browser is Google Chrome, with a 64% global market share on all devices, followed by
Safari with 19%
Database Management Systems Partha Pratim Das 32.13
Web Servers

Module 32 • A web server is software and underlying hardware that accepts requests via HTTP or its
Partha Pratim secure variant HTTPS
Das
• A web browser or crawler, requests for a specific resource using HTTP, and the server
Objectives &
Outline responds with the content of that resource or an error message
WWW
URL
• The server can also accept and store resources sent from the user agent
HTML & HTTP
Sessions & Cookies
• The document name in a URL may identify an executable program, that, when run,
Web Browser &
Server
generates a HTML document
Scripting ◦ When an HTTP server receives a request for such a document, it executes the
Client Side
Javscript
program, and sends back the HTML document that is generated
Server Side ◦ The Web client can pass extra arguments with the name of the document
Servlets
JSP • To install a new service on the Web, one simply needs to create and install an
PHP

Module Summary
executable that provides that service
◦ The Web browser provides a graphical user interface to the information service
• Common Gateway Interface (CGI): a standard interface between web and application
server
Database Management Systems Partha Pratim Das 32.14
Web Services

Module 32

Partha Pratim • Allow data on Web to be accessed using remote procedure call mechanism
Das
• Two approaches are widely used
Objectives &
Outline ◦ Representation State Transfer (REST): allows use of standard HTTP request to
WWW a URL to execute a request and return data
URL
HTML & HTTP . returned data is encoded either in XML, or in JavaScript Object Notation
Sessions & Cookies
Web Browser & (JSON)
Server

Scripting
◦ Big Web Services:
Client Side
Javscript
. uses XML representation for sending request data, as well as for returning results
Server Side . standard protocol layer built on top of HTTP
Servlets
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.15


Web Architecture

Module 32

Partha Pratim
Das

Objectives &
Outline

WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server

Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP

Module Summary

Source: Web Application Architecture: A Comprehensive Guide On The What, Why And How
Database Management Systems Partha Pratim Das 32.16
Scripting for Web Applications PPD

Module 32

Partha Pratim
Das

Objectives &
Outline

WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server

Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Scripting for Web Applications
Module Summary

Database Management Systems Partha Pratim Das 32.17


Scripting for Web Applications

Module 32

Partha Pratim • A script is a list of (text) commands that are embedded in a web-page or in the server
Das
• They are interpreted and executed by a certain program or scripting engine
Objectives &
Outline • Scripts may be written for a variety of purposes such as for automating processes on a
WWW
URL
local-computer or to generate web pages.
HTML & HTTP
Sessions & Cookies
• The programming languages in which scripts are written are called scripting language
Web Browser &
Server • Common scripting languages are VBScript, JavaScript, ASP, PHP, PERL, JSP etc.
Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.18


Scripting for Web Applications (2)

Module 32 Scripting of two types


Partha Pratim
Das
• Client Side: Client-side scripting is responsible for interaction within a web page. The
client-side scripts are firstly downloaded at the client-end and then interpreted and
Objectives &
Outline executed by the browser
WWW
URL
• Server Side: Server-side scripting is responsible for the completion or carrying out a
HTML & HTTP task at the server-end and then sending the result to the client-end.
Sessions & Cookies
Web Browser &
Server

Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP

Module Summary

Source: Web Scripting and its Types

Database Management Systems Partha Pratim Das 32.19


Client Side Scripting

Module 32

Partha Pratim • Browsers can fetch certain scripts (client-side scripts) or programs along with
Das
documents, and execute them in “safe mode” at the client site
Objectives &
Outline ◦ Javascript
WWW ◦ Macromedia Flash and Shockwave for animation/games
URL
HTML & HTTP
◦ VRML
Sessions & Cookies ◦ Applets
Web Browser &
Server
• Client-side scripts/programs allow documents to be active
Scripting
Client Side ◦ For example, animation by executing programs at the local site
Javscript
Server Side
◦ For example, ensure that values entered by users satisfy some correctness checks
Servlets ◦ Permit flexible interaction with the user.
JSP
PHP . Executing programs at the client site speeds up interaction by avoiding many
Module Summary round trips to server

Database Management Systems Partha Pratim Das 32.20


Client Side Scripting (2): Security

Module 32

Partha Pratim • Security mechanisms needed to ensure that malicious scripts do not cause damage to
Das
the client machine
Objectives &
Outline ◦ Easy for limited capability scripting languages, harder for general purpose
WWW programming languages like Java
URL
HTML & HTTP • For example, Java’s security system ensures that the Java applet code does not make
Sessions & Cookies
Web Browser &
any system calls directly
Server

Scripting
◦ Disallows dangerous actions such as file writes
Client Side ◦ Notifies the user about potentially dangerous actions, and allows the option to abort
Javscript
Server Side
the program or to continue execution.
Servlets
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.21


Javascript

Module 32

Partha Pratim • Javascript very widely used


Das
◦ forms basis of new generation of Web applications (called Web 2.0 applications)
Objectives &
Outline offering rich user interfaces
WWW • Javascript functions can
URL
HTML & HTTP ◦ check input for validity
Sessions & Cookies
Web Browser & ◦ modify the displayed Web page, by altering the underlying document object
Server

Scripting
model (DOM) tree representation of the displayed HTML text
Client Side ◦ communicate with a Web server to fetch data and modify the current page using
Javscript
Server Side
fetched data, without needing to reload/refresh the page
Servlets
JSP
. forms basis of AJAX technology used widely in Web 2.0 applications
PHP . For example, on selecting a country in a drop-down menu, the list of states in
Module Summary that country is automatically populated in a linked drop-down menu

Database Management Systems Partha Pratim Das 32.22


Javascript (2): Example

Module 32

Partha Pratim • Example of Javascript used to validate form input


Das

<html> <head>
Objectives &
Outline <script type="text/javascript">
WWW function validate() {
URL var credits=document.getElementById("credits").value;
HTML & HTTP if (isNaN(credits)|| credits<=0 || credits>=16) {
Sessions & Cookies
Web Browser &
alert("Credits must be a number greater than 0 and less than 16");
Server return false;
Scripting }
Client Side }
Javscript
Server Side
</script>
Servlets </head> <body>
JSP <form action="createCourse" onsubmit="return validate()">
PHP
Title: <input type="text" id="title" size="20"><br />
Module Summary Credits: <input type="text" id="credits" size="2"><br />
<Input type="submit" value="Submit">
</form>
</body> </html>
Database Management Systems Partha Pratim Das 32.23
Server-Side Scripting

Module 32

Partha Pratim • Server-side scripting simplifies the task of connecting a database to the Web
Das
◦ Define an HTML document with embedded executable code/SQL queries.
Objectives &
Outline ◦ Input values from HTML forms can be used directly in the embedded code/SQL
WWW queries.
URL
HTML & HTTP
◦ When the document is requested, the Web server executes the embedded code/SQL
Sessions & Cookies queries to generate the actual HTML document.
Web Browser &
Server
• Numerous server-side scripting languages
Scripting
Client Side ◦ JSP, PHP
Javscript
Server Side
◦ General purpose scripting languages: VBScript, Perl, Python
Servlets
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.24


Servlets

Module 32

Partha Pratim • Java Servlet specification defines an API for communication between the Web /
Das
application server and application program running in the server
Objectives &
Outline ◦ For example, methods to get parameter values from Web forms, and to send HTML
WWW text back to client
URL
HTML & HTTP • Application program (also called a servlet) is loaded into the server
Sessions & Cookies
Web Browser & ◦ Each request spawns a new thread in the server
Server

Scripting
. thread is closed once the request is serviced
Client Side
Javscript
Server Side
Servlets
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.25


Servlet (2): Example

Module 32 import java.io.*;


import javax.servlet.*;
Partha Pratim
Das import javax.servlet.http.*;
public class PersonQueryServlet extends HttpServlet {
Objectives & public void doGet (HttpServletRequest request, HttpServletResponse response)
Outline
throws ServletException, IOException
WWW
URL
{
HTML & HTTP response.setContentType(”text/html”);
Sessions & Cookies PrintWriter out = response.getWriter();
Web Browser &
Server out.println(”<HEAD><TITLE> Query Result</TITLE></HEAD>”);
Scripting
out.println(”<BODY>”);
Client Side · · · BODY OF SERVLET (next slide) · · ·
Javscript out.println(”</BODY>”);
Server Side
out.close();
Servlets
JSP }
PHP
}
Module Summary

Database Management Systems Partha Pratim Das 32.26


Servlet (3): Example

Module 32 String persontype = request.getParameter(”persontype”);


String number = request.getParameter(”name”);
Partha Pratim
Das if (persontype.equals(”student”)) {
· · · code to find students with the specified name · · ·
Objectives &
Outline
· · · using JDBC to communicate with the database · · ·
out.println(”<table BORDER COLS=3>”);
WWW
URL
out.println(”<tr> <td>ID</td><td>Name:
HTML & HTTP </td>” + ”<td> Department </td> </tr>”);
Sessions & Cookies for(· · · each result · · · ){
Web Browser &
Server · · · retrieve ID, name and dept name
Scripting
· · · into variables ID, name and deptname
Client Side · · · out.println(”<tr> <td>” + ID + ”</td>” + ”<td>” + name
Javscript + ”</td>” + ”<td>” + deptname + ”</td></tr>”);
Server Side
};
Servlets
JSP out.println(”</table>”);
PHP }
Module Summary else {
· · · as above, but for instructors · · ·
}

Database Management Systems Partha Pratim Das 32.27


Servlet (4): Sessions

Module 32

Partha Pratim • Servlet API supports handling of sessions


Das
◦ Sets a cookie on first interaction with browser, and uses it to identify session on
Objectives &
Outline further interactions
WWW • To check if session is already active:
URL
HTML & HTTP ◦ if (request.getSession(false) == true)
Sessions & Cookies
Web Browser &
Server
. .. then existing session
Scripting
. else .. redirect to authentication page
Client Side
Javscript
◦ authentication page
Server Side
Servlets
. check login/password
JSP . request.getSession(true): creates new session
PHP

Module Summary
• Store/retrieve attribute value pairs for a particular session
◦ session.setAttribute(“userid”, userid)
◦ session.getAttribute(“userid”)

Database Management Systems Partha Pratim Das 32.28


Servlet (5): Support

Module 32

Partha Pratim • Servlets run inside application servers such as


Das
◦ Apache Tomcat, Glassfish, JBoss
Objectives &
Outline ◦ BEA Weblogic, IBM WebSphere and Oracle Application Servers
WWW • Application servers support
URL
HTML & HTTP ◦ deployment and monitoring of servlets
Sessions & Cookies
Web Browser & ◦ Java 2 Enterprise Edition (J2EE) platform supporting objects, parallel processing
Server

Scripting
across multiple application servers, etc
Client Side
Javscript
Server Side
Servlets
JSP
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.29


Java Server Pages (JSP)

Module 32

Partha Pratim • A JSP page with embedded Java code <html>


Das
<head> <title> Hello </title> </head>
Objectives &
Outline
<body>
WWW
<% if (request.getParameter(“name”) == null)
URL { out.println(“Hello World”); }
HTML & HTTP
Sessions & Cookies else { out.println(“Hello, ” + request.getParameter(“name”)); }
Web Browser &
Server %>
Scripting </body>
Client Side
Javscript
</html>
Server Side
Servlets
• JSP is compiled into Java + Servlets
JSP
PHP
• JSP allows new tags to be defined, in tag libraries
Module Summary ◦ such tags are like library functions, can are used for example to build rich user
interfaces such as paginated display of large datasets

Database Management Systems Partha Pratim Das 32.30


PHP

Module 32

Partha Pratim • PHP is widely used for Web server scripting


Das
• Extensive libraries including for database access using ODBC
Objectives &
Outline <html>
WWW <head> <title> Hello </title> </head>
URL
HTML & HTTP
<body>
Sessions & Cookies
<? php if (!isset($ REQUEST[‘name’]))
Web Browser &
Server { echo “Hello World”; }
Scripting
Client Side
else { echo “Hello, ” + $ REQUEST[‘name’]; }
Javscript ?>
Server Side
Servlets
</body>
JSP
</html>
PHP

Module Summary

Database Management Systems Partha Pratim Das 32.31


USP of JSP

Module 32 • JSP vs. Active Server Pages (ASP)


Partha Pratim ◦ ASP is a similar technology from Microsoft and is proprietary (uses VB).
Das
◦ JSP is platform independent and portable.
Objectives &
Outline • JSP vs. Pure Servlets
WWW
URL
◦ JSP is a servlet but it is more convenient to write and to modify regular HTML
HTML & HTTP than to have a million println statements that generate the HTML.
Sessions & Cookies
Web Browser &
◦ The Web page design experts can build the HTML, leaving places for the servlet
Server
programmers to insert the dynamic content.
Scripting
Client Side • JSP vs. JavaScript JavaScript can generate HTML dynamically on the client.
Javscript
Server Side ◦ “Client Side”: Java Script code is executed by the browser after the web server
Servlets
JSP
sends the HTTP response. With the exception of cookies, HTTP and form
PHP submission data is not available to JavaScript.
Module Summary ◦ “Server Side”: Java Server Pages are executed by the web server before the web
server sends the HTTP response. It can access server-side resources like databases,
catalogs.
• JSP vs. Static HTML Regular HTML cannot
Database Management Systems
contain dynamic information.
Partha Pratim Das 32.32
Module Summary

Module 32

Partha Pratim • Familiarized with the Fundamentals notions and technologies of Web
Das
• Learnt about Scripting
Objectives &
Outline • Learnt the notions of Servlets
WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server

Scripting
Client Side
Javscript
Server Side
Slides used in this presentation are borrowed from http://db-book.com/ with kind
Servlets permission of the authors.
JSP
PHP
Edited and new slides are marked with “PPD”.
Module Summary

Database Management Systems Partha Pratim Das 32.33


Module 33

Partha Pratim
Das

Objectives &
Outline Database Management Systems
SQL and Native
Language Module 33: Application Design and Development/3: SQL and Native Language
ODBC
Example: Python

JDBC
Example: Java
Partha Pratim Das
Bridge

Embedded SQL
Department of Computer Science and Engineering
Example: C
Indian Institute of Technology, Kharagpur
Example: Java

Module Summary [email protected]

Database Management Systems Partha Pratim Das 33.1


Module Recap PPD

Module 33

Partha Pratim • Familiarized with the Fundamentals notions and technologies of Web
Das
• Learnt about Scripting
Objectives &
Outline • Learnt the notions of Servlets
SQL and Native
Language

ODBC
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Example: C
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.2


Module Objectives PPD

Module 33

Partha Pratim • To understand how to use SQL from a programming language


Das

Objectives &
Outline

SQL and Native


Language

ODBC
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Example: C
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.3


Module Outline PPD

Module 33

Partha Pratim • Accessing SQL From a Programming Language


Das

Objectives &
Outline

SQL and Native


Language

ODBC
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Example: C
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.4


Module 33

Partha Pratim
Das

Objectives &
Outline

SQL and Native


Language

ODBC
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Example: C
Working with SQL and Native Language
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.5


Working with SQL and Native Language

Module 33 • Applications use Application Programming / Program Interface (API) to interact


Partha Pratim with a database server
Das
• Applications make calls to
Objectives &
Outline ◦ Connect with the database server
SQL and Native
Language
◦ Send SQL commands to the database server
ODBC
◦ Fetch tuples of result one-by-one into program variables
Example: Python
• Frameworks
JDBC
Example: Java ◦ Connectionist
Bridge
. Open Database Connectivity (ODBC) works with C, C++, C#, Visual
Embedded SQL
Example: C
Basic, and Python. Other data APIs include
Example: Java
− OLEDB
Module Summary
− ADO.NET
. Java Database Connectivity (JDBC) works with Java
◦ Embedding
. Embedded SQL works with C, C++, Java, COBOL, FORTRAN and Pascal
Database Management Systems Partha Pratim Das 33.6
Native Language ⇐⇒ Query Language: Connectionist PPD

Module 33

Partha Pratim
Das

Objectives &
Outline

SQL and Native


Language

ODBC
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Example: C
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.7


ODBC

Module 33
• Open Database Connectivity (ODBC) is a standard API for accessing DBMS
Partha Pratim
Das • It aimed to be independent of database systems and operating systems
Objectives & • An application written using ODBC can be ported to other platforms, both on the
Outline
client and server side, with few changes to the data access code
SQL and Native
Language
• ODBC is
ODBC
Example: Python ◦ A standard for application program to communicate with a database server
JDBC ◦ An application program interface (API) to
Example: Java

Bridge
. Open a connection with a database
Embedded SQL . Send queries and updates
Example: C . Get back results
Example: Java

Module Summary • Applications such as GUI, Spreadsheets, etc. can use ODBC
• ODBC was originally developed by Microsoft and Simba Technologies during the early
1990s, and became the basis for the Call Level Interface (CLI) standardized by SQL
Access Group in the Unix and mainframe field.
Database Management Systems Partha Pratim Das 33.8
ODBC (2): Python Example PPD

Module 33
• The code uses a data source
Partha Pratim
Das
named “SQLS” from the odbc.ini
file to connect and issue a query.
Objectives &
Outline
• It creates a table, inserts data
SQL and Native
Language using literal and parameterized
ODBC statements and fetches the data
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Example: C
Example: Java

Module Summary

Source: https: // dzone. com/ articles/ tutorial- connecting- to- odbc- data- sources- with- pyth

Database Management Systems Partha Pratim Das 33.9


JDBC

Module 33 • Java Database Connectivity (JDBC) is an API for the programming language Java,
Partha Pratim which defines how a client may access a database
Das
• It is a Java-based data access technology used for Java database connectivity
Objectives &
Outline
• JDBC supports a variety of features for querying and updating data, and for retrieving
SQL and Native
Language query results; metadata retrieval, such as querying about relations present in the
ODBC database and the names and types of relation attributes
Example: Python

JDBC
• Model for communicating with the database:
Example: Java
◦ Open a connection
Bridge
◦ Create a “statement” object
Embedded SQL
Example: C
◦ Execute queries using the Statement object to send queries and fetch results
Example: Java ◦ Exception mechanism to handle errors
Module Summary
• JDBC, originally released by Sun Microsystems released as part of Java Development
Kit (JDK) 1.1 on in 1997, is part of the Java Standard Edition platform, from Oracle
Corporation

Database Management Systems Partha Pratim Das 33.10


JDBC: Example (1) PPD

Module 33

Partha Pratim • We show a simple example here to connect to SQL Server from Java using JDBC to
Das
execute database commands
Objectives &
Outline • In the example, the sample code makes a connection to the sample database
SQL and Native
Language
• Then, using an SQL statement with the SQLServerStatement object, it runs the SQL
ODBC statement and places the data that it returns into a SQLServerResultSet object
Example: Python

JDBC
• Next, the sample code calls the custom displayRow method to iterate through the
Example: Java rows of data that are in the result set, and uses the getString method to display some
Bridge of the data
Embedded SQL
Example: C
• Complete example can be found at: Retrieving result set data sample
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.11


JDBC: Example (2) PPD

Module 33 import java.sql.Connection;


import java.sql.DriverManager;
Partha Pratim import java.sql.ResultSet;
Das
import java.sql.SQLException;
Objectives &
import java.sql.Statement;
Outline
public class RetrieveResultSet {
SQL and Native
Language
public static void main(String[] args) {
ODBC
// Create a variable for the connection string.
Example: Python
String connectionUrl = "jdbc:sqlserver://<server>:<port>;databaseName=AdventureWorks;";
JDBC connectionUrl += "user=<user>; password=<password>";
Example: Java

Bridge try (Connection con = DriverManager.getConnection(connectionUrl);


Statement stmt = con.createStatement();) {
Embedded SQL
createTable(stmt);
Example: C
Example: Java
String SQL = "SELECT * FROM Production.Product;";
ResultSet rs = stmt.executeQuery(SQL);
Module Summary displayRow("PRODUCTS", rs);
}
// Handle any errors that may have occurred.
catch (SQLException e) {
e.printStackTrace();
}
}
Database Management Systems Partha Pratim Das 33.12
JDBC: Example (3) PPD

Module 33
private static void displayRow(String title, ResultSet rs) throws SQLException {
System.out.println(title);
Partha Pratim while (rs.next()) { // Iterate on Table("ProductID", "Name")
Das System.out.println(rs.getString("ProductID") + " : " + rs.getString("Name"));
}
Objectives &
Outline
}
private static void createTable(Statement stmt) throws SQLException {
SQL and Native stmt.execute("if exists (select * from sys.objects where name = ’Product_JDBC_Sample’)"
Language
+ "drop table Product_JDBC_Sample");
ODBC
Example: Python String sql = "CREATE TABLE [Product_JDBC_Sample](" // Table Name
JDBC + "[ProductID] [int] IDENTITY(1,1) NOT NULL," // Attribute 1
Example: Java + "[Name] [varchar](30) NOT NULL,)"; // Attribute 2
Bridge
stmt.execute(sql);
Embedded SQL sql = "INSERT Product_JDBC_Sample VALUES (’Adjustable Time’,’AR-5381’)"; // Add Product 1
Example: C
stmt.execute(sql);
Example: Java

Module Summary sql = "INSERT Product_JDBC_Sample VALUES (’ML Bottom Bracket’,’BB-8107’)"; // Add Product 2
stmt.execute(sql);

sql = "INSERT Product_JDBC_Sample VALUES (’Mountain-500 Black’,’BK-M18B-44’)"; // Add Product 3


stmt.execute(sql);
}
}
Database Management Systems Partha Pratim Das 33.13
Connectionist Bridge Configurations

Module 33
A bridge is a special kind of driver that uses another driver-based technology
Partha Pratim • This driver translates source function-calls into target function-calls
Das
• Programmers usually use such a bridge when they lack a source driver for some
Objectives &
Outline database but have access to a target driver
SQL and Native
Language • Common bridges are:
ODBC ◦ ODBC-to-JDBC (ODBC-JDBC) bridges: An ODBC-JDBC bridge consists of an ODBC driver
Example: Python which uses the services of a JDBC driver to connect to a database. Examples: OpenLink
JDBC ODBC-JDBC Bridge, SequeLink ODBC-JDBC Bridge
Example: Java ◦ JDBC-to-ODBC (JDBC-ODBC) bridges: A JDBC-ODBC bridge consists of a JDBC driver which
Bridge employs an ODBC driver to connect to a target database. Examples: OpenLink JDBC-ODBC
Embedded SQL Bridge, SequeLink JDBC-ODBC Bridge
Example: C ◦ OLE DB-to-ODBC bridges: An OLE DB-ODBC bridge consists of an OLE DB Provider which
Example: Java
uses the services of an ODBC driver to connect to a target database. This provider translates OLE
Module Summary DB method calls into ODBC function calls. Examples: OpenLink OLEDB-ODBC Bridge,
SequeLink OLEDB-ODBC Bridge
◦ ADO.NET-to-ODBC bridges: An ADO.NET-ODBC bridge consists of an ADO.NET Provider
which uses the services of an ODBC driver to connect to a target database. Examples: OpenLink
ADO.NET-ODBC Bridge, SequeLink ADO.NET-ODBC Bridge
Database Management Systems Partha Pratim Das 33.14
Native Language ⇐⇒ Query Language: Embedded SQL PPD

Module 33

Partha Pratim
Das

Objectives &
Outline

SQL and Native


Language

ODBC
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Example: C
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.15


Embedded SQL

Module 33

Partha Pratim • The SQL standard defines embedding of SQL in a variety of programming languages
Das
such as C, C++, Java, FORTRAN, and PL/1
Objectives &
Outline • A language to which SQL queries are embedded is referred to as a host language, and
SQL and Native the SQL structures permitted in the host language comprise embedded SQL
Language

ODBC • The basic form of these languages follows that of the System R embedding of SQL into
Example: Python
PL/1
JDBC
Example: Java • EXEC SQL (or similar alternate like #sql) statement is used to identify embedded SQL
Bridge request to the pre-processor
Embedded SQL EXEC SQL <embedded SQL statement >;
Example: C
Example: Java Note: this varies by language:
Module Summary
◦ In some languages, like COBOL, the semicolon is replaced with END-EXEC
◦ In Java embedding uses # SQL {....};

Database Management Systems Partha Pratim Das 33.16


Embedded SQL (2)

Module 33

Partha Pratim • Before executing any SQL statements, the program must first connect to the database.
Das
This is done using:
Objectives &
Outline EXEC-SQL connect to server user user-name using password;
SQL and Native
Language
Here, server identifies the server to which a connection is to be established
ODBC
Example: Python • Variables of the host language can be used within embedded SQL statements. They are
JDBC preceded by a colon (:) to distinguish from SQL variables (for example, :credit amount )
Example: Java

Bridge
• Variables used as above must be declared within DECLARE section, as illustrated below.
Embedded SQL The syntax for declaring the variables, however, follows the usual host language syntax
Example: C
Example: Java EXEC-SQL BEGIN DECLARE SECTION
Module Summary
int credit-amount ;
EXEC-SQL END DECLARE SECTION;

Database Management Systems Partha Pratim Das 33.17


Embedded SQL (3)

Module 33

Partha Pratim • To write an embedded SQL query, we use the


Das
declare c cursor for <SQL query>
Objectives &
Outline
statement. The variable c is used to identify the query
SQL and Native • Example:
Language

ODBC
◦ From within a host language, find the ID and name of students who have completed
Example: Python more than the number of credits stored in variable credit amount in the host
JDBC
Example: Java
language
Bridge
◦ Specify the query in SQL as follows:
Embedded SQL EXEC SQL
Example: C
Example: Java
declare c cursor for
Module Summary
select ID, name
from student
where tot cred > :credit amount
END EXEC

Database Management Systems Partha Pratim Das 33.18


Embedded SQL (4)

Module 33

Partha Pratim • Example


Das
◦ From within a host language, find the ID and name of students who have completed
Objectives &
Outline more than the number of credits stored in variable credit amount in the host
SQL and Native language
Language

ODBC
• Specify the query in SQL as follows:
Example: Python EXEC SQL
JDBC
Example: Java
declare c cursor for
Bridge
select ID, name
Embedded SQL from student
Example: C
Example: Java
where tot cred > :credit amount
Module Summary
END EXEC
• The variable c (used in the cursor declaration) is used to identify the query

Database Management Systems Partha Pratim Das 33.19


Embedded SQL (5)

Module 33

Partha Pratim • The open statement for our example is as follows:


Das
EXEC SQL open c ;
Objectives &
Outline
This statement causes the database system to execute the query and to save the results
SQL and Native
Language within a temporary relation. The query uses the value of the host-language variable
ODBC credit-amount at the time the open statement is executed.
Example: Python

JDBC • The fetch statement causes the values of one tuple in the query result to be placed on
Example: Java
host language variables.
Bridge
EXEC SQL fetch c into :si, :sn END EXEC
Embedded SQL
Example: C Repeated calls to fetch get successive tuples in the query result
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.20


Embedded SQL (6)

Module 33

Partha Pratim • A variable called SQLSTATE in the SQL communication area (SQLCA) gets set to
Das
‘02000’ to indicate no more data is available
Objectives &
Outline • The close statement causes the database system to delete the temporary relation that
SQL and Native holds the result of the query.
Language

ODBC
EXEC SQL close c ;
Example: Python Note: above details vary with language. For example, the Java embedding defines Java
JDBC iterators to step through result tuples.
Example: Java

Bridge

Embedded SQL
Example: C
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.21


Embedded SQL (7): Updates

Module 33

Partha Pratim • Embedded SQL expressions for database modification (update, insert, and delete)
Das
• Can update tuples fetched by cursor by declaring that the cursor is for update
Objectives &
Outline EXEC SQL
SQL and Native declare c cursor for
Language

ODBC
select *
Example: Python from instructor
JDBC where dept name = ‘Music’
Example: Java

Bridge
for update
Embedded SQL • We then iterate through the tuples by performing fetch operations on the cursor (as
Example: C
Example: Java
illustrated earlier), and after fetching each tuple we execute the following code:
Module Summary update instructor
set salary = salary + 1000
where current of c

Database Management Systems Partha Pratim Das 33.22


Embedded SQL: C Example PPD

Module 33
• Here is an example embedded SQL C program from DB2: Embedded SQL for C and
Partha Pratim
Das
C++ (by P. Godfrey NOV 2002)
Objectives &
• It does not do much, but is instructive
Outline
• The APP queries a table sailor in schema one.
SQL and Native
Language
• User one has granted select privileges to all on table sailor, so the bind step will be legal
ODBC
Example: Python • This APP takes one argument on the command line, a sailor’s SID. It then finds the
JDBC sailor SID’s age out of the table ONE.SAILOR and reports it
Example: Java

Bridge • Try pre-compiling / compiling it. Connect to database c341f02 for this.
Embedded SQL
Example: C
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.23


Embedded SQL: C Example (2) PPD

#include <stdio.h>
Module 33
#include <stdlib.h>
Partha Pratim #include <string.h>
Das #include <sqlenv.h>
#include <sqlcodes.h>
Objectives & #include <sys/time.h>
Outline

SQL and Native #define EXIT 0


Language #define NOEXIT 1
ODBC
Example: Python EXEC SQL INCLUDE SQLCA; // Include DB2’s SQL error reporting facility.
JDBC
Example: Java
EXEC SQL BEGIN DECLARE SECTION; // Declare the SQL interface variables.
short sage, sid; char sname[16];
Bridge
EXEC SQL END DECLARE SECTION;
Embedded SQL
Example: C // Declare variables to be used in the following C program
Example: Java char msg[1025]; int rc, errcount;
Module Summary
// This macro prints the message in the SQLCA if the return code is 0 and the SQLCODE is not 0
#define PRINT_MESSAGE() { \
if (rc == 0 && sqlca.sqlcode != 0) { \
sqlaintp(msg, 1024, 0, &sqlca); \
printf("%s\n",msg); \
} \
}
Database Management Systems Partha Pratim Das 33.24
Embedded SQL: C Example (3) PPD

Module 33 // The macro prints out all fields in the SQLCA


#define DUMP_SQLCA() { \
Partha Pratim printf("********** DUMP OF SQLCA **********************\n"); \
Das
printf("SQLCAID: %s\n", sqlca.sqlcaid); printf("SQLCABC: %d\n", sqlca.sqlcabc); \
Objectives &
printf("SQLCODE: %d\n", sqlca.sqlcode); printf("SQLERRML: %d\n", sqlca.sqlerrml); \
Outline printf("SQLERRD[0]: %d\n", sqlca.sqlerrd[0]); printf("SQLERRD[1]: %d\n", sqlca.sqlerrd[1]); \
printf("SQLERRD[2]: %d\n", sqlca.sqlerrd[2]); printf("SQLERRD[3]: %d\n", sqlca.sqlerrd[3]); \
SQL and Native
Language printf("SQLERRD[4]: %d\n", sqlca.sqlerrd[4]); printf("SQLERRD[5]: %d\n", sqlca.sqlerrd[5]); \
printf("SQLWARN: %s\n", sqlca.sqlwarn); printf("SQLSTATE: %s\n", sqlca.sqlstate); \
ODBC
printf("********** END OF SQLCA DUMP *******************\n"); \
Example: Python
}
JDBC
Example: Java // This macro prints the message in the SQLCA if one exists
Bridge // If the return code is not 0 or the SQLCODE is not expected, an error occurred and must be recorded.
#define CHECK_SQL(code,text_string,eExit) { \
Embedded SQL
PRINT_MESSAGE(); \
Example: C
Example: Java
if (rc != 0 || sqlca.sqlcode != code) { \
printf("%s\n",text_string); printf("Expected code = %d\n",code); \
Module Summary if (rc == 0) DUMP_SQLCA(); \
else printf("RC: %d\n",rc); \
errcount += 1; \
if (eExit == EXIT) goto errorexit; \
} \
}
Database Management Systems Partha Pratim Das 33.25
Embedded SQL: C Example (4) PPD

Module 33
main (int argc, char *argv[]) { // The PROGRAM
// Grab the first command argument. This is the SID
Partha Pratim if (argc > 1) {
Das sid = atoi(argv[1]);
printf("SID requested is %d.\n", sid); // If there is no argument, bail
Objectives &
Outline
} else {
printf("Which SID?\n");
SQL and Native exit(0);
Language
}
ODBC
Example: Python EXEC SQL CONNECT TO C3421M;
JDBC CHECK_SQL(0, "Connect failed", EXIT);
Example: Java

Bridge
// Find the name and age of sailor SID
EXEC SQL SELECT SNAME, AGE into :sname, :sage
Embedded SQL FROM ONE.SAILOR
Example: C
WHERE sid = :sid;
Example: Java
CHECK_SQL(0, "The SELECT query failed.", EXIT);
Module Summary
// Report the age
printf("Sailor %s’s age is %d.\nExecuted Successfully\nBye\n", sname, sage);

errorexit:
EXEC SQL CONNECT RESET;
}
Database Management Systems Partha Pratim Das 33.26
Embedded SQL: C Example (5) PPD

Module 33 • The instance of the table sailor:


Partha Pratim SNAME SID RATING AGE
Das
--------------- ----------- ------ ------
Objectives &
Outline yuppy 22 1 20
SQL and Native lubber 31 1 25
Language
guppy 44 2 31
ODBC
Example: Python rusty 58 3 47
JDBC
Example: Java
• If the name of the executable is sage, and if you ask:
Bridge

Embedded SQL
% sage 44
Example: C
Example: Java
• The output should be:
Module Summary
SID requested is 44.
Sailor guppy’s age is 31.
Executed Successfully
Bye
Database Management Systems Partha Pratim Das 33.27
Embedded SQL: C Example (6) PPD

Module 33
• The program prompts the user for an order number, retrieves the customer number,
Partha Pratim
Das
salesperson, and status of the order, and displays the retrieved information on the screen
Objectives &
Outline

SQL and Native


Language

ODBC
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Example: C
Example: Java

Module Summary

• The statement used to return the data is a singleton SELECT statement; that is, it
returns only a single row of data. So, the code example does not declare or use cursors
Source: https: // docs. microsoft. com/ en- us/ sql/ odbc/ reference/ embedded- sql- example
Database Management Systems Partha Pratim Das 33.28
Embedded SQL: Java Example PPD

Module 33

Partha Pratim • The following example SQLJ application, App.sqlj, uses static SQL to retrieve and
Das
update data from the EMPLOYEE table of the sample database
Objectives &
Outline • Complete example can be found at: Example: Embedding SQL Statements in your
SQL and Native Java™ application
Language

ODBC
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Example: C
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.29


Embedded SQL: Java Example (2) PPD

import java.sql.*;
Module 33 import sqlj.runtime.*;
import sqlj.runtime.ref.*;
Partha Pratim
Das #sql iterator App_Cursor1 (String empno, String firstnme) ; // 1
#sql iterator App_Cursor2 (String) ;
Objectives &
Outline class App { // Register Driver
SQL and Native static { try { Class.forName("com.ibm.db2.jdbc.app.DB2Driver").newInstance(); }
Language catch (Exception e) { e.printStackTrace(); }
ODBC }
Example: Python public static void main(String argv[]) {
try { App_Cursor1 cursor1; App_Cursor2 cursor2; String str1 = null, str2 = null; long count1;
JDBC
String url = "jdbc:db2:sample"; // URL is jdbc:db2:dbname
Example: Java

Bridge DefaultContext ctx = DefaultContext.getDefaultContext();


Embedded SQL if (ctx == null) {
Example: C try { // connect with default id / password
Example: Java Connection con = DriverManager.getConnection(url);
Module Summary
con.setAutoCommit(false); ctx = new DefaultContext(con);
}
catch (SQLException e) {
System.out.println("Error: could not get a default context");
System.err.println(e); System.exit(1);
}
DefaultContext.setDefaultContext(ctx);
}
Database Management Systems Partha Pratim Das 33.30
Embedded SQL: Java Example (3): Notes PPD

Module 33
1 Declare iterators. This section declares two types of iterators:
Partha Pratim
Das ◦ App Cursor1: Declares column data types and names, and returns the values of the
Objectives &
columns according to column name (Named binding to columns)
Outline ◦ App Cursor2: Declares column data types, and returns the values of the columns by
SQL and Native
Language
column position (Positional binding to columns)
ODBC
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Example: C
Example: Java

Module Summary

Database Management Systems Partha Pratim Das 33.31


Embedded SQL: Java Example (4) PPD

Module 33
// retrieve data from the database
System.out.println("Retrieve some data from the database.");
Partha Pratim #sql cursor1 = {SELECT empno, firstnme FROM employee}; // 2
Das

// display the result set. cursor1.next() returns false when there are no more rows
Objectives &
Outline System.out.println("Received results:");
while (cursor1.next()) { // 3
SQL and Native
Language
str1 = cursor1.empno(); str2 = cursor1.firstnme(); // 4
System.out.println(" empno= " + str1 + " firstname= " + str2 + "");
ODBC }
Example: Python
cursor1.close(); // 9
JDBC
Example: Java // retrieve number of employee from the database
Bridge
#sql { SELECT count(*) into :count1 FROM employee }; // 5
if (1 == count1)
Embedded SQL System.out.println("There is 1 row in employee table");
Example: C
else
Example: Java
System.out.println("There are " + count1 + " rows in employee table");
Module Summary
// update the database
System.out.println("Update the database.");
#sql { UPDATE employee SET firstnme = ’SHILI’ WHERE empno = ’000010’ };

Database Management Systems Partha Pratim Das 33.32


Embedded SQL: Java Example (5): Notes PPD

Module 33
2 Initialize the iterator. The iterator object cursor1 is initialized using the result of a
Partha Pratim
Das
query. The query stores the result in cursor1.
Objectives &
3 Advance the iterator to the next row. The cursor1.next() method returns a Boolean
Outline
false if there are no more rows to retrieve.
SQL and Native
Language 4 Move the data. The named accessor method empno() returns the value of the column
ODBC named empno on the current row. The named accessor method firstnme() returns the
Example: Python

JDBC
value of the column named firstnme on the current row.
Example: Java
5 SELECT data into a host variable. The SELECT statement passes the number of
Bridge
rows in the table into the host variable count1.
Embedded SQL
Example: C 9 Close the iterators. The close() method releases any resources held by the iterators.
Example: Java

Module Summary
You should explicitly close iterators to ensure that system resources are released in a
timely fashion.

Database Management Systems Partha Pratim Das 33.33


Embedded SQL: Java Example (6) PPD

// retrieve the updated data from the database


Module 33
System.out.println("Retrieve the updated data from the database.");
Partha Pratim str1 = "000010";
Das #sql cursor2 = {SELECT firstnme FROM employee WHERE empno = :str1}; // 6
Objectives & // display the result set. cursor2.next() returns false when there are no more rows
Outline
System.out.println("Received results:");
SQL and Native while (true) {
Language
#sql { FETCH :cursor2 INTO :str2 }; // 7
ODBC if (cursor2.endFetch()) break; // 8
Example: Python

JDBC System.out.println(" empno= " + str1 + " firstname= " + str2 + "");
Example: Java }
cursor2.close(); // 9
Bridge

Embedded SQL // rollback the update


Example: C System.out.println("Rollback the update.");
Example: Java #sql { ROLLBACK work };
Module Summary System.out.println("Rollback done.");
} // try
catch( Exception e ) {
e.printStackTrace();
}
} // main
} // class App
Database Management Systems Partha Pratim Das 33.34
Embedded SQL: Java Example (7): Notes PPD

Module 33
6 Initialize the iterator. The iterator object cursor2 is initialized using the result of a
Partha Pratim
Das
query. The query stores the result in cursor2.
Objectives &
7 Retrieve the data. The FETCH statement returns the current value of the first
Outline
column declared in the ByPos cursor from the result table into the host variable str2.
SQL and Native
Language 8 Check the success of a FETCH.INTO statement. The endFetch() method returns
ODBC a Boolean true if the iterator is not positioned on a row, that is, if the last attempt to
Example: Python

JDBC
fetch a row failed. The endFetch() method returns false if the last attempt to fetch a
Example: Java row was successful. DB2 attempts to fetch a row when the next() method is called. A
Bridge FETCH...INTO statement implicitly calls the next() method.
Embedded SQL
Example: C
9 Close the iterators. The close() method releases any resources held by the iterators.
Example: Java
You should explicitly close iterators to ensure that system resources are released in a
Module Summary
timely fashion.

Database Management Systems Partha Pratim Das 33.35


Module Summary

Module 33

Partha Pratim • Introduced the use of SQL from a programming language


Das

Objectives &
Outline

SQL and Native


Language

ODBC
Example: Python

JDBC
Example: Java

Bridge

Embedded SQL
Slides used in this presentation are borrowed from http://db-book.com/ with kind
Example: C permission of the authors.
Example: Java

Module Summary
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 33.36


Module 34

Partha Pratim
Das

Objectives &
Outline Database Management Systems
PostgreSQL and
Python Module 34: Application Design and Development/4: Python and PostgreSQL
Python
Frameworks for
PostgresSQL

Flask

Module Summary
Partha Pratim Das

Department of Computer Science and Engineering


Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 34.1


Module Recap PPD

Module 34

Partha Pratim • Introduced the use of SQL from a programming language


Das

Objectives &
Outline

PostgreSQL and
Python

Python
Frameworks for
PostgresSQL

Flask

Module Summary

Database Management Systems Partha Pratim Das 34.2


Module Objectives

Module 34

Partha Pratim • To understand how to access PostgreSQL database from Python


Das
• To understand Python Web Application with PostgresSQL
Objectives &
Outline

PostgreSQL and
Python

Python
Frameworks for
PostgresSQL

Flask

Module Summary

Database Management Systems Partha Pratim Das 34.3


Module Outline

Module 34

Partha Pratim • Accessing PostgreSQL from Python


Das
• Developing Web Application with Python
Objectives &
Outline

PostgreSQL and
Python

Python
Frameworks for
PostgresSQL

Flask

Module Summary

Database Management Systems Partha Pratim Das 34.4


Module 34

Partha Pratim
Das

Objectives &
Outline

PostgreSQL and
Python

Python
Frameworks for
PostgresSQL

Flask

Module Summary

Working with PostgreSQL and Python

Database Management Systems Partha Pratim Das 34.5


Python Modules for PostgreSQL

Module 34

Partha Pratim
Following Python modules that can be used to work with a PostgreSQL database server:
Das
• psycopg2
Objectives &
Outline • pg8000
PostgreSQL and
Python
• py-postgresql
Python
Frameworks for
• PyGreSQL
PostgresSQL
• ocpgdb
Flask

Module Summary • bpgsql


• SQLAlchemy (needs any of the above to be installed separately)

Source: https: // pynative. com/ python-postgresql-tutorial/

Database Management Systems Partha Pratim Das 34.6


Package psycopg2

Module 34
Advantages of psycopg2
Partha Pratim
Das
• Most popular and stable module to work with PostgreSQL
Objectives &
Outline • Used in most of the Python and Postgres frameworks
PostgreSQL and
Python
• An actively maintained package and supports Python 2.x and 3.x
Python • Thread-safe and designed for heavily multi-threaded applications.
Frameworks for
PostgresSQL

Flask Installing Psycopg2 using pip command


Module Summary • The following pip command installs psycopg2 on different operating systems including
Windows, MacOS, Linux, and Unix
pip install psycopg2
• For installing specific version, the following command can be used
pip install psycopg2=2.8.6

Source: https: // pynative. com/ python-postgresql-tutorial/

Database Management Systems Partha Pratim Das 34.7


Steps to access PostgresSQL from Python

Module 34

Partha Pratim
Steps to access PostgresSQL from Python using Psycopg
Das
a) Create connection
Objectives &
Outline
b) Create cursor
PostgreSQL and
Python c) Execute the query
Python
Frameworks for d) Commit/rollback
PostgresSQL

Flask e) Close the cursor


Module Summary f) Close the connection

Source: https: // pynative. com/ python-postgresql-tutorial/

Database Management Systems Partha Pratim Das 34.8


Python psycopg2 Module APIs: connection objects

Module 34

Partha Pratim
Das
• psycopg2.connect(database="mydb", user="myuser", password="mypass"
host="127.0.0.1", port="5432")
Objectives &
Outline This API opens a connection to the PostgreSQL database. If database is opened
PostgreSQL and
Python
successfully, it returns a connection object.
Python • connection.close()
Frameworks for
PostgresSQL This method closes the database connection.
Flask

Module Summary Important psycopg2 module routines for managing cursor object:
• connection.cursor()
This routine creates a cursor which will be used throughout the program.
• cursor.close()
This method closes the cursor.

Source: https: // www. tutorialspoint. com/ postgresql/ postgresql_ python. htm


Database Management Systems Partha Pratim Das 34.9
Python psycopg2 Module APIs:
insert, delete, update & stored procedures
Module 34
• cursor.execute(sql [, optional parameters])
Partha Pratim
Das This routine executes an SQL statement. The SQL statement may be parameterized
Objectives & (i.e., placeholders instead of SQL literals). The psycopg2 module supports placeholder
Outline
using %s sign. For example: cursor.execute("insert into people values (%s,
PostgreSQL and
Python %s)", (who, age))
Python
Frameworks for
• cursor.executemany(sql, seq of parameters)
PostgresSQL
This routine executes an SQL command against all parameter sequences or mappings
Flask
found in the sequence SQL.
Module Summary
• cursor.callproc(procname[, parameters])
This routine executes a stored database procedure with the given name. The sequence
of parameters must contain one entry for each argument that the procedure expects.
• cursor.rowcount
This is a read-only attribute which returns the total number of database rows that have
been modified, inserted, or deleted by the last execute().
Source: https: // www. tutorialspoint. com/ postgresql/ postgresql_ python. htm
Database Management Systems Partha Pratim Das 34.10
Python psycopg2 Module APIs: select

Module 34

Partha Pratim
Das
• cursor.fetchone()
This method fetches the next row of a query result set, returning a single sequence, or
Objectives &
Outline None when no more data is available.
PostgreSQL and
Python • cursor.fetchmany([size=cursor.arraysize])
Python This routine fetches the next set of rows of a query result, returning a list. An empty list
Frameworks for
PostgresSQL is returned when no more rows are available. The method tries to fetch as many rows as
Flask indicated by the size parameter.
Module Summary
• cursor.fetchall()
This routine fetches all (remaining) rows of a query result, returning a list. An empty
list is returned when no rows are available.

Source: https: // www. tutorialspoint. com/ postgresql/ postgresql_ python. htm


Database Management Systems Partha Pratim Das 34.11
Python psycopg2 Module APIs: commit & rollback

Module 34

Partha Pratim
Das
• connection.commit()
This method commits the current transaction. If you do not call this method, anything
Objectives &
Outline you did since the last call to commit() is not visible to other database connections.
PostgreSQL and
Python • connection.rollback()
Python This method rolls back any changes to the database since the last call to commit().
Frameworks for
PostgresSQL

Flask

Module Summary

Source: https: // www. tutorialspoint. com/ postgresql/ postgresql_ python. htm


Database Management Systems Partha Pratim Das 34.12
Connect to a PostgreSQL Database Server

Module 34 import psycopg2


def connectDb(dbname, usrname, pwd, address, portnum):
Partha Pratim
Das conn = None
try:
Objectives & # connect to the PostgreSQL database
Outline conn = psycopg2.connect(database = dbname, user = usrname, \
PostgreSQL and password = pwd, host = address, port = portnum)
Python print ("Database connected successfully")
Python
except (Exception, psycopg2.DatabaseError) as error:
Frameworks for print(error)
PostgresSQL finally:
Flask conn.close() # close the connection
connectDb("mydb", "myuser", "mypass", "127.0.0.1", "5432") # function call
Module Summary
Output:

psycopg2.DatabaseError: Exception raised for errors that are related to the PostgreSQL database.
We assume the following for all the programs in this module:
• Database Name: mydb
• Username: myuser
• Password: mypass
• Host Name: localhost or IP address 127.0.0.1
Database Management Systems Partha Pratim Das 34.13
Steps to execute SQL commands

Module 34
1. Use the psycopg2.connect() method with the required arguments to connect Post-
Partha Pratim
Das gresSQL. It would return an Connection object if the connection established successfully.
Objectives & 2. Create a cursor object using the cursor() method of connection object.
Outline

PostgreSQL and 3. The execute() methods run the SQL commands and return the result.
Python

Python
4. Use cursor.fetchall() or fetchone() or fetchmany() to read query result.
Frameworks for
PostgresSQL 5. Use commit() to make the changes in database persistent, or use rollback() to revert
Flask the database changes.
Module Summary
6. Use cursor.close() and connection.close() method to close the cursor and Post-
greSQL connection.
Source: https: // pynative. com/ python-postgresql-tutorial/

Database Management Systems Partha Pratim Das 34.14


CREATE new PostgreSQL tables

Module 34 import psycopg2


def createTable():
Partha Pratim
Das conn = None
try:
Objectives & conn = psycopg2.connect(database = "mydb", user = "myuser", \
Outline password = "mypass", host = "127.0.0.1", port = "5432") # connect to the database
PostgreSQL and cur = conn.cursor() # create a new cursor
Python cur.execute(’’’CREATE TABLE EMPLOYEE \
Python
(emp_num INT PRIMARY KEY NOT NULL, \
Frameworks for emp_name VARCHAR(40) NOT NULL, \
PostgresSQL department VARCHAR(40) NOT NULL)’’’) # execute the CREATE TABLE statement
Flask conn.commit() # commit the changes to the database
print ("Table created successfully")
Module Summary
cur.close() # close the cursor
except (Exception, psycopg2.DatabaseError) as error:
print(error)
finally:
if conn is not None:
conn.close() # close the connection
createTable() #function call

Output (if table EMPLOYEE does not exist):


Output (if table EMPLOYEE already exists):

Database Management Systems Partha Pratim Das 34.15


Executing INSERT statement from Python
import psycopg2
Module 34
def insertRecord(num, name, dept):
Partha Pratim conn = None
Das try:
# connect to the PostgreSQL database
Objectives & conn = psycopg2.connect(database = "mydb", user = "myuser", \
Outline
password = "mypass", host = "127.0.0.1", port = "5432")
PostgreSQL and cur = conn.cursor() # create a new cursor
Python
# execute the INSERT statement
Python cur.execute("INSERT INTO EMPLOYEE (emp_num, emp_name, department) \
Frameworks for
PostgresSQL
VALUES (%s, %s, %s)", (num, name, dept))
conn.commit() # commit the changes to the database
Flask print ("Total number of rows inserted :", cur.rowcount);
Module Summary cur.close() # close the cursor
except (Exception, psycopg2.DatabaseError) as error:
print(error)
finally:
if conn is not None:
conn.close() # close the connection
insertRecord(110, ‘Bhaskar’, ’HR’) #function call

Output:

Output:
If a row already exists with emp num = 110
Database Management Systems Partha Pratim Das 34.16
Executing DELETE statement from Python

Module 34 import psycopg2


def deleteRecord(num):
Partha Pratim
Das conn = None
try:
Objectives & # connect to the PostgreSQL database
Outline conn = psycopg2.connect(database = "mydb", user = "myuser", \
PostgreSQL and password = "mypass", host = "127.0.0.1", port = "5432")
Python cur = conn.cursor() # create a new cursor
Python
# execute the DELETE statement
Frameworks for cur.execute("DELETE FROM EMPLOYEE WHERE emp_num = %s", (num,))
PostgresSQL conn.commit() # commit the changes to the database
Flask print ("Total number of rows deleted :", cur.rowcount)
cur.close() # close the cursor
Module Summary
except (Exception, psycopg2.DatabaseError) as error:
print(error)
finally:
conn.close() # close the connection
deleteRecord(110) #function call

Output:
Output:
If the row does not exist

Database Management Systems Partha Pratim Das 34.17


Executing UPDATE statement from Python

Module 34 import psycopg2


def updateRecord(num, dept):
Partha Pratim
Das conn = None
try:
Objectives & # connect to the PostgreSQL database
Outline conn = psycopg2.connect(database = "mydb", user = "myuser", \
PostgreSQL and password = "mypass", host = "127.0.0.1", port = "5432")
Python cur = conn.cursor() # create a new cursor
Python
# execute the UPDATE statement
Frameworks for cur.execute("UPDATE EMPLOYEE set department = %s where emp_num = \
PostgresSQL %s", (dept, num))
Flask conn.commit() # commit the changes to the database
print ("Total number of rows updated :", cur.rowcount)
Module Summary
cur.close() # close the cursor
except (Exception, psycopg2.DatabaseError) as error:
print(error)
finally:
conn.close() # close the connection
updateRecord(110, "Finance") #function call

Output:
Output:
If the row does not exist

Database Management Systems Partha Pratim Das 34.18


Executing SELECT statement from Python

Module 34 import psycopg2


def selectAll():
Partha Pratim
Das conn = None
try:
Objectives & # connect to the PostgreSQL database
Outline conn = psycopg2.connect(database = "mydb", user = "myuser", \
PostgreSQL and password = "mypass", host = "127.0.0.1", port = "5432")
Python cur = conn.cursor() # create a new cursor
Python
# execute the SELECT statement
Frameworks for cur.execute("SELECT emp_num, emp_name, department FROM EMPLOYEE")
PostgresSQL rows = cur.fetchall() # fetches all rows of the query result set
Flask for row in rows:
print (print ("Employee ID = ", row[0], ", NAME = ", \
Module Summary
row[1], ", DEPARTMENT = ", row[2]))
cur.close() # close the cursor
except (Exception, psycopg2.DatabaseError) as error:
print(error)
finally:
conn.close() # close the connection
selectAll() # function call

Output:
Database Management Systems Partha Pratim Das 34.19
Module 34

Partha Pratim
Das

Objectives &
Outline

PostgreSQL and
Python

Python
Frameworks for
PostgresSQL

Flask

Module Summary

Python Frameworks for PostgresSQL

Database Management Systems Partha Pratim Das 34.20


Web and Internet Development using Python

Module 34

Partha Pratim
Python offers several frameworks such as bottle.py, Flask, CherryPy, Pyramid, Django
Das and web2py for web development.
Objectives &
Outline
• Python offers many choices for web development
◦ Frameworks such as Django and Pyramid.
PostgreSQL and
Python
◦ Micro-frameworks such as Flask and Bottle.
◦ Advanced content management systems such as Plone and django CMS.
Python
Frameworks for
PostgresSQL
• Python’s standard library supports many internet protocols
Flask
◦ HTML and XML
◦ JSON
Module Summary ◦ E-mail processing
◦ Support for FTP, IMAP, and other Internet protocols
◦ Easy-to-use socket interface
• The package Index has more libraries
◦ Requests, a powerful HTTP client library.
◦ Beautiful Soup, an HTML parser that can handle all sorts of HTML.
◦ Feedparser for parsing RSS/Atom feeds.
◦ Paramiko, implementing the SSH2 protocol.
◦ Twisted Python, a framework for asynchronous network programming.
Database Management Systems Source: https: // www. python. org/
Partha about/
Pratim Das apps/ 34.21
Flask Web Application Framework

Module 34

Partha Pratim
Das • Flask is a lightweight WSGI (Web Server Gateway Interface) web application framework.
It is designed to make getting started quick and easy, with the ability to scale up to
Objectives &
Outline complex applications.
PostgreSQL and
Python • It began as a simple wrapper around Werkzeug (Werkzeug WSGI toolkit) and Jinja
Python (Jinja template engine) and has since then become one of the most popular Python web
Frameworks for
PostgresSQL application frameworks.
Flask
• Flask offers suggestions, but does not enforce any dependencies or project layouts. It is
Module Summary
up to the developer to choose the tools and libraries they want to use.
• There are many extensions provided by the community that make adding new function-
ality easy.

Installing Flask using pip command


• pip install -U Flask
Source: https: // pypi. org/ project/ Flask/
Database Management Systems Partha Pratim Das 34.22
A Simple Example

Module 34 from flask import Flask


Partha Pratim
app = Flask(__name__)
Das • Importing flask module in the project is mandatory.
@app.route(’/’) Our WSGI application is an object of Flask class.
Objectives & def hello_world():
Outline
return "Hello World" • Flask constructor takes the name of current mod-
PostgreSQL and ule ( name ) as argument.
Python if __name__ == ’__main__’:
Python app.run()
Frameworks for
PostgresSQL

Flask • The route() function of the Flask class is a decorator, which tells the application which URL should
Module Summary
call the associated function.
app.route(rule, options)
◦ The rule parameter represents URL binding with the function.
◦ The options is a list of parameters to be forwarded to the underlying Rule object.
• In the above example, ‘/’ URL is bound with hello world() function. Hence, when the home page of
web server is opened in browser, the output of this function will be rendered.
• Finally the run() method of Flask class runs the application on the local development server.
Source: https: // www. tutorialspoint. com/ flask/ flask_ application. htm

Database Management Systems Partha Pratim Das 34.23


A Simple Example (2)

Module 34 app.run(host, port, debug, options)


Partha Pratim
• host: Hostname to listen on. Defaults to 127.0.0.1 (localhost). Set to ‘0.0.0.0’ to have server available
Das externally
Objectives &
• port: Defaults to 5000
Outline • debug: Defaults to false. If set to true, provides a debug information
PostgreSQL and
Python
• options: To be forwarded to underlying Werkzeug server.
Python
Frameworks for
Executing the code:
PostgresSQL
• Python Hello.py
Flask
Output:
Module Summary A message in Python shell:
• * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
Open the above URL (127.0.0.1:5000) in the browser

Database Management Systems Partha Pratim Das 34.24


Python: Flask

Module 34
• Consider the table Candidate (in PostgreSQL) as shown below:
Partha Pratim
Das

Objectives &
Outline

PostgreSQL and
Python

Python
Frameworks for
PostgresSQL
• Code segment in Python:
Flask
from flask import Flask, \ if __name__ == ’__main__’:
Module Summary
render_template, request # Run the Flask app
import psycopg2 app.run(
host=’127.0.0.1’,
app = Flask( debug=True,
__name__, port=5000
template_folder=’templates’ )
)
#functions to be added here for
#different actions
Database Management Systems Partha Pratim Das 34.25
Python: Flask (2)

Module 34 • Source code for index.html:


Partha Pratim <!DOCTYPE html>
Das
<html>
Objectives & <head>
Outline <title>Candidate Email Database</title>
PostgreSQL and
</head>
Python <body>
<h2>Candidate Email Database</h2>
Python
Frameworks for <a href="/add">Add Email</a><br><br>
PostgresSQL <a href="/viewall">View Email</a>
Flask
</body>
</html>
Module Summary

• Source code for rendering index.html and add.html pages:

@app.route("/")
def index():
return render_template("index.html");

@app.route("/add")
def add():
return render_template("add.html")

Database Management Systems Partha Pratim Das 34.26


Python: Flask (3)

Module 34
http://127.0.0.1:5000/
Partha Pratim
Das

Objectives &
Outline

PostgreSQL and
Python

Python
Frameworks for
PostgresSQL

Flask

Module Summary

Database Management Systems Partha Pratim Das 34.27


Python: Flask (4)

Module 34
• Source code for add.html (in HTML):
Partha Pratim
Das <!DOCTYPE html>
Objectives &
<html>
Outline <head>
PostgreSQL and <title>Add Email</title>
Python </head>
Python <body>
Frameworks for
PostgresSQL <h2>Email Information</h2>
Flask
<form action = "/savedetails" method="post">
<table>
Module Summary
<tr><td>CNO</td><td><input type="text" name="cno" required></td></tr>
<tr><td>Name</td><td><input type="text" name="name" required></td></tr>
<tr><td>Email</td><td><input type="text" name="email" required></td></tr>
<tr><td><input type="submit" value="Submit"></td></tr>
</table>
</form>
</body>
</html>

Database Management Systems Partha Pratim Das 34.28


Python: Flask (5)

Module 34 • savaDetails() function for add.html (in Python):


Partha Pratim
Das @app.route("/savedetails",methods = ["POST"])
def saveDetails():
Objectives & cno = request.form["cno"]
Outline name = request.form["name"]
PostgreSQL and email = request.form["email"]
Python conn = None
Python try:
Frameworks for conn = psycopg2.connect(database = "mydb", user = "myuser", \
PostgresSQL
password = "mypass", host = "127.0.0.1", port = "5432") # connect to the PostgreSQL database
Flask cur = conn.cursor() # create a new cursor
Module Summary
cur.execute("INSERT INTO Candidate (cno, name, email) \
VALUES (%s, %s, %s)", (cno, name, email)) # execute the INSERT statement
conn.commit() # commit the changes to the database
cur.close() # close the cursor
except (Exception, psycopg2.DatabaseError) as error:
render_template("fail.html")
finally:
if conn is not None:
conn.close() # close the connection
return render_template("success.html")

Database Management Systems Partha Pratim Das 34.29


Python: Flask (6)

Module 34
http://127.0.0.1:5000/add http://127.0.0.1:5000/savedetails
Partha Pratim
Das

Objectives &
Outline

PostgreSQL and
Python

Python
Frameworks for
PostgresSQL

Flask

Module Summary

Database Management Systems Partha Pratim Das 34.30


Python: Flask (7)

Module 34 • viewAll() function for viewall.html (in Python):


Partha Pratim
Das @app.route("/viewall")
def viewAll():
Objectives & conn = None
Outline try:
PostgreSQL and # connect to the PostgreSQL database
Python conn = psycopg2.connect(database = "mydb", user = "myuser", \
Python password = "mypass", host = "127.0.0.1", port = "5432")
Frameworks for cur = conn.cursor() # create a new cursor
PostgresSQL
# execute the SELECT statement
Flask cur.execute("SELECT cno, name, email FROM Candidate")
Module Summary
results = cur.fetchall() # fetches all rows of the query result set
cur.close() # close the cursor
except (Exception, psycopg2.DatabaseError) as error:
render_template("fail.html")
finally:
conn.close() # close the connection
return render_template("viewall.html",rows = results)

Database Management Systems Partha Pratim Das 34.31


Python: Flask (8)

Module 34 • Source code for viewall.html (in HTML):


Partha Pratim
Das <!DOCTYPE html>
<html>
Objectives & <head>
Outline <title>Email List</title>
PostgreSQL and </head>
Python <body>
Python
Frameworks for <h3>Email List</h3>
PostgresSQL <table border=5>
Flask <tr>
<th>CNO</td><th>Name</td><th>Email</td>
Module Summary
</tr>
{% for row in rows %}
<tr>
<td>{{row[0]}}</td> <td>{{row[1]}}</td> <td>{{row[2]}}</td>
</tr>
{% endfor %}
</table>
<br><br>
<a href="/">Go Home</a>
</body>
</html>
Database Management Systems Partha Pratim Das 34.32
Python: Flask (9)

Module 34 http://127.0.0.1:5000/viewall
Partha Pratim
Das

Objectives &
Outline

PostgreSQL and
Python

Python
Frameworks for
PostgresSQL

Flask

Module Summary

Database Management Systems Partha Pratim Das 34.33


Module Summary

Module 34

Partha Pratim • Learnt how to access PostgreSQL database from Python


Das
• Learnt to build Python Web Application with PostgresSQL and Flask
Objectives &
Outline

PostgreSQL and
Python

Python
Frameworks for
PostgresSQL

Flask

Module Summary

Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 34.34


Module 35

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Rapid Application
Development Module 35: Application Design and Development/5: Application Development and Mobile
Application
Performance and
Security

Challenges

Mobile Apps
Partha Pratim Das
Module Summary
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 35.1


Module Recap PPD

Module 35

Partha Pratim • Learnt how to accessing PostgreSQL database from Python


Das
• Learnt to build Python Web Application with PostgreSQL and Flask
Objectives &
Outline

Rapid Application
Development

Application
Performance and
Security

Challenges

Mobile Apps

Module Summary

Database Management Systems Partha Pratim Das 35.2


Module Objectives PPD

Module 35

Partha Pratim • To explore the Rapid Application Development Process


Das
• To understand the issues in Application Performance and Security
Objectives &
Outline • To understand the similarity and differences between how Mobile Apps and Web
Rapid Application
Development
Applications
Application
Performance and
Security

Challenges

Mobile Apps

Module Summary

Database Management Systems Partha Pratim Das 35.3


Module Outline PPD

Module 35

Partha Pratim • Rapid Application Development


Das
• Application Performance and Security
Objectives &
Outline • Mobile Apps
Rapid Application
Development

Application
Performance and
Security

Challenges

Mobile Apps

Module Summary

Database Management Systems Partha Pratim Das 35.4


Rapid Application Development PPD

Module 35

Partha Pratim
Das

Objectives &
Outline

Rapid Application
Development

Application
Performance and
Security

Challenges

Mobile Apps

Module Summary

Rapid Application Development

Database Management Systems Partha Pratim Das 35.5


Rapid Application Development

Module 35
• A lot of effort is required to develop Web application interfaces, especially the rich
Partha Pratim
Das
interaction functionality associated with Web 2.0 applications
Objectives &
• Several approaches to speed up application development
Outline
◦ Function library to generate user-interface elements
Rapid Application
Development ◦ Drag-and-drop features in an IDE to create user-interface elements
Application ◦ Automatically generate code for user interface from a declarative specification
Performance and
Security
• Used as part of Rapid Application Development (RAD) tools even before Web
Challenges

Mobile Apps
• RAD Software is an agile model that focuses on fast prototyping and quick feedback in
Module Summary app development to ensure speedier delivery and an efficient result
◦ App development has 4 phases: business modeling, data modeling, process
modeling, and testing & turnover: Defining the requirements, Prototyping,
Receiving feedback and Finalizing the software
◦ With RAD, the time between prototypes and iterations is short, and integration
occurs since inception.

Database Management Systems Partha Pratim Das 35.6


Rapid Application Development (2)

Module 35 • Web application development frameworks


Partha Pratim ◦ Java Server Faces (JSF)
Das
. A set of APIs for representing UI components and managing their state,
Objectives &
Outline handling events and input validation, defining page navigation, and supporting
Rapid Application internationalization and accessibility
Development

Application
. JSP custom tag library for expressing a JSF interface within a JSP page
Performance and
Security ◦ Ruby on Rails
Challenges . Allows easy creation of simple CRUD (create, read, update and delete)
Mobile Apps interfaces by code generation from database schema or object model
Module Summary
• RAD Platforms and Tools
◦ G Suite
◦ Google App Engine
◦ Microsoft Azure
◦ Amazon Elastic Compute Cloud (EC2)
◦ AWS Elastic Beanstalk
◦ ...
Database Management Systems Partha Pratim Das 35.7
ASP.NET and Visual Studio

Module 35

Partha Pratim • ASP.NET provides a variety of controls that are interpreted at server, and generate
Das
HTML code
Objectives &
Outline • Visual Studio provides drag-and-drop development using these controls
Rapid Application
Development
◦ For example, menus and list boxes can be associated with DataSet object
Application
◦ Validator controls (constraints) can be added to form input fields
Performance and
Security . JavaScript to enforce constraints at client, and separately enforced at server
Challenges
◦ User actions such as selecting a value from a menu can be associated with actions
Mobile Apps
at server
Module Summary
◦ DataGrid provides convenient way of displaying SQL query results in tabular format

Database Management Systems Partha Pratim Das 35.8


Application Performance and Security PPD

Module 35

Partha Pratim
Das

Objectives &
Outline

Rapid Application
Development

Application
Performance and
Security

Challenges

Mobile Apps

Module Summary

Application Performance and Security

Database Management Systems Partha Pratim Das 35.9


Application Performance

Module 35

Partha Pratim • Performance is an issue for popular Web sites


Das
◦ May be accessed by millions of users every day, thousands of requests per second at
Objectives &
Outline peak time
Rapid Application
Development
• Caching techniques used to reduce cost of serving pages by exploiting commonalities
Application
between requests
Performance and
Security ◦ At the server site:
Challenges
. Caching of JDBC connections between servlet requests
Mobile Apps
– a.k.a. connection pooling
Module Summary
. Caching results of database queries
– Cached results must be updated if underlying database changes
. Caching of generated HTML
◦ At the client’s network
. Caching of pages by Web proxy

Database Management Systems Partha Pratim Das 35.10


Application Security: SQL Injection

Module 35 • Suppose query is constructed using


Partha Pratim ◦ ”select * from instructor where name = ’” + name + ”’”
Das
• Suppose the user, instead of entering a name, enters:
Objectives &
Outline ◦ X’ or ’Y’ = ’Y
Rapid Application
Development • then the resulting statement becomes:
Application
Performance and
◦ ”select * from instructor where name = ’” + ”X’ or ’Y’ = ’Y” + ”’”
Security ◦ which is:
Challenges
. select * from instructor where name = ’X’ or ’Y’ = ’Y’
Mobile Apps

Module Summary
◦ User could have even used
. X’; update instructor set salary = salary + 10000; - -
• Prepared statement internally uses:
”select * from instructor where name = ’X \’ or \’Y\’ = \’Y’
• Always use prepared statements, with user inputs as parameters
• Is the following prepared statement secure?
◦ conn.prepareStatement(”select * from instructor
Database Management Systems
where name = ’” + name + ”’“)
Partha Pratim Das 35.11
Application Security (2): Password Leakage

Module 35

Partha Pratim • Never store passwords, such as database passwords, in clear text in scripts that may be
Das
accessible to users
Objectives &
Outline ◦ For example, in files in a directory accessible to a web server
Rapid Application
Development
. Normally, web server will execute, but not provide source of script files such as
Application
file.jsp or file.php, but source of editor backup files such as file.jsp∼, or
Performance and
Security
.file.jsp.swp may be served
Challenges • Restrict access to database server from IPs of machines running application servers
Mobile Apps
◦ Most databases allow restriction of access by source IP address
Module Summary

Database Management Systems Partha Pratim Das 35.12


Application Security (3): Authentication

Module 35

Partha Pratim • Single factor authentication such as passwords too risky for critical applications
Das
◦ guessing of passwords, sniffing of packets if passwords are not encrypted
Objectives &
Outline ◦ passwords reused by user across sites
Rapid Application ◦ spyware which captures password
Development

Application
• Two-factor authentication
Performance and
Security ◦ For example, password plus one-time password sent by SMS
Challenges ◦ For example, password plus one-time password devices
Mobile Apps
. device generates a new pseudo-random number every minute, and displays to
Module Summary
user
. user enters the current number as password
. application server generates same sequence of pseudo-random numbers to check
that the number is correct.

Database Management Systems Partha Pratim Das 35.13


Application Security (4): Application-Level Authorization

Module 35

Partha Pratim • Current SQL standard does not allow fine-grained authorization such as “students can
Das
see their own grades, but not other’s grades”
Objectives &
Outline ◦ Problem 1: Database has no idea who are application users
Rapid Application ◦ Problem 2: SQL authorization is at the level of tables, or columns of tables, but not
Development
to specific rows of a table
Application
Performance and
Security
• One workaround: use views such as
Challenges create view studentTakes as
Mobile Apps select *
Module Summary from takes
where takes.ID = syscontext.user id()
◦ where syscontext.user id() provides end user identity
. end user identity must be provided to the database by the application
◦ Having multiple such views is cumbersome

Database Management Systems Partha Pratim Das 35.14


Application Security (5): Application-Level Authorization

Module 35

Partha Pratim • Currently, authorization is done entirely in application


Das
• Entire application code has access to entire database
Objectives &
Outline ◦ large surface area, making protection harder
Rapid Application
Development • Alternative: fine-grained (row-level) authorization schemes
Application
Performance and
◦ extensions to SQL authorization proposed but not currently implemented
Security ◦ Oracle Virtual Private Database (VPD) allows predicates to be added transparently
Challenges
to all SQL queries, to enforce fine-grained authorization
Mobile Apps

Module Summary
. For example, add ID = sys context.user id() to all queries on student relation if
user is a student

Database Management Systems Partha Pratim Das 35.15


Application Security (6): Audit Trails

Module 35

Partha Pratim • Applications must log actions to an audit trail, to detect who carried out an update, or
Das
accessed some sensitive data
Objectives &
Outline • Audit trails used after-the-fact to
Rapid Application
Development
◦ detect security breaches
Application
◦ repair damage caused by security breach
Performance and
Security
◦ trace who carried out the breach
Challenges • Audit trails needed at
Mobile Apps
◦ Database level, and at
Module Summary
◦ Application level

Database Management Systems Partha Pratim Das 35.16


Challenges in Web Application Development PPD

Module 35

Partha Pratim
Das

Objectives &
Outline

Rapid Application
Development

Application
Performance and
Security

Challenges

Mobile Apps

Module Summary

Challenges in Web Application Development

Database Management Systems Partha Pratim Das 35.17


Challenges in Web Application Development PPD

Module 35

Partha Pratim • User Interface and User Experience


Das
• Scalability
Objectives &
Outline • Performance
Rapid Application
Development • Knowledge of Framework and Platforms
Application
Performance and • Security
Security

Challenges Source: 5 Challenges in Web Application Development


Mobile Apps

Module Summary

Database Management Systems Partha Pratim Das 35.18


Mobile Apps PPD

Module 35

Partha Pratim
Das

Objectives &
Outline

Rapid Application
Development

Application
Performance and
Security

Challenges

Mobile Apps

Module Summary

Mobile Apps

Database Management Systems Partha Pratim Das 35.19


What is a Mobile App? PPD

Module 35

Partha Pratim
• A type of application software designed to run on a mobile device, such as a
Das
smartphone or tablet computer
Objectives & • Developed specifically for use on small, wireless computing devices, such as
Outline

Rapid Application
smartphones and tablets
Development • Designed with consideration for the demands and constraints of the devices and also to
Application
Performance and
take advantage of any specialized capabilities
Security
– Form Factor – influences display and navigation
Challenges
– Limited Memory
Mobile Apps

Module Summary
– Limited Computing Power
– Limited Power
– Limited Bandwidth
– ···
+ Availability of sensors like accelerometer
+ Availability of touchscreen – Gesture-based Navigation
+ ···
Source: https://www.slideshare.net/hassandar18/architecture- of- mobile- software- applications?from_action=save
Database Management Systems Partha Pratim Das 35.20
Mobile Website vis-à-vis Mobile App PPD

Module 35

Partha Pratim • Mobile Website


Das
◦ Similar to any other website in that it consists of browser-based HTML pages
Objectives &
Outline
◦ Can display text content, data, images and video
Rapid Application ◦ Typically accessed over WiFi or 3G or 4G networks
Development
◦ Designed for the smaller handheld display and touch-screen interface
Application
Performance and ◦ Can also access mobile-specific features such as click-to-call (to dial a phone
Security
number) or location-based mapping
Challenges

Mobile Apps • Mobile Apps


Module Summary ◦ Actual applications that are downloaded and installed on the mobile device
◦ Users download apps from device-specific portals such as App Store, Google Play
Store
◦ The app may
. pull content and data from the Internet, in similar fashion to a website, or
. download the content so that it can be accessed without an Internet connection
Source: https://www.slideshare.net/hassandar18/architecture- of- mobile- software- applications?from_action=save
Database Management Systems Partha Pratim Das 35.21
Architecture of Mobile App PPD

Module 35
• Typically 3 tier
Partha Pratim
Das ◦ Presentation
Objectives &
◦ Business
Outline
◦ Data
Rapid Application
Development • Data Layer is often split be-
Application
Performance and
tween:
Security
◦ Local Data
Challenges
◦ Remote Data
Mobile Apps

Module Summary • Needs customization for


platform
◦ Android
◦ iOS
◦ Windows

Source: https://www.slideshare.net/hassandar18/architecture- of- mobile-software- applications?from_action=save


Database Management Systems Partha Pratim Das 35.22
Types of Mobile Apps PPD

Module 35

Partha Pratim • Native Apps: Completely written in the native language of a platform
Das
◦ iOS → Objective-C; Android → Java or C/C++
Objectives &
Outline ◦ Platform specific (heavily dependent on OS)
Rapid Application
Development
• Web Apps: Run completely inside of a Web browser.
Application ◦ Features interfaces built with HTML or CSS
Performance and
Security ◦ Powered via Web programming languages → Ruby on Rails, JavaScript, PHP, or
Challenges Python
Mobile Apps ◦ Portable across any phone, tablet, or computer
Module Summary
• Hybrid Apps: Combines attributes of both native and Web apps.
◦ Attempts to use redundant, common code that can be used across platforms, and
◦ Tailors required attributes to the native system
Source: https://www.slideshare.net/hassandar18/architecture- of- mobile- software- applications?from_action=save

Database Management Systems Partha Pratim Das 35.23


Design Issues PPD

Module 35

Partha Pratim • Determine Device


Das
• Note Device Resources – memory, power, speed
Objectives &
Outline • Consider Bandwidth
Rapid Application
Development • Decide on Architecture Layers
Application
Performance and • Select Technology
Security

Challenges
• Define User Interface
Mobile Apps • Select Navigation
Module Summary
• Maintain Flow
Source: https://www.slideshare.net/hassandar18/architecture- of- mobile- software- applications?from_action=save

Database Management Systems Partha Pratim Das 35.24


Module Summary

Module 35

Partha Pratim • Understood the steps in the Rapid Application Development Process
Das
• Exposed to the issues in Application Performance and Application Security
Objectives &
Outline • Learnt the distinctive features of Mobile Apps
Rapid Application
Development

Application
Performance and
Security

Challenges

Mobile Apps

Module Summary

Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 35.25


Week-5

Module 21

Module 22

Module 23

Module 24 Database Management Systems


Module 25
Summary : Week-5
Module 21 Recap

Week-5

Redundancy: having multiple copies of same data in the database.


Module 21 This problem arises when a database is not normalized
Module 22 It leads to anomalies
Module 23
Anomaly: inconsistencies that can arise due to data changes in a database with
Module 24
insertion, deletion, and update
Module 25
These problems occur in poorly planned, un-normalised databases where all the
data is stored in one table (a flat-file database)
There can be three kinds of anomalies
Insertions Anomaly
Deletion Anomaly
Update Anomaly
Redundancy and Anomaly

Week-5
Insertions Anomaly
When the insertion of a data record is not possible without adding some
Module 21 additional unrelated data to the record
Module 22
We cannot add an Instructor in instructor with department if the department
Module 23
does not have a building or budget
Module 24

Module 25
Deletion Anomaly
When deletion of a data record results in losing some unrelated information that
was stored as part of the record that was deleted from a table
We delete the last Instructor of a Department from instructor with department,
we lose building and budget information
Update Anomaly
When a data is changed, which could involve many records having to be changed,
leading to the possibility of some changes being made incorrectly
When the budget changes for a Department having large number of Instructors in
instructor with department application may miss some of them
First Normal Form (1NF) PPD

Week-5

A domain is atomic if its elements are considered to be indivisible units


Module 21 Examples of non-atomic domains:
Module 22 Set of names, composite attributes
Module 23 Identification numbers like CS101 that can be broken up into parts
Module 24

Module 25
A relational schema R is in First Normal Form (INF) if
the domains of all attributes of R are atomic
the value of each attribute contains only a single value from that domain
Non-atomic values complicate storage and encourage redundant (repeated) storage of
data
Example: Set of accounts stored with each customer, and set of owners stored
with each account
We assume all relations are in first normal form
Functional Dependencies

Week-5 Let R be a relation schema


α ⊆ R and β ⊆ R
Module 21 The functional dependency or FD
Module 22 α→β
Module 23 holds on R if and only if for any legal relations r (R), whenever any two tuples t1 and
Module 24 t2 of r agree on the attributes α, they also agree on the attributes β. That is,
Module 25

t1 [α] = t2 [α] ⇒ t1 [β] = t2 [β]

Example: Consider r (A, B) with the following instance of r .

On this instance, A → B does NOT hold, but B → A does hold. So we cannot have
tuples like (2, 4), or (3, 5), or (4, 7) added to the current instance.
Functional Dependencies : Armstrong’s Axioms

Week-5 Given a set of Functional Dependencies F , we can infer new dependencies by the
Armstrong’s Axioms:
Module 21
Reflexivity: if β ⊆ α, then α → β
Module 22 Augmentation: if α → β, then γα → γβ
Module 23 Transitivity: if α → β and β → γ, then α → γ
Module 24
These axioms can be repeatedly applied to generate new FDs and added to F
Module 25
A new FD obtained by applying the axioms is said to the logically implied by F
The process of generations of FDs terminate after finite number of steps and we call it
the Closure Set F + for FDs F . This is the set of all FDs logically implied by F
Clearly, F ⊆ F +
These axioms are
Sound (generate only functional dependencies that actually hold), and
Complete (eventually generate all functional dependencies that hold)
Prove the axioms from definitions of FDs
Prove the soundness and completeness of the axioms
Functional Dependencies : Armstrong’s Axioms: Derived Rules

Week-5

Additional Derived Rules:


Module 21 Union: if α → β holds and α → γ holds, then α → βγ holds
Module 22 Decomposition: if α → βγ holds, then α → β holds and α → γ holds
Module 23 Pseudotransitivity: if α → β holds and γβ → δ holds, then αγ → δ holds
Module 24
The above rules can be inferred from basic Armstrong’s axioms (and hence are not
Module 25
included in the basic set). They can be proven independently too
Reflexivity: if β ⊆ α, then α → β
Augmentation: if α → β, then γα → γβ
Transitivity: if α → β and β → γ, then α → γ
Prove the Rules from:
Basic Axioms
The definitions of FDs
Functional Dependencies : Closure of Attribute Sets: Example

Week-5

R = (A, B, C , G , H, I )
Module 21 F = {A → B, A → C , CG → H, CG → I , B → H}
Module 22
(AG )+
Module 23

Module 24
1 result = AG
Module 25
2 result = ABCG (A → C and A → B)
3 result = ABCGH (CG → H and CG ⊆ AGBC )
4 result = ABCGHI (CG → I and CG ⊆ AGBCH)
Is AG a candidate key?
1 Is AG a super key?
1 Does AG → R? == Is (AG )+ ⊇ R
2 Is any subset of AG a superkey?
1 Does A → R? == Is (A)+ ⊇ R
2 Does G → R? == Is (G )+ ⊇ R
BCNF: Boyce-Codd Normal Form

Week-5

A relation schema R is in BCNF with respect to a set F of FDs if for all FDs in F + of
Module 21
the form
Module 22
α → β, where α ⊆ R and β ⊆ R at least one of the following holds:
Module 23 α → β is trivial (that is, β ⊆ α)
Module 24 α is a superkey for R
Module 25
Example schema not in BCNF:
instr dept (ID, name, salary, dept name, building, budget)
because the non-trivial dependency dept name → building, budget holds on
instr dept, but dept name is not a superkey
BCNF (2): Decomposition

Week-5

If in schema R and a non-trivial dependency α → β causes a violation of BCNF, we


Module 21
decompose R into:
Module 22 α∪β
Module 23 (R − (β − α))
Module 24
In our example,
Module 25
α = dept name
β = building , budget
dept name → building, budget
inst dept is replaced by
(α ∪ β) = (dept name, building, budget)
dept name → building, budget
(R − (β − α)) =(ID, name, salary, dept name)
ID → name, salary, dept name
3NF: Third Normal Form

Week-5

A relation schema R is in third normal form (3NF) if for all:


Module 21
α → β ∈ F+
Module 22
at least one of the following holds:
Module 23 α → β is trivial (that is, β ⊆ α)
Module 24 α is a superkey for R
Module 25
Each attribute A in β − α is contained in a candidate key for R
(Nore: Each attribute may be in a different candidate key)
If a relation is in BCNF it is in 3NF (since in BCNF one of the first two conditions
above must hold)
Third condition is a minimal relaxation of BCNF to ensure dependency preservation
(will see why later)
Extraneous Attributes

Week-5 Consider a set F of FDs and the FD α → β in F .


Attribute A is extraneous in α if A ∈ α and F logically implies
Module 21 (F − {α → β}) ∪ {(α − A) → β}.
Module 22 Attribute A is extraneous in β if A ∈ β and the set of FDs
Module 23 (F − {α → β}) ∪ {α → (β − A)} logically implies F .
Module 24
Note: Implication in the opposite direction is trivial in each of the cases above, since a
Module 25
“stronger” functional dependency always implies a weaker one
Example: Given F = {A → C , AB → C }
B is extraneous in AB → C because {A → C , AB → C } logically implies A → C
(that is, the result of dropping B from AB → C ).
A+ = AC in {A → C , AB → C }
Example: Given F = {A → C , AB → CD}
C is extraneous in AB → CD since AB → C can be inferred even after deleting C
AB + = ABCD in {A → C , AB → D}
Equivalence of Sets of Functional Dependencies PPD

Week-5
Let F & G are two functional dependency sets.
These two sets F & G are equivalent if F + = G + . That is:
Module 21 (F + = G + ) ⇔ (F + ⇒ G and G + ⇒ F )
Module 22 Equivalence means that every functional dependency in F can be inferred from G , and
Module 23 every functional dependency in G an be inferred from F
Module 24
F and G are equal only if
Module 25
F covers G : Means that all functional dependency of G are logically numbers of
functional dependency set F ⇒ F + ⊇ G .
G covers F : Means that all functional dependency of F are logically members of
functional dependency set G ⇒ G + ⊇ F .
Canonical Cover

Week-5 A Canonical Cover for F is a set of dependencies Fc such that ALL the following properties
are satisfied:
Module 21 F + = Fc+ . Or,
Module 22 F logically implies all dependencies in Fc
Module 23 Fc logically implies all dependencies in F
Module 24
No functional dependency in Fc contains an extraneous attribute
Module 25
Each left side of functional dependency in Fc is unique. That is, there are no two
dependencies α1 → β1 and α2 → β2 in such that α1 → α2
Intuitively, a Canonical cover of F is a minimal set of FDs
Equivalent to F
Having no redundant FDs
No redundant parts of FDs
Minimal / Irreducible Set of Functional Dependencies
Canonical Cover : Example

Week-5 For example: A → C is redundant in: {A → B, B → C , A → C }


Parts of a functional dependency may be redundant
Module 21
For example: on RHS: {A → B, B → C , A → CD} can be simplified to
Module 22
{A → B, B → C , A → D}
Module 23
– In the forward: (1) A → CD ⇒ A → C and A → D
Module 24
(2) A → B, B → C ⇒ A → C
Module 25
– In the reverse: (1) A → B, B → C ⇒ A → C
(2) A → C , A → D ⇒ A → CD
For example: on LHS: {A → B, B → C , AC → D} can be simplified to
{A → B, B → C , A → D}
– In the forward: (1) A → B, B → C ⇒ A → C ⇒ A → AC
(2) A → AC , AC → D ⇒ A → D
– In the reverse: A → D ⇒ AC → D
Lossless Join Decomposition

Week-5 For the case of R = (R1 , R2 ), we require that for all possible relations r on schema R

r = πR1 (r ) ▷◁ πR2 (r )
Module 21

Module 22

Module 23
A decomposition of R into R1 and R2 is lossless join if at least one of the following
Module 24
dependencies is in F + :
Module 25 R1 ∩ R2 → R1
R1 ∩ R2 → R2
The above functional dependencies are a sufficient condition for lossless join
decomposition; the dependencies are a necessary condition only if all constraints are
functional dependencies

To Identify whether a decomposition is lossy or lossless, it must satisfy the following conditions:
R1 ∪ R2 = R
R1 ∩ R2 ̸= ϕ and
R1 ∩ R2 → R1 or R1 ∩ R2 → R2
Lossless Join Decomposition : Example

Week-5

R = (A, B, C )
Module 21
F = {A → B, B → C }
Module 22 Can be decomposed in two different ways
Module 23
R1 = (A, B), R2 = (B, C )
Module 24

Module 25
Lossless-join decomposition:
R1 ∩ R2 = {B} and B → BC
Dependency preserving
R1 = (A, B), R2 = (A, C )
Lossless-join decomposition:
R1 ∩ R2 = {A} and A → AB
Not dependency preserving
(cannot check B → C without computing R1 ▷◁ R2 )
Dependency Preservation

Week-5

Let Fi be the set of dependencies F + that include only attributes in Ri


Module 21 A decomposition is dependency preserving, if
Module 22

Module 23 (F1 ∪ F2 ∪ · · · ∪ Fn )+ = F +
Module 24

Module 25 If it is not, then checking updates for violation of functional dependencies may
require computing joins, which is expensive

Let R be the original relational schema having FD set F . Let R1 and R2 having FD
set F1 and F2 respectively, are the decomposed sub-relations of R. The decomposition
of R is said to be preserving if
F1 ∪ F2 ≡ F {Decomposition Preserving Dependency}
If F1 ∪ F2 ⊂ F {Decomposition NOT Preserving Dependency} and
F1 ∪ F2 ⊃ F {this is not possible}
Dependency Preservation : Example

Week-5

R (A, B, C, D)
Module 21
F = {A → B, B → C , C → D, D → A}
Module 22 Decomposition: R1(A, B) R2(B, C) R3(C, D)
Module 23 A → B is preserved on table R1
Module 24 B → C is preserved on table R2
Module 25
C → D is preserved on table R3
We have to check whether the one remaining FD: D→A is preserved or not.
R1 R2 R3
F1 ={A → AB, B → BA} F2 ={B → BC , C → CB} F3 ={C → CD, D → DC }

F ′ = F1 ∪ F 2 ∪ F3 .
Checking for: D → A in F ′+
D → C (from R3), C → B (from R2), B → A (from R1) : D→ A (By
Transitivity)
Hence all dependencies are preserved.
Week-6

Module 26

Module 27

Module 29
Database Management Systems
Summary : Week-6
Normalization or Schema Refinement PPD

Week-6
Normalization or Schema Refinement is a technique of organizing the data in the
database
Module 26
A systematic approach of decomposing tables to eliminate data redundancy and
Module 27
undesirable characteristics
Module 29
Insertion Anomaly
Update Anomaly
Deletion Anomaly
Most common technique for the Schema Refinement is decomposition.
Goal of Normalization: Eliminate Redundancy
Redundancy refers to repetition of same data or duplicate copies of same data stored
in different locations
Normalization is used for mainly two purpose:
Eliminating redundant (useless) data
Ensuring data dependencies make sense, that is, data is logically stored
Normalization and Normal Forms PPD

Week-6

A normal form specifies a set of conditions that the relational schema must satisfy in
Module 26
terms of its constraints – they offer varied levels of guarantee for the design
Module 27 Normalization rules are divided into various normal forms. Most common normal
Module 29 forms are:
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Informally, a relational database relation is often described as ”normalized” if it meets
third normal form. Most 3NF relations are free of insertion, update, and deletion
anomalies
1NF: First Normal Form PPD

Week-6 A relation is in First Normal Form if and only if all underlying domains contain atomic
values only (doesn’t have multivalued attributes (MVA))
Module 26
STUDENT(Sid, Sname, Cname)
Module 27

Module 29

Source: http://www.edugrabs.com/normal- forms/#fnf


2NF: Second Normal Form PPD

Week-6

Relation R is in Second Normal Form (2NF) only iff :


Module 26
R is in 1NF and
Module 27
R contains no Partial Dependency
Module 29

Partial Dependency:
Let R be a relational Schema and X , Y , A be the attribute sets over R where X : Any Candi-
date Key, Y : Proper Subset of Candidate Key, and A : Non Prime Attribute
If Y → A exists in R, then R is not in 2NF.

(Y → A) is a Partial dependency only if


Y : Proper subset of Candidate Key
A: Non Prime Attribute
A prime attribute of a relation is an attribute that is a part of a candidate key of the relation
3NF: Third Normal Form PPD

Week-6
Let R be the relational schema.
[E. F. Codd,1971] R is in 3NF only if:
Module 26
R should be in 2NF
Module 27
R should not contain transitive dependencies (OR, Every non-prime attribute of R is
Module 29 non-transitively dependent on every key of R)
[Carlo Zaniolo, 1982] Alternately, R is in 3NF iff for each of its functional dependencies X → A, at
least one of the following conditions holds:
X contains A (that is, A is a subset of X , meaning X → A is trivial functional dependency), or
X is a superkey, or
Every element of A − X , the set difference between A and X , is a prime attribute (i.e., each
attribute in A − X is contained in some candidate key)
[Simple Statement] A relational schema R is in 3NF if for every FD X → A associated with R either
A ⊆ X (that is, the FD is trivial) or
X is a superkey of R or
A is part of some candidate key (not just superkey!)
A relation in 3NF is naturally in 2NF
Module 27 Recap

Week-6

Module 26

Module 27

Module 29

Decomposition to 3NF
3NF Decomposition: Motivation

Week-6

There are some situations where


Module 26
BCNF is not dependency preserving, and
Module 27
Efficient checking for FD violation on updates is important
Module 29 Solution: define a weaker normal form, called Third Normal Form (3NF)
Allows some redundancy (with resultant problems; as seen above)
But functional dependencies can be checked on individual relations without computing a
join
There is always a lossless-join, dependency-preserving decomposition into 3NF
3NF Decomposition : Testing for 3NF

Week-6

Optimization: Need to check only FDs in F , need not check all FDs in F + .
Module 26 Use attribute closure to check for each dependency α → β, if α is a superkey.
Module 27 If α is not a superkey, we have to verify if each attribute in β is contained in a
Module 29 candidate key of R
This test is rather more expensive, since it involve finding candidate keys
Testing for 3NF has been shown to be NP-hard
Decomposition into 3NF can be done in polynomial time
3NF Decomposition : Algorithm PPD

Week-6

Given: relation R, set F of functional dependencies


Module 26 Find: decomposition of R into a set of 3NF relation Ri
Module 27 Algorithm:
Module 29
1 Eliminate redundant FDs, resulting in a canonical cover Fc of F
2 Create a relation Ri = XY for each FD X → Y in Fc
3 If the key K of R does not occur in any relation Ri , create one more relation Ri = K
3NF Decomposition : Example

Week-6

Relation schema:
Module 26
cust banker branch = (customer id, employee id, branch name, type)
Module 27 The functional dependencies for this relation schema are:
Module 29 1 customer id, employee id → branch name, type
2 employee id → branch name
3 customer id, branch name → employee id
We first compute a canonical cover
branch name is extraneous in the RHS of the 1st dependency
No other attribute is extraneous, so we get Fc =
customer id, employee id → type
employee id → branch name
customer id, branch name → employee id
3NF Decomposition : Example

Week-6

The for loop generates following 3NF schema:


Module 26
(customer id, employee id, type)
Module 27
(employee id, branch name)
Module 29 (customer id, branch name, employee id)
Observe that (customer id, employee id, type) contains a candidate key of the original
schema, so no further relation schema needs be added
At end of for loop, detect and delete schemas, such as (employee id, branch name),
which are subsets of other schemas
result will not depend on the order in which FDs are considered
The resultant simplified 3NF schema is:
(customer id, employee id, type)
(customer id, branch name, employee id)
BCNF Decomposition: BCNF Definition

Week-6

A relation schema R is in BCNF with respect to a set F of FDs if for all FDs in F + of
Module 26
the form
Module 27
α → β, where α ⊆ R and β ⊆ R at least one of the following holds:
Module 29 α → β is trivial (that is, β ⊆ α)
α is a superkey for R
BCNF Decomposition : Algorithm PPD

Week-6
1 For all dependencies A → B in F + , check if A is a superkey
Module 26
By using attribute closure
Module 27 2 If not, then
Module 29 Choose a dependency in F + that breaks the BCNF rules, say A → B
Create R1 = AB
Create R2 = (R − (B − A))
Note that: R1 ∩ R2 = A and A → AB (= R1), so this is lossless decomposition
3 Repeat for R1, and R2
By defining F 1+ to be all dependencies in F that contain only attributes in R1
Similarly F 2+
BCNF Decomposition (4): Testing Dependency Preservation:
Using Closure Set of FD
Week-6
Consider the example given below, we will apply both the algorithms to check dependency preservation and
will discuss the results.
Module 26 R (A, B, C, D)
Module 27 F = {A → B, B → C , C → D, D → A}
Module 29
Decomposition: R1(A, B) R2(B, C) R3(C, D)
A → B is preserved on table R1
B → C is preserved on table R2
C → D is preserved on table R3
We have to check whether the one remaining FD: D→A is preserved or not.

R1 R2 R3
F1 ={A → AB, B → BA} F2 ={B → BC , C → CB} F3 ={C → CD, D → DC }

F ′ = F1 ∪ F2 ∪ F3 .
Checking for: D → A in F ′+
D → C (from R3), C → B (from R2), B → A (from R1) : D→ A (By Transitivity)
Hence all dependencies are preserved.
MVD: Definition PPD

Week-6 Let R be a relation schema and let α ⊆ R and β ⊆ R. The multivalued dependency
α↠β
holds on R if in any legal relation r(R), for all pairs for tuples t1 and t2 in r such that t1 [α] = t2 [α],
Module 26
there exist tuples t3 and t4 in r such that:
Module 27

Module 29 t1 [α] = t2 [α] = t3 [α] = t4 [α]


t3 [β] = t1 [β]
t3 [R – β] = t2 [R – β] Test: course ↠ book
t4 [β] = t2 [β]
t4 [R – β] = t1 [R – β]

Example: A relation of university


courses, the books recommended for
the course, and the lecturers who
will be teaching the course:
course ↠ book
course ↠ lecturer
Fourth Normal Form

Week-6
A relation schema R is in 4NF with respect to a set D of functional and multivalued
dependencies if for all multivalued dependencies in D + of the form α ↠ β, where α ⊆
Module 26 R and β ⊆ R, at least one of the following hold:
Module 27 α ↠ β is trivial (that is, β ⊆ α or α ∪ β = R)
Module 29 α is a superkey for schema R
If a relation is in 4NF,then it is in BCNF
Week-7

Module 31

Module 32

Module 33

Module 34
Database Management Systems
Module 35 Summary : Week-7

March 7, 2022
Module 31 Recap

Week-7

Characteristic of Application Programs - Diversity and Unity


Module 31 Applications are functionally split into:
Module 32 Frontend or Presentation Layer / Tier
Module 33 Middle or Application / Business Logic Layer / Tier
Module 34 Backend or Data Access Layer / Tier
Module 35
Application Architectures: Layers
Presentation Layer / Tier
Model-View-Controller (MVC) architecture
model - business logic
view - presentation of data, depends on display device
controller - receives events, executes actions, and returns a view to the user
Business Logic Layer / Tier - provides high level view of data and actions on data
Data Access Layer / Tier - interfaces between business logic layer and the underlying
database
Module 31 Recap (Cont.)

Week-7

Architecture Classification
Module 31
The design of a DBMS depends on its architecture. It can be
Module 32

Module 33
centralized
Module 34
decentralized
hierarchical
Module 35
The architecture of a DBMS can be seen as either single tier or multi-tier:
1-tier architecture
2-tier architecture
3-tier architecture
n-tier architecture
Module 32 Recap

Week-7

Web Fundamentals
Module 31
The World Wide Web
Module 32

Module 33
Hypertext MarkupLanguage (HTML)
Module 34 Uniform Resource Locators (URLs)
Module 35 Uniform Resource Identifier (URI)
Uniform Resource Locator (URL)
Uniform Resource Name (URN)
Hypertext Transfer Protocol (HTTP)
HTTP and Sessions
Sessions and Cookies
Web Browser
Web Servers
Web Services - Representation State Transfer (REST), XML, JavaScript Object Notation
(JSON), Big Web Services
Module 32 Recap (Cont.)

Week-7

Scripting for Web Applications


Module 31

Module 32 Client side scripting - are firstly downloaded at the client-end and then interpreted and executed by
Module 33
the browser
Module 34
Javascript
Module 35
Server side scripting - is responsible for the completion or carrying out a task at the server-end and
then sending the result to the client-end.
Servlets
Java Server Pages (JSP)
PHP
Module 33 Recap

Week-7

Working with SQL and Native Language


Module 31

Module 32 Connectionist
Module 33
Open Database Connectivity (ODBC)
Module 34

Module 35
Java Database Connectivity (JDBC)
JDBC example
Connectionist Bridge Configurations
ODBC-to-JDBC bridges, JDBC-to-ODBC bridges, OLE DB-to-ODBC bridges, ADO.NET-to-ODBC
bridges

Embedded SQL

Examples with C, Java


Module 34 Recap

Week-7

Python Modules for PostgreSQL


Module 31 Package psycopg2
Module 32

Module 33
Steps to access PostgresSQL from Python using psycopg2
Module 34 1 Create connection
Module 35
2 Create cursor
3 Execute the query
4 Commit/rollback
5 Close the cursor
6 Close the connection

Python psycopg2 Module APIs: insert, delete, update stored procedures


Python psycopg2 Module APIs: select
Web and Internet Development using Python
Module 35 Recap

Week-7

Rapid Application Development - RAD Software is an agile model that focuses on fast
Module 31
prototyping and quick feedback in app development to ensure speedier delivery and an
Module 32
efficient result
Module 33 Several approaches to speed up application development
Module 34

Module 35 Web application development frameworks


1. Java Server Faces (JSF) 2. Ruby on Rails
RAD Platforms and Tools
ASP.NET and Visual Studio

Application Performance
Application Security
SQL Injection: i.e. select * from instructor where name = ’X’ or ’Y’ = ’Y’
1. Password Leakage 2. Authentication 3. Application-Level Authorization 4. Audit Trails
Module 35 Recap (Cont.

Week-7

Challenges in Web Application Development - User Interface and User Experience,


Module 31
Scalability, Performance, Knowledge of Framework and Platforms, Security
Module 32
Mobile Apps - A type of application software designed to run on a mobile device, such as
Module 33
a smartphone or tablet computer
Module 34

Module 35 Mobile Website


Mobile Apps
Architecture of Mobile App - Typically 3 tier: Presentation, Business, and Data
Types of Mobile Apps
Native Apps
Web Apps
Hybrid Apps

Design Issues
Week-8

Module 36

Module 37

Module 38

Module 39 Database Management Systems


Module 40
Summary : Week-8
Module 36 Recap

Week-8

Algorithms and Programs


Module 36 Analysis of Algorithms
Module 37 Why analyze?
Module 38 What to analyze?
Module 39 How to analyze?
Module 40 Counting Models
Asymptotic Analysis
Generating Functions
Master Theorem
Where to analyze?
When to analyze?
Complexity Chart
Module 37 Recap

Week-8

Linear data structures: A Linear data structure has data elements arranged in linear or
sequential manner such that each member element is connected to its previous and
Module 36
next element.
Module 37

Module 38
Array: The data elements are stored at contiguous locations in memory.
Module 39
Linked List: The data elements are not required to be stored at contiguous locations in
Module 40
memory. Rather each element stores a link (a pointer to a reference) to the location of
the next element.
Queue: It is a FIFO (First In First Out) data structure.
Stack: It is a LIFO (Last In First Out) data structure.
Search
Linear
Binary
Module 37 Recap (Cont..)

Week-8

From the study of Linear data structures, we can make the following summary
observations:
Module 36

Module 37
All of them have the space complexity O(n), which optimal. However, the actual used
Module 38
space may be lower in array while linked list has an overhead of 100% (double)
Module 39
All of them have complexities that are identical for Worst as well as Average case
Module 40
All of them offer satisfactory complexity for some operations while being unsatisfactory
on the others

Array Linked List


Unordered Ordered Unordered Ordered
Access O(1) O(1) O(n) O(n)
Insert O(n) O(n) O(1) O(1)
Delete O(n) O(n) O(1) O(1)
Search O(n) O(lg n) O(n) O(n)
Module 38 Recap

Week-8

Non-Linear data structures are those data structures in which data items are not
Module 36
arranged in a sequence and each element may have multiple paths to connect to other
Module 37
elements.
Module 38 Graph: Undirected or Directed, Unweighted or Weighted, and variants
Module 39 Tree: Rooted or Unrooted, Binary or n-ary, Balanced or Unbalanced, and variants
Module 40 Hash Table: Array with lists (coalesced chains) and one or more hash functions
Skip List: Multi-layered interconnected linked lists
Binary Search Trees: Is a tree in which all the nodes hold the following:
The value of each node in the left sub-tree is less than the value of its root
The value of each node in the right sub-tree is greater than the value of its root
Binary Search Tree

Week-8
Practice Question: Construct the binary search tree for the following sequence:
Module 36
1 15,10,20,8,12,27,23,2,6,11,14,17
Module 37 2 15,10,6,20,27,2,23,17,8,14,11,12
Module 38
3 15,23,6,20,12,2,10,17,8,14,11,27
Module 39

Module 40 For each BST, find out the number of leaf nodes, height of BST and number of elements
at level 2.
Comparison of Linear and Non-Linear Data Structures

Week-8
Linear Data Structure Non-Linear Data Structure
• Data elements are arranged in a linear • Data elements are arranged in hierar-
Module 36
order where each and every elements are chical or networked manner
Module 37

Module 38
attached to its previous and next adjacent
Module 39
• Single level is involved • Multiple level are involved
Module 40 • Implementation is easy in comparison • Implementation is complex in compari-
to non-linear data structure son to linear data structure
• Data elements can be traversed in one • Data elements can be traversed in mul-
way only tiple ways. Various traversals may be de-
fined to linearize the data: Depth-First,
Breadth-First, Inorder, Prepoder, Pos-
torder, etc.
• Examples: array, stack, queue, linked • Examples: trees, graphs, skip list, hash
list, and their variants map, and several variants
Module 39 Recap

Week-8
Physical Storage Media
Magnetic Disks
Module 36
(Go through the slides for theoretical part and refer to practice and graded assignment
Module 37
questions )
Module 38

Module 39
Magnetic Tape
Module 40 Cloud Storage
Cloud Storage vs. Traditional Storage
Other Storage
Optical Disks
Flash Drives
Secure Digital Cards (SD cards)
Flash Storage
Solid-State Drives (SSD)
Future of Storage
DNA Digital Storage
Quantum Memory
Module 40 Recap

Week-8
File Organization
Organization of Records in Files
Module 36
Heap: A record can be placed anywhere in the file where there is space
Module 37
Sequential: Store records in sequential order, based on the value of the search key of
Module 38 each record.
Module 39 Suitable for applications that require sequential processing of the entire file
Module 40 The records in the file are ordered by a search-key.
It will work more efficiently when working on search-key (primary key) of the table.
Hashing: A hash function computed on some attribute of each record; the result
specifies in which block of the file the record should be placed
In a multitable clustering file organization records of several different relations can be
stored in the same file.
good for queries involving department ▷◁ instructor, and for queries involving one single
department and its instructors
bad for queries involving only department
results in variable size records
Can add pointer chains to link records of a particular relation
Module 40 Cont..

Week-8

Data Dictionary (also, System Catalog) stores metadata (data about data) such
as:
Module 36

Module 37
Information about relations
Module 38
User and accounting information, including passwords
Module 39
Statistical and descriptive data
Module 40
Physical file organization information
Information about indices
Buffer: portion of main memory available to store copies of disk blocks
Buffer Manager: subsystem responsible for allocating buffer space in main memory
Buffer Replacement Policies:
Least recently used (LRU strategy)
Most recently used (MRU strategy)
Module 36

Partha Pratim
Das

Week Recap

Objectives &
Database Management Systems
Outline
Module 36: Algorithms and Data Structures/1: Algorithms and Complexity Analysis
Algorithms

Analysis of
Algorithms
Why?
What?
How?
Partha Pratim Das
Counting Models
Asymptotic Analysis
Department of Computer Science and Engineering
Where?
Indian Institute of Technology, Kharagpur
Complexity Chart

Module Summary [email protected]

Database Management Systems Partha Pratim Das 36.1


Week Recap PPD

Module 36

Partha Pratim • Had a glimpse of Application Programs across various sectors


Das
• Understood the architectures for an application and their classification and evolution
Week Recap

Objectives &
• Glimpsed at architecture for a few sample applications
Outline

Algorithms
• Familiarized with the Fundamentals notions and technologies of Web
Analysis of • Learnt about Scripting and the notions of Servlets
Algorithms
Why? • Learnt to use SQL from a programming language
What?
How? • Learnt to build Python Web Applications with PostgreSQL using psycopg2 and Flask
Counting Models
Asymptotic Analysis • Understood the steps in the Rapid Application Development Process
Where?

Complexity Chart • Exposed to the issues in Application Performance and Security


Module Summary
• Learnt the distinctive features of Mobile Apps

Database Management Systems Partha Pratim Das 36.2


Module Objectives PPD

Module 36

Partha Pratim • Define Algorithms and its difference with Programs


Das
• Analyze algorithms for performance of time, space, power, etc.
Week Recap

Objectives &
• Introduce Asymptotic notation for representation of complexity
Outline

Algorithms
• Consider complexity of common algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary

Database Management Systems Partha Pratim Das 36.3


Module Outline PPD

Module 36

Partha Pratim • Algorithms and Programs


Das
• Analysis of Algorithms
Week Recap

Objectives &
• Complexity Chart
Outline

Algorithms

Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary

Database Management Systems Partha Pratim Das 36.4


Algorithms and Programs PPD

Module 36
• Algorithm
Partha Pratim
Das ◦ An algorithm is a finite sequence of well-defined, computer-implementable
Week Recap
(optional) instructions, typically to solve a class of specific problems or to perform a
Objectives &
computation.
Outline
◦ Algorithms are always unambiguous and are used as specifications for performing
Algorithms
calculations, data processing, automated reasoning, and other tasks.
Analysis of
Algorithms ◦ An algorithm must terminate
Why?
What? • Program
How?
Counting Models ◦ A computer program is a collection of instructions that can be executed by a
Asymptotic Analysis
Where?
computer to perform a specific task
Complexity Chart
◦ A computer program is usually written by a computer programmer in a
Module Summary programming language.
◦ A programs implements an algorithm
◦ A program may or may not terminate. For example, an OS

Database Management Systems Partha Pratim Das 36.5


Analysis of Algorithms PPD

Module 36

Partha Pratim
Das

Week Recap

Objectives &
Outline

Algorithms

Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary
Analysis of Algorithms

Database Management Systems Partha Pratim Das 36.6


Analysis of Algorithms

Module 36
• Why?
Partha Pratim
Das ◦ Set the motivation for algorithm analysis:
Week Recap
◦ Why analyze?
Objectives & • What?
Outline

Algorithms
◦ Identify what all need to be analyzed:
Analysis of
◦ What to analyze?
Algorithms
Why? • How?
What?
How?
◦ Learn the techniques for analysis:
Counting Models
◦ How to analyze?
Asymptotic Analysis
Where?
• Where?
Complexity Chart
◦ Understand the scenarios for application:
Module Summary
◦ Where to analyze?
• When?
◦ Realize your position for seeking the analysis:
◦ When to analyze?
Database Management Systems Partha Pratim Das 36.7
Why analyze?

Module 36

Partha Pratim
Practical reasons:
Das
• Resources are scarce
Week Recap
• Greed to do more with less
Objectives &
Outline
• Avoid performance bugs
Algorithms

Analysis of
Core Issues:
Algorithms
Why?
• Predict performance
What?
How?
◦ How much time does binary search take?
Counting Models
Asymptotic Analysis
• Compare algorithms
Where?
◦ How quick is Quicksort?
Complexity Chart

Module Summary • Provide guarantees


◦ Size notwithstanding, Red-Black tree inserts in O(log n)
• Understand theoretical basis
◦ Sorting by comparison cannot do better than Ω(n log n)
Database Management Systems Partha Pratim Das 36.8
What to analyze?

Module 36 Core Issue: Cannot control what we cannot measure


Partha Pratim • Time
Das
◦ Story starts here with Analytical Engine
Week Recap

Objectives &
Outline

Algorithms

Analysis of
Algorithms
Why?
What?
How?
Counting Models
◦ Most common analysis factor
Asymptotic Analysis ◦ Representative of various related analysis factors like Power, Bandwidth, Processors
Where? ◦ Supported by Complexity Classes
Complexity Chart
• Space
Module Summary
◦ Widely explored
◦ Important for hand-held devices
◦ Supported by Complexity Classes

Database Management Systems Partha Pratim Das 36.9


What to analyze?

Module 36 • Sum of Natural Numbers


Partha Pratim
Das
int sum(int n) {
int s = 0;
Week Recap
for(; n > 0; --n)
Objectives &
Outline s = s + n;
Algorithms return s;
Analysis of
Algorithms
}
Why?
What?
• Time T (n) = n (additions)
How?
Counting Models
• Space S(n) = 2 (n, s)
Asymptotic Analysis
Where?

Complexity Chart

Module Summary

Database Management Systems Partha Pratim Das 36.10


What to analyze?

Module 36 • Find a character in a string


Partha Pratim
Das
int find(char *str, char c) {
for(int i = 0; i < strlen(str); ++i)
Week Recap
if (str[i] == c)
Objectives &
Outline return i;
Algorithms return 0;
Analysis of
Algorithms
}
Why?
What?
n = strlen(str)
How?
Counting Models
• Time T (n) = n (compare) +n ∗ T (strlen(str)) ≈ n + n2 ≈ n2
Asymptotic Analysis
Where?
• Space S(n) = 3 (str, c, i)
Complexity Chart

Module Summary

Database Management Systems Partha Pratim Das 36.11


What to analyze?

Module 36 • Minimum of a Sequence of Numbers


Partha Pratim
Das
int min(int a[], int n) {
for(int i = 0; i < n; ++i)
Week Recap
cin >> a[i];
Objectives &
Outline

Algorithms int t = a[--n];


Analysis of
Algorithms
for(; n > 0; --n)
Why? if (t < a[--n])
What?
How? t = a[n];
Counting Models
Asymptotic Analysis
return t;
Where? }
Complexity Chart
• Time T (n) = n − 1 (comparison of value)
Module Summary
• Space S(n) = n + 3 (a[]’s, n, i, t)

Database Management Systems Partha Pratim Das 36.12


What to analyze?

Module 36 • Minimum of a Sequence of Numbers


Partha Pratim
Das
int min(int n) {
int x;
Week Recap
cin >> x;
Objectives &
Outline int t = x;
Algorithms for(; n > 1; --n) {
Analysis of
Algorithms
cin >> x;
Why? if (t < x)
What?
How? t = x;
Counting Models
Asymptotic Analysis
}
Where? return t;
Complexity Chart }
Module Summary
• Time T (n) = n − 1 (comparison of value)
• Space S(n) = 3 (n, x, t)

Database Management Systems Partha Pratim Das 36.13


How to analyze?

Module 36

Partha Pratim • Counting Models


Das
• Asymptotic Analysis
Week Recap

Objectives &
• Generating Functions
Outline

Algorithms
• Master Theorem
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary

Database Management Systems Partha Pratim Das 36.14


How to analyze?: Counting Models

Module 36

Partha Pratim
Counting Models
Das
• Core Idea: Total running time = Sum of cost × frequency for all operations
Week Recap
◦ Need to analyze program to determine set of operations
Objectives &
Outline ◦ Cost depends on machine, compiler
Algorithms ◦ Frequency depends on algorithm, input data
Analysis of
Algorithms • Machine Model: Random Access Machine (RAM) Computing Model
Why?
What?
◦ Input data & size
How?
◦ Operations
Counting Models
Asymptotic Analysis ◦ Intermediate Stages
Where?
◦ Output data & size
Complexity Chart

Module Summary

Database Management Systems Partha Pratim Das 36.15


How to analyze?: Counting Models

Module 36

Partha Pratim
• Factorial (Recursive)
Das
int fact(int n) {
Week Recap if (0 != n) return n*fact(n-1);
Objectives & return 1;
Outline }
Algorithms
◦ Time T (n) = n − 1 (multiplication)
Analysis of
Algorithms
◦ Space S(n) = n + 1 (n’s in recursive calls)
Why?
What?
• Factorial (Iterative)
How?
int fact(int n) {
Counting Models
Asymptotic Analysis
int t = 1;
Where? for(; n > 0; --n)
Complexity Chart t = t * n;
Module Summary
return t;
}
◦ Time T (n) = n (multiplication)
◦ Space S(n) = 2 (n, t)

Database Management Systems Partha Pratim Das 36.16


How to analyze?: Asymptotic Analysis

Module 36

Partha Pratim
Asymptotic Analysis
Das
• Core Idea: Cannot compare actual times; hence compare Growth or how time increases
Week Recap with input size
Objectives &
Outline ◦ Function Approximation (tilde (˜) notation)
Algorithms ◦ Common Growth Functions
Analysis of
Algorithms
◦ Big-Oh (O(.)), Big-Omega (Ω(.)), and Big-Theta (Θ(.)) Notations
Why? ◦ Solve recurrence with Growth Functions
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary

Database Management Systems Partha Pratim Das 36.17


How to analyze?: Asymptotic Analysis

Module 36 int count = 0;


Partha Pratim
for (int i = 0; i < N; i++)
Das for (int j = i+1; j < N; j++)
if (a[i] + a[j] == 0)
Week Recap
count++;
Objectives &
Outline Function Approximation (tilde (˜) notation)
Algorithms Operation Frequency Approximation
Analysis of variable declaration N +2 ∼N
Algorithms assignment statement N +2 ∼N
1
∼ 21 N 2
Why?
What?
less than compare 2
(N + 1)(N + 2)
1
How? equal to compare 2
N(N − 1) ∼ 21 N 2
Counting Models
Asymptotic Analysis
array access N(N − 1) ∼ N2
1
Where? increment 2
N(N − 1) to N(N − 1) ∼ 21 N 2 to ∼ N 2
Complexity Chart
• Estimate running time (or memory) as a function of input size N. Ignore lower order terms
Module Summary
◦ when N is large, terms are negligible
◦ when N is small, we don’t care
f (n) ∼ g (n) means
f (n)
lim =1
N→∞ g (n)
Database Management Systems Partha Pratim Das 36.18
How to analyze?: Asymptotic Analysis

Module 36

Partha Pratim
Das

Week Recap

Objectives &
Outline

Algorithms

Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary

Courtesy: Algorithms by Robert Sedgewick & Kevin Wayne


Database Management Systems Partha Pratim Das 36.19
How to analyze?: Asymptotic Analysis

Module 36

Partha Pratim
Das

Week Recap

Objectives &
Outline

Algorithms

Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary

Database Management Systems Partha Pratim Das 36.20


How to analyze?: Asymptotic Analysis

Module 36

Partha Pratim
Das

Week Recap

Objectives &
Outline

Algorithms

Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary

Courtesy: Algorithms by Robert Sedgewick & Kevin Wayne


Database Management Systems Partha Pratim Das 36.21
Asymptotic notation PPD

Module 36

Partha Pratim
For a given function g (n), we denote by O(g (n)) the set of functions:
Das

Week Recap
O(g (n)) = f (n) : there exist positive constants c and n0 such that
Objectives &
Outline
0 ≤ f (n) ≤ cg (n), for all n > n0
Algorithms

Analysis of
Algorithms
Why? • We use O-notation to give an upper bound on a function, to within a constant factor.
• When we say that the running time of A is O(n2 ), we mean that there is a function
What?
How?
Counting Models
Asymptotic Analysis
f (n) that is O(n2 ) such that for any value of n, no matter what particular input of size
Where? n is chosen, the running time on that input is bounded from above by the value f (n).
Complexity Chart
• Equivalently, we mean that the worst-case running time is O(n2 ).
Module Summary

Database Management Systems Partha Pratim Das 36.22


Where to analyze?

Module 36

Partha Pratim
Algorithmic Situation
Das
• Core Idea: Identify data configurations or scenarios for analysis
Week Recap
◦ Best Case
Objectives &
Outline ▷ Minimum running time on an input
Algorithms
◦ Worst Case
Analysis of
Algorithms ▷ Running time guarantee for any input of size n
Why?
What? ◦ Average Case
How?
Counting Models ▷ Expected running time for a random input of size n
Asymptotic Analysis
Where? ◦ Probabilistic Case
Complexity Chart
▷ Expected running time of a randomized algorithm
Module Summary
◦ Amortized Case
▷ Worst case running time for any sequence of n operations

Database Management Systems Partha Pratim Das 36.23


Analysis of Algorithms PPD

Module 36

Partha Pratim
Das

Week Recap

Objectives &
Outline

Algorithms

Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary
Complexity Chart

Database Management Systems Partha Pratim Das 36.24


Big-O Algorithm Complexity Cheat Sheet

Module 36

Partha Pratim
Das

Week Recap

Objectives &
Outline

Algorithms

Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary

Source: Know Thy Complexities! (06-Apr-2021)

Database Management Systems Partha Pratim Das 36.25


Big-O Algorithm Complexity Cheat Sheet

Module 36

Partha Pratim
Das

Week Recap

Objectives &
Outline

Algorithms

Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary

Source: Know Thy Complexities! (06-Apr-2021)

Database Management Systems Partha Pratim Das 36.26


Module Summary

Module 36

Partha Pratim • Need for analyzing the running-time and space requirements of a program
Das
• Asymptotic growth rate or order of the complexity of different algorithms
Week Recap

Objectives &
• Worst-case, average-case and best-case analysis
Outline

Algorithms

Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?

Complexity Chart

Module Summary Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 36.27


Module 37

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Data Structure
Module 37: Algorithms and Data Structures/2: Data Structures
Linear Data
Structures
Array
Linked List

Search
Linear Search
Partha Pratim Das
Binary Search

Module Summary Department of Computer Science and Engineering


Indian Institute of Technology, Kharagpur

[email protected]

Database Management Systems Partha Pratim Das 37.1


Module Recap PPD

Module 37

Partha Pratim • Need for analyzing the running-time and space requirements of a program
Das
• Asymptotic growth rate or order of the complexity of different algorithms
Objectives &
Outline • Worst-case, average-case and best-case analysis
Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.2


Module Objectives PPD

Module 37

Partha Pratim • Introduction to Data Structures


Das
• Review of linear data structures - array, list, stack, queue
Objectives &
Outline • Review of search - linear and binary
Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.3


Module Outline PPD

Module 37

Partha Pratim • Linear data structures - array, list, stack, queue


Das
• Search - linear and binary
Objectives &
Outline

Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.4


Data Structure PPD

Module 37

Partha Pratim • Data structure: A data structure specifies the way of organizing and storing
Das
in-memory data that enables efficient access and modification of the data.
Objectives &
Outline ◦ Linear Data Structures
Data Structure ◦ Non-linear Data Structures
Linear Data
Structures
• Most data structure has a container for the data and typical operations that its needs
Array to perform
Linked List

Search • For applications relating to data management, the key operations are:
Linear Search
Binary Search
◦ Create
Module Summary ◦ Insert
◦ Delete
◦ Find / Search
◦ Close
• Efficiency is measured in terms of time and space taken for these operations

Database Management Systems Partha Pratim Das 37.5


Linear Data Structures PPD

Module 37

Partha Pratim
Das

Objectives &
Outline

Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Linear Data Structures

Database Management Systems Partha Pratim Das 37.6


Linear Data Structures PPD

Module 37

Partha Pratim • A Linear data structure has data elements arranged in linear or sequential manner such
Das
that each member element is connected to its previous and next element.
Objectives &
Outline • Since data elements are sequentially connected, each element is traversable through a
Data Structure single run.
Linear Data
Structures • Examples of linear data structures are Array, Linked List, Queue, Stack, etc.
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.7


Linear Data Structures (2) PPD

Module 37

Partha Pratim
Different examples of linear data structure:
Das
• Array: The data elements are stored at contiguous locations in memory.
Objectives &
Outline • Linked List: The data elements are not required to be stored at contiguous locations
Data Structure in memory. Rather each element stores a link (a pointer to a reference) to the location
Linear Data of the next element.
Structures
Array
• Queue: It is a FIFO (First In First Out) data structure. The element that has been
Linked List

Search
inserted first in the queue would be removed first. Thus, insert and removal of the
Linear Search elements in this take place in the same order.
Binary Search

Module Summary • Stack: It is a LIFO (Last In First Out) data structure. The element that has been
inserted last in the stack would be removed first. Thus, insert and removal of the
elements in this take place in the reverse order.

Database Management Systems Partha Pratim Das 37.8


Linear Data Structures (3): Array PPD

Module 37

Partha Pratim • The elements are stored in contiguous memory locations.


Das

Objectives &
Outline

Data Structure

Linear Data
Structures
Array • Simple access using indices. For example, let the array name be arr, we can access the
Linked List
element at position 5 as arr[5].
Search
Linear Search • Array allows random access using its index which is fast (cost of O(1)). Useful for
Binary Search

Module Summary
operations like sorting, searching.

Database Management Systems Partha Pratim Das 37.9


Linear Data Structures (4): Array PPD

Module 37

Partha Pratim • Have fixed sizes, not flexible. Since we do not know the number of elements to be
Das
stored in runtime, If we create it too large then it can be a waste of memory, if we
Objectives &
Outline
create it too small then some elements may not be accommodated in the array.
Data Structure ◦ For example, suppose we create an array to store 8 elements. However, during
Linear Data execution of the program only 5 elements are available, which results in wastage of
Structures
Array memory space.
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.10


Linear Data Structures (5): Array PPD

Module 37

Partha Pratim • Insertion and removal of elements from an array are costlier since the memory locations
Das
have to be consecutive.
Objectives &
Outline ◦ Insertion or removal of an element from the end of an array is easy.
Data Structure ▷ Insert at end:
Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary
▷ Remove from end:

Database Management Systems Partha Pratim Das 37.11


Linear Data Structures (6): Array PPD

Module 37
• Insertion and removal of elements from an array are costlier since the memory locations
Partha Pratim
Das
have to be consecutive.
Objectives &
◦ Insert and remove elements at any arbitrary position is costly (cost is O(n))
Outline
▷ Insert at any arbitrary position:
Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary
▷ Remove from any arbitrary position:

Database Management Systems Partha Pratim Das 37.12


Linear Data Structures (7): Linked List PPD

Module 37
• Elements are not required to be stored at contiguous memory locations. A new element
Partha Pratim
Das
can be stored anywhere in the memory where free space is available. Thus, it provides
better memory usage than arrays.
Objectives &
Outline
• For each new element allocated, a link (a pointer or a reference) is created for the new
Data Structure
element using which the element can be added to the linked list.
Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary Each element is stored in a node. A node has two parts:
◦ Info: stores the element.
◦ Link: stores the location of the next node.
• Header is a link to the first node of the linked list.

Database Management Systems Partha Pratim Das 37.13


Linear Data Structures (8): Linked List PPD

Module 37
• Flexible in size. Size of a linked list grows or shrinks as and when new elements are
Partha Pratim
Das
inserted or deleted.
Objectives &
• Random access is not possible in linked lists. The elements will have to be accessed
Outline
sequentially.
Data Structure

Linear Data
• Insertion or removal of an element at/from any arbitrary position is efficient as none of
Structures the elements are not required to be moved to new locations.
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.14


Linear Data Structures (9): Linked List PPD

Module 37
• Insertion or removal of an element at/from any arbitrary position is efficient.
Partha Pratim
Das ◦ Insertion at front:
Objectives & 1. NewNode.Link = Header
Outline
2. Header = NewNode
Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.15


Linear Data Structures (10): Linked List PPD

Module 37 • Insertion or removal of an element at/from any arbitrary position is efficient.


Partha Pratim
Das
◦ Remove from front:
1. Temp = Header
Objectives &
Outline 2. Header = Header.Link
Data Structure 3. Delete(Temp)
Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.16


Linear Data Structures (11): Linked List PPD

Module 37
• Insertion or removal of an element at/from any arbitrary position is efficient.
Partha Pratim
Das ◦ Insertion at end:
Objectives & 1. Node.Link = NewNode
Outline

Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.17


Linear Data Structures (12): Linked List PPD

Module 37 • Insertion or removal of an element at/from any arbitrary position is efficient.


Partha Pratim
Das
◦ Remove from end:
1. Temp = Node.Link
Objectives &
Outline 2. Node.Link = NULL
Data Structure 3. Delete(Temp)
Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.18


Linear Data Structures (13): Linked List PPD

Module 37
• Insertion or removal of an element at/from any arbitrary position is efficient.
Partha Pratim
Das ◦ Insertion at any intermediate position:
Objectives & 1. NewNode.Link = Node.Link
Outline
2. Node.Link = NewNode
Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.19


Linear Data Structures (14): Linked List PPD

Module 37
• Insertion or removal of an element at/from any arbitrary position is efficient.
Partha Pratim
Das ◦ Remove from any intermediate position:
Objectives & 1. Temp = Node.Link
Outline
2. Node.Link = Node.Link.Link
Data Structure
3. Delete(Temp)
Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Database Management Systems Partha Pratim Das 37.20


Search PPD

Module 37

Partha Pratim
Das

Objectives &
Outline

Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Search

Database Management Systems Partha Pratim Das 37.21


Linear Search PPD

Module 37

Partha Pratim
• The algorithm starts with the first element, compares with the given key value and returns yes if they
Das match.
Objectives & • If it does not match, then it proceeds sequentially comparing each element of the list with the given key
Outline until a match is found or the full list is traversed.
Data Structure
Let the given input list be inputArr = [‘a’,‘c’,‘a’,‘d’,‘e’,‘m’,‘i’,‘c’,‘s’] and the search key be ‘i’.
Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Figure: Linear Search Example


Database Management Systems Partha Pratim Das 37.22
Linear Search (2) PPD

Module 37
Python Code for Linear Search:
Partha Pratim
Das
------------------------------------------------------
Objectives & def linSearch(inputArr, k):
Outline
for i in range(len(inputArr)):
Data Structure if inputArr[i] == k:
Linear Data return i
Structures
Array
return -1
Linked List

Search
inputArr = [‘a’,‘c’,‘a’,‘d’,‘e’,‘m’,‘i’,‘c’,‘s’]
Linear Search k = ’i’
Binary Search index = linsearch(inputArr,k)
Module Summary if index != -1:
print("Element found at "+ index)

Database Management Systems Partha Pratim Das 37.23


Binary Search PPD

Module 37

Partha Pratim
• The input for the algorithm is a sorted list.
Das
• The algorithm compares the key k with the middle element in the list.
Objectives &
Outline
• If the key matches, then it returns the index.
Data Structure • If the key does not match and is greater than the middle element, then the new list is the list to the
Linear Data
right of the middle element.
Structures
Array
• If the key does not match and is less than the middle element, then the new list is the list to the left of
Linked List the middle element.
Search Let the given input list be inputArr = [‘a’,‘a’,‘c’,‘c’,‘d’,‘e’,‘i’,‘m’,‘s’] and the search key be ‘i’.
Linear Search
Binary Search

Module Summary

Figure: Binary Search Example


Database Management Systems Partha Pratim Das 37.24
Binary Search PPD

Module 37
Python Code for Binary Search:
Partha Pratim ------------------------------------------------------
Das def binSearch(inputArr, k):
low = 0
Objectives &
Outline
high = len(inputArr) - 1
mid = 0
Data Structure while low <= high:
Linear Data mid = (high + low) // 2 # Division(floor)
Structures if inputArr[mid] < k: # new list is to the right of k
Array low = mid + 1
Linked List elif inputArr[mid] > k: # new list is to the left of k
Search high = mid - 1
Linear Search else: # means k is present at mid
Binary Search return mid
Module Summary return -1 # The element is not present

inputArr = [‘a’,‘a’,‘c’,‘c’,‘d’,‘e’,‘i’,‘m’,‘s’]
k = ’i’
index = binSearch(inputArr, k)
if index != -1:
print("Element found at position "+ str(index+1))
else:
print("Not found ")

Database Management Systems Partha Pratim Das 37.25


Common Data Structure Operations

Module 37

Partha Pratim
Das

Objectives &
Outline

Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Source: Know Thy Complexities! (06-Apr-2021)

Database Management Systems Partha Pratim Das 37.26


Module Summary

Module 37

Partha Pratim • Introduced Data Structures


Das
• Reviewed array, list, stack, queue
Objectives &
Outline • Reviewed linear and binary search
Data Structure

Linear Data
Structures
Array
Linked List

Search
Linear Search
Binary Search

Module Summary

Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 37.27


Module 38

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Data Structure
Module 38: Algorithms and Data Structures/3: Data Structures
Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Partha Pratim Das
Tree
Build a BST
Search a Key
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur
Comparison

Module Summary [email protected]

Database Management Systems Partha Pratim Das 38.1


Module Recap PPD

Module 38

Partha Pratim • Introduced Data Structures


Das
• Defined Linear Data Structure
Objectives &
Outline • Reviewed array, list, stack, queue
Data Structure

Non-linear Data
• Reviewed linear and binary search
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary

Database Management Systems Partha Pratim Das 38.2


Module Objectives PPD

Module 38

Partha Pratim • Introducing Non-linear Data Structures - graph, tree, hash table
Das
• Exploring Binary Search Tree
Objectives &
Outline • Comparing Linear and Non-Linear Data Structures
Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary

Database Management Systems Partha Pratim Das 38.3


Module Outline PPD

Module 38

Partha Pratim • Non-linear Data Structures


Das
• Binary Search Trees
Objectives &
Outline • Comparison of Linear and Non-Linear Data Structures
Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary

Database Management Systems Partha Pratim Das 38.4


Data Structure PPD

Module 38

Partha Pratim • Data structure: A data structure specifies the way of organizing and storing
Das
in-memory data that enables efficient access and modification of the data.
Objectives &
Outline ◦ Linear Data Structures
Data Structure ◦ Non-linear Data Structures
Non-linear Data
Structures
• Most data structure has a container for the data and typical operations that its needs
Graph to perform
Tree
Hash Table • For applications relating to data management, the key operations are:
Binary Search
Tree ◦ Create
Build a BST
Search a Key
◦ Insert
Comparison
◦ Delete
Module Summary ◦ Find / Search
◦ Close
• Efficiency is measured in terms of time and space taken for these operations

Database Management Systems Partha Pratim Das 38.5


Non-linear Data Structures PPD

Module 38

Partha Pratim
Das

Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary
Non-linear Data Structures

Database Management Systems Partha Pratim Das 38.6


Non-linear Data Structures: Why? PPD

Module 38 • From the study of Linear data structures in the last module, we can make the
Partha Pratim following summary observations:
Das
◦ All of them have the space complexity O(n), which optimal. However, the actual
Objectives &
Outline used space may be lower in array while linked list has an overhead of 100% (double)
Data Structure ◦ All of them have complexities that are identical for Worst as well as Average case
Non-linear Data
Structures
◦ All of them offer satisfactory complexity for some operations while being
Graph unsatisfactory on the others
Tree
Hash Table

Binary Search Array Linked List


Tree
Build a BST Unordered Ordered Unordered Ordered
Search a Key
Access O(1) O(1) O(n) O(n)
Comparison
Insert O(n) O(n) O(1) O(1)
Module Summary
Delete O(n) O(n) O(1) O(1)
Search O(n) O(lg n) O(n) O(n)
• Non-Linear data structures can be used to trade-off between extremes and achieve a
balanced good performance for all
Database Management Systems Partha Pratim Das 38.7
Non-linear Data Structures (2) PPD

Module 38
• Nonlinear data structures are those data structures in which data items are not
Partha Pratim
Das
arranged in a sequence and each element may have multiple paths to connect to other
elements.
Objectives &
Outline
• Unlike linear data structures, in which each element is directly connected with utmost
Data Structure
two neighbouring elements (previous and next elements), non-linear data structures
Non-linear Data
Structures may be connected with more than two elements.
Graph
Tree • The elements don’t have a single path to connect to the other elements but have
Hash Table
multiple paths. Traversing through the elements is not possible in one run as the data
Binary Search
Tree is non-linearly arranged.
Build a BST
Search a Key • Common Non-Linear Data Structures include:
Comparison ◦ Graph: Undirected or Directed, Unweighted or Weighted, and variants
Module Summary
◦ Tree: Rooted or Unrooted, Binary or n-ary, Balanced or Unbalanced, and variants
◦ Hash Table: Array with lists (coalesced chains) and one or more hash functions
◦ Skip List: Multi-layered interconnected linked lists
◦ and so on
Database Management Systems Partha Pratim Das 38.8
Non-linear Data Structures (3): Graph PPD

Module 38
• Graphs: Graph G is a collection of vertices V (store the elements) and connecting
Partha Pratim
Das
edges (links) E between vertices: G =< V , E > where E ⊆ V × V
Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table
• A graph may be: • Examples of a graph include:
Binary Search
Tree ◦ Undirected or Directed ◦ ER Diagram
Build a BST
Search a Key
◦ Unweighted or Weighted ◦ Network: Electrical, Water
Comparison

Module Summary
◦ Cyclic or Acyclic ◦ Friendships in Facebook
◦ Disconnected or Connected ◦ Knowledge Graph
◦ and so on

Database Management Systems Partha Pratim Das 38.9


Non-linear Data Structures (4): Tree PPD

Module 38
• Tree: Is a connected acyclic graph representing hierarchical relationship
Partha Pratim
Das

Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table
• A tree may be: • Examples of a tree include:
Binary Search
Tree ◦ Rooted or Unrooted ◦ Composite Attributes
Build a BST
Search a Key ◦ Binary or n-ary ◦ Family Genealogy
Comparison

Module Summary
◦ Balanced or Unbalanced ◦ Search Trees
◦ Disconnected (forest) or Connected ◦ and so on
◦ and so on

Database Management Systems Partha Pratim Das 38.10


Non-linear Data Structures (5): Tree PPD

Module 38

Partha Pratim
Das

Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
• Root: The node at the top of the tree is called root. There is only one root per tree
Hash Table

Binary Search
and one path from the root node to any node. A is the root node.
Tree
Build a BST
• Parent: The node which is a predecessor of any node is called parent node. In the
Search a Key
given tree, B is the parent of E. Every node, except the Root, has a unique parent
Comparison

Module Summary
• Child: A node which is the descendant of a node: D, E and F are the child nodes of B
• Leaf: A node which does not have any child node: I, J and K are leaf nodes

Database Management Systems Partha Pratim Das 38.11


Non-linear Data Structures (6): Tree PPD

Module 38

Partha Pratim
Das

Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree • Internal Nodes: The node which has at least one child is called internal Node
Hash Table

Binary Search
• Subtree: Subtree represents the tree rooted at that node
Tree
Build a BST
• Path: Path refers to the sequence of nodes along the edges of a tree
Search a Key
• Siblings: Nodes having the same parents: D, E and F are the siblings.
Comparison

Module Summary • Arity: Number of children of a node. B has arity 3, E has arity 2, G has arity 1, and D
has arity 0 (Leaf)

Maximum arity of a node is defined as the arity of the tree.


Database Management Systems Partha Pratim Das 38.12
Non-linear Data Structures (7): Tree PPD

Module 38

Partha Pratim
Das

Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
• Levels: The root node is said to be at Level 0 and the children of the root node are at
Hash Table

Binary Search
Level 1 and the children of the nodes which are at Level 1 will be at Level 2 and so on.
Tree
Build a BST
Search a Key Level is the length of the path (number of links) or distance of a node from the root
Comparison node. So, level of A is 0, level of C is 1, level of G is 2, and level of J is 3.
Module Summary
• Height: Maximum level in a tree
• Binary Tree: is a tree, where each node can have at most 2 children. It has arity 2.

Database Management Systems Partha Pratim Das 38.13


Non-linear Data Structures (8): Tree PPD

Module 38

Partha Pratim
Das

Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
• Fact 1: A tree with n nodes has n − 1 edges
Hash Table

Binary Search • Fact 2: The maximum number of nodes at level l of a binary tree is 2l .
Tree
Build a BST • Fact 3: If h is the height of a binary tree of n nodes, then:
Search a Key

Comparison
◦ h + 1 ≤ n ≤ 2h+1 − 1
Module Summary ◦ ⌈lg(n + 1)⌉ − 1 ≤ h ≤ n − 1
◦ O(lg n) ≤ h ≤ O(n)
◦ For a k-ary tree, O(lgk n) ≤ h ≤ O(n)

Database Management Systems Partha Pratim Das 38.14


Non-linear Data Structures (9): Hash Table (Module 44) PPD

Module 38
• Hash Table (Hash Map): implements an associative array abstract data type, a
Partha Pratim
Das
structure that can map keys to values by using a hash function to compute an index
(hash code), into an array of buckets or slots, from which the desired value can be found
Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key
• A hash table may be using: • Examples of a hash table include:
Comparison

Module Summary
◦ Static or Dynamic Schemes ◦ Associative arrays
◦ Open Addressing ◦ Database indexing
◦ 2-Choice Hashing ◦ Caches
◦ and so on ◦ and so on
Database Management Systems Partha Pratim Das 38.15
Binary Search Tree PPD

Module 38

Partha Pratim
Das

Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary
Binary Search Tree

Database Management Systems Partha Pratim Das 38.16


Binary Search and Binary Search Tree PPD

Module 38
• During the study of linear data structure, we observed that
Partha Pratim
Das ◦ Binary search is efficient in search of a key: O(lg n). However,
Objectives & ▷ it needs to be performed on a sorted array, and
Outline
▷ the array makes insertion and deletion expensive at O(n)
Data Structure

Non-linear Data
◦ The linked list, on the other hand is efficient in insertion and deletion at O(1), while
Structures it makes the search expensive at O(n).
Graph
Tree ▷ O(1) insert / delete is possible because we just need to manipulate pointers and
Hash Table

Binary Search
not physically move data
Tree
Build a BST
• Using the non-linearity, specifically (binary) trees, we can combine the benefits of both
Search a Key
• Note that once an array is sorted, we know the order in which its elements may be
Comparison
checked (for any key) during a search
Module Summary
• As the binary search splits the array, we can conceptually consider the Middle Element
to be the Root of a tree and the left (right) sub-array to be its left (right) sub-tree
• Progressing recursively, we have a Binary Search Tree
Database Management Systems Partha Pratim Das 38.17
Binary Search and Binary Search Tree PPD

Module 38 10 12 14 17 19 22 25
• Consider the data set:
Partha Pratim
Das
LL L LR M RL R RR
Objectives &
• Search order is:
Outline
◦ First: M
Data Structure
◦ Second: L or R
Non-linear Data
Structures ◦ Third:
Graph
Tree
▷ For L: LL or LR
Hash Table
▷ For R: RL or RR
Binary Search
Tree ◦ Recur ...
Build a BST
Search a Key • Put as a tree:
Comparison

Module Summary

Database Management Systems Partha Pratim Das 38.18


Binary Search Tree PPD

Module 38
• Binary Search Tree (BST): Is a tree in which all the nodes hold the following:
Partha Pratim
Das ◦ The value of each node in the left sub-tree is less than the value of its root
Objectives &
◦ The value of each node in the right sub-tree is greater than the value of its root
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary

• Structure of BST node: Each node consists of an element (X), and a link to the left
child or the left subtree (LC), and a link to the right child or the right subtree (RC)

Database Management Systems Partha Pratim Das 38.19


Binary Search Tree (2) PPD

Module 38
• Example: Obtain the BST by inserting the following values-
Partha Pratim
Das
E, L, P, H, A, N, T.
Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary

Database Management Systems Partha Pratim Das 38.20


Binary Search Tree (3) PPD

Module 38
• Example: Obtain the BST by inserting the following values-
Partha Pratim
Das
E, L, P, H, A, N, T.
Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary

Database Management Systems Partha Pratim Das 38.21


Searching a key in BST PPD

Module 38
search(root, key)
Partha Pratim 1. Compare the key with the element at root.
Das 1.1. If the key is equal to root’s element then
1.1.1 Element found and return
Objectives &
Outline
1.2. else if the key is lesser than the root’s element
1.2.1 search(root.lc) #search on the left subtree
Data Structure 1.3 else: #if the key is greater than the root’s element
Non-linear Data 1.3.1 search(root.rc) #search on the right subtree
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary

Database Management Systems Partha Pratim Das 38.22


Searching a key in BST (2) PPD

Module 38
• Searching a key in a BST is O(h), where h is the height of the key
Partha Pratim
Das • Worst Case
Objectives & ◦ The BST is a skewed binary search tree (all the nodes except the leaf would have
Outline
only one child)
Data Structure
◦ This can happen if keys are inserted in sorted order
Non-linear Data
Structures ◦ Height (h) of the BST having n elements becomes n − 1
Graph
Tree
◦ Time complexity of search in BST becomes O(n)
Hash Table

Binary Search
• Best Case
Tree
Build a BST
◦ The BST is a balanced binary search tree
Search a Key ◦ This is possible if
Comparison
▷ If keys are inserted in purely randomized order, Or
Module Summary
▷ If the tree is explicitly balanced after every insertion
◦ Height (h) of the binary search tree becomes lg n
◦ Time complexity of search in BST becomes O(lg n)

Database Management Systems Partha Pratim Das 38.23


Comparison of Linear and Non-Linear Data Structures PPD

Module 38

Partha Pratim
Das

Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary
Comparison of Linear and Non-Linear Data
Structures

Database Management Systems Partha Pratim Das 38.24


Linear and Non-Linear Data Structures

Module 38

Partha Pratim
Das

Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary

Source: Difference between Linear and Non-linear Data Structures (11-Aug-2021)

Database Management Systems Partha Pratim Das 38.25


Comparison of Linear and Non-Linear Data Structures

Module 38
Linear Data Structure Non-Linear Data Structure
Partha Pratim
Das • Data elements are arranged in a linear • Data elements are arranged in hierar-
Objectives & order where each and every elements are chical or networked manner
Outline
attached to its previous and next adjacent
Data Structure

Non-linear Data
• Single level is involved • Multiple level are involved
Structures • Implementation is easy in comparison • Implementation is complex in compari-
Graph
Tree to non-linear data structure son to linear data structure
Hash Table
• Data elements can be traversed in one • Data elements can be traversed in mul-
Binary Search
Tree way only tiple ways. Various traversals may be de-
Build a BST
Search a Key
fined to linearize the data: Depth-First,
Comparison Breadth-First, Inorder, Prepoder, Pos-
Module Summary torder, etc.
• Examples: array, stack, queue, linked • Examples: trees, graphs, skip list, hash
list, and their variants map, and several variants

Database Management Systems Partha Pratim Das 38.26


Complexity of Common Data Structure Operations

Module 38

Partha Pratim
Das

Objectives &
Outline

Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary

Source: Know Thy Complexities! (06-Apr-2021)

Database Management Systems Partha Pratim Das 38.27


Module Summary

Module 38

Partha Pratim • Introduced Non-linear Data Structures - graph, tree, hash table
Das
• Studied Binary Search Tree as an adaptation of binary search
Objectives &
Outline • Compared Linear and Non-Linear Data Structures
Data Structure

Non-linear Data
Structures
Graph
Tree
Hash Table

Binary Search
Tree
Build a BST
Search a Key

Comparison

Module Summary Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.

Database Management Systems Partha Pratim Das 38.28


Module 39

Partha Pratim
Das

Objectives &
Outline Database Management Systems
Physical Storage
Flash memory
Module 39: Storage and File Structure/1: Physical Storage
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk Partha Pratim Das


Magnetic Tapes

Cloud Storage Department of Computer Science and Engineering


Cloud vs. Storage Indian Institute of Technology, Kharagpur
Other Storage
[email protected]
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.1


Module Recap PPD

Module 39

Partha Pratim • Introduced Non-linear Data Structures - graph, tree, hash table
Das
• Studied Binary Search Tree as an adaptation of binary search
Objectives &
Outline • Compared Linear and Non-Linear Data Structures
Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.2


Module Objectives PPD

Module 39

Partha Pratim • Introduce various Physical Storage Media for high volume, fast, reliable and inexpensive
Das
options for data storage for databases
Objectives &
Outline • To understand the options of Tertiary Storage for high volume, inexpensive backup
Physical Storage options
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.3


Module Outline PPD

Module 39

Partha Pratim • Physical Storage Media


Das
• Magnetic Disks
Objectives &
Outline • Magnetic Tape
Physical Storage
Flash memory • Other Storage
Magnetic Disk
Optical Storage • Future of Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.4


Physical Storage Media PPD

Module 39

Partha Pratim
Das

Objectives &
Outline

Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage Physical Storage Media


Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.5


Classification of Physical Storage Media

Module 39

Partha Pratim • Speed with which data can be accessed


Das
• Cost per unit of data
Objectives &
Outline • Reliability
Physical Storage
Flash memory
◦ data loss on power failure or system crash
Magnetic Disk
Optical Storage
◦ physical failure of the storage device
Tape Storage
Storage Hierarchy
• Can differentiate storage into:
Magnetic Disk ◦ volatile storage: loses contents when power is switched off
Magnetic Tapes ◦ non-volatile storage:
Cloud Storage
Cloud vs. Storage
▷ Contents persist even when power is switched off
Other Storage
▷ Includes secondary and tertiary storage, as well as battery-backed up
Optical Disk main-memory
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.6


Physical Storage Media

Module 39

Partha Pratim • Cache


Das
◦ fastest and most costly form of storage
Objectives &
Outline ◦ volatile
Physical Storage ◦ managed by the computer system hardware
Flash memory
Magnetic Disk • Main memory
◦ fast access (10’s to 100’s of nanoseconds (ns); 1 ns = 10−9 seconds)
Optical Storage
Tape Storage
Storage Hierarchy
◦ generally too small (or too expensive) to store the entire database
Magnetic Disk

Magnetic Tapes
▷ capacities of up to a few Gigabytes widely used currently
Cloud Storage
▷ Capacities have gone up and per-byte costs have decreased steadily and rapidly
Cloud vs. Storage (roughly factor of 2 every 2 to 3 years)
Other Storage
Optical Disk • Volatile
Flash Drives
SD & SSD
◦ contents of main memory are usually lost if a power failure or system crash occurs
Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.7


Physical Storage Media (2): Flash memory

Module 39

Partha Pratim • Data survives power failure


Das
• Data can be written at a location only once, but location can be erased and written to
Objectives &
Outline again
Physical Storage ◦ Can support only a limited number (10K – 1M) of write/erase cycles
Flash memory
Magnetic Disk ◦ Erasing of memory has to be done to an entire bank of memory
Optical Storage
Tape Storage • Reads are roughly as fast as main memory
Storage Hierarchy

Magnetic Disk
• But writes are slow (few microseconds), erase is slower
Magnetic Tapes • Widely used in embedded devices such as digital cameras, phones, and USB keys
Cloud Storage
Cloud vs. Storage

Other Storage
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.8


Physical Storage Media (3): Magnetic Disk

Module 39
• Data is stored on spinning disk, and read/written magnetically
Partha Pratim
Das • Primary medium for the long-term storage of data
Objectives & ◦ typically stores entire database
Outline

Physical Storage
• Data must be moved from disk to main memory for access, and written back for
Flash memory storage - much slower access than main memory
Magnetic Disk
Optical Storage • direct-access
Tape Storage
Storage Hierarchy ◦ possible to read data on disk in any order, unlike magnetic tape
Magnetic Disk
• Capacities range up to roughly 16–32 TB
Magnetic Tapes

Cloud Storage
◦ Much larger capacity and much lower cost/byte than main memory/flash memory
Cloud vs. Storage ◦ Growing constantly and rapidly with technology improvements (factor of 2 to 3
Other Storage
Optical Disk
every 2 years)
Flash Drives
SD & SSD
• Survives power failures and system crashes
Future of Storage ◦ disk failure can destroy data, but is rare
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.9


Physical Storage Media (4): Optical Storage

Module 39

Partha Pratim • non-volatile, data is read optically from a spinning disk using a laser
Das
• CD-ROM (640 MB) and DVD (4.7 to 17 GB) most popular forms
Objectives &
Outline • Blu-ray disks: 27 GB to 54 GB
Physical Storage
Flash memory • Write-one, read-many (WORM) optical disks used for archival storage (CD-R, DVD-R,
Magnetic Disk
Optical Storage
DVD+R)
Tape Storage
Storage Hierarchy
• Multiple write versions also available (CD-RW, DVD-RW, DVD+RW, and DVD-RAM)
Magnetic Disk • Reads and writes are slower than with magnetic disk
Magnetic Tapes
• Juke-box systems, with large numbers of removable disks, a few drives, and a
Cloud Storage
Cloud vs. Storage mechanism for automatic loading/unloading of disks available for storing large volumes
Other Storage of data
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.10


Physical Storage Media (5): Tape Storage

Module 39

Partha Pratim • non-volatile, used primarily for backup (to recover from disk failure), and for archival
Das
data
Objectives &
Outline • sequential-access
Physical Storage ◦ much slower than disk
Flash memory
Magnetic Disk
Optical Storage
• very high capacity (40 to 300 TB tapes available)
Tape Storage
Storage Hierarchy
• tape can be removed from drive storage costs much cheaper than disk, but drives are
Magnetic Disk
expensive
Magnetic Tapes • Tape jukeboxes available for storing massive amounts of data
Cloud Storage
Cloud vs. Storage
◦ hundreds of terabytes (TB) (1 TB = 1012 bytes) to even multiple petabytes (PB)
Other Storage (1 PB = 1015 bytes)
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.11


Storage Hierarchy PPD

Module 39

Partha Pratim
Das

Objectives &
Outline

Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.12


Magnetic Disk PPD

Module 39

Partha Pratim
Das

Objectives &
Outline

Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage Magnetic Disk


Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.13


Magnetic Disk: Mechanism PPD

Module 39

Partha Pratim
Das

Objectives &
Outline

Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital NOTE: Diagram is schematic, and simplifies the structure of actual disk drives
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.14


Magnetic Disk (2): Mechanism

Module 39
• Read-write head
Partha Pratim ◦ Positioned very close to the platter surface
Das ◦ Reads or writes magnetically encoded information
Objectives & • Surface of platter divided into circular tracks
Outline

Physical Storage
◦ Over 50K-100K tracks per platter on typical hard disks
Flash memory
• Each track is divided into sectors
Magnetic Disk
Optical Storage ◦ A sector is the smallest unit of data read or written
Tape Storage
Storage Hierarchy
◦ Sector size typically 512 bytes
◦ Sectors / track: 500 to 1k (inner) to 1k to 2k (outer)
Magnetic Disk

Magnetic Tapes
• To read/write a sector
Cloud Storage ◦ disk arm swings to position head on right track
Cloud vs. Storage ◦ platter spins: Read/Write as sector passes under head
Other Storage
Optical Disk
• Head-disk assemblies
Flash Drives
SD & SSD
◦ multiple disk platters on a single spindle (1 to 5 usually)
◦ one head per platter, mounted on a common arm.
Future of Storage
DNA Digital • Cylinder i consists of i th track of all the platters
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.15


Magnetic Disks (3): Disk Controller, Subsystems, and Interfaces

Module 39 • Disk Controller: interfaces between the computer system and the disk drive hardware
Partha Pratim
Das
◦ Accepts high-level commands to read or write a sector
◦ Initiates actions moving the disk arm to the right track, reading or writing the data
Objectives & ◦ Computes and attaches checksums to each sector to verify that correct read back
Outline
◦ Ensures successful writing by reading back sector after writing it
Physical Storage ◦ Performs remapping of bad sectors
Flash memory
Magnetic Disk • Disk Subsystem:
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage
Optical Disk
Flash Drives
SD & SSD • Disk Interface Standards Families: ATA, SATA, SCSI, SAS, several variants
Future of Storage
DNA Digital
• Storage Area Networks (SAN) connects disks by a high-speed network to a number of servers
Quantum Memory
• Network Attached Storage (NAS) provides a file system interface using networked file system protocol
Module Summary Database Management Systems Partha Pratim Das 39.16
Magnetic Disks (4): Performance Measures

Module 39 • Access Time: time from a read or write request issue to start of data transfer:
Partha Pratim
Das
◦ Seek Time: time to reposition the arm over the correct track
▷ Avg. seek time is 1/2 the worst case seek time; 1/3 if all tracks have same number of sectors
Objectives &
Outline
▷ 4 to 10 milliseconds on typical disks
Physical Storage
◦ Rotational Latency: time for the sector to be accessed to appear under the head
Flash memory ▷ Average latency is 1/2 of the worst case latency
Magnetic Disk
▷ 4 to 11 milliseconds on typical disks (5400 to 15000 rpm)
Optical Storage
Tape Storage • Data-transfer Rate: the rate at which data can be retrieved from or stored to the disk
Storage Hierarchy

Magnetic Disk
◦ 25 to 100 MB per second max rate, lower for inner tracks
Magnetic Tapes
◦ Multiple disks may share a controller, so rate that controller can handle is also important
Cloud Storage • Mean Time To Failure (MTTF): Avg. time the disk is expected to run continuously without any failure
Cloud vs. Storage
◦ Typically 3 to 5 years
Other Storage
Optical Disk
◦ Probability of failure of new disks is quite low, corresponding to a theoretical MTTF of 500,000 to
Flash Drives
1,200,000 hours for a new disk. For example, an MTTF of 1,200,000 hours for a new disk means
SD & SSD that given 1000 relatively new disks, on an average one will fail every 1200 hours
Future of Storage ◦ MTTF decreases as disk ages
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.17


Magnetic Tapes PPD

Module 39

Partha Pratim
Das

Objectives &
Outline

Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage Magnetic Tapes


Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.18


Magnetic Tapes

Module 39 • Hold large volumes of data and provide high transfer rates
Partha Pratim ◦ Tape Formats
Das
▷ Few GB for DAT (Digital Audio Tape) format
Objectives &
Outline ▷ 10-40 GB with DLT (Digital Linear Tape) format
Physical Storage ▷ 100 GB+ with Ultrium format, and
Flash memory
Magnetic Disk
▷ 330 GB with Ampex helical scan format
Optical Storage ◦ Transfer rates from few to 10’s of MB/s
Tape Storage
Storage Hierarchy • Tapes are cheap, but cost of drives is very high
Magnetic Disk
• Very slow access time in comparison to magnetic and optical disks
Magnetic Tapes

Cloud Storage
◦ Limited to sequential access
Cloud vs. Storage ◦ Some formats (Accelis) provide faster seek (10’s of seconds) at cost of lower
Other Storage capacity
Optical Disk
Flash Drives • Used mainly for backup, for storage of infrequently used information, and as an off-line
SD & SSD
medium for transferring information from one system to another.
Future of Storage
DNA Digital • Tape jukeboxes used for very large capacity storage
◦ Multiple petabyes (1015 bytes)
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.19


Cloud Storage PPD

Module 39

Partha Pratim
Das

Objectives &
Outline

Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage Cloud Storage


Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.20


Cloud Storage

Module 39

Partha Pratim • Cloud storage is purchased from a third-party cloud vendor who owns and operates data
Das
storage capacity and delivers it over the Internet in a pay-as-you-go model
Objectives &
Outline • These cloud storage vendors manage capacity, security and durability to make data
Physical Storage accessible to applications all around the world
Flash memory
Magnetic Disk • Applications access cloud storage through traditional storage protocols or directly via an
Optical Storage
Tape Storage
API
Storage Hierarchy
• Many vendors offer complementary services designed to help collect, manage, secure
Magnetic Disk

Magnetic Tapes
and analyze data at massive scale. Various available options for cloud storage are:
Cloud Storage ◦ Google Drive
Cloud vs. Storage
◦ Amazon Drive
Other Storage
Optical Disk
◦ OneDrive by Microsoft
Flash Drives ◦ Evernote
SD & SSD

Future of Storage
◦ Dropbox
DNA Digital ◦ and so on
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.21


Cloud Storage vs. Traditional Storage

Module 39

Partha Pratim Parameters Cloud Storage Traditional Storage


Das
Cost Cloud storage is cheaper per GB The hardware and infrastructure costs
Objectives & than using external drives. are high and adding on more space and
Outline upgrading only adds extra costs.
Physical Storage Reliability Cloud storage is highly reliable as it Traditional storage requires high initial
Flash memory
takes less time to get under function- effort and is less reliable.
Magnetic Disk
Optical Storage
ing
Tape Storage File Sharing Cloud storage supports file sharing Traditional storage requires physical
Storage Hierarchy
dynamically as it can be shared any- drives to share data and a network is to
Magnetic Disk where with network access be established between both
Magnetic Tapes Accessibility Cloud storage gives you access to Restricted to local access
Cloud Storage your files from anywhere
Cloud vs. Storage Backup/ Very safe from on site disaster. In Data that is stored locally is much more
Other Storage Recovery case of a hard drive failure or other susceptible to unexpected events and lo-
Optical Disk
hardware malfunction, you can ac- cal storage and local backups could be
Flash Drives
SD & SSD
cess your files on the cloud, which easily lost
acts as a backup solution for your
Future of Storage
DNA Digital
local storage on physical drives
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.22


Other Storage PPD

Module 39

Partha Pratim
Das

Objectives &
Outline

Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage Other Storage


Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.23


Optical Disks

Module 39

Partha Pratim • Compact disk-read only memory (CD-ROM)


Das
◦ Removable disks, 640 MB per disk
Objectives &
Outline ◦ Seek time about 100 msec (optical read head is heavier and slower)
Physical Storage ◦ Higher latency (3000 RPM) and lower data-transfer rates (3-6 MB/s) compared to
Flash memory
Magnetic Disk
magnetic disks
Optical Storage
Tape Storage
• Digital Video Disk (DVD)
Storage Hierarchy
◦ DVD-5 holds 4.7 GB , and DVD-9 holds 8.5 GB
Magnetic Disk
◦ DVD-10 and DVD-18 are double sided formats with capacities of 9.4 GB and 17 GB
Magnetic Tapes

Cloud Storage
◦ Blu-ray DVD: 27 GB (54 GB for double sided disk)
Cloud vs. Storage ◦ Slow seek time, for same reasons as CD-ROM
Other Storage
Optical Disk
• Record once versions (CD-R and DVD-R) are popular
Flash Drives
SD & SSD
◦ data can only be written once, and cannot be erased.
Future of Storage
◦ high capacity and long lifetime; used for archival storage
DNA Digital ◦ Multi-write versions (CD-RW, DVD-RW, DVD+RW and DVD-RAM) also available
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.24


Flash Drives

Module 39

Partha Pratim • Flash drives are often referred to as pen drives, thumb drives, or jump drives. They
Das
have completely replaced floppy drives for portable storage. Considering how large and
Objectives &
Outline
inexpensive they have become, they have nearly replaced CDs and DVDs for data
Physical Storage
storage purposes.
Flash memory
Magnetic Disk
• USB flash drives are removable and rewritable storage devices that, as the name
Optical Storage
suggests, require a USB port for connection and utilizes non-volatile flash memory
Tape Storage
Storage Hierarchy technology.
Magnetic Disk
• The storage space in USB flash drives is quite large with sizes ranging from 128MB to
Magnetic Tapes
2TB.
Cloud Storage
Cloud vs. Storage • The USB standard a flash drive is built around will determine the number of things
Other Storage
Optical Disk
about its potential performance, including maximum transfer rate.
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.25


Secure Digital Cards (SD cards)

Module 39
• A Secure Digital (SD, in short) card is a type of removable memory card used to read
Partha Pratim
Das
and write large quantities of data.
Objectives &
• Due to their relatively small size, SD cards are widely used in mobile electronics,
Outline
cameras, smart devices, video game consoles, and more.
Physical Storage
Flash memory • There are several types of SD cards sold and used today:
Magnetic Disk
Optical Storage
Card Year of Capacity Supported
Tape Storage Type Debut Devices
Storage Hierarchy
SD 1996 128MB to 2GB All host devices that support SD, SDHC, SDXC
Magnetic Disk SDHC 2006 4GB to 32GB All host devices that support SDHC, SDXC
Magnetic Tapes SDXC 2009 64GB to 2TB All host devices that support SDXC
Cloud Storage
Cloud vs. Storage
Card Capacity File Remarks
Other Storage Type System
Optical Disk
Flash Drives
SD 128MB to 2GB FAT16 FAT16 supports 16 MB to 2 GB
SD & SSD SDHC 4GB to 32GB FAT32 FAT32 can be support up to 16 TB
Future of Storage SDXC 64GB to 2TB exFAT exFAT is non-standard, supports file up to 4 GB
DNA Digital
Quantum Memory Source: CARDS - WHAT ARE THE DIFFERENCES BETWEEN FAT16, FAT32 AND EXFAT FILE SYSTEMS?
Module Summary Database Management Systems Partha Pratim Das 39.26
Flash Storage

Module 39

Partha Pratim • NOR flash vs NAND flash


Das
• NAND flash
Objectives &
Outline ◦ used widely for storage, since it is much cheaper than NOR flash
Physical Storage ◦ requires page-at-a-time read (page: 512 bytes to 4 KB)
Flash memory
Magnetic Disk ◦ transfer rate around 20 MB/sec
Optical Storage
◦ solid state disks: Use multiple flash storage devices to provide higher transfer rate of 100
Tape Storage
Storage Hierarchy to 200 MB/sec
Magnetic Disk ◦ erase is very slow (1 to 2 ms)
Magnetic Tapes ▷ erase block contains multiple pages
Cloud Storage ▷ remapping of logical page addresses to physical page addresses avoids waiting for erase
Cloud vs. Storage

Other Storage
− translation table tracks mapping
Optical Disk − also stored in a label field of flash page
Flash Drives
SD & SSD
− remapping carried out by flash translation layer
Future of Storage ▷ after 100,000 to 1,000,000 erases, erase block becomes unreliable and cannot be used
DNA Digital
Quantum Memory
− wear leveling
Module Summary Database Management Systems Partha Pratim Das 39.27
Solid-State Drives (SSD)

Module 39

Partha Pratim • SSDs replace traditional mechanical hard disks by using flash-based memory, which is
Das
significantly faster.
Objectives &
Outline • SSDs speed up computers significantly due to their low read-access times and fast
Physical Storage throughput.
Flash memory
Magnetic Disk • The idea of SSDs was introduced in 1978. It was implemented using semiconductors. It
Optical Storage
Tape Storage
stores the data in the persistent state even when no power is supplied.
Storage Hierarchy
• The speed of SSD is much larger than that of HDD as it reads/writes data at higher
Magnetic Disk

Magnetic Tapes
input-output per second.
Cloud Storage • Unlike HDDs, SSDs do not include any moving parts. SDDs can resist vibrations and
Cloud vs. Storage
high temperatures.
Other Storage
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.28


SDD vs. HDD

Module 39

Partha Pratim Parameters SSD HDD


Das
Technology Integrated circuit using Mechanical Parts, including spinning
Objectives & Flash memory disks or platters
Outline
Access Time 0.1 ms 5.5-8.0 ms
Physical Storage Average Seek Time 0.08-0.16 ms < 10 ms
Flash memory
Magnetic Disk
Speed (SATA II) 80-250 MB/sec 65-85 MB/sec
Optical Storage Random I/O Perfor- 6000 io/s 400 io/s
Tape Storage mance
Storage Hierarchy
Backup rates 6 hours 20- 24 hours
Magnetic Disk
Reliability The failure rate of less Failure rate fluctuates between 2-5%
Magnetic Tapes than 0.5%
Cloud Storage Energy Consumption 2 to 5 watts 6 to 15 watts
Cloud vs. Storage

Other Storage
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.29


Future of Storage PPD

Module 39

Partha Pratim
Das

Objectives &
Outline

Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage Future of Storage


Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.30


Future of Storage: DNA Digital Storage

Module 39
• DNA digital data storage is the process of encoding and decoding binary data to and from
Partha Pratim
Das
synthesized strands of DNA.

Objectives &
• While DNA as a storage medium has enormous potential because of its high storage density, its
Outline practical use is currently severely limited because of its high cost and very slow read and write
Physical Storage times.
Flash memory
Magnetic Disk • Digital storage systems encode the text, photos, or any other kind of information as a series of
Optical Storage
Tape Storage
0s and 1s. This same information can be encoded in DNA using the four nucleotides that make
Storage Hierarchy up the genetic code: A, T, G, and C. For example, G and C could be used to represent 0 while
Magnetic Disk A and T represent 1.
Magnetic Tapes
• DNA has several other features that makes it desirable as a storage medium; it is extremely
Cloud Storage
Cloud vs. Storage
stable and is fairly easy (but expensive) to synthesize and sequence.
Other Storage • Also, because of its high density - each nucleotide, equivalent to up to two bits, is about 1 cubic
Optical Disk
Flash Drives nanometer - an exabyte (1018 bytes) of data stored as DNA could fit in the palm of your hand
SD & SSD

Future of Storage
• DNA Synthesis: A DNA synthesizer machine builds synthetic DNA strands matching the
DNA Digital sequence of digital code.
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.31


Future of Storage: Quantum Memory

Module 39

Partha Pratim • Quantum memory is the quantum-mechanical version of ordinary computer memory
Das
• Whereas ordinary memory stores information as binary states (represented by ”1”s and
Objectives &
Outline ”0”s), quantum memory stores a quantum state for later retrieval
Physical Storage
Flash memory
• These states hold useful computational information known as qubits
Magnetic Disk
Optical Storage
• Quantum memory is essential for the development of many devices in quantum
Tape Storage information processing applications such as quantum network, quantum repeater, linear
Storage Hierarchy
optical quantum computation or long-distance quantum communication
Magnetic Disk

Magnetic Tapes • Unlike the classical memory of everyday computers, the states stored in quantum
Cloud Storage memory can be in a quantum superposition, giving much more practical flexibility in
Cloud vs. Storage
quantum algorithms than classical information storage
Other Storage
Optical Disk
Flash Drives
SD & SSD

Future of Storage
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.32


Module Summary

Module 39

Partha Pratim • Understood the range of Physical Storage Media


Das
• Studied the mechanism and performance of the Magnetic Disks
Objectives &
Outline • Looked at the features of Magnetic Tape as tertiary storage
Physical Storage
Flash memory • Glimpsed through Other Storage including Optical Disk, Flash and SSD
Magnetic Disk
Optical Storage • Considered the Future of Storage in terms of DNA and Quantum
Tape Storage
Storage Hierarchy

Magnetic Disk

Magnetic Tapes

Cloud Storage
Cloud vs. Storage

Other Storage
Optical Disk Slides used in this presentation are borrowed from http://db-book.com/ with kind
Flash Drives
SD & SSD
permission of the authors.
Future of Storage
Edited and new slides are marked with “PPD”.
DNA Digital
Quantum Memory

Module Summary Database Management Systems Partha Pratim Das 39.33


Module 40

Partha Pratim
Das

Objectives &
Outline Database Management Systems
File Organization
Fixed-Length Records
Module 40: Storage and File Structure/2: File Structure
Free Lists
Variable-Length
Records

Organization of
Records in Files Partha Pratim Das
Sequential
Multi-Table

Data Dictionary Department of Computer Science and Engineering


Storage Indian Institute of Technology, Kharagpur
Storage Access
Buffer Manager
[email protected]
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.1


Module Recap PPD

Module 40

Partha Pratim • Understood the range of Physical Storage Media


Das
• Studied the mechanism and performance of the Magnetic Disks
Objectives &
Outline • Looked at the features of Magnetic Tape as tertiary storage
File Organization
Fixed-Length Records • Glimpsed through Other Storage including Optical Disk, Flash and SSD
Free Lists
Variable-Length • Considered the Future of Storage in terms of DNA and Quantum
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.2


Module Objectives PPD

Module 40

Partha Pratim • To familiarize with the organization for database files


Das
• To understand how records and relations are organized in files
Objectives &
Outline • To learn how databases keep their own information in Data-Dictionary Storage – the
File Organization
Fixed-Length Records
metadata database of a database
Free Lists
Variable-Length
• To understand the mechanisms for fast access of a database store
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.3


Module Outline PPD

Module 40

Partha Pratim • File Organization


Das
• Organization of Records in Files
Objectives &
Outline • Data-Dictionary Storage
File Organization
Fixed-Length Records • Storage Access
Free Lists
Variable-Length
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.4


File Organization PPD

Module 40

Partha Pratim
Das

Objectives &
Outline

File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
File Organization
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.5


File Organization

Module 40

Partha Pratim • A database is


Das
◦ A collection of files. A file is
Objectives &
Outline ▷ A sequence of records. A record is
File Organization
Fixed-Length Records
− A sequence of fields
Free Lists
Variable-Length
• One approach:
Records
◦ assume record size is fixed
Organization of
Records in Files ◦ each file has records of one particular type only
Sequential
Multi-Table
◦ different files are used for different relations
Data Dictionary ◦ This case is easiest to implement; will consider variable length records later
Storage

Storage Access
• A database file is partitioned into fixed-length storage units called blocks
Buffer Manager
Buffer Replacement
◦ Blocks are units of both storage allocation and data transfer
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.6


Fixed-Length Records

Module 40

Partha Pratim • Simple approach:


Das
◦ Store record i starting from byte n ∗ (i − 1), where n is the size of each record.
Objectives &
Outline ◦ Record access is simple but records may cross blocks
File Organization ▷ Modification: do not allow records to cross block boundaries
Fixed-Length Records
Free Lists
Variable-Length
Records

Organization of
Records in Files • Deletion of record i: Alternatives:
Sequential
Multi-Table ◦ move records i + 1, · · · , n to
Data Dictionary i, · · · , n − 1
Storage

Storage Access
◦ move record n to i
Buffer Manager ◦ do not move records, but link
Buffer Replacement
Policy all free records on a free list
Module Summary

Database Management Systems Partha Pratim Das 40.7


Deleting Record 3 with Compaction

Module 40

Partha Pratim
Das
Before deletion After deletion & Compaction
Objectives &
Outline

File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.8


Deleting Record 3 with Moving last record

Module 40

Partha Pratim
Das
Before deletion After deletion & Movement
Objectives &
Outline

File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.9


Free Lists

Module 40 • Store the address of the first deleted record in the file header
Partha Pratim
Das • Use this first record to store the address of the second deleted record, and so on
Objectives &
• Consider these stored addresses as pointers since they point to the location of a record
Outline
• More space efficient representation: reuse space for normal attributes of free records to
File Organization
Fixed-Length Records store pointers (No pointers stored in in-use records)
Free Lists
Variable-Length
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.10


Variable-Length Records

Module 40
• Variable-length records arise in database systems in several ways:
Partha Pratim
Das ◦ Storage of multiple record types in a file
Objectives &
◦ Record types that allow variable lengths for one or more fields such as strings
Outline (varchar)
File Organization
Fixed-Length Records
◦ Record types that allow repeating fields (used in some older data models)
Free Lists
Variable-Length
• Attributes are stored in order
Records
• Variable length attributes represented by fixed size (offset, length), with actual data
Organization of
Records in Files stored after all fixed length attributes
Sequential
Multi-Table • Null values represented by null-value bitmap
Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.11


Variable-Length Records (2)

Module 40

Partha Pratim
Das

Objectives &
Outline

File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records

Organization of
Records in Files • Slotted Page header contains:
Sequential
Multi-Table ◦ number of record entries
Data Dictionary ◦ end of free space in the block
Storage
◦ location and size of each record
Storage Access
Buffer Manager
• Records can be moved around within a page to keep them contiguous with no empty
Buffer Replacement
Policy space between them; entry in the header must be updated
Module Summary
• Pointers should not point directly to record - instead they should point to the entry for
the record in header
Database Management Systems Partha Pratim Das 40.12
Organization of Records in Files PPD

Module 40

Partha Pratim
Das

Objectives &
Outline

File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Organization of Records in Files
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.13


Organization of Records in Files

Module 40

Partha Pratim • Heap: A record can be placed anywhere in the file where there is space
Das
• Sequential: Store records in sequential order, based on the value of the search key of
Objectives &
Outline each record
File Organization
Fixed-Length Records
• Hashing: A hash function computed on some attribute of each record; the result
Free Lists specifies in which block of the file the record should be placed
Variable-Length
Records
• Records of each relation may be stored in a separate file. In a multitable clustering
Organization of
Records in Files file organization records of several different relations can be stored in the same file
Sequential
Multi-Table
◦ Motivation: store related records on the same block to minimize I/O
Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.14


Sequential File Organization

Module 40

Partha Pratim • Suitable for applications that require sequential processing of the entire file
Das
• The records in the file are ordered by a search-key
Objectives &
Outline

File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.15


Sequential File Organization (2)

Module 40

Partha Pratim
• Deletion: Use pointer chains
Das
• Insertion: Locate the position where the
Objectives &
Outline
record is to be inserted
File Organization ◦ if there is free space insert there
Fixed-Length Records
Free Lists
◦ if no free space, insert the record in an
Variable-Length
Records
overflow block
Organization of ◦ In either case, pointer chain must be
Records in Files
Sequential
updated
Multi-Table
• Need to reorganize the file from time to
Data Dictionary
Storage time to restore sequential order
Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.16


Multitable Clustering File Organization

Module 40 Store several relations in one file using a multitable clustering file organization
Partha Pratim
Das

Objectives &
Outline
department
File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records

Organization of
Records in Files
Sequential
instructor
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary multitable clustering of


department and instructor
Database Management Systems Partha Pratim Das 40.17
Multitable Clustering File Organization (2)

Module 40

Partha Pratim • good for queries involving department ▷◁ instructor, and for queries involving one single
Das
department and its instructors
Objectives &
Outline • bad for queries involving only department
File Organization
Fixed-Length Records
• results in variable size records
Free Lists
Variable-Length
• Can add pointer chains to link records of a particular relation
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.18


Data Dictionary Storage PPD

Module 40

Partha Pratim
Das

Objectives &
Outline

File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Data Dictionary Storage
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.19


Data Dictionary Storage

Module 40

Partha Pratim
Data Dictionary (also, System Catalog) stores metadata (data about data) such as:
Das
• Information about relations
Objectives &
Outline ◦ names of relations
File Organization ◦ names, types and lengths of attributes of each relation
Fixed-Length Records
Free Lists
◦ names and definitions of views
Variable-Length
Records
◦ integrity constraints
Organization of • User and accounting information, including passwords
Records in Files
Sequential • Statistical and descriptive data
Multi-Table

Data Dictionary ◦ number of tuples in each relation


Storage

Storage Access
• Physical file organization information
Buffer Manager
Buffer Replacement
◦ How relation is stored (sequential/hash/· · · )
Policy
◦ Physical location of relation
Module Summary
• Information about indices

Database Management Systems Partha Pratim Das 40.20


Relational Representation of System Metadata

Module 40

Partha Pratim
Das

Objectives &
Outline

File Organization
Fixed-Length Records • Relational representa-
Free Lists
Variable-Length tion on disk
Records

Organization of • Specialized data struc-


Records in Files
Sequential
tures designed for effi-
Multi-Table cient access, in memory
Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.21


Storage Access PPD

Module 40

Partha Pratim
Das

Objectives &
Outline

File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Storage Access
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.22


Storage Access

Module 40

Partha Pratim • A database file is partitioned into fixed-length storage units called blocks
Das
◦ Blocks are units of both storage allocation and data transfer
Objectives &
Outline • Database system seeks to minimize the number of block transfers between the disk and
File Organization memory
Fixed-Length Records
Free Lists ◦ We can reduce the number of disk accesses by keeping as many blocks as possible
Variable-Length
Records in main memory
Organization of
Records in Files • Buffer: portion of main memory available to store copies of disk blocks
Sequential
Multi-Table • Buffer Manager: subsystem responsible for allocating buffer space in main memory
Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.23


Buffer Manager

Module 40

Partha Pratim • Programs call on the buffer manager when they need a block from disk
Das
◦ If the block is already in the buffer, buffer manager returns the address of the block
Objectives &
Outline in main memory
File Organization ◦ If the block is not in the buffer, the buffer manager
Fixed-Length Records
Free Lists ▷ Allocates space in the buffer for the block
Variable-Length
Records − Replacing (throwing out) some other block, if required, to make space for
Organization of
Records in Files
the new block
Sequential − Replaced block written back to disk only if it was modified since the most
Multi-Table
recent time that it was written to / fetched from the disk
Data Dictionary
Storage ▷ Reads the block from the disk to the buffer, and returns the address of the block
Storage Access
Buffer Manager
in main memory to requester
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.24


Buffer Replacement Policies

Module 40

Partha Pratim • Most operating systems replace the block least recently used (LRU strategy)
Das
• Idea behind LRU – use past pattern of block references as a predictor of future
Objectives &
Outline references
File Organization
Fixed-Length Records
• Queries have well-defined access patterns (such as sequential scans), and a database
Free Lists system can use the information in a user’s query to predict future references
Variable-Length
Records
◦ LRU may be a bad strategy for certain access patterns involving repeated scans of
Organization of
Records in Files
data
Sequential
Multi-Table
▷ For example: when computing the join of 2 relations r and s by a nested loop
Data Dictionary for each tuple tr of r do
Storage
for each tuple ts of s do
Storage Access
Buffer Manager
if the tuples tr and ts match ...
Buffer Replacement
Policy ◦ Mixed strategy with hints on replacement strategy provided by the query optimizer
Module Summary is preferable

Database Management Systems Partha Pratim Das 40.25


Buffer Replacement Policies (2)

Module 40

Partha Pratim • Pinned block: memory block that is not allowed to be written back to disk
Das
• Toss-immediate strategy: frees the space occupied by a block as soon as the final
Objectives &
Outline tuple of that block has been processed
File Organization
Fixed-Length Records
• Most recently used (MRU) strategy: system must pin the block currently being
Free Lists processed. After the final tuple of that block has been processed, the block is unpinned,
Variable-Length
Records and it becomes the most recently used block.
Organization of
Records in Files • Buffer manager can use statistical information regarding the probability that a request
Sequential
Multi-Table
will reference a particular relation
Data Dictionary ◦ For example., the data dictionary is frequently accessed. Heuristic: keep
Storage
data-dictionary blocks in main memory buffer
Storage Access
Buffer Manager • Buffer managers also support forced output of blocks for the purpose of recovery
Buffer Replacement
Policy

Module Summary

Database Management Systems Partha Pratim Das 40.26


Module Summary

Module 40

Partha Pratim • Familiarized with the organization for database files


Das
• Understood how records and relations are organized in files
Objectives &
Outline • Learnt how databases keep their own information in Data-Dictionary Storage – the
File Organization
Fixed-Length Records
metadata database of a database
Free Lists
Variable-Length
• Understood the mechanisms for fast access of a database store
Records

Organization of
Records in Files
Sequential
Multi-Table

Data Dictionary
Storage

Storage Access
Buffer Manager
Buffer Replacement
Policy

Module Summary Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Database Management Systems Partha Pratim Das 40.27

You might also like