5-8 DBMS
5-8 DBMS
Partha Pratim
Das
Week Recap
Objectives &
Database Management Systems
Outline
Module 21: Relational Database Design/1
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Module Summary
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur
Module 21
Objectives &
• Illustrated equivalence of algebra and calculus
Outline
Features of Good
• Introduced the Design Process for Database Systems
Relational Design
Redundancy and
• Elucidated the E-R Model for real world representation with entities, entity sets,
Anomaly
Decomposition
attributes, and relationships
Atomic Domains • Illustrated ER Diagram notation for ER Models
and First Normal
Form
• Discussed translation of ER Models to Relational Schema and extended features of ER
Module Summary
Model
• Deliberated on various design issues
Module 21
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Atomic Domains
and First Normal
Form
Module Summary
Module 21
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Atomic Domains
and First Normal
Form
Module Summary
Module 21
Partha Pratim
Das
Week Recap
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Atomic Domains
and First Normal
Form
Module 21
Objectives &
• Avoids redundant storage of data items
Outline
Features of Good
• Provides efficient access to data
Relational Design
Redundancy and
• Supports the maintenance of data integrity over time
Anomaly
Decomposition • Clean, consistent, and easy to understand
Atomic Domains
and First Normal • Note: These objectives are sometimes contradictory!
Form
Module Summary
Week Recap
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Atomic Domains
and First Normal
Form
• ID: Key
• building , budget: Redundant Information
• name, salary , dept name: No Redundant Information
Module 21
Atomic Domains
and First Normal
Form
Module Summary
Module 21
Partha Pratim • Redundancy: having multiple copies of same data in the database.
Das
◦ This problem arises when a database is not normalized
Week Recap
◦ It leads to anomalies
Objectives &
Outline • Anomaly: inconsistencies that can arise due to data changes in a database with
Features of Good
Relational Design
insertion, deletion, and update
Redundancy and
Anomaly ◦ These problems occur in poorly planned, un-normalised databases where all the data
Decomposition
is stored in one table (a flat-file database)
Atomic Domains
and First Normal There can be three kinds of anomalies
Form
Module 21
• Insertions Anomaly
Partha Pratim
Das ◦ When the insertion of a data record is not possible without adding some additional
Week Recap
unrelated data to the record
Objectives &
◦ We cannot add an Instructor in instructor with department if the department does
Outline
not have a building or budget
Features of Good
Relational Design • Deletion Anomaly
Redundancy and
Anomaly
◦ When deletion of a data record results in losing some unrelated information that
Decomposition
Atomic Domains
was stored as part of the record that was deleted from a table
and First Normal
Form
◦ We delete the last Instructor of a Department from instructor with department, we
Module Summary
lose building and budget information
• Update Anomaly
◦ When a data is changed, which could involve many records having to be changed,
leading to the possibility of some changes being made incorrectly
◦ When the budget changes for a Department having large number of Instructors in
instructor with department application may miss some of them
Database Management Systems Partha Pratim Das 21.10
Redundancy and Anomaly (3)
Objectives &
• What causes redundancy?
Outline
◦ Dependency ⇒ Redundancy
Features of Good
Relational Design ◦ dept name uniquely decides building and budget. A department cannot have two
Redundancy and
Anomaly different budget or building. So building and budget depends on dept name
Decomposition
Atomic Domains
• How to remove, or at least minimize, redundancy?
and First Normal
Form ◦ Decompose (partition) the relation into smaller relations
Module Summary ◦ instructor with department can be decomposed into instructor and department
◦ Good Decomposition ⇒ Minimization of Dependency
• Is every decomposition good?
◦ No. It needs to preserve information, honour the dependencies, be efficient etc.
◦ Various schemes of normalization ensure good decomposition
◦ Normalization ⇒ Good Decomposition
Database Management Systems Partha Pratim Das 21.11
Decomposition
Module 21
Partha Pratim • Suppose we had started with inst dept. How would we know to split up (decompose)
Das
it into instructor and department?
Week Recap
• Write a rule “if there were a schema (dept name, building, budget), then dept name
Objectives &
Outline would be a candidate key”
Features of Good
Relational Design • Denote as a functional dependency: dept name → building, budget
Redundancy and
Anomaly • In inst dept, because dept name is not a candidate key, the building and budget of a
Decomposition
department may have to be repeated.
Atomic Domains
and First Normal
Form
◦ This indicates the need to decompose inst dept
Module Summary
Module 21
Module Summary • The next slide shows how we lose information – we cannot reconstruct the original
employee relation – and so, this is a lossy decomposition.
Module 21
Partha Pratim
Das
Week Recap
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Atomic Domains
and First Normal
Form
Module Summary
Module 21
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Atomic Domains
and First Normal
Form
Module Summary
Module 21
Module 21
Partha Pratim
Das
Week Recap
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Atomic Domains
and First Normal
Form
Module 21
Partha Pratim • A domain is atomic if its elements are considered to be indivisible units
Das
◦ Examples of non-atomic domains:
Week Recap
Objectives &
. Set of names, composite attributes
Outline . Identification numbers like CS101 that can be broken up into parts
Features of Good
Relational Design • A relational schema R is in First Normal Form (INF) if
Redundancy and
Anomaly ◦ the domains of all attributes of R are atomic
Decomposition
◦ the value of each attribute contains only a single value from that domain
Atomic Domains
and First Normal
Form • Non-atomic values complicate storage and encourage redundant (repeated) storage of
Module Summary data
◦ Example: Set of accounts stored with each customer, and set of owners stored with
each account
◦ We assume all relations are in first normal form
Module 21
Partha Pratim • Atomicity is actually a property of how the elements of the domain are used
Das
◦ Strings would normally be considered indivisible
Week Recap
◦ Suppose that students are given roll numbers which are strings of the form CS0012
Objectives &
Outline or EE1127
Features of Good ◦ If the first two characters are extracted to find the department, the domain
Relational Design
Redundancy and of roll numbers is not atomic
Anomaly
Decomposition ◦ Doing so is a bad idea
Atomic Domains
and First Normal
. Leads to encoding of information in application program rather than in the
Form database
Module Summary
Module 21
Week Recap
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
◦ A telephone number is composite
Atomic Domains
and First Normal ◦ Telephone number is multi-valued
Form
Module Summary
Module 21
Week Recap
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly ◦ is in 1NF if telephone number is not considered composite
Decomposition
Atomic Domains
◦ However, conceptually, we have two attributes for the same concept
and First Normal
Form
. Arbitrary and meaningless ordering of attributes
Module Summary . How to search telephone numbers
. Why only two numbers?
Module 21
Week Recap
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Module 21
Week Recap
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Module 21
Objectives &
Outline
Features of Good
Relational Design
Redundancy and
Anomaly
Decomposition
Atomic Domains
and First Normal
Form
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Functional
Dependencies Module 22: Relational Database Design/2
Closure of FDs
Module Summary
Module 22
Functional
Dependencies
Closure of FDs
Module Summary
Module 22
Objectives &
Outline
Functional
Dependencies
Closure of FDs
Module Summary
Module 22
Objectives &
Outline
Functional
Dependencies
Closure of FDs
Module Summary
Module 22
Partha Pratim
Das
Objectives &
Outline
Functional
Dependencies
Closure of FDs
Module Summary
Functional Dependencies
Module 22
Module 22
Module Summary
• On this instance, A → B does NOT hold, but B → A does hold. So we cannot have
tuples like (2, 4), or (3, 5), or (4, 7) added to the current instance.
Database Management Systems Partha Pratim Das 22.8
Functional Dependencies (3)
Module 22
Module 22
Module 22
Module 22
Objectives &
Outline
Functional
Dependencies
Closure of FDs
Module Summary
◦ StudentID → Semester
StudentID, Lecture → TA
{StudentID, Lecture} → {TA, Semester }
Module 22
Objectives &
Outline
Functional
Dependencies
Closure of FDs
Module Summary
◦ EmployeeID → EmployeeName
EmployeeID → DepartmentID
DepartmentID → DepartmentName
Module 22
Partha Pratim • F = {A → B, B → C }
Das
• F + = {A → B, B → C , A → C }
Objectives &
Outline
Functional
Dependencies
Closure of FDs
Module Summary
Module 22
Objectives &
Outline
Functional
Dependencies
Closure of FDs
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
FD Theory
Armstrong’s Axioms
Module 23: Relational Database Design/3
Closure of FDs
Closure of Attributes
Decomposition
using FDs
BCNF Partha Pratim Das
3NF
Normalization
Department of Computer Science and Engineering
Module Summary
Indian Institute of Technology, Kharagpur
Module 23
Objectives &
Outline
FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes
Decomposition
using FDs
BCNF
3NF
Normalization
Module Summary
Module 23
Decomposition
using FDs
BCNF
3NF
Normalization
Module Summary
Module 23
FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes
Decomposition
using FDs
BCNF
3NF
Normalization
Module Summary
Module 23
Partha Pratim
Das
Objectives &
Outline
FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes
Decomposition
using FDs
BCNF
3NF
Normalization
Module 23 • Given a set of Functional Dependencies F , we can infer new dependencies by the
Partha Pratim Armstrong’s Axioms:
Das
◦ Reflexivity: if β ⊆ α, then α → β
Objectives &
Outline
◦ Augmentation: if α → β, then γα → γβ
FD Theory ◦ Transitivity: if α → β and β → γ, then α → γ
Armstrong’s Axioms
Closure of FDs • These axioms can be repeatedly applied to generate new FDs and added to F
Closure of Attributes
Decomposition
• A new FD obtained by applying the axioms is said to the logically implied by F
using FDs
BCNF
• The process of generations of FDs terminate after finite number of steps and we call it
3NF the Closure Set F + for FDs F . This is the set of all FDs logically implied by F
Normalization
Module 23
Partha Pratim • F = {A → B, B → C }
Das
• F + = {A → B, B → C , A → C }
Objectives &
Outline
FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes
Decomposition
using FDs
BCNF
3NF
Normalization
Module Summary
Module 23 • R = (A, B, C , G , H, I )
Partha Pratim
Das
F = {A → B
A→C
Objectives &
Outline CG → H
FD Theory CG → I
Armstrong’s Axioms
Closure of FDs B → H}
Closure of Attributes
Decomposition
• Some members of F +
using FDs
BCNF
◦ A→H
3NF
Normalization
. by transitivity from A → B and B → H
Module Summary ◦ AG → I
. by augmenting A → C with G , to get AG → CG and then transitivity with
CG → I
◦ CG → HI
. by augmenting CG → I with CG to infer CG → CGI , and augmenting CG → H
with I to infer CGI → HI , and then transitivity
Database Management Systems Partha Pratim Das 23.8
Functional Dependencies (4): Closure of a Set FDs: Computing
F+
Module 23
Module 23
Module 23
Partha Pratim • Given a set of attributes α, define the closure of α under F (denoted by α+ ) as the
Das
set of attributes that are functionally determined by α under F
Objectives &
Outline • Algorithm to compute α+ , the closure of α under F
FD Theory result ← α
Armstrong’s Axioms
Closure of FDs
while (changes to result) do
Closure of Attributes
for each β → γ in F do
Decomposition
using FDs
begin
BCNF if β ⊆ result then result ← result ∪ γ
3NF
Normalization end
Module Summary
Module 23
Module 23
Partha Pratim
There are several uses of the attribute closure algorithm:
Das
• Testing for superkey:
Objectives &
Outline ◦ To test if α is a superkey, we compute α+ , and check if α+ contains all attributes
FD Theory of R.
Armstrong’s Axioms
Closure of FDs • Testing functional dependencies
Closure of Attributes
Decomposition
◦ To check if a functional dependency α → β holds (or, in other words, is in F + ), just
using FDs check if β ⊆ α+
BCNF
3NF ◦ That is, we compute α+ by using attribute closure, and then check if it contains β.
Normalization
◦ Is a simple and cheap test, and very useful
Module Summary
• Computing closure of F
◦ For each γ ⊆ R, we find the closure γ + , and for each S ⊆ γ + , we output a
functional dependency γ → S.
Module 23
Partha Pratim
Das
Objectives &
Outline
FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes
Decomposition
using FDs
BCNF
3NF
Normalization
Module 23
Partha Pratim • A relation schema R is in BCNF with respect to a set F of FDs if for all FDs in F + of
Das
the form
Objectives &
Outline
α → β, where α ⊆ R and β ⊆ R at least one of the following holds:
FD Theory ◦ α → β is trivial (that is, β ⊆ α)
Armstrong’s Axioms
Closure of FDs
◦ α is a superkey for R
Closure of Attributes
• Example schema not in BCNF:
Decomposition
using FDs instr dept (ID, name, salary, dept name, building, budget)
BCNF
3NF • because the non-trivial dependency dept name → building, budget holds on instr dept,
Normalization
but dept name is not a superkey
Module Summary
Module 23
Decomposition
◦ α = dept name
using FDs ◦ β = building , budget
BCNF
3NF ◦ dept name → building, budget
Normalization
inst dept is replaced by
Module Summary
◦ (α ∪ β) = (dept name, building, budget)
. dept name → building, budget
◦ (R − (β − α)) =(ID, name, salary, dept name)
. ID → name, salary, dept name
Module Summary R1 ∩ R2 6= Φ
R1 ∩ R2 → R1 or R1 ∩ R2 → R2
Module 23 • Constraints, including FDs, are costly to check in practice unless they pertain to only
Partha Pratim one relation
Das
• If it is sufficient to test only those dependencies on each individual relation of a
Objectives &
Outline decomposition in order to ensure that all functional dependencies hold, then that
FD Theory decomposition is dependency preserving.
Armstrong’s Axioms
Closure of FDs • It is not always possible to achieve both BCNF and dependency preservation. Consider:
Closure of Attributes
Decomposition
◦ R = CSZ , F = {CS → Z , Z → C }
using FDs
BCNF
◦ Key = CS
3NF ◦ CS → Z satisfies BCNF, but Z → C violates
Normalization
◦ Decompose as: R1 = ZC , R2 = CSZ − (C − Z ) = SZ
Module Summary
◦ R1 ∪ R2 = CSZ = R, R1 ∩ R2 = Z 6= Φ, and R1 ∩ R2 = Z → ZC = R1 . So it has
lossless join
◦ However, we cannot check CS → Z without doing a join. Hence it is not
dependency preserving
• We consider a weaker normal form, known as Third Normal Form (3NF)
Database Management Systems Partha Pratim Das 23.18
3NF: Third Normal Form
Module 23
Partha Pratim • A relation schema R is in third normal form (3NF) if for all:
Das
α → β ∈ F+
Objectives &
Outline
at least one of the following holds:
FD Theory ◦ α → β is trivial (that is, β ⊆ α)
Armstrong’s Axioms
Closure of FDs
◦ α is a superkey for R
Closure of Attributes ◦ Each attribute A in β − α is contained in a candidate key for R
Decomposition
using FDs
(Nore: Each attribute may be in a different candidate key)
BCNF
3NF
• If a relation is in BCNF it is in 3NF (since in BCNF one of the first two conditions
Normalization above must hold)
Module Summary
• Third condition is a minimal relaxation of BCNF to ensure dependency preservation
(will see why later)
Module 23
Module Summary
Module 23
Partha Pratim
There are three potential problems to consider:
Das
• May be impossible to reconstruct the original relation! (Lossiness)
Objectives &
Outline • Dependency checking may require joins
FD Theory
Armstrong’s Axioms
• Some queries become more expensive
Closure of FDs
Closure of Attributes
◦ What is the building for an instructor?
Decomposition Tradeoff: Must consider these issues vs. redundancy
using FDs
BCNF
3NF
Normalization
Module Summary
Module 23
Partha Pratim • There are database schemas in BCNF that do not seem to be sufficiently normalized
Das
• Consider a relation
Objectives &
Outline inst info (ID, child name, phone)
FD Theory ◦ where an instructor may have more than one phone and can have multiple children
Armstrong’s Axioms
Closure of FDs
Closure of Attributes
Decomposition
using FDs
BCNF
3NF
Normalization
Module Summary
inst info
Module 23
Partha Pratim • There are no non-trivial functional dependencies and therefore the relation is in BCNF
Das
• Insertion anomalies – that is, if we add a phone 981-992-3443 to 99999, we need to add
Objectives &
Outline two tuples
FD Theory
Armstrong’s Axioms
Closure of FDs
(99999, David, 981-992-3443)
Closure of Attributes
(99999, William, 981-992-3443)
Decomposition
using FDs
BCNF
3NF
Normalization
Module Summary
Module 23
FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes
Decomposition
using FDs inst phone
BCNF
3NF
Normalization
Module Summary
• This suggests the need for higher normal forms, such as the Fourth Normal Form (4NF)
Module 23
FD Theory
Armstrong’s Axioms
Closure of FDs
Closure of Attributes
Decomposition
using FDs
BCNF
3NF
Normalization
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Algorithms for
FDs Module 24: Relational Database Design/4
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Partha Pratim Das
Practice Problems
Module 24
Algorithms for
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems
Module Summary
Module 24
Objectives &
Outline
Algorithms for
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems
Module Summary
Module 24
Objectives &
Outline
Algorithms for
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems
Module Summary
Module 24
Partha Pratim
Das
Objectives &
Outline
Algorithms for
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems
Module Summary
Algorithms for Functional Dependencies
Module 24
Module 24
Partha Pratim
There are several uses of the attribute closure algorithm:
Das
• Testing for superkey:
Objectives &
Outline ◦ To test if α is a superkey, we compute α+ , and check if α+ contains all attributes
Algorithms for of R.
FDs
Attribute Set Closure • Testing functional dependencies
Extraneous
Attributes
Equivalence of FD
◦ To check if a functional dependency α → β holds (or, in other words, is in F + ), just
Sets
check if β ⊆ α+ .
Canonical Cover of
FDs ◦ That is, we compute α+ by using attribute closure, and then check if it contains β.
Practice Problems
Module Summary
◦ Is a simple and cheap test, and very useful
• Computing closure of F
◦ For each γ ⊆ R, we find the closure γ + , and for each S ⊆ γ + , we output a
functional dependency γ → S.
Module 24
Partha Pratim • Consider a set F of functional dependencies and the functional dependency α → β in F .
Das
• To test if attribute A ∈ α is extraneous in α
Objectives &
Outline a) Compute ({α} − A)+ using the dependencies in F
Algorithms for
FDs
b) Check that ({α} − A)+ contains β; if it does, A is extraneous in α
Attribute Set Closure
Extraneous
• To test if attribute A ∈ β is extraneous in β
a) Compute α+ using only the dependencies in
Attributes
Equivalence of FD
F 0 = (F − {α → β}) ∪ {α → (β − A)},
Sets
Canonical Cover of
FDs
Practice Problems
b) Check that α+ contains A; if it does, A is extraneous in β
Module Summary
Module 24 • Sets of FDs may have redundant dependencies that can be inferred from the others
Partha Pratim
Das
• Can we have some kind of ”optimal” or ”minimal” set of FDs wto work with?
Objectives &
• A Canonical Cover for F is a set of dependencies Fc such that ALL the following
Outline
properties are satisfied:
Algorithms for
FDs ◦ F + = Fc+ . Or,
Attribute Set Closure
Extraneous . F logically implies all dependencies in Fc
Attributes
Equivalence of FD . Fc logically implies all dependencies in F
Sets
Canonical Cover of
FDs
◦ No functional dependency in Fc contains an extraneous attribute
Practice Problems ◦ Each left side of functional dependency in Fc is unique. That is, there are no two
Module Summary dependencies α1 → β1 and α2 → β2 in such that α1 → α2
• Intuitively, a Canonical cover of F is a minimal set of FDs
◦ Equivalent to F
◦ Having no redundant FDs
◦ No redundant parts of FDs
• Minimal / Irreducible Set of Functional Dependencies
Database Management Systems Partha Pratim Das 24.11
Canonical Cover (2): Example
Module 24
Module Summary
Module 24
Partha Pratim • {A → B, B → C , AC → D} ⇒ {A → B, B → C , A → D}
Das
◦ A → B, B → C ⇒ A → C ⇒ A → AC
Objectives &
Outline ◦ A → AC , AC → D ⇒ A → D
Algorithms for ◦ A+ = ABCD
FDs
Attribute Set Closure • {A → B, B → C , A → D} ⇒ {A → B, B → C , AC → D}
Extraneous
Attributes
Equivalence of FD
◦ A → D ⇒ AC → D
Sets
◦ AC + = ABCD
Canonical Cover of
FDs
Practice Problems
Module Summary
Module 24
• To compute a canonical cover for F :
Partha Pratim
Das repeat
Objectives &
Use the union rule to replace any dependencies in F
Outline
α1 → β1 and α1 → β2 with α1 → β1 β2
Algorithms for
FDs Find a functional dependency α → β with an
Attribute Set Closure extraneous attribute either in α or in β
Extraneous
Attributes /* Note: test for extraneous attributes done using Fc , not F */
Equivalence of FD
Sets If an extraneous attribute is found, delete it from α → β
Canonical Cover of
FDs until F does not change
Practice Problems
Module Summary • Note: Union rule may become applicable after some extraneous attributes have been
deleted, so it has to be re-applied
Module 24
Module 24
Partha Pratim • Find if a given functional dependency is implied from a set of Functional
Das
Dependencies:
Objectives &
Outline a) For: A → BC , CD → E , E → C , D → AEH, ABH → BD, DH → BC
Algorithms for
FDs
i) Check: BCD → H
Attribute Set Closure ii) Check: AED → C
Extraneous
Attributes b) For: AB → CD, AF → D, DE → F , C → G , F → E , G → A
Equivalence of FD
Sets
Canonical Cover of
i) Check: CF → DF
FDs
Practice Problems
ii) Check: BG → E
Module Summary
iii) Check: AF → G
iv) Check: AB → EF
c) For: A → BC , B → E , CD → EF
i) Check: AD → F
Module 24
Module Summary
Module 24
Module Summary
Module 24 • Find Prime and Non Prime Attributes using Functional Dependencies:
Partha Pratim
Das
a) R(ABCDEF ) having FDs {AB → C , C → D, D → E , F → B, E → F }
b) R(ABCDEF ) having FDs {AB → C , C → DE , E → F , C → B}
Objectives &
Outline c) R(ABCDEFGHIJ) having FDs {AB → C , A → DE , B → F , F → GH, D → IJ}
Algorithms for d) R(ABDLPT ) having FDs {B → PT , A → D, T → L}
FDs
Attribute Set Closure e) R(ABCDEFGH) having FDs
Extraneous
Attributes
{E → G , AB → C , AC → B, AD → E , B → D, BC → A}
Equivalence of FD
Sets
f) R(ABCDE ) having FDs {A → BC , CD → E , B → D, E → A}
Canonical Cover of
FDs
g) R(ABCDEH) having FDs {A → B, BC → D, E → C , D → A}
Practice Problems
Module Summary
• Prime Attributes: Attribute set that belongs to any candidate key are called Prime Attributes
◦ It is union of all the candidate key attribute: {CK1 ∪ CK2 ∪ CK3 ∪ · · · }
◦ If Prime attribute determined by other attribute set, then more than one candidate key is possible.
◦ For example, If A is Candidate Key, and X → A, then, X is also Candidate Key.
• Non Prime Attribute: Attribute set does not belong to any candidate key are called Non Prime
Attributes
Module 24
Module Summary
Module 24
Partha Pratim • Find the Minimal Cover or Irreducible Sets or Canonical Cover of a Set of
Das
Functional Dependencies:
Objectives &
Outline a) AB → CD, BC → D
Algorithms for b) ABCD → E , E → D, AC → D, A → B
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems
Module Summary
Module 24
Objectives &
Outline
Algorithms for
FDs
Attribute Set Closure
Extraneous
Attributes
Equivalence of FD
Sets
Canonical Cover of
FDs
Practice Problems
Slides used in this presentation are borrowed from http://db-book.com/ with kind
Module Summary
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Lossless Join
Decomposition Module 25: Relational Database Design/5
Practice Problems
Dependency
Preservation
Practice Problems
Module Summary
Partha Pratim Das
Module 25
Objectives &
Outline
Lossless Join
Decomposition
Practice Problems
Dependency
Preservation
Practice Problems
Module Summary
Module 25
Lossless Join
Decomposition
Practice Problems
Dependency
Preservation
Practice Problems
Module Summary
Module 25
Lossless Join
Decomposition
Practice Problems
Dependency
Preservation
Practice Problems
Module Summary
Module 25
Partha Pratim
Das
Objectives &
Outline
Lossless Join
Decomposition
Practice Problems
Dependency
Preservation
Practice Problems
Module Summary
Module 25 • For the case of R = (R1 , R2 ), we require that for all possible relations r on schema R
Partha Pratim
Das
r = πR1 (r ) ./ πR2 (r )
Objectives &
Outline
• A decomposition of R into R1 and R2 is lossless join if at least one of the following
Lossless Join
Decomposition dependencies is in F + :
Practice Problems
Dependency
◦ R1 ∩ R2 → R 1
Preservation ◦ R1 ∩ R2 → R 2
Practice Problems
Module Summary • The above functional dependencies are a sufficient condition for lossless join
decomposition; the dependencies are a necessary condition only if all constraints are
functional dependencies
To Identify whether a decomposition is lossy or lossless, it must satisfy the following conditions:
• R1 ∪ R2 = R
• R1 ∩ R2 6= φ and
• R1 ∩ R2 → R1 or R1 ∩ R2 → R2
Database Management Systems Partha Pratim Das 25.6
Lossless Join Decomposition (2): Example PPD
Module 25 • Consider Supplier Parts schema: Supplier Parts(S#, Sname, City, P#, Qty)
Partha Pratim
Das
• Having dependencies: S# → Sname, S# → City, (S#, P#) → Qty
Objectives &
• Decompose as: Supplier(S#, Sname, City, Qty): Parts(P#, Qty)
Outline
• Take Natural Join to reconstruct: Supplier ./ Parts
Lossless Join
Decomposition
Practice Problems
Dependency
Preservation
Practice Problems
Module Summary
Module 25 • Consider Supplier Parts schema: Supplier Parts(S#, Sname, City, P#, Qty)
Partha Pratim
Das
• Having dependencies: S# → Sname, S# → City, (S#, P#) → Qty
Objectives &
• Decompose as: Supplier(S#, Sname, City): Parts(S#, P#, Qty)
Outline
• Take Natural Join to reconstruct: Supplier ./ Parts
Lossless Join
Decomposition
Practice Problems
Dependency
Preservation
Practice Problems
Module Summary
Module 25
Module 25
Module 25
Partha Pratim
Das
Objectives &
Outline
Lossless Join
Decomposition
Practice Problems
Dependency
Preservation
Practice Problems
Module Summary
Dependency Preservation
Module 25
Partha Pratim • Let Fi be the set of dependencies F + that include only attributes in Ri
Das
◦ A decomposition is dependency preserving, if
Objectives &
Outline
Lossless Join
(F1 ∪ F2 ∪ · · · ∪ Fn )+ = F +
Decomposition
Practice Problems
Dependency
◦ If it is not, then checking updates for violation of functional dependencies may
Preservation require computing joins, which is expensive
Practice Problems
Module Summary
Let R be the original relational schema having FD set F . Let R1 and R2 having FD
set F1 and F2 respectively, are the decomposed sub-relations of R. The decomposition
of R is said to be preserving if
• F1 ∪ F2 ≡ F {Decomposition Preserving Dependency}
• If F1 ∪ F2 ⊂ F {Decomposition NOT Preserving Dependency} and
• F1 ∪ F2 ⊃ F {this is not possible}
Database Management Systems Partha Pratim Das 25.12
Dependency Preservation (2): Testing
Module 25
• To check if a dependency α → β is preserved in a decomposition of R into D = {R1 , R2 , . . . , Rn } we
Partha Pratim apply the following test (with attribute closure done with respect to F )
Das
• The restriction of F + to Ri is the set of all functional dependencies in F + that include only attributes
Objectives & of Ri .
Outline
Module 25
• R (A, B, C, D, E, F )
Partha Pratim F = {A → BCD, A → EF , BC → AD, BC → E , BC → F , B → F , D → E }
Das
• Decomposition: R1(A, B, C, D) R2(B, F) R3(D, E)
Objectives &
Outline
◦ A → BCD, BC → AD are preserved on table R1
◦ B → F is preserved on table R2
Lossless Join
Decomposition
◦ D → E is preserved on table R3
Practice Problems
◦ We have to check whether the remaining FDs: A→E, A→ F, BC→E, BC→ F are preserved or not.
Dependency R1 R2 R3
Preservation
Practice Problems
F1 ={A → ABCD, B → B, C → C , D → D, F2 ={B → BF , F → F } F3 ={D → DE , E → E }
AB → ABCD, BC → ABCD, CD → CD, AD → ABCD
Module Summary
ABC → ABCD, ABD → ABCD, ACD → ABCD
BCD → ABCD}
◦ F 0 = F1 ∪ F2 ∪ F3 .
◦ Checking for: A → E , A → F in F 0+
. A → D (from R1), D → E (from R3) : A→ E (By Transitivity)
. A → B (from R1), B → F (from R2) : A→ F (By Transitivity)
◦ Checking for: BC → E , BC → F in F 0+
. BC → D (from R1), D → E (from R3) : BC→ E (By Transitivity)
. B → F (from R2) : BC→ F (By Augmentation) Hence all dependencies are preserved.
Module 25
Partha Pratim
• R (A, B, C, D)
Das F = {A → B, B → C , C → D, D → A}
Objectives & • Decomposition: R1(A, B) R2(B, C) R3(C, D)
Outline
◦ A → B is preserved on table R1
Lossless Join
Decomposition ◦ B → C is preserved on table R2
Practice Problems ◦ C → D is preserved on table R3
Dependency ◦ We have to check whether the one remaining FD: D→A is preserved or not.
Preservation
Practice Problems
R1 R2 R3
Module Summary
F1 ={A → AB, B → BA} F2 ={B → BC , C → CB} F3 ={C → CD, D → DC }
◦ F 0 = F1 ∪ F2 ∪ F3 .
◦ Checking for: D → A in F 0+
. D → C (from R3), C → B (from R2), B → A (from R1) : D→ A (By Transitivity)
Hence all dependencies are preserved.
Module 25
Dependency
t = (result ∩ Ri )+ ∩ Ri
Preservation result = result ∪ t
Practice Problems
◦ If result contains all attributes in β, then the functional dependency α → β is preserved.
Module Summary
• We apply the test on all dependencies in F to check if a decomposition is dependency
preserving
• This procedure takes polynomial time, instead of the exponential time required to compute F +
and (F1 ∪ F2 ∪ · · · ∪ Fn )+
Lossless Join
Decomposition
Practice Problems
Dependency
Preservation
Practice Problems
• Need to check for: (
A→ (( A → EF , (
(BCD, BC(
(
→(AD, BC → E , BC → F , B→
F ,
D→ E
Module Summary
• (BC ) + /F 1 = ABCD. (ABCD) + /F 2 = ABCDF . (ABCDF ) + /F 3 = ABCDEF . Preserves
BC → E , BC → F
BC → AD (R1), AD → E (R3) implies BC → E
B → F (R2) implies BC → F
• (A) + /F 1 = ABCD. (ABCD) + /F 2 = ABCDF . (ABCDF ) + /F 3 = ABCDEF . Preserves A → EF
A → B (R1), B → F (R2) implies A → F
A → D (R1), D → E (R3) implies A → E
Lossless Join
Decomposition
Practice Problems
Module 25
Partha Pratim
• Check whether the decomposition of R into D is preserving dependency:
Das
Module 25
Partha Pratim • Understood the Characterization for and Determination of Lossless Join
Das
• Understood the Characterization for and Determination of Dependency Preservation
Objectives &
Outline
Lossless Join
Decomposition
Practice Problems
Dependency
Preservation
Practice Problems
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Week Recap
Objectives &
Database Management Systems
Outline
Module 26: Relational Database Design/6: Normal Forms
Normal Forms
1NF
2NF
3NF
Module 26
Objectives &
• Introduced the notion and the theory of functional dependencies
Outline
Normal Forms
• Discussed issues in ”good” design in the context of functional dependencies
1NF
2NF
• Studied Algorithms for Properties of Functional Dependencies
3NF
• Understood the Characterization for and Determination of Lossless Join and
Module Summary
Determination of Dependency Preservation
Module 26
Partha Pratim • To Understand the Normal Forms and their Importance in Relational Design
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Module 26
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Module 26
Partha Pratim
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Normal Forms
Module 26
• Normalization or Schema Refinement is a technique of organizing the data in the
Partha Pratim
Das database
Week Recap • A systematic approach of decomposing tables to eliminate data redundancy and
Objectives & undesirable characteristics
Outline
Normal Forms
◦ Insertion Anomaly
1NF ◦ Update Anomaly
2NF
3NF ◦ Deletion Anomaly
Module Summary • Most common technique for the Schema Refinement is decomposition.
◦ Goal of Normalization: Eliminate Redundancy
• Redundancy refers to repetition of same data or duplicate copies of same data stored in
different locations
• Normalization is used for mainly two purpose:
◦ Eliminating redundant (useless) data
◦ Ensuring data dependencies make sense, that is, data is logically stored
Database Management Systems Partha Pratim Das 26.6
Anomalies PPD
Module 26
b) Insertion Anomaly: Until the new faculty
Partha Pratim
Das a) Update Anomaly: Employee 519 is shown as member, Dr. Newsome, is assigned to teach
having different addresses on different records at least one course, his details cannot be
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
recorded
Module Summary
c) Deletion Anomaly: All information about
Resolution: Decompose the Schema Dr. Giddens is lost if he temporarily ceases
to be assigned to any courses.
a) Update: (ID, Address), (ID, Skill)
b) Insert: (ID, Name, Hire Date), (ID, Code)
c) Delete: (ID, Name, Hire Date), (ID, Code)
Module 26
Objectives &
• Dependency Preserving Property
Outline
◦ No functional dependency (or other constraints should get violated)
Normal Forms
1NF
2NF
3NF
Module Summary
Module 26
Partha Pratim • A normal form specifies a set of conditions that the relational schema must satisfy in
Das
terms of its constraints – they offer varied levels of guarantee for the design
Week Recap
• Normalization rules are divided into various normal forms. Most common normal forms
Objectives &
Outline are:
Normal Forms
1NF
◦ First Normal Form (1NF)
2NF ◦ Second Normal Form (2NF)
3NF
◦ Third Normal Form (3NF)
Module Summary
• Informally, a relational database relation is often described as ”normalized” if it meets
third normal form. Most 3NF relations are free of insertion, update, and deletion
anomalies
Module 26
Module 26 • A relation is in First Normal Form if and only if all underlying domains contain atomic
Partha Pratim values only (doesn’t have multivalued attributes (MVA))
Das
• STUDENT(Sid, Sname, Cname)
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Module 26
• Example: Supplier(SID, Status, City, PID, Qty)
Partha Pratim
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Drawbacks:
• Deletion Anomaly: If we delete <S3,40,Rohtak,P1,245>, then we lose the information that S3 lives in Rohtak.
• Insertion Anomaly: We cannot insert a Supplier S5 located in Karnal, until S5 supplies at least one part.
• Update Anomaly: If Supplier S1 moves from Delhi to Kanpur, then it is difficult to update all the tuples having SID
as S1 and City as Delhi.
Module 26
• When LHS is not a Superkey :
• When LHS is a Superkey :
Partha Pratim
◦ Let X → Y be a non trivial FD over R with X
Das
is not a superkey of R, then redundancy exist ◦ If X → Y is a non trivial FD over R with X is
Week Recap between X and Y attribute set. a superkey of R, then redundancy does not
Objectives & ◦ Hence in order to identify the redundancy, we exist between X and Y attribute set.
Outline
need not to look at the actual data, it can be ◦ Example : X → Y and X is a Candidate Key
Normal Forms identified by given functional dependency. ⇒ X cannot duplicate
1NF
◦ Example : X → Y and X is not a Candidate ⇒ Corresponding Y value may or may not
2NF
Key duplicate.
3NF
Module 26
Normal Forms
1NF
Partial Dependency:
2NF
3NF Let R be a relational Schema and X , Y , A be the attribute sets over R where X : Any Candi-
Module Summary date Key, Y : Proper Subset of Candidate Key, and A : Non Prime Attribute
Key Normalization
Module 26 • STUDENT(Sid, Sname, Cname) (already in 1NF)
Partha Pratim
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Functional Dependencies:
Module Summary {SID, Cname} → Sname
• Redundancy? SID → Sname
◦ Sname
Partial Dependencies: The above two relations R1 and R2 are
• Anomaly? SID → Sname (as SID is a 1.Lossless Join
◦ Yes Proper Subset of Candidate Key 2.2NF
{SID, Cname}) 3.Dependency Preserving
Module 26
Post Normalization
Partha Pratim
Das • Supplier(SID, Status, City, PID, Qty)
Week Recap
Objectives &
Outline
Normal Forms
Partial Dependencies:
1NF
2NF
SID → Status
3NF
SID → City
Drawbacks:
Module Summary
• Deletion Anomaly: If we delete a tuple in
Sup City , then we not only loose the infor-
mation about a supplier, but also loose the
status value of a particular city.
• Insertion Anomaly: We cannot insert a City
and its status until a supplier supplies at least
one part.
• Update Anomaly: If the status value for a
city is changed, then we will face the problem
of searching every tuple for that city.
Source: http://www.edugrabs.com/2nf- second- normal- form/
Database Management Systems Partha Pratim Das 26.16
3NF: Third Normal Form PPD
Module 26
Let R be the relational schema.
Partha Pratim
Das • [E. F. Codd,1971] R is in 3NF only if:
Week Recap ◦ R should be in 2NF
Objectives &
◦ R should not contain transitive dependencies (OR, Every non-prime attribute of R is
Outline non-transitively dependent on every key of R)
Normal Forms
1NF
• [Carlo Zaniolo, 1982] Alternately, R is in 3NF iff for each of its functional dependencies X → A, at least
2NF
one of the following conditions holds:
3NF
◦ X contains A (that is, A is a subset of X , meaning X → A is trivial functional dependency), or
Module Summary
◦ X is a superkey, or
◦ Every element of A − X , the set difference between A and X , is a prime attribute (i.e., each
attribute in A − X is contained in some candidate key)
• [Simple Statement] A relational schema R is in 3NF if for every FD X → A associated with R either
◦ A ⊆ X (that is, the FD is trivial) or
◦ X is a superkey of R or
◦ A is part of some candidate key (not just superkey!)
• A relation in 3NF is naturally in 2NF
Database Management Systems Partha Pratim Das 26.17
3NF (2): Transitive Dependency
Module 26
Module 26
• Example of transitive dependency
Partha Pratim
Das • The functional dependency {Book} → {Author Nationality} applies; that is, if we know
Week Recap the book, we know the author’s nationality. Furthermore:
Objectives &
Outline
◦ {Book} → {Author}
Normal Forms
◦ {Author} does not → {Book}
1NF ◦ {Author} → {Author Nationality}
2NF
3NF • Therefore {Book} → {Author Nationality} is a transitive dependency.
Module Summary
• Transitive dependency occurred because a non-key attribute (Author) was determining
another non-key attribute (Author Nationality).
Module 26
• Example:
Partha Pratim Sup City(SID, Status, City) (already in 2NF) Post Normalization
Das
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Normal Forms
◦ s ID, dept name → i ID
1NF
2NF
. s ID, dept name is a superkey
3NF
◦ i ID → dept name
Module Summary
. dept name is contained in a candidate key
Module 26
• There is some redundancy in this schema
Partha Pratim
Das • Example of problems due to redundancy in 3NF (J : s ID, L : i ID, K : dept name)
Week Recap ◦ R = (J, L, K ). F = {JK → L, L → K }
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary
Module 26
Partha Pratim • Studied the Normal Forms and their Importance in Relational Design – how progressive
Das
increase of constraints can minimize redundancy in a schema
Week Recap
Objectives &
Outline
Normal Forms
1NF
2NF
3NF
Module Summary Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Decomposition to
3NF Module 27: Relational Database Design/7: Normal Forms
Test
Algorithm
Practice Problem
Decomposition to
BCNF Partha Pratim Das
Test
Algorithm
Practice Problem Department of Computer Science and Engineering
Comparison Indian Institute of Technology, Kharagpur
Module Summary
[email protected]
Module 27
Partha Pratim • Studied the Normal Forms and their Importance in Relational Design – how progressive
Das
increase of constraints can minimize redundancy in a schema
Objectives &
Outline
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
Partha Pratim
Das
Objectives &
Outline
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module 27
Module Summary
Module 27
Partha Pratim • A relational schema R is in 3NF if for every FD X → A associated with R either
Das
◦ A ⊆ X (that is, the FD is trivial) or
Objectives &
Outline ◦ X is a superkey of R or
Decomposition to ◦ A is part of some candidate key (not just superkey!)
3NF
Test • A relation in 3NF is naturally in 2NF
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
Partha Pratim • Optimization: Need to check only FDs in F , need not check all FDs in F + .
Das
• Use attribute closure to check for each dependency α → β, if α is a superkey.
Objectives &
Outline • If α is not a superkey, we have to verify if each attribute in β is contained in a
Decomposition to
3NF
candidate key of R
Test
Algorithm
◦ This test is rather more expensive, since it involve finding candidate keys
Practice Problem ◦ Testing for 3NF has been shown to be NP-hard
Decomposition to
BCNF
◦ Decomposition into 3NF can be done in polynomial time
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
Module Summary
Module 27
Module Summary
Module 27
Module 27
Module 27
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Solution is given in the next slide (hidden from presentation – check after you have solved
Comparison
Module Summary
Module 27
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Solution is given in the next slide (hidden from presentation – check after you have solved)
Comparison
Module Summary
Module 27
Decompose the following schema to 3NF in the following steps
Partha Pratim
Das • Compute all keys for R
Objectives & • Compute a Canonical Cover Fc for F Put the FDs into alphabetical order.
Outline
Decomposition to
• Using Fc , employ the 3NFdecom algorithm to obtain a lossless and dependency preserving
3NF decomposition of relation R into a collection of relations that are in 3NF
Test
Algorithm
• Does your schema allow redundancy?
Practice Problem
Decomposition to • R(ABCDEFGH):
BCNF
Test
F = {A → CD, ACF → G , AD → BEF , BCG → D, CF → AH, CH → G , D → B, H → DEG }
Algorithm
Practice Problem
• R(ABCDE ):
Comparison
F = {A → B, A → C , C → D, A → E }
Module Summary • R(ABCDE ):
F = {A → BC , CD → E , B → D, E → A}
• R(ABCD):
F = {A → D, AB → C , AD → C , B → C , D → AB}
Module 27
Partha Pratim
Das
Objectives &
Outline
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module 27
Partha Pratim • A relation schema R is in BCNF with respect to a set F of FDs if for all FDs in F + of
Das
the form
Objectives &
Outline
α → β, where α ⊆ R and β ⊆ R at least one of the following holds:
Decomposition to ◦ α → β is trivial (that is, β ⊆ α)
3NF
Test
◦ α is a superkey for R
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
• To check if a non-trivial dependency α → β causes a violation of BCNF
Partha Pratim
Das a) Compute α+ (the attribute closure of α), and
Objectives &
b) Verify that it includes all attributes of R, that is, it is a superkey of R.
Outline
• Simplified test: To check if a relation schema R is in BCNF, it suffices to check only
Decomposition to
3NF the dependencies in the given set F for violation of BCNF, rather than checking all
Test
Algorithm
dependencies in F + .
Practice Problem
◦ If none of the dependencies in F causes a violation of BCNF, then none of the
Decomposition to
BCNF dependencies in F + will cause a violation of BCNF either.
Test
Algorithm • However, simplified test using only F is incorrect when testing a relation in a
Practice Problem
Comparison
decomposition of R
Module Summary ◦ Consider R = (A, B, C , D, E ), with F = {A → B, BC → D}
. Decompose R into R1 = (A, B) and R2 = (A, C , D, E )
. Neither of the dependencies in F contain only attributes from (A, C , D, E ) so we
might be mislead into thinking R2 satisfies BCNF.
. In fact, dependency AC → D in F + shows R2 is not in BCNF.
Database Management Systems Partha Pratim Das 27.19
BCNF Decomposition (3): Testing for BCNF Decomposition
Module 27
◦ F 0 = F1 ∪ F2 ∪ F3 .
◦ Checking for: D → A in F 0+
. D → C (from R3), C → B (from R2), B → A (from R1) : D→ A (By Transitivity)
Hence all dependencies are preserved.
Database Management Systems Partha Pratim Das 27.21
BCNF Decomposition (4): Testing Dependency Preservation:
Using Closure of Attributes (Poly. Algo.): Module 25
Module 27
• R(ABCD) :. F = {A → B, B → C , C → D, D → A}
Partha Pratim
Das • Decomp = {AB, BC , CD}
Objectives & • On projections:
Outline
Decomposition to
3NF
Test
Algorithm
Practice Problem
In this algo F1, F2, F3 are not the closure sets, rather the set of dependencies directly applicable on R1, R2, R3
Decomposition to respectively.
BCNF
Test • Need to check for: A → B, B → C , C → D, D → A
Algorithm
Practice Problem • (D) + /F 1 = D. (D) + /F 2 = D. (D) + /F 3 = D. So, D → A could not be preserved.
Comparison
• In the previous method we saw the dependency was preserved. In reality also it is preserved.
Module Summary
Therefore the polynomial time algorithm may not work in case of all examples. To prove preservation
Algo 2 is sufficient but not necessary whereas Algo 1 is both sufficient as well as necessary.
Note: This difference in result can occur in any example where a functional dependency of one decomposed table
uses another functional dependency in its closure which is not applicable on any of the decomposed table because of
absence of all attributes in the table.
Database Management Systems Partha Pratim Das 27.22
BCNF Decomposition (4): Algorithm PPD
Module 27
Module Summary
• Similarly F 2+
Module 27
Partha Pratim
result := {R};
Das done := false;
Objectives & compute F + ;
Outline
while (not done) do
Decomposition to
3NF if (there is a schema Ri in result that is not in BCNF)
Test
Algorithm
then begin
Practice Problem
let α → β be a nontrivial functional dependency that
Decomposition to
BCNF
holds on Ri such that α → β is not in F + ,
Test and α ∩ β = φ;
Algorithm
Practice Problem result := (result − Ri ) ∪ (Ri − β) ∪ (α, β);
Comparison
end
Module Summary
else done := true;
Module 27
Module Summary
Module 27
• class (course id, title, dept name, credits, sec id, semester, year, building,
Partha Pratim
Das
room number, capacity, time slot id)
Objectives &
• Functional dependencies:
Outline
◦ course id → title, dept name, credits
Decomposition to
3NF ◦ building, room number → capacity
Test
Algorithm
◦ course id, sec id, semester, year → building, room number, time slot id
Practice Problem
• A candidate key course id, sec id, semester, year.
Decomposition to
BCNF
Test
• BCNF Decomposition:
Algorithm ◦ course id → title, dept name, credits holds
Practice Problem
Comparison . but course id is not a superkey
Module Summary
◦ We replace class by:
. course(course id, title, dept name, credits)
. class-1 (course id, sec id, semester, year, building,
room number, capacity, time slot id)
Database Management Systems Partha Pratim Das 27.26
BCNF Decomposition (8): Example
Module 27
Module Summary
Module 27
Partha Pratim • It is not always possible to get a BCNF decomposition that is dependency preserving
Das
• R = (J, K , L)
Objectives &
Outline F = {JK → L
Decomposition to L → K}
3NF
Test Two candidate keys = JK and JL
Algorithm
Practice Problem • R is not in BCNF
Decomposition to
BCNF
• Any decomposition of R will fail to preserve
Test JK → L
Algorithm
Practice Problem This implies that testing for JK → L requires a join
Comparison
Module Summary
Module 27
Partha Pratim
Decompose the following schema to BCNF
Das
• R = ABCDE . F = {A → B, BC → D}
Objectives &
Outline • R = ABCDEH. F = {A → BC , E → HA}
Decomposition to
3NF
• R = CSJDPQV . F = {C → CSJDPQV , SD → P, JP → C , J → S}
Test
Algorithm
• R = ABCD. F = {C → D, C → A, B → C }
Practice Problem
Decomposition to
BCNF
Test
Algorithm
Practice Problem
Comparison
Module Summary
Module 27
• It is always possible to decompose a relation into a set of relations that are in 3NF such
Partha Pratim
Das
that:
Objectives &
◦ the decomposition is lossless
Outline ◦ the dependencies are preserved
Decomposition to
3NF • It is always possible to decompose a relation into a set of relations that are in BCNF
Test
Algorithm
such that:
Practice Problem
◦ the decomposition is lossless
Decomposition to
BCNF ◦ it may not be possible to preserve dependencies.
Test
Algorithm
Practice Problem
S# 3NF BCNF
Comparison 1. It concentrates on Primary Key It concentrates on Candidate Key
Module Summary 2. Redundancy is high as compared to BCNF 0% redundancy
3. It preserves all the dependencies It may not preserve the dependencies
4. A dependency X → Y is allowed in 3NF if A dependency X → Y is allowed if X is a
X is a super key or Y is a part of some key super key
Module 27
Partha Pratim • Learnt how to decompose a schema into 3NF while preserving dependency and lossless
Das
join
Objectives &
Outline • Learnt how to decompose a schema into BCNF with lossless join
Decomposition to
3NF
Test
Algorithm
Practice Problem
Decomposition to
BCNF
Test
Algorithm Slides used in this presentation are borrowed from http://db-book.com/ with kind
Practice Problem
Comparison
permission of the authors.
Module Summary Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Library
Information Module 28: Relational Database Design/8: Case Study
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Partha Pratim Das
Final Schema
Module 28
Partha Pratim • Learnt how to decompose a schema into 3NF while preserving dependency and lossless
Das
join
Objectives &
Outline • Learnt how to decompose a schema into BCNF with lossless join
Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema
Module Summary
Module 28
Objectives &
Outline
Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema
Module Summary
Module 28
Objectives &
Outline
Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema
Module Summary
Module 28
Partha Pratim
We are given to design a relational database schema for a Library Information System (LIS)
Das of an Institute
Objectives &
Outline
• The specification document of the LIS has already been shared with you
Library • In this presentation, we include the key points from the Specs; but the actual document
Information
System must be referred to
Specification
Entity Sets • We carry out the following tasks in the module:
Relationships
Relational Schema ◦ Identify the Entity Sets with attributes
Schema Refinement
Final Schema
◦ Identify the Relationships
Module Summary ◦ Build the initial set of relational schema
◦ Refine the set of schema with FDs that hold on them
◦ Finalize the design of the schema
• The coding of various queries in SQL, based on these schema are left as exercises
Module 28
• An institute library has 200000+ books and 10000+ members
Partha Pratim
Das • Books are regularly issued by members on loan and returned after a period.
Objectives &
Outline
• The library needs an LIS to manage the books, the members and the issue-return
Library process
Information
System • Every book has
Specification
Entity Sets ◦ title
Relationships
Relational Schema
◦ author (in case of multiple authors, only the first author is maintained)
Schema Refinement ◦ publisher
Final Schema
Module Summary
◦ year of publication
◦ ISBN number (which is unique for the publication), and
◦ accession number (which is the unique number of the copy of the book in the
library)
◦ There may be multiple copies of the same book in the library
Module 28
• Every faculty has
Partha Pratim
Das ◦ name
Objectives & ◦ employee id
Outline
◦ department
Library
Information ◦ gender
System
Specification
◦ mobile number, and
Entity Sets
Relationships
◦ date of joining
Relational Schema
Schema Refinement
• Library also issues a unique membership number to every member. Every member
Final Schema has a maximum quota for the number of books she / he can issue for the maximum
Module Summary duration allowed to her / him. Currently these are set as:
◦ Each undergraduate student can issue up to 2 books for 1 month duration
◦ Each postgraduate student can issue up to 4 books for 1 month duration
◦ Each research scholar can issue up to 6 books for 3 months duration
◦ Each faculty member can issue up to 10 books for six months duration
Module 28
• The library has the following rules for issue:
Partha Pratim
Das ◦ A book may be issued to a member if it is not already issued to someone else
Objectives & (trivial)
Outline
◦ A book may not be issued to a member if another copy of the same book is already
Library
Information issued to the same member
System
Specification
◦ No issue will be done to a member if at the time of issue one or more of the books
Entity Sets
Relationships
issued by the member has already exceeded its duration of issue
Relational Schema ◦ No issue will be allowed also if the quota is exceeded for the member
Schema Refinement
Final Schema
◦ It is assumed that the name of every author or member has two parts
Module Summary . first name
. last name
Module 28
Partha Pratim • Every book has title, author (in case of multiple authors, only the first author is
Das
maintained), publisher, year of publication, ISBN number (which is unique for the
Objectives &
Outline
publication), and accession number (which is the unique number of the copy of the
Library
book in the library). There may be multiple copies of the same book in the library
Information
System • Entity Set:
Specification
Entity Sets ◦ books
Relationships
Relational Schema • Attributes:
Schema Refinement
Final Schema ◦ title
Module Summary ◦ author name (composite)
◦ publisher
◦ year
◦ ISBN no
◦ accession no
Module 28
Partha Pratim • Every student has name, roll number, department, gender, mobile number, date of
Das
birth, and degree (undergrad, grad, doctoral)
Objectives &
Outline • Entity Set:
Library
Information
◦ students
System
Specification
• Attributes:
Entity Sets
Relationships
◦ member no – is unique
Relational Schema ◦ name (composite)
Schema Refinement
Final Schema ◦ roll no – is unique
Module Summary ◦ department
◦ gender
◦ mobile no – may be null
◦ dob
◦ degree
Module 28
Partha Pratim • Every faculty has name, employee id, department, gender, mobile number, and date of
Das
joining
Objectives &
Outline • Entity Set:
Library
Information
◦ faculty
System
Specification
• Attributes:
Entity Sets
Relationships
◦ member no – is unique
Relational Schema ◦ name (composite)
Schema Refinement
Final Schema ◦ id – is unique
Module Summary ◦ department
◦ gender
◦ mobile no – may be null
◦ doj
Module 28
Partha Pratim • Library also issues a unique membership number to every member. There are four
Das
categories of members of the library: undergraduate students, post graduate students,
Objectives &
Outline
research scholars, and faculty members
Library • Entity Set:
Information
System ◦ members
Specification
Entity Sets
Relationships
• Attributes:
Relational Schema ◦ member no
Schema Refinement
Final Schema ◦ member type (takes a value in ug, pg, rs or fc)
Module Summary
Module 28
Partha Pratim • Every member has a maximum quota for the number of books she / he can issue for
Das
the maximum duration allowed to her / him. Currently these are set as:
Objectives &
Outline ◦ Each undergraduate student can issue up to 2 books for 1 month duration
Library ◦ Each postgraduate student can issue up to 4 books for 1 month duration
Information
System ◦ Each research scholar can issue up to 6 books for 3 months duration
Specification
Entity Sets
◦ Each faculty member can issue up to 10 books for six months duration
Relationships
Relational Schema
• Entity Set:
Schema Refinement
Final Schema
◦ quota
Module Summary • Attributes:
◦ member type
◦ max books
◦ max duration
Module 28
Partha Pratim • Though not explicitly stated, library would have staffs to manage the LIS
Das
• Entity Set:
Objectives &
Outline ◦ staff
Library
Information • Attributes: (speculated – to ratify from customer)
System
Specification ◦ name (composite)
Entity Sets
Relationships
◦ id – is unique
Relational Schema ◦ gender
Schema Refinement
Final Schema ◦ mobile no
Module Summary ◦ doj
Module 28
Partha Pratim • Books are regularly issued by members on loan and returned after a period. The library
Das
needs an LIS to manage the books, the members and the issue-return process
Objectives &
Outline • Relationship
Library
Information
◦ book issue
System
Specification
• Involved Entity Sets
Entity Sets
Relationships
◦ students / faculty / members
Relational Schema
Schema Refinement
. member no
Final Schema
◦ books
Module Summary
. accession no
• Relationship Attribute
◦ doi – date of issue
• Type of relationship
◦ Many-to-one from books
Database Management Systems Partha Pratim Das 28.17
LIS Relational Schema PPD
Module 28
Partha Pratim • books(title, author fname, author lname, publisher, year, ISBN no, accession no)
Das
• book issue(members, accession no, doi)
Objectives &
Outline • members(member no, member type)
Library
Information • quota(member type, max books, max duration)
System
Specification • students(member no, student fname, student lname, roll no, department, gender,
Entity Sets
Relationships mobile no, dob, degree)
Relational Schema
Schema Refinement • faculty(member no, faculty fname, faculty lname, id, department, gender, mobile no,
Final Schema
doj)
Module Summary
• staff(staff fname, staff lname, id, gender, mobile no, doj)
Module 28
Partha Pratim • books(title, author fname, author lname, publisher, year, ISBN no, accession no)
Das
◦ ISBN no → title, author fname, author lname, publisher, year
Objectives &
Outline ◦ accession no → ISBN no
Library ◦ Key: accession no
Information
System • Redundancy of book information across copies
Specification
Entity Sets
Relationships
• Good to normalize:
Relational Schema ◦ book catalogue(title, author fname, author lname, publisher, year, ISBN no)
Schema Refinement
Final Schema . ISBN no → title, author fname, author lname, publisher, year
Module Summary . Key: ISBN no
◦ book copies(ISBN no, accession no)
. accession no → ISBN no
. Key: accession no
• Both in BCNF. Decomposition is lossless join and dependency preserving
Module 28
Module Summary
Module 28
Module Summary
Module 28
Module 28
Partha Pratim • students(member no, student fname, student lname, roll no, department, gender,
Das
mobile no, dob, degree)
Objectives &
Outline ◦ roll no → student fname, student lname, department, gender, mobile no, dob,
Library degree
Information
System ◦ member no → roll no
Specification
Entity Sets
◦ roll no → member no
Relationships ◦ 2 Keys: roll no | member no
Relational Schema
Schema Refinement • In BCNF
Final Schema
Module 28
Partha Pratim • faculty(member no, faculty fname, faculty lname, id, department, gender, mobile no,
Das
doj)
Objectives &
Outline ◦ id → faculty fname, faculty lname, department, gender, mobile no, doj
Library ◦ id → member no
Information
System ◦ member no →id
Specification
Entity Sets
◦ 2 Keys: id | member no
Relationships
Relational Schema
• In BCNF
Schema Refinement
Final Schema
• Issues:
Module Summary ◦ member no is needed for issue / return queries. It is unnecessary to have faculty
details with that.
◦ member no may also come from students relation.
◦ member type is needed for issue / return queries. This is implicit by the fact that
we are in faculty relation.
Module 28
Module 28 There are 4 categories of members: ug students, grad students, research scholars, and
Partha Pratim faculty members. This leads to the following specialization relationships:
Das
• Consider the entity set members of a library and refine:
Objectives &
Outline ◦ Attributes:
Library
Information . member no
System
Specification
. member class – ‘student’ or ‘faculty’, used to choose table
Entity Sets . member type – ug,pg, rs, fc, ...
Relationships
Relational Schema . roll no (if member class – ‘student’. Else null)
Schema Refinement
Final Schema
. id (if member class – ‘faculty’. Else null)
Module Summary • We can then exploit some hidden relationship:
◦ students IS A members
◦ faculty IS A members
• Type of relationship
◦ One-to-one
Module 28
Module 28
Partha Pratim • members(member no, member class, member type, roll no, id)
Das
◦ member no → member type, member class, roll no, id
Objectives &
Outline ◦ member type → member class
Library ◦ Key: member no
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema
Module Summary
Module 28
• students(student fname, student lname, roll no, department, gender, mobile no, dob,
Partha Pratim
Das degree)
Objectives &
◦ roll no → student fname, student lname, department, gender, mobile no, dob,
Outline
degree
Library
Information ◦ Keys: roll no
System
Specification
◦ Note:
Entity Sets
Relationships
. member no is no longer used
Relational Schema . member type and member class are set in members from degree at the time of
Schema Refinement
Final Schema
creation of a new record.
Module Summary
Module 28
• faculty(faculty fname, faculty lname, id, department, gender, mobile no, doj)
Partha Pratim
Das ◦ id → faculty fname, faculty lname, department, gender, mobile no, doj
Objectives &
◦ Keys: id
Outline
◦ Note:
Library
Information . member no is no longer used
System
Specification . member type and member class are set in members at the time of creation of a
Entity Sets
Relationships
new record
Relational Schema
Schema Refinement
Final Schema
Module Summary
Module 28
Partha Pratim • book catalogue(title, author fname, author lname, publisher, year, ISBN no)
Das
• book copies(ISBN no, accession no)
Objectives &
Outline • book issue(member no, accession no, doi)
Library
Information • quota(member type, max books, max duration)
System
Specification • members(member no, member class, member type, roll no, id)
Entity Sets
Relationships
• students(student fname, student lname, roll no, department, gender, mobile no, dob,
Relational Schema
Schema Refinement degree)
Final Schema
Module Summary
• faculty(faculty fname, faculty lname, id, department, gender, mobile no, doj)
• staff(staff fname, staff lname, id, gender, mobile no, doj)
Module 28
Partha Pratim • Using the specification for a Library Information System, we have illustrated how a
Das
schema can be designed and then refined for finalization
Objectives &
Outline • Coding of various queries based on these schema are left as exercises
Library
Information
System
Specification
Entity Sets
Relationships
Relational Schema
Schema Refinement
Final Schema
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Multivalued
Dependency Module 29: Relational Database Design/9: MVD and 4NF
Definition
Example
Use
Theory
Module Summary
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur
Module 29
Partha Pratim • Using the specification for a Library Information System, we have illustrated how a
Das
schema can be designed and then refined for finalization
Objectives &
Outline
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Module 29
Partha Pratim • To understand multi-valued dependencies arising out of attributes that can have
Das
multiple values
Objectives &
Outline • To define Fourth Normal Form and learn the decomposition algorithm to 4NF
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Module 29
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Module 29
Partha Pratim
Das
Objectives &
Outline
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Multivalued Dependency
Objectives &
Outline
Multivalued
Dependency
Definition
Example
There are no non trivial FDs because all attributes are
Use combined forming Candidate Key, that is, MDP. In the
Theory
Decomposition to
above relation, two multivalued dependencies exists:
4NF
• Man Phones
Module Summary
• Man Dogs Like
A man’s phone are independent of the dogs they like.
But after converting the above relation in Single Valued
Attribute, each of a man’s phones appears with each of
the dogs they like in all combinations.
Source: http://www.edugrabs.com/multivalued-dependency-mvd/
Module 29
• If two or more independent relations are kept in a single relation, then Multivalued
Partha Pratim
Das Dependency is possible. For example, Let there are two relations :
Objectives &
◦ Student(SID, Sname) where (SID → Sname)
Outline
◦ Course(CID, Cname) where (CID → Cname)
Multivalued
Dependency • There is no relation defined between Student and Course. If we kept them in a single
Definition
Example relation named Student Course, then MVD will exists because of m:n Cardinality
Use
Theory • If two or more MVDs exist in a relation, then while converting into SVAs, MVD exists.
Decomposition to
4NF
Module Summary
Source: http://www.edugrabs.com/multivalued-dependency-mvd/
Module 29
Partha Pratim
• Suppose we record names of children, and phone numbers for instructors:
Das
◦ inst child(ID, child name)
Objectives & ◦ inst phone(ID, phone number)
Outline
Multivalued
• If we were to combine these schema to get
Dependency
◦ inst info(ID, child name, phone number)
Definition
Example
◦ Example data:
Use (99999, David, 512-555-1234)
Theory (99999, David, 512-555-4321)
Decomposition to (99999, William, 512-555-1234)
4NF
(99999, William, 512-555-4321)
Module Summary
• This relation is in BCNF
◦ Why?
Module 29
• Let R be a relation schema and let α ⊆ R and β ⊆ R. The multivalued dependency
Partha Pratim
Das αβ
Objectives &
holds on R if in any legal relation r(R), for all pairs for tuples t1 and t2 in r such that
Outline t1 [α] = t2 [α], there exist tuples t3 and t4 in r such that:
Multivalued
Dependency
Definition
t1 [α] = t2 [α] = t3 [α] = t4 [α]
Example t3 [β] = t1 [β]
Use t3 [R – β] = t2 [R – β] Test: course book
Theory
t4 [β] = t2 [β]
Decomposition to t4 [R – β] = t1 [R – β]
4NF
Module Summary
Module 29
Partha Pratim
• Let R be a relation schema with a set of attributes that are partitioned into 3 nonempty subsets.
Das Y, Z, W
Objectives & • We say that Y Z (Y multidetermines Z ) if and only if for all possible relations r (R )
Outline < y1 , z1 , w1 >∈ r and < y1 , z2 , w2 >∈ r
Multivalued then
Dependency
Definition
< y1 , z1 , w2 >∈ r and < y1 , z2 , w1 >∈ r
Example
Use
• Note that since the behavior of Z and W are identical it follows that
Theory
Y Z if Y W
Decomposition to
4NF
Module Summary
Module 29
Module 29
Module 29
Module Summary
◦ Y is a subset of X (X ⊇ Y) or
◦ X ∪ Y = R. Otherwise, it is a non trivial MVD and we have to repeat values
redundantly in the tuples.
Module 29
• From the definition of multivalued dependency, we can derive the following rule:
Partha Pratim
Das ◦ If α → β, then α β
Objectives & That is, every functional dependency is also a multivalued dependency
Outline
Multivalued
• The closure D + of D is the set of all functional and multivalued dependencies logically
Dependency
Definition
implied by D.
Example
◦ We can compute D + from D, using the formal definitions of functional
Use
Theory dependencies and multivalued dependencies.
Decomposition to
4NF
◦ We can manage with such reasoning for very simple multivalued dependencies,
Module Summary
which seem to be most common in practice
◦ For complex dependencies, it is better to reason about sets of dependencies using a
system of inference rules
Module 29
Partha Pratim
Das
Objectives &
Outline
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Decomposition to 4NF
Module 29
• A relation schema R is in 4NF with respect to a set D of functional and multivalued
Partha Pratim
Das dependencies if for all multivalued dependencies in D + of the form α β, where α ⊆
Objectives &
R and β ⊆ R, at least one of the following hold:
Outline
◦ α β is trivial (that is, β ⊆ α or α ∪ β = R)
Multivalued
Dependency ◦ α is a superkey for schema R
Definition
Example • If a relation is in 4NF it is in BCNF
Use
Theory
Decomposition to
4NF
Module Summary
Module 29
• The restriction of D to Ri is the set Di consisting of
Partha Pratim
Das ◦ All functional dependencies in D + that include only attributes of Ri
Objectives &
◦ All multivalued dependencies of the form
Outline
α (β ∩ Ri )
Multivalued
Dependency where α ⊆ Ri and α β is in D +
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Module 29
Module 29
result: = {R};
Partha Pratim
Das
done := false;
compute D + ;
Objectives & Let Di denote the restriction of D + to Ri
Outline
while ( not done)
Multivalued
Dependency
if (there is a schema Ri in result that is not in 4NF) then
Definition begin
Example let α β be a nontrivial multivalued dependency that holds
Use
Theory
on Ri such that α → Ri is not in Di , and α ∩ β = φ ;
result := (result − Ri ) ∪ (Ri − β) ∪ (α, β);
Decomposition to
4NF end
Module Summary
else done:= true;
Note: each Ri is in 4NF, and decomposition is lossless-join
Module 29
• Example:
Partha Pratim
Das ◦ Person Modify(Man(M), Phones(P), Dog Likes(D),
Address(A)) Post Normalization
Objectives &
Outline ◦ FDs:
Multivalued . FD1 : Man Phones
Dependency
Definition
Example
. FD2 : Man Dogs Like
Use
Theory . FD3 : Man → Address
Decomposition to
4NF
◦ Key = MPD
Module Summary
◦ All dependencies violate 4NF
Module 29
• R =(A, B, C, G, H, I)
Partha Pratim
Das F=AB
Objectives &
B HI
Outline CG H
Multivalued
Dependency • R is not in 4NF since A B and A is not a superkey for R
Definition
Example • Decomposition
Use
Theory
a) R1 = (A, B) (R1 is in 4NF)
Decomposition to b) R2 = (A, C, G, H, I) (R2 is not in 4NF, decompose into R3 and R4 )
4NF
c) R3 = (C, G, H) (R3 is in 4NF)
Module Summary
d) R4 = (A, C, G, I) (R4 is not in 4NF, decompose into R5 and R6 )
◦ A B and B HI → A HI, (MVD transitivity), and
◦ and hence A I (MVD restriction to R4 )
e) R5 = (A, I) (R5 is in 4NF)
f) R6 = (A, C, G) (R6 is in 4NF)
Module 29
Partha Pratim • Understood multi-valued dependencies to handle attributes that can have multiple
Das
values
Objectives &
Outline • Learnt Fourth Normal Form and decomposition to 4NF
Multivalued
Dependency
Definition
Example
Use
Theory
Decomposition to
4NF
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Database Design
Process Module 30: Relational Database Design/10: Design Summary and Temporal Data
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Partha Pratim Das
Temporal
Databases
Temporal Data
Department of Computer Science and Engineering
Uni / Bi Temporal
Indian Institute of Technology, Kharagpur
Example
Module 30
Partha Pratim • Understood multi-valued dependencies to handle attributes that can have multiple
Das
values
Objectives &
Outline • Learnt Fourth Normal Form and decomposition to 4NF
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Partha Pratim
Das
Objectives &
Outline
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Database Design Process
Module Summary
Module 30
• Goal for a relational database design is:
Partha Pratim
Das ◦ BCNF / 4NF
Objectives &
◦ Lossless join
Outline
◦ Dependency preservation
Database Design
Process • If we cannot achieve this, we accept one of
Normal Forms
Normalization &
De-Normalization
◦ Lack of dependency preservation
Bad Design ◦ Redundancy due to use of 3NF
LIS Example
Temporal
• Interestingly, SQL does not provide a direct way of specifying functional dependencies
Databases
Temporal Data
other than superkeys.
Uni / Bi Temporal
Example
• Can specify FDs using assertions, but they are expensive to test, (and currently not
Module Summary
supported by any of the widely used databases!)
• Even if we had a dependency preserving decomposition, using SQL we would not be
able to efficiently test a functional dependency whose left hand side is not a key
Module 30
• Further NFs
Partha Pratim
Das ◦ Elementary Key Normal Form (EKNF)
Objectives &
◦ Essential Tuple Normal Form (ETNF)
Outline
◦ Join Dependencies And Fifth Normal Form (5 NF)
Database Design
Process ◦ Sixth Normal Form (6NF)
Normal Forms
Normalization &
◦ Domain/Key Normal Form (DKNF)
De-Normalization
Bad Design
• Join dependencies generalize multivalued dependencies
LIS Example
◦ lead to project-join normal form (PJNF) (also called fifth normal form)
Temporal
Databases
Temporal Data
• A class of even more general constraints, leads to a normal form called domain-key
Uni / Bi Temporal normal form.
Example
Module Summary
• Problem with these generalized constraints: are hard to reason with, and no set of
sound and complete set of inference rules exists.
• Hence rarely used
Module 30
• We have assumed schema R is given
Partha Pratim
Das ◦ R could have been generated when converting E-R diagram to a set of tables
Objectives &
◦ R could have been a single relation containing all attributes that are of interest
Outline
(universal relation)
Database Design
Process ◦ Normalization breaks R into smaller relations
Normal Forms
Normalization &
◦ R could have been the result of some ad hoc design of relations, which we then
De-Normalization
test/convert to normal form
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Partha Pratim • When an E-R diagram is carefully designed, identifying all entities correctly, the tables
Das
generated from the E-R diagram should not need further normalization
Objectives &
Outline • However, in a real (imperfect) design, there can be functional dependencies from
Database Design non-key attributes of an entity to other attributes of the entity
Process
Normal Forms ◦ Example: an employee entity with attributes
Normalization &
De-Normalization department name and building,
Bad Design
LIS Example
and a functional dependency
Temporal department name → building
Databases
Temporal Data
◦ Good design would have made department an entity
Uni / Bi Temporal
Example
• Functional dependencies from non-key attributes of a relationship set possible, but rare
Module Summary — most relationships are binary
Module 30
• May want to use non-normalized schema for performance
Partha Pratim
Das • For example, displaying prereqs along with course id, and title requires join of course
Objectives & with prereq
Outline
Database Design
◦ Course(course id, title,. . . )
Process ◦ Prerequisite(course id, prereq)
Normal Forms
Normalization &
De-Normalization
• Alternative 1: Use denormalized relation containing attributes of course as well as
Bad Design prereq with all above attributes: Course(course id, title, prereq,. . . )
LIS Example
Temporal
◦ faster lookup
Databases
Temporal Data
◦ extra space and extra execution time for updates
Uni / Bi Temporal ◦ extra coding work for programmer and possibility of error in extra code
Example
Module Summary
• Alternative 2: Use a materialized view defined as Course ./ Prerequisite
◦ Benefits and drawbacks same as above, except no extra coding work for
programmer and avoids possible errors
Module 30
• Some aspects of database design are not caught by normalization
Partha Pratim
Das • Examples of bad database design, to be avoided:
Objectives & Instead of earnings (company id, year, amount ), use
Outline
Database Design
◦ earnings 2004, earnings 2005, earnings 2006, etc., all on the schema (company id,
Process earnings).
Normal Forms
Normalization &
De-Normalization
. Above are in BCNF, but make querying across years difficult and needs new
Bad Design table each year
LIS Example
Temporal
◦ company year (company id, earnings 2004, earnings 2005, earnings 2006 )
Databases
Temporal Data
. Also in BCNF, but also makes querying across years difficult and requires new
Uni / Bi Temporal attribute each year.
Example
Module Summary
. Is an example of a crosstab, where values for one attribute become column
names
. Used in spreadsheets, and in data analysis tools
Module 30
• Consider a different version of relation book catalogue having the following attributes:
Partha Pratim
Das
◦ book title
◦ book catalogue, author lname: A book title may be associated with more than one
Objectives &
Outline author.
Database Design
Process
• book title {book title, author fname, author lname, edition}
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Partha Pratim
Das
Objectives &
Outline
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Figure: book catalogue
Temporal Data
Uni / Bi Temporal
Example • Since the relation has no FDs, it is already in BCNF.
Module Summary
• However, the relation has two nontrivial MVDs
book title {author fname, author lname} and book title edition.
Thus, it is not in 4NF.
• Nontrivial MVDs must be decomposed to convert it into a set of relations in 4NF.
Database Management Systems Partha Pratim Das 30.13
LIS Example 4NF (3)
Module 30
Partha Pratim
Das
• We decompose book catalogue into book author
Objectives & and book edition because:
Outline
Database Design
◦ book author has trivial MVD
Process
Normal Forms
book title {author fname, author lname}
Normalization &
De-Normalization
◦ book edition has trivial MVD
Bad Design
Figure: book author book title edition.
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Module 30
Partha Pratim
Das
Objectives &
Outline
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Temporal Databases
Module Summary
Module 30
• Some data may be inherently historical because they include time-dependent /
Partha Pratim
Das
time-varying data, such as:
◦ Medical Records
Objectives &
Outline ◦ Judicial records
Database Design
Process
◦ Share prices
Normal Forms ◦ Exchange rates
Normalization &
De-Normalization ◦ Interest rates
Bad Design
LIS Example
◦ Company profits
Temporal ◦ etc.
Databases
Temporal Data • The desire to model such data means that we need to store not only the respective
Uni / Bi Temporal
Example
value but also an associated date or a time period for which the value is valid. Typical
Module Summary
queries expressed informally might include:
◦ Give me last month’s history of the Dollar-Pound Sterling exchange rate.
◦ Give me the share prices of the NYSE on October 17, 1996.
• Temporal databases provide a uniform and systematic way of dealing with historical data
Source: https://www.cs.uct.ac.za/mit notes/database/htmls/chp18.html
Database Management Systems Partha Pratim Das 30.16
Temporal Data
Module 30
• Temporal data have an association time interval during which the data are valid.
Partha Pratim
Das • A snapshot is the value of the data at a particular point in time
Objectives &
Outline
• In practice, database designers may add start and end time attributes to relations
Database Design • For example, course(course id, course title) is replaced by
Process
Normal Forms
course(course id, course title, start, end)
Normalization &
De-Normalization ◦ Constraint: no two tuples can have overlapping valid times and are Hard to enforce
Bad Design
LIS Example
efficiently
Temporal ◦ Foreign key references may be to current version of data, or to data at a point in
Databases
Temporal Data
time
Uni / Bi Temporal
Example
. For example, student transcript should refer to course information at the time
Module Summary
the course was taken
Module 30
• There are two different aspects of time in temporal databases.
Partha Pratim
Das
◦ Valid Time: Time period during which a fact is true in real world, provided to the
system.
Objectives &
Outline ◦ Transaction Time: Time period during which a fact is stored in the database, based
Database Design on transaction serialization order and is the timestamp generated automatically by
Process
Normal Forms the system.
Normalization &
De-Normalization • Temporal Relation is one where each tuple has associated time; either valid time or
Bad Design
LIS Example
transaction time or both associated with it.
Temporal ◦ Uni-Temporal Relations: Has one axis of time, either Valid Time or Transaction
Databases
Temporal Data
Time.
Uni / Bi Temporal
◦ Bi-Temporal Relations: Has both axis of time – Valid time and Transaction time.
Example
Module Summary
It includes Valid Start Time, Valid End Time, Transaction Start Time, Transaction
End Time.
Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Database Management Systems Partha Pratim Das 30.19
Modeling Temporal Data: Example (1)
Module 30
• Example.
Partha Pratim
Das ◦ Let’s see an example of a person, John:
Objectives & . John was born on April 3, 1992 in Chennai.
Outline
. His father registered his birth after three days on April 6, 1992.
Database Design
Process . John did his entire schooling and college in Chennai.
Normal Forms
Normalization &
. He got a job in Mumbai and shifted to Mumbai on June 21, 2015.
De-Normalization
Bad Design
. He registered his change of address only on Jan 10, 2016.
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Module Summary
Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Module 30
Partha Pratim
• John’s Data In Non-Temporal Database
Das
Objectives &
• John was born on April 3, 1992
Outline in Chennai.
Database Design • His father registered his birth af-
Process ter three days on April 6, 1992.
Normal Forms
Normalization &
• John did his entire schooling and
De-Normalization college in Chennai.
Bad Design
LIS Example
• He got a job in Mumbai and
shifted to Mumbai on June 21,
Temporal In a non-temporal database, John’s address is entered as Chennai from 1992. When he 2015.
Databases
Temporal Data
registers his new address in 2016, the database gets updated and the address field now • He registered his change of ad-
shows his Mumbai address. The previous Chennai address details will not be available. dress only on Jan 10, 2016.
Uni / Bi Temporal
Example
So, it will be difficult to find out exactly when he was living in Chennai and when he
moved to Mumbai.
Module Summary
Module 30
• Uni-Temporal Relation (Adding Valid Time To John’s Data)
Partha Pratim
Das
Objectives &
Outline
Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Database Management Systems Partha Pratim Das 30.22
Modeling Temporal Data: Example (4)
Module 30
• Bi-Temporal Relation (John’s Data Using Both Valid And Transaction Time)
Partha Pratim
Das
Objectives &
Outline • John was born on April 3, 1992
Database Design in Chennai.
Process
Normal Forms
• His father registered his birth af-
ter three days on April 6, 1992.
Normalization &
De-Normalization
• John did his entire schooling and
Bad Design • The database contents look like this: college in Chennai.
LIS Example Name, City, Valid From, Valid Till, Entered, Superseded
• He got a job in Mumbai and
Temporal
Databases
• Johns father registers his birth on 6th April 1992: shifted to Mumbai on June 21,
Person(John, Chennai, 3-Apr-1992, ∞, 6-Apr-1992, ∞). 2015.
Temporal Data
Uni / Bi Temporal • On January 10, 2016 John reports his new address in Mumbai: • He registered his change of ad-
Example Person(John, Mumbai, 21-June-2015, ∞, 10-Jan-2016, ∞). dress only on Jan 10, 2016.
Module Summary ◦ The original entry is updated as:
Person(John, Chennai, 3-Apr-1992, 20-June-2015, 6-Apr-1992 ,
10-Jan-2016).
Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Module 30
• Advantages
Partha Pratim
Das
◦ The main advantages of this bi-temporal relations is that it provides historical and
roll back information.
Objectives &
Outline . Historical Information – Valid Time.
Database Design
Process
. Rollback Information – Transaction Time.
Normal Forms ◦ For example, you can get the result for a query on John’s history, like: Where did
Normalization &
De-Normalization John live in the year 2001?. The result for this query can be got with the valid time
Bad Design
LIS Example
entry. The transaction time entry is important to get the rollback information.
Temporal • Disadvantages
Databases
Temporal Data
◦ More storage
Uni / Bi Temporal ◦ Complex query processing
Example
Module Summary
◦ Complex maintenance including backup and recovery
Source: https://www.mytecbits.com/oracle/oracle-database/what-is-temporal-database
Database Management Systems Partha Pratim Das 30.24
Module Summary
Module 30
Database Design
Process
Normal Forms
Normalization &
De-Normalization
Bad Design
LIS Example
Temporal
Databases
Temporal Data
Uni / Bi Temporal
Example
Slides used in this presentation are borrowed from http://db-book.com/ with kind
Module Summary permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Week Recap
Objectives &
Database Management Systems
Outline
Module 31: Application Design and Development/1: Architecture
Application
Programs &
Architecture
Architectures
Classification
1-Tier Partha Pratim Das
2-Tier
3-Tier
n-Tier Department of Computer Science and Engineering
Sample Applications Indian Institute of Technology, Kharagpur
Module Summary
[email protected]
Week Recap
Objectives &
Outline
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
Partha Pratim • Studied the Normal Forms and their Importance in Relational Design – how progressive
Das
increase of constraints can minimize redundancy in a schema
Week Recap
• Learnt how to decompose a schema into 3NF while preserving dependency and lossless
Objectives &
Outline join
Application
Programs & • Learnt how to decompose a schema into BCNF with lossless join
Architecture
Architectures • Using the specification for a Library Information System, we have illustrated how a
Classification
1-Tier
schema can be designed and then refined for finalization
2-Tier
3-Tier
• Coding of various queries based on these schema are left as exercises
n-Tier
Sample Applications
• Understood multi-valued dependencies to handle attributes that can have multiple
Module Summary values
• Learnt Fourth Normal Form and decomposition to 4NF
• Discussed aspects of the database design process
• Studied the issues with temporal data
Database Management Systems Partha Pratim Das 31.3
Module Objectives PPD
Module 31
Partha Pratim • What are the Application Programs across various sectors?
Das
• Commonality of architecture across applications
Week Recap
Objectives &
• Understanding the classification and evolution of the architectures
Outline
Application
• A look at the architecture for a few sample applications
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
Objectives &
• Sample application architectures
Outline
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
Partha Pratim
Das
Week Recap
Objectives &
Outline
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Objectives & ◦ Most use an RDBMS like Oracle, DB2 MySQL, PostgreSQL, etc. for managing data
Outline
◦ Applications are functionally split into frontend layer, middle layer, backend layer
Application
Programs & . Frontend or Presentation Layer / Tier
Architecture
Architectures − Interacts with the user: Display / View, Input / Output
Classification − Choose item, Add to cart, Checkout, Pay, Track order
1-Tier
2-Tier
− Interfaces may be, Browser-based, Mobile App, or Custom
3-Tier
. Middle or Application / Business Logic Layer / Tier
n-Tier
Sample Applications − Implements the Functionality of the Application: Links front and backend
Module Summary − Authentication, Search / Browse logic, Pricing, Cart management, Payment handling
(gateway), Order management (mail / SMS / internal actions), Delivery management
− Support functionality based on frontend interface
. Backend or Data Access Layer / Tier
− Manages persistent data, large volume, efficient access, security
− User, Cart, Inventory, Order, Vendor databases
Database Management Systems Partha Pratim Das 31.8
Characteristic of Application Programs (2): Architecture PPD
Module 31
Partha Pratim
Das
Week Recap
Objectives &
Outline
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
• Presentation Layer / Tier
Partha Pratim
Das ◦ Model-View-Controller (MVC) architecture
Week Recap . model: business logic
Objectives & . view: presentation of data, depends on display device
Outline
. controller: receives events, executes actions, and returns a view to the user
Application
Programs &
Architecture
• Business Logic Layer / Tier
Architectures
Classification
◦ provides high level view of data and actions on data
1-Tier
2-Tier
. often using an object data model
3-Tier ◦ hides details of data storage schema
n-Tier
Sample Applications
• Data Access Layer / Tier
Module Summary
◦ interfaces between business logic layer and the underlying database
◦ provides mapping from object model of business layer to relational model of
database
◦ Already discussed and studied in depth
Database Management Systems Partha Pratim Das 31.10
Application Architecture (2): MVC
Module 31
Partha Pratim
Das
Week Recap
Objectives &
Outline
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
Partha Pratim • Web browsers have become the de-facto standard user interface to databases
Das
◦ Enable large numbers of users to access databases from anywhere
Week Recap
◦ Avoid the need for downloading / installing specialized code, while providing a good
Objectives &
Outline graphical user interface
Application
Programs &
. Javascript, Flash and other scripting languages run in browser, but are
Architecture downloaded transparently
Architectures
Classification ◦ Examples: banks, airline and rental car reservations, university course registration
1-Tier
2-Tier and grading, and so on.
3-Tier
n-Tier • Use in Mobile Devices are getting popular
Sample Applications
Module Summary
◦ Mobile Apps or Browser in Mobile
◦ These are similar in architecture and workflow with web, but have significant
differences with their smaller (but wide range of) form factor, and extremely low
resources
◦ Will be discussed later
Database Management Systems Partha Pratim Das 31.12
Application Architecture (4): Business Logic Layer
Module 31
Objectives &
• Enforces business rules for carrying out actions
Outline
◦ For example, student can enroll in a class only if she has completed prerequisites,
Application
Programs & and has paid her tuition fees
Architecture
Architectures • Supports workflows which define how a task involving multiple participants is to be
Classification
1-Tier carried out
2-Tier
3-Tier
◦ For example, how to process application by a student applying to a university
n-Tier
◦ Sequence of steps to carry out task
Sample Applications
Module Summary
◦ Error handling
. For example, what to do if recommendation letters not received on time
Module 31
Partha Pratim • Allows application code to be written on top of object-oriented data model, while
Das
storing data in a traditional relational database
Week Recap
◦ alternative: implement object-oriented or object-relational database to store object
Objectives &
Outline model
Application
Programs &
. has not been commercially successful
Architecture
Architectures
• Schema designer has to provide a mapping between object data and relational schema
Classification
1-Tier
◦ For example, Java class Student mapped to relation student, with corresponding
2-Tier mapping of attributes
3-Tier
n-Tier ◦ An object can map to multiple tuples in multiple relations
Sample Applications
Module Summary
• Application opens a session, which connects to the database
• Objects can be created and saved to the database using session.save(object)
◦ mapping used to create appropriate tuples in the database
• Query can be run to retrieve objects satisfying specified predicates
Database Management Systems Partha Pratim Das 31.14
Application Architecture (6): Data Access Layer
Module 31
Partha Pratim • Issues of modeling and design of databases have already discussed in depth through the
Das
previous module
Week Recap
• Issues of accessing and updating data from application will be discussed later this with
Objectives &
Outline through the interactions of native languages and SQL
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
• Database architecture uses programming languages to design a particular type of
Partha Pratim
Das
software for businesses or organizations.
Week Recap
• Database architecture focuses on the design, development, implementation and
Objectives & maintenance of computer programs that store and organize information for businesses,
Outline
agencies and institutions.
Application
Programs &
Architecture
• A database architect develops and implements software to meet the needs of users.
Architectures
Classification
• The design of a DBMS depends on its architecture. It can be
1-Tier
2-Tier
◦ centralized
3-Tier ◦ decentralized
n-Tier
Sample Applications
◦ hierarchical
Module Summary • The architecture of a DBMS can be seen as either single tier or multi-tier:
◦ 1-tier architecture
◦ 2-tier architecture
◦ 3-tier architecture
◦ n-tier architecture
Database Management Systems Partha Pratim Das 31.16
Architecture Evolution
Module 31
• Three distinct eras of application architecture
Partha Pratim
Das ◦ Mainframe (1960’s and 70’s)
Week Recap ◦ Personal computer era (1980’s)
Objectives & ◦ Web / Mobile era (1990’s onwards)
Outline
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31 • One-tier architecture involves putting all of the required components for a software
Partha Pratim application or technology on a single server or platform
Das
Week Recap
Objectives &
Outline
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
• The two-tier is based on Client Server architecture
Partha Pratim
Das • It is like client server application
Week Recap
Objectives &
Outline
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
• A 3-tier architecture separates its tiers - Presentation, Logic and Data Access - from
Partha Pratim
Das
each other based on the complexity of the users and how they use the data present in
the database
Week Recap
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
• An n-tier architecture distributes different components of the 3 tiers between different
Partha Pratim
Das
servers and adds interfaces tiers for interactions and workload balancing
Week Recap
Objectives &
Outline
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
Partha Pratim
Das
Week Recap
Objectives &
Outline
Application
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Module 31
Objectives &
• Studies the classification and evolution of the architectures
Outline
Application
• Glimpsed at architecture for a few sample applications
Programs &
Architecture
Architectures
Classification
1-Tier
2-Tier
3-Tier
n-Tier
Sample Applications
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Database Management Systems Partha Pratim Das 31.23
Module 32
Partha Pratim
Das
Objectives &
Outline Database Management Systems
WWW
URL
Module 32: Application Design and Development/2: Web Applications
HTML & HTTP
Sessions & Cookies
Web Browser &
Server
Module 32
Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Module Summary
Module 32
Partha Pratim • To familiarize with the fundamentals notions and technologies of Web
Das
• To learn about scripting
Objectives &
Outline • To learn about Servlets
WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server
Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Module Summary
Module 32
WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server
Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Module Summary
Module 32
Partha Pratim
Das
Objectives &
Outline
WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server
Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Web Fundamentals
Module Summary
Module 32
Scripting
◦ forms, enabling users to enter data which can then be sent back to the Web server
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Module Summary
Module 32 • On the Web, functionality of pointers is provided by Uniform Resource Locators (URLs).
Partha Pratim
Das
• URL example: http://www.acm.org/sigmod
Objectives &
◦ The first part indicates how the document is to be accessed (protocol)
Outline
. “http” indicates that the document is to be accessed using the Hyper Text
WWW
URL
Transfer Protocol.
HTML & HTTP
Sessions & Cookies
◦ The second part gives the unique name of a machine on the Internet
Web Browser &
Server
◦ The rest of the URL identifies the document within the machine
Scripting • The local identification can be:
Client Side
Javscript ◦ The path name of a file on the machine: A file at
Server Side
Servlets
C:/WINDOWS/media/Alarm01.wav of local machine can be accessed as:
JSP
PHP
. file:///C:/WINDOWS/media/Alarm01.wav
Module Summary . file://localhost/c:/WINDOWS/media/Alarm01.wav
◦ An identifier (path name) of a program, plus arguments to be passed to the
program: Searching google.com with ‘silberschatz’ has the uri:
. http://www.google.com/search?q=silberschatz
Database Management Systems Partha Pratim Das 32.7
URI, URL, and URN PPD
Module 32
Partha Pratim
Das • Uniform Resource Identifier (URI)
Objectives &
• Uniform Resource Locator (URL)
Outline
• Uniform Resource Name (URN)
WWW
URL
• Relationships:
HTML & HTTP ◦ URIs can be classified as lo-
Sessions & Cookies
cators (URLs), or as names
Web Browser &
Server (URNs), or as both.
Scripting ◦ URN functions like a person’s
Client Side name
Javscript
Server Side
◦ URL resembles that person’s
Servlets street address.
JSP ◦ URN defines an item’s iden-
PHP
tity, while the URL provides a
Module Summary method for finding it
Module 32
Partha Pratim • HTML provides formatting, hypertext link, and image display features
Das
◦ including tables, stylesheets (to alter default formatting), etc.
Objectives &
Outline • HTML also provides input features
WWW
URL
◦ Select from a set of options
HTML & HTTP
Sessions & Cookies
. Pop-up menus, radio buttons, check lists
Web Browser &
Server
◦ Enter values
Scripting . Text boxes
Client Side
Javscript ◦ Filled in input sent back to the server, to be acted upon by an executable at the
Server Side
Servlets
server
JSP
PHP
• HyperText Transfer Protocol (HTTP) used for communication with the Web server
Module Summary
Module 32
<html>
Partha Pratim <body>
Das <table border>
<tr> <th>ID</th><th>Name</th> <th>Department</th></tr>
Objectives & <tr> <td>00128</td> <td>Zhang</td> <td>Comp. Sci.</td> </tr>
Outline
···
WWW </table>
URL <form action=”PersonQuery” method=get>
HTML & HTTP Search for:
Sessions & Cookies <select name=”persontype”>
Web Browser &
Server
<option value=”student” selected>Student</option>
<option value=”instructor”> Instructor </option>
Scripting
</select> <br>
Client Side
Name: <input type=text size=20 name=”name”>
Javscript
Server Side
<input type=submit value=”submit”>
Servlets
</form>
JSP
</body>
PHP </html>
Module Summary
Module 32
Module Summary
• Solution: use a cookie
Module 32
Module Summary
Module 32 • A web browser is application software for accessing the World Wide Web
Partha Pratim
Das
• A web browser is to fetch content from the Web and display it on a user’s device
Objectives &
• This process begins when the user inputs a URL into the browser starting with either
Outline
http: or https:
WWW
URL • Once a web page has been retrieved, the rendering engine displays it on the user’s device
HTML & HTTP
Sessions & Cookies ◦ A browser or rendering engine is a core software component for a web browser
Web Browser &
Server ◦ The primary job of a browser engine is to transform HTML documents and other
Scripting resources of a web page into an interactive visual representation on a user’s device
Client Side
Javscript
◦ This includes image and video formats supported by the browser
Server Side
Servlets
• Web pages usually contain hyperlinks to other pages and resources. Each link contains
JSP a URL, and when it is clicked or tapped, the browser navigates to the new resource
PHP
Module Summary • Web browsers are used on a range of devices, including desktops, laptops, tablets, and
smartphones. In 2020, an estimated 4.9 billion people used a browser. The most used
browser is Google Chrome, with a 64% global market share on all devices, followed by
Safari with 19%
Database Management Systems Partha Pratim Das 32.13
Web Servers
Module 32 • A web server is software and underlying hardware that accepts requests via HTTP or its
Partha Pratim secure variant HTTPS
Das
• A web browser or crawler, requests for a specific resource using HTTP, and the server
Objectives &
Outline responds with the content of that resource or an error message
WWW
URL
• The server can also accept and store resources sent from the user agent
HTML & HTTP
Sessions & Cookies
• The document name in a URL may identify an executable program, that, when run,
Web Browser &
Server
generates a HTML document
Scripting ◦ When an HTTP server receives a request for such a document, it executes the
Client Side
Javscript
program, and sends back the HTML document that is generated
Server Side ◦ The Web client can pass extra arguments with the name of the document
Servlets
JSP • To install a new service on the Web, one simply needs to create and install an
PHP
Module Summary
executable that provides that service
◦ The Web browser provides a graphical user interface to the information service
• Common Gateway Interface (CGI): a standard interface between web and application
server
Database Management Systems Partha Pratim Das 32.14
Web Services
Module 32
Partha Pratim • Allow data on Web to be accessed using remote procedure call mechanism
Das
• Two approaches are widely used
Objectives &
Outline ◦ Representation State Transfer (REST): allows use of standard HTTP request to
WWW a URL to execute a request and return data
URL
HTML & HTTP . returned data is encoded either in XML, or in JavaScript Object Notation
Sessions & Cookies
Web Browser & (JSON)
Server
Scripting
◦ Big Web Services:
Client Side
Javscript
. uses XML representation for sending request data, as well as for returning results
Server Side . standard protocol layer built on top of HTTP
Servlets
JSP
PHP
Module Summary
Module 32
Partha Pratim
Das
Objectives &
Outline
WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server
Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Module Summary
Source: Web Application Architecture: A Comprehensive Guide On The What, Why And How
Database Management Systems Partha Pratim Das 32.16
Scripting for Web Applications PPD
Module 32
Partha Pratim
Das
Objectives &
Outline
WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server
Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Scripting for Web Applications
Module Summary
Module 32
Partha Pratim • A script is a list of (text) commands that are embedded in a web-page or in the server
Das
• They are interpreted and executed by a certain program or scripting engine
Objectives &
Outline • Scripts may be written for a variety of purposes such as for automating processes on a
WWW
URL
local-computer or to generate web pages.
HTML & HTTP
Sessions & Cookies
• The programming languages in which scripts are written are called scripting language
Web Browser &
Server • Common scripting languages are VBScript, JavaScript, ASP, PHP, PERL, JSP etc.
Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Module Summary
Scripting
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Module Summary
Module 32
Partha Pratim • Browsers can fetch certain scripts (client-side scripts) or programs along with
Das
documents, and execute them in “safe mode” at the client site
Objectives &
Outline ◦ Javascript
WWW ◦ Macromedia Flash and Shockwave for animation/games
URL
HTML & HTTP
◦ VRML
Sessions & Cookies ◦ Applets
Web Browser &
Server
• Client-side scripts/programs allow documents to be active
Scripting
Client Side ◦ For example, animation by executing programs at the local site
Javscript
Server Side
◦ For example, ensure that values entered by users satisfy some correctness checks
Servlets ◦ Permit flexible interaction with the user.
JSP
PHP . Executing programs at the client site speeds up interaction by avoiding many
Module Summary round trips to server
Module 32
Partha Pratim • Security mechanisms needed to ensure that malicious scripts do not cause damage to
Das
the client machine
Objectives &
Outline ◦ Easy for limited capability scripting languages, harder for general purpose
WWW programming languages like Java
URL
HTML & HTTP • For example, Java’s security system ensures that the Java applet code does not make
Sessions & Cookies
Web Browser &
any system calls directly
Server
Scripting
◦ Disallows dangerous actions such as file writes
Client Side ◦ Notifies the user about potentially dangerous actions, and allows the option to abort
Javscript
Server Side
the program or to continue execution.
Servlets
JSP
PHP
Module Summary
Module 32
Scripting
model (DOM) tree representation of the displayed HTML text
Client Side ◦ communicate with a Web server to fetch data and modify the current page using
Javscript
Server Side
fetched data, without needing to reload/refresh the page
Servlets
JSP
. forms basis of AJAX technology used widely in Web 2.0 applications
PHP . For example, on selecting a country in a drop-down menu, the list of states in
Module Summary that country is automatically populated in a linked drop-down menu
Module 32
<html> <head>
Objectives &
Outline <script type="text/javascript">
WWW function validate() {
URL var credits=document.getElementById("credits").value;
HTML & HTTP if (isNaN(credits)|| credits<=0 || credits>=16) {
Sessions & Cookies
Web Browser &
alert("Credits must be a number greater than 0 and less than 16");
Server return false;
Scripting }
Client Side }
Javscript
Server Side
</script>
Servlets </head> <body>
JSP <form action="createCourse" onsubmit="return validate()">
PHP
Title: <input type="text" id="title" size="20"><br />
Module Summary Credits: <input type="text" id="credits" size="2"><br />
<Input type="submit" value="Submit">
</form>
</body> </html>
Database Management Systems Partha Pratim Das 32.23
Server-Side Scripting
Module 32
Partha Pratim • Server-side scripting simplifies the task of connecting a database to the Web
Das
◦ Define an HTML document with embedded executable code/SQL queries.
Objectives &
Outline ◦ Input values from HTML forms can be used directly in the embedded code/SQL
WWW queries.
URL
HTML & HTTP
◦ When the document is requested, the Web server executes the embedded code/SQL
Sessions & Cookies queries to generate the actual HTML document.
Web Browser &
Server
• Numerous server-side scripting languages
Scripting
Client Side ◦ JSP, PHP
Javscript
Server Side
◦ General purpose scripting languages: VBScript, Perl, Python
Servlets
JSP
PHP
Module Summary
Module 32
Partha Pratim • Java Servlet specification defines an API for communication between the Web /
Das
application server and application program running in the server
Objectives &
Outline ◦ For example, methods to get parameter values from Web forms, and to send HTML
WWW text back to client
URL
HTML & HTTP • Application program (also called a servlet) is loaded into the server
Sessions & Cookies
Web Browser & ◦ Each request spawns a new thread in the server
Server
Scripting
. thread is closed once the request is serviced
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Module Summary
Module 32
Module Summary
• Store/retrieve attribute value pairs for a particular session
◦ session.setAttribute(“userid”, userid)
◦ session.getAttribute(“userid”)
Module 32
Scripting
across multiple application servers, etc
Client Side
Javscript
Server Side
Servlets
JSP
PHP
Module Summary
Module 32
Module 32
Module Summary
Module 32
Partha Pratim • Familiarized with the Fundamentals notions and technologies of Web
Das
• Learnt about Scripting
Objectives &
Outline • Learnt the notions of Servlets
WWW
URL
HTML & HTTP
Sessions & Cookies
Web Browser &
Server
Scripting
Client Side
Javscript
Server Side
Slides used in this presentation are borrowed from http://db-book.com/ with kind
Servlets permission of the authors.
JSP
PHP
Edited and new slides are marked with “PPD”.
Module Summary
Partha Pratim
Das
Objectives &
Outline Database Management Systems
SQL and Native
Language Module 33: Application Design and Development/3: SQL and Native Language
ODBC
Example: Python
JDBC
Example: Java
Partha Pratim Das
Bridge
Embedded SQL
Department of Computer Science and Engineering
Example: C
Indian Institute of Technology, Kharagpur
Example: Java
Module 33
Partha Pratim • Familiarized with the Fundamentals notions and technologies of Web
Das
• Learnt about Scripting
Objectives &
Outline • Learnt the notions of Servlets
SQL and Native
Language
ODBC
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Example: C
Example: Java
Module Summary
Module 33
Objectives &
Outline
ODBC
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Example: C
Example: Java
Module Summary
Module 33
Objectives &
Outline
ODBC
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Example: C
Example: Java
Module Summary
Partha Pratim
Das
Objectives &
Outline
ODBC
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Example: C
Working with SQL and Native Language
Example: Java
Module Summary
Module 33
Partha Pratim
Das
Objectives &
Outline
ODBC
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Example: C
Example: Java
Module Summary
Module 33
• Open Database Connectivity (ODBC) is a standard API for accessing DBMS
Partha Pratim
Das • It aimed to be independent of database systems and operating systems
Objectives & • An application written using ODBC can be ported to other platforms, both on the
Outline
client and server side, with few changes to the data access code
SQL and Native
Language
• ODBC is
ODBC
Example: Python ◦ A standard for application program to communicate with a database server
JDBC ◦ An application program interface (API) to
Example: Java
Bridge
. Open a connection with a database
Embedded SQL . Send queries and updates
Example: C . Get back results
Example: Java
Module Summary • Applications such as GUI, Spreadsheets, etc. can use ODBC
• ODBC was originally developed by Microsoft and Simba Technologies during the early
1990s, and became the basis for the Call Level Interface (CLI) standardized by SQL
Access Group in the Unix and mainframe field.
Database Management Systems Partha Pratim Das 33.8
ODBC (2): Python Example PPD
Module 33
• The code uses a data source
Partha Pratim
Das
named “SQLS” from the odbc.ini
file to connect and issue a query.
Objectives &
Outline
• It creates a table, inserts data
SQL and Native
Language using literal and parameterized
ODBC statements and fetches the data
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Example: C
Example: Java
Module Summary
Source: https: // dzone. com/ articles/ tutorial- connecting- to- odbc- data- sources- with- pyth
Module 33 • Java Database Connectivity (JDBC) is an API for the programming language Java,
Partha Pratim which defines how a client may access a database
Das
• It is a Java-based data access technology used for Java database connectivity
Objectives &
Outline
• JDBC supports a variety of features for querying and updating data, and for retrieving
SQL and Native
Language query results; metadata retrieval, such as querying about relations present in the
ODBC database and the names and types of relation attributes
Example: Python
JDBC
• Model for communicating with the database:
Example: Java
◦ Open a connection
Bridge
◦ Create a “statement” object
Embedded SQL
Example: C
◦ Execute queries using the Statement object to send queries and fetch results
Example: Java ◦ Exception mechanism to handle errors
Module Summary
• JDBC, originally released by Sun Microsystems released as part of Java Development
Kit (JDK) 1.1 on in 1997, is part of the Java Standard Edition platform, from Oracle
Corporation
Module 33
Partha Pratim • We show a simple example here to connect to SQL Server from Java using JDBC to
Das
execute database commands
Objectives &
Outline • In the example, the sample code makes a connection to the sample database
SQL and Native
Language
• Then, using an SQL statement with the SQLServerStatement object, it runs the SQL
ODBC statement and places the data that it returns into a SQLServerResultSet object
Example: Python
JDBC
• Next, the sample code calls the custom displayRow method to iterate through the
Example: Java rows of data that are in the result set, and uses the getString method to display some
Bridge of the data
Embedded SQL
Example: C
• Complete example can be found at: Retrieving result set data sample
Example: Java
Module Summary
Module 33
private static void displayRow(String title, ResultSet rs) throws SQLException {
System.out.println(title);
Partha Pratim while (rs.next()) { // Iterate on Table("ProductID", "Name")
Das System.out.println(rs.getString("ProductID") + " : " + rs.getString("Name"));
}
Objectives &
Outline
}
private static void createTable(Statement stmt) throws SQLException {
SQL and Native stmt.execute("if exists (select * from sys.objects where name = ’Product_JDBC_Sample’)"
Language
+ "drop table Product_JDBC_Sample");
ODBC
Example: Python String sql = "CREATE TABLE [Product_JDBC_Sample](" // Table Name
JDBC + "[ProductID] [int] IDENTITY(1,1) NOT NULL," // Attribute 1
Example: Java + "[Name] [varchar](30) NOT NULL,)"; // Attribute 2
Bridge
stmt.execute(sql);
Embedded SQL sql = "INSERT Product_JDBC_Sample VALUES (’Adjustable Time’,’AR-5381’)"; // Add Product 1
Example: C
stmt.execute(sql);
Example: Java
Module Summary sql = "INSERT Product_JDBC_Sample VALUES (’ML Bottom Bracket’,’BB-8107’)"; // Add Product 2
stmt.execute(sql);
Module 33
A bridge is a special kind of driver that uses another driver-based technology
Partha Pratim • This driver translates source function-calls into target function-calls
Das
• Programmers usually use such a bridge when they lack a source driver for some
Objectives &
Outline database but have access to a target driver
SQL and Native
Language • Common bridges are:
ODBC ◦ ODBC-to-JDBC (ODBC-JDBC) bridges: An ODBC-JDBC bridge consists of an ODBC driver
Example: Python which uses the services of a JDBC driver to connect to a database. Examples: OpenLink
JDBC ODBC-JDBC Bridge, SequeLink ODBC-JDBC Bridge
Example: Java ◦ JDBC-to-ODBC (JDBC-ODBC) bridges: A JDBC-ODBC bridge consists of a JDBC driver which
Bridge employs an ODBC driver to connect to a target database. Examples: OpenLink JDBC-ODBC
Embedded SQL Bridge, SequeLink JDBC-ODBC Bridge
Example: C ◦ OLE DB-to-ODBC bridges: An OLE DB-ODBC bridge consists of an OLE DB Provider which
Example: Java
uses the services of an ODBC driver to connect to a target database. This provider translates OLE
Module Summary DB method calls into ODBC function calls. Examples: OpenLink OLEDB-ODBC Bridge,
SequeLink OLEDB-ODBC Bridge
◦ ADO.NET-to-ODBC bridges: An ADO.NET-ODBC bridge consists of an ADO.NET Provider
which uses the services of an ODBC driver to connect to a target database. Examples: OpenLink
ADO.NET-ODBC Bridge, SequeLink ADO.NET-ODBC Bridge
Database Management Systems Partha Pratim Das 33.14
Native Language ⇐⇒ Query Language: Embedded SQL PPD
Module 33
Partha Pratim
Das
Objectives &
Outline
ODBC
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Example: C
Example: Java
Module Summary
Module 33
Partha Pratim • The SQL standard defines embedding of SQL in a variety of programming languages
Das
such as C, C++, Java, FORTRAN, and PL/1
Objectives &
Outline • A language to which SQL queries are embedded is referred to as a host language, and
SQL and Native the SQL structures permitted in the host language comprise embedded SQL
Language
ODBC • The basic form of these languages follows that of the System R embedding of SQL into
Example: Python
PL/1
JDBC
Example: Java • EXEC SQL (or similar alternate like #sql) statement is used to identify embedded SQL
Bridge request to the pre-processor
Embedded SQL EXEC SQL <embedded SQL statement >;
Example: C
Example: Java Note: this varies by language:
Module Summary
◦ In some languages, like COBOL, the semicolon is replaced with END-EXEC
◦ In Java embedding uses # SQL {....};
Module 33
Partha Pratim • Before executing any SQL statements, the program must first connect to the database.
Das
This is done using:
Objectives &
Outline EXEC-SQL connect to server user user-name using password;
SQL and Native
Language
Here, server identifies the server to which a connection is to be established
ODBC
Example: Python • Variables of the host language can be used within embedded SQL statements. They are
JDBC preceded by a colon (:) to distinguish from SQL variables (for example, :credit amount )
Example: Java
Bridge
• Variables used as above must be declared within DECLARE section, as illustrated below.
Embedded SQL The syntax for declaring the variables, however, follows the usual host language syntax
Example: C
Example: Java EXEC-SQL BEGIN DECLARE SECTION
Module Summary
int credit-amount ;
EXEC-SQL END DECLARE SECTION;
Module 33
ODBC
◦ From within a host language, find the ID and name of students who have completed
Example: Python more than the number of credits stored in variable credit amount in the host
JDBC
Example: Java
language
Bridge
◦ Specify the query in SQL as follows:
Embedded SQL EXEC SQL
Example: C
Example: Java
declare c cursor for
Module Summary
select ID, name
from student
where tot cred > :credit amount
END EXEC
Module 33
ODBC
• Specify the query in SQL as follows:
Example: Python EXEC SQL
JDBC
Example: Java
declare c cursor for
Bridge
select ID, name
Embedded SQL from student
Example: C
Example: Java
where tot cred > :credit amount
Module Summary
END EXEC
• The variable c (used in the cursor declaration) is used to identify the query
Module 33
JDBC • The fetch statement causes the values of one tuple in the query result to be placed on
Example: Java
host language variables.
Bridge
EXEC SQL fetch c into :si, :sn END EXEC
Embedded SQL
Example: C Repeated calls to fetch get successive tuples in the query result
Example: Java
Module Summary
Module 33
Partha Pratim • A variable called SQLSTATE in the SQL communication area (SQLCA) gets set to
Das
‘02000’ to indicate no more data is available
Objectives &
Outline • The close statement causes the database system to delete the temporary relation that
SQL and Native holds the result of the query.
Language
ODBC
EXEC SQL close c ;
Example: Python Note: above details vary with language. For example, the Java embedding defines Java
JDBC iterators to step through result tuples.
Example: Java
Bridge
Embedded SQL
Example: C
Example: Java
Module Summary
Module 33
Partha Pratim • Embedded SQL expressions for database modification (update, insert, and delete)
Das
• Can update tuples fetched by cursor by declaring that the cursor is for update
Objectives &
Outline EXEC SQL
SQL and Native declare c cursor for
Language
ODBC
select *
Example: Python from instructor
JDBC where dept name = ‘Music’
Example: Java
Bridge
for update
Embedded SQL • We then iterate through the tuples by performing fetch operations on the cursor (as
Example: C
Example: Java
illustrated earlier), and after fetching each tuple we execute the following code:
Module Summary update instructor
set salary = salary + 1000
where current of c
Module 33
• Here is an example embedded SQL C program from DB2: Embedded SQL for C and
Partha Pratim
Das
C++ (by P. Godfrey NOV 2002)
Objectives &
• It does not do much, but is instructive
Outline
• The APP queries a table sailor in schema one.
SQL and Native
Language
• User one has granted select privileges to all on table sailor, so the bind step will be legal
ODBC
Example: Python • This APP takes one argument on the command line, a sailor’s SID. It then finds the
JDBC sailor SID’s age out of the table ONE.SAILOR and reports it
Example: Java
Bridge • Try pre-compiling / compiling it. Connect to database c341f02 for this.
Embedded SQL
Example: C
Example: Java
Module Summary
#include <stdio.h>
Module 33
#include <stdlib.h>
Partha Pratim #include <string.h>
Das #include <sqlenv.h>
#include <sqlcodes.h>
Objectives & #include <sys/time.h>
Outline
Module 33
main (int argc, char *argv[]) { // The PROGRAM
// Grab the first command argument. This is the SID
Partha Pratim if (argc > 1) {
Das sid = atoi(argv[1]);
printf("SID requested is %d.\n", sid); // If there is no argument, bail
Objectives &
Outline
} else {
printf("Which SID?\n");
SQL and Native exit(0);
Language
}
ODBC
Example: Python EXEC SQL CONNECT TO C3421M;
JDBC CHECK_SQL(0, "Connect failed", EXIT);
Example: Java
Bridge
// Find the name and age of sailor SID
EXEC SQL SELECT SNAME, AGE into :sname, :sage
Embedded SQL FROM ONE.SAILOR
Example: C
WHERE sid = :sid;
Example: Java
CHECK_SQL(0, "The SELECT query failed.", EXIT);
Module Summary
// Report the age
printf("Sailor %s’s age is %d.\nExecuted Successfully\nBye\n", sname, sage);
errorexit:
EXEC SQL CONNECT RESET;
}
Database Management Systems Partha Pratim Das 33.26
Embedded SQL: C Example (5) PPD
Embedded SQL
% sage 44
Example: C
Example: Java
• The output should be:
Module Summary
SID requested is 44.
Sailor guppy’s age is 31.
Executed Successfully
Bye
Database Management Systems Partha Pratim Das 33.27
Embedded SQL: C Example (6) PPD
Module 33
• The program prompts the user for an order number, retrieves the customer number,
Partha Pratim
Das
salesperson, and status of the order, and displays the retrieved information on the screen
Objectives &
Outline
ODBC
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Example: C
Example: Java
Module Summary
• The statement used to return the data is a singleton SELECT statement; that is, it
returns only a single row of data. So, the code example does not declare or use cursors
Source: https: // docs. microsoft. com/ en- us/ sql/ odbc/ reference/ embedded- sql- example
Database Management Systems Partha Pratim Das 33.28
Embedded SQL: Java Example PPD
Module 33
Partha Pratim • The following example SQLJ application, App.sqlj, uses static SQL to retrieve and
Das
update data from the EMPLOYEE table of the sample database
Objectives &
Outline • Complete example can be found at: Example: Embedding SQL Statements in your
SQL and Native Java™ application
Language
ODBC
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Example: C
Example: Java
Module Summary
import java.sql.*;
Module 33 import sqlj.runtime.*;
import sqlj.runtime.ref.*;
Partha Pratim
Das #sql iterator App_Cursor1 (String empno, String firstnme) ; // 1
#sql iterator App_Cursor2 (String) ;
Objectives &
Outline class App { // Register Driver
SQL and Native static { try { Class.forName("com.ibm.db2.jdbc.app.DB2Driver").newInstance(); }
Language catch (Exception e) { e.printStackTrace(); }
ODBC }
Example: Python public static void main(String argv[]) {
try { App_Cursor1 cursor1; App_Cursor2 cursor2; String str1 = null, str2 = null; long count1;
JDBC
String url = "jdbc:db2:sample"; // URL is jdbc:db2:dbname
Example: Java
Module 33
1 Declare iterators. This section declares two types of iterators:
Partha Pratim
Das ◦ App Cursor1: Declares column data types and names, and returns the values of the
Objectives &
columns according to column name (Named binding to columns)
Outline ◦ App Cursor2: Declares column data types, and returns the values of the columns by
SQL and Native
Language
column position (Positional binding to columns)
ODBC
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Example: C
Example: Java
Module Summary
Module 33
// retrieve data from the database
System.out.println("Retrieve some data from the database.");
Partha Pratim #sql cursor1 = {SELECT empno, firstnme FROM employee}; // 2
Das
// display the result set. cursor1.next() returns false when there are no more rows
Objectives &
Outline System.out.println("Received results:");
while (cursor1.next()) { // 3
SQL and Native
Language
str1 = cursor1.empno(); str2 = cursor1.firstnme(); // 4
System.out.println(" empno= " + str1 + " firstname= " + str2 + "");
ODBC }
Example: Python
cursor1.close(); // 9
JDBC
Example: Java // retrieve number of employee from the database
Bridge
#sql { SELECT count(*) into :count1 FROM employee }; // 5
if (1 == count1)
Embedded SQL System.out.println("There is 1 row in employee table");
Example: C
else
Example: Java
System.out.println("There are " + count1 + " rows in employee table");
Module Summary
// update the database
System.out.println("Update the database.");
#sql { UPDATE employee SET firstnme = ’SHILI’ WHERE empno = ’000010’ };
Module 33
2 Initialize the iterator. The iterator object cursor1 is initialized using the result of a
Partha Pratim
Das
query. The query stores the result in cursor1.
Objectives &
3 Advance the iterator to the next row. The cursor1.next() method returns a Boolean
Outline
false if there are no more rows to retrieve.
SQL and Native
Language 4 Move the data. The named accessor method empno() returns the value of the column
ODBC named empno on the current row. The named accessor method firstnme() returns the
Example: Python
JDBC
value of the column named firstnme on the current row.
Example: Java
5 SELECT data into a host variable. The SELECT statement passes the number of
Bridge
rows in the table into the host variable count1.
Embedded SQL
Example: C 9 Close the iterators. The close() method releases any resources held by the iterators.
Example: Java
Module Summary
You should explicitly close iterators to ensure that system resources are released in a
timely fashion.
JDBC System.out.println(" empno= " + str1 + " firstname= " + str2 + "");
Example: Java }
cursor2.close(); // 9
Bridge
Module 33
6 Initialize the iterator. The iterator object cursor2 is initialized using the result of a
Partha Pratim
Das
query. The query stores the result in cursor2.
Objectives &
7 Retrieve the data. The FETCH statement returns the current value of the first
Outline
column declared in the ByPos cursor from the result table into the host variable str2.
SQL and Native
Language 8 Check the success of a FETCH.INTO statement. The endFetch() method returns
ODBC a Boolean true if the iterator is not positioned on a row, that is, if the last attempt to
Example: Python
JDBC
fetch a row failed. The endFetch() method returns false if the last attempt to fetch a
Example: Java row was successful. DB2 attempts to fetch a row when the next() method is called. A
Bridge FETCH...INTO statement implicitly calls the next() method.
Embedded SQL
Example: C
9 Close the iterators. The close() method releases any resources held by the iterators.
Example: Java
You should explicitly close iterators to ensure that system resources are released in a
Module Summary
timely fashion.
Module 33
Objectives &
Outline
ODBC
Example: Python
JDBC
Example: Java
Bridge
Embedded SQL
Slides used in this presentation are borrowed from http://db-book.com/ with kind
Example: C permission of the authors.
Example: Java
Module Summary
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
PostgreSQL and
Python Module 34: Application Design and Development/4: Python and PostgreSQL
Python
Frameworks for
PostgresSQL
Flask
Module Summary
Partha Pratim Das
Module 34
Objectives &
Outline
PostgreSQL and
Python
Python
Frameworks for
PostgresSQL
Flask
Module Summary
Module 34
PostgreSQL and
Python
Python
Frameworks for
PostgresSQL
Flask
Module Summary
Module 34
PostgreSQL and
Python
Python
Frameworks for
PostgresSQL
Flask
Module Summary
Partha Pratim
Das
Objectives &
Outline
PostgreSQL and
Python
Python
Frameworks for
PostgresSQL
Flask
Module Summary
Module 34
Partha Pratim
Following Python modules that can be used to work with a PostgreSQL database server:
Das
• psycopg2
Objectives &
Outline • pg8000
PostgreSQL and
Python
• py-postgresql
Python
Frameworks for
• PyGreSQL
PostgresSQL
• ocpgdb
Flask
Module 34
Advantages of psycopg2
Partha Pratim
Das
• Most popular and stable module to work with PostgreSQL
Objectives &
Outline • Used in most of the Python and Postgres frameworks
PostgreSQL and
Python
• An actively maintained package and supports Python 2.x and 3.x
Python • Thread-safe and designed for heavily multi-threaded applications.
Frameworks for
PostgresSQL
Module 34
Partha Pratim
Steps to access PostgresSQL from Python using Psycopg
Das
a) Create connection
Objectives &
Outline
b) Create cursor
PostgreSQL and
Python c) Execute the query
Python
Frameworks for d) Commit/rollback
PostgresSQL
Module 34
Partha Pratim
Das
• psycopg2.connect(database="mydb", user="myuser", password="mypass"
host="127.0.0.1", port="5432")
Objectives &
Outline This API opens a connection to the PostgreSQL database. If database is opened
PostgreSQL and
Python
successfully, it returns a connection object.
Python • connection.close()
Frameworks for
PostgresSQL This method closes the database connection.
Flask
Module Summary Important psycopg2 module routines for managing cursor object:
• connection.cursor()
This routine creates a cursor which will be used throughout the program.
• cursor.close()
This method closes the cursor.
Module 34
Partha Pratim
Das
• cursor.fetchone()
This method fetches the next row of a query result set, returning a single sequence, or
Objectives &
Outline None when no more data is available.
PostgreSQL and
Python • cursor.fetchmany([size=cursor.arraysize])
Python This routine fetches the next set of rows of a query result, returning a list. An empty list
Frameworks for
PostgresSQL is returned when no more rows are available. The method tries to fetch as many rows as
Flask indicated by the size parameter.
Module Summary
• cursor.fetchall()
This routine fetches all (remaining) rows of a query result, returning a list. An empty
list is returned when no rows are available.
Module 34
Partha Pratim
Das
• connection.commit()
This method commits the current transaction. If you do not call this method, anything
Objectives &
Outline you did since the last call to commit() is not visible to other database connections.
PostgreSQL and
Python • connection.rollback()
Python This method rolls back any changes to the database since the last call to commit().
Frameworks for
PostgresSQL
Flask
Module Summary
psycopg2.DatabaseError: Exception raised for errors that are related to the PostgreSQL database.
We assume the following for all the programs in this module:
• Database Name: mydb
• Username: myuser
• Password: mypass
• Host Name: localhost or IP address 127.0.0.1
Database Management Systems Partha Pratim Das 34.13
Steps to execute SQL commands
Module 34
1. Use the psycopg2.connect() method with the required arguments to connect Post-
Partha Pratim
Das gresSQL. It would return an Connection object if the connection established successfully.
Objectives & 2. Create a cursor object using the cursor() method of connection object.
Outline
PostgreSQL and 3. The execute() methods run the SQL commands and return the result.
Python
Python
4. Use cursor.fetchall() or fetchone() or fetchmany() to read query result.
Frameworks for
PostgresSQL 5. Use commit() to make the changes in database persistent, or use rollback() to revert
Flask the database changes.
Module Summary
6. Use cursor.close() and connection.close() method to close the cursor and Post-
greSQL connection.
Source: https: // pynative. com/ python-postgresql-tutorial/
Output:
Output:
If a row already exists with emp num = 110
Database Management Systems Partha Pratim Das 34.16
Executing DELETE statement from Python
Output:
Output:
If the row does not exist
Output:
Output:
If the row does not exist
Output:
Database Management Systems Partha Pratim Das 34.19
Module 34
Partha Pratim
Das
Objectives &
Outline
PostgreSQL and
Python
Python
Frameworks for
PostgresSQL
Flask
Module Summary
Module 34
Partha Pratim
Python offers several frameworks such as bottle.py, Flask, CherryPy, Pyramid, Django
Das and web2py for web development.
Objectives &
Outline
• Python offers many choices for web development
◦ Frameworks such as Django and Pyramid.
PostgreSQL and
Python
◦ Micro-frameworks such as Flask and Bottle.
◦ Advanced content management systems such as Plone and django CMS.
Python
Frameworks for
PostgresSQL
• Python’s standard library supports many internet protocols
Flask
◦ HTML and XML
◦ JSON
Module Summary ◦ E-mail processing
◦ Support for FTP, IMAP, and other Internet protocols
◦ Easy-to-use socket interface
• The package Index has more libraries
◦ Requests, a powerful HTTP client library.
◦ Beautiful Soup, an HTML parser that can handle all sorts of HTML.
◦ Feedparser for parsing RSS/Atom feeds.
◦ Paramiko, implementing the SSH2 protocol.
◦ Twisted Python, a framework for asynchronous network programming.
Database Management Systems Source: https: // www. python. org/
Partha about/
Pratim Das apps/ 34.21
Flask Web Application Framework
Module 34
Partha Pratim
Das • Flask is a lightweight WSGI (Web Server Gateway Interface) web application framework.
It is designed to make getting started quick and easy, with the ability to scale up to
Objectives &
Outline complex applications.
PostgreSQL and
Python • It began as a simple wrapper around Werkzeug (Werkzeug WSGI toolkit) and Jinja
Python (Jinja template engine) and has since then become one of the most popular Python web
Frameworks for
PostgresSQL application frameworks.
Flask
• Flask offers suggestions, but does not enforce any dependencies or project layouts. It is
Module Summary
up to the developer to choose the tools and libraries they want to use.
• There are many extensions provided by the community that make adding new function-
ality easy.
Flask • The route() function of the Flask class is a decorator, which tells the application which URL should
Module Summary
call the associated function.
app.route(rule, options)
◦ The rule parameter represents URL binding with the function.
◦ The options is a list of parameters to be forwarded to the underlying Rule object.
• In the above example, ‘/’ URL is bound with hello world() function. Hence, when the home page of
web server is opened in browser, the output of this function will be rendered.
• Finally the run() method of Flask class runs the application on the local development server.
Source: https: // www. tutorialspoint. com/ flask/ flask_ application. htm
Module 34
• Consider the table Candidate (in PostgreSQL) as shown below:
Partha Pratim
Das
Objectives &
Outline
PostgreSQL and
Python
Python
Frameworks for
PostgresSQL
• Code segment in Python:
Flask
from flask import Flask, \ if __name__ == ’__main__’:
Module Summary
render_template, request # Run the Flask app
import psycopg2 app.run(
host=’127.0.0.1’,
app = Flask( debug=True,
__name__, port=5000
template_folder=’templates’ )
)
#functions to be added here for
#different actions
Database Management Systems Partha Pratim Das 34.25
Python: Flask (2)
@app.route("/")
def index():
return render_template("index.html");
@app.route("/add")
def add():
return render_template("add.html")
Module 34
http://127.0.0.1:5000/
Partha Pratim
Das
Objectives &
Outline
PostgreSQL and
Python
Python
Frameworks for
PostgresSQL
Flask
Module Summary
Module 34
• Source code for add.html (in HTML):
Partha Pratim
Das <!DOCTYPE html>
Objectives &
<html>
Outline <head>
PostgreSQL and <title>Add Email</title>
Python </head>
Python <body>
Frameworks for
PostgresSQL <h2>Email Information</h2>
Flask
<form action = "/savedetails" method="post">
<table>
Module Summary
<tr><td>CNO</td><td><input type="text" name="cno" required></td></tr>
<tr><td>Name</td><td><input type="text" name="name" required></td></tr>
<tr><td>Email</td><td><input type="text" name="email" required></td></tr>
<tr><td><input type="submit" value="Submit"></td></tr>
</table>
</form>
</body>
</html>
Module 34
http://127.0.0.1:5000/add http://127.0.0.1:5000/savedetails
Partha Pratim
Das
Objectives &
Outline
PostgreSQL and
Python
Python
Frameworks for
PostgresSQL
Flask
Module Summary
Module 34 http://127.0.0.1:5000/viewall
Partha Pratim
Das
Objectives &
Outline
PostgreSQL and
Python
Python
Frameworks for
PostgresSQL
Flask
Module Summary
Module 34
PostgreSQL and
Python
Python
Frameworks for
PostgresSQL
Flask
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Rapid Application
Development Module 35: Application Design and Development/5: Application Development and Mobile
Application
Performance and
Security
Challenges
Mobile Apps
Partha Pratim Das
Module Summary
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur
Module 35
Rapid Application
Development
Application
Performance and
Security
Challenges
Mobile Apps
Module Summary
Module 35
Challenges
Mobile Apps
Module Summary
Module 35
Application
Performance and
Security
Challenges
Mobile Apps
Module Summary
Module 35
Partha Pratim
Das
Objectives &
Outline
Rapid Application
Development
Application
Performance and
Security
Challenges
Mobile Apps
Module Summary
Module 35
• A lot of effort is required to develop Web application interfaces, especially the rich
Partha Pratim
Das
interaction functionality associated with Web 2.0 applications
Objectives &
• Several approaches to speed up application development
Outline
◦ Function library to generate user-interface elements
Rapid Application
Development ◦ Drag-and-drop features in an IDE to create user-interface elements
Application ◦ Automatically generate code for user interface from a declarative specification
Performance and
Security
• Used as part of Rapid Application Development (RAD) tools even before Web
Challenges
Mobile Apps
• RAD Software is an agile model that focuses on fast prototyping and quick feedback in
Module Summary app development to ensure speedier delivery and an efficient result
◦ App development has 4 phases: business modeling, data modeling, process
modeling, and testing & turnover: Defining the requirements, Prototyping,
Receiving feedback and Finalizing the software
◦ With RAD, the time between prototypes and iterations is short, and integration
occurs since inception.
Application
. JSP custom tag library for expressing a JSF interface within a JSP page
Performance and
Security ◦ Ruby on Rails
Challenges . Allows easy creation of simple CRUD (create, read, update and delete)
Mobile Apps interfaces by code generation from database schema or object model
Module Summary
• RAD Platforms and Tools
◦ G Suite
◦ Google App Engine
◦ Microsoft Azure
◦ Amazon Elastic Compute Cloud (EC2)
◦ AWS Elastic Beanstalk
◦ ...
Database Management Systems Partha Pratim Das 35.7
ASP.NET and Visual Studio
Module 35
Partha Pratim • ASP.NET provides a variety of controls that are interpreted at server, and generate
Das
HTML code
Objectives &
Outline • Visual Studio provides drag-and-drop development using these controls
Rapid Application
Development
◦ For example, menus and list boxes can be associated with DataSet object
Application
◦ Validator controls (constraints) can be added to form input fields
Performance and
Security . JavaScript to enforce constraints at client, and separately enforced at server
Challenges
◦ User actions such as selecting a value from a menu can be associated with actions
Mobile Apps
at server
Module Summary
◦ DataGrid provides convenient way of displaying SQL query results in tabular format
Module 35
Partha Pratim
Das
Objectives &
Outline
Rapid Application
Development
Application
Performance and
Security
Challenges
Mobile Apps
Module Summary
Module 35
Module Summary
◦ User could have even used
. X’; update instructor set salary = salary + 10000; - -
• Prepared statement internally uses:
”select * from instructor where name = ’X \’ or \’Y\’ = \’Y’
• Always use prepared statements, with user inputs as parameters
• Is the following prepared statement secure?
◦ conn.prepareStatement(”select * from instructor
Database Management Systems
where name = ’” + name + ”’“)
Partha Pratim Das 35.11
Application Security (2): Password Leakage
Module 35
Partha Pratim • Never store passwords, such as database passwords, in clear text in scripts that may be
Das
accessible to users
Objectives &
Outline ◦ For example, in files in a directory accessible to a web server
Rapid Application
Development
. Normally, web server will execute, but not provide source of script files such as
Application
file.jsp or file.php, but source of editor backup files such as file.jsp∼, or
Performance and
Security
.file.jsp.swp may be served
Challenges • Restrict access to database server from IPs of machines running application servers
Mobile Apps
◦ Most databases allow restriction of access by source IP address
Module Summary
Module 35
Partha Pratim • Single factor authentication such as passwords too risky for critical applications
Das
◦ guessing of passwords, sniffing of packets if passwords are not encrypted
Objectives &
Outline ◦ passwords reused by user across sites
Rapid Application ◦ spyware which captures password
Development
Application
• Two-factor authentication
Performance and
Security ◦ For example, password plus one-time password sent by SMS
Challenges ◦ For example, password plus one-time password devices
Mobile Apps
. device generates a new pseudo-random number every minute, and displays to
Module Summary
user
. user enters the current number as password
. application server generates same sequence of pseudo-random numbers to check
that the number is correct.
Module 35
Partha Pratim • Current SQL standard does not allow fine-grained authorization such as “students can
Das
see their own grades, but not other’s grades”
Objectives &
Outline ◦ Problem 1: Database has no idea who are application users
Rapid Application ◦ Problem 2: SQL authorization is at the level of tables, or columns of tables, but not
Development
to specific rows of a table
Application
Performance and
Security
• One workaround: use views such as
Challenges create view studentTakes as
Mobile Apps select *
Module Summary from takes
where takes.ID = syscontext.user id()
◦ where syscontext.user id() provides end user identity
. end user identity must be provided to the database by the application
◦ Having multiple such views is cumbersome
Module 35
Module Summary
. For example, add ID = sys context.user id() to all queries on student relation if
user is a student
Module 35
Partha Pratim • Applications must log actions to an audit trail, to detect who carried out an update, or
Das
accessed some sensitive data
Objectives &
Outline • Audit trails used after-the-fact to
Rapid Application
Development
◦ detect security breaches
Application
◦ repair damage caused by security breach
Performance and
Security
◦ trace who carried out the breach
Challenges • Audit trails needed at
Mobile Apps
◦ Database level, and at
Module Summary
◦ Application level
Module 35
Partha Pratim
Das
Objectives &
Outline
Rapid Application
Development
Application
Performance and
Security
Challenges
Mobile Apps
Module Summary
Module 35
Module Summary
Module 35
Partha Pratim
Das
Objectives &
Outline
Rapid Application
Development
Application
Performance and
Security
Challenges
Mobile Apps
Module Summary
Mobile Apps
Module 35
Partha Pratim
• A type of application software designed to run on a mobile device, such as a
Das
smartphone or tablet computer
Objectives & • Developed specifically for use on small, wireless computing devices, such as
Outline
Rapid Application
smartphones and tablets
Development • Designed with consideration for the demands and constraints of the devices and also to
Application
Performance and
take advantage of any specialized capabilities
Security
– Form Factor – influences display and navigation
Challenges
– Limited Memory
Mobile Apps
Module Summary
– Limited Computing Power
– Limited Power
– Limited Bandwidth
– ···
+ Availability of sensors like accelerometer
+ Availability of touchscreen – Gesture-based Navigation
+ ···
Source: https://www.slideshare.net/hassandar18/architecture- of- mobile- software- applications?from_action=save
Database Management Systems Partha Pratim Das 35.20
Mobile Website vis-à-vis Mobile App PPD
Module 35
Module 35
• Typically 3 tier
Partha Pratim
Das ◦ Presentation
Objectives &
◦ Business
Outline
◦ Data
Rapid Application
Development • Data Layer is often split be-
Application
Performance and
tween:
Security
◦ Local Data
Challenges
◦ Remote Data
Mobile Apps
Module 35
Partha Pratim • Native Apps: Completely written in the native language of a platform
Das
◦ iOS → Objective-C; Android → Java or C/C++
Objectives &
Outline ◦ Platform specific (heavily dependent on OS)
Rapid Application
Development
• Web Apps: Run completely inside of a Web browser.
Application ◦ Features interfaces built with HTML or CSS
Performance and
Security ◦ Powered via Web programming languages → Ruby on Rails, JavaScript, PHP, or
Challenges Python
Mobile Apps ◦ Portable across any phone, tablet, or computer
Module Summary
• Hybrid Apps: Combines attributes of both native and Web apps.
◦ Attempts to use redundant, common code that can be used across platforms, and
◦ Tailors required attributes to the native system
Source: https://www.slideshare.net/hassandar18/architecture- of- mobile- software- applications?from_action=save
Module 35
Challenges
• Define User Interface
Mobile Apps • Select Navigation
Module Summary
• Maintain Flow
Source: https://www.slideshare.net/hassandar18/architecture- of- mobile- software- applications?from_action=save
Module 35
Partha Pratim • Understood the steps in the Rapid Application Development Process
Das
• Exposed to the issues in Application Performance and Application Security
Objectives &
Outline • Learnt the distinctive features of Mobile Apps
Rapid Application
Development
Application
Performance and
Security
Challenges
Mobile Apps
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Module 21
Module 22
Module 23
Week-5
Week-5
Insertions Anomaly
When the insertion of a data record is not possible without adding some
Module 21 additional unrelated data to the record
Module 22
We cannot add an Instructor in instructor with department if the department
Module 23
does not have a building or budget
Module 24
Module 25
Deletion Anomaly
When deletion of a data record results in losing some unrelated information that
was stored as part of the record that was deleted from a table
We delete the last Instructor of a Department from instructor with department,
we lose building and budget information
Update Anomaly
When a data is changed, which could involve many records having to be changed,
leading to the possibility of some changes being made incorrectly
When the budget changes for a Department having large number of Instructors in
instructor with department application may miss some of them
First Normal Form (1NF) PPD
Week-5
Module 25
A relational schema R is in First Normal Form (INF) if
the domains of all attributes of R are atomic
the value of each attribute contains only a single value from that domain
Non-atomic values complicate storage and encourage redundant (repeated) storage of
data
Example: Set of accounts stored with each customer, and set of owners stored
with each account
We assume all relations are in first normal form
Functional Dependencies
On this instance, A → B does NOT hold, but B → A does hold. So we cannot have
tuples like (2, 4), or (3, 5), or (4, 7) added to the current instance.
Functional Dependencies : Armstrong’s Axioms
Week-5 Given a set of Functional Dependencies F , we can infer new dependencies by the
Armstrong’s Axioms:
Module 21
Reflexivity: if β ⊆ α, then α → β
Module 22 Augmentation: if α → β, then γα → γβ
Module 23 Transitivity: if α → β and β → γ, then α → γ
Module 24
These axioms can be repeatedly applied to generate new FDs and added to F
Module 25
A new FD obtained by applying the axioms is said to the logically implied by F
The process of generations of FDs terminate after finite number of steps and we call it
the Closure Set F + for FDs F . This is the set of all FDs logically implied by F
Clearly, F ⊆ F +
These axioms are
Sound (generate only functional dependencies that actually hold), and
Complete (eventually generate all functional dependencies that hold)
Prove the axioms from definitions of FDs
Prove the soundness and completeness of the axioms
Functional Dependencies : Armstrong’s Axioms: Derived Rules
Week-5
Week-5
R = (A, B, C , G , H, I )
Module 21 F = {A → B, A → C , CG → H, CG → I , B → H}
Module 22
(AG )+
Module 23
Module 24
1 result = AG
Module 25
2 result = ABCG (A → C and A → B)
3 result = ABCGH (CG → H and CG ⊆ AGBC )
4 result = ABCGHI (CG → I and CG ⊆ AGBCH)
Is AG a candidate key?
1 Is AG a super key?
1 Does AG → R? == Is (AG )+ ⊇ R
2 Is any subset of AG a superkey?
1 Does A → R? == Is (A)+ ⊇ R
2 Does G → R? == Is (G )+ ⊇ R
BCNF: Boyce-Codd Normal Form
Week-5
A relation schema R is in BCNF with respect to a set F of FDs if for all FDs in F + of
Module 21
the form
Module 22
α → β, where α ⊆ R and β ⊆ R at least one of the following holds:
Module 23 α → β is trivial (that is, β ⊆ α)
Module 24 α is a superkey for R
Module 25
Example schema not in BCNF:
instr dept (ID, name, salary, dept name, building, budget)
because the non-trivial dependency dept name → building, budget holds on
instr dept, but dept name is not a superkey
BCNF (2): Decomposition
Week-5
Week-5
Week-5
Let F & G are two functional dependency sets.
These two sets F & G are equivalent if F + = G + . That is:
Module 21 (F + = G + ) ⇔ (F + ⇒ G and G + ⇒ F )
Module 22 Equivalence means that every functional dependency in F can be inferred from G , and
Module 23 every functional dependency in G an be inferred from F
Module 24
F and G are equal only if
Module 25
F covers G : Means that all functional dependency of G are logically numbers of
functional dependency set F ⇒ F + ⊇ G .
G covers F : Means that all functional dependency of F are logically members of
functional dependency set G ⇒ G + ⊇ F .
Canonical Cover
Week-5 A Canonical Cover for F is a set of dependencies Fc such that ALL the following properties
are satisfied:
Module 21 F + = Fc+ . Or,
Module 22 F logically implies all dependencies in Fc
Module 23 Fc logically implies all dependencies in F
Module 24
No functional dependency in Fc contains an extraneous attribute
Module 25
Each left side of functional dependency in Fc is unique. That is, there are no two
dependencies α1 → β1 and α2 → β2 in such that α1 → α2
Intuitively, a Canonical cover of F is a minimal set of FDs
Equivalent to F
Having no redundant FDs
No redundant parts of FDs
Minimal / Irreducible Set of Functional Dependencies
Canonical Cover : Example
Week-5 For the case of R = (R1 , R2 ), we require that for all possible relations r on schema R
r = πR1 (r ) ▷◁ πR2 (r )
Module 21
Module 22
Module 23
A decomposition of R into R1 and R2 is lossless join if at least one of the following
Module 24
dependencies is in F + :
Module 25 R1 ∩ R2 → R1
R1 ∩ R2 → R2
The above functional dependencies are a sufficient condition for lossless join
decomposition; the dependencies are a necessary condition only if all constraints are
functional dependencies
To Identify whether a decomposition is lossy or lossless, it must satisfy the following conditions:
R1 ∪ R2 = R
R1 ∩ R2 ̸= ϕ and
R1 ∩ R2 → R1 or R1 ∩ R2 → R2
Lossless Join Decomposition : Example
Week-5
R = (A, B, C )
Module 21
F = {A → B, B → C }
Module 22 Can be decomposed in two different ways
Module 23
R1 = (A, B), R2 = (B, C )
Module 24
Module 25
Lossless-join decomposition:
R1 ∩ R2 = {B} and B → BC
Dependency preserving
R1 = (A, B), R2 = (A, C )
Lossless-join decomposition:
R1 ∩ R2 = {A} and A → AB
Not dependency preserving
(cannot check B → C without computing R1 ▷◁ R2 )
Dependency Preservation
Week-5
Module 23 (F1 ∪ F2 ∪ · · · ∪ Fn )+ = F +
Module 24
Module 25 If it is not, then checking updates for violation of functional dependencies may
require computing joins, which is expensive
Let R be the original relational schema having FD set F . Let R1 and R2 having FD
set F1 and F2 respectively, are the decomposed sub-relations of R. The decomposition
of R is said to be preserving if
F1 ∪ F2 ≡ F {Decomposition Preserving Dependency}
If F1 ∪ F2 ⊂ F {Decomposition NOT Preserving Dependency} and
F1 ∪ F2 ⊃ F {this is not possible}
Dependency Preservation : Example
Week-5
R (A, B, C, D)
Module 21
F = {A → B, B → C , C → D, D → A}
Module 22 Decomposition: R1(A, B) R2(B, C) R3(C, D)
Module 23 A → B is preserved on table R1
Module 24 B → C is preserved on table R2
Module 25
C → D is preserved on table R3
We have to check whether the one remaining FD: D→A is preserved or not.
R1 R2 R3
F1 ={A → AB, B → BA} F2 ={B → BC , C → CB} F3 ={C → CD, D → DC }
F ′ = F1 ∪ F 2 ∪ F3 .
Checking for: D → A in F ′+
D → C (from R3), C → B (from R2), B → A (from R1) : D→ A (By
Transitivity)
Hence all dependencies are preserved.
Week-6
Module 26
Module 27
Module 29
Database Management Systems
Summary : Week-6
Normalization or Schema Refinement PPD
Week-6
Normalization or Schema Refinement is a technique of organizing the data in the
database
Module 26
A systematic approach of decomposing tables to eliminate data redundancy and
Module 27
undesirable characteristics
Module 29
Insertion Anomaly
Update Anomaly
Deletion Anomaly
Most common technique for the Schema Refinement is decomposition.
Goal of Normalization: Eliminate Redundancy
Redundancy refers to repetition of same data or duplicate copies of same data stored
in different locations
Normalization is used for mainly two purpose:
Eliminating redundant (useless) data
Ensuring data dependencies make sense, that is, data is logically stored
Normalization and Normal Forms PPD
Week-6
A normal form specifies a set of conditions that the relational schema must satisfy in
Module 26
terms of its constraints – they offer varied levels of guarantee for the design
Module 27 Normalization rules are divided into various normal forms. Most common normal
Module 29 forms are:
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Informally, a relational database relation is often described as ”normalized” if it meets
third normal form. Most 3NF relations are free of insertion, update, and deletion
anomalies
1NF: First Normal Form PPD
Week-6 A relation is in First Normal Form if and only if all underlying domains contain atomic
values only (doesn’t have multivalued attributes (MVA))
Module 26
STUDENT(Sid, Sname, Cname)
Module 27
Module 29
Week-6
Partial Dependency:
Let R be a relational Schema and X , Y , A be the attribute sets over R where X : Any Candi-
date Key, Y : Proper Subset of Candidate Key, and A : Non Prime Attribute
If Y → A exists in R, then R is not in 2NF.
Week-6
Let R be the relational schema.
[E. F. Codd,1971] R is in 3NF only if:
Module 26
R should be in 2NF
Module 27
R should not contain transitive dependencies (OR, Every non-prime attribute of R is
Module 29 non-transitively dependent on every key of R)
[Carlo Zaniolo, 1982] Alternately, R is in 3NF iff for each of its functional dependencies X → A, at
least one of the following conditions holds:
X contains A (that is, A is a subset of X , meaning X → A is trivial functional dependency), or
X is a superkey, or
Every element of A − X , the set difference between A and X , is a prime attribute (i.e., each
attribute in A − X is contained in some candidate key)
[Simple Statement] A relational schema R is in 3NF if for every FD X → A associated with R either
A ⊆ X (that is, the FD is trivial) or
X is a superkey of R or
A is part of some candidate key (not just superkey!)
A relation in 3NF is naturally in 2NF
Module 27 Recap
Week-6
Module 26
Module 27
Module 29
Decomposition to 3NF
3NF Decomposition: Motivation
Week-6
Week-6
Optimization: Need to check only FDs in F , need not check all FDs in F + .
Module 26 Use attribute closure to check for each dependency α → β, if α is a superkey.
Module 27 If α is not a superkey, we have to verify if each attribute in β is contained in a
Module 29 candidate key of R
This test is rather more expensive, since it involve finding candidate keys
Testing for 3NF has been shown to be NP-hard
Decomposition into 3NF can be done in polynomial time
3NF Decomposition : Algorithm PPD
Week-6
Week-6
Relation schema:
Module 26
cust banker branch = (customer id, employee id, branch name, type)
Module 27 The functional dependencies for this relation schema are:
Module 29 1 customer id, employee id → branch name, type
2 employee id → branch name
3 customer id, branch name → employee id
We first compute a canonical cover
branch name is extraneous in the RHS of the 1st dependency
No other attribute is extraneous, so we get Fc =
customer id, employee id → type
employee id → branch name
customer id, branch name → employee id
3NF Decomposition : Example
Week-6
Week-6
A relation schema R is in BCNF with respect to a set F of FDs if for all FDs in F + of
Module 26
the form
Module 27
α → β, where α ⊆ R and β ⊆ R at least one of the following holds:
Module 29 α → β is trivial (that is, β ⊆ α)
α is a superkey for R
BCNF Decomposition : Algorithm PPD
Week-6
1 For all dependencies A → B in F + , check if A is a superkey
Module 26
By using attribute closure
Module 27 2 If not, then
Module 29 Choose a dependency in F + that breaks the BCNF rules, say A → B
Create R1 = AB
Create R2 = (R − (B − A))
Note that: R1 ∩ R2 = A and A → AB (= R1), so this is lossless decomposition
3 Repeat for R1, and R2
By defining F 1+ to be all dependencies in F that contain only attributes in R1
Similarly F 2+
BCNF Decomposition (4): Testing Dependency Preservation:
Using Closure Set of FD
Week-6
Consider the example given below, we will apply both the algorithms to check dependency preservation and
will discuss the results.
Module 26 R (A, B, C, D)
Module 27 F = {A → B, B → C , C → D, D → A}
Module 29
Decomposition: R1(A, B) R2(B, C) R3(C, D)
A → B is preserved on table R1
B → C is preserved on table R2
C → D is preserved on table R3
We have to check whether the one remaining FD: D→A is preserved or not.
R1 R2 R3
F1 ={A → AB, B → BA} F2 ={B → BC , C → CB} F3 ={C → CD, D → DC }
F ′ = F1 ∪ F2 ∪ F3 .
Checking for: D → A in F ′+
D → C (from R3), C → B (from R2), B → A (from R1) : D→ A (By Transitivity)
Hence all dependencies are preserved.
MVD: Definition PPD
Week-6 Let R be a relation schema and let α ⊆ R and β ⊆ R. The multivalued dependency
α↠β
holds on R if in any legal relation r(R), for all pairs for tuples t1 and t2 in r such that t1 [α] = t2 [α],
Module 26
there exist tuples t3 and t4 in r such that:
Module 27
Week-6
A relation schema R is in 4NF with respect to a set D of functional and multivalued
dependencies if for all multivalued dependencies in D + of the form α ↠ β, where α ⊆
Module 26 R and β ⊆ R, at least one of the following hold:
Module 27 α ↠ β is trivial (that is, β ⊆ α or α ∪ β = R)
Module 29 α is a superkey for schema R
If a relation is in 4NF,then it is in BCNF
Week-7
Module 31
Module 32
Module 33
Module 34
Database Management Systems
Module 35 Summary : Week-7
March 7, 2022
Module 31 Recap
Week-7
Week-7
Architecture Classification
Module 31
The design of a DBMS depends on its architecture. It can be
Module 32
Module 33
centralized
Module 34
decentralized
hierarchical
Module 35
The architecture of a DBMS can be seen as either single tier or multi-tier:
1-tier architecture
2-tier architecture
3-tier architecture
n-tier architecture
Module 32 Recap
Week-7
Web Fundamentals
Module 31
The World Wide Web
Module 32
Module 33
Hypertext MarkupLanguage (HTML)
Module 34 Uniform Resource Locators (URLs)
Module 35 Uniform Resource Identifier (URI)
Uniform Resource Locator (URL)
Uniform Resource Name (URN)
Hypertext Transfer Protocol (HTTP)
HTTP and Sessions
Sessions and Cookies
Web Browser
Web Servers
Web Services - Representation State Transfer (REST), XML, JavaScript Object Notation
(JSON), Big Web Services
Module 32 Recap (Cont.)
Week-7
Module 32 Client side scripting - are firstly downloaded at the client-end and then interpreted and executed by
Module 33
the browser
Module 34
Javascript
Module 35
Server side scripting - is responsible for the completion or carrying out a task at the server-end and
then sending the result to the client-end.
Servlets
Java Server Pages (JSP)
PHP
Module 33 Recap
Week-7
Module 32 Connectionist
Module 33
Open Database Connectivity (ODBC)
Module 34
Module 35
Java Database Connectivity (JDBC)
JDBC example
Connectionist Bridge Configurations
ODBC-to-JDBC bridges, JDBC-to-ODBC bridges, OLE DB-to-ODBC bridges, ADO.NET-to-ODBC
bridges
Embedded SQL
Week-7
Module 33
Steps to access PostgresSQL from Python using psycopg2
Module 34 1 Create connection
Module 35
2 Create cursor
3 Execute the query
4 Commit/rollback
5 Close the cursor
6 Close the connection
Week-7
Rapid Application Development - RAD Software is an agile model that focuses on fast
Module 31
prototyping and quick feedback in app development to ensure speedier delivery and an
Module 32
efficient result
Module 33 Several approaches to speed up application development
Module 34
Application Performance
Application Security
SQL Injection: i.e. select * from instructor where name = ’X’ or ’Y’ = ’Y’
1. Password Leakage 2. Authentication 3. Application-Level Authorization 4. Audit Trails
Module 35 Recap (Cont.
Week-7
Design Issues
Week-8
Module 36
Module 37
Module 38
Week-8
Week-8
Linear data structures: A Linear data structure has data elements arranged in linear or
sequential manner such that each member element is connected to its previous and
Module 36
next element.
Module 37
Module 38
Array: The data elements are stored at contiguous locations in memory.
Module 39
Linked List: The data elements are not required to be stored at contiguous locations in
Module 40
memory. Rather each element stores a link (a pointer to a reference) to the location of
the next element.
Queue: It is a FIFO (First In First Out) data structure.
Stack: It is a LIFO (Last In First Out) data structure.
Search
Linear
Binary
Module 37 Recap (Cont..)
Week-8
From the study of Linear data structures, we can make the following summary
observations:
Module 36
Module 37
All of them have the space complexity O(n), which optimal. However, the actual used
Module 38
space may be lower in array while linked list has an overhead of 100% (double)
Module 39
All of them have complexities that are identical for Worst as well as Average case
Module 40
All of them offer satisfactory complexity for some operations while being unsatisfactory
on the others
Week-8
Non-Linear data structures are those data structures in which data items are not
Module 36
arranged in a sequence and each element may have multiple paths to connect to other
Module 37
elements.
Module 38 Graph: Undirected or Directed, Unweighted or Weighted, and variants
Module 39 Tree: Rooted or Unrooted, Binary or n-ary, Balanced or Unbalanced, and variants
Module 40 Hash Table: Array with lists (coalesced chains) and one or more hash functions
Skip List: Multi-layered interconnected linked lists
Binary Search Trees: Is a tree in which all the nodes hold the following:
The value of each node in the left sub-tree is less than the value of its root
The value of each node in the right sub-tree is greater than the value of its root
Binary Search Tree
Week-8
Practice Question: Construct the binary search tree for the following sequence:
Module 36
1 15,10,20,8,12,27,23,2,6,11,14,17
Module 37 2 15,10,6,20,27,2,23,17,8,14,11,12
Module 38
3 15,23,6,20,12,2,10,17,8,14,11,27
Module 39
Module 40 For each BST, find out the number of leaf nodes, height of BST and number of elements
at level 2.
Comparison of Linear and Non-Linear Data Structures
Week-8
Linear Data Structure Non-Linear Data Structure
• Data elements are arranged in a linear • Data elements are arranged in hierar-
Module 36
order where each and every elements are chical or networked manner
Module 37
Module 38
attached to its previous and next adjacent
Module 39
• Single level is involved • Multiple level are involved
Module 40 • Implementation is easy in comparison • Implementation is complex in compari-
to non-linear data structure son to linear data structure
• Data elements can be traversed in one • Data elements can be traversed in mul-
way only tiple ways. Various traversals may be de-
fined to linearize the data: Depth-First,
Breadth-First, Inorder, Prepoder, Pos-
torder, etc.
• Examples: array, stack, queue, linked • Examples: trees, graphs, skip list, hash
list, and their variants map, and several variants
Module 39 Recap
Week-8
Physical Storage Media
Magnetic Disks
Module 36
(Go through the slides for theoretical part and refer to practice and graded assignment
Module 37
questions )
Module 38
Module 39
Magnetic Tape
Module 40 Cloud Storage
Cloud Storage vs. Traditional Storage
Other Storage
Optical Disks
Flash Drives
Secure Digital Cards (SD cards)
Flash Storage
Solid-State Drives (SSD)
Future of Storage
DNA Digital Storage
Quantum Memory
Module 40 Recap
Week-8
File Organization
Organization of Records in Files
Module 36
Heap: A record can be placed anywhere in the file where there is space
Module 37
Sequential: Store records in sequential order, based on the value of the search key of
Module 38 each record.
Module 39 Suitable for applications that require sequential processing of the entire file
Module 40 The records in the file are ordered by a search-key.
It will work more efficiently when working on search-key (primary key) of the table.
Hashing: A hash function computed on some attribute of each record; the result
specifies in which block of the file the record should be placed
In a multitable clustering file organization records of several different relations can be
stored in the same file.
good for queries involving department ▷◁ instructor, and for queries involving one single
department and its instructors
bad for queries involving only department
results in variable size records
Can add pointer chains to link records of a particular relation
Module 40 Cont..
Week-8
Data Dictionary (also, System Catalog) stores metadata (data about data) such
as:
Module 36
Module 37
Information about relations
Module 38
User and accounting information, including passwords
Module 39
Statistical and descriptive data
Module 40
Physical file organization information
Information about indices
Buffer: portion of main memory available to store copies of disk blocks
Buffer Manager: subsystem responsible for allocating buffer space in main memory
Buffer Replacement Policies:
Least recently used (LRU strategy)
Most recently used (MRU strategy)
Module 36
Partha Pratim
Das
Week Recap
Objectives &
Database Management Systems
Outline
Module 36: Algorithms and Data Structures/1: Algorithms and Complexity Analysis
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Partha Pratim Das
Counting Models
Asymptotic Analysis
Department of Computer Science and Engineering
Where?
Indian Institute of Technology, Kharagpur
Complexity Chart
Module 36
Objectives &
• Glimpsed at architecture for a few sample applications
Outline
Algorithms
• Familiarized with the Fundamentals notions and technologies of Web
Analysis of • Learnt about Scripting and the notions of Servlets
Algorithms
Why? • Learnt to use SQL from a programming language
What?
How? • Learnt to build Python Web Applications with PostgreSQL using psycopg2 and Flask
Counting Models
Asymptotic Analysis • Understood the steps in the Rapid Application Development Process
Where?
Module 36
Objectives &
• Introduce Asymptotic notation for representation of complexity
Outline
Algorithms
• Consider complexity of common algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Module 36
Objectives &
• Complexity Chart
Outline
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Module 36
• Algorithm
Partha Pratim
Das ◦ An algorithm is a finite sequence of well-defined, computer-implementable
Week Recap
(optional) instructions, typically to solve a class of specific problems or to perform a
Objectives &
computation.
Outline
◦ Algorithms are always unambiguous and are used as specifications for performing
Algorithms
calculations, data processing, automated reasoning, and other tasks.
Analysis of
Algorithms ◦ An algorithm must terminate
Why?
What? • Program
How?
Counting Models ◦ A computer program is a collection of instructions that can be executed by a
Asymptotic Analysis
Where?
computer to perform a specific task
Complexity Chart
◦ A computer program is usually written by a computer programmer in a
Module Summary programming language.
◦ A programs implements an algorithm
◦ A program may or may not terminate. For example, an OS
Module 36
Partha Pratim
Das
Week Recap
Objectives &
Outline
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Analysis of Algorithms
Module 36
• Why?
Partha Pratim
Das ◦ Set the motivation for algorithm analysis:
Week Recap
◦ Why analyze?
Objectives & • What?
Outline
Algorithms
◦ Identify what all need to be analyzed:
Analysis of
◦ What to analyze?
Algorithms
Why? • How?
What?
How?
◦ Learn the techniques for analysis:
Counting Models
◦ How to analyze?
Asymptotic Analysis
Where?
• Where?
Complexity Chart
◦ Understand the scenarios for application:
Module Summary
◦ Where to analyze?
• When?
◦ Realize your position for seeking the analysis:
◦ When to analyze?
Database Management Systems Partha Pratim Das 36.7
Why analyze?
Module 36
Partha Pratim
Practical reasons:
Das
• Resources are scarce
Week Recap
• Greed to do more with less
Objectives &
Outline
• Avoid performance bugs
Algorithms
Analysis of
Core Issues:
Algorithms
Why?
• Predict performance
What?
How?
◦ How much time does binary search take?
Counting Models
Asymptotic Analysis
• Compare algorithms
Where?
◦ How quick is Quicksort?
Complexity Chart
Objectives &
Outline
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
◦ Most common analysis factor
Asymptotic Analysis ◦ Representative of various related analysis factors like Power, Bandwidth, Processors
Where? ◦ Supported by Complexity Classes
Complexity Chart
• Space
Module Summary
◦ Widely explored
◦ Important for hand-held devices
◦ Supported by Complexity Classes
Complexity Chart
Module Summary
Module Summary
Module 36
Objectives &
• Generating Functions
Outline
Algorithms
• Master Theorem
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Module 36
Partha Pratim
Counting Models
Das
• Core Idea: Total running time = Sum of cost × frequency for all operations
Week Recap
◦ Need to analyze program to determine set of operations
Objectives &
Outline ◦ Cost depends on machine, compiler
Algorithms ◦ Frequency depends on algorithm, input data
Analysis of
Algorithms • Machine Model: Random Access Machine (RAM) Computing Model
Why?
What?
◦ Input data & size
How?
◦ Operations
Counting Models
Asymptotic Analysis ◦ Intermediate Stages
Where?
◦ Output data & size
Complexity Chart
Module Summary
Module 36
Partha Pratim
• Factorial (Recursive)
Das
int fact(int n) {
Week Recap if (0 != n) return n*fact(n-1);
Objectives & return 1;
Outline }
Algorithms
◦ Time T (n) = n − 1 (multiplication)
Analysis of
Algorithms
◦ Space S(n) = n + 1 (n’s in recursive calls)
Why?
What?
• Factorial (Iterative)
How?
int fact(int n) {
Counting Models
Asymptotic Analysis
int t = 1;
Where? for(; n > 0; --n)
Complexity Chart t = t * n;
Module Summary
return t;
}
◦ Time T (n) = n (multiplication)
◦ Space S(n) = 2 (n, t)
Module 36
Partha Pratim
Asymptotic Analysis
Das
• Core Idea: Cannot compare actual times; hence compare Growth or how time increases
Week Recap with input size
Objectives &
Outline ◦ Function Approximation (tilde (˜) notation)
Algorithms ◦ Common Growth Functions
Analysis of
Algorithms
◦ Big-Oh (O(.)), Big-Omega (Ω(.)), and Big-Theta (Θ(.)) Notations
Why? ◦ Solve recurrence with Growth Functions
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Module 36
Partha Pratim
Das
Week Recap
Objectives &
Outline
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Module 36
Partha Pratim
Das
Week Recap
Objectives &
Outline
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Module 36
Partha Pratim
Das
Week Recap
Objectives &
Outline
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Module 36
Partha Pratim
For a given function g (n), we denote by O(g (n)) the set of functions:
Das
Week Recap
O(g (n)) = f (n) : there exist positive constants c and n0 such that
Objectives &
Outline
0 ≤ f (n) ≤ cg (n), for all n > n0
Algorithms
Analysis of
Algorithms
Why? • We use O-notation to give an upper bound on a function, to within a constant factor.
• When we say that the running time of A is O(n2 ), we mean that there is a function
What?
How?
Counting Models
Asymptotic Analysis
f (n) that is O(n2 ) such that for any value of n, no matter what particular input of size
Where? n is chosen, the running time on that input is bounded from above by the value f (n).
Complexity Chart
• Equivalently, we mean that the worst-case running time is O(n2 ).
Module Summary
Module 36
Partha Pratim
Algorithmic Situation
Das
• Core Idea: Identify data configurations or scenarios for analysis
Week Recap
◦ Best Case
Objectives &
Outline ▷ Minimum running time on an input
Algorithms
◦ Worst Case
Analysis of
Algorithms ▷ Running time guarantee for any input of size n
Why?
What? ◦ Average Case
How?
Counting Models ▷ Expected running time for a random input of size n
Asymptotic Analysis
Where? ◦ Probabilistic Case
Complexity Chart
▷ Expected running time of a randomized algorithm
Module Summary
◦ Amortized Case
▷ Worst case running time for any sequence of n operations
Module 36
Partha Pratim
Das
Week Recap
Objectives &
Outline
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Complexity Chart
Module 36
Partha Pratim
Das
Week Recap
Objectives &
Outline
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Module 36
Partha Pratim
Das
Week Recap
Objectives &
Outline
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary
Module 36
Partha Pratim • Need for analyzing the running-time and space requirements of a program
Das
• Asymptotic growth rate or order of the complexity of different algorithms
Week Recap
Objectives &
• Worst-case, average-case and best-case analysis
Outline
Algorithms
Analysis of
Algorithms
Why?
What?
How?
Counting Models
Asymptotic Analysis
Where?
Complexity Chart
Module Summary Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Data Structure
Module 37: Algorithms and Data Structures/2: Data Structures
Linear Data
Structures
Array
Linked List
Search
Linear Search
Partha Pratim Das
Binary Search
Module 37
Partha Pratim • Need for analyzing the running-time and space requirements of a program
Das
• Asymptotic growth rate or order of the complexity of different algorithms
Objectives &
Outline • Worst-case, average-case and best-case analysis
Data Structure
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
Data Structure
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
Partha Pratim • Data structure: A data structure specifies the way of organizing and storing
Das
in-memory data that enables efficient access and modification of the data.
Objectives &
Outline ◦ Linear Data Structures
Data Structure ◦ Non-linear Data Structures
Linear Data
Structures
• Most data structure has a container for the data and typical operations that its needs
Array to perform
Linked List
Search • For applications relating to data management, the key operations are:
Linear Search
Binary Search
◦ Create
Module Summary ◦ Insert
◦ Delete
◦ Find / Search
◦ Close
• Efficiency is measured in terms of time and space taken for these operations
Module 37
Partha Pratim
Das
Objectives &
Outline
Data Structure
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
Partha Pratim • A Linear data structure has data elements arranged in linear or sequential manner such
Das
that each member element is connected to its previous and next element.
Objectives &
Outline • Since data elements are sequentially connected, each element is traversable through a
Data Structure single run.
Linear Data
Structures • Examples of linear data structures are Array, Linked List, Queue, Stack, etc.
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
Partha Pratim
Different examples of linear data structure:
Das
• Array: The data elements are stored at contiguous locations in memory.
Objectives &
Outline • Linked List: The data elements are not required to be stored at contiguous locations
Data Structure in memory. Rather each element stores a link (a pointer to a reference) to the location
Linear Data of the next element.
Structures
Array
• Queue: It is a FIFO (First In First Out) data structure. The element that has been
Linked List
Search
inserted first in the queue would be removed first. Thus, insert and removal of the
Linear Search elements in this take place in the same order.
Binary Search
Module Summary • Stack: It is a LIFO (Last In First Out) data structure. The element that has been
inserted last in the stack would be removed first. Thus, insert and removal of the
elements in this take place in the reverse order.
Module 37
Objectives &
Outline
Data Structure
Linear Data
Structures
Array • Simple access using indices. For example, let the array name be arr, we can access the
Linked List
element at position 5 as arr[5].
Search
Linear Search • Array allows random access using its index which is fast (cost of O(1)). Useful for
Binary Search
Module Summary
operations like sorting, searching.
Module 37
Partha Pratim • Have fixed sizes, not flexible. Since we do not know the number of elements to be
Das
stored in runtime, If we create it too large then it can be a waste of memory, if we
Objectives &
Outline
create it too small then some elements may not be accommodated in the array.
Data Structure ◦ For example, suppose we create an array to store 8 elements. However, during
Linear Data execution of the program only 5 elements are available, which results in wastage of
Structures
Array memory space.
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
Partha Pratim • Insertion and removal of elements from an array are costlier since the memory locations
Das
have to be consecutive.
Objectives &
Outline ◦ Insertion or removal of an element from the end of an array is easy.
Data Structure ▷ Insert at end:
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
▷ Remove from end:
Module 37
• Insertion and removal of elements from an array are costlier since the memory locations
Partha Pratim
Das
have to be consecutive.
Objectives &
◦ Insert and remove elements at any arbitrary position is costly (cost is O(n))
Outline
▷ Insert at any arbitrary position:
Data Structure
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
▷ Remove from any arbitrary position:
Module 37
• Elements are not required to be stored at contiguous memory locations. A new element
Partha Pratim
Das
can be stored anywhere in the memory where free space is available. Thus, it provides
better memory usage than arrays.
Objectives &
Outline
• For each new element allocated, a link (a pointer or a reference) is created for the new
Data Structure
element using which the element can be added to the linked list.
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary Each element is stored in a node. A node has two parts:
◦ Info: stores the element.
◦ Link: stores the location of the next node.
• Header is a link to the first node of the linked list.
Module 37
• Flexible in size. Size of a linked list grows or shrinks as and when new elements are
Partha Pratim
Das
inserted or deleted.
Objectives &
• Random access is not possible in linked lists. The elements will have to be accessed
Outline
sequentially.
Data Structure
Linear Data
• Insertion or removal of an element at/from any arbitrary position is efficient as none of
Structures the elements are not required to be moved to new locations.
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
• Insertion or removal of an element at/from any arbitrary position is efficient.
Partha Pratim
Das ◦ Insertion at front:
Objectives & 1. NewNode.Link = Header
Outline
2. Header = NewNode
Data Structure
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Search
Linear Search
Binary Search
Module Summary
Module 37
• Insertion or removal of an element at/from any arbitrary position is efficient.
Partha Pratim
Das ◦ Insertion at end:
Objectives & 1. Node.Link = NewNode
Outline
Data Structure
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Search
Linear Search
Binary Search
Module Summary
Module 37
• Insertion or removal of an element at/from any arbitrary position is efficient.
Partha Pratim
Das ◦ Insertion at any intermediate position:
Objectives & 1. NewNode.Link = Node.Link
Outline
2. Node.Link = NewNode
Data Structure
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
• Insertion or removal of an element at/from any arbitrary position is efficient.
Partha Pratim
Das ◦ Remove from any intermediate position:
Objectives & 1. Temp = Node.Link
Outline
2. Node.Link = Node.Link.Link
Data Structure
3. Delete(Temp)
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
Partha Pratim
Das
Objectives &
Outline
Data Structure
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Search
Module 37
Partha Pratim
• The algorithm starts with the first element, compares with the given key value and returns yes if they
Das match.
Objectives & • If it does not match, then it proceeds sequentially comparing each element of the list with the given key
Outline until a match is found or the full list is traversed.
Data Structure
Let the given input list be inputArr = [‘a’,‘c’,‘a’,‘d’,‘e’,‘m’,‘i’,‘c’,‘s’] and the search key be ‘i’.
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
Python Code for Linear Search:
Partha Pratim
Das
------------------------------------------------------
Objectives & def linSearch(inputArr, k):
Outline
for i in range(len(inputArr)):
Data Structure if inputArr[i] == k:
Linear Data return i
Structures
Array
return -1
Linked List
Search
inputArr = [‘a’,‘c’,‘a’,‘d’,‘e’,‘m’,‘i’,‘c’,‘s’]
Linear Search k = ’i’
Binary Search index = linsearch(inputArr,k)
Module Summary if index != -1:
print("Element found at "+ index)
Module 37
Partha Pratim
• The input for the algorithm is a sorted list.
Das
• The algorithm compares the key k with the middle element in the list.
Objectives &
Outline
• If the key matches, then it returns the index.
Data Structure • If the key does not match and is greater than the middle element, then the new list is the list to the
Linear Data
right of the middle element.
Structures
Array
• If the key does not match and is less than the middle element, then the new list is the list to the left of
Linked List the middle element.
Search Let the given input list be inputArr = [‘a’,‘a’,‘c’,‘c’,‘d’,‘e’,‘i’,‘m’,‘s’] and the search key be ‘i’.
Linear Search
Binary Search
Module Summary
Module 37
Python Code for Binary Search:
Partha Pratim ------------------------------------------------------
Das def binSearch(inputArr, k):
low = 0
Objectives &
Outline
high = len(inputArr) - 1
mid = 0
Data Structure while low <= high:
Linear Data mid = (high + low) // 2 # Division(floor)
Structures if inputArr[mid] < k: # new list is to the right of k
Array low = mid + 1
Linked List elif inputArr[mid] > k: # new list is to the left of k
Search high = mid - 1
Linear Search else: # means k is present at mid
Binary Search return mid
Module Summary return -1 # The element is not present
inputArr = [‘a’,‘a’,‘c’,‘c’,‘d’,‘e’,‘i’,‘m’,‘s’]
k = ’i’
index = binSearch(inputArr, k)
if index != -1:
print("Element found at position "+ str(index+1))
else:
print("Not found ")
Module 37
Partha Pratim
Das
Objectives &
Outline
Data Structure
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Module 37
Linear Data
Structures
Array
Linked List
Search
Linear Search
Binary Search
Module Summary
Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Data Structure
Module 38: Algorithms and Data Structures/3: Data Structures
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Partha Pratim Das
Tree
Build a BST
Search a Key
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur
Comparison
Module 38
Non-linear Data
• Reviewed linear and binary search
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Module 38
Partha Pratim • Introducing Non-linear Data Structures - graph, tree, hash table
Das
• Exploring Binary Search Tree
Objectives &
Outline • Comparing Linear and Non-Linear Data Structures
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Module 38
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Module 38
Partha Pratim • Data structure: A data structure specifies the way of organizing and storing
Das
in-memory data that enables efficient access and modification of the data.
Objectives &
Outline ◦ Linear Data Structures
Data Structure ◦ Non-linear Data Structures
Non-linear Data
Structures
• Most data structure has a container for the data and typical operations that its needs
Graph to perform
Tree
Hash Table • For applications relating to data management, the key operations are:
Binary Search
Tree ◦ Create
Build a BST
Search a Key
◦ Insert
Comparison
◦ Delete
Module Summary ◦ Find / Search
◦ Close
• Efficiency is measured in terms of time and space taken for these operations
Module 38
Partha Pratim
Das
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Non-linear Data Structures
Module 38 • From the study of Linear data structures in the last module, we can make the
Partha Pratim following summary observations:
Das
◦ All of them have the space complexity O(n), which optimal. However, the actual
Objectives &
Outline used space may be lower in array while linked list has an overhead of 100% (double)
Data Structure ◦ All of them have complexities that are identical for Worst as well as Average case
Non-linear Data
Structures
◦ All of them offer satisfactory complexity for some operations while being
Graph unsatisfactory on the others
Tree
Hash Table
Module 38
• Nonlinear data structures are those data structures in which data items are not
Partha Pratim
Das
arranged in a sequence and each element may have multiple paths to connect to other
elements.
Objectives &
Outline
• Unlike linear data structures, in which each element is directly connected with utmost
Data Structure
two neighbouring elements (previous and next elements), non-linear data structures
Non-linear Data
Structures may be connected with more than two elements.
Graph
Tree • The elements don’t have a single path to connect to the other elements but have
Hash Table
multiple paths. Traversing through the elements is not possible in one run as the data
Binary Search
Tree is non-linearly arranged.
Build a BST
Search a Key • Common Non-Linear Data Structures include:
Comparison ◦ Graph: Undirected or Directed, Unweighted or Weighted, and variants
Module Summary
◦ Tree: Rooted or Unrooted, Binary or n-ary, Balanced or Unbalanced, and variants
◦ Hash Table: Array with lists (coalesced chains) and one or more hash functions
◦ Skip List: Multi-layered interconnected linked lists
◦ and so on
Database Management Systems Partha Pratim Das 38.8
Non-linear Data Structures (3): Graph PPD
Module 38
• Graphs: Graph G is a collection of vertices V (store the elements) and connecting
Partha Pratim
Das
edges (links) E between vertices: G =< V , E > where E ⊆ V × V
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
• A graph may be: • Examples of a graph include:
Binary Search
Tree ◦ Undirected or Directed ◦ ER Diagram
Build a BST
Search a Key
◦ Unweighted or Weighted ◦ Network: Electrical, Water
Comparison
Module Summary
◦ Cyclic or Acyclic ◦ Friendships in Facebook
◦ Disconnected or Connected ◦ Knowledge Graph
◦ and so on
Module 38
• Tree: Is a connected acyclic graph representing hierarchical relationship
Partha Pratim
Das
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
• A tree may be: • Examples of a tree include:
Binary Search
Tree ◦ Rooted or Unrooted ◦ Composite Attributes
Build a BST
Search a Key ◦ Binary or n-ary ◦ Family Genealogy
Comparison
Module Summary
◦ Balanced or Unbalanced ◦ Search Trees
◦ Disconnected (forest) or Connected ◦ and so on
◦ and so on
Module 38
Partha Pratim
Das
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
• Root: The node at the top of the tree is called root. There is only one root per tree
Hash Table
Binary Search
and one path from the root node to any node. A is the root node.
Tree
Build a BST
• Parent: The node which is a predecessor of any node is called parent node. In the
Search a Key
given tree, B is the parent of E. Every node, except the Root, has a unique parent
Comparison
Module Summary
• Child: A node which is the descendant of a node: D, E and F are the child nodes of B
• Leaf: A node which does not have any child node: I, J and K are leaf nodes
Module 38
Partha Pratim
Das
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree • Internal Nodes: The node which has at least one child is called internal Node
Hash Table
Binary Search
• Subtree: Subtree represents the tree rooted at that node
Tree
Build a BST
• Path: Path refers to the sequence of nodes along the edges of a tree
Search a Key
• Siblings: Nodes having the same parents: D, E and F are the siblings.
Comparison
Module Summary • Arity: Number of children of a node. B has arity 3, E has arity 2, G has arity 1, and D
has arity 0 (Leaf)
Module 38
Partha Pratim
Das
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
• Levels: The root node is said to be at Level 0 and the children of the root node are at
Hash Table
Binary Search
Level 1 and the children of the nodes which are at Level 1 will be at Level 2 and so on.
Tree
Build a BST
Search a Key Level is the length of the path (number of links) or distance of a node from the root
Comparison node. So, level of A is 0, level of C is 1, level of G is 2, and level of J is 3.
Module Summary
• Height: Maximum level in a tree
• Binary Tree: is a tree, where each node can have at most 2 children. It has arity 2.
Module 38
Partha Pratim
Das
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
• Fact 1: A tree with n nodes has n − 1 edges
Hash Table
Binary Search • Fact 2: The maximum number of nodes at level l of a binary tree is 2l .
Tree
Build a BST • Fact 3: If h is the height of a binary tree of n nodes, then:
Search a Key
Comparison
◦ h + 1 ≤ n ≤ 2h+1 − 1
Module Summary ◦ ⌈lg(n + 1)⌉ − 1 ≤ h ≤ n − 1
◦ O(lg n) ≤ h ≤ O(n)
◦ For a k-ary tree, O(lgk n) ≤ h ≤ O(n)
Module 38
• Hash Table (Hash Map): implements an associative array abstract data type, a
Partha Pratim
Das
structure that can map keys to values by using a hash function to compute an index
(hash code), into an array of buckets or slots, from which the desired value can be found
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
• A hash table may be using: • Examples of a hash table include:
Comparison
Module Summary
◦ Static or Dynamic Schemes ◦ Associative arrays
◦ Open Addressing ◦ Database indexing
◦ 2-Choice Hashing ◦ Caches
◦ and so on ◦ and so on
Database Management Systems Partha Pratim Das 38.15
Binary Search Tree PPD
Module 38
Partha Pratim
Das
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Binary Search Tree
Module 38
• During the study of linear data structure, we observed that
Partha Pratim
Das ◦ Binary search is efficient in search of a key: O(lg n). However,
Objectives & ▷ it needs to be performed on a sorted array, and
Outline
▷ the array makes insertion and deletion expensive at O(n)
Data Structure
Non-linear Data
◦ The linked list, on the other hand is efficient in insertion and deletion at O(1), while
Structures it makes the search expensive at O(n).
Graph
Tree ▷ O(1) insert / delete is possible because we just need to manipulate pointers and
Hash Table
Binary Search
not physically move data
Tree
Build a BST
• Using the non-linearity, specifically (binary) trees, we can combine the benefits of both
Search a Key
• Note that once an array is sorted, we know the order in which its elements may be
Comparison
checked (for any key) during a search
Module Summary
• As the binary search splits the array, we can conceptually consider the Middle Element
to be the Root of a tree and the left (right) sub-array to be its left (right) sub-tree
• Progressing recursively, we have a Binary Search Tree
Database Management Systems Partha Pratim Das 38.17
Binary Search and Binary Search Tree PPD
Module 38 10 12 14 17 19 22 25
• Consider the data set:
Partha Pratim
Das
LL L LR M RL R RR
Objectives &
• Search order is:
Outline
◦ First: M
Data Structure
◦ Second: L or R
Non-linear Data
Structures ◦ Third:
Graph
Tree
▷ For L: LL or LR
Hash Table
▷ For R: RL or RR
Binary Search
Tree ◦ Recur ...
Build a BST
Search a Key • Put as a tree:
Comparison
Module Summary
Module 38
• Binary Search Tree (BST): Is a tree in which all the nodes hold the following:
Partha Pratim
Das ◦ The value of each node in the left sub-tree is less than the value of its root
Objectives &
◦ The value of each node in the right sub-tree is greater than the value of its root
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
• Structure of BST node: Each node consists of an element (X), and a link to the left
child or the left subtree (LC), and a link to the right child or the right subtree (RC)
Module 38
• Example: Obtain the BST by inserting the following values-
Partha Pratim
Das
E, L, P, H, A, N, T.
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Module 38
• Example: Obtain the BST by inserting the following values-
Partha Pratim
Das
E, L, P, H, A, N, T.
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Module 38
search(root, key)
Partha Pratim 1. Compare the key with the element at root.
Das 1.1. If the key is equal to root’s element then
1.1.1 Element found and return
Objectives &
Outline
1.2. else if the key is lesser than the root’s element
1.2.1 search(root.lc) #search on the left subtree
Data Structure 1.3 else: #if the key is greater than the root’s element
Non-linear Data 1.3.1 search(root.rc) #search on the right subtree
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Module 38
• Searching a key in a BST is O(h), where h is the height of the key
Partha Pratim
Das • Worst Case
Objectives & ◦ The BST is a skewed binary search tree (all the nodes except the leaf would have
Outline
only one child)
Data Structure
◦ This can happen if keys are inserted in sorted order
Non-linear Data
Structures ◦ Height (h) of the BST having n elements becomes n − 1
Graph
Tree
◦ Time complexity of search in BST becomes O(n)
Hash Table
Binary Search
• Best Case
Tree
Build a BST
◦ The BST is a balanced binary search tree
Search a Key ◦ This is possible if
Comparison
▷ If keys are inserted in purely randomized order, Or
Module Summary
▷ If the tree is explicitly balanced after every insertion
◦ Height (h) of the binary search tree becomes lg n
◦ Time complexity of search in BST becomes O(lg n)
Module 38
Partha Pratim
Das
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Comparison of Linear and Non-Linear Data
Structures
Module 38
Partha Pratim
Das
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Module 38
Linear Data Structure Non-Linear Data Structure
Partha Pratim
Das • Data elements are arranged in a linear • Data elements are arranged in hierar-
Objectives & order where each and every elements are chical or networked manner
Outline
attached to its previous and next adjacent
Data Structure
Non-linear Data
• Single level is involved • Multiple level are involved
Structures • Implementation is easy in comparison • Implementation is complex in compari-
Graph
Tree to non-linear data structure son to linear data structure
Hash Table
• Data elements can be traversed in one • Data elements can be traversed in mul-
Binary Search
Tree way only tiple ways. Various traversals may be de-
Build a BST
Search a Key
fined to linearize the data: Depth-First,
Comparison Breadth-First, Inorder, Prepoder, Pos-
Module Summary torder, etc.
• Examples: array, stack, queue, linked • Examples: trees, graphs, skip list, hash
list, and their variants map, and several variants
Module 38
Partha Pratim
Das
Objectives &
Outline
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary
Module 38
Partha Pratim • Introduced Non-linear Data Structures - graph, tree, hash table
Das
• Studied Binary Search Tree as an adaptation of binary search
Objectives &
Outline • Compared Linear and Non-Linear Data Structures
Data Structure
Non-linear Data
Structures
Graph
Tree
Hash Table
Binary Search
Tree
Build a BST
Search a Key
Comparison
Module Summary Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Partha Pratim
Das
Objectives &
Outline Database Management Systems
Physical Storage
Flash memory
Module 39: Storage and File Structure/1: Physical Storage
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Future of Storage
DNA Digital
Quantum Memory
Module 39
Partha Pratim • Introduced Non-linear Data Structures - graph, tree, hash table
Das
• Studied Binary Search Tree as an adaptation of binary search
Objectives &
Outline • Compared Linear and Non-Linear Data Structures
Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Other Storage
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
Partha Pratim • Introduce various Physical Storage Media for high volume, fast, reliable and inexpensive
Das
options for data storage for databases
Objectives &
Outline • To understand the options of Tertiary Storage for high volume, inexpensive backup
Physical Storage options
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Other Storage
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Other Storage
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
Partha Pratim
Das
Objectives &
Outline
Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Future of Storage
DNA Digital
Quantum Memory
Module 39
Future of Storage
DNA Digital
Quantum Memory
Module 39
Magnetic Tapes
▷ capacities of up to a few Gigabytes widely used currently
Cloud Storage
▷ Capacities have gone up and per-byte costs have decreased steadily and rapidly
Cloud vs. Storage (roughly factor of 2 every 2 to 3 years)
Other Storage
Optical Disk • Volatile
Flash Drives
SD & SSD
◦ contents of main memory are usually lost if a power failure or system crash occurs
Future of Storage
DNA Digital
Quantum Memory
Module 39
Magnetic Disk
• But writes are slow (few microseconds), erase is slower
Magnetic Tapes • Widely used in embedded devices such as digital cameras, phones, and USB keys
Cloud Storage
Cloud vs. Storage
Other Storage
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
• Data is stored on spinning disk, and read/written magnetically
Partha Pratim
Das • Primary medium for the long-term storage of data
Objectives & ◦ typically stores entire database
Outline
Physical Storage
• Data must be moved from disk to main memory for access, and written back for
Flash memory storage - much slower access than main memory
Magnetic Disk
Optical Storage • direct-access
Tape Storage
Storage Hierarchy ◦ possible to read data on disk in any order, unlike magnetic tape
Magnetic Disk
• Capacities range up to roughly 16–32 TB
Magnetic Tapes
Cloud Storage
◦ Much larger capacity and much lower cost/byte than main memory/flash memory
Cloud vs. Storage ◦ Growing constantly and rapidly with technology improvements (factor of 2 to 3
Other Storage
Optical Disk
every 2 years)
Flash Drives
SD & SSD
• Survives power failures and system crashes
Future of Storage ◦ disk failure can destroy data, but is rare
DNA Digital
Quantum Memory
Module 39
Partha Pratim • non-volatile, data is read optically from a spinning disk using a laser
Das
• CD-ROM (640 MB) and DVD (4.7 to 17 GB) most popular forms
Objectives &
Outline • Blu-ray disks: 27 GB to 54 GB
Physical Storage
Flash memory • Write-one, read-many (WORM) optical disks used for archival storage (CD-R, DVD-R,
Magnetic Disk
Optical Storage
DVD+R)
Tape Storage
Storage Hierarchy
• Multiple write versions also available (CD-RW, DVD-RW, DVD+RW, and DVD-RAM)
Magnetic Disk • Reads and writes are slower than with magnetic disk
Magnetic Tapes
• Juke-box systems, with large numbers of removable disks, a few drives, and a
Cloud Storage
Cloud vs. Storage mechanism for automatic loading/unloading of disks available for storing large volumes
Other Storage of data
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
Partha Pratim • non-volatile, used primarily for backup (to recover from disk failure), and for archival
Das
data
Objectives &
Outline • sequential-access
Physical Storage ◦ much slower than disk
Flash memory
Magnetic Disk
Optical Storage
• very high capacity (40 to 300 TB tapes available)
Tape Storage
Storage Hierarchy
• tape can be removed from drive storage costs much cheaper than disk, but drives are
Magnetic Disk
expensive
Magnetic Tapes • Tape jukeboxes available for storing massive amounts of data
Cloud Storage
Cloud vs. Storage
◦ hundreds of terabytes (TB) (1 TB = 1012 bytes) to even multiple petabytes (PB)
Other Storage (1 PB = 1015 bytes)
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
Partha Pratim
Das
Objectives &
Outline
Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Other Storage
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
Partha Pratim
Das
Objectives &
Outline
Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Future of Storage
DNA Digital
Quantum Memory
Module 39
Partha Pratim
Das
Objectives &
Outline
Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Other Storage
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital NOTE: Diagram is schematic, and simplifies the structure of actual disk drives
Quantum Memory
Module 39
• Read-write head
Partha Pratim ◦ Positioned very close to the platter surface
Das ◦ Reads or writes magnetically encoded information
Objectives & • Surface of platter divided into circular tracks
Outline
Physical Storage
◦ Over 50K-100K tracks per platter on typical hard disks
Flash memory
• Each track is divided into sectors
Magnetic Disk
Optical Storage ◦ A sector is the smallest unit of data read or written
Tape Storage
Storage Hierarchy
◦ Sector size typically 512 bytes
◦ Sectors / track: 500 to 1k (inner) to 1k to 2k (outer)
Magnetic Disk
Magnetic Tapes
• To read/write a sector
Cloud Storage ◦ disk arm swings to position head on right track
Cloud vs. Storage ◦ platter spins: Read/Write as sector passes under head
Other Storage
Optical Disk
• Head-disk assemblies
Flash Drives
SD & SSD
◦ multiple disk platters on a single spindle (1 to 5 usually)
◦ one head per platter, mounted on a common arm.
Future of Storage
DNA Digital • Cylinder i consists of i th track of all the platters
Quantum Memory
Module 39 • Disk Controller: interfaces between the computer system and the disk drive hardware
Partha Pratim
Das
◦ Accepts high-level commands to read or write a sector
◦ Initiates actions moving the disk arm to the right track, reading or writing the data
Objectives & ◦ Computes and attaches checksums to each sector to verify that correct read back
Outline
◦ Ensures successful writing by reading back sector after writing it
Physical Storage ◦ Performs remapping of bad sectors
Flash memory
Magnetic Disk • Disk Subsystem:
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Other Storage
Optical Disk
Flash Drives
SD & SSD • Disk Interface Standards Families: ATA, SATA, SCSI, SAS, several variants
Future of Storage
DNA Digital
• Storage Area Networks (SAN) connects disks by a high-speed network to a number of servers
Quantum Memory
• Network Attached Storage (NAS) provides a file system interface using networked file system protocol
Module Summary Database Management Systems Partha Pratim Das 39.16
Magnetic Disks (4): Performance Measures
Module 39 • Access Time: time from a read or write request issue to start of data transfer:
Partha Pratim
Das
◦ Seek Time: time to reposition the arm over the correct track
▷ Avg. seek time is 1/2 the worst case seek time; 1/3 if all tracks have same number of sectors
Objectives &
Outline
▷ 4 to 10 milliseconds on typical disks
Physical Storage
◦ Rotational Latency: time for the sector to be accessed to appear under the head
Flash memory ▷ Average latency is 1/2 of the worst case latency
Magnetic Disk
▷ 4 to 11 milliseconds on typical disks (5400 to 15000 rpm)
Optical Storage
Tape Storage • Data-transfer Rate: the rate at which data can be retrieved from or stored to the disk
Storage Hierarchy
Magnetic Disk
◦ 25 to 100 MB per second max rate, lower for inner tracks
Magnetic Tapes
◦ Multiple disks may share a controller, so rate that controller can handle is also important
Cloud Storage • Mean Time To Failure (MTTF): Avg. time the disk is expected to run continuously without any failure
Cloud vs. Storage
◦ Typically 3 to 5 years
Other Storage
Optical Disk
◦ Probability of failure of new disks is quite low, corresponding to a theoretical MTTF of 500,000 to
Flash Drives
1,200,000 hours for a new disk. For example, an MTTF of 1,200,000 hours for a new disk means
SD & SSD that given 1000 relatively new disks, on an average one will fail every 1200 hours
Future of Storage ◦ MTTF decreases as disk ages
DNA Digital
Quantum Memory
Module 39
Partha Pratim
Das
Objectives &
Outline
Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Future of Storage
DNA Digital
Quantum Memory
Module 39 • Hold large volumes of data and provide high transfer rates
Partha Pratim ◦ Tape Formats
Das
▷ Few GB for DAT (Digital Audio Tape) format
Objectives &
Outline ▷ 10-40 GB with DLT (Digital Linear Tape) format
Physical Storage ▷ 100 GB+ with Ultrium format, and
Flash memory
Magnetic Disk
▷ 330 GB with Ampex helical scan format
Optical Storage ◦ Transfer rates from few to 10’s of MB/s
Tape Storage
Storage Hierarchy • Tapes are cheap, but cost of drives is very high
Magnetic Disk
• Very slow access time in comparison to magnetic and optical disks
Magnetic Tapes
Cloud Storage
◦ Limited to sequential access
Cloud vs. Storage ◦ Some formats (Accelis) provide faster seek (10’s of seconds) at cost of lower
Other Storage capacity
Optical Disk
Flash Drives • Used mainly for backup, for storage of infrequently used information, and as an off-line
SD & SSD
medium for transferring information from one system to another.
Future of Storage
DNA Digital • Tape jukeboxes used for very large capacity storage
◦ Multiple petabyes (1015 bytes)
Quantum Memory
Module 39
Partha Pratim
Das
Objectives &
Outline
Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Future of Storage
DNA Digital
Quantum Memory
Module 39
Partha Pratim • Cloud storage is purchased from a third-party cloud vendor who owns and operates data
Das
storage capacity and delivers it over the Internet in a pay-as-you-go model
Objectives &
Outline • These cloud storage vendors manage capacity, security and durability to make data
Physical Storage accessible to applications all around the world
Flash memory
Magnetic Disk • Applications access cloud storage through traditional storage protocols or directly via an
Optical Storage
Tape Storage
API
Storage Hierarchy
• Many vendors offer complementary services designed to help collect, manage, secure
Magnetic Disk
Magnetic Tapes
and analyze data at massive scale. Various available options for cloud storage are:
Cloud Storage ◦ Google Drive
Cloud vs. Storage
◦ Amazon Drive
Other Storage
Optical Disk
◦ OneDrive by Microsoft
Flash Drives ◦ Evernote
SD & SSD
Future of Storage
◦ Dropbox
DNA Digital ◦ and so on
Quantum Memory
Module 39
Module 39
Partha Pratim
Das
Objectives &
Outline
Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Future of Storage
DNA Digital
Quantum Memory
Module 39
Cloud Storage
◦ Blu-ray DVD: 27 GB (54 GB for double sided disk)
Cloud vs. Storage ◦ Slow seek time, for same reasons as CD-ROM
Other Storage
Optical Disk
• Record once versions (CD-R and DVD-R) are popular
Flash Drives
SD & SSD
◦ data can only be written once, and cannot be erased.
Future of Storage
◦ high capacity and long lifetime; used for archival storage
DNA Digital ◦ Multi-write versions (CD-RW, DVD-RW, DVD+RW and DVD-RAM) also available
Quantum Memory
Module 39
Partha Pratim • Flash drives are often referred to as pen drives, thumb drives, or jump drives. They
Das
have completely replaced floppy drives for portable storage. Considering how large and
Objectives &
Outline
inexpensive they have become, they have nearly replaced CDs and DVDs for data
Physical Storage
storage purposes.
Flash memory
Magnetic Disk
• USB flash drives are removable and rewritable storage devices that, as the name
Optical Storage
suggests, require a USB port for connection and utilizes non-volatile flash memory
Tape Storage
Storage Hierarchy technology.
Magnetic Disk
• The storage space in USB flash drives is quite large with sizes ranging from 128MB to
Magnetic Tapes
2TB.
Cloud Storage
Cloud vs. Storage • The USB standard a flash drive is built around will determine the number of things
Other Storage
Optical Disk
about its potential performance, including maximum transfer rate.
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
• A Secure Digital (SD, in short) card is a type of removable memory card used to read
Partha Pratim
Das
and write large quantities of data.
Objectives &
• Due to their relatively small size, SD cards are widely used in mobile electronics,
Outline
cameras, smart devices, video game consoles, and more.
Physical Storage
Flash memory • There are several types of SD cards sold and used today:
Magnetic Disk
Optical Storage
Card Year of Capacity Supported
Tape Storage Type Debut Devices
Storage Hierarchy
SD 1996 128MB to 2GB All host devices that support SD, SDHC, SDXC
Magnetic Disk SDHC 2006 4GB to 32GB All host devices that support SDHC, SDXC
Magnetic Tapes SDXC 2009 64GB to 2TB All host devices that support SDXC
Cloud Storage
Cloud vs. Storage
Card Capacity File Remarks
Other Storage Type System
Optical Disk
Flash Drives
SD 128MB to 2GB FAT16 FAT16 supports 16 MB to 2 GB
SD & SSD SDHC 4GB to 32GB FAT32 FAT32 can be support up to 16 TB
Future of Storage SDXC 64GB to 2TB exFAT exFAT is non-standard, supports file up to 4 GB
DNA Digital
Quantum Memory Source: CARDS - WHAT ARE THE DIFFERENCES BETWEEN FAT16, FAT32 AND EXFAT FILE SYSTEMS?
Module Summary Database Management Systems Partha Pratim Das 39.26
Flash Storage
Module 39
Other Storage
− translation table tracks mapping
Optical Disk − also stored in a label field of flash page
Flash Drives
SD & SSD
− remapping carried out by flash translation layer
Future of Storage ▷ after 100,000 to 1,000,000 erases, erase block becomes unreliable and cannot be used
DNA Digital
Quantum Memory
− wear leveling
Module Summary Database Management Systems Partha Pratim Das 39.27
Solid-State Drives (SSD)
Module 39
Partha Pratim • SSDs replace traditional mechanical hard disks by using flash-based memory, which is
Das
significantly faster.
Objectives &
Outline • SSDs speed up computers significantly due to their low read-access times and fast
Physical Storage throughput.
Flash memory
Magnetic Disk • The idea of SSDs was introduced in 1978. It was implemented using semiconductors. It
Optical Storage
Tape Storage
stores the data in the persistent state even when no power is supplied.
Storage Hierarchy
• The speed of SSD is much larger than that of HDD as it reads/writes data at higher
Magnetic Disk
Magnetic Tapes
input-output per second.
Cloud Storage • Unlike HDDs, SSDs do not include any moving parts. SDDs can resist vibrations and
Cloud vs. Storage
high temperatures.
Other Storage
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
Other Storage
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
Partha Pratim
Das
Objectives &
Outline
Physical Storage
Flash memory
Magnetic Disk
Optical Storage
Tape Storage
Storage Hierarchy
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Future of Storage
DNA Digital
Quantum Memory
Module 39
• DNA digital data storage is the process of encoding and decoding binary data to and from
Partha Pratim
Das
synthesized strands of DNA.
Objectives &
• While DNA as a storage medium has enormous potential because of its high storage density, its
Outline practical use is currently severely limited because of its high cost and very slow read and write
Physical Storage times.
Flash memory
Magnetic Disk • Digital storage systems encode the text, photos, or any other kind of information as a series of
Optical Storage
Tape Storage
0s and 1s. This same information can be encoded in DNA using the four nucleotides that make
Storage Hierarchy up the genetic code: A, T, G, and C. For example, G and C could be used to represent 0 while
Magnetic Disk A and T represent 1.
Magnetic Tapes
• DNA has several other features that makes it desirable as a storage medium; it is extremely
Cloud Storage
Cloud vs. Storage
stable and is fairly easy (but expensive) to synthesize and sequence.
Other Storage • Also, because of its high density - each nucleotide, equivalent to up to two bits, is about 1 cubic
Optical Disk
Flash Drives nanometer - an exabyte (1018 bytes) of data stored as DNA could fit in the palm of your hand
SD & SSD
Future of Storage
• DNA Synthesis: A DNA synthesizer machine builds synthetic DNA strands matching the
DNA Digital sequence of digital code.
Quantum Memory
Module 39
Partha Pratim • Quantum memory is the quantum-mechanical version of ordinary computer memory
Das
• Whereas ordinary memory stores information as binary states (represented by ”1”s and
Objectives &
Outline ”0”s), quantum memory stores a quantum state for later retrieval
Physical Storage
Flash memory
• These states hold useful computational information known as qubits
Magnetic Disk
Optical Storage
• Quantum memory is essential for the development of many devices in quantum
Tape Storage information processing applications such as quantum network, quantum repeater, linear
Storage Hierarchy
optical quantum computation or long-distance quantum communication
Magnetic Disk
Magnetic Tapes • Unlike the classical memory of everyday computers, the states stored in quantum
Cloud Storage memory can be in a quantum superposition, giving much more practical flexibility in
Cloud vs. Storage
quantum algorithms than classical information storage
Other Storage
Optical Disk
Flash Drives
SD & SSD
Future of Storage
DNA Digital
Quantum Memory
Module 39
Magnetic Disk
Magnetic Tapes
Cloud Storage
Cloud vs. Storage
Other Storage
Optical Disk Slides used in this presentation are borrowed from http://db-book.com/ with kind
Flash Drives
SD & SSD
permission of the authors.
Future of Storage
Edited and new slides are marked with “PPD”.
DNA Digital
Quantum Memory
Partha Pratim
Das
Objectives &
Outline Database Management Systems
File Organization
Fixed-Length Records
Module 40: Storage and File Structure/2: File Structure
Free Lists
Variable-Length
Records
Organization of
Records in Files Partha Pratim Das
Sequential
Multi-Table
Module Summary
Module 40
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim
Das
Objectives &
Outline
File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
File Organization
Buffer Replacement
Policy
Module Summary
Module 40
Storage Access
• A database file is partitioned into fixed-length storage units called blocks
Buffer Manager
Buffer Replacement
◦ Blocks are units of both storage allocation and data transfer
Policy
Module Summary
Module 40
Organization of
Records in Files • Deletion of record i: Alternatives:
Sequential
Multi-Table ◦ move records i + 1, · · · , n to
Data Dictionary i, · · · , n − 1
Storage
Storage Access
◦ move record n to i
Buffer Manager ◦ do not move records, but link
Buffer Replacement
Policy all free records on a free list
Module Summary
Module 40
Partha Pratim
Das
Before deletion After deletion & Compaction
Objectives &
Outline
File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim
Das
Before deletion After deletion & Movement
Objectives &
Outline
File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40 • Store the address of the first deleted record in the file header
Partha Pratim
Das • Use this first record to store the address of the second deleted record, and so on
Objectives &
• Consider these stored addresses as pointers since they point to the location of a record
Outline
• More space efficient representation: reuse space for normal attributes of free records to
File Organization
Fixed-Length Records store pointers (No pointers stored in in-use records)
Free Lists
Variable-Length
Records
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
• Variable-length records arise in database systems in several ways:
Partha Pratim
Das ◦ Storage of multiple record types in a file
Objectives &
◦ Record types that allow variable lengths for one or more fields such as strings
Outline (varchar)
File Organization
Fixed-Length Records
◦ Record types that allow repeating fields (used in some older data models)
Free Lists
Variable-Length
• Attributes are stored in order
Records
• Variable length attributes represented by fixed size (offset, length), with actual data
Organization of
Records in Files stored after all fixed length attributes
Sequential
Multi-Table • Null values represented by null-value bitmap
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim
Das
Objectives &
Outline
File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records
Organization of
Records in Files • Slotted Page header contains:
Sequential
Multi-Table ◦ number of record entries
Data Dictionary ◦ end of free space in the block
Storage
◦ location and size of each record
Storage Access
Buffer Manager
• Records can be moved around within a page to keep them contiguous with no empty
Buffer Replacement
Policy space between them; entry in the header must be updated
Module Summary
• Pointers should not point directly to record - instead they should point to the entry for
the record in header
Database Management Systems Partha Pratim Das 40.12
Organization of Records in Files PPD
Module 40
Partha Pratim
Das
Objectives &
Outline
File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Organization of Records in Files
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim • Heap: A record can be placed anywhere in the file where there is space
Das
• Sequential: Store records in sequential order, based on the value of the search key of
Objectives &
Outline each record
File Organization
Fixed-Length Records
• Hashing: A hash function computed on some attribute of each record; the result
Free Lists specifies in which block of the file the record should be placed
Variable-Length
Records
• Records of each relation may be stored in a separate file. In a multitable clustering
Organization of
Records in Files file organization records of several different relations can be stored in the same file
Sequential
Multi-Table
◦ Motivation: store related records on the same block to minimize I/O
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim • Suitable for applications that require sequential processing of the entire file
Das
• The records in the file are ordered by a search-key
Objectives &
Outline
File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim
• Deletion: Use pointer chains
Das
• Insertion: Locate the position where the
Objectives &
Outline
record is to be inserted
File Organization ◦ if there is free space insert there
Fixed-Length Records
Free Lists
◦ if no free space, insert the record in an
Variable-Length
Records
overflow block
Organization of ◦ In either case, pointer chain must be
Records in Files
Sequential
updated
Multi-Table
• Need to reorganize the file from time to
Data Dictionary
Storage time to restore sequential order
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40 Store several relations in one file using a multitable clustering file organization
Partha Pratim
Das
Objectives &
Outline
department
File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records
Organization of
Records in Files
Sequential
instructor
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module 40
Partha Pratim • good for queries involving department ▷◁ instructor, and for queries involving one single
Das
department and its instructors
Objectives &
Outline • bad for queries involving only department
File Organization
Fixed-Length Records
• results in variable size records
Free Lists
Variable-Length
• Can add pointer chains to link records of a particular relation
Records
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim
Das
Objectives &
Outline
File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Data Dictionary Storage
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim
Data Dictionary (also, System Catalog) stores metadata (data about data) such as:
Das
• Information about relations
Objectives &
Outline ◦ names of relations
File Organization ◦ names, types and lengths of attributes of each relation
Fixed-Length Records
Free Lists
◦ names and definitions of views
Variable-Length
Records
◦ integrity constraints
Organization of • User and accounting information, including passwords
Records in Files
Sequential • Statistical and descriptive data
Multi-Table
Storage Access
• Physical file organization information
Buffer Manager
Buffer Replacement
◦ How relation is stored (sequential/hash/· · · )
Policy
◦ Physical location of relation
Module Summary
• Information about indices
Module 40
Partha Pratim
Das
Objectives &
Outline
File Organization
Fixed-Length Records • Relational representa-
Free Lists
Variable-Length tion on disk
Records
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim
Das
Objectives &
Outline
File Organization
Fixed-Length Records
Free Lists
Variable-Length
Records
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Storage Access
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim • A database file is partitioned into fixed-length storage units called blocks
Das
◦ Blocks are units of both storage allocation and data transfer
Objectives &
Outline • Database system seeks to minimize the number of block transfers between the disk and
File Organization memory
Fixed-Length Records
Free Lists ◦ We can reduce the number of disk accesses by keeping as many blocks as possible
Variable-Length
Records in main memory
Organization of
Records in Files • Buffer: portion of main memory available to store copies of disk blocks
Sequential
Multi-Table • Buffer Manager: subsystem responsible for allocating buffer space in main memory
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim • Programs call on the buffer manager when they need a block from disk
Das
◦ If the block is already in the buffer, buffer manager returns the address of the block
Objectives &
Outline in main memory
File Organization ◦ If the block is not in the buffer, the buffer manager
Fixed-Length Records
Free Lists ▷ Allocates space in the buffer for the block
Variable-Length
Records − Replacing (throwing out) some other block, if required, to make space for
Organization of
Records in Files
the new block
Sequential − Replaced block written back to disk only if it was modified since the most
Multi-Table
recent time that it was written to / fetched from the disk
Data Dictionary
Storage ▷ Reads the block from the disk to the buffer, and returns the address of the block
Storage Access
Buffer Manager
in main memory to requester
Buffer Replacement
Policy
Module Summary
Module 40
Partha Pratim • Most operating systems replace the block least recently used (LRU strategy)
Das
• Idea behind LRU – use past pattern of block references as a predictor of future
Objectives &
Outline references
File Organization
Fixed-Length Records
• Queries have well-defined access patterns (such as sequential scans), and a database
Free Lists system can use the information in a user’s query to predict future references
Variable-Length
Records
◦ LRU may be a bad strategy for certain access patterns involving repeated scans of
Organization of
Records in Files
data
Sequential
Multi-Table
▷ For example: when computing the join of 2 relations r and s by a nested loop
Data Dictionary for each tuple tr of r do
Storage
for each tuple ts of s do
Storage Access
Buffer Manager
if the tuples tr and ts match ...
Buffer Replacement
Policy ◦ Mixed strategy with hints on replacement strategy provided by the query optimizer
Module Summary is preferable
Module 40
Partha Pratim • Pinned block: memory block that is not allowed to be written back to disk
Das
• Toss-immediate strategy: frees the space occupied by a block as soon as the final
Objectives &
Outline tuple of that block has been processed
File Organization
Fixed-Length Records
• Most recently used (MRU) strategy: system must pin the block currently being
Free Lists processed. After the final tuple of that block has been processed, the block is unpinned,
Variable-Length
Records and it becomes the most recently used block.
Organization of
Records in Files • Buffer manager can use statistical information regarding the probability that a request
Sequential
Multi-Table
will reference a particular relation
Data Dictionary ◦ For example., the data dictionary is frequently accessed. Heuristic: keep
Storage
data-dictionary blocks in main memory buffer
Storage Access
Buffer Manager • Buffer managers also support forced output of blocks for the purpose of recovery
Buffer Replacement
Policy
Module Summary
Module 40
Organization of
Records in Files
Sequential
Multi-Table
Data Dictionary
Storage
Storage Access
Buffer Manager
Buffer Replacement
Policy
Module Summary Slides used in this presentation are borrowed from http://db-book.com/ with kind
permission of the authors.
Edited and new slides are marked with “PPD”.
Database Management Systems Partha Pratim Das 40.27