Dbms Notes Be Sem 5wbeb
Dbms Notes Be Sem 5wbeb
Semester : V
Di
scl
aimerst
atement
Module – I
Introduction: Purpose of Database System; View of Data, Data Models, Database
Languages, Transaction Management, Database Architecture, Database Users
Administrator
Database Design and Entity - Relational Model: Overview of Design process, E-R
Model, Constraints, E – R Diagram, Week Entity Sets, Extended E – R Features
Module – II
Relational Model: Structure of Relational Database, Fundamental Relational Algebra,
Operation, Additional Operations, Tuple Relational
Module – III
SQL & Advanced SQL: Data Definition, Basic Structure of SQL Queries, Set
Operations, Aggregate Functions, Null Values, Nested Sub – Queries, Complex Queries,
Views, Modification of Database, SQL data types & schemas, Integrity Constraints,
Authorization, Embedded SQL
Module – IV
Relational Database Design: Atomic domains and First Normal Form, Decomposition
using Functional Dependencies, Decomposition using Multivalued Dependencies , more
normal forms
Module – V
Indexing and Hashing: Basic Concepts, Ordered Indices, B+ Tree Index Files, B Tree
Index Files, Multiple Key Access, Hashing, Comparison of Ordered Indexing and
Hashing
Module – VI
Query Processing: Overview, Measure of Query Cost, , Selection Operation, sorting join
operations
Module – VII
Transaction & Concurrency Control: Transaction Concepts & ACID Properties,
Transaction States, Concurrent Executions, Serializability & Its Testing, Recoverability,
Introduction to Concurrency Control, Locked Base Protocol & Deadlock Handling, Time
stamp Based Protocols, Validation-Based Protocols, Multiple Granualarity.
Text Book:
1. A.Silberschatz et.al - Database System Concepts, 5th Edn, Tata Mc-Graw Hill,
New Delhi – 2000.
Reference Books:
1. Date C.J. - An Introduction to Database System, Pearson Education, New Delhi-
2005
2. R.Elmasri, Fundamentals of Database Systems, Pearson Education, New Delhi,
COURSE PLAN
Depar
tment :Comput
erSci
ence
Subj
ect : DATABASE MANAGEMENT SYSTEM
Semest
er& br
anch :I
II
No.ofper
iodshour
s/week :3 Theor
y: yes Labs:
Tot
alNo.ofl
ect
ures :40
RecommendedCour
seBooks
Text Book:
1. A.Silberschatz et.al - Database System Concepts, 5 th Edn, Tata Mc-Graw Hill,
New Delhi – 2000.
Reference Books:
Lect
ureNo. Topi
c(s)t
obecover
ed
Introduction: Purpose of Database System
1
2 View of Data,
3 Data Models
4 Data Models
Database Languages
5
Transaction Management
6
7 Storage Management
12 Keys, E – R Diagram
13 Keys, E – R Diagram
21 Functional Dependencies
23 Decomposition,
Desirable Properties of Decomposition,
24
Normalization
25
26 Normalization
BCNF & Its Comparison with 3NF.
27
30 Evaluation of Expressions
31 Evaluation of Expressions
Selection Operation.
32
Test Coverage :
Test1:
Introduction: Purpose of Database System; View of Data, Data Models, Database
Languages, Transaction Management, Storage Management, Database Users
Administrator, History of Database Systems.
Database Design and Entity - Relational Model: Basic Concepts, Design issues
Mapping, Constraints, Keys, E – R Diagram, Week Entity Sets, Extended E – R Features,
Design of an E – R Database Schema, Deduction of an E – R Schema to Tables.
Test2:
One can understand the internal details of Relational Algebra. You can write a
efficient query based on many algorithms. This course will help students in developing
software and give them the solution of how to avoid deadlock in real life problem like
banking system, Airline Management System etc., concepts for transaction processing
and the operations relevant to transaction processing, types of failures that may occur
during transaction execution, concurrency control, distributed databases and centralized
databases. Design and build a relational database for a small business application form
given user requirements
This course presents problem-solving logic and skills for analyzing and developing a
database. Topics covered will include database design, administration and application
development. This course is designed to give the student the understanding necessary to
work efficiently within the database environment. Optimize a database for commercial
operation
Unit I
Introduction: Purpose of Database System; View of Data, Data Models, Database
Languages, Transaction Management, Storage Management, Database Users
Administrator, History of Database Systems.
10
Data Independence:
Data Independence can be defined as the capacity to change
the schema at one level of a database system without having to
change the schema at the next higher level.
For example:
method of representation of alphanumeric data (e.g.,
changing date format to avoid Y2000 problem)
method of representation of numeric data (e.g., integer vs.
long integer or floating-point)
units (e.g., metric vs. furlongs)
We can Define two types of data Independence:
Logical Data Independence:
Capacity to change the conceptual schema with out having to
change external schemas or application programs. We may change
the conceptual schema to expand the database (by adding a record
type or data item) , or to reduce the database (by removing a record
type or data items).
Physical Data Independence:
Capacity to change the Internal schema without having to
change the conceptual or external schema. For example , by
creating additional access structures –to improve the performance
of retrieval or update.
11
12
View of Data :
A major purpose of a DBS is to provide users with an abstract
view of the data. That is, the system hides certain details of how
the data are stored and maintained.
Data Abstraction-
Developers hide the complexity from users thru several levels of
abstraction, to simplify user’s interactions with the system:
13
14
15
16
END USER
USER 1
... USER 2
EXTERNAL
EXTERNAL VIEW 1 EXTERNAL VIEW 2
LEVEL
External/conceptual mapping
CONCEPTUAL CONCEPTUAL
LEVEL SCHEMA
Conceptual/internal mapping
STORED DATABASE
17
18
19
20
21
Database Users
Naive Users who interact with the system by invoking
one of the application programs that have been written
previously.For example , a bank teller who needs to
22
1) Entity-Relational model
2) Relational model
3) Network model
4) Hierarchical model
23
24
Advantages
25
26
27
1980s:
Research relational prototypes evolve into commercial systems
SQL becomes industrial standard
Parallel and distributed database systems
Object-oriented database systems
1990s:
Large decision support and data-mining applications
Large multi-terabyte data warehouses
Emergence of Web commerce
2000s:
XML and XQuery standards
Automated database administration
28
Unit II
Database Design and Entity - Relational Model: Basic Concepts, Design issues
Mapping, Constraints, Keys, E – R Diagram, Week Entity Sets, Extended E – R Features,
Design of an E – R Database Schema, Deduction of an E – R Schema to Tables.
29
DATA MODEL:
ER Data model:
ER data model is a high level – data model. ER data model perceives the real
world as consisting of basic objects, called entities and relationship among
them.
An entity is an object that exists and is distinguishable from other objects. For
instance, John Harris with S.I.N. 890-12-3456 is an entity, as he can be uniquely
identified as one particular person in the universe.
An entity may be concrete (a person or a book, for example) or abstract (like a
holiday or a concept).
An entity set is a set of entities of the same type (e.g., all persons having an
account at a bank).
Entity sets need not be disjoint. For example, the entity set employee (all
employees of a bank) and the entity set customer (all customers of the bank) may
have members in common.
An entity is represented by a set of attributes.
o E.g. name, S.I.N., street, city for ``customer'' entity.
o The domain of the attribute is the set of permitted values (e.g. the
telephone number must be seven positive integers).
Formally, an attribute is a function which maps an entity set into a domain.
o Every entity is described by a set of (attribute, data value) pairs.
o There is one pair for each attribute of the entity set.
o E.g. a particular customer entity is described by the set {(name, Harris),
(S.I.N., 890-123-456), (street, North), (city, Georgetown)}.
An analogy can be made with the programming language notion of type definition.
branch, the set of all branches of a particular bank. Each branch is described by
the attributes branch-name, branch-city and assets.
30
customer, the set of all people having an account at the bank. Attributes are
customer-name, S.I.N., street and customer-city.
employee, with attributes employee-name and phone-number.
account, the set of all accounts created and maintained in the bank. Attributes are
account-number and balance.
transaction, the set of all account transactions executed in the bank. Attributes are
transaction-number, date and amount.
31
Customer
Loan
Loan-no Amount
L-17 1000
L-23 2000
L-15 1500
Attributes
Consider the entity set employee with attributes employee-name and phone-number.
We could argue that the phone be treated as an entity itself, with attributes phone-
number and location.
Then we have two entity sets, and the relationship set EmpPhn defining the
association between employees and their phones.
This new definition allows employees to have several (or zero) phones.
New definition may more accurately reflect the real world.
We cannot extend this argument easily to making employee-name an entity.
32
33
34
Mapping Constraints
An E-R scheme may define certain constraints to which the contents of a database must
conform.
35
For example,
Keys
An entity set that does not possess sufficient attributes to form a primary key is called a
weak entity set. One that does have a primary key is called a strong entity set.
For example,
The entity set transaction has attributes transaction-number, date and amount.
Different transactions on different accounts could share the same number.
These are not sufficient to form a primary key (uniquely identify a transaction).
Thus transaction is a weak entity set.
Relationship Sets
Example:
Hayes depositor A-102
customer entity relationship set account entity
A relationship set is a mathematical relation among n 2 entities, each taken from entity
sets {(e1, e2, … en) | e1 E1, e2 E2, …, en En}
36
ER DIAGRAMS
We can express the overall logical structure of a database graphically with an E-R
diagram.
37
38
39
a)
Simple Attributes: The attribute that cannot be further divided into smaller parts
and represents the basic meaning is called a simple attribute. e.g. The “First
name”, “Last Name” attributes of a EMPLOYEE entity represent a simple
attribute.
Composite Attributes: The attributes that can be further divided into smaller
units and each individual unit contains a specific meaning. For example, an
attribute name of an entity set EMPLOYEE can be sub-divided into First-name,
Middle-initial, and Last-name.
b)
Single Valued Attribute: The attributes having single value for a particular entity
is called as single-valued attribute. e.g. age is a single valued attribute of a
EMPLOYEE entity.
Multi-valued Attributes: Attributes that have more than one values for a
particular entity is called a multi-valued attribute. Different entities may have
different number of values for these kind of attributes. For multi-valued
attributes we must also specify the minimum and maximum number of values
that can be attached. e.g. phone-number for a EMPLOYEE entity is a multi-valued
attribute.
c) Derived Attributes: The attributes that are not stored directly but can be derived
from stored attributes are called derived attributes. e.g. total-salary of an entity
EMPLOYEE can be calculated from basic-salary attribute.
d) Null Value: An attribute takes null value when an entity does not have a value
for it. The null value indicate “not applicable”- that is, that the value does not
exist for the entity. For example, one may have no middle name. Null can also
designate that an attribute value is unknown. An unknown value may be either
missing (the value does exist, but we do not have that information) or not known
(we do not know whether or not the value actually exists).
40
41
42
43
One to One
44
45
One to many
46
47
Many to one
48
49
May to many
The function that an entity plays in a relationship is called its role. Roles are normally
explicit and not specified.
50
They are useful when the meaning of a relationship set needs clarification.
For example, the entity sets of a relationship may not be distinct. The relationship works-
for might be ordered pairs of employees (first is manager, second is worker).
In the E-R diagram, this can be shown by labelling the lines connecting entities
(rectangles) to relationships (diamonds).
51
52
Total participation (indicated by double line): every entity in the entity set
participates in at least one relationship in the relationship set
Partial participation: some entities may not participate in any relationship in the
relationship set
53
54
An entity set that does not have a primary key is referred to as a weak entity set.
The existence of a weak entity set depends on the existence of a identifying entity set
It must relate to the identifying entity set via a total, one-to-many relationship set from
the identifying to the weak entity set
The discriminator (or partial key) of a weak entity set is the set of attributes that
distinguishes among all the entities of a weak entity set.
The primary key of a weak entity set is formed by the primary key of the strong entity
set on which the weak entity set is existence dependent, plus the weak entity set’s
discriminator.
55
56
Note: the primary key of the strong entity set is not explicitly stored with the weak
entity set, since it is implicit in the identifying relationship.
If loan-number were explicitly stored, payment could be made a strong entity, but
then the relationship between payment and loan would be duplicated by an implicit
relationship defined by the attribute loan-number common to payment and loan
Then the relationship with course would be implicit in the course-number attribute
Nonbinary Relationships
This E-R diagram says that a customer may have several accounts, each located in a
specific bank branch, and that an account may belong to several different customers.
57
58
We allow at most one arrow out of a ternary (or greater degree) relationship to
indicate a cardinality constraint
E.g. an arrow from works-on to job indicates each employee works on at most one job
at any branch.
If there is more than one arrow, there are two ways of defining the meaning.
2. add (ei , ai ) to RA
3. add (ei , bi ) to RB
4. add (ei , ci ) to RC
59
60
The E-R data model provides a wide range of choice in designing a database scheme to
accurately model some real-world situation.
We have seen weak entity sets, generalization and aggregation. Designers must decide
when these features are appropriate.
Strong entity sets and their dependent weak entity sets may be regarded as a
single ``object'' in the database, as weak entities are existence-dependent on a
strong entity.
It is possible to treat an aggregated entity set as a single unit without concern for
its inner structure details.
Generalization contributes to modularity by allowing common attributes of
similar entity sets to be represented in one place in an E-R diagram.
Specialization
Top-down design process; we designate subgroupings within an entity set that are
distinctive from other entities in the set.
These subgroupings become lower-level entity sets that have attributes or participate
in relationships that do not apply to the higher-level entity set.
Attribute inheritance – a lower-level entity set inherits all the attributes and
relationship participation of the higher-level entity set to which it is linked.
61
Generalization
A bottom-up design process – combine a number of entity sets that share the same
features into a higher-level entity set.
Specialization and generalization are simple inversions of each other; they are
represented in an E-R diagram in the same way.
62
Aggregation
63
64
65
66
67
We use a table with one column for each attribute of the set. Each row in the table
corresponds to one entity of the entity set. For the entity set account We can add, delete
and modify rows (to reflect changes in the real world).
Actually, the table contains a subset of the set of all possible rows. We refer to the set of
all possible rows as the cartesian product of the sets of all attribute values.
D1 X D2
for the account table, where D1and D2 denote the set of all account numbers and all
account balances, respectively.
In general, for a table of n columns, we may denote the cartesian product of D1,D2,
….,Dn by
D1 X D2 X…………X Dn-1 X Dn
68
69
Customer Table
Loan-number amount
L-11 900
L-14 1500
L-15 1500
L-16 1300
L-17 1000
L-23 2000
L-93 500
Loan table
Composite attributes are flattened out by creating a separate attribute for each
component attribute
E.g. given entity set customer with composite attribute name with component
attributes first-name and last-name the table corresponding to the entity set
has two attributes
name.first-name and name.last-name
A multivalued attribute M of an entity E is represented by a separate table EM
Table EM has attributes corresponding to the primary key of E and an
attribute corresponding to multivalued attribute M
E.g. Multivalued attribute dependent-names of employee is represented by a
table
employee-dependent-names( employee-id, dname)
Each value of the multivalued attribute maps to a separate row of the table EM
E.g., an employee entity with primary key John and
dependents Johnson and Johndotir maps to two rows:
(John, Johnson) and (John, Johndotir)
For a weak entity set, we add columns to the table corresponding to the primary key
of the strong entity set on which the weak set is dependent
70
71
A weak entity set becomes a table that includes a column for the primary key of the
identifying strong entity set
72
73
The table corresponding to the relationship set R has the following attributes:
Redundancy of Tables
Many-to-one and one-to-many relationship sets that are total on the many-side can be
represented by adding an extra attribute to the many side, containing the primary key
of the one side
E.g.: Instead of creating a table for relationship account-branch, add an attribute
branch to the entity set account
74
75
For one-to-one relationship sets, either side can be chosen to act as the “many” side
That is, extra attribute can be added to either of the tables corresponding to the
two entity sets
If participation is partial on the many side, replacing a table by an extra attribute in
the relation corresponding to the “many” side could result in null values
The table corresponding to a relationship set linking a weak entity set to its
identifying strong entity set is redundant.
E.g. The payment table already contains the information that would appear in
the loan-payment table (i.e., the columns loan-number and payment-number).
Method 1:
Form a table for the higher level entity
Form a table for each lower level entity set, include primary key of higher
level entity set and local attributes
76
Method 2:
Form a table for each entity set with all local and inherited attributes
table table attributes
person name, street, city
customer name, street, city, credit-rating
employee name, street, city, salary
77
UNIT III
78
Relational Algebra
Relation r
A B C D
1 7
5 7
12 3
23 10
A=B D > 5 ( r )
A B C D
1 7
23 10
Select Operation
Notation: p(r)
p is called the selection predicate
Defined as:
branch_name=“Perryridge”(account)
79
Pr
ojectOper
ati
on–Exampl
e
A B C
Rel
ati
onr
:
10 1
20 1
30 1
40 2
A C A,C
A (r)
C
1 1
1 = 1
2
1
2
Project Operation
Notation:
where A1, A2 are attribute names and r is a relation name.
The result is defined as the relation of k columns obtained by erasing the columns that
are not listed
Duplicate rows removed from result, since relations are sets
Example: To eliminate the branch_name attribute of account
80
Uni
onOper
ati
on–Exampl
e
A R
Bel
ati
onsr
,s: A B
1 2
2 3
1
s
r
A B
rs
:1
2
1
3
Notation: r s
Defined as:
r s = {t | t r or t s}
For r s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd column
of r deals with the same type of values as does the 2nd column of s)
Example: to find all customers with either an account or a loan
customer_name (depositor) customer_name (borrower)
81
A RB
el
ati
onsr
,s: A B
1 2
2 3
1
s
r
r–
As:B
1
1
Notation r – s
Defined as:
r – s = {t | t r and t s}
82
Car
tesi
an-
ProductOper
ati
on– Exampl
e
Rel
ati
onsr
,s:
A B C D E
1 10 a
2 10 a
20 b
r 10 b
s
rxs
:
A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
Cartesian-Product Operation
Notation r x s
Defined as:
r x s = {t q | t r and q s}
Assume that attributes of r(R) and s(S) are disjoint. (That is, R S = ).
If attributes of r(R) and s(S) are not disjoint, then renaming must be used.
83
Composition of Operations
Can build expressions using multiple operations
Example: A=C(r x s)
rxs
A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
A=C(r x s)
A B C D E
1 10 a
2 10 a
2 20 b
Rename Operation
Allows us to name, and therefore to refer to, the results of relational-algebra
expressions.
Allows us to refer to a relation by more than one name.
Example:
x (E)
returns the expression E under the name X
If a relational-algebra expression E has arity n, then
84
returns the result of expression E under the name X, and with the
attributes renamed to A1 , A2 , …., An .
Banking Example
branch (branch_name, branch_city, assets)
customer (customer_name, customer_street, customer_city)
Fi
ndal
ll
oansofov
er$1200
amount>1200(
loan)
Fi
ndt
hel
oannumberf
oreac
hloanofanamountgr
eat
ert
han $1200
l a
oan_number( mount>1200(
loan)
)
Fi
ndt
henamesofal
lcus
tomer
swhohav
eal
oan,anaccount
,orbot
h,f
rom t
hebank
cus
tomer
_name(
bor
r ) c
ower us
tomer
_name(
deposi
t
or)
85
Fi
ndt
henamesofal
lcus
tomer
swhohav
eal
oanatt
hePer
ryr
idgebr
anc
h.
cust
omer br
_name( anch_name=“
Per
ryr
idge”
b
( or
rower
.l
oan_number=l
oan.
l
oan_number
(bor
rowerxl
oan)
))
Fi
ndthenamesofal
lcust
omerswhohavealoanatthe
Per
ryr
idgebr
anchbutdonothav
eanaccountatanybranc
hof
t
hebank .
cus
t _name(br
omer anch_name=“
Per
ryr
idge”
bor
( rower
.l
oan_number=l
oan.
loan_number
(bor
rowerxl
oan)
))–
cus
tomer
_name(deposi
t
or)
Find the names of all customers who have a loan at the Perryridge branch.
Query 1
customer_name (branch_name = “Perryridge” (
borrower.loan_number = loan.loan_number (borrower x loan)))
Query 2
customer_name(loan.loan_number = borrower.loan_number (
(branch_name = “Perryridge” (loan)) x borrower))
86
bal
ance( )-a
account ccount.
balance
a
( ccount.
balance<d.balance(accountxd(account
))
)
Set-Intersection Operation
Notation: r s
Defined as:
r s = { t | t r and t s }
Assume:
r, s have the same arity
attributes of r and s are compatible
Note: r s = r – (r – s)
87
Set
-I
nter
sect
ionOper
ati
on–Exampl
e
A B Rel
ati
onr
,s:
A B
1 2
2 3
1
r r s s
A B
2
Not
ati
on:r s
Let r and s be relations on schemas R and S respectively.
r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r x s))
88
Nat
uralJoi
nOper
ati
on–Exampl
e
Rel
ati
onsr
,s:
A B C D B D E
1 a 1 a
2 a 3 a
4 b 1 a
1 a 2 b
2 b 3 b
r s
R S
A B C D E
1 a
1 a
1 a
1 a
2 b
89
Di
visi
onOper
ati
on–Exampl
e
Rel
ati
onsr
,s:
A B
B
1 1
2
2
3
1 s
1
1
3
4
6
1
2
A rr s
:
Anot
herDi
vi
si
onEx
ampl
e
Rel
ati
onsr
,s:
A B C D E D E
a a 1 a 1
b 1
a a 1
a b 1 s
a a 1
a b 3
a a 1
a b 1
a b 1
r
r s
:
A B C
a
a
90
Property
Let q = r s
Then q is the largest relation satisfying q x s r
Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S R
r s = R-S (r ) – R-S ( ( R-S (r ) x s ) – R-S,S(r ))
To see why
R-S,S (r) simply reorders attributes of r
R-S (R-S (r ) x s ) – R-S,S(r) ) gives those tuples t in
BankExampl
eQuer
ies
Fi
ndt
henamesofal
lcus
tomer
swhohav
eal
oanandanaccountatbank
.
cus
tomer
_name(
bor
r ) c
ower ust
omer
_name(
deposi
t
or)
Fi
ndt
henameofal
lcus
tomer
swhohav
eal
oanatt
hebankandt
hel
oanamount
cust
omer
_name,l
oan_number
,amount(
bor
rower l
oan)
Find all customers who have an account from at least the “Downtown” and the
Uptown” branches.
Query 1
customer_name (branch_name = “Downtown” (depositor join account ))
customer_name (branch_name = “Uptown” (depositor join account))
91
Query 2
customer_name, branch_name (depositor account)
temp(branch_name) ({(“Downtown” ), (“Uptown” )})
Note that Query 2 uses a constant relation.
Find all customers who have an account at all branches located in Brooklyn city.
customer_name, branch_name (depositor account)
branch_name (branch_city = “Brooklyn” (branch))
Extended Relational-Algebra-Operations
Generalized Projection
Aggregate Functions
Outer Join
Generalized Projection
∏F , F ,...,F ( E )
1 2 n
E is any relational-algebra expression
Each of F1, F2, …, Fn are are arithmetic expressions involving constants and
attributes in the schema of E.
Given relation credit_info(customer_name, limit, credit_balance), find how much
more each person can spend:
92
Rel
ati
onr
:
A B C
7
7
3
10
g sum(c) (r) c)
sum(
27
Rel
ati
onaccountgr
oupedbybr
anch-
name:
br
anch_name account
_numberbal
ance
Perr
yridge A-
102 400
Perr
yridge A-
201 900
Bri
ghton A-
217 750
Bri
ghton A-
215 750
Redwood A-
222 700
br
anch_nameg sum(bal
ance) (account)
br
anch_name sum(
bal
ance)
Perr
yridge 1300
Bri
ghton 1500
Redwood 700
93
Outer Join
Rel
ati
onl
oan
oan_number br
l anch_name amount
L
-170 Downt
own 3000
L
-230 Redwood 4000
L
-260 Per
ryr
idge 1700
Rel
ati
onbor
rower
cus
tomer
_name l
oan_number
Jones L
-170
Smith L
-230
Hayes L
-155
94
Joi
n
l
oan bor
rower
l
oan_number br
anch_name amount cus
tomer
_name
L
-170 Downt
own 3000 Jones
L
-230 Redwood 4000 Smith
LeftOut
erJ
oin
l
oan bor
rower
oan_number br
l anc
h_name amount cus
tomer
_name
L
-170 Downt
own 3000 Jones
L
-230 Redwood 4000 Smith
L
-260 Per
ryr
idge 1700 null
Ri
ghtOut
erJoin
l
oan bor
rower
l
oan_number br
anch_name amount cust
omer
_name
L
-170 Downt
own 3000 Jones
L
-230 Redwood 4000 Smith
L
-155 nul
l nul
l Hayes
Full
OuterJoin
loan bor
rower
l
oan_number br
anch_name amount cus
tomer
_name
L
-170 Downt
own 3000 Jones
L
-230 Redwood 4000 Smith
L
-260 Per
ryr
idge 1700 nul
l
L
-155 nul
l nul
l Hayes
95
Null Values
null signifies an unknown value or that a value does not exist.
The result of any arithmetic expression involving null is null.
Aggregate functions simply ignore null values (as in SQL)
For duplicate elimination and grouping, null is treated like any other value, and two
nulls are assumed to be the same (as in SQL)
Comparisons with null values return the special truth value: unknown
If false was used instead of unknown, then not (A < 5)
would not be equivalent to A >= 5
Three-valued logic using the truth value unknown:
OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
NOT: (not unknown) = unknown
In SQL “P is unknown” evaluates to true if predicate P evaluates to unknown
Result of select predicate is treated as false if it evaluates to unknown
The content of the database may be modified using the following operations:
Deletion
Insertion
Updating
All these operations are expressed using the assignment operator.
Del
eti
on
Adel
eterequesti
sexpr essedsimil
arl
yt oaquer y,ex
ceptinsteadofdispl
aying
t
upl
estot heuser
,theselectedtupl
esar eremov edfr
om thedat abase.
Candelet
eonlywhol etupl
es;cannotdeletevaluesononl yparti
cularat
tr
ibutes
Adel
etionisexpr
essedi nrelat
i
onalalgebraby:
r r–E
wher erisarelat
ionandEi sarelat
ional
algebraquer y
.
96
Del
eti
onExampl
es
Del
eteal
laccountr
ecor
dsi
nthePer
ryr
idgebr
anch.
account account–br
anch_name=“
Per
ryr
idge”(account)
Del
eteal
ll
oanr
ecor
dswi
t
hamounti
nther
angeof0t
o50
oan–a
oan l
l mount0andamount 50(l
oan)
Del
eteal
laccount
satbr
anchesl
ocat
edi
nNeedham.
1 b
r r
anch_ci
ty=“
Needham”(
account br
anch)
2 a
r ccount
_number
,br
anch_name,bal
ance(
r1)
r3 cus tomer _name,account
_number(
r2 deposi
t
or)
account account–r2
deposi
tor deposit
or–r3
I
nser
ti
on
Toinser tdataintoar elat
ion,weeither:
speci
fyat upletobei nsert
ed
wr i
teaquer ywhoser esul
tisasetoft upl
est obeinsert
ed
i
nr el
at i
onalalgebra,ani nserti
onisexpressedby :
r r E
wher erisar elati
onandEi sar el
ati
onalalgebraexpressi
on.
Thei nserti
onofasi ngl
et upl
eisexpressedbyl ett
ingE beaconst antr
elat
i
on
containingonet uple.
Example
I
nsertinfor
mat
ioninthedatabasespec
ifyi
ngt
hatSmi
t
hhas$1200i
naccountA-
973att hePer
ryr
idgebranch
account account {(
“A-
973”
,“Perr
yri
dge”,1200)
}
deposi
tor deposi
tor {(
“Smi
th”
,“A-973”
)}
97
Ex
ampl e
Pr
ovi
deasagi ftf
oral
lloancust
omersi
nt hePerr
yri
dge
br
anc h,a$200savi
ngsaccount
.Lettheloannumberser
ve
astheaccountnumberfort
henewsavingsaccount
Updat
ing
A mechanism to change a value in a tuple without charging all values in the tuple
r r1,r2,………rn ( r)
Each Fi is either
the I th attribute of r, if the I th attribute is not updated, or,
if the attribute is to be updated Fi is an expression, involving only constants and
the attributes of r, which gives the new value for the attribute
Example
account account
_number
,br
anch_name,bal
ance*1.
05(
account
)
Payal
laccount
swit
hbal
ancesover$10,
0006per
centi
nter
est
andpayallot
her
s5percent
account account_number,branch_name,bal
ance*1. BAL 10000(
06( account))
account_number,branch_name,bal 05(
ance*1. BAL 10000( )
account)
98
Views
In some cases, it is not desirable for all users to see the entire logical model (that is,
all the actual relations stored in the database.)
Consider a person who needs to know a customer’s name, loan number and branch
name, but has no need to see the loan amount.
name,loan-number,bramch-name ( borrower X loan)
A view provides a mechanism to hide certain data from the view of certain users.
Any relation that is not of the conceptual model but is made visible to a user as a
“virtual relation” is called a view.
View Definition
Where < query expression> is any legal relational algebra query expression. The view
name is represented by v
Example
customer-name(σbranch-name= “Perryridge”(all-customer))
99
• Consider the person who needs to see all loan data in the loanrelation except
amount.The view given to the person, branch-loan, is defined as:
•Since we allow a view name to appear wherever a relation name is allowed, the person
may write:
100
Banking Example
branch (branch_name, branch_city, assets )
customer (customer_name, customer_street, customer_city )
account (account_number, branch_name, balance )
loan (loan_number, branch_name, amount )
depositor (customer_name, account_number )
borrower (customer_name, loan_number )
Find the loan_number, branch_name, and amount for loans of over $1200
Answer
t|t l
{ oan t[
amount] 1200}
Find the loan number for each loan of an amount greater than $1200
101
Find the names of all customers having a loan, an account, or both at the bank
Find the names of all customers who have a loan and an account at the bank
{t | s borrower ( t [customer_name ] = s [customer_name ])
u depositor ( t [customer_name ] = u [customer_name] )
Find the names of all customers having a loan at the Perryridge branch
102
Find the loan_number, branch_name, and amount for loans of over $1200
Find the names of all customers who have a loan of over $1200
{ c | l, b, a ( c, l borrower l, b, a loan a > 1200)}
Find the names of all customers who have a loan from the Perryridge branch
and the loan amount:
o { c, a | l ( c, l borrower b ( l, b, a loan
b = “Perryridge”))}
o { c, a | l ( c, l borrower l, “ Perryridge”, a loan)}
Find the names of all customers having a loan, an account, or both at the
Perryridge branch:
{ c | l ( c, l borrower
b,a ( l, b, a loan b = “Perryridge”))
a ( c, a depositor
b,n ( a, b, n account b = “Perryridge”))}
Find the names of all customers who have an account at all branches located in
Brooklyn:
{ c | s,n ( c, s, n customer)
x,y,z ( x, y, z branch y = “Brooklyn”)
a,b ( x, y, z account c,a depositor)}
Safety of Expressions
The expression:
{ x1, x2, …, xn | P (x1, x2, …, xn )}
is safe if all of the following hold:
1. All values that appear in tuples of the expression are values from dom (P ) (that is,
the values appear either in P or in a tuple of a relation mentioned in P ).
2. For every “there exists” subformula of the form x (P1(x )), the subformula is
true if and only if there is a value of x in dom (P1) such that P1(x ) is true.
3. For every “for all” subformula of the form x (P1 (x )), the subformula is true if
and only if P1(x ) is true for all values x from dom (P1).
103
Query-by-Example (QBE)
104
105
106
107
108
109
110
Method 1:
P.
_x P.
_y P.
_z
Met
hod2:Shor
thandnot
ati
on
Fi
ndt
hel
oannumberofal
ll
oanswi
t
hal
oanamountofmor
ethan$700
Fi
ndnamesofal
lbr
anchest
hatar
enotl
ocat
edi
nBr
ookl
yn
111
Fi
ndt
hel
oannumber
sofal
ll
oansmadej
oi
ntl
ytoSmi
t
handJones
.
Fi
ndal
lcust
omer
swhol
i
vei
nthes
ameci
tyasJones
Fi
ndt
henamesofal
lcus
tomer
swhohav
eal
oanf
rom t
hePer
ryr
idgebr
anch.
112
Fi
ndt
henamesofal
lcust
omer
swhohav
ebot
hanaccountandal
oanatt
hebank
.
113
Negat
ioni
nQBE
Fi
ndt
henamesofal
lcust
omer
swhohav
eanaccountatt
hebank
,butdonothav
eal
oanf
rom t
h
¬means“
ther
edoesnotexi
st”
Fi
ndal
lcust
omer
swhohav
eatl
eas
ttwoaccount
s.
¬means“
notequal
to”
114
TheCondi
ti
onBox
Al
l
owst
heexpr
essi
onofconst
rai
ntsondomai
nvar
iabl
est
hatar
eei
t
heri
nconv
eni
entori
mpossi
b
Complexcondi
t
ionscanbeusedincondit
ionbox
es
Example:Fi
ndthel
oannumbersofal
lloansmadetoSmi
t
h,t
oJones
,ort
obot
hjoi
ntl
y
115
QBEsuppor
tsani
nter
est
i
ngs
ynt
axf
orexpr
essi
ngal
t
ernat
i
vev
alues
Fi
ndal
laccountnumber
swi
t
habal
ancegr
eat
ert
han$1,
300andl
esst
han$1,
500
Fi
ndal
laccountnumber
swi
t
habal
ancegr
eat
ert
han$1,
300and l
esst
han$2,
000butnotex
act
116
Fi
ndal
lbr
anc
hest
hathav
eas
set
sgr
eat
ert
hant
hoseofatl
eas
tonebr
anchl
ocat
edi
n
Find the customer_name, account_number, and balance for all customers who have an
account at the Perryridge branch.
117
118
Or
der
ingt
heDi
spl
ayofTupl
es
AO =ascendi
ngor
der;DO =descendingorder
.
Exampl
e:li
sti
nascendi
ngalphabet
icalor
derallcust
omer
swhohav
eanaccountatt
hebank
Whensort
ingonmul ti
pl
eattr
ibut
es,t
hesor
ti
ngorderi
sspeci
fiedbyi
ncl
udi
ngwit
heachsortope
Ex
ample:Listal
laccountnumbersatt
hePerr
yri
dgebranchi
nascendi
ngal
phabet
icor
derwitht
h
Aggr
egat
eOper
ati
ons
Theaggr
egat
eoper
ator
sar
eAVG,MAX,MI
N,SUM,andCNT
Theaboveoper
ator
smustbepostfix
edwith“ALL”(
e.g.
,SUM.ALL.orAVG.ALL.
_x)t
oensur
etha
Exampl
e:Findt
hetotal
bal
anceofalltheaccount
smaintai
nedatt
hePerryr
idgebr
anch.
119
120
121
Quer
yExampl
es
Fi
ndt
heav
eragebal
anceateachbr
anch.
The“
G”i
n“P.
G”i
sanal
ogoust
oSQL
’sgr
oupbycons
truct
The“
ALL
”int
he“
P.AVG.
ALL
”ent
ryi
nthebal
ancecol
umnensur
est
hatal
lbal
ancesar
econsi
der
e
Tofindt
heav
erageaccountbal
anceatonl
ythosebr
ancheswher
etheav
erageaccountbal
ance
Find all customers who have an account at all branches located in Brooklyn
122
Modi
ficat
ionoft
he
Del
eti
onoft
upl
esf
rom ar
elat
i
oni
sexpr
essedbyuseofaD.command.I
nthecasewher
Del
etecust
omerSmi
t
h
Del
etet
hebr
anch_ci
tyv
alueoft
hebr
anchwhosenamei
s“Per
ryr
idge”
.
123
Delet
ealll
oanswit
haloanamountgr
eatert
han$1300and l
esst
han$1500.
Forconsi
st
ency,wehavet
odelet
einf
ormati
onfr
om l
oanandborr
owert
ables
Del
eteal
laccount
satbr
anc
hesl
ocat
edi
nBr
ookl
yn.
124
Modi
ficat
ionoft
heDat
abase–I
nser
tion
I
nser
ti
onisdonebypl aci
ngtheI.oper
atori
nthequeryexpressi
on.
I
nser
tthef
actthataccountA-9732att
hePer r
yri
dgebranchhasabalanceof$700.
Modi
ficat
iono
Pr
ovi
deasagi
ftf
oral
ll
oancus
tomer
soft
hePer
ryr
idgebr
anch,anew$200savi
ngsaccountf
ore
125
Modi
ficat
ionoft
heDat
abase–Updat
Uset
heU.oper
atort
ochangeav
aluei
nat
upl
ewi
t
houtchangi
ngal
lval
uesi
nthet
upl
e.QBEdo
Updat
etheassetv
alueoft
hePer
ryr
idgebr
ancht
o$10,
000,
000.
I
ncr
easeal
lbal
ancesby5per
cent
.
126
UNIT IV
SQL & Other Relational Languages: Structures, Set Operations, Aggregate Functions,
Null Values, Nested Sub – Queries, Derived Relations, Joined Relations, DDL, Other
SQL features.
127
Example:
create table branch
(branch_name char(15) not null,
branch_city char(30),
assets integer)
128
where A is the name of the attribute to be added to relation r and D is the domain of
A.
All tuples in the relation are assigned null as the value for the new attribute.
The alter table command can also be used to drop attributes of a relation:
alter table r drop A
where A is the name of an attribute of relation r
Dropping of attributes not supported by many databases
129
The select clause can contain arithmetic expressions involving the operation, +, –, ,
and /, and operating on constants or attributes of tuples.
The query:
select loan_number, branch_name, amount 100
from loan
would return a relation that is the same as the loan relation, except that the value
of the attribute amount is multiplied by 100.
select loan_number
from loan
where amount between 90000 and 100000
130
Tuple Variables
Tuple variables are defined in the from clause via the use of the as clause.
Find the customer names and their loan numbers for all customers having a loan at
some branch.
select customer_name, T.loan_number, S.amount
from borrower as T, loan as S
where T.loan_number = S.loan_number
String Operations
SQL includes a string-matching operator for comparisons on character strings. The
operator “like” uses patterns that are described using two special characters:
percent (%). The % character matches any substring.
underscore (_). The _ character matches any character.
Find the names of all customers whose street includes the substring “Main”.
select customer_name
from customer
where customer_street like '% Main%'
Match the name “Main%”
like 'Main\%' escape '\'
SQL supports a variety of string operations such as
concatenation (using “||”)
converting from upper to lower case (and vice versa)
finding string length, extracting substrings, etc.
131
Set Operations
Find all customers who have a loan, an account, or both:
Aggregate Functions
These functions operate on the multiset of values of a column of a relation, and return
a value
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
132
Null Values
It is possible for tuples to have a null value, denoted by null, for some of their
attributes
null signifies an unknown value or that a value does not exist.
The predicate is null can be used to check for null values.
Example: Find all loan number which appear in the loan relation with null
values for amount.
select loan_number
from loan
where amount is null
The result of any arithmetic expression involving null is null
Example: 5 + null returns null
Any comparison with null returns unknown
Example: 5 < null or null <> null or null = null
Three-valued logic using the truth value unknown:
OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
NOT: (not unknown) = unknown
“P is unknown” evaluates to true if predicate P evaluates to unknown
Result of where clause predicate is treated as false if it evaluates to unknown
133
Nested Subqueries
A subquery is a select-from-where expression that is nested within another query.
A common use of subqueries is to perform tests for set membership, set comparisons,
and set cardinality
Find all customers who have both an account and a loan at the bank.
select distinct customer_name
from borrower
where customer_name in (select customer_name
from depositor )
Find all customers who have a loan at the bank but do not have
an account at the bank
select distinct customer_name
from borrower
where customer_name not in (select customer_name
from depositor )
Find all customers who have both an account and a loan at the Perryridge branch
select distinct customer_name
from borrower, loan
where borrower.loan_number = loan.loan_number and
branch_name = 'Perryridge' and
(branch_name, customer_name ) in
(select branch_name, customer_name
from depositor, account
where depositor.account_number =
account.account_number )
Set Comparison
Find all branches that have greater assets than some branch located in Brooklyn.
select distinct T.branch_name
from branch as T, branch as S
where T.assets > S.assets and
S.branch_city = 'Brooklyn'
Find the names of all branches that have greater assets than all branches located in
Brooklyn.
select branch_name
from branch
134
Derived Relations
Find the average account balance of those branches where the average account
balance is greater than $1200.
select branch_name, avg_balance
from (select branch_name, avg (balance)
from account
group by branch_name )
as branch_avg ( branch_name, avg_balance )
where avg_balance > 1200
Note that we do not need to use the having clause, since we compute the
temporary (view) relation branch_avg in the from clause, and the attributes of
branch_avg can be used directly in the where clause.
Views
In some cases, it is not desirable for all users to see the entire logical model (that is,
all the actual relations stored in the database.)
Consider a person who needs to know a customer’s name, loan number and branch
name, but has no need to see the loan amount. This person should see a relation
described, in SQL, by
A view provides a mechanism to hide certain data from the view of certain users.
Any relation that is not of the conceptual model but is made visible to a user as a
“virtual relation” is called a view.
A view is defined using the create view statement which has the form
where <query expression> is any legal SQL expression. The view name is
represented by v.
Once a view is defined, the view name can be used to refer to the virtual relation that
the view generates.
135
When a view is created, the query expression is stored in the database; the expression
is substituted into queries using the view.
136
Delete the record of all accounts with balances below the average at the bank.
delete from account
where balance < (select avg (balance )
from account )
Problem: as we delete tuples from deposit, the average balance changes
Solution used in SQL:
First, compute avg balance and find all tuples to delete
Next, delete all tuples found above (without recomputing avg or retesting the
tuples)
update account
set balance = balance 1.05
where balance 10000
Same query as before: Increase all accounts with balances over $10,000 by 6%, all
other accounts receive 5%.
update account
set balance = case
when balance <= 10000 then balance *1.05
137
Update of a View
Create a view of all loan data in the loan relation, hiding the amount attribute
create view loan_branch as
select loan_number, branch_name
from loan
Add a new tuple to branch_loan
insert into branch_loan
values ('L-37‘, 'Perryridge‘)
This insertion must be represented by the insertion of the tuple
('L-37', 'Perryridge', null )
into the loan relation
Some updates through views are impossible to translate into updates on the database
relations
create view v as
select loan_number, branch_name, amount
from loan
where branch_name = ‘Perryridge’
insert into v values ( 'L-99','Downtown', '23')
138
139
140
141
142
143
144
Find all customers who have either an account or a loan (but not both) at the bank.
select customer_name
from (depositor natural full outer join borrower )
where account_number is null or loan_number is null
Database Schema
UNIT V
Relational Database Design: Pitfalls in Relational – Database Design, Functional
Dependencies, Decomposition, Desirable Properties of Decomposition, Normalization
(INF- DKNF), BCNF & Its Comparison with 3NF.
145
Normalization
This process of reducing a relation into simpler structures is the process of
Normalisation.
Normalisation may be defined as a step by step reversible process of transforming an
unnormalised relation into relations with progressively simpler structures. Since the
process is reversible, no information is lost in the transformation.
Normalisation removes (or more accurately, minimises) the undesirable properties by
working through a series of stages called Normal Forms. Originally, Codd defined three
types of undesirable properties:
Data aggregates
Partial key dependency
Indirect key dependency
146
Stages of Normalisation
Unnormalised
(UDF)
Remove repeating groups
First normal form
(1NF)
Remove partial dependencies
Second normal form
(2NF)
Remove transitive dependencies
Third normal form
(3NF)
Remove remaining functional
dependency anomalies
Boyce-Codd normal
form (BCNF)
Remove multivalued dependencies
Fourth normal form
(4NF)
Remove remaining anomalies
Fifth normal form
(5NF)
Goals of Normalisation
• Eliminate certain kinds of redundancy
• avoid certain update anomalies
• good representation of real world
• simplify enforcement of DB integrity
Bad Design
Alternatively, why don't we re-structure our relation such that we do not restrict the
number of transactions per customer. We can do this with the following structure:
This way, a customer can have just any number of Part transactions without worrying
about any upper limit or wasted space through null values (as it was with the previous
structure).
147
It seems a waste of storage to keep repeated values of Cname, Ccity and Cphone.
If C# 1 were to change his telephone number, we would have to ensure that we
update ALL occurrences of C# 1's Cphone values. This means updating tuple 1,
tuple 2 and all other tuples where there is an occurrence of C# 1. Otherwise, our
database would be left in an inconsistent state.
Suppose we now have a new customer with C# 4. However, there is no part
transaction yet with the customer as he has not ordered anything yet. We may find
that we cannot insert this new information because we do not have a P# which
serves as part of the 'primary key' of a tuple. (A primary key cannot have null
values).
Suppose the third transaction has been canceled, i.e. we no longer need information about
25 of P# 1 being ordered on 26 Jan. We thus delete the third tuple. We are then left with
the following relation:
But then, suppose we need information about the customer "Martin", say the city he is
located in. Unfortunately as information about Martin was held in only that tuple and
having the entire tuple deleted because of its P# transaction, meant also that we have lost
all information about Martin from the relation.
As illustrated in the above instances, we note that badly designed, unnormalised relations
waste storage space. Worse, they give rise to the following storage irregularities:
Update anomaly: Data inconsistency or loss of data integrity can arise from data
redundancy/repetition and partial update.
Insertion anomaly: Data cannot be added because some other data is absent.
Deletion anomaly: Data maybe unintentionally lost through the deletion of other
data.
148
Let us see if this new design will alleviate the above storage anomalies:
Update anomaly
If C# 1 were to change his telephone number, as there is only one occurrence of the tuple
in the Customer relation, we need to update only that one tuple as there are no
redundant/duplicate tuples.
Addition anomaly
Adding a new customer with C# 4 can be easily done in the Customer relation of which
C# serves as the primary key. With no P# yet, a tuple in Transaction need not be created.
Deletion anomaly
Canceling the third transaction about 25 of P# 1 being ordered on 26 Jan would now
mean deleting only the third tuple of the new Transaction relation above. This leaves
information about Martin still intact in the new Customer relation.
We shall now show a more formal process on how we can decompose relations into
multiple relations by using the Normal Form rules for structuring.
149
This is thus not in 1NF. It must be "flattened". This can be achieved by ensuring that
every tuple defines a single entity by containing only atomic values. One can either re-
organise into one relation as in:
Note that earlier we defined 1NF as one of the characteristics of a relation . Thus we
consider that every relation is at least in the first normal form (thus the Figure 4-3 is not
even a relation). The Transaction relation of Figure 4-2 is however a 1NF relation.
We may thus generalise by saying that "A relation is in the 1NF if the values in the
relation are atomic for every single attribute of the relation".
Before we can look into the next two normal forms, 2NF and 3NF, we need to first
explain the notion of "functional dependency" as these two forms are constrained by
functional dependencies.
Functional Dependencies
Determinant
The value of an attribute can uniquely determine the value in another attribute.
150
C#Cname
C#Ccity
C#Cphone
(C#, P#, Date)Qnt
The value of the attribute on the left-hand side of the arrow is the determinant because its
value uniquely determines the value of the attribute on the right.
(Ccity, Cphone)Cname
(Ccity, Cphone)C#
Similarly, "(C#, P#, Date) is a determinant of Qnt" is thus also "Qnt is functionally
dependent on the set of attributes (C#, P#, Date)". The set of attributes is also known as a
composite attribute.
and likewise
151
1. Update
What happens if Codd's telephone number changes and we update only the first tuple (but
not the second)?
2. Insertion
If we wish to introduce a new customer, we cannot do so unless an appropriate
transaction is effected.
152
3. Deletion
If the data about a transaction is deleted, the information about the customer is also
deleted. If this happens to the last transaction for that customer the information about the
customer will be lost.
Clearly, the Transaction relation although it is normalised to 1NF still have storage
anomalies. The reason for such violations to the database's integrity and consistency rules
is because of the partial dependency on the primary key.
The determinant (C#, P#, Date) is the composite key of the Transaction relation - its
value will uniquely determine the value of every other non-key attribute in a tuple of the
relation. Note that whilst Qnt is fully functionally dependent on all of (C#, P#, Date),
Cname, Ccity and Cphone are only partially functionally dependent on the composite key
(as they each depend only on the C# part of the key only but not on P# or Date).
The problems are avoided by eliminating partial key dependence in favour of full
functional dependence, and we can do so by separating the dependencies as follows:
153
The source relation is thus split into two (or more) relations whereby each resultant
relation no longer has any partial key dependencies:
We now have two relations, both of which are in the second normal form.
Update anomaly
There are no redundant/duplicate tuples in the relation, thus updates are done just at one
place without any worry for database inconsistencies.
Addition anomaly
Adding a new customer can be done in the Customer relation without concern whether
there is a transaction for a part or not.
154
Deletion anomaly
Deleting a tuple in Transaction does not cause loss of information about Customer details.
where the composite attribute (K1, K2) is the Primary Key. Suppose also that there exist
the following functional dependencies:
(K1, K2)I1
i.e. a full functional dependency on the composite key (K1, K2).
K2I2
i.e. a partial functional dependency on the composite key (K1, K2).
The partial dependencies on the primary key must be eliminated. The reduction of 1NF
into 2NF consists of replacing the 1NF relation by appropriate "projections" such that
every non-key attribute in the relations are fully functionally dependent on the primary
key of the respective relation. The steps are:
If a relation has the same determinant as another relation, place the dependent attributes
of the relation to be non-key attributes in the other relation for which the determinant is a
key.
Thus, "A relation R is in 2NF if it is in 1NF and every non-key attribute is fully
functionally dependent on the primary key".
155
Given a set of attributes a, define the closure of a under F (denoted by a+) as the set
of attributes that are functionally determined by a under F
Q Compute the closure of the following set F of functional dependencies for relation
schema R = (A,B,C,D,E)
A →BC
CD →E
B →D
E →A
List the candidate keys for R.
(b) Using the functional dependencies of compute B+.
Result ={A}+
Repeat again
Result ={CD}+
156
B →D Result = { C,D,E }
E →A Result = { A,C,D,E }
Repeat again
Result ={B}+
A →BC Result = { B }
CD →E Result = { B }
B →D Result = { B,D}
E →A Result = { B,D}
Repeat again
4) We check about E
Result ={E}+
A →BC Result = { E }
CD →E Result = { E }
B →D Result = { E}
E →A Result = { E,A}
Repeat again
157
Result ={B}+
A →BC Result = { B }
CD →E Result = { B }
B →D Result = { B,D}
E →A Result = { B,D}
Repeat again
Repeat again
158
CD →E Result = {A,B,C,D,E}
A B C
a1 b1 c1
a1 b1 c2
a2 b1 c1
a2 b1 c2
Ans
A->B is satisfied since t3(a2,b1) and t4(a2,b1)
A-C is not satisfied since t1(a1,c1) and t2(a1,c2)
B->A is not satisfied since t1(b1,a1) and t3(b1,a2)
C->B is satisfied since t1(c1,b1) and t3(c1,b1)
AB->C is not satisfied since t1(a1,b1,c1) and t2(a1,b1,c2)
BC->A is not satisfied since t1(b1,c1,a1) and t3(b1,c1,a2)
Now trivial functional Dependencies
AB-> A is satisfied since t1(a1,b1,a1) and t2(a1,b1,a1)
BC->A is satisfied
CA->B is satisfied
Example BCNF
• R = (A, B, C)
• F = (A==> B,
B==> C)
• R is not in BCNF
• Decomposition R1 = (A, B), R2 = (B, C)
– R1 and R2 are in BCNF
159
– Lossless-join decomposition
– Dependency preserving
BCNF and Dependency Preservation
Constraints, including functional dependencies, are costly to check in practice unless
they pertain to only one relation
If it is sufficient to test only those dependencies on each individual relation of a
decomposition in order to ensure that all functional dependencies hold, then that
decomposition is dependency preserving.
Because it is not always possible to achieve both BCNF and dependency
preservation, we consider a weaker normal form, known as third normal form.
Example
• R = (J, K, L)
• F = (JK==> L,
L==> K)
• Two candidate keys: JK and JL
• R is in 3NF
– JK==>L JK is a superkey
– L==>K K is contained in a candidate key
• BCNF decomposition has R1 = (J, L), R2 = (J, K)
– testing for JK==>L requires a join
• There is some redundancy in this schema
160
Extraneous Attributes
Consider a set F of functional dependencies and the functional dependency in F.
Attribute A is extraneous in if A
and F logically implies (F – { }) {( – A) }.
Attribute A is extraneous in if A
and the set of functional dependencies
(F – { }) { ( – A)} logically implies F.
Note: implication in the opposite direction is trivial in each of the cases above, since a
“stronger” functional dependency always implies a weaker one
Example: Given F = {A C, AB C }
B is extraneous in AB C because {A C, AB C} logically implies A C (I.e. the
result of dropping B from AB C).
Example: Given F = {A C, AB CD}
C is extraneous in AB CD since AB C can be inferred even after deleting C
Testing if an Attribute is Extraneous
Consider a set F of functional dependencies and the functional dependency in F.
To test if attribute A is extraneous in
161
Canonical Cover
A canonical cover for F is a set of dependencies Fc such that F logically implies all
dependencies in Fc, and Fc logically implies all dependencies in F, and
No functional dependency in Fc contains an extraneous attribute, and
Each left side of functional dependency in Fc is unique.
To compute a canonical cover for F:
repeat
Use the union rule to replace any dependencies in F
1 1 and 1 2 with 1 1 2
Find a functional dependency with an
extraneous attribute either in or in
If an extraneous attribute is found, delete it from
until F does not change
Note: Union rule may become applicable after some extraneous attributes have been
deleted, so it has to be re-applied
Computing a Canonical Cover
R = (A, B, C)
F = {A BC
BC
AB
AB C}
Combine A BC and A B into A BC
Set is now {A BC, B C, AB C}
A is extraneous in AB C
Check if the result of deleting A from AB C is implied by the other dependencies
Yes: in fact, B C is already present!
Set is now {A BC, B C}
C is extraneous in A BC
Check if A C is logically implied by A B and the other dependencies
Yes: using transitivity on A B and B C.
Can use attribute closure of A in more complex cases
The canonical cover is: A B
BC
Fourth Normal Form
162
BOOK_NO
AUTHOR_NOAnomaly SUBJECT
because the same set of SUBJECT is associated with each AUTHOR
A1 B1 Comp. Sc.
A1 B1 Maths
multidetermines
A2 B1 Comp. Sc.
BOOK_NO AUTHOR_NO
A2 B1 Maths
A1 B1 Comp. Sc.
A1 B1 B1 Comp. Sc.
A2 B1 B1 Maths
A3 B2 B2 Maths
UNIT VI
163
Integrity Constraints
Integrity constraints guard against accidental damage to the database, by ensuring that
authorized changes to the database do not result in a loss of data consistency.
checking account must have a balance greater than $10,000.00
164
Example: Declare branch_name as the primary key for branch and ensure that the
values of assets are non-negative.create table branch
(branch_name char(15),
branch_city char(30),
assets integer,
primary key (branch_name),
check (assets >= 0))
Use check clause to ensure that an hourly_wage domain allows only values
greater than a specified value.
create domain hourly_wage numeric(5,2)
constraint value_test check(value > = 4.00)
The domain has a constraint that ensures that the hourly_wage is greater than
4.00
The clause constraint value_test is optional; useful to indicate which
constraint an update violated
165
Assertions
An assertion is a predicate expressing a condition that we wish the database always to
satisfy.
An assertion in SQL takes the form
create assertion <assertion-name> check <predicate>
Every loan has at least one borrower who maintains an account with a minimum
balance or $1000.00
Referential Integrity
The foreign key clause lists the attributes that comprise the foreign key and
the name of the relation referenced by the foreign key. By default, a foreign
key references the primary key attributes of the referenced table.
166
167
Solution
create table product_master(
product_no varchar2(6) primary key check(product_no like 'P%'),
description varchar2(15),
profit_percent number(4,2),
unit_mesure varchar2(10),
qty_on_hand number(8),
record_lvl number(8),
sell_price number(8,2) check(sell_price>0),
cost_price number(8,2) check(cost_price>0)
);
QTable Name: salesman_master
Description: Used to store information about salesman working in the company.
Column Name Date Type Size Attribute
salesman_no Varchar2 6 Primary key / First letter must
start with ‘S’
salesman_name Varchar2 20 Not null
address1 Varchar2 30
address2 Varchar2 30
city Varchar2 20
pincode Varchar2 8
state Varchar2 20
sal_amt number 8,2 Not null, Cannot be 0
tgt_to_get number 6,2 Not null, Cannot be 0
ytd_sales number 6,2 Not null
remarks Varchar2 60
Solution
create table salesman_master(
salesman_no varchar2(6) primary key check(salesman_no like 'S%'),
salesman_name varchar2(20) not null,
address1 varchar2(30),
address2 varchar2(30),
city varchar2(20),
pincode varchar2(8),
state varchar2(20),
sal_amt number(8,2) not null check (sal_amt<>0),
tgt_to_get number(6,2) not null check(tgt_to_get <>0),
ytd_sales number(6,2) not null,
remarks varchar2(60)
);
QTable Name: salesman_order
Description: Used to store information about clients order
Column Name Date Type Size Attribute
Order_no Varchar2 6 Primary key / First letter must
start with ‘O’
Order_date date
168
169
);
Triggers
A trigger is a statement that is executed automatically by the system as a side effect of a
modification to the database.
To design a trigger mechanism, we must:
Specify the conditions under which the trigger is to be executed.
Specify the actions to be taken when the trigger executes.
Trigger Example
Suppose that instead of allowing negative account balances, the bank deals with
overdrafts by setting the account balance to zero creating a loan in the amount of the
overdraft giving this loan a loan number identical to the account number of the
overdrawn account
The condition for executing the trigger is an update to the account relation that results
in a negative balance value.
170
UNIT VII
171
172
173
174
175
f
i:averagefan-outofinternalnodesofindexi,f
or
t
ree-struct
uredindicessuc hasB+- tr
ees.
HTi:numberofl evelsinindexi —i .
e.,t
heheightofi
.
Forabalancedt r
eeindex( suchasB+- tree)onatt
ri
buteAofrel
ati
onr
,
HTi=l
ogfi(
V(A,
r)
).
Forahashi
ndex
,HTii
s1.
:numberofl
LBi owest
-l
evel
indexbl
ocksi
ni— i
.
e,t
henumberofbl
ocksatt
he
l
eaflev
eloft
hei
ndex.
Measur
esofQuer
yCost
Recal
lthat
Typi
callydi
skaccessi sthepredominantcost,andisal
sorelat
i
v el
y
easyt
oestimate.
Thenumberofbl ocktransf
ersf
rom disk isusedasameasur eofthe
act
ualcostofev aluat
i
on.
Iti
sassumedt hatal lt
ransf
ersofblockshavethesamecost.
Reallif
eoptimizersdonotmak ethisassumption,anddi
st
ingui
sh
betweens equentialandrandom di
skaccess
Wedonoti ncl
udecostt owri
ti
ngout puttodi
sk .
Wer ef
ertothecostes ti
mateofalgori
thm AasEA
176
2 Sel
ect
ionSi
zeEst
imat
ion
Equal
it
ysel
ect
ion A=v(
r)
SC(A,r):numberofr ecordsthatwil
lsat
is
fythesel
ect
i
on
SC(A,r
)/r — numberofbl
f ocksthatt
heserecor
dswil
loccupy
E.
g.Binar
ysearchcostes t
imatebecomes
SC( A , r )
Ea 2 =⌈log 2 (br )⌉+⌈ ⌉−1
fr
Equal
i
tycondi
t
iononak
eyat
tr
ibut
e:SC(
A,)=1
r
St
ati
sti
calI
nfor
mat
ionf
orExampl
es
facc
ount=20 ( 20t uplesofaccountfitinonebl ock)
V( branch- name,account )=50 ( 50br anches)
V( bal ance,account )=500 ( 500di fferentbalancev
alues)
account=10000 ( accounthas10, 000t uples)
Assumet hef ol
lowi ngindi
cesexistonaccount :
Apr imar y +
,B- t
reei ndexforat
tr
ibutebr anch-name
As econdar y,B-+
t
reeindexforattr
ibut ebalance
Sel
ect
ionsI
nvol
vingCompar
isons
Select
ionsofthef
orm AV()(
r caseofA V()i
r ssymmet
ri
c)
Letcdenotet heesti
matednumberoft uplessati
sfyi
ngt
hecondi
t
ion.
Ifmin(
A,r)andmax(A,r)areavail
abl
eincat al
og
C=0i fv<mi n(A,r
)
C=nrifv>=max(a,
r)
And
nr*(
v-mi
n(A,
r)
)/(
max(
A,r
)–mi
n(A,
r)
) ot
her
wise
I
nabs
enceofst
ati
st
icali
nfor
mat
i
onci
sassumedt
obenr/2.
I
mpl
ement
ati
onofCompl
exSel
ect
ions
Theselect
ivi
tyofacondi t
ioniist
heprobabi
l
it
ythatatupl
einthe
rel
ati
onrsatsfiesi.I
i fsi i
sthenumberofsati
sfy
ingt
uplesi
nr,the
sel
ecti
vi
tyofii sgi
venbysi/nr.
177
Conj
unct
ion:1 2... n(
r).Theest
i
mat
efornumberof
tupl
esint
her
es ul
tis:
n
Nr*(s1*s2*
……..sn)/nr
Di
sjuncti
on:1 2... n(
r. Est
) imat
ednumberoft
upl
es:
1–(1-s1/
nr)
*(1-s2/ nr) *………( 1-
sn/
nr)
Joi
nOper
ati
on:Runni
ngExampl
e
Est
imat
ionoft
heSi
zeofJoi
ns
TheCar t
esianproductrxscont ainsnr.nst
uples;eachtupleoccupies
sr+ssbytes.
I fR S=,t henr| X|si sthes ameasrxs .
I fR Si sak eyforR,t henat upleofswi ll
joinwithatmostonet uple
from r
t
heref
ore,thenumberoft uplesinr| X|si snogreaterthanthenumberof
t
upl
esins .
I fR Si nSisaf or ei
gnk eyinSr eferenci
ngR,t henthenumberof
tupl
esinr si sex actl
ythes ameast henumberoft uplesins.
Thecasef orR Sbei ngaf orei
gnk eyref
erencingSi ssymmetri
c .
I ntheexamplequer ydeposit
or|X| cus tomer,customer-namein
deposiori
t saf or
eignk eyofcus t
omer
178
hence,t
her
esul
thasex
act
l
yndepositort
upl
es,whi
chi
s5000
fR S={
I A}i
snotakeyforRorS.
I
fweassumethatev
eryt
upleti
nRproducest
upl
esi
nR S,t
henumberof
t
upl
esi
nR Si
ses
ti
mat
edt
obe:
nr∗n s
V ( A ,s)
I
fther
ever sei str
ue,t heesti
mat
eobt
ainedwi
l
lbe:
nr∗ns
V ( A ,r )
Si
zeEst
imat
ionf
orOt
herOper
ati
ons
Projecti
on:est i
matedsi zeofA( ) = V(
r A,)
r
Aggr egati
on:es t
imatedsi z
eofAgF( r) =V(A,)
r
Setoper ati
ons
Foruni ons/i
nter
secti
onsofsel ecti
onsont hesamer el
ati
on:r
ewrit
eandus e
siz
eest imateforselecti
ons
E.g.1(r) 2()canber
r ewr
itenas12(
t r)
Foroper ati
onsondi ffer
entrelat
ions:
est imatedsizeofr s =si zeofr+sizeofs.
est imatedsizeofr s =mi nimum siz
eofrandsi zeofs.
est imatedsizeofr–s =r .
Allt
het hreeestimatesmaybequi t
einaccur
ate,butpr
ovideupperboundson
thesi z
es.
Outer join:
Estimated size of r s = size of r join s + size of r
179
Est
imat
ionofNumberofDi
sti
nctVal
ues
Sel
ect ions: (
r)
If forcesAt otak eas peci
fiedv alue:V( A, (
r))=1.
e. g.,A=3
If forcesAt otak eononeofas peci fieds etofval
ues:
V(A, (
r))=numberofspec ifiedv alues.
(e. g.,(A=1VA=3VA=4) )
,
Ifthesel ectioncondi t
ion i
soft hef orm Aopr
estimatedV( A, (r))=V( A.)*s
r
wher esi sthesel ecti
vi
tyoft hesel ecti
on.
Inal ltheothercases :useappr oximat ees t
imateof
min(V(A, )
r,n (r))
Mor eaccurateest imatecanbegotusi ngpr obabili
tytheor
y,butt
hisonewor
ks
finegener all
y
Joi
ns:r X s
Ifallatt
ri
butesinAar efrom r
est
i
mat ed V(A,rX s)=mi n(V(A,
r),nrx s)
IfAcont ai
nsat t
ri
butesA1f r
om randA2f r
om s,t henestimated
V(A,rXs)=
min(V(A1,r)
*V(A2–A1, s)
,V(A1–A2, r)
*V(A2,
s) ,nrx s)
Moreaccuratees t
imat
ecanbegotusi ngprobabil
it
yt heory,butthi
sone
worksfinegenerall
y
180
181
Query: Find the names of all customers who have an account at some branch located
in Brooklyn.
customer-name(branch-city = “Brooklyn”
(branch (account depositor)))
Transformation using rule 7a.
customer-name
((branch-city =“Brooklyn” (branch))
(account depositor))
Performing the selection as early as possible reduces the size of the relation to be
joined.
Query: Find the names of all customers with an account at a Brooklyn branch whose
account balance is over $1000.
customer-name((branch-city = “Brooklyn” balance > 1000
(branch (account depositor)))
182
When we compute
(branch-city = “Brooklyn” (branch) account )
we obtain a relation whose schema is:
(branch-name, branch-city, assets, account-number, balance)
Push projections using equivalence rules 8a and 8b; eliminate unneeded attributes
from intermediate results to get:
customer-name ((
account-number ( (branch-city = “Brooklyn” (branch) account ))
depositor)
(r1 X r2) X r3
so that we compute and store a smaller temporary relation.
183
Since it is more likely that only a small fraction of the bank’s customers have
accounts in branches located in Brooklyn, it is better to compute
branch-city = “Brooklyn” (branch) account
first.
Evaluation Plan
An evaluation plan defines exactly what algorithm is used for each operation, and
how the execution of the operations is coordinated.
Cost-Based Optimization
Consider finding the best join-order for r1 r2 . . . rn.
There are (2(n – 1))!/(n – 1)! different join orders for above expression. With n = 7,
the number is 665280, with n = 10, the number is greater than 176 billion!
No need to generate all the join orders. Using dynamic programming, the least-cost
join order for any subset of
{r1, r2, . . . rn} is computed only once and stored for future use.
184
185
186
Cost of Optimization
With dynamic programming time complexity of optimization with bushy trees is
O(3n).
o With n = 10, this number is 59000 instead of 176 billion!
Space complexity is O(2n)
To find best left-deep join tree for a set of n relations:
o Consider n alternatives with one relation as right-hand side input and the
other relations as left-hand side input.
o Using (recursively computed and stored) least-cost join order for each
alternative on left-hand-side, choose the cheapest of the n alternatives.
If only left-deep trees are considered, time complexity of finding best join order is
O(n 2n)
o Space complexity remains at O(2n)
Cost-based optimization is expensive, but worthwhile for queries on large datasets
(typical queries have small n, generally < 10)
Consider the expression (r1 r2 r3) r4 r5
An interesting sort order is a particular sort order of tuples that could be useful for
a later operation.
o Generating the result of r1 r2 r3 sorted on the attributes common with
r4 or r5 may be useful, but generating it sorted on the attributes common
only r1 and r2 is not useful.
o Using merge-join to compute r1 r2 r3 may be costlier, but may
provide an output sorted in an interesting order.
Not sufficient to find the best join order for each subset of the set of n given relations;
must find the best join order for each subset, for each interesting sort order
o Simple extension of earlier dynamic programming algorithms
o Usually, number of interesting orders is quite small and doesn’t affect
time/space complexity significantly
Heuristic Optimization
Cost-based optimization is expensive, even with dynamic programming.
Systems may use heuristics to reduce the number of choices that must be made in a
cost-based fashion.
Heuristic optimization transforms the query-tree by using a set of rules that typically
(but not in all cases) improve execution performance:
o Perform selection early (reduces the number of tuples)
o Perform projection early (reduces the number of attributes)
o Perform most restrictive selection and join operations before other similar
operations.
o Some systems use only heuristics, others combine heuristics with partial
cost-based optimization.
187
UNIT VIII
Transaction & Concurrency Control: Transaction Concepts & ACID Properties,
Transaction States, Concurrent Executions, Serializability & Its Testing, Guarantee
Serializability, Recoverability, Introduction to Concurrency Control, Locked Base
Protocol & Deadlock Handling.
188
Transaction Concept
ACID Properties
To preserve integrity of data, the database system
must ensure:
189
190
6. write(B)
Consistency requirement – the sum of A and B is
unchanged by the execution of the transaction.
Atomicity requirement — if the transaction fails
after step 3 and before step 6, the system should
ensure that its updates are not reflected in the
database, else an inconsistency will result.
Durability requirement — once the user has been
notified that the transaction has completed (i.e.,
the transfer of the $50 has taken place), the
updates to the database by the transaction must
persist despite failures.
Isolation requirement — if between steps 3 and 6,
another transaction is allowed to access the
partially updated database, it will see an
inconsistent database
(the sum A + B will be less than it should be).
Can be ensured trivially by running transactions
serially, that is one after the other. However,
executing multiple transactions concurrently has
significant benefits, as we will see.
Transaction State
191
Tr
ansact
ionSt
ate
192
I
mpl
ement
ati
onofAt
omi
cit
yandDur
Theshadow-
dat
abases
cheme:
Assumesdiskst
onotfai
l
Usefulf
ort
extedi
tor
s,butext
remel
yineffici
entf
orl
argedat
abases
:ex
ecut
i
ngasi
ngl
etr
ans
a
193
Concurrent Executions
Multiple transactions are allowed to run concurrently in the system. Advantages are:
o increased processor and disk utilization, leading to better transaction
throughput: one transaction can be using the CPU while another is reading
from or writing to the disk
o reduced average response time for transactions: short transactions need
not wait behind long ones.
Concurrency control schemes – mechanisms to achieve isolation, i.e., to control the
interaction among the concurrent transactions in order to prevent them from
destroying the consistency of the database
Schedules
Schedules – sequences that indicate the chronological order in which instructions of
concurrent transactions are executed
o a schedule for a set of transactions must consist of all instructions of those
transactions
o must preserve the order in which the instructions appear in each individual
transaction.
194
Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B. The
following is a serial schedule (Schedule 1 in the text), in which T1 is followed by T2.
T1 T2
Read(A)
A=A-50
Write(A)
Read(B)
B=B+50
Write(B)
Read(A)
A=A-50
Write(A)
Read(B)
B=B+50
Write(B)
T1 T2
Read(A)
Temp=A*0.1
A=A-temp
Write(A)
Read(B)
B=B+temp
Write(B)
Read(A)
A=A-50
Write(A)
Read(B)
B=B+50
Write(B)
T1 T2
195
Read(A)
A=A-50
Write(A)
Read(A)
Temp=A*0.1
A=A-temp
Write(A)
Read(B)
B=B+50
Write(B)
Read(B)
B=B+temp
Write(B)
The following concurrent schedule (Schedule 4 in the text) does not preserve the value of
the the sum A + B.
Schedule 4
Serializability
196
We ignore operations other than read and write instructions, and we assume that
transactions may perform arbitrary computations on data in local buffers in between
reads and writes. Our simplified schedules consist of only read and write
instructions.
Conflict Serializability
Instructions li and lj of transactions Ti and Tj respectively, conflict if and only if there
exists some item Q accessed by both li and lj, and at least one of these instructions
wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
Intuitively, a conflict between li and lj forces a (logical) temporal order between
them. If li and lj are consecutive in a schedule and they do not conflict, their results
would remain the same even if they had been interchanged in the schedule.
T3 T4
Read(Q)
Write(Q)
Write(Q)
We are unable to swap instructions in the above schedule to obtain either the serial
schedule < T3, T4 >, or the serial schedule < T4, T3 >.
T1 T2
Read(A)
A=A-50
Write(A)
Read(A)
Temp=A*0.1
A=A-temp
Write(A)
Read(B)
B=B+50
Write(B)
197
Read(B)
B=B+temp
Write(B)
Schedule 3
Schedule 3 below can be transformed into Schedule 1, a serial schedule where T2
follows T1, by series of swaps of non-conflicting instructions. Therefore Schedule 3
is conflict serializable.
T1 T2
Read(A)
A=A-50
Write(A)
Read(A)
Temp=A*0.1
A=A-temp
Read(B)
Write(A)
B=B+50
Write(B)
Read(B)
B=B+temp
Write(B)
Schedule 3_a (After swaping between Read(B) and Write (A)
T1 T2
Read(A)
A=A-50
Write(A)
Read(B)
Read(A)
Temp=A*0.1
A=A-temp
Write(A)
B=B+50
Write(B)
Read(B)
B=B+temp
Write(B)
Schedule3-b
View Serializability
198
Let S and S´ be two schedules with the same set of transactions. S and S´ are view
equivalent if the following three conditions are met:
1. For each data item Q, if transaction Ti reads the initial value of Q in schedule S,
then transaction Ti must, in schedule S´, also read the initial value of Q.
2. For each data item Q if transaction Ti executes read(Q) in schedule S, and that
value was produced by transaction Tj (if any), then transaction Ti must in schedule S´
also read the value of Q that was produced by transaction Tj .
3. For each data item Q, the transaction (if any) that performs the final write(Q)
operation in schedule S must perform the final write(Q) operation in schedule S´.
As can be seen, view equivalence is also based purely on reads and writes alone.
Every view serializable schedule that is not conflict serializable has blind writes.
Schedule 8 (from text) given below produces same outcome as the serial schedule <
T1, T5 >, yet is not conflict equivalent or view equivalent to it.
199
Determining such equivalence requires analysis of operations other than read and write
Recoverability
If T8 should abort, T9 would have read (and possibly shown to the user) an inconsistent
database state. Hence database must ensure that schedules are recoverable.
Recoverable schedules are desirable because failure of a transaction might otherwise
bring the system into an irreversibly inconsistent state. Non recoverable schedules may
sometimes be needed when updates must be made visible early due to time constraints,
even if they have not yet been committed, which may be required for very long duration
transactions.
200
If T10 fails, T11 and T12 must also be rolled back. Can lead to the undoing of a
significant amount of work
201
Strict
A schedule is strict if for any two transactions T1, T2, if a write operation of T1 precedes
a conflicting operation of T2 (either read or write), then the commit event of T1 also
precedes that conflicting operation of T2.
Implementation of Isolation
Schedules must be conflict or view serializable, and recoverable, for the sake of
database consistency, and preferably cascadeless.
A policy in which only one transaction can execute at a time generates serial
schedules, but provides a poor degree of concurrency..
Concurrency-control schemes tradeoff between the amount of concurrency they allow
and the amount of overhead that they incur.
Some schemes allow only conflict-serializable schedules to be generated, while
others allow view-serializable schedules that are not conflict-serializable
202
Serial Schedule
T1 T2
R(X)
X=X-N
W(X)
R(Y)
Y=Y+N
W(Y)
R(X)
X=X+M
W(X)
T1 T2
R(X)
X=X+M
W(X)
R(X)
X=X-N
W(X)
R(Y)
Y=Y+N
W(Y)
Schedule A Schedule B
T1 T2
R(X)
X=X-N
R(X)
X=X+M
W(X)
R(Y)
W(X)
Y=Y+N
W(Y)
T1 T2
R(X)
X=X-N
W(X)
R(X)
X=X+M
W(X)
203
R(Y)
Y=Y+N
W(Y)
Schedule C Schedule D
Example
(a) Serial Schedule A : T1 followed by T2
(b) Serial Schedule B: T2 followed by T1
(c) Schedule C : Non serial Schedule
(d) Schedule D : Non serial Schedule
T1 T2
204
W(X)
R(Y)
Y=Y+N
W(Y)
T1 T2
T1 T2
R(X)
X=X+M
W(X)
R(X)
X=X-N
Precedence graph for Schedule B
T1 T2
R(X)
X=X-N
R(X)
X=X+M
W(X)
R(Y)
W(X)
Y=Y+N
W(Y)
T1 T2
205
Precedence graph for Schedule C (Not Serial)Cycle exist from T1T2 and T2T1
T1 T2 T3
Read(Z)
Read(Y)
Write(Y)
Read(Y)
Read(Z)
Read(X)
Write(X)
Write(Y)
Write(Z)
Read(X)
Read(Y)
Write(Y)
Schedule E
T1 T2
T3
206
Lock-Compatible Matrix
S X
S True False
X False False
T1: lock-X(B);
Read(B);
B=B-50;
Write(B)
Unlock(B);
Lock-X(A)
Read(A)
A=A+50
Write(A)
Unlock(A)
207
T2: lock_S(A)
Read(A)
Unlock(A)
Lock-S(B)
Read(B)
Unlock(B)
Display(A+B)
Figure 16.3 Transaction T2
T1 T2 Concurrency-Control Manager
Lock-X(B)
Grant-X(B,T1)
Read(B)
B=B-50
Write(B)
Unlock(B)
Lock-S(A)
Grant-S(A,T2)
Read(A)
Unlock(A)
Lock-S(B)
Grant-S(B,T2)
Read(B)
Unlock(B)
Display(A+B)
Lock-X(A)
Grant-X(A,T2)
Read(A)
A=A+50
Write(A)
Unlock(A)
208
T3: lock-X(B);
Read(B);
B=B-50;
Write(B)
Lock-X(A)
Read(A)
A=A+50
Write(A)
Unlock(B)
Unlock(A)
T4: lock_S(A)
Read(A)
Lock-S(B)
Read(B)
Display(A+B)
Unlock(A)
Unlock(B)
Figure 16.6 Transaction T4
T3 T4
Lock-X(B)
Read(B)
B=B-50
Write(B)
Lock-S(A)
Read(A)
Lock-S(B)
Lock-X(A)
Figure 16.7 Schedule 2 (Dead Lock)
Neither T3 nor T4 can make progress — executing lock-S(B) causes T4 to wait for T3 to
release its lock on B, while executing lock-X(A) causes T3 to wait for T4 to release its
lock on A.
Such a situation is called a Deadlock
o To handle a deadlock one of T3 or T4 must be rolled back
The potential for deadlock exists in most locking protocols. Deadlocks are a
necessary evil.
209
T5 T6 T7
Lock-X(A)
Read(A)
Lock-S(B)
Read(B)
Write(A)
Unlock(A)
Lock-X(A)
Read(A)
Write(A)
Unlock(A)
Lock-S(A)
Read(A)
210
Rigorous two-phase locking is even stricter: here all locks are held till
commit/abort. In this protocol transactions can be serialized in the order in which they
commit.
T8: Read(a1)
Read(a2)
Read(a3)
….
….
Read(an)
Write(a1)
T9: Read(a1)
Read(a2);
Display(a1+a2)
T8 T9
Lock-S(a1)
Lock-S(a1)
Lock-S(a2)
Lock-S(a2)
Lock-S(a3)
Lock-S(a4)
Unlock(a1)
Unlock(a2)
Lock-S(an)
Upgrade(a1)
Figure 16.9 Incomplete Schedule with a lock conversion
211
Implementation of Locking
LockTabl
e
Blackrect
anglesindicat
egrantedlocks,whit
eonesi ndi
catewaiti
ngrequest
s
Locktablealsorecordsthetypeoflockgrantedorrequested
Newr equestisaddedt otheendoft hequeueofreques t
sforthedatait
em,andgrantedi
fi
tis
Unlockrequestsresulti
ntherequestbeingdelet
ed,andl at
errequestsar
echeckedtoseeift
Ift
ransact
ionaborts,allwai
ti
ngorgr ant
edrequestsofthetransact
ionaredel
eted
l
ockmanagermayk
eepal
i
stofl
ock
shel
dbyeacht
ransact
i
on,t
oimpl
ementt
hiseffici
ent
l
y
212
Tr
eePr
otocol
The tree protocol ensures conflict serializability as well as freedom from deadlock.
Unlocking may occur earlier in the tree-locking protocol than in the two-phase
locking protocol.
o shorter waiting times, and increase in concurrency
o protocol is deadlock-free, no rollbacks are required
o the abort of a transaction can still lead to cascading rollbacks.
(this correction has to be made in the book also.)
However, in the tree-locking protocol, a transaction may have to lock data items that
it does not access.
o increased locking overhead, and additional waiting time
o potential decrease in concurrency
Schedules not possible under two-phase locking are possible under tree protocol, and
vice versa.
213
Each transaction is issued a timestamp when it enters the system. If an old transaction
Ti has time-stamp TS(Ti), a new transaction Tj is assigned time-stamp TS(Tj) such
that TS(Ti) <TS(Tj).
The protocol manages concurrent execution such that the time-stamps determine the
serializability order.
In order to assure such behavior, the protocol maintains for each data Q two
timestamp values:
o W-timestamp(Q) is the largest time-stamp of any transaction that
executed write(Q) successfully.
o R-timestamp(Q) is the largest time-stamp of any transaction that executed
read(Q) successfully.
The timestamp ordering protocol ensures that any conflicting read and write
operations are executed in timestamp order.
1. Suppose a transaction Ti issues a read(Q)
214
a. If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was needed
previously, and the system assumed that that value would never be produced. Hence,
the write operation is rejected, and Ti is rolled back.
T14 : Read(B)
Read(A)
Display(A+B)
T15: Read(B)
B=B-50
Write(B)
Read(A)
A=A+50
Write(A)
Display(A+B)
T14 T15
Read(B)
Read(B)
B=B-50
Write(B)
Read(A)
Read(A)
Display(A+B)
A=A+50
Write(A)
Display(A+B)
Figure 16.13 Schedule 3
In Schedule 3 of figure 16.13, TS(T14) <TS(T15)
215
Example
Consider the following schedule consisting of three transactions T1, T2, and T3:
S: r3(Z), w3(Z), r1(X), r2(Y), w2(Y), w1(X), r1(Y), r3(X), c1, c2, c3
Assume that transactions have timestamp values:
TS(T1)= 2 TS(T2)= 1 TS(T3)= 3
Question:
1. Rewrite S using Timestamp ordering algorithm
2. Rewrite S using Timestamp ordering algorithm with the following timestamps
TS(T1)=3, TS(T2)=2, TS(T3)=1
Show the point where some transaction gets aborted
3. Rewrite 2) with multiversion Timestamp ordering algorithm
The initial values of the items are X0,Y0,Z0
216
Validation-Based Protocol
217
T14 T15
218
Read(B)
Read(B)
B=B-50
Read(A)
A=A+50
Read(A)
<Validate>
Display(A+B)
<Validate>
Write(B)
Write(A)
Multiple Granularity
If a transaction Ti needs to access the entire database and a locking protocol is used ,
then Ti must lock each item in the database . It is time consuming. On the other hand ,
if transaction Tj needs to access only a few data items, it should not be required to
lock the entire database , since otherwise concurrency is lost.
Solution
Multiple Granularity
219
Figure 16.16
The highest level in the example hierarchy is the entire database.
The levels below are of type area, file and record in that order.
Intention Lock Modes
In addition to S and X lock modes, there are three additional lock modes with
multiple granularity:
o intention-shared (IS): indicates explicit locking at a lower level of the tree
but only with shared locks.
o intention-exclusive (IX): indicates explicit locking at a lower level with
exclusive or shared locks
o shared and intention-exclusive (SIX): the subtree rooted by that node is
locked explicitly in shared mode and explicit locking is being done at a
lower level with exclusive-mode locks
Suppose the transaction Tj wishes to lock record rb6 of file fb. Since Ti has locked Fb
explicitly , it follows that rb6 is also locked(implicitly). But, When Tj issues a lock
request for rb6 , rb6 is not explicitly locked! How does the system determine whether Tj
can lock rb6? Tj must traverse the tree from the root to record rb6.
220
IS IX S SIX X
IS True True True True False
IX True True False False False
S True False True False False
SIX True False False False False
X False False False False False
Multiple Granularity Locking Scheme
Transaction Ti can lock a node Q, using the following rules:
1. The lock compatibility matrix must be observed.
2. The root of the tree must be locked first, and may be locked in
any mode.
3. A node Q can be locked by Ti in S or IS mode only if the parent
of Q is currently locked by Ti in either IX or IS
mode.
4. A node Q can be locked by Ti in X, SIX, or IX mode only if the
parent of Q is currently locked by Ti in either IX
or SIX mode.
5. Ti can lock a node only if it has not previously unlocked any node
(that is, Ti is two-phase).
6. Ti can unlock a node Q only if none of the children of Q are
currently locked by Ti.
Observe that locks are acquired in root-to-leaf order,
whereas they are released in leaf-to-root order.
221
Each data item Q has a sequence of versions <Q1, Q2,...., Qm>. Each version Qk contains
three data fields:
o Content -- the value of version Qk.
o W-timestamp(Qk) -- timestamp of the transaction that created (wrote)
version Qk
o R-timestamp(Qk) -- largest timestamp of a transaction that successfully
read version Qk
222
Deadlock Handling
223
T3 T4
Lock-X(B)
Read(B)
B=B-50
Write(B)
Lock-S(A)
Read(A)
Lock-S(B)
Lock-X(A)
Figure 16.7 Schedule 2
System is deadlocked if there is a set of transactions such that every transaction in the
set is waiting for another transaction in the set.
Deadlock prevention protocols ensure that the system will never enter into a deadlock
state. Some prevention strategies :
o Require that each transaction locks all its data items before it begins
execution (predeclaration).
o Impose partial ordering of all data items and require that a transaction can
lock data items only in the order specified by the partial order (graph-
based protocol).
Following schemes use transaction timestamps for the sake of deadlock prevention
alone.
wait-die scheme — non-preemptive
o older transaction may wait for younger one to release data item. Younger
transactions never wait for older ones; they are rolled back instead.
o a transaction may die several times before acquiring needed data item
wound-wait scheme — preemptive
o older transaction wounds (forces rollback) of younger transaction instead
of waiting for it. Younger transactions may wait for older ones.
o may be fewer rollbacks than wait-die scheme.
224
Deadlock Recovery
When deadlock is detected :
o Some transaction will have to rolled back (made a victim) to break
deadlock. Select that transaction as victim that will incur minimum cost.
o Rollback -- determine how far to roll back transaction
Total rollback: Abort the transaction and then restart it.
More effective to roll back transaction only as far as necessary to
break deadlock.
o Starvation happens if same transaction is always chosen as victim. Include
the number of rollbacks in the cost factor to avoid starvation
Dirty read
225
226
B=B-50
Write(B)
Unlock(B)
Lock-S(A)
Grant-S(A,T2)
Read(A)
Unlock(A)
Lock-S(B)
Grant-S(B,T2)
Read(B)
Unlock(B)
Display(A+B)
Lock-X(A)
Grant-X(A,T2)
Read(A)
A=A+50
Write(A)
Unlock(A)
Figure 16.4 Schedule 1
T3: lock-X(B);
Read(B);
B=B-50;
Write(B)
Lock-X(A)
Read(A)
A=A+50
Write(A)
Unlock(B)
Unlock(A)
T4: lock_S(A)
Read(A)
Lock-S(B)
Read(B)
Display(A+B)
Unlock(A)
Unlock(B)
Figure 16.6 Transaction T4
T3 T4
227
Lock-X(B)
Read(B)
B=B-50
Write(B)
Lock-S(A)
Read(A)
Lock-S(B)
Lock-X(A)
Figure 16.7 Schedule 2
T5 T6 T7
Lock-X(A)
Read(A)
Lock-S(B)
Read(B)
Write(A)
Unlock(A)
Lock-X(A)
Read(A)
Write(A)
Unlock(A)
Lock-S(A)
Read(A)
Figure 16.8 Partial Schedule under two phase locking
T8: Read(a1)
Read(a2)
Read(a3)
….
….
Read(an)
Write(a1)
T9: Read(a1)
Read(a2);
Display(a1+a2)
228
T8 T9
Lock-S(a1)
Lock-S(a1)
Lock-S(a2)
Lock-S(a2)
Lock-S(a3)
Lock-S(a4)
Unlock(a1)
Unlock(a2)
Lock-S(an)
Upgrade(a1)
T14 : Read(B)
Read(A)
Display(A+B)
229
T15: Read(B)
B=B-50
Write(B)
Read(A)
A=A+50
Write(A)
Display(A+B)
T14 T15
Read(B)
Read(B)
B=B-50
Write(B)
Read(A)
Read(A)
Display(A+B)
A=A+50
Write(A)
Display(A+B)
T16 T17
Read(Q)
Write(Q)
Write(Q)
Figure 16.14 Schedule 4
T14 T15
Read(B)
Read(B)
B=B-50
Read(A)
A=A+50
Read(A)
<Validate>
Display(A+B)
230
<Validate>
Write(B)
Write(A)
T31:
Read(A)
Read(B)
If A=0 then B=B+1
Write(B)
T32:
Read(B)
Read(A)
If B=0 then A=A+1
Write(A)
Add lock and unlock instructions to transactions T31 and T32, so that they observe
the two phase locking protocol. Can the execution of these transactions result in a
deadlock?
Solution
T31:
Lock-S(A)
Read(A)
Lock-X(B)
Read(B)
If A=0 then B=B+1
Write(B)
Unlock(A)
Unlock(B)
T32:
Lock-S(B)
Read(B)
Lock-X(A)
Read(A)
If B=0 then A=A+1
Write(A)
Unlock(B)
Unlock(A)
Dead Lock
231
Lock-S(A)
Lock-S(B)
Read(B)
Read(A)
Lock-X(B)
Lock-X(A)
232
References
233
QUESTION BANK
Q1 What are the main differences between a file-processing system and a database
management system?
Q2 What do you mean by data independence? Explain the differences between
physical and logical data independences.
Q3 What are the advantages and disadvantages of using a DBMS.
Q4 Briefly explain the different views of database.
Q5 Explain three-level schema architecture?
Q6 Define database management system, data model, schema, DDL, DML.
Q7 Explain the different DBMS languages.
Q8 Explain the component modules of a DBMS.
Q9 Explain different types of data models.
Q10 Explain the main phases of database design.
Q11 What are the main functions of a database administrator(DBA)?
Q12 List the responsibilities of a database manager?
Q13 Briefly explain the different types of users of database management system.
Q14 Explain the following terms:
a. Composite and Simple attributes
b. Single values and multi-valued attributes
c. Derived attributes
Q15 Explain the following terms:
a. Primary key, candidate key and Super key
b. instance
c. Null value
Q16 Explain different symbols of E-R diagram
Q17 Explain the following
a. mapping constraint
b. Entity , attribute and relationship set
Q18 Explain the difference between weak and strong entity with the help of diagram.
Q19 Explain the concept of generalization and specialization with the help of an
example.
234
Q21 How database design can be performed from E-R diagram containing many to
many relationship, week strong relationship, generalization and specialization?
Q22 Explain Design Constraints on a Specialization and Generalization
Q23 Construct an ER diagram for a car insurance company with a set of customers
each of which owns a number of cars. Each car has a number of recorded
accidents associated with it.
Q24 Construct an ER diagram for a hospital with a set of patients and a set of medical
doctors. With each patient a log of various conducted tests is also associated
Q25 Explain what is meant by repetition of information and inability to represent
information. Explain why each of these properties may indicate a bad relational
database design.
Q26 Define relations.What are five basic relational algebraic operators?
Q27 What is relational algebra? Explain different type of joins operation with
Examples
Q28 How database design can be performed from E-R diagram containing many to
many relationship, week strong relationship, generalization and specialization?
Q29 Explain the UNION, SET DIFFERENCE and SET INTERSECTION operators
with suitable example.
Q30 Consider the following relations and write the relational-algebra expression
equivalent to the following queries:
BRANCH (Branch_Name, Branch_City, Assets)
CUSTOMER (Customer_Name, Customer_Street, Customer_City)
ACCOUNT (Account_No., Branch_Name, Balance)
LOAN (Loan_No., Branch_Name, Amount)
BORROWER (Customer_Name, Loan_No.)
DEPOSITOR (Customer_Name, Account_No.)
a) Find the names of all customers, who live in “MUSCAT”.
b) Find the names of all customers, who have a loan at the “MUSCAT” branch.
c) Find the names of all customers along with their loan numbers, who have a
loan at the bank.
d) Find the names of all customers, who have either a loan or an account at the
bank.
e) Find the names of all branches with customers who have an account in the
bank and who live in “MUSCAT”.
235
A B C
a1 b1 c1
a1 b1 c2
a2 b1 c1
a2 b1 c2
Q33 Compute the closure of the following set F of functional dependencies for relation
schema R = (A,B,C,D,E)
A →BC
CD →E
B →D
E →A
List the candidate keys for R.
Q34 How do you find Closure of attribute sets? Explain with the help of example
Q35 Explain Functional Dependencies with the help of example.
Q36 Explain canonical cover with the help of example.
Q37 Explain the concept of Normalization briefly with the help of Example.
Q38. Discuss Armstrong’s Axiom’s. Give examples for each rule
Q39 Explain Codd’s Rules.
Q40. Show that every BCNF schema is in 3NF but 3NF is not in BCNF.
1) Sales_order
236
Order_no Varchar2 6 Primary key / first letter must start with ‘O’
Order_date Date
Client_no Varchar2 6
Dely_addr Varchar2 25
Salesman_no Varchar2 6
Dely_type Char 1
Billed_yn Char 1
Dely_date date
Order_status Varchar2 10 Values in (‘In Process’,’Fullfilled ‘,’Back
order’,’Cancelled’)
1) Employees
Column Data type Size Attribute
Name
Emp_name Varchar2 32 Primary key
Street Varchar2 15
City Varchar2 15
3) Works
Column Name Data type Size Attribute
Emp_name Varchar2 32 Primary key
Company_name Varchar2 32
Salary number
4) company
Column Name Data type Size Attribute
Company_name Varchar2 32 Primary key
City Varchar2 15
5)manages
Column Name Data type Size Attribute
Emp_name Varchar2 32 Primary key
manager_name Varchar2 32
Q43. What do you mean by Application Security? Explain Data Encryption and Data
Decryption technique in Database.
Q44. Explain the following
a) Digital certificates
b) Triggers
c) Privileges with Grant option
Q45. What is statistical database? What are the techniques used to prevent users
from using some methods for accessing private information?
237
238
Q61 List the cost function for the select and join method.
239
These two styles, used alone, have in common that both the data
employed and the results produces do not survive the run of the program. When
the program is next run, the data must be obtained again (even if it is the same
data, or only slightly changed), and the results cannot be fed forward to become
the input for some other program. In order to work around these obstacles, a
means of storing data outside programs is required. Files serve not only this
purpose, but also provide a way of storing very large data collections, for which
individual entry to every program would be impractical. Indeed, it is often the
case in such instances that the data is the central theme to an entire symphony
of programs operating on it, and that no one of the programs in the collection is
nearly as important as the data itself.
A file resembles a book. Its structure (plot) is created in the mind of an
author and it must be written (encoded) on some medium. Once this has been
done, others can read it. However, in order to read it intelligibly, they must
follow the structure created by the author.
A file is a source or a sink for a collection of data.
Just as data must be structured or arranged in such a way as to represent
some real life problem, so also files must be structured (the plot, again) so as to
represent the data they are intended to store. There are as many ways to do this
as there are programmers, computers, operating systems, programs, and
problems. The definition of a file has been expressed in a broad and general
form for this very reason--the meaning must cover a lot of ground. In fact, by
this definition, the batched data within a program and the data input
interactively at a keyboard by the user of a program are both files--at least
conceptually.
At the highest and most abstract level, a book can be thought of in terms
of its plot, character and events. At a middle level, the book is perhaps
structured by named chapters. On a more detailed level, the book is a collection
of words on a page. That is, there are degrees of abstraction to the
concept book as there are for many other commonly used ideas. This
observation leads to the first classification of files, by the degree of their
abstraction:
240
241
them. Thus, there have been many attempts to provide universal file handling
interfaces, and these differ widely. Indeed, perhaps the most troublesome area
for both the designers of a computing notation and for those who program
using it is how to deal with input and output, as they are on the one hand
essential to any substantial programming activity, and on the other closely tied
to a particular system.
The problem for a language designer is the necessity to find common
ground ahead of time for all possible external devices and operating systems so
that I/O routines can be universally applicable. In Modula-2, this problem has
been partially avoided by placing such matters outside the purvey of the
language proper, and by assuming instead that any device needing
communications links with a program will have these facilities supplied in a
particular implementation by appropriate library modules.
This results in the Modula-2 notation itself being small and versatile, but
causes somewhat more work for the programmer, who often had a large
number of library routines to keep straight--especially if using more than one
version--for then there was no guarantee that such libraries would correspond.
In spite of this deliberate lack of pre-specification by Wirth (he required
no particular I/O routines or modules, and only made a few suggestions of
modules he had found generally useful), much can still be said about such
functions. Although operating systems differ widely, there are many things that
they do have in common. Moreover, there are not many kinds of logical file in
common use, even though the recordings of such files may be very different. As
a result, many vendors of Modula-2 products produced very similar libraries for
I/O and to some extent, this tended to create a de facto or marketplace standard.
One section of this chapter is devoted to outlining the most common I/O and
file handling routines used by commercial vendors in the years when no official
standards existed.
Even the ubiquitous classical high level
modules InOut and RealInOut have many variations however, and lower level
modules often have more differences than similarities between
implementations. This was one of the major reasons for convening a working
group (WG13) of the International Standards Organization (ISO) in April of
1987 to produce a standard definition of both Modula-2 and its libraries. This
standard will be the focus for most of the rest of this chapter and will be used
subsequently when a sample solution happens to call for the use of files.
Before looking at the specifics of handling the program/logical file
communication, however, some additional attention needs to be paid to the
logical view of data storage.
242