Unit-6 Note
Unit-6 Note
For Example:
Consider the relational schema:
Branch_loan = ( branch_name, branch_city, assets, customer_name, loan_no, amount)
Redundancy:
o Data for branch_name, branch_city, assets are repeated for each loan that provides by bank.
o Storing information several times leads waste of storage space/time.
o Data redundancy leads problem in Insertion, deletion and update.
Insertion Problem:
We cannot store information about a branch without loan information, we can use null values for loan information,
but they are difficult to handle.
Deletion Problem:
In this example, if we delete the information of Manoj, (i.e. DELETE FROM branch_loan WHERE customer_name =
'Manoj';), we cannot obtain the information of Pokhara branch (i.e. we don't know branch city of Pokhara, total asset of
Pokhara branch etc.
Update problem:
Since data are duplicated so multiple copies of same fact need to update while updating one. It increases the possibility of
data inconsistency. When we made update in one copy there is possibility that only some of the multiple copies are
update but not all, which lead data/database in inconsistent state.
6.2 Functional dependency
Functional dependency(FD) is a set of constraints between two attributes in a relation. Functional dependency says that if
two tuples have same values for attributes A1, A2, …, An, then those two tuples must have to have same values for
attributes B1, B2, …. , Bn.
Functional dependency is represented by an arrow sign ( ) that is, X Y, where X
functionally determines Y. The left-hand side attributes determine the values of attributes on the right-hand side.
Armstrong's Axioms:
If F is a set of functional dependencies then the closure of F, denoted as F*, is the set of all functional dependencies
logically implied by F. Armstrong's Axioms are a set of rules that, when applied repeatedly, generates a closure of
functional dependencies.
1. Reflexive rule: If alpha is a set of attributes and beta is subset of alpha, then alpha holds beta.
2. Transitivity rule: Same as transitive rule in algebra, if a b holds and b c holds, then a c also holds. a bis
called as a functionally that determines b.
Trivial Functional Dependency
1. Trivial: If a functional dependency (FD) X Y holds, where Y is a subset of X, then it is called a trivial FD. Trivial
FDs always hold.
2. Non-trivial: If an FD X Y holds, where Y is not a subset of X, then it is called anon-trivial FD.
3. Completely non-trivial: If an FD X -trivial
FD.
Normalization, its need and objectives
Database normalization is the process of organizing data into tables in such a way that the results of using the database
are always unambiguous and as intended. Such normalization is intrinsic to relational database theory. It may have the
effect of duplicating data within the database and often results in the creation of additional tables.
The concept of database normalization is generally traced back to E.F. Codd, an IBM researcher who, in 1970, published
a paper describing the relational database model. What Codd describes as "a normal form for database relations" was an
essential element of the relational technique. While data normalization rules tend to increase the duplication of data, it
does not introduce data redundancy, which is unnecessary duplication. Database normalization is typically a refinement
process after the initial exercise of identifying the data objects that should be in the relational database, identifying their
relationships and defining the tables required and the columns within each table.
Objectives of Normalization:
A basic objective of the first normal form defined by Codd in 1970 was to permit data to be queried and manipulated using
a "universal data sub-language" grounded in first-order-logic. The objectives of normalization beyond 1NF (First Normal
Form) were stated as follows by Codd:
1. To free the collection of relations from undesirable insertion, update and deletion dependencies;
2. To reduce the need for restructuring the collection of relations, as new types of data are introduced, and thus increase
the life span of application programs;
3. To make the relational model more informative to users;
4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time
goes by.
The following example give details of each of these objectives:
Free the database of modification anomalies:
Employee Skill
Employee ID Employee Address Skill
426 Chhoprak-7, Gorkha Typing
426 Chhoprak-7, Gorkha Shorthand
519 Bharatpur-10, Chitwan Public Speaking
519 Bharatpur-12, Chitwan Carpentry
An update anomaly. Employee 519 is shown as having different addresses on different records.
Normal Form:
Student Age
Adam 15
Alex 14
Stuart 17
In Student table the candidate key will be Student column, because all other column i.e. Age is dependent on it.
New subject table introduced for 2NF will be:
Student Subject
Adam Biology
Adam Maths
Alex Maths
Stuart Maths
In Subject Table the candidate key will be {Student, Subject} column. Now, both the above tables qualifies for Second
Normal Form and will never suffer from Update Anomalies. Although there are a few complex cases in which table in
Second Normal Form suffers Update Anomalies, and to handle those scenarios Third Normal Form is there.
Student_Detail Table:
Solution:
First Normal Form (1NF):
In the above table, the field Tphone holds multiple values in one cell. In this case, the primary key is RN. With the design
like this table, we can have insert, update, delete and select anomalies. This table has more than one data in one cell so,
it should be corrected as:
Student1 table:
RN DOB
1 2051/11/12
2 2051/3/13
3 2052/3/5
4 2051/4/4
Now, this scheme is free from insert, update, delete and select anomalies. Finally the given table is normalized