0% found this document useful (0 votes)
7 views

Unit-6 Note

This is a computer note of BBA 2nd semester

Uploaded by

osialamansari148
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Unit-6 Note

This is a computer note of BBA 2nd semester

Uploaded by

osialamansari148
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Unit -6

Relational Database Design

Pitfalls of Relational Model: Redundancy and Anomalies


The main goal of relational database is to create/find good collection of relational schemas. Such that database should
allow us to store data/information without unnecessary redundancy and should also allow to retrieve information easily.
That is the goal of relational database design should concentrate:
 To avoid redundant data from database.
 To ensure that relationships among attributes are represented.
 To facilitate the checking of updates for the violation of database integrity constraints.
A bad relational database design may lead to:
 Repetition of information.
 That is, it leads data redundancy in database, so obviously it requires much space.
 Inability to represent certain information.

For Example:
Consider the relational schema:
Branch_loan = ( branch_name, branch_city, assets, customer_name, loan_no, amount)

Redundancy:
o Data for branch_name, branch_city, assets are repeated for each loan that provides by bank.
o Storing information several times leads waste of storage space/time.
o Data redundancy leads problem in Insertion, deletion and update.
Insertion Problem:
 We cannot store information about a branch without loan information, we can use null values for loan information,
but they are difficult to handle.
Deletion Problem:

In this example, if we delete the information of Manoj, (i.e. DELETE FROM branch_loan WHERE customer_name =
'Manoj';), we cannot obtain the information of Pokhara branch (i.e. we don't know branch city of Pokhara, total asset of
Pokhara branch etc.
Update problem:
Since data are duplicated so multiple copies of same fact need to update while updating one. It increases the possibility of
data inconsistency. When we made update in one copy there is possibility that only some of the multiple copies are
update but not all, which lead data/database in inconsistent state.
6.2 Functional dependency
Functional dependency(FD) is a set of constraints between two attributes in a relation. Functional dependency says that if
two tuples have same values for attributes A1, A2, …, An, then those two tuples must have to have same values for
attributes B1, B2, …. , Bn.
Functional dependency is represented by an arrow sign ( ) that is, X Y, where X

functionally determines Y. The left-hand side attributes determine the values of attributes on the right-hand side.
Armstrong's Axioms:
If F is a set of functional dependencies then the closure of F, denoted as F*, is the set of all functional dependencies
logically implied by F. Armstrong's Axioms are a set of rules that, when applied repeatedly, generates a closure of
functional dependencies.
1. Reflexive rule: If alpha is a set of attributes and beta is subset of alpha, then alpha holds beta.
2. Transitivity rule: Same as transitive rule in algebra, if a b holds and b c holds, then a c also holds. a bis
called as a functionally that determines b.
Trivial Functional Dependency
1. Trivial: If a functional dependency (FD) X Y holds, where Y is a subset of X, then it is called a trivial FD. Trivial
FDs always hold.
2. Non-trivial: If an FD X Y holds, where Y is not a subset of X, then it is called anon-trivial FD.
3. Completely non-trivial: If an FD X -trivial
FD.
Normalization, its need and objectives
Database normalization is the process of organizing data into tables in such a way that the results of using the database
are always unambiguous and as intended. Such normalization is intrinsic to relational database theory. It may have the
effect of duplicating data within the database and often results in the creation of additional tables.
The concept of database normalization is generally traced back to E.F. Codd, an IBM researcher who, in 1970, published
a paper describing the relational database model. What Codd describes as "a normal form for database relations" was an
essential element of the relational technique. While data normalization rules tend to increase the duplication of data, it
does not introduce data redundancy, which is unnecessary duplication. Database normalization is typically a refinement
process after the initial exercise of identifying the data objects that should be in the relational database, identifying their
relationships and defining the tables required and the columns within each table.
Objectives of Normalization:
A basic objective of the first normal form defined by Codd in 1970 was to permit data to be queried and manipulated using
a "universal data sub-language" grounded in first-order-logic. The objectives of normalization beyond 1NF (First Normal
Form) were stated as follows by Codd:
1. To free the collection of relations from undesirable insertion, update and deletion dependencies;
2. To reduce the need for restructuring the collection of relations, as new types of data are introduced, and thus increase
the life span of application programs;
3. To make the relational model more informative to users;
4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time
goes by.
The following example give details of each of these objectives:
Free the database of modification anomalies:
Employee Skill
Employee ID Employee Address Skill
426 Chhoprak-7, Gorkha Typing
426 Chhoprak-7, Gorkha Shorthand
519 Bharatpur-10, Chitwan Public Speaking
519 Bharatpur-12, Chitwan Carpentry
An update anomaly. Employee 519 is shown as having different addresses on different records.

Faculty and their Courses

Faculty ID Faculty Name Faculty Hire Date Course Code


389 Mr. Sushil Bhattarai 10 – Feb – 2012 MGT – 111
407 Mr. Suman Poudel 19 – Apr – 2010 CMP – 101
407 Mr. Suman Poudel 19 – Apr – 2010 CMP – 201
424 Mr. Ujjwal Marahatta 29 – Mar – 2016 ?
A deletion anomaly. All information about Mr. Sushil Bhattarai is lost if he temporarily ceases to be assigned to any
courses.

4 1NF, 2NF, 3NF, BCNF and 4NF


First Normal Form (1NF):
As per First Normal Form, no two rows of data must contain repeating group of information i.e. each set of column must
have a unique value, such that multiple columns cannot be used to fetch the same row. Each table should be organized
into rows, and each row should have a primary key that distinguishes it as unique.
The Primary Key is usually a single column, but sometimes more than one column can be combined to create a single
primary key. For example consider a table which is not in First normal form:

Normal Form:

Student Age Subject


Adam 15 Biology, Maths
Alex 14 Maths
Stuart 17 Maths
In First Normal Form, any row must not have a column in which more than one value is saved, like separated with
commas. Rather than that, we must separate such data into multiple rows.
Student table following 1NF will be:

Student Age Subject


Adam 15 Biology
Adam 15 Maths
Alex 14 Maths
Stuart 17 Maths
Using the First Normal Form, data redundancy increases, as there will be many columns with same data in multiple rows
but each row as a whole will be unique.
Second Normal Form (2NF):
As per Second Normal Form there must not be any partial dependency of any column on primary key. It means that for a
table that has concatenated primary key, each column in the table that is not part of the primary key must depend upon
the entire concatenated key for its existence. If any column depends only on one part of the concatenated key, then the
table fails Second normal form.
In example of First Normal Form there are two rows for Adam, to include multiple subjects that he has opted for. While
this is searchable, and follows First Normal Form, it is an inefficient use of space. Also in the above table in First Normal
Form, while the candidate key is {Student, Subject}, Age of Student only depends on Student column, which is incorrect
as per Second Normal Form. To achieve second normal form, it would be helpful to split out the subjects into an
independent table, and match them up using the student names as foreign keys.
New Student Table following 2NF will be:

Student Age
Adam 15
Alex 14
Stuart 17
In Student table the candidate key will be Student column, because all other column i.e. Age is dependent on it.
New subject table introduced for 2NF will be:

Student Subject
Adam Biology
Adam Maths
Alex Maths
Stuart Maths
In Subject Table the candidate key will be {Student, Subject} column. Now, both the above tables qualifies for Second
Normal Form and will never suffer from Update Anomalies. Although there are a few complex cases in which table in
Second Normal Form suffers Update Anomalies, and to handle those scenarios Third Normal Form is there.

Third Normal Form (3NF):


Third Normal Form applies that every non-prime attribute of table must be dependent on primary key, or we can say that,
there should not be the case that a non-prime attribute is determined by another non-prime attribute. So this transitive
functional dependency should be removed from the table and also the table must be in Second Normal Form. For
example, consider a table with following fields:

Student_Detail Table:

Student_ID Student_name DOB Street City State Zip


In this table Student_ID is primary key, but street, city and state depends upon Zip. The dependency between zip and
other fields is called transitive dependency. Hence, to apply 3NF, we need to move the street, city and state to new
table, with zip as primary key.
New Student_Detail Table:

Student_ID Student_name DOB Zip


Address Table:

Zip Street City State


The advantage of removing transitive dependency is,

Boyce and Codd Normal Form (BCNF):


Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with certain type of anomaly
that is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF.
For a table to be in BCNF, following conditions must be satisfied:
rd Normal Form
functional dependency (X Y), X should be a super key.
Question: Normalize the following table up to 3NF.

RN Name Address Phone DOB Tid Dept Tname Tphone


1 Ram Chitwan 056540123 2051/11/12 12 Management Shankar 9845111111
056540111
2 Hari Kathmandu 01444444 2051/3/13 11 Science Manish 984111111
01411111
3 Sita Bhaktapur 01611111 2052/3/5 10 Humanities Smita 984111811
4 Alisha Lalitput 01511111 2051/4/4 10 Humanities Smita 984111811

Solution:
First Normal Form (1NF):
In the above table, the field Tphone holds multiple values in one cell. In this case, the primary key is RN. With the design
like this table, we can have insert, update, delete and select anomalies. This table has more than one data in one cell so,
it should be corrected as:

RN Name Address Phone DOB Tid Dept Tname Tphone


1 Ram Chitwan 056540123 2051/11/12 12 Management Shankar 9845111111
1 Ram Chitwan 056540123 2051/11/12 12 Management Shankar 056540111
2 Hari Kathmandu 01444444 2051/3/13 11 Science Manish 984111111
2 Hari Kathmandu 01444444 2051/3/13 11 Science Manish 01411111
3 Sita Bhaktapur 01611111 2052/3/5 10 Humanities Smita 984111811
4 Alisha Lalitput 01511111 2051/4/4 10 Humanities Smita 984111811
This table is in 1NF because every field of the table is atomic, i.e. small piece of data consists in the table field. Further,
there are no duplication of rows or columns.
Second Normal Form (2NF):
In the above table, the RN and Tid with Tphone field can be used as primary keys. Similarly, the corresponding key field
depends on its primary key like Name, Address, Phone and DOB depends on RN field and Dept, Tname and Tphone
depends on Tid. To change the table in 2NF, we need to decompose the table in multiple tables as:
Student's table:

RN Name Address Phone DOB


1 Ram Chitwan 056540123 2051/11/12
2 Hari Kathmandu 01444444 2051/3/13
3 Sita Bhaktapur 01611111 2052/3/5
4 Alisha Lalitput 01511111 2051/4/4
Teacher's table

Tid Tname Tphone


12 Shankar 9845111111
12 Shankar 056540111
11 Manish 984111111
11 Manish 01411111
10 Smita 984111811
Here, Tid and Tphone jointly work as primary key.
Relation between teacher's and student's table:
RN Tid
1 12
2 11
3 10
4 10
Third Normal Form (3NF):
The above tables are in 2NF but we still have Insert and Delete anomalies. To reduce these anomalies, these tables
should be changed into 3NF. In the student table (RN, Name, Address, Phone, DOB), Name, Address and Phone are fully
depend on it's primary key 'RN'. But the date of birth DOB is depends on student name 'Name' which is not a primary key.
As applying the rules, the student tables will be decomposed into the following tables but other tables remain same
because other supports the rule of 3NF.

Student1 table:

RN Name Address Phone


1 Ram Chitwan 056540123
2 Hari Kathmandu 01444444
3 Sita Bhaktapur 01611111
4 Alisha Lalitput 01511111
Student2 table:

RN DOB
1 2051/11/12
2 2051/3/13
3 2052/3/5
4 2051/4/4
Now, this scheme is free from insert, update, delete and select anomalies. Finally the given table is normalized

You might also like