Unit 3 Notes
Unit 3 Notes
Example:
In this example, if we know the value of Employee number, we can obtain Employee Name,
city, salary, etc. By this, we can say that the city, Employee Name, and salary are
functionally depended on Employee number.
Use an entity relation diagram (ERD) to provide the big picture, or macro view, of an
organization’s data requirements and operations. This is created through an iterative process
that involves identifying relevant entities, their attributes and their relationships.
Example:
In this example, maf_year and color are independent of each other but dependent on
car_model. In this example, these two columns are said to be multivalue dependent on
car_model.
car_model-> colour
For example:
Emp_id Emp_name
AS555 Harry
AS811 George
AS999 Kevin
Example:
(Company} -> {CEO} (if we know the Company, we knows the CEO name)
But CEO is not a subset of Company, and hence it's non-trivial functional dependency.
Transitive dependency:
A transitive is a type of functional dependency which happens when t is indirectly formed
by two functional dependencies.
Example:
Alibaba Jack Ma 54
{Company} -> {CEO} (if we know the compay, we know its CEO's name)
Note: You need to remember that transitive dependency can only occur in a relation of
three or more attributes.
What is Normalization?
Normalization is a method of organizing the data in the database which helps you to avoid
data redundancy, insertion, update & deletion anomaly. It is a process of analyzing the
relation schemas based on their different functional dependencies and primary key.
Summary
Functional Dependency is when one attribute determines another attribute in a
DBMS system.
Axiom, Decomposition, Dependent, Determinant, Union are key terms for functional
dependency
Four types of functional dependency are 1) Multivalued 2) Trivial 3) Non-trivial 4)
Transitive
Multivalued dependency occurs in the situation where there are multiple
independent multivalued attributes in a single table
The Trivial dependency occurs when a set of attributes which are called a trivial if the
set of attributes are included in that attribute
Nontrivial dependency occurs when A->B holds true where B is not a subset of A
A transitive is a type of functional dependency which happens when it is indirectly
formed by two functional dependencies
Normalization is a method of organizing the data in the database which helps you to
avoid data redundancy
What Is Normalization?
Normalization is the branch of relational theory that provides design insights. It is the process
of determining how much redundancy exists in a table. The goals of normalization are to:
NORMALIZATION may also be defined as:- it is a database design technique that reduces
data redundancy and eliminates undesirable characteristics like Insertion, Update and
Deletion Anomalies. Normalization rules divides larger tables into smaller tables and links
them using relationships. The purpose of Normalization in SQL is to eliminate redundant
(repetitive) data and ensure data is stored logically.
The inventor of the relational model Edgar Codd proposed the theory of normalization with
the introduction of the First Normal Form, and he continued to extend theory with Second
and Third Normal Form. Later he joined Raymond F. Boyce to develop the theory of Boyce-
Codd Normal Form.
Normal Forms
All the tables in any database can be in one of the normal forms we will discuss next. Ideally
we only want minimal redundancy for PK to FK. Everything else should be derived from other
tables. There are five normal forms.
To normalize a relation that contains a repeating group, remove the repeating group and form
two new relations.
The PK of the new relation is a combination of the PK of the original relation plus an attribute
from the newly created relation for unique identification.
In the Student Grade Report table, the repeating group is the course information. A student can
take many courses.
Remove the repeating group. In this case, it’s the course information for each student.
Identify the PK for your new table.
The PK must uniquely identify the attribute value (StudentNo and CourseNo).
After removing all the attributes related to the course and student, you are left with the student
course table (StudentCourse).
The Student table (Student) is now in first normal form with the repeating group removed.
The two new tables are shown below.
If the relation has a composite PK, then each non-key attribute must be fully dependent on the
entire PK and not on a subset of the PK (i.e., there must be no partial dependency or
augmentation).
At this stage, there should be no anomalies in third normal form. Let’s look at the dependency
diagram (Figure 12.1) for this example. The first step is to remove repeating groups, as
discussed above.
To recap the normalization process for the School database, review the dependencies shown in
Figure 12.1.
BCNF Example 1
Consider the following table (St_Maj_Adv).
The semantic rules (business rules applied to the database) for this table are:
The functional dependencies for this table are listed below. The first one is a candidate key;
the second is not.
To reduce the St_Maj_Adv relation to BCNF, you create two new tables:
St_Adv table
Student_id Advisor
111 Smith
111 Chan
320 Dobbs
671 White
803 Smith
Adv_Maj table
Advisor Major
Smith Physics
Chan Music
Dobbs Math
White Physics
BCNF Example 2
Consider the following table (Client_Interview).
A relation is in BCNF if, and only if, every determinant is a candidate key. We need to create
a table that incorporates the first three FDs (Client_Interview2 table) and another table
(StaffRoom table) for the fourth FD.
Client_Interview2 table
StaffRoom table
Join Dependency:
A Join dependency is generalization of Multivalued dependency.A JD {R1, R2, ..., Rn}
is said to hold over a relation R if R1, R2, R3, ..., Rn is a lossless-join decomposition
of R . There is no set of sound and complete inference rules for JD.
Inclusion Dependency:
An Inclusion Dependency is a statement of the form that some columns of a relation
are contained in other columns. A foreign key constraint is an example of inclusion
dependency.
first normal form (1NF): only single values are permitted at the intersection of each row and
column so there are no repeating groups
second normal form (2NF): the relation must be in 1NF and the PK comprises a single
attribute
third normal form (3NF): the relation must be in 2NF and all transitive dependencies must be
removed; a non-key attribute may not be functionally dependent on another non-key attribute
4NF (Fourth Normal Form) and 5NF (Fifth Normal Form)
If two or more independent relation are kept in a single relation or we can say multivalue
dependency occurs when the presence of one or more rows in a table implies the presence of
one or more other rows in that same table. Put another way, two attributes (or columns) in a
table are independent of one another, but both depend on a third attribute. A multivalued
dependency always requires at least three attributes because it consists of at least two
attributes that are dependent on a third.
For a dependency A -> B, if for a single value of A, multiple value of B exists, then the table may
have multi-valued dependency. The table should have at least 3 attributes and B and C should
be independent for A ->> B multivalued dependency. For example,
Person->-> mobile,
Person ->-> food_likes
This is read as “person multidetermines mobile” and “person multidetermines food_likes.”
Note that a functional dependency is a special case of multivalued dependency. In a functional
dependency X -> Y, every x determines exactly one y, never more than one.
Fourth normal form (4NF) is a level of database normalization where there are no non-trivial
multivalued dependencies other than a candidate key. It builds on the first three normal forms
(1NF, 2NF and 3NF) and the Boyce-Codd Normal Form (BCNF). It states that, in addition to a
database meeting the requirements of BCNF, it must not contain more than one multivalued
dependency.
Properties – A relation R is in 4NF if and only if the following conditions are satisfied:
1. It should be in the Boyce-Codd Normal Form (BCNF).
2. the table should not have any Multi-valued Dependency.
A table with a multivalued dependency violates the normalization standard of Fourth Normal
Form (4NK) because it creates unnecessary redundancies and can contribute to inconsistent
data. To bring this up to 4NF, it is necessary to break this information into two tables.
Example – Consider the database table of a class whaich has two relations R1 contains student
ID(SID) and student name (SNAME) and R2 contains course id(CID) and course name
(CNAME).
S1 A
S2 B
CID CNAME
C1 C
CID CNAME
C2 D
Table – R1 X R2
SID SNAME CID CNAME
S1 A C1 C
S1 A C2 D
S2 B C1 C
S2 B C2 D
Example –
Table – R1
COMPANY PRODUCT
C1 pendrive
C1 mic
C2 speaker
COMPANY PRODUCT
C2 speaker
Company->->Product
Table – R2
AGENT COMPANY
Aman C1
Aman C2
Mohan C1
Agent->->Company
Table – R3
AGENT PRODUCT
Aman pendrive
Aman mic
Aman speaker
Mohan speaker
Agent->->Product
Table – R1⋈R2⋈R3
COMPANY PRODUCT AGENT
C1 pendrive Aman
C1 mic Aman
C2 speaker speaker
COMPANY PRODUCT AGENT
C1 speaker Aman
Agent->->Product
A relation R is in 5NF if and only if every join dependency in R is implied by the candidate keys
of R. A relation decomposed into two relations must have loss-less join Property, which ensures
that no spurious or extra tuples are generated, when relations are reunited through a natural
join.
Properties – A relation R is in 5NF if and only if it satisfies following conditions:
1. R should be already in 4NF.
2. It cannot be further non loss decomposed (join dependency)
Example – Consider the above schema, with a case as “if a company makes a product and an
agent is an agent for that company, then he always sells that product for the company”. Under
these circumstances, the ACP table is shown as:
Table – ACP
AGENT COMPANY PRODUCT
A1 PQR Nut
A1 PQR Bolt
A1 XYZ Nut
A1 XYZ Bolt
A2 PQR Nut
The relation ACP is again decompose into 3 relations. Now, the natural Join of all the three
relations will be shown as:
Table – R1
AGENT COMPANY
A1 PQR
AGENT COMPANY
A1 XYZ
A2 PQR
Table – R2
AGENT PRODUCT
A1 Nut
A1 Bolt
A2 Nut
Table – R3
COMPANY PRODUCT
PQR Nut
PQR Bolt
XYZ Nut
XYZ Bolt
Result of Natural Join of R1 and R3 over ‘Company’ and then Natural Join of R13 and R2 over
‘Agent’and ‘Product’ will be table ACP.
Hence, in this example, all the redundancies are eliminated, and the decomposition of ACP is a
lossless join decomposition. Therefore, the relation is in 5NF as it does not violate the property
of lossless join.
Summary
Database designing is critical to the successful implementation of a database
management system that meets the data requirements of an enterprise system.
Normalization in DBMS helps produce database systems that are cost-effective and
have better security models.
Functional dependencies are a very important component of the normalize data
process
Most database systems are normalized database up to the third normal forms.
A primary key uniquely identifies are record in a Table and cannot be null
A foreign key helps connect table and references a primary key