0% found this document useful (0 votes)
16 views26 pages

Chapter 4-Functional Dependency

Chapter 4 of 'Fundamentals of Database Systems' by Amsalu Dinote focuses on functional dependency and normalization in relational database design. It discusses the importance of analyzing attribute groupings for quality design, introduces functional dependencies as constraints between attributes, and explains the normalization process to minimize redundancy and anomalies. The chapter outlines various normal forms (1NF, 2NF, 3NF, BCNF) and their criteria, emphasizing the significance of functional dependencies in achieving a well-structured database.

Uploaded by

danielarega25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views26 pages

Chapter 4-Functional Dependency

Chapter 4 of 'Fundamentals of Database Systems' by Amsalu Dinote focuses on functional dependency and normalization in relational database design. It discusses the importance of analyzing attribute groupings for quality design, introduces functional dependencies as constraints between attributes, and explains the normalization process to minimize redundancy and anomalies. The chapter outlines various normal forms (1NF, 2NF, 3NF, BCNF) and their criteria, emphasizing the significance of functional dependencies in achieving a well-structured database.

Uploaded by

danielarega25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Fundamentals of Database

Systems(CoSc2071)
Chapter - 4: Functional Dependency
and Normalization

By Amsalu Dinote

03/19/2025 Fundamentals of Database Syste 1


ms
Introduction
• So far, we have assumed that attributes are grouped to
form a relation schema by using the common sense of
the database designer or by mapping a database
schema design from a conceptual data model such as
the ER or Enhanced-ER (EER) data model.
• These models make the designer identify entity types
and relationship types and their respective attributes,
which leads to a natural and logical grouping of the
attributes into relations when the mapping procedures
are followed.
• However, we still need some formal way of analyzing
why one grouping of attributes into a relation schema
may be better than another. 03/19/2025 Fundamentals of Database Syste
ms
2
Introduction(cont..)
• While discussing database design in, we did not
develop any measure of appropriateness or goodness
to measure the quality of the design, other than the
intuition of the designer.
• In this chapter we discuss some of the theory that
has been developed with the goal of evaluating
relational schemas for design quality—that is, to
measure formally why one set of groupings of
attributes into relation schemas is better than
another.
• Relational database design ultimately produces a set
of relations. The implicit goals of the design activity
are information preservation and minimum
03/19/2025 Fundamentals of Database Syste 3
ms
redundancy.
Introduction(cont..)
 Information is very hard to quantify—hence we consider
information preservation in terms of maintaining all concepts,
including attribute types, entity types, and relationship types as
well as generalization/specialization relationships, which are
described using a model such as the EER model.
 Thus, the relational design must preserve all of these concepts,
which are originally captured in the conceptual design after the
conceptual to logical design mapping.
 Minimizing redundancy implies minimizing redundant storage
of the same information and reducing the need for multiple
updates to maintain consistency across multiple copies of the
same information in response to real-world events that require
making an update.
03/19/2025 Fundamentals of Database Syste 4
ms
Functional Dependency
A functional dependency is a constraint between two sets
of attributes from the database.
Suppose that our relational database schema has n attributes
A1, A2,...,An; let us think of the whole database as being
described by a single universal relation schema R = {A1,
A2,..., An}.
We do not imply that we will actually store the database as a
single universal table; we use this concept only in developing
the formal theory of data dependencies.
A functional dependency, denoted by X → Y, between two
sets of attributes X and Y that are subsets of R specifies a
constraint on the possible tuples that can form a relation state
r of R. The constraint is that, for any two tuples t1 and t2 in r
that have t1[X] = t2[X], they must also have t1[Y] = t2[Y].
03/19/2025 Fundamentals of Database Syste 5
ms
Functional Dependency(cont..)
This means that the values of the Y component of a
tuple in r depend on, or are determined by, the values
of the X component; alternatively, the values of the X
component of a tuple uniquely (or functionally)
determine the values of the Y component.
We also say that there is a functional dependency
from X to Y, or that Y is functionally dependent on X.
The abbreviation for functional dependency is FD or
f.d. The set of attributes X is called the left-hand
side of the FD, and Y is called the right-hand side.

03/19/2025 Fundamentals of Database Syste 6


ms
Functional Dependency(cont..)
In simple terms, a functional dependency describes
the relationship between attributes (fields) in a
relation (table). Suppose we have two fields A and B
in a table, called T,; and if each value of A inside the
table, T, is associated with exactly one value of B in
the same table, we say, B is functionally dependent
on A, or pictorially:
A→B
And A is called the determinant of the functional
dependency A to B. The fields A and B can each of
them be a single field or group of fields.
Consider the following staff_branch table as
example 03/19/2025 Fundamentals of Database Syste
ms
7
Functional Dependency(cont..)

03/19/2025 Fundamentals of Database Syste 8


ms
Normalization
 The normalization process, as first proposed by Codd (1972),takes a
relation schema through a series of tests to certify whether it satisfies a
certain normal form.
 The process, which proceeds in a top-down fashion by evaluating each
relation against the criteria for normal forms and decomposing relations
as necessary, can thus be considered as relational design by analysis.
 Initially, Codd proposed three normal forms, which he called first(1NF),
second(2NF), and third(3NF) normal form.
 A stronger definition of 3NF—called Boyce-Codd normal form (BCNF)
—was proposed later by Boyce and Codd.
 All these normal forms are based on a single analytical tool: the
functional dependencies among the attributes of a relation.
 Normalization of data can be considered a process of analyzing the
given relation schemas based on their FDs and primary keys to achieve
the desirable properties of (1) minimizing redundancy and (2)
minimizing the insertion, deletion, and update anomalies
03/19/2025 Fundamentals of Database Syste 9
ms
Normalization(cont..)
 Itcan be considered as a “filtering” or “purification” process to make
the design have successively better quality. Unsatisfactory relation
schemas that do not meet certain conditions—the normal form tests—
are decomposed into smaller relation schemas that meet the tests and
hence possess the desirable properties. Thus, the normalization
procedure provides database designers with the following:
 A formal framework for analyzing relation schemas based on their keys
and on the functional dependencies among their attributes
 A series of normal form tests that can be carried out on individual
relation schemas so that the relational database can be normalized to any
desired degree
 The normal form of a relation refers to the highest normal form
condition that it meets, and hence indicates the degree to which it has
been normalized.
 Database design as practiced in industry today pays particular attention
to normalization only up to 3NF,BCNF,or at most 4NF.
03/19/2025 Fundamentals of Database Syste 10
ms
 A superkey of a relation schema R = {A1, A2,..., An} is a set
of attributes S ⊆ R with the property that no two tuples t 1
and t2 in any legal relation state r of R will have t 1[S] = t2[S].
A key K is a superkey with the additional property that
removal of any attribute from K will cause K not to be a
superkey any more.
 The difference between a key and a superkey is that a key
has to be minimal; that is, if we have a key K = {A 1, A2,...,
Ak} of R, then K – {Ai} is not a key of R for any Ai, 1≤ i ≤ k.
Definitions of Keys and Attributes
 In previous ER, {Ssn} is a key for EMPLOYEE,whereas
Participating
{Ssn},{Ssn, Ename}, in {Ssn,
KeysEname, Bdate},and any set of
attributes that includes Ssn are all superkeys.
03/19/2025 Fundamentals of Database Syste 11
ms
Definitions of Keys and Attributes
Participating in Keys(cont..)
 If a relation schema has more than one key, each is called a
candidate key. One of the candidate keys is arbitrarily
designated to be the primary key, and the others are called
secondary keys(alternate keys).
 In a practical relational database, each relation schema
must have a primary key. If no candidate key is known for
a relation, the entire relation can be treated as a default
superkey.
 An attribute of relation schema R is called a prime
attribute of R if it is a member of some candidate key of
R. An attribute is called nonprime if it is not a prime
attribute—that is, if it is not a member of any candidate
key. 03/19/2025 Fundamentals of Database Syste
ms
12
First Normal Form(1NF)
Itstates that the domain of an attribute must include
only atomic (simple, indivisible) values and that the
value of any attribute in a tuple must be a single value
from the domain of that attribute.
Hence,1NF disallows having a set of values, a tuple of
values, or a combination of both as an attribute value
for a single tuple.
Consider the following DEPARTMENT relation, as we
can see, this is not in 1NF because Dlocations is not an
atomic attribute

03/19/2025 Fundamentals of Database Syste 13


ms
First Normal Form(cont..)
 There are three main techniques to achieve first normal form for
such a relation:
1. Remove the attribute Dlocations that violates 1NF and place it in
a separate relation DEPT_LOCATIONS along with the primary
key Dnumber of DEPARTMENT. The primary key of this
relation is the combination {Dnumber, Dlocation}
2. Expand the key so that there will be a separate tuple in the
original DEPARTMENT relation for each location of a
DEPARTMENT. In this case, the primary key becomes the
combination {Dnumber, Dlocation}.(introduce redundancy)
3. If a maximum number of values is known for the attribute—for
example, if it is known that at most three locations can exist for a
department—replace the Dlocations attribute by three atomic
attributes: Dlocation1, Dlocation2,and Dlocation3.(introduce
NULL values) 03/19/2025 Fundamentals of Database Syste 14
ms
First Normal Form(cont..)
Department relation not in 1NF

Department relation in 1NF with


redundancy

03/19/2025 Fundamentals of Database Syste 15


ms
Second Normal Form(2NF)
 2NF is based on the concept of full functional dependency.
 A functional dependency X → Y is a full functional dependency if
removal of any attribute A from X means that the dependency does
not hold any more; that is, for any attribute A ε X, (X – {A}) does not
functionally determine Y.
 A functional dependency X→Y is a partial dependency if some
attribute A ε X can be removed from X and the dependency still
holds; that is, for some A ε X, (X – {A}) → Y.
 For example, in EMP_PRO table,{Ssn, Pnumber} → Hours is a full
dependency (neither Ssn → Hours nor Pnumber →Hours
holds).However, the dependency {Ssn, Pnumber}→Ename is partial
because Ssn →Ename holds.
 In other words, a relation schema R is in 2NF if it is in 1NF and
every nonprime attribute A in R is fully functionally dependent on
the primary key of R.
03/19/2025 Fundamentals of Database Syste 16
ms
Second Normal Form(cont..)
 The test for 2NF involves testing for functional dependencies
whose left-hand side attributes are part of the primary key. If the
primary key contains a single attribute, the test need not be applied
at all.
 Let us consider another example, patients table below. It is clear
that this table is in 1NF. And, the primary key for this table is
the composite key (PatientId, RelativeId).

03/19/2025 Fundamentals of Database Syste 17


ms
Second Normal Form(cont..)
So, to determine if it satisfies 2NF, you have to find
out if all other fields in it depend fully on both
PatientId and RelativeId; that is, you need to decide
whether the following conditions are true:
(PatientId, RelativeId) → Relationship; and
(PatientId, RelativeId) → Patient_tel.
However, on the dependencies in the patient table,
only the following are true:
(PatientId, RelativeId) → Relationship; and
(PatientId) → Patient_tel
Therefore, table Patients is not in 2NF.

03/19/2025 Fundamentals of Database Syste 18


ms
Second Normal Form(cont..)
 In order to normalize table Patients to 2NF we can break it
into two normalized tables. The Patient_tel field really
doesn’t belong to Patients table because the patients’
telephone numbers have nothing to do with patients’ relatives
and should be associated with patients only.

03/19/2025 Fundamentals of Database Syste 19


ms
Third Normal Form(3NF)
 Third normal form (3NF) is based on the concept of transitive
dependency.
 A functional dependency X→Y in a relation schema R is a
transitive dependency if there exists a set of attributes Z in R
that is neither a candidate key nor a subset of any key of R, and
both X→Z and Z→Y hold.
 The dependency Ssn →Dmgr_ssn is transitive through Dnumber
in EMP_DEPT, because both the dependencies Ssn → Dnumber
and Dnumber → Dmgr_ssn hold and Dnumber is neither a key
itself nor a subset of the key of EMP_DEPT. Intuitively, we can
see that the dependency of Dmgr_ssn on Dnumber is undesirable
in EMP_DEPT since Dnumber is not a key of EMP_DEPT.
 According to Codd’s original definition, a relation schema R is
in 3NF if it satisfies 2NF and no nonprime attribute of R is
transitively dependent on the primary key. Fundamentals of Database Syste 20
03/19/2025
ms
Third Normal Form(cont..)
 Let us consider the following table.

 The primary key of this table is EmpId. Assuming that Empname holds scalar
values, this table is in 1NF and also 2NF.
 Moreover, the fields: Empname and Department are all directly associated
with EmpId, the primary key. The last field, Dept_tel, however, contains
the telephone number of departments and therefore is determined by the
department, which is not part of the primary key. In short, the following holds
true in this table: EmpId → Department and Department → Dept_tel
 These dependencies can be put together to show the fact that the following
transitive dependency holds true. EmpId → Department → Dept_tel
03/19/2025 Fundamentals of Database Syste 21
ms
Third Normal Form(cont..)
 The normalization of 2NF tables to 3NF involves the
removal of transitive dependencies. We remove the
transitively dependent fields(s) from the table by placing the
field(s) in a new table along with a copy of the determinant(s).
Therefore, the above table can be decomposed into two 3NF
tables shown below.

03/19/2025 Fundamentals of Database Syste 22


ms
Transitive Dependency Anomalies
 Transitive dependencies could result in various insertions, update or delete
anomalies.
 Insertion Anomalies: Suppose a new department has just been created, but the
company hasn’t hired anyone fir this new department. An error would occur if
you attempted to add data into the table because there is no EmpId associated with
the new department. Since EmpId is the primary key, you canít add a new record
into the table with EmpId being null. In the normalized table the new directory data
can be inserted into the Department table with no problem.
 Deletion Anomalies: can occur to the Employee1 table if, for example, Zahara
Hagos leaves the company and her record is deleted from the Employee1 table.
Because Zahara Hagos is the only member of the Administration department,
all information associated with the Administration department will be wiped out
even though the department itself has not been eliminated.
 Update Anomalies: could occur to the UN-normalized employee table
(Employee1) if the Finance department changes the current telephone number to
a new one. You would have to update three records in the table even
though one piece of information about a department has changed. However, in the
normalized table, Department, you need to change only once, i.e., the Dept_tel data
of the record that holds Finance as department.
03/19/2025 Fundamentals of Database Syste 23
ms
Boyce-Codd Normal Form (BCNF)
 The Boyce-Codd Normal Form is an extension to the 3NF for the
special case where:
There are at least two candidate keys in the table,
All the candidate keys are composite keys, and
There is overlapping field(s) in the candidate keys (there is
at least one field in common).
 When a table satisfies all these conditions, the 3NF can’t eliminate
all forms of transitive dependency. For a table to be in the BCNF, it
must be in the 3NF, and all fields in all its candidate keys must be
functionally independent. Violation of the BCNF is quite rare, it
may only happen under the above conditions.
 The following table(a) shows a case in which the BCNF is not
fulfilled. Suppose that the following conditions apply to the table
below(a). For each subject, each student of that subject is taught by
only one teacher, and each teacher teaches only one subject (but 24
03/19/2025 Fundamentals of Database Syste

each subject is taught by several teachers). ms


Boyce-Codd Normal Form (BCNF)

(a) (b) (c)


 The candidate keys for this table(a) are (Student, Subject) and (Student, Teacher).
Observe here that both candidate keys are composite and the Student field is
the overlapped field in the candidate keys. This implies that the table is not in
BCNF.
 Moreover, this table(a) suffers from certain update anomalies. For example, if we
wish to delete the information that Zahara is studying Physics, we can not do so
without at the same time losing the information that Dr. Jemal teaches
Physics.
 Normalizing this table to BCNF would require the decomposition of the above
03/19/2025 Fundamentals of Database Syste 25
table(a) into two tables(b) and (c). ms
Thank you!!

03/19/2025 Fundamentals of Database Syste 26


ms

You might also like