0% found this document useful (0 votes)
24 views18 pages

Chapter 14

Chapter 14 discusses the purpose and process of normalization in database design, emphasizing the importance of reducing redundancy and preventing update anomalies through functional dependencies. It outlines the steps of normalization, from First Normal Form (1NF) to Third Normal Form (3NF), and highlights the significance of understanding attribute relationships for effective database structuring. The chapter also addresses the implications of data anomalies and the necessity of decomposing relations to maintain data integrity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views18 pages

Chapter 14

Chapter 14 discusses the purpose and process of normalization in database design, emphasizing the importance of reducing redundancy and preventing update anomalies through functional dependencies. It outlines the steps of normalization, from First Normal Form (1NF) to Third Normal Form (3NF), and highlights the significance of understanding attribute relationships for effective database structuring. The chapter also addresses the implications of data anomalies and the necessity of decomposing relations to maintain data integrity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Chapter 14

Purpose of Normalization
Definition: A technique for producing a set of relations with desirable properties,
given the data requirements of an enterprise.
Purpose: to identify suitable set of relation that support data requirements and
those requirements are following
The minimal number of attributes are necessary to support data requirements
Attributes with close logical relationship (functional dependency) are found in
same table
Functional dependency: a functional dependency is a relationship between two
sets of attributes in a relation (table). It describes the dependency of one set of
attributes on another set. More specifically, if we have a relation with attributes A
and B, we say that B is functionally dependent on A if, for every unique
combination of values in A
Minimal redundancy: No duplication of values in table, the information in an
attribute must be repeated first. there's an exception to the rule of not repeating
information. If some information is crucial for connecting or "joining" different
tables, it's okay to include it more than once. This means that the information
that's repeated (the exception mentioned earlier) is crucial for connecting or
linking different tables that are related to each other
How normalization supports data base
Normalization is a formal technique that can be used at any stage of database
design Approach 1 shows how normalization can be used as a bottom-up
standalone database design technique, and Approach 2 shows how normalization
can be used as a validation technique to check the structure of relations which
may have been created using a top-down approach such as ER modeling. No
matter which approach is used, the goal is the same; creating a set of well-
designed relations that meet the data requirements of the enterprise.
User’s requirement specification is the preferred data resource. it is possible to
design a database based on the information taken directly from other data
sources, such as forms and reports

normalization as a bottom-up standalone technique (Approach 1) is often limited


by the level of detail that the database designer is reasonably expected to
manage. However, this limitation is not applicable when normalization is used as
a validation technique (Approach 2), as the database designer focuses on only
part of the database, such as a single relation, at any one time. Therefore, no
matter what the size or complexity of the database, normalization can be usefully
applied
Data redundancy and update anomalies
In following data redundancy if minimal redundancy is implemented then
following are benefits:
1: updates to the data stored in the database are achieved with a minimal
number of operations, thus reducing the opportunities for data inconsistencies
occurring in the database
2: reduction in the file storage space required by the base relations thus
minimizing costs
Relational databases uses data redundance for specific reason data redundancy
means having certain copies of key information(entity integrity) these primary
information act as foreign keys

In staffbranch there is redundant data (unnecessary data or information which is


being repeated) the details are being repeated in staff located in the branch
In opposite in branch table details only appear at once and only the branch
number (branchNo) is repeated in the Staff relation to represent where each
member of staff is located. Relations that have redundant data may have
problems called update anomalies
Update anomalies are divided into insertion, deletion and modification anomalies
Data anomalies refer to irregularities, errors, or inconsistencies in a dataset.
These anomalies can occur when there are problems with how data is stored,
updated, or retrieved.
Insertion anomalies
1: To insert the details of new members of staff into the StaffBranch relation, we
must include the details of the branch at which the staff are to be located. For
example, to insert the details of new staff located at branch number B007, we
must enter the correct details of branch number B007 so that the branch details
are consistent with values for branch B007 in other tuples of the StaffBranch
relation
2: If we want to add details for a new branch without any staff in the StaffBranch
relation, we'd need to put nulls in staff-related attributes (like staffNo). However,
this violates the primary key rule (entity integrity) and isn't allowed.
Deletion Anomalies
If we remove a record (tuple) from the StaffBranch relation for the last staff
member in a branch, the branch details are lost too. For example, deleting the
record for SA9 (Mary Howe) erases details about branch B007 from the database.
because branch details are stored separately. Only the branch number links the
two relations. If we delete SA9 from the Staff relation, it doesn't affect B007's
details in the Branch relation.
Modification Anomalies
If we want to change the value of one of the attributes of a particular branch in
the StaffBranch relation—for example, the address for branch number B003—we
must update the tuples of all staff located at that branch. If this modification is
not carried out on all the appropriate tuples of the StaffBranch relation, the
database will become inconsistent. In this example, branch number B003 may
appear to have different addresses in different staff tuples.
StaffBranch relation is subject to update anomalies, we can avoid these anomalies
by decomposing the original relation into the Staff and Branch relations. There are
two important properties associated with decomposition of a larger relation into
smaller relations:
1: The lossless-join property ensures that any instance of the original relation can
be identified from corresponding instances in the smaller relations.
2: The dependency preservation property ensures that a constraint on the original
relation can be maintained by simply enforcing some constraint on each of the
smaller relations. In other words, we do not need to perform joins on the smaller
relations to check whether a constraint on the original relation is violated.
Functional dependencies
Maier 1983 gave important concept of Normalization which is functional
dependencies
Characteristics of Functional Dependencies
, assume that a relational schema has attributes (A, ", C, . . . , Z) and that the
database is described by a single universal relation called R ! (A, ", C, . . . , Z). This
assumption means that every attribute in the database has a unique name.
Functional dependency definition:
Describes the relationship between attributes in a relation. For example, if A and
B are attributes of relation R, " is functionally dependent on A (denoted A ® B), if
each value of A is associated with exactly one value of B. (“A and B " may each
consist of one or more attributes.)
Functional dependency is a property of the meaning or semantics of the
attributes of relation
Semantics indicates how attributes relate with one another and specify their
dependencies with each other. When dependencies occur they are called
constraint between attributes

Consider a relation with attributes A and B, where attribute B is functionally


dependent on attribute A. If we know the value of A and we examine the relation
that holds this dependency, we find only one value of B in all the tuples that have
a given value of A, at any moment in time. Thus, when two tuples have the same
value of A, they also have the same value of B. However, for a given value of B,
there may be several different values of A.
“A functionally determines B.”
Determinant
Refers to the attribute, or group of attributes, on the left-hand side of the arrow
of a functional dependency.
When a functional dependency exists, the attribute or group of attributes on the
left-hand side of the arrow is called the determinant.
An additional characteristic of functional dependencies that is useful for
normalization is that their determinants should have the minimal number of
attributes necessary to maintain the functional dependency with the attribute(s)
on the righthand side. This requirement is called full functional dependency.
Full Functional dependency
A fully functional dependency is a concept in database normalization that
describes a relationship between attributes in a table such that the dependency
exists only when the dependent attribute is fully dependent on the entire set of
attributes comprising the primary key. In simpler terms, a fully functional
dependency occurs when the value of an attribute is uniquely determined by all
the attributes in a composite primary key, and removing any part of that key
would break the dependency.

For example, if you have a table with a composite primary key (A, B), and
attribute C is fully functionally dependent on (A, B), it means the value of C is
uniquely determined by both A and B together. If you were to remove either A or
B from the primary key, C would no longer be uniquely determined, and the
dependency would not hold as a fully functional dependency. Fully functional
dependencies are important in the process of normalizing a database to reduce
redundancy and improve data integrity.

Characteristics of normalization
1: In a functional dependency, there's a one-to-one relationship from the left-side
attributes (determinant) to the right-side attributes. This means each unique
value on the left uniquely determines one value on the right. However, the
reverse relationship (from right to left) can be one-to-one or one-to-many,
meaning one value on the right might correspond to one or multiple values on the
left.
2: They hold for all time.
3: The determinant has the minimal number of attributes necessary to maintain
the dependency with the attribute(s) on the right-hand side. In other words, there
must be a full functional dependency between the attribute(s) on the left-hand
and right-hand sides of the dependency.
Transitive dependency
A condition where A, B, and C are attributes of a relation such that if A ® C and B
relation C, then C is transitively dependent on A via " (provided that A is not
functionally dependent on " or C).

Functional dependencies Identification


Identifying all functional dependencies between a set of attributes should be
quite simple if the meaning of each attribute and the relationships between the
attributes are well understood. This type of information may be provided by the
enterprise in the form of discussions with users and/or appropriate
documentation, such as the users’ requirements specification. However, if the
users are unavailable for consultation and/or the documentation is incomplete,
then—depending on the database application—it may be necessary for the
database designer to use their common sense and/or experience to provide the
missing information.
Example

Identifying the Primary Key for a Relation Using Functional Dependencies


The main purpose of identifying a set of functional dependencies for a relation is
to specify the set of integrity constraints that must hold on a relation. An
important integrity constraint to consider first is the identification of candidate
keys, one of which is selected to be the primary key for the relation.

The process of normalization


Normalization is a formal technique for analyzing relations based on their primary
key (or candidate keys) and functional dependencies. The technique involves a
series of rules that can be used to test individual relations so that a database can
be normalized to any degree. When a requirement is not met, the relation
violating the requirement must be decomposed into relations that individually
meet the requirements of normalization.
Three normal forms were initially proposed called First Normal Form (1NF),
Second Normal Form (2NF), and Third Normal Form (3NF). Subsequently, R. Boyce
and E. F. Codd introduced a stronger definition of third normal form called Boyce–
Codd Normal Form (BCNF)
Higher normal forms that go beyond BCNF were introduced later such as Fourth
Normal Form (4NF) and Fifth Normal Form (5NF) (Fagin, 1977, 1979). However,
these later normal forms deal with situations that are very rare.
Normalization is often executed as a series of steps and every normal form has it’s
own properties. As normalization proceeds, the relations become progressively
more restricted (stronger) in format and also less vulnerable to update anomalies.
it is important to recognize that it is only First Normal Form (1NF) that is critical in
creating relations; all subsequent normal forms are optional. But recommended
to 3NF
we describe normalization as a bottom-up technique extracting information
about attributes from sample forms that are first transformed into table format,
which is described as being in Unnormalized Form (UNF). This table is then
subjected progressively to the different requirements associated with each
normal form until ultimately the attributes shown in the original sample forms are
represented as a set of 3NF relations. a 1NF relation may result in the relation
being transformed to 2NF relations, or in some cases directly into 3NF relations in
one step. Each relation has primary key. it is essential that the meaning of the
attributes and their relationships is well understood before beginning the process
of normalization. This information is fundamental to normalization and is used to
test whether a relation is in a particular normal form we begin by describing First
Normal Form (1NF) we describe Second Normal Form (2NF) and Third Normal
Forms (3NF)
First Normal form
A relation in which the intersection of each row and column contains one and
only one value
Unnormalized normal form
A table that contains one or more repeating groups.
we begin the process of normalization by first transferring the data from the
source into table format the table is in unnormalized Form and is referred to as an
unnormalized table. To transform the unnormalized table to First Normal Form,
we identify and remove repeating groups within the table. A repeating group is an
attribute, or group of attributes, within a table that occurs with multiple values
for a single occurrence of the nominated key attribute(s) for that table.
There are two common approaches to removing repeating groups from
unnormalized tables: (1) By entering appropriate data in the empty columns of
rows containing the repeating data. In other words, we fill in the blanks by
duplicating the nonrepeating data, where required. This approach is commonly
referred to as “flattening” the table.
(2) By placing the repeating data, along with a copy of the original key attribute(s),
in a separate relation. Sometimes the unnormalized table may contain more than
one repeating group, or repeating groups within repeating groups. In such cases,
this approach is applied repeatedly until no repeating groups remain. A set of
relations is in 1NF if it contains no repeating groups.
approach 2 creates two or more relations with less redundancy than in the
original UNF table. In other words, approach 2 moves the original UNF table
further along the normalization process than approach 1.
Example:
2nd Normal form.
A relation that is in first normal form and every non-primary-key attribute is fully
functionally dependent on the primary key.
Second Normal Form (2NF) is based on the concept of full functional
dependency,. Second normal form applies to relations with composite keys, that
is, relations with a primary key composed of two or more attributes. A relation
with a single-attribute primary key is automatically in at least 2NF. A relation that
is not in 2NF may suffer from the update anomalies.
The normalization of 1NF relations to 2NF involves the removal of partial
dependencies. If a partial dependency exists, we remove the partially dependent
attribute(s) from the relation by placing them in a new relation along with a copy
of their determinant.
Example
2NF relations have less redundancy than those in 1NF, they may still suffer from
update anomalies. For example, if we want to update the name of an owner, such
as Tony Shaw (ownerNo CO93), we have to update two tuples in the
PropertyOwner relation If we update only one tuple and not the other, the
database would be in an inconsistent state. This update anomaly is caused by a
transitive dependency, We need to remove such dependencies by progressing to
third normal form.
3rd normal form
Definition: A relation that is in first and second normal form and in which no non-
primary-key attribute is transitively dependent on the primary key.
The normalization of 2NF relations to 3NF involves the removal of transitive
dependencies. If a transitive dependency exists, we remove the transitively
dependent attribute(s) from the relation by placing the attribute(s) in a new
relation along with a copy of the determinant.
Example:
The original ClientRental relation can be recreated by joining the Client, Rental,
PropertyForRent, and Owner relations through the primary key/ foreign key
mechanism. the ownerNo attribute is a primary key within the Owner relation and
is also present within the PropertyForRent relation as a foreign key. The ownerNo
attribute acting as a primary key/foreign key allows the association of the
PropertyForRent and Owner relations to identify the name of property owners.
The clientNo attribute is a primary key of the Client relation and is also present
within the Rental relation as a foreign key.

You might also like