0% found this document useful (0 votes)
25 views

Database Management Systems

Uploaded by

luvyharish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Database Management Systems

Uploaded by

luvyharish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Unit 3 Normalization

What is normalization

 Database Normalization is the process of removing the redundant data from tables in to
improve storage efficiency , data integrity and scalability.

 Normalization is the process of organizing a database in a way that reduces redundancy


and dependency.
 It is a crucial step in designing an efficient and effective database structure.

 Normalization generally involves splitting existing tables into multiples ones, which must
be re-joined or linked each time a query is issued.

 Formal process of decomposing relations with anomalies to produce smaller, well


structured and stable relations

 Primarily a tool to validate and improve a logical design so that it satisfies certain
constraints that avoid unnecessary duplication of data

 It is bottom-up techniques.

3.1 Purpose of Normalization


• Characteristics of a suitable set of relations include:

– The minimal number of attributes necessary to support the data requirements of the enterprise;
– Attributes with a close logical relationship are found in the same relation;
– Minimal redundancy with each attribute represented only once with the important exception of
attributes that form all or part of foreign keys.

• The benefits of using a database that has a suitable set of relations is that the database will be:
– Easier for the user to access and maintain the data;
– Take up minimal storage space on the computer.
3.2 How Normalization Support Database Design

Approach 1 shows how normalization can be used as a bottom-up standalone database design
technique,

Approach 2 shows how normalization can be used as a validation technique to check the structure
of relations which may have been created using a top-down approach such as ER modeling.

The users’ requirements specification is the preferred data source, it is possible to design a
database based on the information taken directly from other data sources, such as forms and
reports.

3.3 Data Redundancy and Update Anomalies


• Major aim of relational database design is to group attributes into relations to minimize data
redundancy.

• Potential benefits for implemented database include:


– Updates to the data stored in the database are achieved with a minimal number of operations
thus reducing the opportunities for data inconsistencies
. – Reduction in the file storage space required by the base relations thus minimizing costs.

• Problems associated with data redundancy are illustrated by comparing the Staff and Branch
relations with the Staff-Branch relation.

Staff-Branch relation has redundant data; the details of a branch are repeated for every member of
staff.

• In contrast, the branch information appears only once for each branch in the Branch relation and
only the branch number (branchNo) is repeated in the Staff relation, to represent where each
member of staff is located.
Relations that contain redundant information may potentially suffer from update anomalies.
A relation that contains minimal data redundancy and allows users to insert, delete, and update
rows without causing data inconsistencies

• Goal is to avoid (minimize) types of anomalies


– Insertion Anomaly – adding new rows forces user to create duplicate data
– Deletion Anomaly – deleting a row may cause loss of other data representing completely
different facts
– Modification Anomaly – changing data in a row forces changes to other rows because of
duplication General rule of thumb: a table should not pertain to more than one entity type.

3.4 Functional Dependencies


Functional dependencies are relationships between attributes in a database. They describe how one
attribute is dependent on another attribute. It typically exists between the primary key and non-key
attribute within a table.

X → Y
The left side of FD is known as a determinant, the right side of the production is known as a
dependent.

For example:
consider the following table.

From the above table we can conclude some valid functional dependencies:
 roll_no → { name, dept_name, dept_building },→ Here, roll_no can determine values of
fields name, dept_name and dept_building, hence a valid Functional dependency

 roll_no → dept_name , Since, roll_no can determine whole set of {name, dept_name,
dept_building}, it can determine its subset dept_name also.
 dept_name → dept_building , Dept_name can identify the dept_building accurately, since
departments with different dept_name will also have a different dept_building

 More valid functional dependencies: roll_no → name, {roll_no, name} ⇢ {dept_name,


dept_building}, etc.

Here are some invalid functional dependencies:


 name → dept_name Students with the same name can have different dept_name, hence
this is not a valid functional dependency.

 dept_building → dept_name There can be multiple departments in the same building.


Example, in the above table departments ME and EC are in the same building B2, hence
dept_building → dept_name is an invalid functional dependency.

 More invalid functional dependencies: name → roll_no, {name, dept_name} → roll_no,


dept_building → roll_no, etc.
Functional Dependencies are divided into two types.
1. Full Functional Dependency
2. Partial Functional Dependency

Full Functional Dependency: In full functional dependency an attribute or a set of attributes


uniquely determines another attribute or set of attributes.

If a relation R has attributes X, Y, Z with the dependencies X->Y and X->Z which states that
those dependencies are fully functional.

Partial Functional Dependency: In partial functional dependency a non key attribute depends
on a part of the composite key, rather than the whole key. If a relation R has attributes X, Y, Z
where X and Y are the composite key and Z is non key attribute. Then X->Z is a partial
functional dependency in RBDMS.
Transitive Functional Dependency
In transitive functional dependency, dependent is indirectly dependent on determinant. i.e.

If a → b & b → c, then according to axiom of transitivity, a → c. This is a transitive


functional dependency.

For example,
enrol_n
o name dept building_no

42 abc CO 4

43 pqr EC 2

44 xyz IT 1

45 abc EC 2

Here, enrol_no → dept and dept → building_no.


Hence, according to the axiom of transitivity, enrol_no → building_no is a valid functional
dependency. This is an indirect functional dependency, hence called Transitive functional
dependency.

3.5 First normal form


A database is in first normal form if it satisfies the following conditions:
* Contains only atomic values.
* There are no repeating groups.
An atomic value is a value that cannot be divided.

Example1:
In the table shown below the values in the [color] column in the first row can be divided into
“yellow ” and “blue ”, hence table-product is not in 1NF.
A repeating group means that a table contains two or more columns that are closely related.

The above table is not in first normal form because the [color] column can contain multiple
values.

To bring this table to first normal form, we split the table into two tables and now we how the
resulting tables.
Eg 2:
3.6 Second Normal Form
Definition

A database is in second normal form if it satisfies the following condition.

* It is in first normal form.

* All non-key attributes are fully functionally dependent on the primary key.

In a table if attribute B is functionally dependent on A but is not functionally dependent on a


proper subset on A, then B is considered fully functionally dependent on A. Hence, in a 2NF table,
all non-key attributes cannot be dependent on a subset of the primary key.

Note that if the primary is not a composite key, all non key attributes are always fully functional
dependent on the primary key. A table that is in first normal form and contains only a single key as
the primary key is automatically in second normal form.

The above table has a composite primary key [customer id, store id]. The non-key attribute is
[purchase Location]. In this case, [purchase location] only depends on [store id], which is only part
of the primary key. Therefore this table does not satisfy second normal form.

To bring this table to second normal form, we break the table into two tables and now we have the
following.
What we have done is to remove the partial functional dependency that we initially have. Now, in
the table [table- store], the column [purchase location] is fully dependent on the primary key of
that table, which is store Id.

3.7 Third Normal Form


Definition

A database is in third normal form if it satisfies the following condition. i) It is in second normal
form. ii) There is no transitive functional independency.

By transitive functional dependency we have the following relationship in the table A is


functionally dependent on B, and B is functional dependent on C. In this case C is transitively
dependent on A via B.

Example:

In the above table [book-id] determines [gender-id] and [gender] determines [gender-type] via
[gender-id] and we have transitive functional dependency and this structure does not satisfies the
third normal form.
To bring this table to the third normal form, we split the table into two as follows.

Now all non-key attributes are fully functionally dependent only on the primary key In [table-
book], both [gender-id] and [price] are only dependent on [book-id].

In table [table-gender], [gender- type] is only dependent on [gender id].

3.8 Boyce-Codd Normal Form( BCNF)


A database table is set to see in BCNF if in is in 3NF and contains each and every determinant as a
candidate key. The process of converting the table into BCNF is as follows:

* Remove the nontrieval functional dependency.

* Make separate table for the determinates.

BCNF of the below table is as follows:


Constraints included in a relational model

Relational data model includes several types of constraints whose purpose is to maintain
the accuracy and integrity of the data in the database. The major type of integrity constraints
are
1. Domain Constraints
2. Entity Integrity
3. Referential Integrity
4. Operational constraints

3.9 Domain constraints


A domain is a set of values that may be assigned to an attribute. A domain definitionusually
consists of the following components.
1. Domain name
2. Meaning
3. Data Type
4. Size or length
5. Allowable values or allowable range
3.10 Entity integrity
The entity integrity rule is designed to assure that every relation has a primary key, and that the
data values for that primary key are all valid. Entity integrity guarantees that every primary key
attribute is non-null.

3.11 Referential integrity


A referential integrity constraint is a rule that maintains consistency among the rows of two
tables (relation). The rule states that if there is a foreign key in one relation, either each foreign
key value must match a primary key value in the other table or else the foreign key value must
be null.

1. What is a tuple?
In Relational Data structure terminology tuple is nothing but record.

2. What do you mean by the degree of a relation?


The number of attributes in a relation is called the degree of the relation.

3. What is the cardinality of the relation?


The number of tuples or rows in a relation is called the cardinality of the relation.

4. What is a candidate key?


A candidate key is an attribute that can uniquely identify a row in a table. Every relation has
atleast one candidate key, because at least the combination of all its attributes has the uniqueness
property.

5. What is a Primary key?


A Primary key is a column in the table whose purpose is to uniquely identify the record from the
same table. One candidate key is designated as the primary key.

6. What is an alternate key?


Remaining candidate key except the primary key in a table are called alternate key.

7. What do you mean by foreign key?


A Primary key is a column in the table whose purpose is to uniquely identify the record from a
different table. OR Foreign key is an attribute or combination of attribute of one relation
(table).Also the foreign key and the primary key should be defined on the same underlying
domain.

8. What is normalization?
Normalization is a process in which we analyze and decompose the complex relations and
transform them into smaller, simpler and well –structured relations for validating and improving
the logical design , so that the logical design satisfies certain constraints and avoid unnecessary
duplication of data.
9. What are the two problematic issues in the design of relational database?
Two most problematic issues in the design of relational databases are
1. Repetition of Information ( redundancy)
2. Inability to represent certain information.

10. What do you mean by determinant?


Determinant refers to the attribute or group of attributes on the left-hand side of the arrow of a
functional dependency.
Emp_no -> emp_name

11. What is a trivial functional dependency?


A dependency is trivial , if and only if , the right hand side is a subset of the left-hand
side(determinant).

12. What is transitive dependency?


A transitive dependency in a relation is a functional dependency between two or more nonkey
attributes.

13. What is first normal form (1NF)?


The multi-valued attributes called repeating groups should be removed, i.e elimination of
repeating groups.

14 . What is Second normal form (2NF)?


The partial functional dependencies have to be removed ,i.e elimination of redundant data.

15. What is third normal form (3NF)?


The transitive dependencies have to be removed, i.e elimination of columns not dependent on the
key.

16. What is Boyce-Codd Normal form (BCNF)?


The remaining anomalies that results from functional dependencies are removed.

17. Mention the purpose of Normalization


i ) Minimize redundancy in data.
ii) Remove insert, delete, update anomaly during database activities.
iii)Reduce the need to reorganize data when it is modified or enhanced.

You might also like