0% found this document useful (0 votes)

19 views

Unit 2

The document discusses database design using Entity Relationship (ER) modeling. It covers key concepts in ER modeling including entities, attributes, relationships, cardinality, and ER diagram symbols. ER diagrams provide a graphical representation of database structure and the relationships between real-world entities. They can be used to design relational databases by mapping ER diagrams to tables. The document also provides examples of different types of entities, attributes, relationships, and cardinality.

Uploaded by

ananyamarysebastian05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Unit 2

Uploaded by

ananyamarysebastian05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Unit 2: Database Design and ER Model

2.1 Overview
2.2 ER – Model
2.3 Constraints
2.4 E-R Diagrams, ERD Issues, Weak Entity Sets
2.5 Codd’s Rules
2.6 Relational database model: Logical view of data, keys, integrity rules
2.7 Relational Database design: Features of good relational database design
2.8 Atomic domain and Normalization 1NF, 2NF, 3NF, BCNF

The Entity Relational Model is a model for identifying entities to be represented in the database and
representation of how those entities are related. The ER data model specifies enterprise schema that
represents the overall logical structure of a database graphically.

The Entity Relationship Diagram explains the relationship among the entities present in the database.
ER models are used to model real-world objects like a person, a car, or a company and the relation
between these real-world objects. In short, the ER Diagram is the structural format of the database.

Why Use ER Diagrams In DBMS?

 ER diagrams are used to represent the E-R model in a database, which makes them easy to be
converted into relations (tables).

 ER diagrams provide the purpose of real-world modeling of objects, which makes them
intently useful.

 ER diagrams require no technical knowledge and no hardware support.

 These diagrams are very easy to understand and easy to create even for a naive user.

 It gives a standard solution for visualizing the data logically.

Symbols Used in ER Model

ER Model is used to model the logical view of the system from a data perspective, which consists of
these symbols:

 Rectangles: Rectangles represent Entities in the ER Model.

 Ellipses: Ellipses represent Attributes in the ER Model.

 Diamond: Diamonds represent Relationships among Entities.

 Lines: Lines represent attributes to entities and entity sets with other relationship types.

 Double Ellipse: Double Ellipses represent Multi-Valued Attributes.

 Double Rectangle: Double Rectangle represents a Weak Entity.

1
Components of ER Diagram

ER Model consists of Entities, Attributes, and Relationships among Entities in a Database System.

Entity

An Entity may be an object with a physical existence – a particular person, car, house, or employee
– or it may be an object with a conceptual existence – a company, a job, or a university course.

Entity Set: An Entity is an object of Entity Type and a set of all entities is called an entity set. For
Example, E1 is an entity having Entity Type Student and the set of all students is called Entity Set.
In ER diagram, Entity Type is represented as:

2
1. Strong Entity

A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not depend on other
Entity in the Schema. It has a primary key, which helps in identifying it uniquely, and it is represented
by a rectangle. These are called Strong Entity Types.

2. Weak Entity

An Entity type has a key attribute that uniquely identifies each entity in the entity set. But some
entity type exists for which key attributes can’t be defined. These are called Weak Entity types.

For Example, A company may store the information of dependents (Parents, Children, Spouse) of
an Employee. But the dependents don’t have existed without the employee. So Dependent will be a
Weak Entity Type and Employee will be Identifying Entity type for Dependent, which means it is
Strong Entity Type.

A weak entity type is represented by a Double Rectangle. The participation of weak entity types is
always total. The relationship between the weak entity type and its identifying strong entity type is
called identifying relationship and it is represented by a double diamond.

Strong Entity and Weak Entity

Attributes

Attributes are the properties that define the entity type. For example, Roll_No, Name, DOB, Age,
Address, and Mobile_No are the attributes that define entity type Student. In ER diagram, the attribute
is represented by an oval.

Attribute

1. Key Attribute

The attribute which uniquely identifies each entity in the entity set is called the key attribute. For
example, Roll_No will be unique for each student. In ER diagram, the key attribute is represented by
an oval with underlying lines.

Key Attribute

3
2. Composite Attribute

An attribute composed of many other attributes is called a composite attribute. For example, the
Address attribute of the student Entity type consists of Street, City, State, and Country. In ER
diagram, the composite attribute is represented by an oval comprising of ovals.

Composite Attribute

3. Multivalued Attribute

An attribute consisting of more than one value for a given entity. For example, Phone_No (can be
more than one for a given student). In ER diagram, a multivalued attribute is represented by a double
oval.

Multivalued Attribute

4. Derived Attribute

An attribute that can be derived from other attributes of the entity type is known as a derived attribute.
e.g.; Age (can be derived from DOB). In ER diagram, the derived attribute is represented by a dashed
oval.

Derived Attribute

The Complete Entity Type Student with its Attributes can be represented as:

Entity and Attributes

4
Relationship Type and Relationship Set

A Relationship Type represents the association between entity types. For example, ‘Enrolled in’ is a
relationship type that exists between entity type Student and Course. In ER diagram, the relationship
type is represented by a diamond and connecting the entities with lines.

Entity-Relationship Set

A set of relationships of the same type is known as a relationship set. The following relationship set
depicts S1 as enrolled in C2, S2 as enrolled in C1, and S3 as registered in C3.

Relationship Set

Degree of a Relationship Set

The number of different entity sets participating in a relationship set is called the degree of a
relationship set.

1. Unary Relationship: When there is only ONE entity set participating in a relation, the relationship
is called a unary relationship. For example, one person is married to only one person.

Unary Relationship

2. Binary Relationship: When there are TWO entities set participating in a relationship, the
relationship is called a binary relationship. For example, a Student is enrolled in a Course.

Binary Relationship

3. n-ary Relationship: When there are n entities set participating in a relation, the relationship is
called an n-ary relationship.

5
Cardinality

The number of times an entity of an entity set participates in a relationship set is known as cardinality.
Cardinality can be of different types:

1. One-to-One: When each entity in each entity set can take part only once in the relationship, the
cardinality is one-to-one. Let us assume that a male can marry one female and a female can marry
one male. So the relationship will be one-to-one.

the total number of tables that can be used in this is 2.

one to one cardinality

Using Sets, it can be represented as:

Set Representation of One-to-One

2. One-to-Many: In one-to-many mapping as well where each entity can be related to more than one
relationship and the total number of tables that can be used in this is 2. Let us assume that one surgeon
deparment can accomodate many doctors. So the Cardinality will be 1 to M. It means one deparment
has many Doctors.

total number of tables that can used is 3.

one to many cardinality

6
Using sets, one-to-many cardinality can be represented as:

Set Representation of One-to-Many

3. Many-to-One: When entities in one entity set can take part only once in the relationship set and
entities in other entity sets can take part more than once in the relationship set, cardinality is many to
one. Let us assume that a student can take only one course but one course can be taken by many
students. So the cardinality will be n to 1. It means that for one course there can be n students but for
one student, there will be only one course.

The total number of tables that can be used in this is 3.

many to one cardinality

Using Sets, it can be represented as:

Set Representation of Many-to-One

In this case, each student is taking only 1 course but 1 course has been taken by many students.

4. Many-to-Many: When entities in all entity sets can take part more than once in the relationship
cardinality is many to many. Let us assume that a student can take more than one course and one
course can be taken by many students. So the relationship will be many to many.

7
the total number of tables that can be used in this is 3.

many to many cardinality

Using Sets, it can be represented as:

Many-to-Many Set Representation

In this example, student S1 is enrolled in C1 and C3 and Course C3 is enrolled by S1, S3, and S4. So
it is many-to-many relationships.

How to Draw ER Diagram?

 The very first step is Identifying all the Entities, and place them in a Rectangle, and labeling them
accordingly.

 The next step is to identify the relationship between them and pace them accordingly using the
Diamond, and make sure that, Relationships are not connected to each other.

 Attach attributes to the entities properly.

 Remove redundant entities and relationships.

8
Types of Keys in Relational Model (Candidate, Super, Primary, Alternate
and Foreign)
Keys are one of the basic requirements of a relational database model. It is widely used to identify
the tuples(rows) uniquely in the table. We also use keys to set up relations amongst various columns
and tables of a relational database.

Different Types of Keys in the Relational Model

1. Candidate Key

2. Primary Key

3. Super Key

4. Alternate Key

5. Foreign Key

6. Composite Key

1. Candidate Key: The minimal set of attributes that can uniquely identify a tuple is known as a
candidate key. For Example, STUD_NO in STUDENT relation.

 It is a minimal super key.

 It is a super key with no repeated data is called a candidate key.

 The minimal set of attributes that can uniquely identify a record.

 It must contain unique values.

 It can contain NULL values.

 Every table must have at least a single candidate key.

 A table can have multiple candidate keys but only one primary key.

 The value of the Candidate Key is unique and may be null for a tuple.

 There can be more than one candidate key in a relationship.

Example:

STUD_NO is the candidate key for relation STUDENT.

Table STUDENT

_NO SNAME ADDRESS PHONE

1 Shyam Delhi 123456789
2 Rakesh Kolkata 223365796
3 Suraj Delhi 175468965

9
 The candidate key can be simple (having only one attribute) or composite as well.

Example:

{STUD_NO, COURSE_NO} is a composite

candidate key for relation STUDENT_COURSE.

Table STUDENT_COURSE

STUD_NO TEACHER_NO COURSE_NO

1 001 C001
2 056 C005

Note: In SQL Server a unique constraint that has a nullable column, allows the value ‘null‘ in that
column only once. That’s why the STUD_PHONE attribute is a candidate here, but can not be a
‘null’ value in the primary key attribute.

2. Primary Key: There can be more than one candidate key in relation out of which one can be
chosen as the primary key. For Example, STUD_NO, as well as STUD_PHONE, are candidate keys
for relation STUDENT but STUD_NO can be chosen as the primary key (only one out of many
candidate keys).

 It is a unique key.

 It can identify only one tuple (a record) at a time.

 It has no duplicate values, it has unique values.

 It cannot be NULL.

 Primary keys are not necessarily to be a single column; more than one column can also be a
primary key for a table.

Example:

STUDENT table -> Student(STUD_NO, SNAME,

ADDRESS, PHONE) , STUD_NO is a primary key

Table STUDENT

STUD_NO SNAME ADDRESS PHONE

1 Shyam Delhi 123456789
2 Rakesh Kolkata 223365796
3 Suraj Delhi 175468965

3. Super Key: The set of attributes that can uniquely identify a tuple is known as Super Key. For
Example, STUD_NO, (STUD_NO, STUD_NAME), etc. A super key is a group of single or multiple
keys that identifies rows in a table. It supports NULL values.

 Adding zero or more attributes to the candidate key generates the super key.

 A candidate key is a super key but vice versa is not true.

10
 Super Key values may also be NULL.

Example:

 Consider the table shown above.

STUD_NO+PHONE is a super key.

4. Alternate Key: The candidate key other than the primary key is called an alternate key.

 All the keys which are not primary keys are called alternate keys.

 It is a secondary key.

 It contains two or more fields to identify two or more records.

 These values are repeated.

 Eg:- SNAME, and ADDRESS is Alternate keys

Example:

Consider the table shown above.

STUD_NO, as well as PHONE both,
are candidate keys for relation STUDENT but
PHONE will be an alternate key
(only one out of many candidate keys).

5. Foreign Key: If an attribute can only take the values which are present as values of some other
attribute, it will be a foreign key to the attribute to which it refers. The relation which is being
referenced is called referenced relation and the corresponding attribute is called referenced attribute
the relation which refers to the referenced relation is called referencing relation and the corresponding
attribute is called referencing attribute. The referenced attribute of the referenced relation should be
the primary key to it.
11
 It is a key it acts as a primary key in one table and it acts as
secondary key in another table.

 It combines two or more relations (tables) at a time.

 They act as a cross-reference between the tables.

 For example, DNO is a primary key in the DEPT table and a non-key in EMP

Example:

Refer Table STUDENT shown above.

STUD_NO in STUDENT_COURSE is a
foreign key to STUD_NO in STUDENT relation.

Table STUDENT_COURSE

STUD_NO TEACHER_NO COURSE_NO

1 005 C001
2 056 C005

It may be worth noting that, unlike the Primary Key of any given relation, Foreign Key can be NULL
as well as may contain duplicate tuples i.e. it need not follow uniqueness constraint. For Example,
STUD_NO in the STUDENT_COURSE relation is not unique. It has been repeated for the first and
third tuples. However, the STUD_NO in STUDENT relation is a primary key and it needs to be
always unique, and it cannot be null.

Relation between Primary Key and Foreign Key

6. Composite Key: Sometimes, a table might not have a single column/attribute that uniquely
identifies all the records of a table. To uniquely identify rows of a table, a combination of two or
more columns/attributes can be used. It still can give duplicate values in rare cases. So, we need to
find the optimal set of attributes that can uniquely identify rows in a table.

 It acts as a primary key if there is no primary key in a table

 Two or more attributes are used together to make a composite key.

 Different combinations of attributes may give different accuracy in terms of identifying the
rows uniquely.

Example:

12
FULLNAME + DOB can be combined
together to access the details of a student.

Different Types of Keys

13
Functional Dependency and Attribute Closure
A functional dependency A->B in a relation holds if two tuples having same value of attribute A
also have same value for attribute B. For Example, in relation STUDENT shown in table 1,
Functional Dependencies

STUD_NO->STUD_NAME, STUD_NO->STUD_PHONE hold

but

STUD_NAME->STUD_STATE do not hold

How to find functional dependencies for a relation?

Functional Dependencies in a relation are dependent on the domain of the relation. Consider the
STUDENT relation given in Table 1.

 We know that STUD_NO is unique for each student. So STUD_NO->STUD_NAME, STUD_NO-

>STUD_PHONE, STUD_NO->STUD_STATE, STUD_NO->STUD_COUNTRY and STUD_NO -> STUD_AGE
all will be true.
 Similarly, STUD_STATE->STUD_COUNTRY will be true as if two records have same STUD_STATE,
they will have same STUD_COUNTRY as well.
 For relation STUDENT_COURSE, COURSE_NO->COURSE_NAME will be true as two records with
same COURSE_NO will have same COURSE_NAME.

Functional Dependency Set: Functional Dependency set or FD set of a relation is the set of all
FDs present in the relation. For Example, FD set for relation STUDENT shown in table 1 is:

{ STUD_NO->STUD_NAME, STUD_NO->STUD_PHONE, STUD_NO->STUD_STATE, STUD_NO-

>STUD_COUNTRY,
STUD_NO -> STUD_AGE, STUD_STATE->STUD_COUNTRY }

Attribute Closure: Attribute closure of an attribute set can be defined as set of attributes which can
be functionally determined from it.

How to find attribute closure of an attribute set?

To find attribute closure of an attribute set:

 Add elements of attribute set to the result set.

14
 Recursively add elements to the result set which can be functionally determined from the elements
of the result set.

Using FD set of table 1, attribute closure can be determined as:

(STUD_NO)+ = {STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_COUNTRY,

STUD_AGE}
(STUD_STATE)+ = {STUD_STATE, STUD_COUNTRY}

How to find Candidate Keys and Super Keys using Attribute Closure?

 If attribute closure of an attribute set contains all attributes of relation, the attribute set will be
super key of the relation.
 If no subset of this attribute set can functionally determine all attributes of the relation, the set will
be candidate key as well. For Example, using FD set of table 1,

(STUD_NO, STUD_NAME)+ = {STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE,

STUD_COUNTRY, STUD_AGE}

(STUD_NO)+ = {STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE,

STUD_COUNTRY, STUD_AGE}

(STUD_NO, STUD_NAME) will be super key but not candidate key because its subset
(STUD_NO)+ is equal to all attributes of the relation. So, STUD_NO will be a candidate key.

GATE Question: Consider the relation scheme R = {E, F, G, H, I, J, K, L, M, N} and the set of
functional dependencies {{E, F} -> {G}, {F} -> {I, J}, {E, H} -> {K, L}, K -> {M}, L -> {N} on
R. What is the key for R? (GATE-CS-2014)
A. {E, F}
B. {E, F, H}
C. {E, F, H, K, L}
D. {E}

Answer: Finding attribute closure of all given options, we get:

{E,F}+ = {EFGIJ}
{E,F,H}+ = {EFHGIJKLMN}
{E,F,H,K,L}+ = {{EFHGIJKLMN}
{E}+ = {E}
{EFH}+ and {EFHKL}+ results in set of all attributes, but EFH is minimal. So it will be candidate
key. So correct option is (B).

How to check whether an FD can be derived from a given FD set?

To check whether an FD A->B can be derived from an FD set F,

1. Find (A)+ using FD set F.

2. If B is subset of (A)+, then A->B is true else not true.

GATE Question: In a schema with attributes A, B, C, D and E following set of functional

dependencies are given
{A -> B, A -> C, CD -> E, B -> D, E -> A}
Which of the following functional dependencies is NOT implied by the above set?
15
A. CD -> AC
B. BD -> CD
C. BC -> CD
D. AC -> BC

Answer: Using FD set given in question,

(CD)+ = {CDEAB} which means CD -> AC also holds true.
(BD)+ = {BD} which means BD -> CD can’t hold true. So this FD is no implied in FD set. So (B)
is the required option.
Others can be checked in the same way.

Prime and non-prime attributes

Attributes which are parts of any candidate key of relation are called as prime attribute, others are
non-prime attributes. For Example, STUD_NO in STUDENT relation is prime attribute, others are
non-prime attribute.

GATE Question: Consider a relation scheme R = (A, B, C, D, E, H) on which the following

functional dependencies hold: {A–>B, BC–> D, E–>C, D–>A}. What are the candidate keys of
R? [GATE 2005]
(a) AE, BE
(b) AE, BE, DE
(c) AEH, BEH, BCH
(d) AEH, BEH, DEH

Answer: (AE)+ = {ABECD} which is not set of all attributes. So AE is not a candidate key. Hence
option A and B are wrong.
(AEH)+ = {ABCDEH}
(BEH)+ = {BEHCDA}
(BCH)+ = {BCHDA} which is not set of all attributes. So BCH is not a candidate key. Hence
option C is wrong.
So correct answer is D.

Advantages of functional dependencies:

 They help in reducing data redundancy in a database by identifying and eliminating unnecessary
or duplicate data.

 They improve data integrity by ensuring that data is consistent and accurate across the database.

 They facilitate database maintenance by making it easier to modify, update, and delete data.

Disadvantages of functional dependencies:

 The process of identifying functional dependencies can be time-consuming and complex, especially
in large databases with many tables and relationships.

 Overly restrictive functional dependencies can result in slow query performance or data
inconsistencies, as data that should be related may not be properly linked.

16
 Functional dependencies do not take into account the semantic meaning of data, and may not
always reflect the true relationships between data elements.

Advantages of attribute closure:

 Attribute closures help to identify all possible attributes that can be derived from a set of given
attributes.

 They facilitate database design by identifying relationships between attributes and tables, which
can help to optimize query performance.

 They ensure data consistency by identifying all possible combinations of attributes that can exist in
the database.

Disadvantages of attribute closure:

 The process of calculating attribute closures can be computationally expensive, especially for large
datasets.

 Attribute closures can become too complex to manage, especially as the number of attributes and
tables in a database grows.

 Attribute closures do not take into account the semantic meaning of data, and may not always
accurately reflect the relationships between data elements.

17
Introduction of Database Normalization
Database normalization is the process of organizing the attributes of the database to reduce or
eliminate data redundancy (having the same data but at different places).

Problems because of data redundancy: Data redundancy unnecessarily increases the size of the
database as the same data is repeated in many places. Inconsistency problems also arise during
insert, delete and update operations.

Functional Dependency: Functional Dependency is a constraint between two sets of attributes in

relation to a database. A functional dependency is denoted by an arrow (→). If an attribute A
functionally determines B, then it is written as A → B.

For example, employee_id → name means employee_id functionally determines the name of the
employee. As another example in a timetable database, {student_id, time} → {lecture_room},
student ID and time determine the lecture room where the student should be.

Advantages of Functional Dependency

 The database’s data quality is maintained using it.

 It communicates the database design’s facts.
 It aids in precisely outlining the limitations and implications of databases.
 It is useful to recognize poor designs.
 Finding the potential keys in the relationship is the first step in the normalization procedure.
Identifying potential keys and normalizing the database without functional dependencies is
impossible.

What does functionally dependent mean?

A function dependency A → B means for all instances of a particular value of A, there is the same
value of B. For example in the below table A → B is true, but B → A is not true as there are
different values of A for B = 3.

A B
------
1 3
2 3
4 0
1 3
4 0

Trivial Functional Dependency

X → Y is trivial only when Y is a subset of X.

Examples

ABC → AB
ABC → A
ABC → ABC

Non Trivial Functional Dependencies

X → Y is a non-trivial functional dependency when Y is not a subset of X.

X → Y is called completely non-trivial when X intersect Y is NULL.

18
Example:

Id → Name,
Name → DOB

Semi Non Trivial Functional Dependencies

X → Y is called semi non-trivial when X intersect Y is not NULL.

Examples:

AB → BC,
AD → DC
The features of database normalization are as follows:

Elimination of Data Redundancy: One of the main features of normalization is to eliminate the
data redundancy that can occur in a database. Data redundancy refers to the repetition of data in
different parts of the database. Normalization helps in reducing or eliminating this redundancy,
which can improve the efficiency and consistency of the database.

Ensuring Data Consistency: Normalization helps in ensuring that the data in the database is
consistent and accurate. By eliminating redundancy, normalization helps in preventing
inconsistencies and contradictions that can arise due to different versions of the same data.

Simplification of Data Management: Normalization simplifies the process of managing data in a

database. By breaking down a complex data structure into simpler tables, normalization makes it
easier to manage the data, update it, and retrieve it.

Improved Database Design: Normalization helps in improving the overall design of the database.
By organizing the data in a structured and systematic way, normalization makes it easier to design
and maintain the database. It also makes the database more flexible and adaptable to changing
business needs.

Avoiding Update Anomalies: Normalization helps in avoiding update anomalies, which can occur
when updating a single record in a table affects multiple records in other tables. Normalization
ensures that each table contains only one type of data and that the relationships between the tables
are clearly defined, which helps in avoiding such anomalies.

Standardization: Normalization helps in standardizing the data in the database. By organizing the
data into tables and defining relationships between them, normalization helps in ensuring that the
data is stored in a consistent and uniform manner.

Normalization is an important process in database design that helps in improving the efficiency,
consistency, and accuracy of the database. It makes it easier to manage and maintain the data and
ensures that the database is adaptable to changing business needs.

Normal Forms in DBMS – Database Normalization

Normalization is the process of minimizing redundancy from a relation or set of relations.
Redundancy in relation may cause insertion, deletion, and update anomalies. So, it helps to minimize

19
the redundancy in relations. Normal forms are used to eliminate or reduce redundancy in database
tables.

What is Database Normalization?

In database management systems (DBMS), normal forms are a series of guidelines that help to ensure
that the design of a database is efficient, organized, and free from data anomalies. There are several
levels of normalization, each with its own set of guidelines, known as normal forms.

Important Points Regarding Normal Forms in DBMS

 First Normal Form (1NF): This is the most basic level of normalization. In 1NF, each table cell
should contain only a single value, and each column should have a unique name. The first normal
form helps to eliminate duplicate data and simplify queries.

 Second Normal Form (2NF): 2NF eliminates redundant data by requiring that each non-key
attribute be dependent on the primary key. This means that each column should be directly related
to the primary key, and not to other columns.

 Third Normal Form (3NF): 3NF builds on 2NF by requiring that all non-key attributes are
independent of each other. This means that each column should be directly related to the primary
key, and not to any other columns in the same table.

 Boyce-Codd Normal Form (BCNF): BCNF is a stricter form of 3NF that ensures that each
determinant in a table is a candidate key. In other words, BCNF ensures that each non-key attribute
is dependent only on the candidate key.

 Fourth Normal Form (4NF): 4NF is a further refinement of BCNF that ensures that a table does not
contain any multi-valued dependencies.

 Fifth Normal Form (5NF): 5NF is the highest level of normalization and involves decomposing a
table into smaller tables to remove data redundancy and improve data integrity.

Normal forms help to reduce data redundancy, increase data consistency, and improve database
performance. However, higher levels of normalization can lead to more complex database designs
and queries. It is important to strike a balance between normalization and practicality when designing
a database.

Advantages of Normal Form

 Reduced data redundancy: Normalization helps to eliminate duplicate data in tables, reducing the
amount of storage space needed and improving database efficiency.

 Improved data consistency: Normalization ensures that data is stored in a consistent and
organized manner, reducing the risk of data inconsistencies and errors.

 Simplified database design: Normalization provides guidelines for organizing tables and data
relationships, making it easier to design and maintain a database.

 Improved query performance: Normalized tables are typically easier to search and retrieve data
from, resulting in faster query performance.

20
 Easier database maintenance: Normalization reduces the complexity of a database by breaking it
down into smaller, more manageable tables, making it easier to add, modify, and delete data.

Overall, using normal forms in DBMS helps to improve data quality, increase database efficiency,
and simplify database design and maintenance.

First Normal Form

If a relation contain composite or multi-valued attribute, it violates first normal form or a relation is
in first normal form if it does not contain any composite or multi-valued attribute. A relation is in
first normal form if every attribute in that relation is singled valued attribute.

 Example 1 – Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute

STUD_PHONE. Its decomposition into 1NF has been shown in table 2.

 Example 2 –

ID Name Courses
------------------
1 A c1, c2
2 E c3
3 M C2, c3

 In the above table Course is a multi-valued attribute so it is not in 1NF. Below Table is in
1NF as there is no multi-valued attribute

ID Name Course
------------------
1 A c1
1 A c2
2 E c3
3 M c2
3 M c3

Second Normal Form

To be in second normal form, a relation must be in first normal form and relation must not contain
any partial dependency. A relation is in 2NF if it has No Partial Dependency, i.e., no non-prime
attribute (attributes which are not part of any candidate key) is dependent on any proper subset of any

21
candidate key of the table. Partial Dependency – If the proper subset of candidate key determines
non-prime attribute, it is called partial dependency.

 Example 1 – Consider table-3 as following below.

STUD_NO COURSE_NO COURSE_FEE

1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000

 {Note that, there are many courses having the same course fee} Here, COURSE_FEE cannot alone
decide the value of COURSE_NO or STUD_NO; COURSE_FEE together with STUD_NO cannot decide
the value of COURSE_NO; COURSE_FEE together with COURSE_NO cannot decide the value of
STUD_NO; Hence, COURSE_FEE would be a non-prime attribute, as it does not belong to the one
only candidate key {STUD_NO, COURSE_NO} ; But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE
is dependent on COURSE_NO, which is a proper subset of the candidate key. Non-prime attribute
COURSE_FEE is dependent on a proper subset of the candidate key, which is a partial dependency
and so this relation is not in 2NF. To convert the above relation to 2NF, we need to split the table
into two tables such as : Table 1: STUD_NO, COURSE_NO Table 2: COURSE_NO, COURSE_FEE

Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000

 NOTE: 2NF tries to reduce the redundant data getting stored in memory. For instance, if there are
100 students taking C1 course, we don’t need to store its Fee as 1000 for all the 100 records,
instead, once we can store it in the second table as the course fee for C1 is 1000.

 Example 2 – Consider following functional dependencies in relation R (A, B , C, D )

AB -> C [A and B together determine C]

BC -> D [B and C together determine D]

In the above relation, AB is the only candidate key and there is no partial dependency, i.e., any proper
subset of AB doesn’t determine any non-prime attribute.

X is a super key.
Y is a prime attribute (each element of Y is part of some candidate key).

Example 1: In relation STUDENT given in Table 4, FD set: {STUD_NO -> STUD_NAME,

STUD_NO -> STUD_STATE, STUD_STATE -> STUD_COUNTRY, STUD_NO -> STUD_AGE}

Candidate Key: {STUD_NO}

For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE ->
STUD_COUNTRY are true.

So STUD_COUNTRY is transitively dependent on STUD_NO. It violates the third normal form.

22
To convert it in third normal form, we will decompose the relation STUDENT (STUD_NO,
STUD_NAME, STUD_PHONE, STUD_STATE, STUD_COUNTRY_STUD_AGE) as:
STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_AGE)
STATE_COUNTRY (STATE, COUNTRY)

Consider relation R(A, B, C, D, E) A -> BC, CD -> E, B -> D, E -> A All possible candidate keys in
above relation are {A, E, CD, BC} All attributes are on right sides of all functional dependencies are
prime.

Example 2: Find the highest normal form of a relation R(A,B,C,D,E) with FD set as {BC->D, AC-
>BE, B->E}

Step 1: As we can see, (AC)+ ={A,C,B,E,D} but none of its subset can determine all attribute of
relation, So AC will be candidate key. A or C can’t be derived from any other attribute of the relation,
so there will be only 1 candidate key {AC}.

Step 2: Prime attributes are those attributes that are part of candidate key {A, C} in this example and
others will be non-prime {B, D, E} in this example.

Step 3: The relation R is in 1st normal form as a relational DBMS does not allow multi-valued or
composite attribute. The relation is in 2nd normal form because BC->D is in 2nd normal form (BC
is not a proper subset of candidate key AC) and AC->BE is in 2nd normal form (AC is candidate
key) and B->E is in 2nd normal form (B is not a proper subset of candidate key AC).

The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor D is a prime
attribute) and in B->E (neither B is a super key nor E is a prime attribute) but to satisfy 3rd normal
for, either LHS of an FD should be super key or RHS should be prime attribute. So the highest normal
form of relation will be 2nd Normal form.

For example consider relation R(A, B, C) A -> BC, B -> A and B both are super keys so above
relation is in BCNF.

Third Normal Form

A relation is said to be in third normal form, if we did not have any transitive dependency for non-
prime attributes. The basic condition with the Third Normal Form is that, the relation must be in
Second Normal Form.

Below mentioned is the basic condition that must be hold in the non-trivial functional dependency
X -> Y:

 X is a Super Key.

 Y is a Prime Attribute ( this means that element of Y is some part of Candidate Key).

For more, refer to Third Normal Form in DBMS.

BCNF (Boyce-Codd Normal Form)

BCNF (Boyce-Codd Normal Form) is just a advanced version of Third Normal Form. Here we
have some additional rules than Third Normal Form. The basic condition for any relation to be in
BCNF is that it must be in Third Normal Form.

23
We have to focus on some basic rules that are for BCNF:

1. Table must be in Third Normal Form.

2. In relation X->Y, X must be a superkey in a relation.

For more, refer to BCNF in DBMS.

Fourth Normal Form

Fourth Normal Form contains no non-trivial multivaued dependency except candidate key. The
basic condition with Fourth Normal Form is that the relation must be in BCNF.

The basic rules are mentioned below.

1. It must be in BCNF.
2. It does not have any multi-valued dependency.

For more, refer to Fourth Normal Form in DBMS.

Fifth Normal Form

Fifth Normal Form is also called as Projected Normal Form. The basic conditions of Fifth Normal
Form is mentioned below.

Relation must be in Fourth Normal Form.

The relation must not be further non loss decomposed.

24
Applications of Normal Forms in DBMS
 Data consistency: Normal forms ensure that data is consistent and does not contain any redundant
information. This helps to prevent inconsistencies and errors in the database.

 Data redundancy: Normal forms minimize data redundancy by organizing data into tables that
contain only unique data. This reduces the amount of storage space required for the database and
makes it easier to manage.

 Query performance: Normal forms can improve query performance by reducing the number of
joins required to retrieve data. This helps to speed up query processing and improve overall system
performance.

 Database maintenance: Normal forms make it easier to maintain the database by reducing the
amount of redundant data that needs to be updated, deleted, or modified. This helps to improve
database management and reduce the risk of errors or inconsistencies.

 Database design: Normal forms provide guidelines for designing databases that are efficient,
flexible, and scalable. This helps to ensure that the database can be easily modified, updated, or
expanded as needed.

Some Important Points about Normal Forms

 BCNF is free from redundancy.

 If a relation is in BCNF, then 3NF is also satisfied.

 If all attributes of relation are prime attribute, then the relation is always in 3NF.

 A relation in a Relational Database is always and at least in 1NF form.

 Every Binary Relation ( a Relation with only 2 attributes ) is always in BCNF.

 If a Relation has only singleton candidate keys( i.e. every candidate key consists of only 1 attribute),
then the Relation is always in 2NF( because no Partial functional dependency possible).

 Sometimes going for BCNF form may not preserve functional dependency. In that case go for BCNF
only if the lost FD(s) is not required, else normalize till 3NF only.

 There are many more Normal forms that exist after BCNF, like 4NF and more. But in real world
database systems it’s generally not required to go beyond BCNF.

Bus Reservation System
No ratings yet
Bus Reservation System
26 pages
General Synthesis of The Hyperborean Wisdom Nimrod de Rosario Felipe Moyano
100% (3)
General Synthesis of The Hyperborean Wisdom Nimrod de Rosario Felipe Moyano
63 pages
Complex Systems Leadership Theory
100% (2)
Complex Systems Leadership Theory
94 pages
Introduction of ER Model
No ratings yet
Introduction of ER Model
13 pages
ER Model
No ratings yet
ER Model
11 pages
ER Model
No ratings yet
ER Model
13 pages
ER Data Models
No ratings yet
ER Data Models
21 pages
Unit 2
No ratings yet
Unit 2
26 pages
Lec 02, Introduction of ER Model
No ratings yet
Lec 02, Introduction of ER Model
18 pages
Introduction of ER Model
No ratings yet
Introduction of ER Model
34 pages
ER DIAGRAM.docx
No ratings yet
ER DIAGRAM.docx
25 pages
Components of ER Model
No ratings yet
Components of ER Model
9 pages
DBMS_Introduction of ER Model
No ratings yet
DBMS_Introduction of ER Model
29 pages
rdbms unit2
No ratings yet
rdbms unit2
21 pages
Introduction of ER Model - GeeksforGeeks
No ratings yet
Introduction of ER Model - GeeksforGeeks
2 pages
UNIT 02 Rdbms
No ratings yet
UNIT 02 Rdbms
26 pages
WWW Geeksforgeeks Org Introduction of Er Model
No ratings yet
WWW Geeksforgeeks Org Introduction of Er Model
12 pages
DBMS Unit 1
No ratings yet
DBMS Unit 1
13 pages
Introduction of ER Model - GeeksforGeeks
No ratings yet
Introduction of ER Model - GeeksforGeeks
16 pages
Entity Relationship Data Modelling
No ratings yet
Entity Relationship Data Modelling
13 pages
RDBMS Unit 2 Notes
No ratings yet
RDBMS Unit 2 Notes
22 pages
ER Diagram
No ratings yet
ER Diagram
50 pages
DBMS-lecture 4 ER-Model Introduction
No ratings yet
DBMS-lecture 4 ER-Model Introduction
14 pages
Entity Relationship Diagrams
No ratings yet
Entity Relationship Diagrams
10 pages
Entity Relationship Diagram - ER Diagram
100% (1)
Entity Relationship Diagram - ER Diagram
10 pages
ER Diagram
No ratings yet
ER Diagram
10 pages
Dbms 4
No ratings yet
Dbms 4
57 pages
Database Design Using ER Model (14M)
No ratings yet
Database Design Using ER Model (14M)
107 pages
E-R Model
No ratings yet
E-R Model
14 pages
CP.3
No ratings yet
CP.3
8 pages
Unit 1
No ratings yet
Unit 1
335 pages
5.1 Data Models - ER Model
No ratings yet
5.1 Data Models - ER Model
9 pages
Unit 2 notes DBMS FINAL
No ratings yet
Unit 2 notes DBMS FINAL
24 pages
DBMS-Note-2
No ratings yet
DBMS-Note-2
7 pages
Er Model
No ratings yet
Er Model
24 pages
DBMS_Unit -II
No ratings yet
DBMS_Unit -II
14 pages
Dbms Entity
No ratings yet
Dbms Entity
40 pages
Systems Analysis and Design - Erd
No ratings yet
Systems Analysis and Design - Erd
25 pages
5.Entity RDM
No ratings yet
5.Entity RDM
23 pages
Anish Gupta
No ratings yet
Anish Gupta
14 pages
DBMS UNITIII
No ratings yet
DBMS UNITIII
47 pages
1 ER Diagrams
No ratings yet
1 ER Diagrams
93 pages
UNIT-1B DBMS
No ratings yet
UNIT-1B DBMS
12 pages
ER Diagram - Kameshwari
No ratings yet
ER Diagram - Kameshwari
12 pages
DB Lect 5
No ratings yet
DB Lect 5
7 pages
Entity Relationship Diagram – ER Diagram in DBMS
No ratings yet
Entity Relationship Diagram – ER Diagram in DBMS
7 pages
Entity Relationship model
No ratings yet
Entity Relationship model
93 pages
ER Digram
No ratings yet
ER Digram
10 pages
Entity Relationship Model
No ratings yet
Entity Relationship Model
19 pages
Se Pactical 7
No ratings yet
Se Pactical 7
4 pages
Unit No.2 Data Modelling & Relational Database Design
No ratings yet
Unit No.2 Data Modelling & Relational Database Design
70 pages
Data Model
No ratings yet
Data Model
9 pages
Unit-Ii Database Design: Er Model & Er Diagrams
No ratings yet
Unit-Ii Database Design: Er Model & Er Diagrams
81 pages
Unit 1 Data Modeling
No ratings yet
Unit 1 Data Modeling
51 pages
DBMS ER Diagram
No ratings yet
DBMS ER Diagram
11 pages
ER Diagram Part 1
No ratings yet
ER Diagram Part 1
22 pages
Dbms Sanchit Sir Notes Compress
No ratings yet
Dbms Sanchit Sir Notes Compress
232 pages
Datamodeling
No ratings yet
Datamodeling
8 pages
Entity Relationship Model
No ratings yet
Entity Relationship Model
9 pages
Dbms Unit II
No ratings yet
Dbms Unit II
11 pages
Semantic Modeling In Formal English
From Everand
Semantic Modeling In Formal English
Dr. Ir. Andries Van Renssen
No ratings yet
Energy Faith Focus Formula
From Everand
Energy Faith Focus Formula
Rolito S. Salmo
5/5 (4)
A Concise Guide to Object Orientated Programming
From Everand
A Concise Guide to Object Orientated Programming
alasdair gilchrist
No ratings yet
UNIT 2 Relational Model
No ratings yet
UNIT 2 Relational Model
28 pages
Database Assignment 2
No ratings yet
Database Assignment 2
4 pages
Lecture Notes: Introduction To Machine Learning For The Sciences
No ratings yet
Lecture Notes: Introduction To Machine Learning For The Sciences
80 pages
AIS Chapter 7 Data Modeling & D.B Design
No ratings yet
AIS Chapter 7 Data Modeling & D.B Design
55 pages
DBMS Notes
No ratings yet
DBMS Notes
145 pages
Digital Design With an Introduction to the Verilog HDL VHDL and SystemVerilog 6th Edition Mano Solutions Manual instant download
100% (3)
Digital Design With an Introduction to the Verilog HDL VHDL and SystemVerilog 6th Edition Mano Solutions Manual instant download
40 pages
DBMS_ASSIGNMENT -5
No ratings yet
DBMS_ASSIGNMENT -5
2 pages
DMS (22319) - Chapter 2 Notes
No ratings yet
DMS (22319) - Chapter 2 Notes
133 pages
22MCA10062
No ratings yet
22MCA10062
21 pages
Relational Databases - 3rd Semester
No ratings yet
Relational Databases - 3rd Semester
23 pages
SDA UML Modeling
No ratings yet
SDA UML Modeling
13 pages
ISM-AYUSH BANSAL Practical file-BBA 212
No ratings yet
ISM-AYUSH BANSAL Practical file-BBA 212
20 pages
Chapter 2
No ratings yet
Chapter 2
11 pages
College Database
No ratings yet
College Database
11 pages
DBMS Lab Manual
No ratings yet
DBMS Lab Manual
37 pages
Scientific Understanding v4
100% (1)
Scientific Understanding v4
14 pages
Chapter 3-ER (Part 1)
No ratings yet
Chapter 3-ER (Part 1)
45 pages
DBMS Unit-3 Notes
No ratings yet
DBMS Unit-3 Notes
23 pages
Project of DFD
No ratings yet
Project of DFD
4 pages
DBMS 2
No ratings yet
DBMS 2
18 pages
SQL
No ratings yet
SQL
5 pages
Ums Dbms Project
No ratings yet
Ums Dbms Project
9 pages
Database Worksheet
100% (1)
Database Worksheet
10 pages
Project Database
No ratings yet
Project Database
9 pages
Database Systems - Lecture 6
No ratings yet
Database Systems - Lecture 6
6 pages
Systems Analysis and Design 6th Edition Dennis Test Bank instant download
100% (1)
Systems Analysis and Design 6th Edition Dennis Test Bank instant download
53 pages
OMG P&C Data Models
No ratings yet
OMG P&C Data Models
174 pages