0% found this document useful (0 votes)
12 views15 pages

Unit 4

The document discusses logical database design, focusing on the normalization process and the conversion of ER diagrams to relational tables. It outlines rules for mapping entities and relationships to tables, as well as the importance of normalization in reducing data anomalies such as insertion, deletion, and modification anomalies. Additionally, it explains functional dependencies and the steps involved in achieving different normal forms in database design.

Uploaded by

girum shewatatek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views15 pages

Unit 4

The document discusses logical database design, focusing on the normalization process and the conversion of ER diagrams to relational tables. It outlines rules for mapping entities and relationships to tables, as well as the importance of normalization in reducing data anomalies such as insertion, deletion, and modification anomalies. Additionally, it explains functional dependencies and the steps involved in achieving different normal forms in database design.

Uploaded by

girum shewatatek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Fundamentals of Database Systems Lecture Note UOG

UNIT FOUR
Logical Database Design

Logical design is the process of constructing a relational model of the information used in an
enterprise based on a specific data model (e.g. relational, hierarchical or network or object), but
independent of a particular DBMS and other physical considerations.
Normalization process
 Collection of rules to be maintained
 Discover new entities in the process
 Revise attributes based on the rules and the discovered entities
The first step before applying the rules in relational data model is converting the conceptual
design to a form suitable for relational logical model, which is in a form of tables.
Converting ER Diagram to Relational Tables
Three basic rules to convert ER into tables or relations:
Rule 1: Entity Names will automatically be table names
Rule 2: Mapping of attributes: attributes will be columns of the respective tables.
 Atomic or single-valued or derived or stored attributes will be columns
 Composite attributes: the parent attribute will be ignored and the decomposed
attributes (child attributes) will be columns of the table.
 Multi-valued attributes: will be mapped to a new table where the primary key of the
main table will be posted for cross referencing.
Rule 3: Relationships: relationship will be mapped by using a foreign key attribute.
1. For a relationship with One-to-One Cardinality: post the primary or candidate key of
one of the table into the other as a foreign key. In cases where one entity is having partial
participation on the relationship, it is recommended to post the candidate key of the
partial participants to the total participant so as to save some memory location due to null
values on the foreign key attribute. E.g. for a relationship Employee manages a
department.
2. For a relationship with One-to-Many Cardinality:

1
Fundamentals of Database Systems Lecture Note UOG

 Post the primary key or candidate key from the “one” side as a foreign key attribute to
the “many” sides. E.g.: For a relationship called “Belongs To” between Employee
(Many) and Department (One).
3. For a relationship with Many-to-Many Cardinality:
 Create a new table (which is the associative entity) and post primary key or candidate
key from each entity as attributes in the new table along with some additional
attributes (if applicable).
4. For a relationship having Associative Entity property: in cases
where the relationship has its own attributes (associative entity), one
has to create a new table for the associative entity and post primary
key or candidate key from the participating entities as foreign key
attributes in the new table.

Example to illustrate the major rules in mapping ER to relational schema:

The following ER has been designed to represent the requirement of an


organization to capture Employee Department and Project information. And
Employee works for department where an employee might be assigned to
manage a department. Employees might participate on different projects
within the organization. An employee might as well be assigned to lead a
project where the starting and ending date of his/her project leadership and
bonus will be registered.

2
Fundamentals of Database Systems Lecture Note UOG

FNamee LNamee

EID Salary DID DLoc


Name
Manag
es 1
1
Employee Department

M 1 M WorksFo 1
r
Tel DName

StartDate

Leads
EndDate
Participat
e
PBonus

M
M
Project

PFund
PID PName

After we have drawn the ER diagram, the next thing is to map the ER into
relational schema so as the rules of the relational data model can be tested
for each relational schema. The mapping can be done for the entities
followed by relationships based on the rule of mapping. the mapping has
been done as follows.

 Mapping EMPLOYEE Entity:


There will be Employee table with EID, Salary, FName and LName
being the columns. The composite attribute Name will be ignored as its
decomposed attributes (FName and LName) are columns in the
Employee Table. The Tel attribute will be a new table as it is multi-
valued.
Employee
EID FName LName Salary
Telephone

3
Fundamentals of Database Systems Lecture Note UOG

EID Tel
 Mapping DEPARTMENT Entity:
There will be Department table with DID, DName, and DLoc being
the columns.
Department
DID DName DLoc

 Mapping PROJECT Entity:


There will be Project table with PID, PName, and PFund being the
columns.
Project
PID PName PFund

 Mapping the MANAGES Relationship:


As the relationship is having one-to-one cardinality, the PK or CK of one
of the table can be posted into the other. But based on the
recommendation, the Pk or CK of the partial participant (Employee)
should be posted to the total participants (Department). This will
require adding the PK of Employee (EID) in the Department Table as a
foreign key. We can give the foreign key another name which is MEID
to mean "managers employee id". this will affect the degree of the
Department table.
Department
DID DName DLoc MEID

 Mapping the WORKSFOR Relationship:


As the relationship is having one-to-many cardinality, the PK or CK of
the "One" side (PK or CK of Department table) should be posted to the
many side (Employee table). This will require adding the PK of
Department (DID) in the Employee Table as a foreign key. We can give
the foreign key another name which is EDID to mean "Employee's
Department id". this will affect the degree of the Employee table.
Employee
EID FName LName Salary EDID

 Mapping the PARTICIPATES Relationship:


As the relationship is having many-to-many cardinality, we need to
create a new table and post the PK or CK of the Employee and Project
table into the new table. We can give a descriptive new name for the
new table like Emp_Partc_Project to mean "Employee participate in a
project".
Emp_Partc_Project
EID PID

 Mapping the LEADS Relationship:

4
Fundamentals of Database Systems Lecture Note UOG

As the relationship is associative entity, we are supposed to create a


table for the associative entity where the PK of Employee and Project
tables will be posted in the new table as a foreign key. The new table
will have the attributes of the associative entity as columns. We can
give a descriptive new name for the new table like Emp_Lead_Project
to mean "Employee leads a project".
Emp_Lead_Project
EID PID PBonus StartDa EndDat
te e

At the end of the mapping we will have the following relational schema
(tables) for the logical database design phase.

Department
DID DName DLoc MEID

Project
PID PName PFund
Telephone
EID Tel

Employee
EID FName LName Salary EDID
Emp_Partc_Project
EID PID
Emp_Lead_Project
EID PID PBonus StartDa EndDat
te e

After converting the ER diagram in to table forms, the next phase is


implementing the process of normalization, which is a collection of rules
each table should satisfy.

Normalization
A relational database is merely a collection of data, organized in a particular manner. As the
father of the relational database approach, Codd created a series of rules called normal forms
that help define that organization.

Database normalization is a series of steps followed to obtain a database design that allows for
consistent storage and efficient access of data in a relational database. These steps reduce data
redundancy and the risk of data becoming inconsistent.
5
Fundamentals of Database Systems Lecture Note UOG

Normalization is the process of identifying the logical associations between data items and
designing a database that will represent such associations but without suffering the anomalies
which are;
1. Insertion Anomalies
2. Deletion Anomalies

3. Modification Anomalies
Normalization may reduce system performance since data will be cross referenced from many
tables. Thus, denormalization is sometimes used to improve performance, at the cost of reduced
consistency guarantees.

All the normalization rules will eventually remove the anomalies that may exist during data
manipulation after the implementation.

The type of problems that could occur in insufficiently normalized table is called anomalies
which includes;
(1) Insertion anomalies
An "insertion anomaly" is a failure to place information about a new database entry into all the
places in the database where information about that new entry needs to be stored. In a properly
normalized database, information about a new entry needs to be inserted into only one place in
the database; in an inadequately normalized database, information about a new entry may need to
be inserted into more than one place.
(2) Deletion anomalies
A "deletion anomaly" is a failure to remove information about an existing database entry when it
is time to remove that entry. In a properly normalized database, information about an old, to-be-
gotten-rid-of entry needs to be deleted from only one place in the database; in an inadequately
normalized database, information about that old entry may need to be deleted from more than
one place.
(3) Modification/Updating anomalies
A modification of a database involves changing some value of the attribute of a table. In a
properly normalized database table, whatever information is modified by the user, the change
will be affected and used accordingly.
6
Fundamentals of Database Systems Lecture Note UOG

N.B: The purpose of normalization is to reduce the chances for anomalies to occur in a
database.

Example for anomalies problems (problems related with anomalies)


EmpID FName LName SkillID Skill SkillType School SchoolAdd Skill
Level
12 Abebe Mekuria 2 SQL Database AAU Sidist_Kilo 5
16 Lemma Alemu 5 C++ Programming Unity Gerji 6
28 Chane Kebede 2 SQL Database AAU Sidist_Kilo 10
25 Abera Taye 6 VB6 Programming Helico Piazza 8
65 Almaz Belay 2 SQL Database Helico Piazza 9
24 Dereje Tamiru 8 Oracle Database Unity Gerji 5
51 Selam Belay 4 Prolog Programming Jimma Jimma City 8
94 Alem Kebede 3 Cisco Networking AAU Sidist_Kilo 7
18 Girma Dereje 1 IP Programming Jimma Jimma City 4
13 Yared Gizaw 7 Java Programming AAU Sidist_Kilo 6

Insertion Anomalies:
What if we have a new employee with a skill called Pascal? We can’t decide whether Pascal is
allowed as a value for skill and we have no clue about the type of skill that Pascal should be
categorized as.
Deletion Anomalies:
If employee with EMPID 16 is deleted then ever information about skill C++ and the type of
skill is deleted from the database. Then we will not have any information about C++ and its skill
type.
Modification Anomalies:
What if the address for Helico is changed from Piazza to Mexico? We need to look for every
occurrence of Helico and change the value of School_Add from Piazza to Mexico, which is
prone to error.
Note: Database-Management System (DBMS) can work only with the information that we put
explicitly into its tables for a given database and into its rules for working with those tables,
where such rules are appropriate and possible.
Functional Dependency (FD)
Before moving to the definition and application of normalization, it is important to have an
understanding of "functional dependency."

7
Fundamentals of Database Systems Lecture Note UOG

Data Dependency
The logical associations between data items that point the database designer in the direction of a
good database design are referred to as determinant or dependent relationships.
Two data items A and B are said to be in a determinant or dependent relationship if certain
values of data item B always appears with certain values of data item A. if the data item A is the
determinant data item and B the dependent data item then the direction of the association is from
A to B and not vice versa.
The essence of this idea is that if the existence of something, call it A, implies that B must exist
and have a certain value, then we say that "B is functionally dependent on A." We also often
express this idea by saying that "A determines B," or that "B is a function of A", or that "A
functionally governs B."

X  Y holds if whenever two tuples have the same value for X, they must have the same value
for Y.
The notation is: AB which is read as; B is functionally dependent on A.
Functional Dependencies (FDs) are derived from the real-world constraints on the attributes.
Example

Dinner Type of Wine


Meat Red
Fish White
Cheese Rose

Since the type of Wine served depends on the type of Dinner, we say Wine is functionally
dependent on Dinner. And this can be expressed as:
Dinner  Wine
Dinner Type of Wine Type of Fork
Meat Red Meat fork
Fish White Fish fork
Cheese Rose Cheese fork

Since both Wine type and Fork type are determined by the Dinner, we say Wine is functionally
dependent on Dinner and Fork is functionally dependent on Dinner. And this is expressed as
follows:

8
Fundamentals of Database Systems Lecture Note UOG

Dinner  Wine and Dinner  Fork

Partial Dependency
If we have composite primary key and if an attribute which is not a member of all the primary
key (i.e. is dependent on some part of the primary key then that attribute is partially functionally
dependent on the primary key.
Let {A, B} is the Primary Key and C is non key attribute.
Then if it should be {A, B}C but BC or AC . Then C is partially functionally dependent
on {A, B}.
Full Dependency
If an attribute which is not a member of the primary key is not dependent on some part of the
primary key but the whole key (if we have composite primary key) then that attribute is fully
functionally dependent on the primary key.
Let {A, B} is the Primary Key and C is non key attribute.
Then if {A, B}C and BC and AC does not hold (if B cannot determine C and A cannot
determine C). Then C Fully functionally dependent on {A, B}.
Transitive Dependency
In mathematics and logic, a transitive relationship is a relationship of the following form: "If A
implies B, and if also B implies C, then A implies C."
Example:
If Mr. X is a Human, and if every Human is an Animal, then Mr. X must be an Animal.
Generalized way of describing transitive dependency is that:
If A functionally governs B, AND
If B functionally governs C
THEN A functionally governs C.
Provided that neither C nor B determines A i.e. (B / A and C / A)

In the normal notation:


{(AB) AND (BC)} ==> AC provided that B / A and C / A

Steps of Normalization

9
Fundamentals of Database Systems Lecture Note UOG

We have various levels or steps in normalization called Normal Forms. The level of complexity,
strength of the rule and decomposition increases as we move from one lower level normal form
to the higher-level normal form.
A table in a relational database is said to be in a certain normal form if it satisfies certain
constraints. Normal form below represents a stronger condition than the previous one.

Normalization towards a logical design consists of the following steps:


Unnormalized Form: Identify all data elements.
First Normal Form: Find the key with which you can find all data.
Second Normal Form: Remove part-key dependencies. Make all data dependent on the whole
key.
Third Normal Form: Remove non-key dependencies. Make all data dependent on nothing but
the key.

Note: For most practical purposes, databases are considered normalized if they adhere to third
normal form.

First Normal Form (1NF)


Requires that all column values in a table are atomic (e.g., a number is an atomic value, while a
list or a set is not).
We have two ways of achieving this:
1. Putting each repeating group into a separate table and connecting them with a primary
key-foreign key relationship.
2. Moving this repeating groups to a new row by repeating the common attributes. If so then
find the key with which you can find all data.
Definition: a table (relation) is in 1NF
If
o There are no duplicated rows in the table. Unique identifier.
o Each cell is single-valued (i.e., there are no repeating groups).
o Entries in a column (attribute, field) are of the same kind.
Example for First Normal Form (1NF): Consider the following unnormalized database table.
EmpID FirstName LastName Skill SkillType School SchoolAdd SkillLevel
12 Abebe Mekuria SQL, Database, AAU, Sidist_Kilo 5

10
Fundamentals of Database Systems Lecture Note UOG

VB6 Programming Helico Piazza 8


16 Lemma Alemu C++ Programming Unity Gerji 6
IP Programming Jimma Jimma City 4
28 Chane Kebede SQL Database AAU Sidist_Kilo 10
65 Almaz Belay SQL Database Helico Piazza 9
Prolog Programming Jimma Jimma City 8
Java Programming AAU Sidist_Kilo 6
24 Dereje Tamiru Oracle Database Unity Gerji 5
94 Alem Kebede Cisco Networking AAU Sidist_Kilo 7

First normal form (1NF)


Remove all repeating groups. Distribute the multi-valued attributes into different rows and
identify a unique identifier for the relation so that is can be said is a relation in relational
database.
EmpID FirstName LastName SkillID Skill SkillType School SchoolAdd SkillLeve
l
12 Abebe Mekuria 1 SQL Database AAU Sidist_Kilo 5
12 Abebe Mekuria 3 VB6 Programming Helico Piazza 8
16 Lemma Alemu 2 C++ Programming Unity Gerji 6
16 Lemma Alemu 7 IP Programming Jimma Jimma City 4
28 Chane Kebede 1 SQL Database AAU Sidist_Kilo 10
65 Almaz Belay 1 SQL Database Helico Piazza 9
65 Almaz Belay 5 Prolog Programming Jimma Jimma City 8
65 Almaz Belay 8 Java Programming AAU Sidist_Kilo 6
24 Dereje Tamiru 4 Oracle Database Unity Gerji 5
94 Alem Kebede 6 Cisco Networking AAU Sidist_Kilo 7

Remember you can add additional fields/attributes in normalizing database tables. As you can
see in the above SkillID is added to organize the database table effectively and efficiently.
Second Normal form (2NF)
No partial dependency of a non-key attribute on part of the primary key. This will result in a set
of relations with a level of Second Normal Form (2NF).
Note: Any table that is in 1NF and has a single-attribute (i.e., a non-composite) primary key is
automatically in 2NF.
Definition: a table (relation) is in 2NF
If
o It is in 1NF and
o If all non-key attributes are dependent on the entire primary key i.e. if the primary key is
composite key.
 No partial dependency.
Example for Second Normal Form (2NF): Consider the following database tables.
EMP_PROJ

11
Fundamentals of Database Systems Lecture Note UOG

EmpI EmpNam ProjN ProjNam ProjLo ProjFun ProjMangI Incentiv


D e o e c d D e

EMP_PROJ rearranged
EmpID ProjNo EmpName ProjName ProjLoc ProjFund ProjMangI Incentive
D

Business rule: Whenever an employee participates in a project, he/she will be entitled for an
incentive.
This schema is in its 1NF since we don’t have any repeating groups or attributes with multi-
valued property. To convert it to a 2NF we need to remove all partial dependencies of non-key
attributes on part of the primary key.
{EmpID, ProjNo} EmpName, ProjName, ProjLoc, ProjFund, ProjMangID, Incentive
But in addition to this we have the following dependencies
FD1: {EmpID} EmpName
FD2: {ProjNo} ProjName, ProjLoc, ProjFund, ProjMangID
FD3: {EmpID, ProjNo} Incentive
As we can see, some non-key attributes are partially dependent on some part of the primary key.
Thus, each functional dependency, with their dependent attributes should be moved to a new
relation/table where the determinant will be the Primary Key for each. Then the normalization
result will be as follows:

EMPLOYEE
EmpID EmpName

PROJECT
ProjN ProjNam ProjLo ProjFun ProjMangI
o e c d D

EMP_PROJ
EmpI ProjN Incentiv
D o e

Third Normal Form (3NF)

12
Fundamentals of Database Systems Lecture Note UOG

Eliminate columns dependent on another non-primary key - If attributes do not contribute to a


description of the key, remove them to a separate table. In other words, avoid transitive
dependency. This level avoids update and delete anomalies.
Definition: a table (relation) is in 3NF
If:
o It is in 2NF and
o There are no transitive dependencies between a primary key and non-primary key attribute.
Example for Third Normal Form (3NF): Consider the following example: Students of same
batch (same year) live in one building or dormitory.

STUDENT
StudID Stud_FName Stud_LName Dep’t Year Dormitary
125/97 Abebe Mekuria Info Sc 1 401
654/95 Lemma Alemu Geog 3 403
842/95 Chane Kebede CompSc 3 403
165/97 Alem Kebede InfoSc 1 401
985/95 Almaz Belay Geog 3 403

This schema is in its 2NF since the primary key is a single attribute.
Let’s take StudID, Year and Dormitary and see the dependencies.
StudIDYear AND YearDormitary
And Year cannot determine StudID and Dormitary cannot determine StudID. Then transitively
StudIDDormitary.
To convert it to a 3NF we need to remove all transitive dependencies of non-key attributes on
another non-key attribute.
The non-primary key attributes, dependent on each other will be moved to another table and
linked with the main table using candidate key- foreign key relationship.
STUDENT
StudID Stud F_Name Stud L_Name Dep’t Year DORM
125/97 Abebe Mekuria Info Sc 1
654/95 Lemma Alemu Geog 3
842/95 Chane Kebede CompS 3
c
165/97 Alem Kebede InfoSc 1
985/95 Almaz Belay Geog 3
13
Fundamentals of Database Systems Lecture Note UOG

Year Dormitary
1 401
3 403

Generally, even though there are other four additional levels of normalization, a table is said to
be normalized if it reaches 3NF.
 Mnemonic for remembering the rationale for normalization up to 3NF could be the
following:
1. No repeating or redundancy: No repeating fields in the table.

2. The fields depend upon the key: The table should solely depend on the key.

3. The whole key: No partial dependency. (No dependency on part of primary key).

4. And nothing but the key: No inter-data dependency. Not be transitive dependency.

Other Levels of Normalization


1. Boyce-Codd Normal Form (BCNF)
Isolate Independent Multiple Relationships - No table may contain two or more 1: n or N: M
relationships that are not directly related.
Def: A table is in BCNF if it is in 3NF and if every determinant is a candidate key.
2. Forth Normal form (4NF)
Isolate Semantically Related Multiple Relationships - There may be practical constrains on
information that justify separating logically related many-to-many relationships.
Def: A table is in 4NF if it is in BCNF and if it has no multi-valued dependencies.
3. Fifth Normal Form (5NF)
Def: A table is in 5NF, also called "Projection-Join Normal Form" (PJNF), if it is in 4NF and
if every join dependency in the table is a consequence of the candidate keys of the
table.
4. Domain-Key Normal Form (DKNF)
A model free from all modification anomalies.
Def: A table is in DKNF if every constraint on the table is a logical consequence of the
definition of keys and domains.
Through normalization we want to design for our relational database a set of tables that;

14
Fundamentals of Database Systems Lecture Note UOG

(1) Contain all the data necessary for the purposes that the database is to serve,

(2) Have as little redundancy as possible,


(3) Accommodate multiple values for types of data that require them,
(4) Permit efficient updates of the data in the database, and
(5) Avoid the danger of losing data unknowingly.

The following figure shows the graphical illustration of different phases of normalization.

Pitfalls of Normalization
 Requires data to see the problems
 May reduce performance of the system
 Is time consuming,
 Difficult to design and apply and
 Prone to human error

15

You might also like