Unit 4
Unit 4
UNIT FOUR
Logical Database Design
Logical design is the process of constructing a relational model of the information used in an
enterprise based on a specific data model (e.g. relational, hierarchical or network or object), but
independent of a particular DBMS and other physical considerations.
Normalization process
Collection of rules to be maintained
Discover new entities in the process
Revise attributes based on the rules and the discovered entities
The first step before applying the rules in relational data model is converting the conceptual
design to a form suitable for relational logical model, which is in a form of tables.
Converting ER Diagram to Relational Tables
Three basic rules to convert ER into tables or relations:
Rule 1: Entity Names will automatically be table names
Rule 2: Mapping of attributes: attributes will be columns of the respective tables.
Atomic or single-valued or derived or stored attributes will be columns
Composite attributes: the parent attribute will be ignored and the decomposed
attributes (child attributes) will be columns of the table.
Multi-valued attributes: will be mapped to a new table where the primary key of the
main table will be posted for cross referencing.
Rule 3: Relationships: relationship will be mapped by using a foreign key attribute.
1. For a relationship with One-to-One Cardinality: post the primary or candidate key of
one of the table into the other as a foreign key. In cases where one entity is having partial
participation on the relationship, it is recommended to post the candidate key of the
partial participants to the total participant so as to save some memory location due to null
values on the foreign key attribute. E.g. for a relationship Employee manages a
department.
2. For a relationship with One-to-Many Cardinality:
1
Fundamentals of Database Systems Lecture Note UOG
Post the primary key or candidate key from the “one” side as a foreign key attribute to
the “many” sides. E.g.: For a relationship called “Belongs To” between Employee
(Many) and Department (One).
3. For a relationship with Many-to-Many Cardinality:
Create a new table (which is the associative entity) and post primary key or candidate
key from each entity as attributes in the new table along with some additional
attributes (if applicable).
4. For a relationship having Associative Entity property: in cases
where the relationship has its own attributes (associative entity), one
has to create a new table for the associative entity and post primary
key or candidate key from the participating entities as foreign key
attributes in the new table.
2
Fundamentals of Database Systems Lecture Note UOG
FNamee LNamee
M 1 M WorksFo 1
r
Tel DName
StartDate
Leads
EndDate
Participat
e
PBonus
M
M
Project
PFund
PID PName
After we have drawn the ER diagram, the next thing is to map the ER into
relational schema so as the rules of the relational data model can be tested
for each relational schema. The mapping can be done for the entities
followed by relationships based on the rule of mapping. the mapping has
been done as follows.
3
Fundamentals of Database Systems Lecture Note UOG
EID Tel
Mapping DEPARTMENT Entity:
There will be Department table with DID, DName, and DLoc being
the columns.
Department
DID DName DLoc
4
Fundamentals of Database Systems Lecture Note UOG
At the end of the mapping we will have the following relational schema
(tables) for the logical database design phase.
Department
DID DName DLoc MEID
Project
PID PName PFund
Telephone
EID Tel
Employee
EID FName LName Salary EDID
Emp_Partc_Project
EID PID
Emp_Lead_Project
EID PID PBonus StartDa EndDat
te e
Normalization
A relational database is merely a collection of data, organized in a particular manner. As the
father of the relational database approach, Codd created a series of rules called normal forms
that help define that organization.
Database normalization is a series of steps followed to obtain a database design that allows for
consistent storage and efficient access of data in a relational database. These steps reduce data
redundancy and the risk of data becoming inconsistent.
5
Fundamentals of Database Systems Lecture Note UOG
Normalization is the process of identifying the logical associations between data items and
designing a database that will represent such associations but without suffering the anomalies
which are;
1. Insertion Anomalies
2. Deletion Anomalies
3. Modification Anomalies
Normalization may reduce system performance since data will be cross referenced from many
tables. Thus, denormalization is sometimes used to improve performance, at the cost of reduced
consistency guarantees.
All the normalization rules will eventually remove the anomalies that may exist during data
manipulation after the implementation.
The type of problems that could occur in insufficiently normalized table is called anomalies
which includes;
(1) Insertion anomalies
An "insertion anomaly" is a failure to place information about a new database entry into all the
places in the database where information about that new entry needs to be stored. In a properly
normalized database, information about a new entry needs to be inserted into only one place in
the database; in an inadequately normalized database, information about a new entry may need to
be inserted into more than one place.
(2) Deletion anomalies
A "deletion anomaly" is a failure to remove information about an existing database entry when it
is time to remove that entry. In a properly normalized database, information about an old, to-be-
gotten-rid-of entry needs to be deleted from only one place in the database; in an inadequately
normalized database, information about that old entry may need to be deleted from more than
one place.
(3) Modification/Updating anomalies
A modification of a database involves changing some value of the attribute of a table. In a
properly normalized database table, whatever information is modified by the user, the change
will be affected and used accordingly.
6
Fundamentals of Database Systems Lecture Note UOG
N.B: The purpose of normalization is to reduce the chances for anomalies to occur in a
database.
Insertion Anomalies:
What if we have a new employee with a skill called Pascal? We can’t decide whether Pascal is
allowed as a value for skill and we have no clue about the type of skill that Pascal should be
categorized as.
Deletion Anomalies:
If employee with EMPID 16 is deleted then ever information about skill C++ and the type of
skill is deleted from the database. Then we will not have any information about C++ and its skill
type.
Modification Anomalies:
What if the address for Helico is changed from Piazza to Mexico? We need to look for every
occurrence of Helico and change the value of School_Add from Piazza to Mexico, which is
prone to error.
Note: Database-Management System (DBMS) can work only with the information that we put
explicitly into its tables for a given database and into its rules for working with those tables,
where such rules are appropriate and possible.
Functional Dependency (FD)
Before moving to the definition and application of normalization, it is important to have an
understanding of "functional dependency."
7
Fundamentals of Database Systems Lecture Note UOG
Data Dependency
The logical associations between data items that point the database designer in the direction of a
good database design are referred to as determinant or dependent relationships.
Two data items A and B are said to be in a determinant or dependent relationship if certain
values of data item B always appears with certain values of data item A. if the data item A is the
determinant data item and B the dependent data item then the direction of the association is from
A to B and not vice versa.
The essence of this idea is that if the existence of something, call it A, implies that B must exist
and have a certain value, then we say that "B is functionally dependent on A." We also often
express this idea by saying that "A determines B," or that "B is a function of A", or that "A
functionally governs B."
X Y holds if whenever two tuples have the same value for X, they must have the same value
for Y.
The notation is: AB which is read as; B is functionally dependent on A.
Functional Dependencies (FDs) are derived from the real-world constraints on the attributes.
Example
Since the type of Wine served depends on the type of Dinner, we say Wine is functionally
dependent on Dinner. And this can be expressed as:
Dinner Wine
Dinner Type of Wine Type of Fork
Meat Red Meat fork
Fish White Fish fork
Cheese Rose Cheese fork
Since both Wine type and Fork type are determined by the Dinner, we say Wine is functionally
dependent on Dinner and Fork is functionally dependent on Dinner. And this is expressed as
follows:
8
Fundamentals of Database Systems Lecture Note UOG
Partial Dependency
If we have composite primary key and if an attribute which is not a member of all the primary
key (i.e. is dependent on some part of the primary key then that attribute is partially functionally
dependent on the primary key.
Let {A, B} is the Primary Key and C is non key attribute.
Then if it should be {A, B}C but BC or AC . Then C is partially functionally dependent
on {A, B}.
Full Dependency
If an attribute which is not a member of the primary key is not dependent on some part of the
primary key but the whole key (if we have composite primary key) then that attribute is fully
functionally dependent on the primary key.
Let {A, B} is the Primary Key and C is non key attribute.
Then if {A, B}C and BC and AC does not hold (if B cannot determine C and A cannot
determine C). Then C Fully functionally dependent on {A, B}.
Transitive Dependency
In mathematics and logic, a transitive relationship is a relationship of the following form: "If A
implies B, and if also B implies C, then A implies C."
Example:
If Mr. X is a Human, and if every Human is an Animal, then Mr. X must be an Animal.
Generalized way of describing transitive dependency is that:
If A functionally governs B, AND
If B functionally governs C
THEN A functionally governs C.
Provided that neither C nor B determines A i.e. (B / A and C / A)
Steps of Normalization
9
Fundamentals of Database Systems Lecture Note UOG
We have various levels or steps in normalization called Normal Forms. The level of complexity,
strength of the rule and decomposition increases as we move from one lower level normal form
to the higher-level normal form.
A table in a relational database is said to be in a certain normal form if it satisfies certain
constraints. Normal form below represents a stronger condition than the previous one.
Note: For most practical purposes, databases are considered normalized if they adhere to third
normal form.
10
Fundamentals of Database Systems Lecture Note UOG
Remember you can add additional fields/attributes in normalizing database tables. As you can
see in the above SkillID is added to organize the database table effectively and efficiently.
Second Normal form (2NF)
No partial dependency of a non-key attribute on part of the primary key. This will result in a set
of relations with a level of Second Normal Form (2NF).
Note: Any table that is in 1NF and has a single-attribute (i.e., a non-composite) primary key is
automatically in 2NF.
Definition: a table (relation) is in 2NF
If
o It is in 1NF and
o If all non-key attributes are dependent on the entire primary key i.e. if the primary key is
composite key.
No partial dependency.
Example for Second Normal Form (2NF): Consider the following database tables.
EMP_PROJ
11
Fundamentals of Database Systems Lecture Note UOG
EMP_PROJ rearranged
EmpID ProjNo EmpName ProjName ProjLoc ProjFund ProjMangI Incentive
D
Business rule: Whenever an employee participates in a project, he/she will be entitled for an
incentive.
This schema is in its 1NF since we don’t have any repeating groups or attributes with multi-
valued property. To convert it to a 2NF we need to remove all partial dependencies of non-key
attributes on part of the primary key.
{EmpID, ProjNo} EmpName, ProjName, ProjLoc, ProjFund, ProjMangID, Incentive
But in addition to this we have the following dependencies
FD1: {EmpID} EmpName
FD2: {ProjNo} ProjName, ProjLoc, ProjFund, ProjMangID
FD3: {EmpID, ProjNo} Incentive
As we can see, some non-key attributes are partially dependent on some part of the primary key.
Thus, each functional dependency, with their dependent attributes should be moved to a new
relation/table where the determinant will be the Primary Key for each. Then the normalization
result will be as follows:
EMPLOYEE
EmpID EmpName
PROJECT
ProjN ProjNam ProjLo ProjFun ProjMangI
o e c d D
EMP_PROJ
EmpI ProjN Incentiv
D o e
12
Fundamentals of Database Systems Lecture Note UOG
STUDENT
StudID Stud_FName Stud_LName Dep’t Year Dormitary
125/97 Abebe Mekuria Info Sc 1 401
654/95 Lemma Alemu Geog 3 403
842/95 Chane Kebede CompSc 3 403
165/97 Alem Kebede InfoSc 1 401
985/95 Almaz Belay Geog 3 403
This schema is in its 2NF since the primary key is a single attribute.
Let’s take StudID, Year and Dormitary and see the dependencies.
StudIDYear AND YearDormitary
And Year cannot determine StudID and Dormitary cannot determine StudID. Then transitively
StudIDDormitary.
To convert it to a 3NF we need to remove all transitive dependencies of non-key attributes on
another non-key attribute.
The non-primary key attributes, dependent on each other will be moved to another table and
linked with the main table using candidate key- foreign key relationship.
STUDENT
StudID Stud F_Name Stud L_Name Dep’t Year DORM
125/97 Abebe Mekuria Info Sc 1
654/95 Lemma Alemu Geog 3
842/95 Chane Kebede CompS 3
c
165/97 Alem Kebede InfoSc 1
985/95 Almaz Belay Geog 3
13
Fundamentals of Database Systems Lecture Note UOG
Year Dormitary
1 401
3 403
Generally, even though there are other four additional levels of normalization, a table is said to
be normalized if it reaches 3NF.
Mnemonic for remembering the rationale for normalization up to 3NF could be the
following:
1. No repeating or redundancy: No repeating fields in the table.
2. The fields depend upon the key: The table should solely depend on the key.
3. The whole key: No partial dependency. (No dependency on part of primary key).
4. And nothing but the key: No inter-data dependency. Not be transitive dependency.
14
Fundamentals of Database Systems Lecture Note UOG
(1) Contain all the data necessary for the purposes that the database is to serve,
The following figure shows the graphical illustration of different phases of normalization.
Pitfalls of Normalization
Requires data to see the problems
May reduce performance of the system
Is time consuming,
Difficult to design and apply and
Prone to human error
15