bcs-092 b2
bcs-092 b2
Introduction to
Indira Gandhi Databases
National Open University
School of Computer and
Information Sciences
Block
2
DATABASE DESIGN
UNIT 6
Entity Relationship Model 7
UNIT 7
Integrity Rules and Constraints 28
UNIT 8
Relational Design and Redundancy 28
UNIT 9
Functional Dependencies 49
UNIT 10
Introduction to Data Normalization 57
Course Coordinator (for the year 2021)
Mr. Akshay Kumar, Associate Professor
SOCIS, IGNOU, New Delhi-110068
In all the Units of this Block, Check Your Progress and Answers to Check
Your Progress Questions are written and added in 2021 BY
Mr. Akshay Kumar, Associate Professor
SOCIS, IGNOU, New Delhi-110068
PRODUCTION
Mr. Tilak Raj Mr. Yashpal
Asst. Registrar (Pub.) Section Officer (Pub.)
MPDD, IGNOU, New Delhi MPDD, IGNOU, New Delhi
January, 2021
Laser Typesetting : Akashdeep Printers, 20-Ansari Road, Daryaganj, New Delhi-110002
IGNOU is one of the participants for the development of courses of this
Programme
Copyright
This course has been developed as part of the collaborative advanced ICT course
development project of the Commonwealth of Learning (COL). COL is an
intergovernmental organization created by Commonwealth Heads of Government
to promote the development and sharing of open learning and distance education
knowledge, resources and technologies.
The Open University of Tanzania (OUT) is a fully fledged, autonomous and
accredited public University. It offers its certificate, diploma, degree and
postgraduate courses through the open and distance learning system which
includes various means of communication such as face-to-face, broadcasting,
telecasting, correspondence, seminars, e-learning as well as a blended mode. The
OUT’s academic programmes are quality-assured and as centrally regulated by
the Tanzania Commission for Universities (TCU).
6
Entity Relationship Model
UNIT 6 ENTITY RELATIONSHIP MODEL
Structure
6.1 Introduction
6.2 Objectives
6.3 Terminologies
6.4 Introduction to ER Model
6.5 Entities
6.5.1 Definition of an Entity
6.5.2 Classification of Entities
6.5.3 Types of Entities
6.5.4 Attributes
6.5.5 Keys
6.6 Relationships
Introduction
Types of Relationships
6.7 Video Lecture
6.8 Activity
6.9 Summary
6.10 Answers
6.11 Case Study
6.12 References and Further Reading
6.13 Attribution
6.1 INTRODUCTION
For a correct design and implementation of database, there is a need to understand
which entities should hold data and identify the connections that may exist between
entities. In this unit, you are going to learn about the Entity-Relationship Modelwhich
provides a technique in creating a graphicalview of the different elements of a database
aswell asthe relationships between them. In addition, you will also learn the drawing
conventions of the E-R model, beginning with those conventions used to represent a
single entity, and concluding with conventions used to represent all relations in a
database.
6.2 OBJECTIVES
Upon completion of this unit you will be able to:
Describe the significance ER Model in database design.
Explain the entity-relationship model and its components.
Convert user requirements into an ER Model.
7
Database Design
6.3 TERMINOLOGIES
Entity : A thing with distinct and independent existence.
Attribute : Characteristic of an entity.
Relationship : Connection between entities/tables in a database.
6.5 ENTITIES
In this section the term entity is explained with the help of an example.
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q2: Identify the strong and weak entities in the following statement:
The information is required to be stored about the employees and their
dependents.
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q3: What is a charactristic entity?
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
6.5.4 Attributes
Each entity is described by a set of attributes, e.g. Lecturer = (PFNo, Name, Address,
Age, Salary).
Each attribute has a name, associated with an entity and is associated with a domain of
legal values. However the information about attribute domain is not presented on the
ER diagram.
i) Types of Attributes
There are a few types of attributes you need to be familiar with. Some of these are to
be left as is, but some need to be adjusted to facilitate representation in the relational 11
Database Design model. This first section will discuss the types of attributes. Later on we will discuss
fixing the attributes to fit correctly into the relational model.
Simple Attributes
Simple attributes are those drawn from the atomic value domains; theyare also called
single-valued attributes. In the UNIVERSITY database, an example of this would be:
PFNo = {908}; Name = {Grace}, Age = {37}
Composite Attribute
Composite attributes are those that consist of a hierarchy of attributes. Using our
database example, and shown in Figure 6.4, Address may consist of Number, Street
and Suburb. So this would be written as follows:-
Address = {30 + 'Tumaini' + 'Mikocheni'}
Multivalued attributes
12 Multivalued attributes are attributes that have a set of values for each entity. Examples
of a multivalued attribute from the UNIVERSITY database, as seen in Figure 6.5 Entity Relationship Model
Derived attributes
Derived attributes are attributes that contain values calculated from other attributes.
An example of this can be seen in Figure 6.6. Age can be derived from the attribute
Birthdate. In this situation, Birthdate is called a stored attribute, which is physically
saved to the database.
6.5.5 Keys
An important constraint on an entity is the key. The keyis an attribute or a group of
attributes whose values can be used to uniquelyidentify an individual entityin an entity
set. The following are types of keys:
i) Candidate Key
A candidate key is a simple or composite key that is unique and minimal. It is unique
because no two rows in a table may have the same value at any time. It is minimal
because every column is necessary in order to attain uniqueness.
From our UNIVERSITY database example, if the entity is Lecturer (LID, FirstName,
LastName, Address, Phone, BirthDate, Salary, DepartmentID), possible candidate
keys are:
13
Database Design LID
First Name and Last Name - assuming there is no one else in the company
with the same name
Last Name and DepartmentID - assuming two people with the same last
name don't work in the same department
First Name and Last Name - assuming there is no one else in the company
with the same name.
Last Name and Department ID - assuming two people with the same last
name don't work in the same department.
iii) Primary Key
The primary key is a candidate key that is selected by the database designer to be
used as an identifying mechanism for the whole entity set. It must uniquely identify
tuples in a table and not be null. The primary key is indicated in the ER model by
underlining the attribute.
v) Alternate key
Alternate keys are all candidate keys not chosen as the primary key.
Department (DepartmentID,DepartmentName,DepartmentSpecialities)
14
Check Your Progress 2 Entity Relationship Model
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q2: Which of the following is multi-valued attribute or a derived attribute?
DurationOf Work in present post, Experience.
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q3: Given the following relation, identify primary key, composite key, and foreign
keys.
Student (Id, name, programmecode, aadharno, phone, dateofbirth)
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
6.6 RELATIONSHIPS
The relationship in ERP connects two or more entities. They are discussed in details in
this section.
6.6.1 Introduction
Relationships are the glue that holds the tables together. They are used to connect
related information between tables.
Relationship strength is based on how the primary key of a related entity is defined. A
weak, or non-identifying, relationship exists if the primary key of the related entity
does not contain a primary keycomponent of the parent entity. For example:
Customer(CustID, CustName)
Order(OrderID, CustID, Date)
A strong, or identifying, relationship exists when the primary key of the related entity
contains the primary key component ofthe parent entity. Examples include:
15
Database Design Course(CourseCode, DepartmentID, Description)
Class(CrsCode, Section, ClassTime…)
Aone to one (1:1) relationship is the relationship of one entity to onlyone other entity,
and vice versa. It should be rare in any relational database design. In fact, it could
indicate that two entities actually belong in the same table.
An example from the UNIVERSITY database is one lecturer is associated with one
spouse, and one spouse is associated with one lecturer.
tables.
The linking table contains multiple occurrences of the foreign key values.
Figure 6.8 shows another aspect of the M:N relationship where a lecturer has different
start dates for different academic programmes. Therefore, you need a JOIN table that
contains the LID, PrgrmCode and StartDate.
Figure 6.8: Illustration of where lecturer has different start dates for
different Programmes, by G.Mbwete
Figure 6.8: Illustration of where lecturer has different two roles as a supervisor for junior
staff and supervisee for his/her immediate supervisor, by G. Mbwete
v) Ternary Relationship
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
18
Entity Relationship Model
6.7 VIDEO LECTURE
https://tinyurl.com/h22xxdl
https://tinyurl.com/htfkunh
https://tinyurl.com/jbbrq9g
6.8 ACTIVITY
Activity 6.0
ER Modelling Exercises
Motivation: To become conversant with basic ER Modelling techniques.
Resources: Unit5 and 6 learning materials and ER Diagram Tool.
What to do:
Read the following scenario carefully.
A manufacturing companyproduces products. Product information stored
is product name, id, and quantity onhand. These products are made up of
many components. Each component can be supplied by one or more
suppliers. Component information kept is component id, name, description,
suppliers who supply them, and which products they are used in. Create
an ERD to show how you would track this information. Show entitynames,
19
Database Design
primary keys, attributes for each entity, relationships between the entities
and cardinality.
How to do it:
For the above scenario, draw an entity-relationship model diagram that
captures the given requirements. Once done, validate the model against
the requirements to make sure nothing was missed. Remember that the
four entity-relationship elements we want to pull out of these requirements
are: entities, attributes, identifiers, and relationships.
Duration: Expect to spend about 2 hour on this activity.
Feedback: The learner should submit this activity for assessment to the
course instructor and get a feedback on this activity.
20
Entity Relationship Model
6.9 SUMMARY
In this unit you learned about Entityrelationship Modelwhich helps in the logical design
stage of a relational database. The unit content has given details on entities and its
types, attributes and its characteristics and finally; relationships which are the basis/
connectivity between entities/tables in the database.
6.10 ANSWERS
Check Your Progress 1
Ans 1: Employee, Bank, Saving Account.
Ans 2: Strong entity- Employee
Weak entitity-Dependents of an employee
Ans 3: Whichprovides more information about other relations. e.g. multivalued attribute
based entities.
Check Your Progress 2
Ans 1: Name, if stored as first name, middle name and last name separately, then it is
composite attribute.
Salary is a simple attribute.
Ans 2: Experience is multi-valued attribute, as a person may have multiple work
experiences
Durationofwork in present post can be compute from date ofjoining, therefore,
is a derived attribute.
Ans 3: Two candidate keys may be
– Id
– AadharNo
The id may be selected as the primary key.
One of the composite key may be phone + dateofbirth assuming no twins have
joined.
Programme code is the foreign key to a programme relation.
Check Your Progress 3
Ans 1: Astaff member is assigned to one department only, and one department may
have many staff members.
Ans 2: At a specific time, only one employee can be the CEO of the company.
Ans 3: A student may join for many subjects and one subject has many students.
Case Study II
Acar dealership sells both new and used cars, and it operates a service facility. Base
your design on the following business rules:
A salesperson maysell manycars, but each car is sold by onlyone salesperson.
Acustomer maybuy many cars, but each car is sold to only one customer.
A salesperson writes a single invoice for each car he or she buys.
A customer gets an invoice for each car he or she buys.
A customer may come in just to have his or her car serviced, that is, one need
not buy a car to be classified as a customer.
When a customer takes one or more cars in for repair or service, one service
ticket is written for each car.
The car dealership maintains a service history for each of the cars serviced.
The service records are referenced by the car's serial number.
A car brought in for service can be worked on by manymechanics, and each
mechanic may work on many cars.
A car that is serviced may or may not need parts. (For example, adjusting a
carburetor or cleaning a fuel injector nozzle does not require the use of parts).
Case Study III
Secretive Sci-Fi Star Maps
This companycreates and sells maps that indicate the address of select science
fiction film stars and also movie shoot locations. Several maps are needed to
cover the large LosAngeles region. The map borders don't overlap one other.
22 Each sci-fi star's address is shown on the appropriate map at the proper
coordinates. The film or films that each star has appeared in are recorded, Entity Relationship Model
Customers of SSFSM often want to find all of the stars that appeared in a
certain movie (e.g. Star Trek II, Star Wars IV, The Adventures of Buckaroo
Banzai Across the 8th Dimension).
Special shoot locations for some sci-fi films are also displayed on the maps.
Case Study IV
For both cats and dogs, the name and gender of the animal is recorded, as
well as whether it has been neutered or not.
For cats, the mother of the cat is noted, if the mother is known to the shelter.
For any given mother, the shelter wants to be able to find all descendants
through potentiallymanygenerations.
Dogs are classified by breed. Dogs can be of one breed (purebred), two
breeds (crossbreed), or just be a mutt. The specific breeds, except in the
case of a mutt, are tracked.
For both adopters and donators, a reference to the dog or cat they adopted
or donated, respectively, is kept.
Adopters can adopt many dogs or cats, or may just be interested in adoption
Case Study V
Draw an EER diagram of the conceptual schema for another part of a University
database, described as follows:
Academic staff, general staffand students are the onlypersons at the university.
Each person is either an academic staff, or a general staff, or a student.
A person is uniquely identified by a PerId (person's ID), and has a Name, and
an Address. An Address is composed of HouseNo, Street , and City.
A characteristic property of a student is that she/he has at least one Major
and one NoOfPts (number of points) for each major.
An academic staff has a Position and anAcQual (academic qualification).
6.13 ATTRUBUTION
The content of this unit ofthe Learning Material - Introduction to Databases (including
its images, unless otherwise noted) is a derivative copy of materials from the book
Database Design by Adrienne Watt and Nelson Eng licensed under Creative
Commons Attribution 4.0 International License.
Download this book for free at http://open.bccampus.ca
The ER Case Studies are authored by David Rogers fromYukon College.
The following material was written by Grace Mbwete:
1. Introduction
2. Summary
24
Entity Relationship Model
UNIT 7 INTEGRITY RULES AND
CONSTRAINTS
Structure
7.1 Introduction
7.2 Objectives
7.3 Terminologies
7.4 Integrity Constraints
7.4.1 Entity Integrity
7.4.2 Referential Integrity
7.4.3 Enterprise Constraints
7.5 Business Rules
7.5.1 Cardinality and connectivity
7.6 Relationship Types
Optional relationships
Mandatory relationships
7.7 Video Lecture
7.8 Activity
7.9 Summary
7.10 Answers
7.11 References and Further Reading
7.12 Attrubution
7.1 INTRODUCTION
One of the important functionality ofa DBMS is to enable the specification of integrity
constraints and to enforce them. Constraints are useful because they allow a designer
to specify the semantics of data in the database. Constraints are the rules that force
DBMSs to check that data satisfies the semantics. The concepts which will be covered
in this unit, will assist the student in understanding and defining the database constraints
during the practical sessions relating to this course.
7.2 OBJECTIVES
Upon completion of this unit you will be able to:
Describe the basic concepts of data integrity rules.
Specify integrityconstraints and how to enforce them.
Identify business rules when gathering user requirements.
7.3 TERMINOLOGIES
Constraints : The rules that force DBMSs to check that data satisfies
the semantics. 25
Database Design Entity Integrity : Requires that everytable have a primarykey; neither the
primary key, nor any part of it, can contain null values.
Integrity constraints : Logical statements that state what data values are or are
not allowed and which format is suitable for an attribute.
Business rules : Obtained from users when gathering requirements and
are used to determine cardinality.
Cardinality : Expresses the minimum and maximum number of entity
occurrences associated with one occurrence ofa related
entity.
Identifying relationship : Is a condition where the primarykeycontains the foreign
key; indicated in an ERD by a solid line.
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
28
Q2: Identify foreign key and referential constraints. Integrity Rules and
Constraints
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q3: Is there any enterprise constraints in these two tables?
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
In Figure 7.4 both inner (representing cardinality) and outer (representing connectivity)
markers are shown. The left side of this symbol is read as minimum 1 and maximum 1.
On the right side, it is read as: minimum 1 and maximum many.
7.6 RELATIONSHIPTYPES
The line that connects two tables, in an ERD, indicates the relationship type between
the tables: either identifying or non-identifying. An identifying relationship will have a
solid line (where the PK contains the FK). Anon-identifying relationship is indicated
by a broken line and does not contain the FK in the PK. Please refer to section in Unit
6 that discusses weak and strong relationships for more explanation.
For example, if you look at the Order table on the right-hand side of Figure 7.5, you'll
notice that a customer doesn't need to place an order to be a customer. In other
words, the many side is optional.
Figure 7.7: Example usage of a zero to many optional relationship symbol, by A. Watt.
The relationship symbol in Figure 7.7 above, can also be read as follows:
i. Left side: The order entity must contain a minimum of one related entity in the
Customer table and a maximum of one related entity.
ii. Right side: Acustomer can place a minimum of zero orders or a maximum of
many orders.
Figure 7.8 , shows another type of optional relationship symbol with a zero and one,
meaning zero OR one. The one side is optional.
31
Database Design
Figure 7.9: Example usage of a zero to one optional relationship symbol, by A. Watt.
Figure 7.11: Example of a one and only one mandatory relationship symbol, by A. Watt.
Figure 7.12 illustrates what a one to many relationship symbol looks like where the
many side is mandatory.
Refer to Figure 7.13 for an example of how the one to many symbol may be used.
The connectivity symbols show maximums. So if you think about it logically, if the
connectivity symbol on the left side shows 0 (zero), then there would be no connection
between the tables. The way to read a relationship symbol, such as the one in Figure
7.16, is as follows.
The CustID in the Order table must also be found in the Customer table a
minimum of 0 and a maximum of 1 time.
The 0 means that the CustID in the Order table may be null.
The left-most 1 (right before the 0 representing connectivity) says that if there
is a CustID in the Order table, it can only be in the Customer table once.
When you see the 0 symbol for cardinality, you can assume two things:
ii. The FK is not part of the PK since PKs must not contain null values.
34
Integrity Rules and
Constraints
Figure 7.16: The relationship between a Customer table and an Order table, by A. Watt.
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q2: How can you represent that a student data must have exactly one parent name?
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
35
Database Design ...............................................................................................................................
...............................................................................................................................
7.8 ACTIVITY
36
Integrity Rules and
The primary keys are identified below. The following data types should be Constraints
defined in MySQL.
tblLevels
Level - Identity PK
ClassName - text 20 - nulls are not allowed
tblPool
Pool - Identity PK
PoolName - text 20 - nulls are not allowed
Location - text 30
tblStaff
StaffID - Identity PK
FirstName - text 20
MiddleInitial - text 3
LastName - text 30
Suffix - text 3
Salaried - Bit
PayAmount - money
tblClasses
LessonIndex - Identity PK
Level - Integer FK
SectionID - Integer
Semester - TinyInt
Days - text 20
Time - datetime (formatted for time)
Pool - Integer FK
Instructor - Integer FK
Limit - TinyInt
Enrolled - TinyInt
Price - money
tblEnrollment
LessonIndex - Integer FK
SID - Integer FK (LessonIndex and SID) Primary Key
Status - text 30
Charged - bit
AmountPaid - money
DateEnrolled - datetime
37
Database Design
tblStudents
SID - Identity PK
FirstName - text 20
MiddleInitial - text 3
LastName - text 30
Suffix - text 3
Birthday - datetime
LocalStreet - text 30
LocalCity - text 20
LocalPostalCode - text 6
LocalPhone - text 10
Implement this schema in My SQL or MS-Access (you will need to pick
comparable data types). Submit a screenshot of your ERD in the database.
1. Explain the relationship rules for each relationship (e.g., tblEnrollment
and tblStudents: Astudent can enroll in many classes).
2. Identify cardinalityfor each relationship, assuming the following rules:
i. A pool may or may not ever have a class.
ii. The levels table must always be associated with at least one class.
iii. The staff table may not have ever taught a class.
iv. All students must be enrolled in at least one class.
v. The class must have students enrolled in it.
vi. The class must have a valid pool.
vii. The class maynot have an instructor assigned.
The class must always be associated with an existing level.
i) Which tables are weak and which tables are strong (covered in an
earlier chapter)?
ii) Which of the tables are non-identifying and which are identifying
Duration: Expect to spend about 2 hour on this activity.
Feedback: This activity should be submitted to the course instructor for
assessment and feedback.
7.9 SUMMARY
In this unit you learned about Data Integrity and Constraints. You have learnt about
entity integrity and referential integrity. In addition, the unit content has also covered
different types of relationships. Data Integrity and Constraints concepts assist the
database designer especially in relational databases to ensure that the data is accurate
38 and consistent.
Integrity Rules and
7.10 ANSWERS Constraints
Ans 2:
Ans 3: The Cust-ID in order table must be found incustomer table exactlyin one record.
– Customer can give 0 or more orders.
7.12 ATTRUBUTION
The content of this unit ofthe Learning Material - Introduction to Databases (including
its images, unless otherwise noted) is a derivative copy of materials from the book
Database Design by Adrienne Watt and Nelson Eng licensed under Creative
Commons Attribution 4.0 International License.
Download this book for free at http://open.bccampus.ca
The following material was written by Grace Mbwete:
1. Introduction
2. Summary? 39
Database Design
UNIT 8 RELATIONAL DESIGN AND
REDUNDANCY
Structure
8.1 Introduction
8.2 Objectives
8.3 Terminologies
8.4 Data Redundancy
8.5 Data Anomalies
8.5.1 Insertion Anomaly
8.5.2 Update Anomaly
8.5.3 Deletion Anomaly
8.6 How to Avoid Anomalies
8.7 Video Lecture
8.8 Activity
8.9 Summary
8.10 Answers
8.11 Review Questions byAuthors
8.12 Further Reading
8.13 Attribution
8.1 INTRODUCTION
A good relational database design must capture all of the necessary attributes and
associations. The design should do this with a minimal amount of stored information
and no redundant data. In database design, redundancy is generallyundesirable because
it causes problems maintaining consistency after updates. In this unit, you will be
introduced to the basic concepts of data redundancy in relation to database design and
data anomalies.
8.2 OBJECTIVES
Upon completion of this unit you should be able to:
Explain the basic concepts of data redundancy in database design.
Compare and contrast different types of data anomalies.
Apply various techniques in removing data anomalies during database design.
8.3 TERMINOLOGIES
Deletion anomaly : Occurs when you delete a record that may contain
40 attributes that shouldn't be deleted.
Functional Dependency : Describes how individual attributes are related Relational Design and
Redundancy
(FD)
Insertion anomaly : Occurs when you are inserting inconsistent information
into a table.
Join : Used when you need to obtain information based on
two related tables.
8.5 DATAANOMALIES
Data anomalies are problems that can occur in poorly planned databases where all
data and/or information are stored in one file. An example is a file-based systems.
There are three types of data anomalies; namely, insertion anomaly, deletion anomaly
and update anomaly.
Table 8.3: Update anomaly - 'Update address for account A-113 at Roundhill'
Figure 8.5. Examples of bank account tables that contain one entity each, by A. Watt
Following this practice will ensure that when branch information is added or updated it
will only affect one record. So, when customer information is added or deleted, the
branch information will not be accidentally modified or incorrectly recorded.
43
Database Design Example: employee project table and anomalies
Table 8.6 shows an example of an employee project table. From this table, we can
assume that:
S75 32 P1 7
S75 40 P2 3
S79 32 P1 4
S79 27 P3 1
S80 40 P2 5
17 P4
Next, let's look at some possible anomalies that might occur with this table during the
following steps:
vi. Problem: Step #5 creates two tuples with different values for project P1's
budget
vii. Solution: Create a separate table, each, for Projects and Employees, as shown
in Figure 8.2
Tables 8.7 and 8.8 respectively, separate Project and Employee tables with data. By
keeping data separate using individual Project and Employee tables:
i. No anomalies will be created if a budget is changed.
ii. No dummyvalues are needed for projects that have no employees assigned.
iii. If an employee's contribution is deleted, no important data is lost.
iv. No anomalies are created if an employee's contribution is added.
Check Your Progress 2
Q1: What should be done to minimize anomaly for the table shown in check your
progress 1.
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q2: What is the logic for decomposition of tables?
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
46
Relational Design and
8.8 ACTIVITY Redundancy
8.9 SUMMARY
In this unit you learned about data redundancy in relation with database design;
specifically covering the three major types of data anomalies. The unit concluded by
illustrating how to remove them. In the next unit the process of such decomposition
called normalization is covered in details.
8.10 ANSWERS
Check Your Progress 1
Ans 1. There are several other Programmes of the University for example BCA,
CMAD etc., which cannot be entered in the table till a student data is to be
stored for those programmes.
Ans 2: The deletion anomaly in this table is that if you delete the data of student S02,
the information of Programme-ID P2 will be deleted.
Ans 3: If you update first record to
S01 XYZ P1 BCA
Then the information that P1 is code for BCAor MCAis not clear. This may
be an update anomaly.
Ans 2: Logic is simple that one entity should be included in one table.
8.13 ATTRIBUTION
The content of this unit ofthe Learning Material - Introduction to Databases (including
its images, unless otherwise noted) is a derivative copy of materials from the book
Database Design by Adrienne Watt and Nelson Eng licensed under Creative
Commons Attribution 4.0 International License.
Download this book for free at http://open.bccampus.ca
The following material was written by Grace Mbwete:
1. Introduction
2. Unit Summary
48 3. Review questions
Relational Design and
UNIT 9 FUNCTIONAL DEPENDENCIES Redundancy
Structure
9.1 Introduction
9.2 Objectives
9.3 Terminologies
9.4 Functional Dependency
9.5 Rules of Functional Dependencies
9.5.1 Inference Rules
9.6 Dependency Diagram
9.7 Video Lecture
9.8 Activity
9.9 Summary
9.10 Answers
9.11 Review Questions byAuthors
9.12 References and Further Reading
9.13 Attribution
9.1 INTRODUCTION
In this unit, we will be introduced to the concept of Functional Dependency (FD)
which is an important part of relational database design and a pre-requisite for the
next unit (normalization). FD in relational databases defines relationship between
two sets of attributes. Therefore, we will learn more on functional dependencies
categories.
9.2 OBJECTIVES
Upon completion of this unit you should be able to:
Describe the concept of functional dependency.
Compare and contrast different types of functional dependencies.
dentify various types of inference rules in relation to FD.
Use dependency diagram tool to define FD.
9.3 TERMINOLOGIES
Armstrong's axioms : A set of inference rules used to infer all the functional
dependencies on a relational database.
Decomposition : A rule that suggests if you have a table that appears to
contain two entities that are determined by the same
PK, consider breaking them up into two tables.
49
Database Design Dependent : The right side of the functional dependency diagram.
Union : A rule that suggests that if two tables are separate, and
the PK is the same, consider putting them together.
ISBN
Title
A B C D E
a1 b1 c1 d1 e1
a2 b1 c2 d2 e1
a3 b2 c1 d1 e1
a4 b2 c2 d2 e1
a5 b3 c3 d1 e1
50
As you look at this table, ask yourself: What kind of dependencies can we observe Functional Dependencies
among the attributes in Table R? Since the values ofAare unique (a1, a2, a3, etc.), it
follows from the FD definition that:
Since the values of E are always the same (all e1) (it means for any value ofAor B or
C or D, E is always e1), it follows that:
Other observations:
i. Therefore, C D
Looking at actualdata can help clarify which attributes are dependent and which are
determinants.
Let R(U) be a relation scheme over the set of attributes U. We will use the letters X, Y,
Z to represent any subset of and, for short, the union of two sets of attributes, instead
of the usual X Y.
i) Axiom of reflexivity
For example, PartNo NT123 where X (PartNo) is composed of more than one
piece of information; i.e., Y (NT) and partID (123). 51
Database Design ii) Axiom of augmentation
The axiom of augmentation, also known as a partial dependency, says if X determines
Y, then XZ determines YZ for any Z (see Figure 11.2 ).
The axiom ofaugmentation says that everynon-key attribute must be fully dependent
on the PK. In the example shown below, StudentName, Address, City, Prov, and
PC (postal code) are only dependent on the StudentNo, not on the StudentNo and
Course.
To fix this problem, we need to break the originaltable down into two as follows:
Table 1: StudentNo, Course, Grade, DateCompleted
The table below has information not directly related to the student; for instance,
ProgramID and ProgramName should have a table of its own. ProgramName is not
dependent on StudentNo; it's dependent on ProgramID.
StudentNo
StudentName, Address, City, Prov, PC, ProgramID,
ProgramName
This situation is not desirable because a non-key attribute (ProgramName) depends
on another non-key attribute (ProgramID).
To fix this problem, we need to break this table into two: one to hold information about
the student and the other to hold information about the program.
However we still need to leave an FK in the student table so that we can identify which
program the student is enrolled in.
52
Check Your Progress 1 Functional Dependencies
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q2: Enrolment number ofa student includes the year of admission. Does any FD exist
between the two?
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q3: Consider the following table and identity functional dependencies.
CUSTOMER(Cust-id, Cust-name, Account -No, Balance ofAccount)
Please note that an account holder can have multiple accounts.
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q4: Consider the following table and identify functional dependencies.
CUST(Cust-id, costumer-name, customer type, customer-credit-rating )
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
iv) Union
This rule suggests that if two tables are separate, and the PK is the same, you may
want to consider putting them together.
It states that if X determines Yand X determines Z then X must also determine Y and
Z (see Figure 9.4). 53
Database Design
EmployeeID EmpName
EmployeeID SpouseName
You may want to join these two tables into one as follows:
ProjNo ProjName
Transitive Dependency:
DeptNo DeptName
Check Your Progress 2
Q1: Given the two FDs of two different tables:
Table 1: Student-ID Name
Table 2: Student-ID Programme-code
Can these two tables combined?
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q2: Given the FDs in a table Student-ID Name, Phone would you like to decompose
them in two tables?
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
Q3: List the FDs for the following dependency diagram
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
55
Database Design ...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
9.8 ACTIVITY
56
Functional Dependencies
LecturerId, has a name (Name),an office (Office), and a phone
extension number (ExtensionNo).
The number of students (NoOfStud) enrolled in a course, which is
identified bythe value of the attribute CourseId , depends on the term
when the course is offered (Term) and the year (Year).
Each office (Office) has only one phone extension number
(ExtensionNo) and each phone extension number belongs to at most
one office.
Duration: Expect to spend about 1 hour on this activity.
Feedback: This is an activity should be submitted to the course instructor
for assessment and feedback.
9.9 SUMMARY
In this unit you learned about Functional Dependencies (FD). The unit has also covered
different inference rules and dependency diagram. The concepts covered in this unit
are a pre-requisite for the next unit of data normalization. The normalization has been
discussed in the next unit in details.
9.10 ANSWERS
Check Your Progress 1
AB C, AC B
Cust-id Cust-name
Customer-type customer-credit-rating
Ans 2: As both the Name & Phone are important fields, so not advisable to decompose.
57
Database Design Ans 3: Student-Id St-Name
Course-Id Course-Name
Part I
Consider a relation R with five attributes ABCDE. You are given the following
dependencies: A B, BC E, and ED A.
i. List all keys for R.
ii. Is R in 3NF?
iii. Is R in BCNF?
Part II
i. Define the term functional dependency and give an example to illustrate.
ii. Why are some functional dependencies called trivial?
Part III
Bank G wants to have a database for a bank that contains accounts (C), branches (H)
and customers (D). Detailed below are the requirements given the following constraints:
i. An account cannot be shared bymultiple customers.
ii. Two different branches do not have the same account.
iii. Each customer canhave at most one account in a branch (but different accounts
in different branches).
Write the functional dependencies implied by the above mentioned requirements.
Part IV
Identify the functional dependencies in the following relations and represent the
dependencies using the proper notation.
i. Course (CourseCode, CourseName,CourseUnits)
9.13 ATTRIBUTION
The content of this unit of the Learning Material - Introduction to Databases (including
its images, unless otherwise noted) is a derivative copy of materials from the book
Database Design by Adrienne Watt and Nelson Eng licensed under Creative
Commons Attribution 4.0 International License.
Download this book for free at http://open.bccampus.ca
The following material was written by Grace Mbwete:
1. Introduction
2. Unit Summary
3. Review questions
59
Database Design
UNIT 10 INTRODUCTION TO DATA
NORMALIZATION
Structure
10.1 Introduction
10.2 Objectives
10.3 Terminologies
10.4 Overview
10.5 Normal Forms
10.5.1 First Normal Form (1NF)
10.5.2 Second normal form (2NF)
10.5.3 Third normal form (3NF)
10.5.4 Boyce-Codd Normal Form (BCNF)
10.6 Normalization and Database Design
10.7 Video Lecture
10.8 Activity
10.9 Summary
10.10 Answers
10.11 Case Study
10.12 Further Reading
10.13 Attribution
10.1 INTRODUCTION
In the previous units, you have learnt theories on data modeling, relational data modeling
and entity relationship model, which create logical design for relational databases. In
this unit, we will learn that for anydata in databasetable must be stored in a normalized
way. This unit willexplain the properties ofa normalized table, process of normalization
and its importance to the structure of a database.
10.2 OBJECTIVES
Upon completion of this unit you will be able to:
Describe the fundamental concepts of normalization.
Compare and Contrast different types of normal forms.
Normalize a database relation to a third normal form.
10.3 TERMINOLOGIES
Normalization : The process of determining how much redundancyexists in a
database table.
60
1NF : First Normal Forms - only single values are permitted at the Introduction to Data
Normalization
intersection of each row and column so there are no repeating
groups.
2NF : Second Normal Form - the relation must be in 1NF and either
the PK comprises a single attribute or the non-key attributes
are not functionally dependent on any part of the composite
PK.
3NF : Third Normal Form - the relation must be in 2NF and all
transitive dependencies must be removed; a non-key attribute
maynot be functionallydependent onanother non-keyattribute.
Normalization : The process of determining how much redundancyexists in a
database table.
10.4 OVERVIEW
Normalization is the branch of relational theorythat provides design insights. It is the
process of determining how much redundancy exists in a table. The goals of
normalization are to:
Be able to characterize the level of redundancy in a relational schema.
Provide mechanisms for transforming schemas in order to remove redundancy.
Normalization theory draws heavily on the theory of functional dependencies.
Normalization theorydefines six normal forms (NF). Each normal form involves a set
of dependency properties that a schema must satisfy and each normal form gives
guarantees about the presence and/or absence of update anomalies. This means that
higher normal forms have less redundancy, and as a result, fewer update problems.
Normalization should be part of the database design process. However, it is difficult to
separate the normalization process fromthe ER modeling process so the two techniques
should be used concurrently as follows:
Use an entity relation diagram (ERD) to provide the big picture, or macro
view, of an organization's data requirements and operations. This is created
through an iterative process that involves identifying relevant entities, their
attributes and their relationships.
Normalization procedure focuses on characteristics of specific entities and
represents the micro view of entities within the ERD.
For the second normal form, the relation must first be in 1NF. The relation is
automatically in 2NF if, and only if, the PK comprises a single attribute.
If the relationhas a composite PK, then each non-keyattribute must be fully dependent
on the entire PK and not on a subset of the PK (i.e., there must be no partialdependency
or augmentation).
When examining the Student Course table, we see that not all the attributes
are fully dependent on the PK; specifically, all course information. The only
attribute that is fully dependent is grade.
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
It is assumed that a student can have multiple phone numbers. Is the relation in
1NF? 63
Database Design ...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
The semantic rules (business rules applied to the database) for this table are:
i. Each Student may major in several subjects.
65
Database Design ii. For each Major, a given Student has only one Advisor.
iii. Each Major has severalAdvisors.
iv. Each Advisor advises only one Major.
v. Each Advisor advises several Students in one Major.
The functional dependencies for this table are listed below. The first one is a candidate
key; the second is not.
i. Student_id, Major Advisor
ii. Advisor Major
Anomalies for this table include:
i. Delete - student deletes advisor info
ii. Insert - a new advisor needs a student
iii. Update - inconsistencies
Table 10.6:StaffRoom
StaffNo InterviewDate RoomNo
SG5 13-May-02 G101
SG37 13-May-02 G102
SG5 1-July-02 G102
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
...............................................................................................................................
https://tinyurl.com/gu49pgj
68
Introduction to Data
10.8 ACTIVITY Normalization
10.9 SUMMARY
In this unit you learned data normalization process which is used parallel with ER
modelling inthe logical design ofdatabase implementation. This unit also covered insert,
update and delete anomalies which can be removed by applying normalization to that
particular database. The last part covers the first three stages of data normalization,
namely, 1NF, 2NF and 3NF.
10.10 ANSWERS
Check Your Progress 1
Ans 1: It reduces the redundancy, thus is useful in minimizing the three anomalies, viz.
Insert, update & Delete anomalies.
Ans 2: No, the relation may be decomposed as 69
Database Design Student (Id, name)
Student phone (Id, phone)
The scheme as above is in 1NF.
Ans 3: No, FDs are Cust-id Name and Cust-id, Account-number ? balance
Therefore, the 2NF relation would be:
(Cust-id, name)
(Cust-id, Account-number, balance)
Check Your Progress 2
Ans 1: The FDs are
student-id, course-code taught by, teacher-qualification taught by teacher-
qualification.
Therefore, the relation is not in 3NF.
The 3NF relation would be
(student-id, course-code, taught-by)
(taught-by, teacher-qualification)
Ans 2: (a) The alternate keys of the relations are (staff-if, room-no) and (staff-if,
work-type) any one of these may be selected as primary key.
(b) The BCNF of this table would be:
(staff-id, room-no) and (work-type, room-no)
Case Study II
The ABC Manufacturing company has a completely automated application system.
The system, however, resides on index files and does not allow for decision support at
all. In order to move to ad hoc queries, and "what if" queries, the company has
decided to convert the existing system to a database. Initially, the onlycriterion for the
application is to replace the existing system with a database system. No ad hoc screen
or reportshave been anticipated. You willsee the reports and screens that exist currently.
1 2 3 4 5 6 7 8
0567 J Evans 11/4/89 M COMP7 Computing 186 H Smith
A Level
8453 R Begum 18/3/88 F BIOL9 Biology 78 D Jones
A Level
0567 J Evans 11/4/89 M MATH5 Maths 186 H Smith
A Level
71
Database Design Key for field name:
1 = StudentNo, 2= StudentName, 3=DateOfBirth, 4= Gender, 5= Course No, 6=
Course Name, 7 = LecturerNo, 8= Lecturer Name
10.12 ATTRIBUTION
The content of this unit ofthe Learning Material - Introduction to Databases (including
its images, unless otherwise noted) is a derivative copy of materials from the
bookDatabase Design by Adrienne Watt and Nelson Eng licensed under Creative
Commons Attribution 4.0 International License.
Download this book for free at http://open.bccampus.ca
The following material was written by Grace Mbwete:
1. Introduction
2. Unit Summary
72