Unit 9
Unit 9
9.0 Objectives
9.1 Introduction
9.2 Data Modeling
9.2.1 Entity-Relationship Model
9.2.2 Types of Relationship
9.3 Data Models
9.3.1 Hierarchical Model
9.3.2 Network Model
9.3.3 Relational Model
9.3.4 Object-oriented Model
9.4 Relational Database Management Systems (RDBMS)
9.4.1 Characteristics of a Relation
9.4.2 Keys and their Functions
9.4.3 Criteria for a DBMS to be Relational
9.5 Normalisation of Relations
9.5.1 Dependencies
9.5.2 Normal Forms
9.6 Designing Databases
9.7 Summary
9.8 Answers to Self-Check Exercises
9.9 Keywords
9.10 References and Further Reading
9.0 OBJECTIVES
In effective database design, data modeling plays an important role. Data modeling
is the process of structuring of data requirements.
After reading this Unit, you will be able to:
l understand the role of data modeling in conceptual database design;
l distinguish between various data models and know the characteristics of each;
l become conversant with relational database technology; and
l comprehend the steps involved in designing databases.
9.1 INTRODUCTION
This Unit presents data modeling concepts and the role of the Entity-Relationship
140 (ER) model in conceptual database design. Classical data models, viz., hierarchical
model, network model and relational model have been discussed. Concepts of Data Models
Species Common
Genus name
Genus
Endangered
Family Belongs to Plant
1 n
Name
n Botanical id
Serves
Let us illustrate these relationship types one by one. In a 1:1 relationship, one instance
of an entity of a given type is associated with only one member of another type. Let
there be a set of country names and a set of city names. Further let us assume that
each city in the set is a capital. The relationship between these two sets which can be
called “capital” is 1:1 because for each country name there is only one city name
and, conversely, each city name corresponds to only one country name.
In a 1:n relationship, one instance of a given type of entity is related to many instances
of another type. Let there be a set of departments and a set of faculty members
employed in the departments. The department-faculty relationship which can be
called ‘employed’ is of the 1:n type because each department employs several faculty
members and each faculty member works only in one department.
The many-to-one (n:1) relationship has the same semantics as 1:n. In the above
example, if we change the relationship to faculty-department (in place of department-
faculty) we will have n:1 relationship type.
Lastly the n:m relationship is one in which many instances of an entity type are
associated with many instances of another entity type. Consider a set of faculty
members teaching a set of students. The faculty-student relationship (“teaching”) is
an example of the n:m type relationship because a faculty member can teach `m`
students and a student can be taught by ‘n’ faculty members.
142
Self Check Exercise Data Models
The hierarchical model represents a 1:n relationship between record types (see Fig.
3.2). One record type (the 1 in 1:n relationship) is designated as the “parent” record
type. In this parent-child relationship, a child record type can have only one parent
record type but a parent record may have several child record types.
Record type #2
Vasant
Das Rohini Teacher Sinha Vihar Scientist Roy Saket Advocate
Sarita Delhi Delhi Delhi
Hajela Vihar Retired Mathur Naraina Teacher
Delhi Delhi
190 3000 229 3335 722 917 435 3900 229 3335
817 1725
The network model is implemented with various pointer schemes. Since a network
is an extension of hierarchy, the semantic properties of the network model are similar
to those of the hierarchical model.The network model has been illustrated in Fig.3.5.
144
The example given makes use of the same record types as in the hierarchical model Data Models
with the difference that the first record type has pointers to the A/c No. of the
second record type. A pointer may itself point to a number of pointers (called arrays).
Cullinet’s IDMS is an example of a commercial database management system based
on the network model.
File 1 File 2
A/C No. A/C No. Balance
722 917
Sinha
Vasant Vihar
Scientist (An array)
Delhi
817 1725
Saket
Roy Advocate
Delhi
435 3900
Naraina
Mathur Delhi Teacher
145
Data Models and Database Table name : Customer Table name : Account
Design
User User
DBMS Operations
Data ≡ Object
Message
Data
User
— Objects : An object is an entity, real or abstract, that has state, behaviour and
identity. The state of an object is represented by its attributes and their values.
The behaviour of an object is represented by its operations or methods.
— Messages : Objects communicate with each other through messages. A message
determines what operation is to be performed by an object. A message specifies
an operation name and a list of arguments.
— Classes : A class is a set of objects that share common attributes and behaviour.
Each object is an instance of some class.
The object-oriented approach emphasises incremental software development. The
underlying principle of this approach is:
l Grow software, don’t build it
l Build components rather than a whole system
l Assemble a basic system and then enhance it.
Smalltalk, C++, Java and Object Pascal/Delphi are the object-oriented programming
languages used in this approach.
Self Check Exercise
2) Why have RDBMS found wider application than other data models? Give a
few examples of RDBMS.
...................................................................................................................
...................................................................................................................
...................................................................................................................
...................................................................................................................
The structure of a relation i.e., set the of attributes without any values assigned to
them, is called the relation scheme. A tuple of a relation with values assigned to its
attributes is called a relation instance.
A table is generally represented by its relation scheme which is denoted by the table
name followed by the attribute names given in brackets. The relation schemes of
tables shown in Fig. 3.6 are :
CUSTOMER (Name, Address, Profession, A/c No.)
ACCOUNT (A/c No., Balance).
9.5.1 Dependencies
A dependency refers to the relationship amongst attributes. These attributes may
belong to the same relation or different relations. Dependencies can be of various
types viz., functional dependencies, transitive dependencies, multivalued
dependencies, join dependencies, etc. We shall briefly examine some of these
150 dependencies.
Functional dependency (F.D) – Functional dependency represents semantic Data Models
151
Data Models and Database Multivalued dependency — Multivalued dependency refers to m:n (many-to-
Design
many) relationships. We say multivalued dependency exists between two data items
when one value of the first data item gives a collection of values of the second data
item, i.e., it multidetermines the second data items.
Join dependency — If we decompose a relation into smaller relations and the join
of the smaller relations does not give us tuples as in the parent relation, we say the
relation has join dependency.
Let us consider a relation SAMPLE with the functional dependencies as shown :
SAMPLE
A B C F.D : A→B
a1 b1 c1
a2 b2 c3 C→B
a3 b1 c2
a4 b2 c4
Let us now decompose this relation into two relations SPLIT 1 (A, B) and SPLIT
2(B, C). The functional dependencies will remain the same in the relation SAMPLE.
SPLIT 1 SPLIT 2
A B B C
a1 b1 b1 c1
F.D. : A → B F.D.: C → B
a2 b2 b2 c3
a3 b1 b1 c2
a4 b2 b2 c4
Now if we join the relations SPLIT 1 and SPLIT 2 over the common attribute B,
we get the relation SAMPLE 1.
SAMPLE 1
A B C
a1 b1 c1
a1 b1 c2*
a2 b2 c3
a2 b2 c4*
a3 b1 c2
a3 b1 c1*
a4 b2 c3*
a4 b2 c4
152
We can see from the relation SAMPLE 1 that it has four additional (spurious) tuples Data Models
(shown by asterisk) which were not present in the original relation SAMPLE. This
type of join is called lossy join because the information content of the original table
is lost. This has occurred since the join attribute was not the determinant in the
original relation and hence it should not have been decomposed the way it was
done. We say a join dependency exists in this case. If we decompose a relation and
join the constituent relations over the determinant attribute, we get lossless join.
For example consider the relation SAMPLENEW and split it into SAMPLENEW1
and SAMPLENEW2 as given below :
SAMPLENEW
X Y Z
x1 y1 z1
x2 y2 z2 F.D: X → Y
X→Z
x3 y2 z1
x4 y1 z2
SAMPLENEW1 SAMPLENEW2
X Y F.D : X → Y X Z F.D : X → Z
x1 y1 x1 z1
x2 y2 x2 z2
x3 y2 x3 z1
x4 y1 x4 z2
The join of SAMPLENEW1 and SAMPLENEW2 gives the original table
SAMPLENEW without any spurious rows because the attribute X over which the
table is decomposed is the determinant attribute.
The first normal form (1NF). A relation is in the first normal form if it can be
represented as a flat file, i.e., the relation contains single values at the intersection of
each row and column.
In this relation CName , Age pertains to the attribute dependent child with multiple values
(name and age). To convert this relation into 1NF, it should be decomposed into
two relations as follows :
The second normal form (2NF). A relation is in the second normal form if it is in
1NF and every non-key attribute is fully functionally dependent on the primary key.
The second normal form pertains only to relations with composite primary key. In
case a relation is in 1NF and has a single-attribute primary key, it is automatically in
the 2NF.
To explain the second normal form let us take the example of the following relation.
In this relation Course-id, Student-id and Faculty-id form a composite primary key.
Course-id → Course-name
This means that Course-name is not fully functionally dependent on the primary key.
Therefore the relation is not in the 2NF. The dependency diagram of the relation can
be represented as follows :
To convert this relation into the 2NF, we decompose it into two relations as follows:
COURSE (Course-id, Class-number, Student-id, Faculty-id)
To convert this relation into the 3NF we shall have to decompose it into two smaller
relations. The new relation OFFICE-NAME has the attribute office which caused
transitive dependency and its determinant. The decomposed relations are :
FACULTY (Faculty-id, Faculty-name, Department, Gender, Salary)
OFFICE-NAME (Department, Office)
With some tuple values in these relations let us examine them for anomalies which
156 are used to test relations.
FACULTY (unnormalized) Data Models
Raw Data
Ensure flat file structure
1 NF Remove elements of functional dependencies
Remove elements of transitive dependencies
2 NF BCNF
Ensure single multivalued dependencies
4 NF
5 NF
9.7 SUMMARY
The E-R model plays an important role in conceptual database design. The
evolutionary path of data from hierarchical model to relational and then to object-
159
Data Models and Database oriented has brought about tremendous changes in database design techniques.
Design
Relational database management systems are by far the most popular. Normalisation
of relations is an important aspect of database design aimed to remove anomalies in
a database. It improves integrity and consistency of data, though it slows retrieval
speed.
Designing databases is a highly complex process. There are a number of basic steps
which a database designer follows while designing databases. Usually a database
stabilizes in design over a period of time with feedback from users.
RDBMS have advantages over other data models in the fact that the relational
model is based on the well-developed mathematical theory of relations from
which it derives its name. Application of mathematics imparts great strength to
the relational model. The data in relational systems is represented in the form of
tables which users find easier to handle. Examples of RDBMS are : ORACLE,
SYBASE and INGRESS.
9.9 KEYWORDS
Dependency : A dependency refers to the relationship amongst attributes
elonging to the relation or different relations.
Foreign Key : A column in one table that is the primary key in a second
table. It does not need to be a key in the first table.
in a table.
161