02 RDBMS CG
02 RDBMS CG
Management Systems
This module covers the concepts of a Relational Database Management System (RDBMS). The module
introduces you to some basic concepts of relational models. Elementary concepts like database, relation, and
normalization are also discussed.
The module starts with the basics of a Database Management System (DBMS) and then goes on to explain
terms like table, database, primary key, foreign key and composite key.
The module then explains the basics of the Relational Model and the Entity Relationship Model. The
method of creation of E/R diagrams and their mapping to tables is also covered in the lessons. The concepts
of normalization and denormalization are also explained in this module.
1.4 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
ENTRY PROFILE
A student should have knowledge on the following topics before starting with the RDBMS module:
Build Flowcharts – the student should be able to represent logic, sequence tasks for execution, and
implement conditional operations and iteration.
Design a database.
1.6 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
LESSON -W ISE INPUTS
Lesson One
Experiences
This lesson introduces the students to database management systems. Discuss the benefits of a database approach. It
also introduces the students to relational database design and describes data models and relational operators.
Explain the object-based logical model. Also explain the concept of entities, relationships, and attributes in
an entity-relationship model.
Explain the various types of relationships. Also explain the concept of attributes, subtypes, and
supertypes.
• Highlight the guidelines for identifying an entity. Tell the students to underline all nouns in a
problem statement. Those nouns which form part of the scope of data storage will be the entities.
Do caution them that certain nouns might be attributes. Attributes are those, which describe an
entity. Also, tell them that entities are stored in a database in the form of tables.
• In most of the problem statements, the given attributes constitute a partial list. Ask the class to
identify the other attributes of an entity for which data can be captured.
• Restrict
• Project
• Product
• Intersect
• Difference
• Join
• Divide
This session is a packed session so please do a recap of this session before starting off with lesson-2 in the
next class.
Additional Inputs
A database is a collection of information organized in such a way that a computer program can quickly
select the desired data. You can think of a database as an electronic filing system.
Traditional databases are organized by fields, records, and files. A field is a single unit of information; a
record is one complete set of fields; and a file is a collection of records. For example, a telephone book is
analogous to a file. It contains a list of records, each of which consists of three fields: name, address, and
telephone number.
A relationship is an association established between the common fields (columns) in two tables. A
relationship can be one-to-one, one-to-many, or many-to-many.
A field is a space allocated for a particular item of information. For example, a tax form can contain a
number of fields to store information like your name, your Social Security number, your income, and so on.
In database systems, fields are the smallest units of information that can be accessed. In spreadsheets, fields
are referred to as cells.
Fields have certain attributes associated with them. For example, some fields are numeric whereas others are
textual. In addition, every field has a name, called the field name.
In an RDBMS, a field is also referred to as a key that you use to sort data. It can also be called a key field,
sort key, index, or key word. For example, if you sort records by age, then the age field is a key. Most
database management systems allow you to have more than one key so that you can sort records in different
ways. One of the keys is designated as the primary key, and must hold a unique value for each record. The
field that identifies records in a different table is called a foreign key.
Consider a situation where each customer is assigned a customer number. The customer number is unique
for that customer; this might be a customer number that is used by your company or a number that is
automatically assigned by the computer, which the user may never even see. It is essential, however, that
there be only one customer at any time having that customer number. The customer number uniquely
identifies each customer "record". A "record" is a single entry in a table; for instance, in the Customers
table, each customer's information constitutes a record; in the Sales table, each sale constitutes a record.
1.8 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
A primary key is also required in the Sales table. In addition, the customer number of the customer should
also appear in this table as a "field". A field is a single type of information in a table; in the Customers table,
we would have a customer name field, a customer number field, etc.
The customer number in the Sales table is called a "foreign key". A foreign key is a copy of a primary key
that appears elsewhere.
Answer: One type of field that makes an excellent primary key is a "counter" field. This is a field with a
whole number as its value, which automatically creates a new, unique number for each record. As a rule,
numeric keys are preferable to text keys because they take up less memory and therefore are faster and
require less data storage space. The counter field additionally removes the concern of generating a unique
value for the primary key field for each new record.
Date/time stamps make tolerable primary keys in some RDBMSs, as long as the program is never being
used by more than one person at a time.
A social security number may not be the best primary key, for several reasons: first, they take up more digits
than would generally be required. Second, the U.S. government sometimes assigns the same social security
number twice (usually when the first holder is deceased). Third, anyone who is not a U.S. citizen may not
have a social security number. Fourth, we are told that it is not permissible by law to require a person to
supply their social security number; tax ID numbers can be assigned and used instead in some cases.
Two foreign keys can be collectively set to create a primary key, called a "compound key" (also referred to
as the composite key). Some RDBMSs will then automatically prevent the duplication of any combination
of those fields. Any number of fields may be identified as a compound key together. However, compound
keys should not be used too widely, as they are slower than single-field primary keys (since the RDBMS has
to check two or more full fields of information in each table rather than just one field). Also, compound keys
become awkward when more than two or three fields are in use, and it is more difficult to refer to specific
records through compound keys than through single keys.
CUSTOMERis a set all people who have their account in the bank.
CUSTOMERcustomer_code, account_number
EMPLOYEEemployee_code
ACCOUNTaccount_number
In a Hierarchical Database Management System (HDBMS) the data is stored conceptually in a hierarchical
format. An example of such kind of data storage is an XML data island where there is a root tag and there
are child tags below the root tag. Each of these child tags can in turn have child tags. This kind of a
formation resembles a tree. An example can be a Firm that can have a child called Factory(s), which in turn
can have a child called Employee(s). In an HDBMS a one-to-one or a one-to-many relationship can be very
easily implemented but implementing a many-to-many relationship is not possible. In a Networked Database
Management System (NDBMS) the child node of a HDBMS can be associated to multiple parent nodes.
Like HDBMS a many-to-many relationship cannot be implemented in an NDBMS.
Object-Relational Database Management System (ORDBMS) are extensions of existing RDBMS that can
be used to fulfill complex jobs of storing and managing objects. For instance using an ORDBMS image
objects can be stored and managed. Examples of ORDBMS are Informix Universal Server and Adaptive
Server of Sybase Inc.
Just a Minute…
The following statement has been extracted from a case presented by a manufacturer regarding the
maintenance of their data: “A supplier ships certain parts”. Identify the entities mentioned in this statement,
and their relationship. Draw a diagram depicting the relationship.
Solution
Entities: SUPPLIER, PARTS.
Relationship: SHIP (or SHIPMENT)
1.10 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
Just a Minute…
Solution
Many students can work on many projects.
Many employees belong to only one department.
m m
SUPPLIER SHIP PARTS
Just a Minute…
SUPPLIER
ADD CR_STATUS
Just a Minute…
There are two types of suppliers. One type of supplier allows credit, while the other type insists on payment in cash
before delivery. The manufacturer wishes to maintain separate information on these two types of suppliers. For the
credit supplier, “credit period” and “credit limit” have to be recorded. For the cash supplier, “date of payment” has
to be stored. Represent this information diagrammatically.
Solution
SUPPLIERS
CASH CREDIT
PAY_D
CR LIMIT CR_PERIOD
1.12 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
Faculty Observations
Experiences
This session defines the terms entity, relationship, and attribute with respect to the entity-relationship model.
Relationships should be discussed in detail with enough examples. The different types of relationships are
very important and should be made clear to the student. The symbols to represent the different types of
relationships between entities should be explained. The structure of a table and the concept of tuples and
attributes should be discussed in detail. Primary key and foreign key should be explained. Mapping E/R
diagrams to tables is very important.
Conceptual Model
Explain the conceptual model and then explain the following:
• Regular entities
• Attributes
• Weak entities
E/R Diagrams
Stress on the symbols used for depicting the entities, the attributes, and the relationship between the
entities.
Stress that there is no difference between a one-to-many and a many-to-one relationship except for the
fact as to how the entities are placed while drawing the E/R diagram. For example, the relationship
between a department and an employee can be represented in two ways.
m Belong 1
Employee to Department
1 m
Has
Department Employee
Stress the fact that in a one-to-many relationship, a common attribute is required to relate the two tables.
FAQs
Q: What is the difference between a database structure and an ER Diagram?
A:
Comprises of collection of objects or entities and the Represents data in the database as simple tables in the row-
relationships among these. column format
A: A Weak entity depends upon a regular entity for its existence whereas a Sub entity is part of a Regular
entity. For example, an entity called Students is used to store all student details. Now, every student is
enrolled for a course and there are some students who have taken a break from the course. In this kind of a
scenario, we can have a Sub entity called Break-Students, which stores details of all students who have
currently taken a break. Note that a Sub entity will contain all the columns of the super-entity from which it
is derived. A Weak entity on the other hand has attributes that are different from those of the regular entity
on which it is dependent.
A: A Weak entity depends upon a regular entity for its existence. For example, the entity called Children
depends on the entity Employee for its existence. If an Employee resigns, then the corresponding records for
the specific employee from the table Children is also removed.
A: The difference between a Subtype and a Supertype entity is best understood with an example. An entity
called Student has two sub entities Boarder and Day-scholar. Here, the entity Student stores details about
students like name, age, course, class etc. and has student-id as the primary key. The Boarder sub-type entity
has a distinguishing attribute Room-no while the Day-scholar entity, has a distinguishing attribute Locker-
no. Apart from the distinguishing attributes the sub-type entities also contain the primary key of the super-
entity.
Additional Inputs
The database design created from a well-prepared E/R Diagram will result in tables, which are fairly
normalized. To further reduce redundancy, you will need to normalize the tables and in certain situations,
1.16 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
because of system requirements, you may need to introduce some amount of redundancy by denormalizing
the tables. Tell the students that normalization will be discussed in detail later in the course.
Tips
Be comfortable with the types of relationships – one-to-one, one-to-many, many-to-one, and
many-to-many.
You should be able to look at a schema and identify the types of relationships and what data can be
captured in the same.
Just a Minute…
a. Candidate Key
b. Alternate Key
Solution
a. Any attribute (or set of attributes) that uniquely identifies a row in a table is a candidate for the
primary key. Such an attribute is called a candidate key.
b. Any attribute that is a candidate for the primary key but is not the primary key is called the
alternate key.
Solutions: 2.P.1
aircraft_id (PK) char (4) No Aircraft Id. For example, it will store A330 for Airbus 330
and B747 for Boeing 747-400 aircraft
first_class_seats int No The total number of first class seats in a particular type of an
aircraft
sector_id (PK) char(5) No Id for a particular sector where the flight operates
week_day2 char(3) No
Table Flight
aircraft_id (FK) char(4) No Aircraft id. Should exist in the Aircraft table.
sector_id (FK) char(5) No Sector where the flight operates. Should exist in the Sector
table
* Note: The number of seats in each class for a particular flight will be initialized based on the aircraft Id. Every time a
reservation takes place, the number of seats in this table will be decremented based on the class of travel chosen by
passenger. Similarly, in case of a cancellation, the number of available seats will be incremented. Though this is
redundant data, it is still maintained in this table for ease of use in queries.
Table Passenger
1.18 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
travel_date (FK) date No Date of travel. The flight number and the date of travel
together form a foreign key that references the flight number
and flight date in the Flight table.
(Vegetarian/Non-vegetarian)
cancel_flag char(1) No ‘Y’ if the passenger has cancelled the ticket, ‘N’ otherwise.
Table DailyCollection
pnrno (FK) char(8) No PNR number. It references the PNR number from the
Passenger table
*Note: This table is updated whenever a transaction such as a reservation, cancellation, or modification takes place.
This table is created for the purpose of generation of reports. Although this table is not depicted in the E/R diagram, it is
used to maintain a record of daily transactions.
1.20 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
Lesson Three
Experiences
This session is an introduction to database design and SQL. The session starts with an introduction to normalization
and the types of normal forms. The Boyce-Codd normal form and denormalization are also explained in the
session.
Normalization
Explain the need for normalization.
Denormalization
Explain the concept of denormalization.
Explain relations with respect to domains. Also, explain the properties of relations.
Additional Inputs
Fourth Normal Form: If a relation has many-to-many relationships with two or more relations, then the
attributes of all the three or more relations cannot be depicted in the same relation. For the relations to
be in the fourth normal form, the relation has to be in the third normal form and each of the many-to-
many relationships should be assigned to separate relations. For example consider an entity called
Student has multiple Skills and has enrolled in multiple courses. Here the relationship between a
Student and Skills are many to many as one student can have multiple skills and the same skill can be
attributed to multiple students. Also one student can enroll for multiple courses and a course can have
multiple students. To be in the Fourth Normal form, you should have two relations Student-Course and
Student-Skill.
Fifth Normal Form: For a relation to be in the Fifth normal form it has to be in the Fourth Normal form
and abide by some business rules. Let us now understand the business rule involved in the Fifth normal
form with an example. A student is enrolled in one or more courses of the three available courses and
each course has two modules and there is a business rule which states that a student can be associated
with only one module at any time then there should be three relations: Course-Module, Course-Student
and Student-Module. If the business rule did not exist then there is no need of the fifth normal form.
Just a Minute…
You have received a proposed table structure for the table Position. After testing the table structure with
some data, you find that there is a problem in inserting, deleting, and modifying data. You see that the table
structure could lead to inconsistency in data and is also occupying a lot of disk space.
Position
cPositionCode
vDescription
iBudgetedStrength
siYear
iCurrentStrength
vSkill
1.22 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
cPositionCode vDescription iBudgetedStre iCurrentStrength vSkill
ngth
Solution
Use Normalization and break the table as follows:
Just a Minute…
In the reporting system, the total amount paid to a contract recruiter is often required. The required result
can be calculated using the two tables - Payment and ContractRecruiter.
ContractRecruiter Payment
cContractRecruiterCode cSourceCode
cName mAmount
vAddress cChequeNo
cCity dDate
cZip
cPhone
cFax
siPercentageCharge
Denormalize the tables. Add a column called mTotalPaid to the ContractRecruiter table.
ContractRecruiter
cContractRecruiterCode
cName
vAddress
cCity
cZip
cPhone
cFax
siPercentageCharge
mTotalPaid
Note: You can communicate to the students that the value of mTotalPaid will get automatically updated
through the usage of various tools provided by different RDBMS software’s.
Just a Minute…
Each time the salary slip for an employee is generated, the referral bonus (if present) has to be calculated
and printed in the salary slip. The following three tables are used for this query- MonthlySalary,
Employee, and EmployeeReferrals. The table structures are:
cCandidateCode
vAddress
cCity
cZip
cCountryCode
cPhone
1.24 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
vQualification
dBirthDate
cSex
cCurrentPosition
cDesignation
cEmailId
cDepartmentCode
cRegion
imPhoto
vSkill
dJoiningDate
dResignationDate
cSocialSecurityNo
However, since the tables structures are large, it is necessary to improve the performance of this query by
modifying the table structures.
Solution
Identify how to increase the performance of queries. Denormalize the tables by adding Referralbonus
attribute in the MonthlySalary table.
CCandidateCode Referralbonus
VAddress
CCity
CZip
CCountryCode
CPhone
VQualification
CSex
CCurrentPosition
CDesignation
CEmailId
CDepartmentCode
cRegion
imPhoto
vSkill
dJoiningDate
dResignationDate
cSocialSecurityNo
Additional Exercises
To make students comfortable with Normalization, faculty can ask students to attempt the following
questions:
Q1. Following table is in 1ST NF. Convert the following table structure in 3RD NF.
Where:
1.26 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
• EName is the employee name.
Sol:
• Find and remove attributes that are functionally dependent on only a part of the key and not on the
whole key. Place them in a different table.
So,
Project
Pno
PName
Employee
Eno
Ename
JobProfile
JobRate
Assignment
Pno
Eno
Hours
• Remove the non-key attributes that are functionally dependent on attributes that are not the primary
key. Place them in a different table.
Pno
Pname
Employee
Eno
Ename
JobProfile
Assignment
Pno
Eno
Hours
Job
JobProfile
JobRate
Q2. Following table is in 1ST NF. Convert it into 2ND NF.
Rep ID* Rep First Name Rep Last Name Client ID* Client Time With Client
TS-89 Gilroy Gladstone 978 US Corp 14 hrs
TS-89 Gilroy Gladstone 665 Taggarts 26 hrs
TS-89 Gilroy Gladstone 782 Kilroy Inc. 9 hrs
RK-56 Mary Mayhem 221 Italiana 67 hrs
RK-56 Mary Mayhem 982 Linkers 2 hrs
RK-56 Mary Mayhem 665 Taggarts 4 hrs
Solution:
1.28 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
Rep ID* First Name Last Name
TS-89 Gilroy Gladstone
RK-56 Mary Mayhem
The advantage of having relational tables in 3NF is that it eliminates redundant data, which in turn saves
space and reduces manipulation anomalies.
• The tables are narrower and have more records in a page, which results in less I/O.
• Database is organized.
Sol. Denormalization is the process of taking a normalized database and modifying table structures to allow
controlled redundancy for increased database performance. This is used to improve the performance a
database. A denormalized database is not the same as a database that has not been normalized.
Denormalizing a database is the process of taking the level of normalization within the database.
Normalization can slow down the performance of a database with its frequently table join operations.
Denormalization may involve recombining separate tables or creating duplicate data within tables to reduce
the number of tables that need to be joined to retrieve the requested data, which results in less I/O and CPU
time.
Sol. Although most successful databases are normalized to some degree, there is one substantial drawback
of a normalized database. Normalization slows down the database performance. The acceptance of reduced
performance requires the knowledge that a normalized database requires much more CPU, memory, and I/O
to process transactions and database queries than does a denormalized database. A normalized database
must locate the requested tables and then join the data from these tables to either get the requested
information or to process the desired data.
The primary disadvantage of normalization is that it increases the number of joins. Joins are a mechanism in
relational databases that let two tables with matching key values retrieve information from both the tables. It
is a disadvantage to have too many joins in a query because of the way the query optimizer works and
because of the number of I/Os that are required to retrieve all the information.
1.30 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
Faculty Observations
Relationship: Works
2. The entity-relationship diagram depicting the relationship between an instructor and students is as
follows: 1 m
TEACHES ROLL_NO
INSTRUCTOR STUDENTS
Lesson 2
1. The relationships between the following entities is given below:
1 m
Order Shopper
Recipient Gives
Address ID
Cart ID
1.32 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
Table Structure can be:
Table Recipient
Table Order
Lesson 3
1. ORDER TABLE
OrderNo
OderDate
Total Cost
TOY TABLE
ToyID
ToyName
RECIPIENT TABLE
OrderNo
FirstName
LastName
Address
2. The following tables are in their normal forms.
1.34 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT
ADDITIONAL REFERENCE
C. J. Date: An Introduction to Database Systems
Michael J. Hernandez: Database Design for Mere Mortals: A Hands-On Guide to Relational Database
Design
C. J. Date, Hugh Darwen: Foundation for Future Database Systems: The Third Manifesto
C. J. Date, What Not How: The Business Rules Approach to Application Development
1.36 Implementing a Database Design Using Microsoft SQL Server 7.0 ©NIIT