0% found this document useful (0 votes)
18 views

Unit 1 DBMS

Uploaded by

alankritabhonde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Unit 1 DBMS

Uploaded by

alankritabhonde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

9 1

In this Chapter
 DBMS basic concepts
 Advantages of a DBMS over file-processing
systems
 Data Abstraction
 Database Languages
 Data Models
 Data Independence
 Components of a DBMS/
 Overall Structure of a DBMS
 DBMS architecture
 ERD
 Schema Diagram
 Relational Model
 Codd’s Rule
 Relational Integrity
1.1 Basic Concepts NOTES
What is Data?
Data is a collection of a distinct small unit of information. It can be
used in a variety of forms like text, numbers, media, bytes, etc. it
can be stored in pieces of paper or electronic memory, etc.
Word 'Data' is originated from the word 'datum' that means
'single piece of information.' It is plural of the word datum.
In computing, Data is information that can be translated into a
form for efficient movement and processing. Data is
interchangeable.
What is Database?
A database is an organized collection of data, so that it can be
easily accessed and managed.
What is Database Management System?
Database Management System is a software or technology used to
manage data from a database. Some popular databases are MySQL,
Oracle, MongoDB, etc. DBMS provides many operations e.g. creating
a database, storing in the database, updating an existing database,
delete from the database. DBMS is a system that enables you to store,
modify and retrieve data in an organized way. It also provides
security to the database.

Database DBMS Distinction

A system intended for easily organizing, storing and retrieving


large amounts of data, is called a database. In other words, a
database holds a bundle of organized data (typically in digital form)
for one or more users. Databases, often abbreviated DB, are
classified according to their content, such as document-text,
bibliographic and statistical. But, a DBMS (Database Management
System) is actually the whole system used for managing digital
databases which allows storage of database content,
creation/maintenance of data, search and other functionalities. In
today’s world a database itself is useless if there is no DBMS
associated with it for accessing its data. But, increasingly, the term
Database is used as shorthand for Database Management System.

Notes by Dr. Nilesh Shelke SIT, Nagpur


2
NOTES 1.2 Drawbacks of File Processing System

A file can be understood as a container to store data in


a computer. Files can be stored on the storage device
of a computer system. Contents of a file can be texts,
computer program code, comma separated values

(CSV), etc. Likewise, pictures, audios/videos, web pages

are also files.

File system becomes difficult to handle when number of


files increases and volume of data also grows. Following
are some of the limitations of file system:

(A) Difficulty in Access

Files themselves do not provide any mechanism to


retrieve data. Data maintained in a file system are
accessed through application programs. While writing
such programs, the developer may not anticipate all
the possible ways in which data may be accessed. So,
sometimes it is difficult to access data in the required
format and one has to write application program to
access data.

(B) Data Redundancy & Inconsistency


In conventional data systems, an organization often builds a
collection of application programs often created by different
programmers and requiring different components of the
operational data of the organization. The data in conventional
data systems is often not centralized. Some applications may
require data to be combined from several systems. These several
systems could well have data that is redundant as well as
inconsistent (that is, different copies of the same data may have
different values). Data inconsistencies are often encountered in
everyday life. For example, we have all come across situations
when a new address is communicated to an organization that we
deal with (e.g. a bank, or Telecom, or a gas company), we find that
some of the communications from that organization are received
at the new address while others continue to be mailed to the old
address. Combining all the data in a database would involve
reduction in redundancy as well as inconsistency. It also is likely
to reduce the costs for collection, storage and updating of data.

Notes by Dr. Nilesh Shelke SIT, Nagpur


3
(C) Data Isolation

Files are stored in different locations, different formats. Thus they are
isolated. For example, one location the student data may be stored
in .txt format. In other location, the same file may be stored
in .doc format.

(D) Data Dependence

Data are stored in a specific format or structure in a file. If the


structure or format itself is changed, all the existing
application programs accessing that file also need to be
change. Otherwise, the programs may not work correctly. This
is data dependency. Hence, updating the structure of a data file
requires modification in all the application programs
accessing that file.
(E) Controlled Data Sharing
There can be different category of users like teacher, office staff
and parents. Ideally, not every user should be able to access all
the data. As an example, guardians and office staff can only see
the student attendance data but should not be able to
modify/delete it. It means these users should be given limited
access (read only)to the ATTENDANCE file. Only the teacher
should be able to update the attendance data. It is very difficult
to enforce this kind of access control in a file system while
accessing files through application programs.
(F) Integrity problems
Integrity problem arises when the database fails to satisfy certain
integrity conditions.
For example, the phone number cannot be longer than 10 digits, bank
balance should not go below 1000 etc. The actual problem arises when
we would like to include new such conditions with the existing
database. It is hard to make those changes.
G) Atomicity problems
The database must be in a consistent state in spite of failures.
For example, let us suppose that you have a savings account with the
balance 5000 and a loan account with an outstanding of 3000. This is
the old consistent state. Now you would like to transfer 500 to your
loan account. If this transaction is successful, then your savings
balance should be 4500 and loan outstanding should be 2500. This is
the new consistent state. Suppose a failure occurs during this
transaction, the database must be in any one of the 2 consistent states
mentioned above. It is hard to maintain atomicity in file processing
system due to data redundancy, data isolation etc.
Notes by Dr. Nilesh Shelke SIT, Nagpur
4
H) Concurrent access anomalies
Simultaneous access of a data item should be handled carefully. For example, if only one
ticket is there and two customers are trying to book the ticket simultaneously, the ticket
should be allotted to any one customer.
It is difficult to handle in file processing system due to the fact of data isolation, redundancy
etc.

File System V/s Database Approach

File based system Database system

1. The data and program are 1. The data and program are independent of each
inter- dependent. other.
2. File-based system caused data 2. Database system control data redundancy.
redundancy. The data may be The data appeared only once in the system.
duplicated in different files
3. File –based system caused data 3. In database system data always consistent.
inconsistency. The data in Because data appeared only once.
different files may be different
that cause data inconsistency.
4. The data cannot be shared 4. In database data is easily shared because data
because data is distributed in is stored at one place.
different files.
5. In file based system data is 5. It provides many methods to maintain data
widely spread. Due to this reason security in the database.
file based system provides poor
security.
6. File based system does not 6. Database system provides a different
provide consistency constrains. consistency constrains to maintain data integrity
in the system.
7. File based system is less 7. Database system is very complex system.
complex system.
8. To generate different report to 8. The report can be generated very easily in
take a crucial decision is very required format in database system.
difficult in file based system.

Notes by Dr. Nilesh Shelke SIT, Nagpur


5
NOTES

9. File based system takes much 9. Database approach store


space in the system, and memory data more efficiently it takes
is wasted in this approach. less space in the system and
memory is not wasted.
10. File based system does not 10. Database system provides
provide concurrency facility. concurrency facility.
11. File based system does not 11. Database system provides
provide data atomicity data atomicity functionality.
functionality.

1.3 Data Abstraction

Data abstraction is the procedure of concealing irrelevant or unwanted data


from the end user.

Let us understand this with an example, if you go to a shop to buy a pair of


shoes, you ask the shopkeeper to show you the shoes of a certain company,
and you also tell the shopkeeper about the size, color, and material you want.
Then, you will only see the specified things in the shoes, or will you be asking
the shopkeeper questions such as, where are these shoes made? From where
does the material come? What is the cost of the material?

The answer to these questions is NO. You will not ask these questions
because these questions are of no use. You do not care about these questions.
You are only concerned about a few things, such as the company, size, color,
material, and how the shoes look. That is why these unimportant details are
kept hidden from the end user. This is the process we call data abstraction.

What is Data abstraction in Database Management System?

The database system contains intricate data structures and relations. The
developers keep away the complex data from the user and remove the
complications so that the user can comfortably access data in the database
and can only access the data they want, which is done with the help of data
abstraction.

The main purpose of data abstraction is to hide irrelevant data and provide
an abstract view of the data. With the help of data abstraction, developers
hide irrelevant data from the user and provide them the relevant data. By
doing this, users can access the data without any hassle, and the system will
also work efficiently.

In DBMS, data abstraction is performed in layers which means there are


levels of data abstraction in DBMS that we will further study in this article.
Based on these levels, the database management system is designed.
Notes by Dr. Nilesh Shelke SIT, Nagpur
6
Levels of Data Abstractions in DBMS

In DBMS, there are three levels of data abstraction, which are as follows:

Fig. 1 Levels of Data Abstraction in DBMS


1. Physical or Internal Level:
The physical or internal layer is the lowest level of data abstraction in the
database management system. It is the layer that defines how data is actually
stored in the database. It defines methods to access the data in the database.
It defines complex data structures in detail, so it is very complex to
understand, which is why it is kept hidden from the end user.
Data Administrators (DBA) decide how to arrange data and where to store
data. The Data Administrator (DBA) is the person whose role is to manage
the data in the database at the physical or internal level.
There is a data center that securely stores the raw data in detail on hard
drives at this level.

Notes by Dr. Nilesh Shelke SIT, Nagpur


7
2. Logical or Conceptual Level:
The logical or conceptual level is the intermediate or next level of data
abstraction. It explains what data is going to be stored in the database
and what the relationship is between them.
It describes the structure of the entire data in the form of tables. The
logical level or conceptual level is less complex than the physical level.
With the help of the logical level, Data Administrators (DBA) abstract
data from raw data present at the physical level.

3. View or External Level:


View or External Level is the highest level of data abstraction. There are
different views at this level that define the parts of the overall data of
the database. This level is for the end-user interaction; at this level, end
users can access the data based on their queries.

Advantages of data abstraction in DBMS


 Users can easily access the data based on their queries.
 It provides security to the data stored in the database.
 Database systems work efficiently because of data abstraction.

1.4 Data Sublanguages

A DBMS has appropriate languages and interfaces to express database


queries and updates. Database languages can be used to read, store and
update the data in the database.

Notes by Dr. Nilesh Shelke SIT, Nagpur


8
Types of Database Languages

Fig. 1.2 Database Sublanguages

1.4.1 Data Definition Language (DDL)


o DDL stands for Data Definition Language. It is used to define database
structure or pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the
database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata
like the number of tables and schemas, their names, indexes, columns
in each table, constraints, etc.
Here are some tasks that come under DDL:
o Create: It is used to create objects in the database.
o Alter: It is used to alter the structure of the database.
o Drop: It is used to delete objects from the database.
o Truncate: It is used to remove all records from a table.
o Rename: It is used to rename an object.
o Comment: It is used to comment on the data dictionary.

Notes by Dr. Nilesh Shelke SIT, Nagpur


9
NOTES
These commands are used to update the database schema that's why
they come under Data definition language.
1.4.2 Data Manipulation Language (DML)
DML stands for Data Manipulation Language. It is used for accessing
and manipulating data in a database. It handles user requests.
Here are some tasks that come under DML:

o Select: It is used to retrieve data from a database.


o Insert: It is used to insert data into a table.
o Update: It is used to update existing data within a table.
o Delete: It is used to delete all records from a table.
o Merge: It performs UPSERT operation, i.e., insert or update operations.
o Call: It is used to call a structured query language or a Java
subprogram.
o Explain Plan: It has the parameter of explaining data.
o Lock Table: It controls concurrency.

1.4.3 Data Control Language (DCL)


o DCL stands for Data Control Language. It is used to retrieve the stored
or saved data.
o The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does
not have the feature of rolling back.)
Here are some tasks that come under DCL:
o Grant: It is used to give user access privileges to a database.
o Revoke: It is used to take back permissions from the user.
There are the following operations which have the authorization of Revoke:

Notes by Dr. Nilesh Shelke SIT, Nagpur


10
1.4. 4. Transaction Control Language (TCL)
TCL is used to run the changes made by the DML statement. TCL can be grouped
into a logical transaction.
Here are some tasks that come under TCL:
o Commit: It is used to save the transaction on the database.
o Rollback: It is used to restore the database to original since the last Commit.

1.5 Various Data Models: The Three Data Models are:


1. The Relational Approach
2. The Hierarchical Approach
3. The Network Approach
1) The Relational Approach
Below table shows the sample data in the relational form; that is, it represents
a relational view of the data.
It can be seen that the data is organized into three tables: S(Suppliers), P(Parts),
and SP(Shipment). The S table contains, for each suppler, a supplier number,
name status code and location, the P table contains, for each part, part number,
name, color, weight, and location where the part is stored. and the SP table
contains, for each shipment, a supplier has a unique supplier number and has
exactly one unique part number and exactly one name, color, weight and location;
and that, at any given time, no more than one shipment exists for a given
supplier/part combination.

Sno Name Status City


S1 Suneet 20 Qadian
S2 Ankit 10 Amritsar
S3 Amit 10 Amritsar
Supplier Records
Pno Name Color Weight City
P1 Nut Red 12 Qadian
P2 Bolt Green 17 Amritsar
P3 Screw Blue 17 Jalandar
P4 Screw Red 14 Qadian
Parts Records

Notes by Dr. Nilesh Shelke 11


SIT, Nagpur
2) The Hierarchical Approach
In the hierarchical data model, records are linked with other superior records
on which they are dependent and also on the records, which are dependent on
them. A tree structure may establish one-to-many relationship. Figure
illustrates the structure of a family. Great grandparent is the root of the
structure. Parents can have many children exhibiting one to many
relationships. The great grandparent record is known as the root of the tree.
The grandparents and children are the nodes or dependents of the root. In
general, a root may have any number of dependents. Each of these dependent
may have any number of lower level dependents, and so on, with no
restriction of levels.

The tree structure has parts record superior to supplier record. That is parts
form the parent and supplier forms the children. Each of the four trees figure,
consists of one part record occurrence, together with a set of subordinate
supplier record occurrences. There is one supplier record for each supplier of
a particular part. Each supplier occurrence includes the corresponding
shipment quantity.

Fig. 1.4 Hierarchical Model

Notes by Dr. Nilesh Shelke 12


SIT, Nagpur
F
or supplier S3 supplies 300 quantities of part P2. Note that the set
For example,
ex occurrences for a given part occurrence may contain any number
of supplier
a including zero (for the case of part P4). Part PI is supplied by two
of members,
m
suppliers, S1 and S2. Part P2 is supplied by three suppliers, S1, S2 and S3 and
pl
part P3 supplied by only supplier SI as shown in figure.
e,
3) The Network Approach
su
Considering again the sample supplier-part database, its network view is
p
shown. In addition to the part and supplier record types, a third record type
pl
is introduced which we will call as the connector. A connector occurrence
ie
specifies the association (shipment) between one supplier and one part. It
r
contains data (quantity of the parts supplied) describing the association
S
between supplier and part records.
3
su
p
pl
ie
s
3
0
0
q
u Fig. 1.5 Network Model
a
All connector occurrences for a given supplier are placed on a chain. The
nt from a supplier and finally returns to the supplier. Similarly, all
chain starts
connectoriti occurrences for a given part are placed on a chain starting from

the part es
and finally returning to the same part.
of

Notes by Dr. Nilesh Shelke 13


SIT, Nagpur
1.6 Data Independence
o Data independence can be explained using the three-schema
architecture.
o Data independence refers characteristic of being able to modify the
schema at one level of the database system without altering the schema
at the next higher level.
Logical Data Independence
o Logical data independence refers characteristic of being able to change
the conceptual schema without having to change the external schema.
o Logical data independence is used to separate the external level from
the conceptual view.
o If we do any changes in the conceptual view of the data, then the user
view of the data would not be affected.
o Logical data independence occurs at the user interface level.
Physical Data Independence
o Physical data independence can be defined as the capacity to change
the internal schema without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server,
then the Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from
the internal levels.
o Physical data independence occurs at the logical interface level.

Fig. 1.6 Data Independence

Notes by Dr. Nilesh Shelke 14


SIT, Nagpur
1.7 Structures or Components of DBMS

A typical structure of a DBMS with its components and relationships between


them is show. The DBMS software is partitioned into several modules. Each
module or component is assigned a specific operation to perform. Some of the
functions of the DBMS are supported by operating systems (OS) to provide
basic services and DBMS is built on top of it. The physical data and system
catalog are stored on a physical disk. Access to the disk is controlled primarily
by as, which schedules disk input/output. Therefore, while designing a DBMS
its interface with the as must be taken into account.

Fig. 1.7 Structures or Components of DBMS

Notes by Dr. Nilesh Shelke 15


SIT, Nagpur
1. Query Processor: It interprets the requests (queries) received from end
user via an application program into instructions. It also executes the user
request which is received from the DML compiler.
Query Processor contains the following components –
 DML Compiler: It processes the DML statements into low level
instruction (machine language), so that they can be executed.
 DDL Interpreter: It processes the DDL statements into a set of table
containing meta data (data about data).
 Embedded DML Pre-compiler: It processes DML statements embedded
in an application program into procedural calls.
 Query Optimizer: It executes the instruction generated by DML
Compiler.
2. Storage Manager: Storage Manager is a program that provides an
interface between the data stored in the database and the queries received.
It is also known as Database Control System. It maintains the consistency
and integrity of the database by applying the constraints and executing
the DCL statements. It is responsible for updating, storing, deleting, and
retrieving data in the database.
It contains the following components –
 Authorization Manager: It ensures role-based access control, i.e,.
checks whether the particular person is privileged to perform the
requested operation or not.
 Integrity Manager: It checks the integrity constraints when the
database is modified.
 Transaction Manager: It controls concurrent access by performing the
operations in a scheduled way that it receives the transaction. Thus, it
ensures that the database remains in the consistent state before and
after the execution of a transaction.

Notes by Dr. Nilesh Shelke 16


SIT, Nagpur
 File Manager: It manages the file space and the data structure used to
represent information in the database.

 Buffer Manager: It is responsible for cache memory and the transfer of


data between the secondary storage and main memory.

3. Disk Storage: It contains the following components –

 Data Files: It stores the data.


Data Dictionary: It contains the information about the structure of any
database object. It is the repository of information that governs the
metadata.
 Indices: It provides faster retrieval of data item.

Data Dictionary
A data dictionary is a repository of metadata. The data dictionary of Oracle
is stored in the SYS schema.
Each Oracle database has a data dictionary, which is a set of tables and views
that serve as a reference about the database.
For example, a data dictionary stores information about both
the logical and physical structure of the database.
A data dictionary also stores the valid users of an Oracle database, information
about integrity constraints defined for tables in the database, and the amount
of space allocated for a schema object and how much of that space is in use,
among much other information.
A data dictionary is created when a database is created. To accurately reflect
the status of the database at all times, the data dictionary is automatically
updated by Oracle Database in response to specific actions, such as when the
structure of the database is altered. Database users cannot modify the data
dictionary. Various database processes rely on the data dictionary to record,
verify, and conduct ongoing work. For example, during database operation,
Oracle Database reads the data dictionary to verify that schema objects exist
and that users have proper access to them.
Data Dictionary Tables
USER_TABLES
USER_VIEWS
USER_CONSTRAINTS
USER_INDEXES

Notes by Dr. Nilesh Shelke 17


SIT, Nagpur
1.8 Three Level -Schema Architecture of Database

Data are actually stored as bits, or numbers and strings, but it is difficult to
work with data at this level.
Schema:
 Description of data at some level. Each level has its own schema.
We will be concerned with three forms of schemas:
 physical,
 conceptual, and
 external
Physical Data Level
The physical schema describes details of how data is stored: files, indices, etc.
on the random access disk system. It also typically describes the record layout
of files and type of files (hash, b-tree, flat).
Early applications worked at this level - explicitly dealt with details. E.g.,
minimizing physical distances between related data and organizing the data
structures within the file (blocked records, linked lists of blocks, etc.)
 Routines are hardcoded to deal with physical representation.
Problem:
 Changes to data structures are difficult to make.
 Application code becomes complex since it must deal with details.
 Rapid implementation of new features very difficult.
Hides details of the physical level.
 In the relational model, the conceptual schema presents data as a set
of tables.
The DBMS maps data access between the conceptual to physical schemas
automatically.
 Physical schema can be changed without changing application:
 DBMS must change mapping from conceptual to physical.

Notes by Dr. Nilesh Shelke 18


SIT, Nagpur
 Referred to as physical data independence.
Conceptual Data Level
Also referred to as the Logical level Hides details of the physical level.
 In the relational model, the conceptual schema presents data as a set
of tables.
The DBMS maps data access between the conceptual to physical schemas
automatically.
 Physical schema can be changed without changing application:
 DBMS must change mapping from conceptual to physical.
 Referred to as physical data independence.

1.9 Entity Relationship Diagram

An Entity–relationship model (ER model) describes the structure of a


database with the help of a diagram, which is known as Entity Relationship
Diagram (ER Diagram). An ER model is a design or blueprint of a database
that can later be implemented as a database. The main components of E-R
model are: entity set and relationship set.

What is an Entity Relationship Diagram (ER Diagram)? An ER diagram


shows the relationship among entity sets. An entity set is a group of similar
entities and these entities can have attributes. In terms of DBMS, an entity is a
table or attribute of a table in database, so by showing relationship among
tables and their attributes, ER diagram shows the complete logical structure
of a database. Lets have a look at a simple ER diagram to understand this
concept.
Facts about ER Diagram Model: ER model allows you to draw Database
Design It is an easy to use graphical tool for modeling data Widely used in
Database Design It is a GUI representation of the logical structure of a
Database It helps you to identifies the entities which exist in a system and the
relationships between those entities.
Why use ER Diagrams?
Here, are prime reasons for using the ER Diagram Helps you to define terms
related to entity relationship modeling Provide a preview of how all your
tables should connect, what fields are going to be on each table Helps to
describe entities, attributes, relationships ER diagrams are translatable into
relational tables which allows you to build databases quickly ER diagrams can
be used by database designers as a blueprint for implementing data in specific
software applications
A simple ER Diagram: In the following diagram we have two entities Student and
College and their relationship. The relationship between Student and College is
many to one as a college can have many students however a student cannot study
in multiple colleges at the same time. Student entity has attributes such as Stu_Id,
Stu_Name & Stu_Addr and College entity has attributes such as Col_ID &
Col_Name.
Notes by Dr. Nilesh Shelke 19
SIT, Nagpur
Notes by Dr. Nilesh Shelke 20
SIT, Nagpur
Notes by Dr. Nilesh Shelke 21
SIT, Nagpur
In above example, "Trans No" is a discriminator within a group of transactions in an
ATM .Let's learn more about a weak entity by comparing it with a Strong Entity

Notes by Dr. Nilesh Shelke 22


SIT, Nagpur
Notes by Dr. Nilesh Shelke 23
SIT, Nagpur
Notes by Dr. Nilesh Shelke 24
SIT, Nagpur
2. One to Many Relationship When a single instance of an entity is associated with
more than one instances of another entity then it is called one to many relationship.
For example – a customer can place many orders but a order cannot be placed by
many customers.

Notes by Dr. Nilesh Shelke 25


SIT, Nagpur
EER Enhanced Entity Relationship Diagram
Generalization
o Generalization is like a bottom-up approach in which two or more
entities of lower level combine to form a higher level entity if they have
some attributes in common.
o In generalization, an entity of a higher level can also combine with the
entities of the lower level to form a further higher level entity.
o Generalization is more like subclass and superclass system, but the
only difference is the approach. Generalization uses the bottom-up
approach.
o In generalization, entities are combined to form a more generalized
entity, i.e., subclasses are combined to make a superclass.
For example, Faculty and Student entities can be generalized and create a
higher level entity Person.

Notes by Dr. Nilesh Shelke 26


SIT, Nagpur
Specialization
o Specialization is a top-down approach, and it is opposite to
Generalization. In specialization, one higher level entity can be broken
down into two lower level entities.
o Specialization is used to identify the subset of an entity set that shares
some distinguishing characteristics.
o Normally, the superclass is defined first, the subclass and its related
attributes are defined next, and relationship set are then added.
For example: In an Employee management system, EMPLOYEE entity can be
specialized as TESTER or DEVELOPER based on what role they play in the
company.

Aggregation
In aggregation, the relation between two entities is treated as a single entity.
In aggregation, relationship with its corresponding entities is aggregated into
a higher level entity.
For example: Center entity offers the Course entity act as a single entity in
the relationship which is in a relationship with another entity visitor. In the
real world, if a visitor visits a coaching center then he will never enquiry about
the Course only or just about the Center instead he will ask the enquiry about
both.

Notes by Dr. Nilesh Shelke 27


SIT, Nagpur
ER Diagram for University database

Schema Diagram for University Database

Notes by Dr. Nilesh Shelke 28


SIT, Nagpur
1.10 Relational Model
NOTES Relation
A relation, also known as a table or file, is a subset of the Cartesian product of
a list of domains characterized by a name. And within a table, each row
represents a group of related data values. A row, or record, is also known as
a tuple. The columns in a table is a field and is also referred to as an attribute.
You can also think of it this way: an attribute is used to define the record and
a record contains a set of attributes.

The steps below outline the logic between a relation and its domains.

1. Given n domains are denoted by D1, D2, … Dn

2. And r is a relation defined on these domains

3. Then r ⊆ D1×D2×…×Dn

Table
A database is composed of multiple tables and each table holds the data.
Figure 7.1 shows a database that contains three tables.

Domain
A domain is the original sets of atomic values used to model data. By atomic
value, we mean that each value in the domain is indivisible as far as the
relational model is concerned. For example:
 The domain of Marital Status has a set of possibilities: Married, Single,
Divorced.
Notes by Dr. Nilesh Shelke 29
SIT, Nagpur
 The domain of Shift has the set of all possible
 The domain of Shift has the set of all possible days: {Mon, Tue, Wed…}.
 The domain of Salary is the set of all floating-point numbers greater than
0 and less than 200,000.
The domain of First Name is the set of character strings that represents
names of people.
In summary, a domain is a set of acceptable values that a column is allowed to
contain. This is based on various properties and the data type for the column.
We will discuss data types in another chapter.
Degree
The degree is the number of attributes in a table.
Example of Relational data model.
S# Sname Status City
S1 Smith 20 London
S2 Jones 10 Paris
S3 Blake 30 Paris
P
P# PNAME COLOR WEIGH CITY
P1 Nut Red T
12 Londo
P2 Bolt Green 17 n
Paris
P3 Screw Blue 17 Rome
P4 Screw Red 14 Londo
SP n
S# P# SP#
S1 P1 300
S1 P2 200
S1 P3 400
S2 P1 300
S2 P2 400
S3 P2 200

Notes by Dr. Nilesh Shelke 30


SIT, Nagpur
Schema
A database schema is the skeleton structure that represents the logical view
of the entire database. It defines how the data is organized and how the
relations among them are associated. It formulates all the constraints that are
to be applied on the data.
A database schema defines its entities and the relationship among them. It
contains a descriptive detail of the database, which can be depicted by means
of schema diagrams. It’s the database designers who design the schema to
help programmers understand the database and make it useful.

A database schema can be divided broadly into two categories −


 Physical Database Schema − This schema pertains to the actual storage
of data and its form of storage like files, indices, etc. It defines how the data
will be stored in a secondary storage.
 Logical Database Schema − This schema defines all the logical constraints
that need to be applied on the data stored. It defines tables, views, and
integrity constraints.

Notes by Dr. Nilesh Shelke 31


SIT, Nagpur
Instance

It is important that we distinguish these two terms individually. Database


schema is the skeleton of database. It is designed when the database doesn't
exist at all. Once the database is operational, it is very difficult to make any
changes to it. A database schema does not contain any data or information.

A database instance is a state of operational database with data at any given


time. It contains a snapshot of the database. Database instances tend to change
with time. A DBMS ensures that its every instance (state) is in a valid state, by
diligently following all the validations, constraints, and conditions that the
database designers have imposed.

The data stored in database at a particular moment of time is called instance


of database. Database schema defines the variable declarations in tables that
belong to a particular database; the value of these variables at a moment of
time is called the instance of that database.

Notes by Dr. Nilesh Shelke 32


SIT, Nagpur
Different types of keys that are used in RDBMS

There are different keys used in the database are:

• Primary Key
• Super Key
• Candidate Key
• Foreign Key
Primary key
Primary key of a table never contains null and duplicate values. Thus, it is used
to identify tuple of a table uniquely. For example, Roll numbers attribute of a
student record acts as a primary key because every student must have a
unique roll number.
Super key
Super key is the combination of more than one attribute that is used to identify
every tuple of the table uniquely. A table may contain more than one super
keys depending on the possible combinations of the attributes in the table.
You can use primary key of a table to make a super key.
Candidate key
Candidate key is that super key, which combines minimum attributes of a
table to identify the tuples uniquely and having primary key of the table. You
can use candidate key as a primary key in a table.
Foreign Key

Foreign key is that column of the table, which is used to maintain relationship

with other tables of database. Foreign key of a table should be defined as


primary key in another table.

Notes by Dr. Nilesh Shelke 33


SIT, Nagpur
Integrity constraints
Integrity Constraints provide a way to control consistency and correctness of
data. The various types of integrity constraints are:
 Referential integrity
 Entity integrity
 Domain integrity
 Null integrity
1 Referential Integrity
Referential integrity ensures that the relationship between tables remain
preserved when you insert, delete, and modify data. In SQL Server 2000,
referential integrity is based on relationship between foreign keys and
primary keys or between foreign keys and unique keys. Referential integrity
ensures that key values are consistent across the related tables. When
referential integrity is enforced, SQL Server 2000 prevents users from adding
records to a related table if there is no associated record in the primary table.
Users are also prevented from changing values in a primary table or deleting
records from the primary table if there are related records in the related table.
Foreign Key constraint prevents conditions that violate any reference
between the two database tables. A foreign key value refers to another table
with the corresponding values of primary key in a database table. Consider
the following example where you need to create a database table with a
foreign key constraint.
2 Entity Integrity
Entity Integrity is a mechanism that allows uniquely identified rows in a table.
This is done with either primary keys or unique keys that will prevent
duplicate rows. You can apply entity integrity to a table by specifying a
PRIMARY KEY constraint. A primary key help retrieve data records uniquely

Notes by Dr. Nilesh Shelke 34


SIT, Nagpur
from a database table. Each table must have a primary key constraint to
uniquely identify each row in the database.
There can be only one primary key constraint for each table.
3. Domain Integrity
Domain integrity concerns the validity of entries for a given column. Selecting
the appropriate data type for a column is the first step in maintaining domain
integrity. Other steps could include, setting up appropriate constraints and
rules to define the data format and/or restricting the range of possible values.
4. Null Attribute
Sometimes it is required that certain attributes cannot have null values. For
example, if every EMPLOYEE must have a valid name then the Name attribute is
constrained to be NOT NULL.

1.11 Codd's 12 rules for defining a fully relational database


A relational database management system (RDBMS) is a database
management system (DBMS) that is based on the relational model as
introduced by E. F. Codd. Most popular commercial and open source databases
currently in use are based on the relational model.

A short definition of an RDBMS may be a DBMS in which data is stored in the


form of tables and the relationship among the data is also stored in the
form of tables.

E.F. Codd, the famous mathematician has introduced 12 rules for the
relational model for databases commonly known as Codd's rules. The rules
mainly define what is required for a DBMS for it to be considered relational,
i.e., an RDBMS. There is also one more rule i.e. Rule00 which specifies the
relational model should use the relational way to manage the database. The
rules and their description are as follows:-
Rule 0: Foundation Rule
A relational database management system should be capable of using its
relational facilities (exclusively) to manage the database.

Notes by Dr. Nilesh Shelke 35


SIT, Nagpur
NOTES Rule 1: Information Rule

All information in the database is to be represented in one and only one way.
This is achieved by values in column positions within rows of tables.

Rule 2: Guaranteed Access Rule

All data must be accessible with no ambiguity, that is, Each and every datum
(atomic value) is guaranteed to be logically accessible by resorting to a
combination of table name, primary key value and column name.

Rule 3: Systematic treatment of null values

Null values (distinct from empty character string or a string of blank


characters and distinct from zero or any other number) are supported in the
fully relational DBMS for representing missing information in a systematic
way, independent of data type.

Rule 4: Dynamic On-line Catalog Based on the Relational Model

The database description is represented at the logical level in the same way as
ordinary data, so authorized users can apply the same relational language to
its interrogation as they apply to regular data. The authorized users can access
the database structure by using common language i.e. SQL.

Rule 5: Comprehensive Data Sublanguage Rule

A relational system may support several languages and various modes of


terminal use. However, there must be at least one language whose statements
are expressible, per some well-defined syntax, as character strings and whose
ability to support all of the following is comprehensible:

a. data definition
b. view definition
c. data manipulation (interactive and by program)
d. integrity constraints
e. authorization
f. Transaction boundaries (begin, commit, and rollback).

Rule 6: View Updating Rule

All views that are theoretically updateable are also updateable by the system.

Notes by Dr. Nilesh Shelke 36


SIT, Nagpur
NOTES Rule 7: High-level Insert, Update, and Delete

The system is able to insert, update and delete operations fully. It can also
perform the operations on multiple rows simultaneously.

Rule 8: Physical Data Independence

Application programs and terminal activities remain logically unimpaired


whenever any changes are made in either storage representation or access
methods.

Rule 9: Logical Data Independence

Application programs and terminal activities remain logically unimpaired


when information preserving changes of any kind that theoretically permit
unimpairment are made to the base tables.

Rule 10: Integrity Independence

Integrity constraints specific to a particular relational database must be


definable in the relational data sublanguage and storable in the catalog, not in
the application programs.

Rule 11: Distribution Independence

The data manipulation sublanguage of a relational DBMS must enable


application programs and terminal activities to remain logically unimpaired
whether and whenever data are physically centralized or distributed.

Rule 12: Nonsubversion Rule

If a relational system has or supports a low-level (single-record-at-a-time)


language, that low-level language cannot be used to subvert or bypass the
integrity rules or constraints expressed in the higher-level (multiple-records-
at-a-time) relational language.

Note that based on these rules there is no fully relational database


management system available today. In particular, rules 6, 9, 10, 11 and 12 are
difficult to satisfy.

What reasons are considered to create views in DBMS? Explain.

Notes by Dr. Nilesh Shelke 37


SIT, Nagpur
1.12 Views
Views provide a means to present a different representation of the data that
resides
within the base tables. Views are very powerful because they allow you to
tailor the presentation of data to different types of users. Views are often used
*to provide an additional level of table security by restricting access to a
predetermined set of rows and/or columns of a table to hide data
complexity
For example, a single view might be defined with a join, which is a collection
of related columns or rows in multiple tables. However, the view hides the fact
that this information actually originates from several tables.
* to simplify commands for the user
For example, views allow users to select information from multiple tables
without actually knowing how to perform a join.
*to present the data in a different perspective from that of the base table
For example, the columns of a view can be renamed without affecting the
tables on which the view is based.
* to isolate applications from changes in definitions of base tables
For example, if a view’s defining query references three columns of a four
column table and a fifth column is added to the table, the view’s definition is
not affected and all applications using the view are not affected.
Order_master table
desc order_master;
Name Null? Type
----------------------------------------------------- -------- ----------------------
ORDERNO VARCHAR2(5)
ODATE DATE
VENCODE VARCHAR2(5)
OSTATUS CHAR(1)
DEL_DATE DATE
Notes by Dr. Nilesh Shelke 38
SIT, Nagpur
insert into order_master values('o002','11-may-98','v001','c','13-dec-98');
insert into order_master values('o003','11-april-98','v004','p','13-nov-98');
insert into order_master values('o005','11-april-98','v005','p','18-nov-99');
General Syntax for creating views;
Create or replace force view view_name as query with check option read
only constraint.
create view myview1 as select * from order_master
create view view2 as select * from order_master where ostatus='p';
The following command will work fine but this will produce rows which are
not viewable through the view.
update view2 set ostatus='d' where ostatus='p';
Creating the same with check option will prevent this.
create or replace view view2 as select * from order_master where ostatus='d'
with check option constraint penv;
update view2 set ostatus='n' where ostatus='d';
ORA-01402: view WITH CHECK OPTION where-clause violation
Oracle creates view even if the defining query refers to the non-existing table.
create force view en as select * from venmast;
Warning: View created with compilation errors.
Read only views:
create or replace
view view2 as select * from order_master with read only;
SQL> update view2 set ostatus='n' where ostatus='d';
update view2 set ostatus='n' where ostatus='d'

Notes by Dr. Nilesh Shelke 39


SIT, Nagpur
Questions:
Q1. What are the drawbacks of File Processing System?
Q2. What are the advantages DBMS?
Q3. Explain three level architecture proposal for DBMS.
Q4. What is data independence? What are its types? Explain.
Q5. Explain Data Models.
Q6. Differentiate between
a. file processing system and DBMS
b. Schema & Instance.
Q7. Explain DDL, DML and DCL.
Q8. Explain DBMS System Structure.
Q9. What are views? What is their significance in DBMS?
Q10. What is ERD? What are different symbols used in ERD. Explain with
example.
Q11. What is Data Abstraction in DBMS? What are different levels of Data
abstraction?
Q12. What are different Codd’s Rule? What is their significance?
Q11. Draw an ER Diagram for the database of a departmental store. There are
various departments in the store. One department sales many items. Some
items may be sold by more than one department. A department has many
employees. An employee can belong to at most one department. A manager is
an employee who may look after more than one department. A supplier may
supply more than one item. Every item is supplied by only one supplier at a
time.

Notes by Dr. Nilesh Shelke 40


SIT, Nagpur
Q12. Construct an ER Diagram for a car insurance company with a set of
tomers each of which own a number of cars. Each car has a number of
recorded accidents with it.

Q13. Draw an ER diagram for the Company which has the following
description:
 Company has several departments.
 Each department may have several Location.
 Departments are identified by a name, D_no, Location.
 A Manager control a particular department.
 Each department is associated with number of projects.
 Employees are identified by name, id, address, dob, date_of_joining.
 An employee works in only one department but can work on several
project.
 We also keep track of number of hours worked by an employee on a
single project.
 Each employee has dependent
 Dependent has D_name, Gender and relationship.

Notes by Dr. Nilesh Shelke 41


SIT, Nagpur
14. What is Weak Entity Set. Differentiate between Strong and Weak Entity
set.

SUMMARY

• A file in a file system is a container to store data in a computer.


• File system suffers from Data Redundancy, Data Inconsistency, Data
Isolation, Data Dependence and Controlled Data sharing.
• Database Management System (DBMS) is a software to create and
SUMMARY
manage databases. A database is a collection of tables.
• Database schema is the design of a database
• A database constraint is a restriction on the type of data that that can be
inserted into the table.
• Database schema and database constraints are storedin database Catalog.

• Whereas the snapshot of the database at any given time is the database
instance.
• A query is a request to a database for informationretrieval and data
manipulation (insertion, deletion or update). It is written in Structured
Query Language (SQL).
• Relational DBMS (RDBMS) is used to store data in related tables. Rows and
columns of a table are called tuples and attributed respectively. A table is
referredto as a relation.
• Destructions on data stored in a RDBMS is appliedby use of keys such
as Candidate Key, Primary Key, Composite Primary Key, Foreign Key.
• Primary key in a relation is used for unique identification of tuples.
• Foreign key is used to relate two tables or relations.
• Each column in a table represents a feature (attribute)of a record. Table
stores the information for an entity whereas a row represents a record.
• Each row in a table represents a record. A tuple is collection of attribute
values that makes a record unique.
• A tuple is a unique entity whereas attribute values canbe duplicate in the
table.
SQL is the standard language for RDBMS systems like MySQL.

Notes by Dr. Nilesh Shelke 42


SIT, Nagpur
DATABAS

38 INFORXI

Notes by Dr. Nilesh Shelke SIT, Nagpur


43

You might also like