Unit 1
Unit 1
1. Introduction of DBMS
• A database management system (DBMS) refers to the technology for creating and
managing databases.
• DBMS is a software tool to organize (create, retrieve, update, and manage) data in
a database.
• The main aim of a DBMS is to supply a way to store up and
retrieve database information that is both convenient and efficient.
• Database Management System (DBMS) is a software for storing and retrieving users'
data while considering appropriate security measures.
• It consists of a group of programs which manipulate the database.
• The DBMS accepts the request for data from an application and instructs the operating
system to provide the specific data.
• In large systems, a DBMS helps users and other third-party software to store and retrieve
data.
• DBMS allows users to create their own databases as per their requirement. The term
“DBMS” includes the user of the database and other application programs.
• It provides an interface between the data and the software application.
*Application of DBMS:
HR Management
occur.
Possible. Possible.
12) File system data sharing is not 12)DBMS offers features of data sharing
13) Transaction concept is not used. 13) The concept of transaction is important
aspect of DBMS.
1.4 Advantages and Disadvantages of DBMS
Advantages
5. Data is independent.
Disadvantages
➢ Database Designers
• Database Designers are responsible for designing the database objects.
• The database objects like tables, columns, their data types, forms and reports.
• Ones the database design is complete they assist DBA.
➢ Application Programmers
• These people write the codes of the programs according to the designs suggested by
the database designers.
• They also test, debug, document and maintain these programs.
• The person who develops the application is called s Application Developer.
➢ End Users:
• The end users are the people who interact with the database management system.
• They conduct various operations on database like retrieving, updating, deleting, etc.
• End Users of database:
❖ Casual end users: They are typically high or middle level managers.
❖ Naive or Parametric end users: Production supervisor or store-keeper at a
factory reservation clerks for airlines, railways.
❖ Sophisticated end users: business analysts, consultants, scientists etc.
❖ Stand alone users: These users maintain personal database by using readymade
program packages.
1.6 Structure of DBMS
Authorization and
Integrity Manager
• DML Pre-compiler: It translates DML statements in a query language into low level
instructions that query evaluation engine understands. It also attempts to transform user's request
into an equivalent but more efficient form.
• DDL Interpreter: It interprets the DDL statements and records them in a set of tables
containing meta data or data dictionary.
• Query Evaluation Engine: It executes low-level instructions generated by the DML compiler.
• Authorization and Integrity Manager: It tests for the satisfaction of integrity constraints
checks the authority of users to access data.
• Transaction Manager: It ensures that the database remains in a consistent state despite the
system failures and that concurrent transaction execution proceeds without conflicting.
• File Manager: It manages the allocation of space on disk storage and the data structures used
to represent information stored on disk.
• Buffer Manager: It is responsible for fetching data from disk storage into main memory and
deciding what data to cache in memory.
3. Data Storage:
Following data structures are required as a part of the physical system implementation.
• Data Dictionary : It stores meta data (data about data) about the structure of the database.
.
• Indices : Provide fast access to data items that hold particular values.
• Statistical Data: It stores statistical information about the data in the database. This information
is used by query processor to select efficient ways to execute query.
The view level provides the “view of data” to the users and hides the irrelevant details such as
data relationship, database schema, constraints, security etc from the user.
1. Data abstraction
2. Instance and schema
*Data Abstraction
To ease the user interaction with database, the developers hide internal irrelevant details from
users. This process of hiding irrelevant details from user is called data abstraction.
There are three level of Data abstraction:
Physical / Internal level: This is the lowest level of data abstraction. It describes how data is
actually stored in database. You can get the complex data structure details at this level.
Logical / Conceptual level: This is the middle level of 3-level data abstraction architecture. It
describes what data is stored in database.
View / External level: Highest level of data abstraction. This level describes the user interaction
with database system.
* Schema
Schema is of three types: Physical schema, logical schema and view schema.
❖ Physical schema: The design of a database at physical level is called physical schema, how
the data stored in blocks of storage is described at this level.
❖ Logical schema: Design of database at logical level is called logical schema. Programmers
and database administrators work at this level, at this level data can be described as certain
types of data records gets stored in data structures.
❖ View schema: Design of database at view level is called view schema. This generally
describes end user interaction with database systems.
*Instance
DDL is used for specifying the database schema. It is used for creating tables, schema, indexes,
constraints etc. in database. Lets see the operations that we can perform on database using DDL:
All of these commands either defines or update the database schema that’s why they come under
Data Definition language.
DML is used for accessing and manipulating data in a database. The following operations on
database comes under DML:
REVOKE: We can reject privileges given to particular user with help of REVOKE stamen.
The changes in the database that we made using DML commands are either performed or
rollbacked using TCL.
The command is only be used to undo changes since the last COMMIT.
1.9 Index
Introduction
➢ Indexing is a data structure technique to efficiently retrieve records from the database
files based on some attributes on which the indexing has been done.
➢ A database index is a data structure that improves the speed of data retrieval operations
on a database table at the cost of additional writes and storage space to maintain
the index data structure.
➢ Some databases extend the power of indexing by letting developers create indexes on
functions or expressions.
➢ Index file should be small than Main file. Otherwise it will take more time find the data.
Types of Indexing
• Primary Indexing
• Secondary Indexing
✓ Primary Index:
Primary Index is an ordered file which is fixed length size with two fields.
The first field is the same a primary key and second, filed is pointed to that specific data
block.
In the primary Index, there is always one to one relationship between the entries in the index
table.
The primary Indexing in DBMS is also further divided into two types.
• Dense Index
• Sparse Index
1) Dense Index:
In a dense index, a record is created for every search key valued in the database.
This helps you to search faster but needs more space to store index records. In this Indexing,
method records contain search key value and points to the real record on the disk.
2) Sparse Index
It is an index record that appears for only some of the values in the file. Sparse Index helps you
to resolve the issues of dense Indexing in DBMS.
In this method of indexing technique, a range of index columns stores the same data block
address, and when data needs to be retrieved, the block address will be fetched.
However, sparse Index stores index records for only some search-key values. It needs less space,
less maintenance overhead for insertion, and deletions but It is slower compared to the dense
Index for locating records.
In a clustered index, records themselves are stored in the Index and not pointers. Sometimes
the Index is created on non-primary key columns which might not be unique for each record. In
such a situation, you can group two or more columns to get the unique values and create an index
which is called clustered Index. This also helps you to identify the record faster.
Example:
Let's assume that a company recruited many employees in various departments. In this case,
clustering indexing in DBMS should be created for all employees who belong to the same dept.
It is considered in a single cluster, and index points point to the cluster as a whole. Here,
Department _no is a non-unique key.
✓ Secondary Index:
The secondary Index in DBMS can be generated by a field which has a unique value for each
record, and it should be a candidate key. It is also known as a non-clustering index.
This two-level database indexing technique is used to reduce the mapping size of the first level.
For the first level, a large range of numbers is selected because of this; the mapping size always
remains small.
Example of secondary Indexing
In a bank account database, data is stored sequentially by acc no; you may want to find all
accounts in of a specific branch of ABC bank.
Here, you can have a secondary index in DBMS for every search-key. Index record is a record
point to a bucket that contains pointers to all the records with their specific search-key value.
1.10 File Organization
o The File is a collection of records. Using the primary key, we can access the records. The
type and frequency of access can be determined by the type of file organization which
was used for a given set of records.
o File organization is a logical relationship among various records. This method defines
how file records are mapped onto disk blocks.
o File organization is used to describe the way in which the records are stored in terms of
blocks, and the blocks are placed on the storage medium.
o The first approach to map the database to the file is to use the several files and store only
one fixed length record in any given file. An alternative approach is to structure our files
so that we can contain multiple lengths for records.
o Files of fixed length records are easier to implement than the files of variable length
records.
File Organization
2.
• When the two or more records are stored in the same file, it is known as clusters.
• These files will have two or more tables in the same data block, and key
attributes which are used to map these tables together are stored only once.
D. Hash File Organization:
• Hash File Organization uses the computation of hash function on some fields
of the records.
• The hash function's output determines the location of disk block where the
records are to be placed.
• Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data.
• It provides the conceptual tools for describing the design of a database at each level of
data abstraction.
• Therefore, there are following four data models used for understanding the structure of
the database:
1) Relational Data Model:
▪ This type of model designs the data in the form of rows and columns within a
table.
▪ Thus, a relational model uses tables for representing data and in-between
relationships. Tables are also called relations.
▪ This type of data model is different from the other three data models.
▪ The semistructured data model allows the data specifications at places where the
individual data items of the same type may have different attributes sets.