0% found this document useful (0 votes)
32 views

Dbms Chapter1

The document provides an overview of database management systems (DBMS). It discusses what a database is, how it organizes data, and examples of databases like a college database. It then defines DBMS as software that manages databases and allows users to create, store, update, and retrieve data from the database. It also maintains data security, consistency, and protects the database. The document outlines the basic tasks of a DBMS like data definition, updating, and retrieval. It lists characteristics of DBMS like using a digital repository, providing a clear view of data manipulation, automatic backup/recovery, and maintaining data integrity. Finally, it compares DBMS and file system approaches to data storage.

Uploaded by

Ayush Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Dbms Chapter1

The document provides an overview of database management systems (DBMS). It discusses what a database is, how it organizes data, and examples of databases like a college database. It then defines DBMS as software that manages databases and allows users to create, store, update, and retrieve data from the database. It also maintains data security, consistency, and protects the database. The document outlines the basic tasks of a DBMS like data definition, updating, and retrieval. It lists characteristics of DBMS like using a digital repository, providing a clear view of data manipulation, automatic backup/recovery, and maintaining data integrity. Finally, it compares DBMS and file system approaches to data storage.

Uploaded by

Ayush Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Overview of DBMS

What is Database
The database is a collection of inter-related data which is used to retrieve, insert and
delete the data efficiently. It is also used to organize the data in the form of a table,
schema, views, and reports, etc.

For example: The college Database organizes the data about the admin, staff, students
and faculty etc.

Using the database, you can easily retrieve, insert, and delete the information.

Database Management System

● Database management system is a software which is used to manage the


database. For example: MySQL
, Oracle
, etc are a very popular commercial database which is used in different
applications.

● DBMS provides an interface to perform various operations like database


creation, storing data in it, updating data, creating a table in the database and a
lot more.

● It provides protection and security to the database. In the case of multiple


users, it also maintains data consistency.

DBMS allows users the following tasks:

● Data Definition: It is used for creation, modification, and removal of definition


that defines the organization of data in the database.

● Data Updation: It is used for the insertion, modification, and deletion of the
actual data in the database.
● Data Retrieval: It is used to retrieve the data from the database which can be
used by applications for various purposes.

● User Administration: It is used for registering and monitoring users, maintain


data integrity, enforcing data security, dealing with concurrency control,
monitoring performance and recovering information corrupted by unexpected
failure.

Characteristics of DBMS

● It uses a digital repository established on a server to store and manage the


information.

● It can provide a clear and logical view of the process that manipulates data.

● DBMS contains automatic backup and recovery procedures.

● It contains ACID properties which maintain data in a healthy state in case of


failure.

● It can reduce the complex relationship between data.

● It is used to support manipulation and processing of data.

● It is used to provide security of data.

● It can view the database from different viewpoints according to the


requirements of the user.

Basic DBMS terminology

● Database
A database is a named collection of tables. (see table). A database can also contain views,
indexes, sequences, data types, operators, and functions. Other relational database
products use the term catalog.
● Command
A command is a string that you send to the server in hopes of having the server do
something useful. Some people use the word statement to mean command. The two words
are very similar in meaning and, in practice, are interchangeable
● Query
A query is a type of command that retrieves data from the server.
● Table (relation, file, class)
A table is a collection of rows. A table usually has a name, although some tables are
temporary and exist only to carry out a command. All the rows in a table have the same
shape (in other words, every row in a table contains the same set of columns). In other
database systems, you may see the terms relation, file, or even class?these are all
equivalent to a table.
● Column (field, attribute)
A column is the smallest unit of storage in a relational database. A column represents one
piece of information about an object. Every column has a name and a data type. Columns
are grouped into rows, and rows are grouped into tables. In Figure 1.1, the shaded area
depicts a single column.
Figure 1.1. A column (highlighted).

The terms field and attribute have similar meanings.


● Row (record, tuple)
A row is a collection of column values. Every row in a table has the same shape (in other
words, every row is composed of the same set of columns). If you are trying to model a
real-world application, a row represents a real-world object. For example, if you are running
an auto dealership, you might have a vehicles table. Each row in the vehicles table
represents a car (or truck, or motorcycle, and so on). The kinds of information that you
store are the same for all vehicles (that is, every car has a color, a vehicle ID, an engine, and
so on). In Figure 1.2, the shaded area depicts a row.
Figure 1.2. A row (highlighted).

You may also see the terms record or tuple?these are equivalent to a row.
● View
A view is an alternative way to present a table (or tables). You might think of a view as a
"virtual" table. A view is (usually) defined in terms of one or more tables. When you create a
view, you are not storing more data, you are instead creating a different way of looking at
existing data. A view is a useful way to give a name to a complex query that you may have
to use repeatedly.
● Client/server
PostgreSQL is built around a client/server architecture. In a client/server product, there are
at least two programs involved. One is a client and the other is a server. These programs
may exist on the same host or on different hosts that are connected by some sort of
network. The server offers a service; in the case of PostgreSQL, the server offers to store,
retrieve, and change data. The client asks a server to perform work; a PostgreSQL client
asks a PostgreSQL server to serve up relational data.
● Client
A client is an application that makes requests of the PostgreSQL server. Before a client
application can talk to a server, it must connect to a postmaster (see postmaster) and
establish its identity. Client applications provide a user interface and can be written in many
languages. Chapters 8 through 17 will show you how to write a client application.
● Server
The PostgreSQL server is a program that services commands coming from client
applications. The PostgreSQL server has no user interface?you can't talk to the server
directly, you must use a client application.
● Postmaster
Because PostgreSQL is a client/server database, something has to listen for connection
requests coming from a client application. That's what the postmaster does. When a
connection request arrives, the postmaster creates a new server process in the host
operating system.
● Transaction
A transaction is a collection of database operations that are treated as a unit. PostgreSQL
guarantees that all the operations within a transaction complete or that none of them
complete. This is an important property?it ensures that if something goes wrong in the
middle of a transaction, changes made before the point of failure will not be reflected in the
database. A transaction usually starts with a BEGIN command and ends with a COMMIT or
ROLLBACK (see the next entries).
● Commit
A commit marks the successful end of a transaction. When you perform a commit, you are
telling PostgreSQL that you have completed a unit of operation and that all the changes
that you made to the database should become permanent.
● Rollback
A rollback marks the unsuccessful end of a transaction. When you roll back a transaction,
you are telling PostgreSQL to discard any changes that you have made to the database
(since the beginning of the transaction).
● Index
An index is a data structure that a database uses to reduce the amount of time it takes to
perform certain operations. An index can also be used to ensure that duplicate values don't
appear where they aren't wanted. I'll talk about indexes in Chapter 4, "Query Optimization."
● Result set
When you issue a query to a database, you get back a result set. The result set contains all
the rows that satisfy your query. A result set may be empty.

Data base system v/s file system

File System Approach


File based systems were an early attempt to computerize the manual system. It is also
called a traditional based approach in which a decentralized approach was taken
where each department stored and controlled its own data with the help of a data
processing specialist. The main role of a data processing specialist was to create the
necessary computer file structures, and also manage the data within structures and
design some application programs that create reports based on file data.

In the above figure:

Consider an example of a student's file system. The student file will contain
information regarding the student (i.e. roll no, student name, course etc.). Similarly, we
have a subject file that contains information about the subject and the result file which
contains the information regarding the result.

Some fields are duplicated in more than one file, which leads to data redundancy. So to
overcome this problem, we need to create a centralized system, i.e. DBMS approach.

12M

195

Triggers in SQL (Hindi)

DBMS:
A database approach is a well-organized collection of data that are related in a
meaningful way which can be accessed by different users but stored only once in a
system. The various operations performed by the DBMS system are: Insertion,
deletion, selection, sorting etc.
In the above figure,

In the above figure, duplication of data is reduced due to centralization of data.

There are the following differences between DBMS and File systems:

Basis DBMS Approach File System Approach

Meaning DBMS is a collection of data. In The file system is a collection of

DBMS, the user is not required to data. In this system, the user has

write the procedures. to write the procedures for

managing the database.

Sharing of Due to the centralized approach, Data is distributed in many files,

data data sharing is easy. and it may be of different formats,

so it isn't easy to share data.

Data DBMS gives an abstract view of The file system provides the detail

Abstraction data that hides the details. of the data representation and

storage of data.
Security DBMS provides a good It isn't easy to protect a file under

and protection mechanism. the file system.

Protection

Recovery DBMS provides a crash recovery The file system doesn't have a

Mechanism mechanism, i.e., DBMS protects crash mechanism, i.e., if the

the user from system failure. system crashes while entering

some data, then the content of the

file will be lost.

Manipulatio DBMS contains a wide variety of The file system can't efficiently

n sophisticated techniques to store store and retrieve the data.

Techniques and retrieve the data.

Concurrenc DBMS takes care of Concurrent In the File system, concurrent

y Problems access of data using some form access has many problems like

of locking. redirecting the file while deleting

some information or updating

some information.

Where to Database approach used in large File system approach used in

use systems which interrelate many large systems which interrelate

files. many files.

Cost The database system is The file system approach is

expensive to design. cheaper to design.


Data Due to the centralization of the In this, the files and application

Redundanc database, the problems of data programs are created by different

y and redundancy and inconsistency programmers so that there exists

Inconsisten are controlled. a lot of duplication of data which

cy may lead to inconsistency.

Structure The database structure is The file system approach has a

complex to design. simple structure.

Data In this system, Data In the File system approach, there

Independen Independence exists, and it can exists no Data Independence.

ce be of two types.

● Logical Data
Independence

● Physical Data
Independence

Integrity Integrity Constraints are easy to Integrity Constraints are difficult

Constraints apply. to implement in file system.

Data In the database approach, 3 In the file system approach, there

Models types of data models exist: is no concept of data models

exists.
● Hierarchal data models

● Network data models

● Relational data models


Flexibility Changes are often a necessity to The flexibility of the system is

the content of the data stored in less as compared to the DBMS

any system, and these changes approach.

are more easily with a database

approach.

Examples Oracle, SQL Server, Sybase etc. Cobol, C++ etc.

Codd's Rules
Every database has tables, and constraints cannot be referred to as a rational database
system. And if any database has only relational data model, it cannot be a Relational
Database System (RDBMS)

. So, some rules define a database to be the correct RDBMS. These rules were developed by
Dr. Edgar F. Codd (E.F. Codd) in 1985, who has vast research knowledge on the Relational
Model of database Systems. Codd presents his 13 rules for a database to test the concept of
DBMS
against his relational model, and if a database follows the rule, it is called a true relational
database (RDBMS). These 13 rules are popular in RDBMS, known as Codd's 12 rules.
Rule 0: The Foundation Rule

The database must be in relational form. So that the system can handle the database
through its relational capabilities.

Rule 1: Information Rule

A database contains various information, and this information must be stored in each
cell of a table in the form of rows and columns.

Rule 2: Guaranteed Access Rule


Every single or precise data (atomic value) may be accessed logically from a relational
database using the combination of primary key value, table name, and column name.

Competitive questions on Structures in Hindi

Keep Watching

Rule 3: Systematic Treatment of Null Values

This rule defines the systematic treatment of Null values in database records. The null
value has various meanings in the database, like missing the data, no value in a cell,
inappropriate information, unknown data and the primary key should not be null.

Rule 4: Active/Dynamic Online Catalog based on the relational


model

It represents the entire logical structure of the descriptive database that must be
stored online and is known as a database dictionary. It authorizes users to access the
database and implement a similar query language to access the database.

Rule 5: Comprehensive Data SubLanguage Rule

The relational database supports various languages, and if we want to access the
database, the language must be the explicit, linear or well-defined syntax, character
strings and supports the comprehensive: data definition, view definition, data
manipulation, integrity constraints, and limit transaction management operations. If
the database allows access to the data without any language, it is considered a
violation of the database.

Rule 6: View Updating Rule

All views table can be theoretically updated and must be practically updated by the
database systems.

Rule 7: Relational Level Operation (High-Level Insert, Update and


delete) Rule
A database system should follow high-level relational operations such as insert,
update, and delete in each level or a single row. It also supports union, intersection and
minus operation in the database system.

Rule 8: Physical Data Independence Rule

All stored data in a database or an application must be physically independent to


access the database. Each data should not depend on other data or an application. If
data is updated or the physical structure of the database is changed, it will not show
any effect on external applications that are accessing the data from the database.

Rule 9: Logical Data Independence Rule

It is similar to physical data independence. It means, if any changes occurred to the


logical level (table structures), it should not affect the user's view (application). For
example, suppose a table either split into two tables, or two table joins to create a
single table, these changes should not be impacted on the user view application.

Rule 10: Integrity Independence Rule

A database must maintain integrity independence when inserting data into table's cells
using the SQL query language. All entered values should not be changed or rely on any
external factor or application to maintain integrity. It is also helpful in making the
database-independent for each front-end application.

Rule 11: Distribution Independence Rule

The distribution independence rule represents a database that must work properly,
even if it is stored in different locations and used by different end-users. Suppose a
user accesses the database through an application; in that case, they should not be
aware that another user uses particular data, and the data they always get is only
located on one site. The end users can access the database, and these access data
should be independent for every user to perform the SQL queries.

Rule 12: Non Subversion Rule

The non-submersion rule defines RDBMS as a SQL


language to store and manipulate the data in the database. If a system has a low-level or
separate language other than SQL to access the database system, it should not subvert or
bypass integrity to transform data.

Data Independence
● Data independence can be explained using the three-schema architecture.

● Data independence refers characteristic of being able to modify the schema at


one level of the database system without altering the schema at the next higher
level.

There are two types of data independence:

1. Logical Data Independence

● Logical data independence refers characteristic of being able to change the


conceptual schema without having to change the external schema.

● Logical data independence is used to separate the external level from the
conceptual view.

● If we do any changes in the conceptual view of the data, then the user view of
the data would not be affected.

● Logical data independence occurs at the user interface level.

2. Physical Data Independence

● Physical data independence can be defined as the capacity to change the


internal schema without having to change the conceptual schema.
● If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.

● Physical data independence is used to separate conceptual levels from the


internal levels.

● Physical data independence occurs at the logical interface level.

Fig: Data Independence

DBMS Architecture
● The DBMS design depends upon its architecture. The basic client/server
architecture is used to deal with a large number of PCs, web servers, database
servers and other components that are connected with networks.
● The client/server architecture consists of many PCs and a workstation which
are connected via the network.

● DBMS architecture depends upon how users are connected to the database to
get their request done.

Types of DBMS Architecture

Database architecture can be seen as a single tier or multi-tier. But logically, database
architecture is of two types like: 2-tier architecture and 3-tier architecture.

1-Tier Architecture

● In this architecture, the database is directly available to the user. It means the
user can directly sit on the DBMS and uses it.
● Any changes done here will directly be done on the database itself. It doesn't
provide a handy tool for end users.

● The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick
response.

2-Tier Architecture

● The 2-Tier architecture is same as basic client-server. In the two-tier


architecture, applications on the client end can directly communicate with the
database at the server side. For this interaction, API's like: ODBC, JDBC are
used.

● The user interfaces and application programs are run on the client-side.

● The server side is responsible to provide the functionalities like: query


processing and transaction management.

● To communicate with the DBMS, client-side application establishes a


connection with the server side.
Fig: 2-tier Architecture

3-Tier Architecture

● The 3-Tier architecture contains another layer between the client and server. In
this architecture, client can't directly communicate with the server.

● The application on the client-end interacts with an application server which


further communicates with the database system.

● End user has no idea about the existence of the database beyond the
application server. The database also has no idea about any other user beyond
the application.

● The 3-Tier architecture is used in case of large web application.


Fig: 3-tier Architecture

Three schema Architecture


● The three schema architecture is also called ANSI/SPARC architecture or
three-level architecture.

● This framework is used to describe the structure of a specific database system.

● The three schema architecture is also used to separate the user applications
and physical database.

● The three schema architecture contains three-levels. It breaks the database


down into three different categories.

The three-schema architecture is as follows:


In the above diagram:

● It shows the DBMS architecture.

● Mapping is used to transform the request and response between various


database levels of architecture.

● Mapping is not good for small DBMS because it takes more time.

● In External / Conceptual mapping, it is necessary to transform the request from


external level to conceptual schema.

● In Conceptual / Internal mapping, DBMS transform the request from the


conceptual to internal level.
Objectives of Three schema Architecture
The main objective of three level architecture is to enable multiple users to access the
same data with a personalized view while storing the underlying data only once. Thus
it separates the user's view from the physical structure of the database. This
separation is desirable for the following reasons:

● Different users need different views of the same data.

● The approach in which a particular user needs to see the data may change over
time.

● The users of the database should not worry about the physical implementation
and internal workings of the database such as data compression and encryption
techniques, hashing, optimization of the internal structures etc.

● All users should be able to access the same data according to their
requirements.

● DBA should be able to change the conceptual structure of the database without
affecting the user's

● Internal structure of the database should be unaffected by changes to physical


aspects of the storage.

1. Internal Level
● The internal level has an internal schema which describes the physical storage
structure of the database.

● The internal schema is also known as a physical schema.

● It uses the physical data model. It is used to define that how the data will be
stored in a block.

● The physical level is used to describe complex low-level data structures in


detail.

The internal level is generally is concerned with the following activities:

● Storage space allocations.


For Example: B-Trees, Hashing etc.

● Access paths.
For Example: Specification of primary and secondary keys, indexes, pointers
and sequencing.

● Data compression and encryption techniques.

● Optimization of internal structures.

● Representation of stored fields.

2. Conceptual Level
● The conceptual schema describes the design of a database at the conceptual
level. Conceptual level is also known as logical level.

● The conceptual schema describes the structure of the whole database.

● The conceptual level describes what data are to be stored in the database and
also describes what relationship exists among those data.

● In the conceptual level, internal details such as an implementation of the data


structure are hidden.

● Programmers and database administrators work at this level.

3. External Level

● At the external level, a database contains several schemas that sometimes


called as subschema. The subschema is used to describe the different view of
the database.

● An external schema is also known as view schema.

● Each view schema describes the database part that a particular user group is
interested and hides the remaining database from that user group.

● The view schema describes the end user interaction with database systems.

Mapping between Views


The three levels of DBMS architecture don't exist independently of each other. There
must be correspondence between the three levels i.e. how they actually correspond
with each other. DBMS is responsible for correspondence between the three types of
schema. This correspondence is called Mapping.

There are basically two types of mapping in the database architecture:


● Conceptual/ Internal Mapping

● External / Conceptual Mapping

Conceptual/ Internal Mapping

The Conceptual/ Internal Mapping lies between the conceptual level and the internal
level. Its role is to define the correspondence between the records and fields of the
conceptual level and files and data structures of the internal level.

External/ Conceptual Mapping

The external/Conceptual Mapping lies between the external level and the Conceptual
level. Its role is to define the correspondence between a particular external and the
conceptual view.

What is an instance in DBMS?


DBMSDatabaseBig Data Analytics

The situation where a data or information is stored in the database at a particular


moment of time is called an instance. An instance is also called a current state or
database state. The database schema that defines variables in tables which belong to
a specific database, the records of these variables at a particular moment are called
the instance of the database.

Many instances are constructed to correspond to a specific database schema. Every


time we can insert, modify, or delete the value of a data item in a record. One state of
data can change into another state.

Example

Consider a table given below which has the Student (Schema) −

Std ID Name City

100 Lucky Hyderabad


101 Pinky Delhi

102 Bob Hyderabad

In the above table, rows are called instances.

Finally, we can say that the content of database at a point of time is called instance or
database state.

There are three types of states present in database −

● Empty state: When ever a new database is defined.


● Initial state: first time data is loaded in database.
● Current state: the present operation is applied to database.
The instance of a student relation is −

For example − students ( studentID: string, student_name: string, Login:string, age:


integer);

Instance −

studentID Student_name Login Age

101 Bob bob@cse 20

105 Pinky pink@ece 18

107 Rosy rose@it 20

Database Language
● A DBMS has appropriate languages and interfaces to express database queries
and updates.

● Database languages can be used to read, store and update the data in the
database.

Types of Database Language

1. Data Definition Language

● DDL stands for Data Definition Language. It is used to define database structure
or pattern.

● It is used to create schema, tables, indexes, constraints, etc. in the database.

● Using the DDL statements, you can create the skeleton of the database.

● Data definition language is used to store the information of metadata like the
number of tables and schemas, their names, indexes, columns in each table,
constraints, etc.
Here are some tasks that come under DDL:

● Create: It is used to create objects in the database.

● Alter: It is used to alter the structure of the database.

● Drop: It is used to delete objects from the database.

● Truncate: It is used to remove all records from a table.

● Rename: It is used to rename an object.

● Comment: It is used to comment on the data dictionary.

These commands are used to update the database schema that's why they come under
Data definition language.

2. Data Manipulation Language

DML stands for Data Manipulation Language. It is used for accessing and manipulating
data in a database. It handles user requests.

Here are some tasks that come under DML:

● Select: It is used to retrieve data from a database.

● Insert: It is used to insert data into a table.

● Update: It is used to update existing data within a table.

● Delete: It is used to delete all records from a table.

● Merge: It performs UPSERT operation, i.e., insert or update operations.

● Call: It is used to call a structured query language or a Java subprogram.

● Explain Plan: It has the parameter of explaining data.

● Lock Table: It controls concurrency.

3. Data Control Language


● DCL stands for Data Control Language. It is used to retrieve the stored or saved
data.

● The DCL execution is transactional. It also has rollback parameters.

(But in Oracle database, the execution of data control language does not have the
feature of rolling back.)

Here are some tasks that come under DCL:

● Grant: It is used to give user access privileges to a database.

● Revoke: It is used to take back permissions from the user.

There are the following operations which have the authorization of Revoke:

CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.

4. Transaction Control Language

TCL is used to run the changes made by the DML statement. TCL can be grouped into
a logical transaction.

Here are some tasks that come under TCL:

● Commit: It is used to save the transaction on the database.

● Rollback: It is used to restore the database to original since the last Commit.

Data Models
Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a
database at each level of data abstraction. Therefore, there are following four data
models used for understanding the structure of the database:
1) Relational Data Model: This type of model designs the data in the form of rows and
columns within a table. Thus, a relational model uses tables for representing data and
in-between relationships. Tables are also called relations. This model was initially
described by Edgar F. Codd, in 1969. The relational data model is the widely used
model which is primarily used by commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of data as


objects and relationships among them. These objects are known as entities, and
relationship is an association among these entities. This model was designed by Peter
Chen and published in 1976 papers. It was widely used in database designing. A set of
attributes describe the entities. For example, student_name, student_id describes the
'student' entity. A set of the same type of entities is known as an 'Entity set', and the set
of the same type of relationships is known as 'relationship set'.

3) Object-based Data Model: An extension of the ER model with notions of functions,


encapsulation, and object identity, as well. This model supports a rich type system that
includes structured and collection types. Thus, in 1980s, various database systems
following the object-oriented approach were developed. Here, the objects are nothing
but the data carrying its properties.

4) Semistructured Data Model: This type of data model is different from the other
three data models (explained above). The semistructured data model allows the data
specifications at places where the individual data items of the same type may have
different attributes sets. The Extensible Markup Language, also known as XML, is
widely used for representing the semistructured data. Although XML was initially
designed for including the markup information to the text document, it gains
importance because of its application in the exchange of data.

You might also like