DBMS Unit1 Notes
DBMS Unit1 Notes
Introduction to DBMS:
Overview, File system vs DBMS, advantages of DBMS, storage data, queries, transaction management,
DBMS structure.
Data Models:
Data modelling and data models, the importance of data models, data model basic building blocks, the
evolution of data models, degree of data abstraction.
Introduction to DBMS:
Data are simply facts or figures, bits of information. When data are processed, interpreted, organized,
structured or presented so as to make them meaningful or useful, they are called
information. Information provides context for data.
Data usually refers to raw data, or unprocessed data. It is the basic form of data, data that hasn’t been
analysed or processed in any manner. Once the data is analysed, it is considered as information.
Information is "knowledge communicated or received concerning a particular fact or circumstance."
Information is a sequence of symbols that can be interpreted as a message. It provides knowledge or
insight about a certain matter.
Data is used as input for the computer system. Information is the output of data.
Data is a single unit. A group of data which carries news and meaning is called Information.
Database:
The database is a collection of inter-related data which is used to retrieve, insert and delete the data
efficiently. It is also used to organize the data in the form of a table, schema, views, and reports, etc.
For example: The college Database organizes the data about the admin, staff, students and faculty etc.
Using the database, you can easily retrieve, insert, and delete the information.
A DBMS is a collection of inter related data and a set of programs to manipulate those data.
Database management system is software which is used to manage the database.
Disadvantage of DBMS:
1. DBMS software and hardware (networking installation) cost is high
2. The processing overhead by the dbms for implementation of security, integrity and sharing of the
data.
3. centralized database control
4. Setup of the database system requires more knowledge, money, skills, and time.
5. The complexity of the database may result in poor performance.
Examples of DBMS:
IBM DB2.
Microsoft Access.
Mango DB
Microsoft SQL Server.
MySQL.
Oracle RDBMS.
Function of DBMS:
1. Defining database schema: it must give facility for defining the database structure also specifies
access rights to authorized users.
2. Manipulation of the database: The dbms must have functions like insertion of record into database
updation of data, deletion of data, retrieval of data
3. Sharing of database: The DBMS must share data items for multiple users by maintaining consistency
of data.
4. Protection of database: It must protect the database against unauthorized users.
5. Database recovery: If for any reason the system fails DBMS must facilitate data base recovery
File oriented approach:
The traditional file oriented approach to information processing has for each application a separate master
file and its own set of personal file. In file oriented approach the program dependent on the files and files
become dependent on the files and files become dependents upon the programs
Data redundancy and inconsistency: The same information may be written in several files. This
redundancy leads to higher storage and access cost. It may lead data inconsistency that is the various
copies of the same data may longer agree for example a changed customer address may be reflected in
single file but not elsewhere in the system.
Difficulty in accessing data: The conventional file processing system does not allow data to
retrieve in a convenient and efficient manner according to user choice.
Data isolation: Because data are scattered in various file and files may be in different formats with new
application programs to retrieve the appropriate data is difficult.
Integrity Problems: Developers enforce data validation in the system by adding appropriate code in
the various application program. However when new constraints are added, it is difficult to change the
programs to enforce them.
Atomicity: It is difficult to ensure atomicity in a file processing system when transaction failure occurs
due to power failure, networking problems etc.
(Atomicity: either all operations of the transaction are reflected properly in the database or non are)
Concurrent access: In the file processing system it is not possible to access a same file for
transaction at same time
Security problems: There is no security provided in file processing system to secure the data from
unauthorized user access.
Difference between File System and DBMS:
7.Security File systems provide less security DBMS has more security
Constraints in comparison to DBMS. mechanisms as compared to
file system.
Data Independence
Data independence refers to characteristic of being able to modify the schema at one
level of the database system without altering the schema at the next higher level.
The file based data management systems contained multiple files that were stored in
many different locations in a system or even across multiple systems. Because of this,
there were sometimes multiple copies of the same file which lead to data redundancy.
This is prevented in a database as there is a single database and any change in it is
reflected immediately. Because of this, there is no chance of encountering duplicate
data.
Sharing of Data
In a database, the users of the database can share the data among themselves. There are
various levels of authorisation to access the data, and consequently the data can only be
shared based on the correct authorisation protocols being followed.
Many remote users can also access the database simultaneously and share the data
between themselves.
Data Integrity
Data integrity means that the data is accurate and consistent in the database. Data
Integrity is very important as there are multiple databases in a DBMS. All of these
databases contain data that is visible to multiple users. So it is necessary to ensure that
the data is correct and consistent in all the databases and for all the users.
Data Security
Data Security is vital concept in a database. Only authorised users should be allowed to
access the database and their identity should be authenticated using a username and
password. Unauthorised users should not be allowed to access the database under any
circumstances as it violates the integrity constraints.
Privacy
The privacy rule in a database means only the authorized users can access a database
according to its privacy constraints. There are levels of database access and a user can
only view the data he is allowed to. For example - In social networking sites, access
constraints are different for different accounts a user may want to access.
Database Management System automatically takes care of backup and recovery. The
users don't need to backup data periodically because this is taken care of by the DBMS.
Moreover, it also restores the database after a crash or system failure to its previous
condition.
Data Consistency
Data consistency is ensured in a database because there is no data redundancy. All data
appears consistently across the database and the data is same for all the users viewing
the database. Moreover, any changes made to the database are immediately reflected to
all the users and there is no data inconsistency.
QUERIES IN DBMS:
Transaction Management:
A transaction is a set of logically related operations. For example, you are transferring
money from your bank account to your friend’s account, the set of operations would be:
2. Rollback: If any of the operation fails then rollback all the changes done by previous
operations.
Even though these operations can help us avoiding several issues that may arise during
transaction but they are not sufficient when two transactions are running concurrently.
To handle those problems the database system maintains the ACID properties.
1. Atomicity: This property states that a transaction must be treated as an atomic unit, that
is, either all of its operations are executed or none. There must be no state in a database
where a transaction is left partially completed.
2. Consistency: A transaction enforces consistency in the system state by ensuring that at
the end of any transaction the system is in a valid state.
3. Isolation: For every pair of transactions, one transaction should start execution only
when the other finished execution. I have already discussed the example of Isolation in
the Consistency property above.
4. Durability: Once a transaction completes successfully, the changes it has made into the
database should be permanent even if there is a system failure. The recovery-
management component of database systems ensures the durability of transaction
Storage Manager
A storage manager is a program module that provides the interface between the low-
level data stored in the database and the application programs and queries submitted to
the system. The storage manager is responsible for the interaction with the file
manager. The raw data are stored on the disk using the file system, which is usually
provided by a conventional operating system. The storage manager translates the
various DML statements into low- level file-system commands. Thus, the storage
manager is responsible for storing, retrieving, and updating data in the database.
The storage manager components include:
Authorization and integrity manager: which tests for the satisfaction of integrity
constraints and checks the authority of users to access data?
Transaction manager: which ensures that the database remains in a consistent
(correct) state despite system failures, and that concurrent transaction executions
proceed without conflicting.
File manager, which manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
Buffer manager, which is responsible for fetching data from disk storage into main
memory, and deciding what data to cache i main memory. The buffer manager is a
critical part of the database system, since it enables the database to handle data sizes
that are much larger than the size of main memory
Transaction Manager, A transaction is a collection of operations that performs a
single logical function in a database application. Each transaction is a unit of both
atomicity and consistency. Thus, we require that transactions do not violate any
database-consistency constraints. That is, if the database was consistent when a
transaction started, the database must be consistent when the transaction successfully
terminates. Transaction manager ensures that the database remains in a consistent
(correct) state despite system failures (e.g., power failures and operating system
crashes) and transaction failures.
We can categorize data models according to the types of concepts they use to describe the database
structure.
High-level or conceptual data models provide concepts that are close to the way many users perceive data,
whereas
Low-level or physical data models provide concepts that describe the details of how data is stored on the
computer storage media, typically magnetic disks. Concepts provided by physical data models are
generally meant for computer specialists, not for end users.
Between these two extremes is a class of Representational or implementation data models, which provide
concepts that may be easily understood by end users.
ER Model:
Conceptual data models or semantic data model is a more abstract, high-level data model that makes it
easier for a user to come up with a good initial description of the data in an enterprise. A database design
in terms of a semantic model serves as a useful starting point and is subsequently translated into a database
design in terms of the data model the DBMS actually supports. A widely used semantic data model called
the entity-relationship (ER) model allows us to pictorially denote entities and the relationships among
them. It use concepts such as entities, attributes, and relationships.
An entity represents a real-world object or concept, such as an employee or a project from the miniworld
that is described in the database. An attribute represents some property of interest that further describes an
entity, such as the employee’s name or salary. A relationship among two or more entities represents an
association among the entities, for example,
Entity Relationship Model Advantages:
Visual modelling yields conceptual simplicity
Visual representation makes it an effective communication tool
Is integrated with the dominant relational model
Disadvantages:
Conceptual simplicity
Handles more relationship types
Data access is flexible
Data owner/member relationship promotes data integrity
Conformance to standards
Includes data definition language (DDL) and data manipulation language (DML)
Disadvantages:
software engineering domain. Uses the E-R modelling as a basis but extended to
include encapsulation, inheritance
Objects have both state and behaviour. State is defined by attributes. Behaviour is defined by methods
(functions or procedures)
Designer defines classes with attributes, methods, and relationships
Class constructor method creates object instances
Each object has a unique object ID
Classes related by class hierarchies
Object-Oriented Model
Advantages
It is a process of hiding unwanted or irrelevant details from the end user. It provides a different view and
helps in achieving data independence which is used to enhance the security of data.
Mainly there are three levels of abstraction for DBMS, which are as follows:
It is the lowest level of abstraction for DBMS which defines how the data is actually
stored, it defines data-structures to store data and access methods used by the
database. Actually, it is decided by developers or database application programmers
how to store the data in the database.
So, overall, the entire database is described in this level that is physical or internal
level. It is a very complex level to understand. For example, customer's information
is stored in tables and data is stored in the form of blocks of storage such as bytes,
gigabytes etc.
Logical level is the intermediate level or next higher level. It describes what data is
stored in the database and what relationship exists among those data. It tries to
describe the entire or whole data because it describes what tables to be created and
what are the links among those tables that are created.
It is less complex than the physical level. Logical level is used by developers or
database administrators (DBA). So, overall, the logical level contains tables (fields
and attributes) and relationships among table attributes.
View level can be used by all users (all levels' users). This level is the least complex
and easy to understand.
For example, a user can interact with a system using GUI that is view level and can
enter details at GUI or screen and the user does not know how data is stored and
what data is stored, this detail is hidden from the user.
Internal level or Physical level of abstraction is the lowest level of abstraction and
External or View level of abstraction is the highest level of abstraction. Based on
these levels of abstraction, we have two types of data independence.
1. Physical Data Independence
2. Logical Data Independence
The changes in the physical level may include changes using the following −
Database administrator is the one who decides what information is to be kept in the
database and how to use the logical level of abstraction. It provides the global view
of Data. It also describes what data is to be stored in the database along with the
relationship. The data independence provides the database in simple structure. It is
based on application domain entities to provide the functional requirement. It
provides abstraction of system functional requirements. Static structure for the
logical view is defined in the class object diagrams. Users cannot manipulate the
logical structure of the database. The changes in the logical level may include
Change the data definition. Adding, deleting, or updating any new attribute, entity
or relationship in the database.