Module-1 Database System Concepts and Architecture
Module-1 Database System Concepts and Architecture
BCSE302L – Database
Management
Systems
Module 1: Database System Concepts
& Architecture
Dr. K.P. Vijayakumar,
VIT Chennai
2
Module 1
Module:1 Database System Concepts and Architecture
Need for database systems – Characteristics of Database
Approach – Advantages of using DBMS approach - Actors on
the Database Management Scene: Database Administrator -
Classification of database management systems - Data Models
– Schemas and Instances - Three-Schema Architecture - The
Database System Environment - Centralized and Client/Server
Architectures for DBMSs – Overall Architecture of Database
Management Systems.
Basic Definitions 3
Data
Database
Database management System (DBMS)
Data
Known facts that can be recorded and
have an implicit meaning.
Database
A collection of related data.
Database System
Disadvantages of File Processing System
Data Redundancy and inconsistency
Difficulty in accessing data
Data Isolation – scattered data
Integrity Problems -consistency constraints
Atomicity Problems – all or none
Concurrent Access Anomalies
Security Problems
Need/Purpose of the 9
Database System
Disadvantages of File Processing System
Data Redundancy
Database System
Disadvantages of File Processing System
Data inconsistency
Name : Ram
Reg.No.: 22BCE1001
Database System
Dept : SCOPE Disadvantages of File Processing System
Year : 2
Address: ABC
Difficulty in accessing data
Mobile: 9123456789
Courses:
…
Program 1 : Extract and display the list of students who
… live in chennai
Name : Vijay
Reg.No.: 22BCE1002 Program 2 : Extract and display the list of students who
Dept : SCOPE registered DBMS course
Year : 2
Address: ABC …
Mobile: 9123456788 Program N : …
Courses:
…
Need/Purpose of the 12
Database System
Disadvantages of File Processing System
To retrieve the Data Isolation – scattered data
appropriate data
is difficult. Student File Hostel File
Name : Vijay
Name : Vijay
Reg.No.: 22BCE1001
Reg.No.: 22BCE1001
Dept : SCOPE
Dept : SCOPE
Year : 2
Year : 2
Address: ABC
Block : A
Mobile: 9123456789
Room No: 101
Email: [email protected]
Dues in Rs: 15000
Courses:
…
…
Customer File
Need/Purpose of the 13
Name : Ram
AC.No.: 221001
Database System
Branch : VIT Disadvantages of File Processing System
Address: ABC
Mobile: 9123456789 Integrity Problems -consistency constraints
Balance : 100000
…
… Constraint 1 : Balance never fall below zero
Name : Vijay
Ac. No.: 221002
Branch : VIT Constraint 2 : Balance never fall below 5000
Year : 2 …
Address: ABC Constraint N : …
Mobile: 9123456788
Balance : 150000 Add/Update appropriate code in the various application programs
…
A - 50000
B - 40000 Need/Purpose of the 14
A
5000
B Database System
Disadvantages of File Processing System
Transaction to transfer $5000 from Atomicity Problems – all or none
account A to account B:
If the transaction fails after step 3 and before
1. read(A)
step 6, money will be “lost” leading to an
2. A := A – 5000
inconsistent database state.
3. write(A)
4. read(B) ----- Failure
5. B := B + 5000
6. write(B)
Courses:
Database System
Disadvantages of File Processing System
DBMS: Concurrent Access Anomalies
Total : 60 Student 1 :
Occupied : 59
Left : 1
… Occupied is viewed as 59 and register DBMS course
… Access at a timeStudent 2 :
Name : Ram
AC.No.: 221001
Database System
Branch : VIT Disadvantages of File Processing System
Address: ABC
Mobile: 9123456789 Security Problems
Balance : 100000 enforcing such security constraints is
…
…
difficult in file processing system
Name : Vijay
Ac. No.: 221002
Branch : VIT
Year : 2
Address: ABC
Mobile: 91234567
Balance : 150000
Unauthorized User
…
Database Applications 17
Banking: transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking,
customized recommendations
Manufacturing: production, inventory,
orders, supply chain
Human resources: employee records,
salaries, tax deductions
Characteristics of Database 18
Approach
1. Self-describing nature of a database system
2. Insulation between Programs and Data, and Data
Abstraction
3. Support of Multiple Views of the Data
4. Sharing of Data and Multiuser Transaction
Processing
Database Approach
Database Approach
Date of
Birth
Database Approach
Data Abstraction:
Conceptual representation
A conceptual representation of the
STUDENT records
Hides the details such as how the data
is stored or how the operations are
implemented.
A data model
Is type of data abstraction
hide storage and implementation details
present the users with a conceptual view of
Image Source: https://www.hitechnectar.com/blogs/data-abstraction-level/ the database.
Characteristics of 22
Customer File
Name : Ram Database Approach
AC.No.: 221001
Branch : VIT Manager
Address: ABC can
Mobile: 9123456789 access all customers
Balance : 100000 details
…
…
Name : Vijay
Ac. No.: 221002 Cashier
Branch : VIT can access only
Year : 2 Ac.No. and Balance
Support of multiple views of the data:
Address: ABC Each user may see a different view of the
Mobile: 91234567 database, which describes only the data
Balance : 150000 of interest to that user.
…
Characteristics of 23
Database Approach
Sharing of data and multi-user
transaction processing:
Allowing a set of concurrent users to
retrieve from and to update the database.
Concurrency control within the DBMS
guarantees that each transaction is correctly
executed or aborted
Recovery subsystem ensures each
completed transaction has its effect
permanently recorded in the database
OLTP (Online Transaction Processing) is a
major part of database applications. This
allows hundreds of concurrent transactions
to execute per second.
Advantages of Database Approach 24
Controlling redundancy
Duplication, wastage of storage space,
inconsistency
Restricting unauthorized access to data.
Providing persistent storage for program
Objects
Providing backup and recovery services.
• Database Administrators
• Database Designers
• End Users
• Casual, Naive or parametric,
• Sophisticated, Standalone users
• System Analysts and Application Programmers
Actors on the scene
Those who actually use the database content
and control and monitor the database
Workers behind the Scene
• DBMS system designers and implementers Those who design, develop and administrate
• Tool developers the Database and operation of DBMS software
• Operators and maintenance personnel and system environment
Database administrators:
Responsible for authorizing access to the
database
coordinating and monitoring its use
acquiring software and hardware resources
Image Source: https://www.dataversity.net/so-you-want-to-be-a-database-administrator/
as needed.
Actors on the Scene 30
Database Designers:
Responsible to define the content, the
structure, the constraints, and functions or
transactions against the database. They must
communicate with the end-users and
Image Source: https://www.cybertec-postgresql.com/en/services/postgresql- understand their needs.
design/postgresql-database-modeling/
Actors on the Scene 31
Sophisticated:
Business analysts, scientists, engineers,
others thoroughly familiar with the system
capabilities.
Many use tools in the form of software
packages that work closely with the stored
Image Source: https://knowyourmeme.com/photos/1698903-computer-reaction-faces
database.
Actors on the Scene 33
Stand-alone:
Mostly maintain personal databases using
ready-to-use packaged applications.
An example is a tax program user that creates its
own internal database.
Another example is a user that maintains an
Image Source: https://www.yourdictionary.com/stand-alone-pc address book.
Workers behind the 34
Scene
Scene
Classification of DBMS
Criteria to classify DBMS
The data model on which the DBMS is
based.
Relational, object ,hierarchical, network
and XML model
The number of users supported by the
system
Single user, multiuser
37
Classification of DBMS
Criteria to classify DBMS
Distributed DBMS Homogeneous DDBMS The number of sites over which the
database is distributed
Centralized, Distributed DBMS,
Homogeneous DDBMS, Heterogeneous
DDBMS, Federated DBMS or multi
database system
Federated DBMS
Classification of DBMS
Criteria to classify DBMS
Cost
Open source – MySQL, PostgreSQL
Proprietary – Oracle, MS SQL Server,IBM DB2
Types of access path options for storing files
File structure
General or Special purpose
General purpose
meets the need of as many applications as possible
Example: EMail
Special purpose
Air line reservation, Railway reservation
OLTP – large no. of concurrent transactions without delay
Student
Name Reg.No. Dept
Data Models 39
Relationship
Course
Name Code Dept Reg.No.
Operations
Operations on the data model may
include
basic model operations (e.g. generic
insert, delete, update, retrieve)
user-defined operations (e.g.
compute_student_gpa).
41
Categories of Data Model
Conceptual or High-level or semantic data models:
Provide concepts that are close to the way many users perceive data.
ER Model - Entity, Attribute, relationship
Implementation or representational data models:
Provide concepts that easily understood by end users and hides many details of data storage
on disk
Relational model – records
Physical or low-level or internal data models:
Provide concepts that describe details of how data is stored in the computer storage
Record format, record ordering, access path
Schema vs Instances 42
Database Schema:
description of a database
Student={Name,Student_number,
class,Major}
Course={Course_Name,Course_number,
Credit_balance, Department}
…
Schema vs Instances 43
Schema Diagram:
An illustrative display of (most aspects
of) a database schema.
Schema vs Instances 44
Schema Construct:
A component of the schema or an
object within the schema
Database State:
Database Instance The actual data
stored in a database at a particular
moment in time.
Also called occurrence or snapshot
Student
Database Schema vs 46
State
Distinction
…..ViewN
Architecture
View1 …
convenient tool with which the user can visualize
the schema levels in a DB system.
It is also called ANSI/SPARC architecture or 3-
level architecture.
It is used to
describe the structure of a specific database system.
separates the user applications and physical
database.
External or View level:
Describes various user views (part of a DB)
uses a representational data model
Three Schema 49
View1 … …..ViewN
Architecture
Conceptual or Logical schema:
Describe structure of whole database
Describe - entities, data types, relationships, user
operations, and constraints
uses a representational data model
Hides the details of physical storage structures
Internal Level or internal schema:
PHYSICAL storage structure of DB and access
paths(index).
uses a physical data model
50
Vijay 1 CSE … x
Ram 2 CSE … Y
Sam 3 ECE …. z
51
Data Independence
Data Independence is defined as a property
of DBMS that helps the user to change the
Database schema at one level of a database
system without requiring to change the
schema at the next higher level.
Types of Data Independence
Physical Data Independence
Logical Data Independence
52
Physical Data Independence
Physical Data Independence : Change the
internal schema without having to change
conceptual schema. i.e can easily change the
physical storage structures or devices with an
effect on the conceptual schema.
Example - creating additional access structures
Index
Quick access
Benefits of Physical Data 53
Independence
Due to Physical independence, any of the below
changes will not affect the conceptual layer:
Using a new storage device like Hard Drive or
Magnetic Tapes
Modifying the file organization technique in the
Database
Switching to different data structures.
Changing the access method
Changes to compression techniques or hashing
algorithms.
Change of Location of Database
Example : from C: Drive to D: Drive
54
Logical Data Independence
Logical Data Independence is the ability to change the
conceptual scheme without changing external
schema or application programs
When compared to Physical Data independence, it is
challenging to achieve logical data independence.
Due to Logical independence, any of the below change
will not affect the external layer.
Add/Modify/Delete a new attribute, entity or
relationship is possible without a rewrite of existing
application programs.
Merging two records into one.
Breaking an existing record into two or more
records.
Centralized Database
55
Management System
Centralized DBMS
Advantages :
The Data Integrity is maximized as the whole database is
stored at a single physical location. It is easier to coordinate
the data and it is as accurate and consistent as possible.
The Data Redundancy is minimal in the centralized
database. All the data is stored together and not scattered
across different locations. So, there is no redundant data
available.
Since all the data is in one place, there can be stronger
security measures around it. So, It is much more secure.
Data is easily portable because it is stored at the same
place.
It is cheaper than other types of databases as it requires
less power and maintenance.
All the information can be easily accessed from the same
location and at the same time.
Merits and De-merits of
57
Centralized DBMS
Disadvantages :
Since all the data is at one location, it takes more time
to search and access it. If the network is slow, this
process takes even more time.
There is a lot of data access traffic for the centralized
database. This may create a bottleneck situation.
Since all the data is at the same location, if multiple
users try to access it simultaneously it creates a
problem. This may reduce the efficiency of the system.
If there are no database recovery measures in place
and a system failure occurs, then all the data in the
database will be destroyed.
Client-Server Database
58
Management System
A client does not share any of its resources,
but requests a server’s content or service
function.
Clients therefore initiate communication
sessions with servers which await incoming
requests.
Examples of computer applications that use
the client–server model are Email, network
printing, and the World Wide Web.
Client-Server DBMS
Advantages :
Centralization – Access, Resources, and
Data Security are controlled through server.
Scalability – Any element can be upgraded
when needed.
Flexibiltiy – New Technology can be easily
integrated into the system.
Interoperabilty – All components work
together.
Merits and De-merits of 60
Client-Server DBMS
Disadvantages :
Dependability – When Servers goes down,
operations will cease.
Lack of Mature Tools - To administrate.
Lack of Scalability – Network OS are not
vary scalable.
Network Congestion.
DBMS Architecture 61
Application
Presentation Layer
Business Layer • Get Username,
Password, Captcha
Data Layer presentation layer
• Get Username, Passwor
from data layer
• Compare
1- Tier Architecture 63
Users
DBA - Responsibilities
Defines
the Schemas
Storage structure and access method
Grant user authority
Integrity constraint
Monitor the performance and response the changes to
requirements
Casual user
Occasionally access data via interactive query interface
Application Programmer
Parametric users
DBMS Components 68
DDL Compiler :
Processes schema definitions, specified in the DDL, and
stores descriptions of the schemas in the DBMS catalog
System Catalog/Data Dictionary:
It defines the structure of the database.
names and sizes of tables, names and data types of data
items, storage details of each table, mapping information
among schemas, constraints, indexes, triggers etc.
Query Compiler:
Queries are parsed and validated for correctness of the
query syntax, the names of tables and data elements
Query Optimizer:
rearrangement and possible reordering of operations,
elimination of redundancies, and use of correct algorithms
and indexes during execution
DBMS Components 69
Precompiler:
extracts DML commands from an application program
written in a host programming language
DML :
DML processor must interact with the query processor to
generate the appropriate code
Query processor :
It transforms user queries into a series of low level
instructions.
It is used to interpret the online user’s query and convert it
into an efficient series of operations in a form capable of
being sent to the run time data manager for execution.
DBMS Components 70