Introduction 1
Introduction 1
(CSC 631)
Sanghita Bhattacharjee
Dept of CSE
NIT Durgapur
Course Objectives
Understand the basic concepts and appreciate the applications of database systems
Comprehend the fundamentals of design principles for logical design of relational
databases
Apply the query writing skill and its subsequent optimization
Discuss the basic issues of transaction processing and concurrency control
Course Content
A Silberschatz, H F Korth and S Sudarshan, Database System Concepts, 5th Edition, 2006
Ramez Elmasriand Shamkant, B Navathe, Fundamentals of Database Systems, 3rd Edition,
Addison Wesley, 2000
Video lectures:
(i) Database Management System by Prof. Partha Pratim Das
(ii) Introduction to database systems by Prof. P. Sreenivasa Kumar
Lecture 1
Introduction
What is Information ?
Information is the processed data on which decisions and actions are based. Or information
is processed meaningful data. Information can be defined as the organized and classified
data to provide meaningful values
Eg: “ Age of Rohan is 18”
Database Management System
DBMS is the set of programs that are used to access and modify the data in
database
DBMS is a general purpose software system that enables
Creation of large disk-resident databases
Efficient retrieval of data
Concurrent use of the system by allowing multiple users in a consistent manner
Avoid redundancy of data and provides correctness of data
Ensure availability of data in spite of system failures (disk failure, power failure,
software failure etc.)
File system
File system
Memory/HD
Memory/HD
Structured data vs Unstructured data
Structured data Unstructured data
Has predefined format or predefined data model Cannot be arranged according to predefined
or schema. data set
Examples: Adhar no, address, name, credit card Examples: audio, video, image, geo spatial data,
no, age, price sensor data ( traffic data, weather data), email
content, social media data( whatsapp, FB data)
Textual only Textual and non textual too
Quantitative . Easy to store and access Qualitative. Difficult to mange and access
SQL in RDBMS is used to access structured data Platform like No-SQL (Not only SQL) is used for
RDMS housing, managing and using unstructured data.
MONGO DB is used to store
Data is stored in data ware house Data is stored in data ware house, data lakes
Drawbacks of using File Systems (cont.)
Uses unstructured data ( e.g. :: images, audio, video, email content,
social media data)
Data redundancy and inconsistency
Multiple file formats, duplication of information in different files
Difficulty in accessing data
Need to write a new program to carry out each new task i.e. each
different data access request (a query) performed by a separate
program
Data isolation — multiple files and formats
Integrity problems
Integrity constraints (e.g. account balance > 0) become part of
program code
Hard to add new constraints or change existing ones
Drawbacks of using File Systems (cont.)
Atomicity problem
Failures may leave database in an inconsistent state with partial updates
carried out
Example: transfer of funds from one account to another should either
complete or not happen at all
Concurrent access by multiple users
Concurrent accessed needed for performance
Concurrent accesses of system in uncontrolled manner can lead to
inconsistencies
Example: two people reading a balance and want to update the account at the
same time
Security problems
Database systems offer solutions to all the above problems
Advantages of DBMS
Controlling redundancy
Improved data sharing
Data integrity
Security
Data consistency
Efficient data access
Data independence
Disadvantages
Little bit complex
Need more memory to run
A Simple Database System Environment
Users/ programmers
m n
course
student enroll
C_name credit
name roll_no
Student instance
View / conceptual
mapping
Logical level
Relation 1 Relation 2 Relation n
Logical/ internal
mapping
The capacity to change the conceptual schema without affecting the view
level schemas or application programs
We may change conceptual schema to expand database
Examples
merging / splitting of records
adding a new attribute to some relation
no need to change the programs or views that don’t require to use the
new attribute
deleting an attribute
no need to change the programs or views that use the remaining data
view definitions in VL-LL mapping only need to be changed for views
that use the deleted attribute
changes to constraints can be applied to conceptual schema without affecting
view level schema
Physical Data Independence
The ability to modify physical/internal schema without affecting the logical or view schema
Changes to internal schema may be needed as some physical files have to be reorganized by
creating additional access structure to improve the performance of retrieval
If same data as before remains in database, no need to change conceptual schema
Physical Data Independence – modification is localized
achieved by suitably modifying PL-LL mapping
A very important feature of modern DBMS
Examples:
modification at physical level creating a new index
changing access method
switching to different data structure
modifying file organization,
use of new storage structure
change of location of database say from C drive to D drive
Database languages
Database languages are used to read, manipulate and store the data in
database
Several languages : DDL, DML, DCL, TCL
Data Definition language (DDL)
Specification notation for defining the database schema. It is used to create schema,
table, index, constraints in database
DDL commands: create, drop, alter, truncate, rename
To create the database/ tables : create
To alter the structure of database : alter
To drop the database/ table: drop
To delete the all records from a table including space allocated for the records: truncate
Example
create table student (
student_name char(30),
address varchar (30),
roll_no char(10) NOT NULL PRIMARY KEY,
age int
);
DDL
Drop table student;
Alter table student add phone_no int(10);
Alter table student drop column age;
Alter table student modify student_ name varchar(30);
DDL
DDL compiler whose function is to process DDL statements in order to identify
descriptions of the schema constructs and to store the schema description in the
DBMS catalog (data dictionary )
Data dictionary is the table which contains the information about database objects
Database schema (description of entities , attributes as well as meaning of data
elements)
Integrity constraint
primary key uniquely identifies records
synonyms, authorization and security codes
database authorization ( who can access what)
Data Manipulation Language (DML)
Language for accessing and manipulating the database
DML also known as query language
Two classes of languages
Procedural – user instructs the system to perform sequence of operations on
database to compute the desired result ( relational algebra)
Non procedural – user describes the desired information without giving a specific
procedure for getting the information (tuple relational calculus, domain relational
calculus)
Pure query language (tuple relational calculus, domain relational calculus, relational algebra)
SQL is the most widely used commercial query language
DML compiler translates DML statements in query language into evaluation plan
consisting of low level instructions that query evaluation engine understand
DML commands: insert, update, delete, select
DML Examples
Insert into student values (‘Anupam’,’Kolkata’,’101’,’982345123’);
Delete * from student where roll_no=‘101’;
Update student set address =‘Delhi’ where roll_no=‘101’;
// careful with update. Where clause specifies which record(s) to be
updated. Omitting where clause, all records in the table will be update
Query
ID Name Age
3 A 14
1 A 10
4 C 14
2 B 13
5 V 17
3 A 14
4 C 14
Roll back vs Commit
ID Name Age
1 A 10
student
2 B 13
student
3 A 14
4 C 14
5 V 17
Database Design
How to say the database design is “good” or “bad”
Employee table
EMP Dept
Database Users and Administrators
Users are differentiated by the way they expect to interact with the system
Naive users
Unsophisticated users who interact with the system by invoking one of the application
programs that have been written previously
No deep knowledge of database required
Use the GUI provided by an application program
Examples: people accessing database over the web, bank tellers, data entry operators
Database Users and Administrators
Application Programmers
Computer professionals who write application programs
Embed SQL in high level programming language and develop programs to satisfy requirements
Interact with system through DML calls
Should thoroughly understand the logical schema or relevant views
Testing of programs is necessary
DBA (Database Administrator)
Coordinates all the activities of the database system; the database
administrator has a good understanding of the enterprise’s information
resources and needs
Functionalities
Creation and modification of conceptual schema
Implementation of storage structure and access method
Physical organization modifications
Grant/ revoke authorization to other users for data access
Integrity constraints specification
Execute immediate recovery procedure in case of failures
Ensure physical security to database
Overall Database Architecture
Data storage and Querying
Storage manager
Query processor
Transaction management
Storage Management
Storage manager is a program module that provides the interface
between the low level data stored in the database and the application
programs and queries submitted to the system
Responsible for interaction with the file manager
The raw data are stored on the disk using the file system, which is
usually provided by a conventional operating system.
The storage manager translates the various DML statements into low-
level file-system commands. Thus, the storage manager is responsible
for storing, retrieving, and updating data in the database.
Query Processor
Scanner,
Intermediate query
Query Parser
form
and
translator
data
Transaction Management