Database Systems-Lec 1
Database Systems-Lec 1
Systems
Course Details
Textbook
◦ Database Systems Concepts Design Applications by S K
Singh, latest edition 2011
Reference
◦ Database System by Catherine Ricardo
◦ Data Management Systems by Raghu Ramakrishnan and
Johannes Gehrke
◦ Database Systems: Design, Implementation, and
Management By Carlos Coronel, Steven Morris, Peter
Rob, 10th edition 2012
◦ An Introduction to Database Systems By Date, 2006
◦ Introduction To ORACLE: SQL and PL/SQL, Student
Guide, Production 1.1, Volume 1.2.
2
Course Outline
(subject to minor changes)
Basics: Introduction to Database Systems, The Entity Relationship Model,
The Relational Model, Relational Algebra and Calculus
SQL: Queries, Programming, Triggers, Query By Example (QBE)
Data Storage and Indexing: File Organization and Indexes, Tree Structured
Indexing, Hash Based Indexing
Query Evaluation: External Sorting, Evaluation Of Relational Operators,
Introduction to Query Optimization, A Typical Relational Query Optimizer
Database Design: Schema Refinement and Normal Forms, Physical
Database Design and Tuning, Security
Transaction Management: Transaction Management Overview,
Concurrency Control, Crash Recovery
Advanced topics: Parallel and Distributed Databases, Internet Databases,
Decision Support, Data Mining, Object Database Systems, Spatial Data
management, Deductive Databases
3
What are Data?
Data are often viewed as the lowest level of abstraction
from which information and knowledge are derived.
Data can exist in a variety of forms -- as numbers or text
on pieces of paper, as bits and bytes stored in electronic
memory, or as facts stored in a person's mind.
Raw data refers to a collection of numbers, characters,
images or other outputs from devices that collect
information to convert physical quantities into symbols,
that are unprocessed.
Usually, there are many facts to describe something of
interest to us. (For example, employee data to calculate
payroll check, send company greetings, inform family
in case of emergency
4
Data: Where can we find it?
Filing Cabinets
Folders
6
File Systems
ASCII file
Accounts separated by new lines
Fields separated by #’s
Different files: account types, branches etc.
7
File Systems
8
File Systems
9
Advantages of File System
Provides a useful historical perspective that how we
handle the data
Helps in overall understanding of design complexity
of the overall system
Understanding the problems and knowledge of
limitation in file based system helps in avoiding the
same problem when designing database system.
10
Drawbacks of File System
Data Redundancy
◦ Decentralized approach adopted
◦ Duplication of information in different files (For example,
cust_id data in CUSTOMER and SALES file)
◦ Wasteful (more storage space, extra time, more effort)
Data Inconsistency
◦ Multiple file formats, duplication of information in different
files (e.g name in one file is 15 characters, while in other
file is 10 characters)
◦ Various copies of the same data may be different
◦ Results in maintenance overhead and storage costs
◦ Serious degradation in the quality of information and also
the accuracy 11
Drawbacks of File System
Difficulty in Accessing Data
◦ Need to write a new program to carry out each new task
Data Isolation
◦ Data scattered in various files - Difficult
Program Data Dependence
◦ A change in file structure requires change in the file
description (physical structure, storage of the data files and
record) in each program to confirm the new file structure
◦ Difficult to locate all files affected by it
◦ Time consuming and subject to error when making changes
12
Drawbacks of File System
Poor data control
◦ Multiple names used by various departments
◦ Lead to different meanings of the data field in different
context, same meaning for different fields, leads to poor data
control, and also confusion
Limited Data Sharing & Excessive Programming Effort
◦ Each application has its own private files
◦ Little opportunity to share data with other applications
◦ To obtain data from several incompatible files in separate
system will require a large programming effort
Inadequate data manipulation capabilities
◦ No connection between data in different files, so data
manipulation capability is limited
13
Drawbacks of File System
Integrity Problems
◦ Integrity constraints (e.g. account balance > 0) become part
of program code
◦ Hard to add new constraints or change existing ones
Atomicity Problems
◦ Failures may leave database in an inconsistent state with
partial updates carried out
◦ E.g. transfer of funds from one account to another should
either complete or not happen at all
14
Drawbacks of File System
Concurrent access by multiple users
◦ Concurrent accessed needed for performance
◦ Uncontrolled concurrent accesses can lead to
inconsistencies
For example, two people reading a balance and updating
it at the same time
Security Problems
◦ Access Control
Database
Databasesystems
systemsoffer
offersolutions
solutionsto
toall
allthe
theabove
above
problems
problems
15
What is a Database?
A database consists of an organized collection of data
for one or more multiple uses.
An organized body of related information.
A collection of logically related data stored together
that is designed to meet the information needs of the
organization
16
Database Applications
Databases play a critical role in almost all areas
◦ Banking: all transactions
◦ Airline: reservation, schedules
◦ Universities: registration, grades
◦ Sales: customers, products, purchases
◦ Manufacturing: production, inventory, orders, supply
chain
◦ Human resources: employee records, salaries, tax
deductions
17
What is a Database?
A database can be of any size and of varying
complexity.
A software system that facilitates the creation and
maintenance and use of an electronic database
◦ For example, the list of names and addresses of friends
◦ The book catalog of a large library may contain half a
million records
◦ A database of much greater size and complexity is
maintained to keep track of the tax information filed by
taxpayers.
18
What is Database Management?
Database management is an approach to provide
simplistic access to information stored in databases.
19
What is a Database Management System?
A DBMS is a collection of software programs to
enable users to create, maintain and utilize a database.
DBMS is a generalized software system for
manipulating databases
◦ Process of Defining (specifying the data types, structure and
constraint)
◦ Constructing (process of storing data on storage media)
◦ Manipulating (querying to retrieve specific data, updating to
reflect changes and generating reports from the data)
20
Database Management System
21
What is a DBMS?
Functions of DBMS
◦ Insert records
◦ Delete records
◦ Update records
◦ Query records
◦ Add and Delete files from the database
In short, DBMS comprises of two main parts
◦ Data Management in the database
◦ User Management associated with the database
22
What is a DBMSs?
Commercial DBMSs
Company Product
Oracle Oracle 8i, 9i, 10g,11i
IBM DB2, Universal Server (from System
R, System R*, Starburst) & Informix
Microsoft Access, SQL Server
Sybase Adaptive Server
Informix Dynamic Server
NCR Teradata
UC Brekeley’s INGRES,M PostgreSQL
23
Advantages of DBMS
Minimal Data Redundancy
◦ Centralized database and control of data
◦ Eliminates extra processing to trace the required data
◦ Storage requirement also reduced
◦ If duplicate data exists, DBMS is aware of it and ensure
multiple copies are consistent
Program Data Independence
◦ Separation of data description from the application
programs
◦ Change in the data description does not affect the
application program that process the data
24
Advantages of DBMS
Efficient Data Access
◦ Utilize a sophisticated techniques to store and retrieve data
efficiently
Improved Data Sharing
◦ Centralized repository of data belonging to entire
organization (For example, university data)
◦ Can be shared by all authorized users
Improved Data Consistency
◦ DMS ensures that any change made to either of the two
entries in the database is automatically applied to the other
one as well, known as propagating updates
25
Advantages of DBMS
Improved Data Integrity
◦ Ensures that the data is accurate and consistent
◦ For example, months between the range 01 and 12, not
allowed to transfer money less than specific amount
Improved Security
◦ Protection of database from unauthorized user
◦ Can define user name and passwords to authorize user, and
may be restricted for each type of access
Increased Productivity of Application Development
◦ Provide many of the standard functions, such as forms and
report generators to automate some of the activities of the
database design
◦ Simplify the development of the database applications
26
Advantages of DBMS
Enforcement of Standards
Economy of Scale
Balance of Conflicting Requirements
Improved Data Accessibility
Improved Responsiveness
Increased Concurrency
Reduced Program Maintenance
Improved Backup and Recovery Services
27
Data Abstraction
Database
Database Systems
Systems provide
provide users
users with
with an
an abstract
abstract
view
view of
of data
data hiding
hiding certain
certain details
details of
of how
how data
data are
are
stored
storedand
andmaintained
maintained
28
Data Abstraction
Physical Level
◦ Describes how data is actually stored
Logical Level
◦ Describes what data are stored in the database and what
relationships exist among those data
View Level
◦ Describes only part of the entire database hiding details of
data types.
◦ Views can also hide information (e.g., salary) for security
purposes
29
Level of Abstraction
Company database
Files on disks
Disadvantages of DBMS
Increased Complexity
◦ Multi-user DBMS becomes an extremely complex piece of
software
◦ Necessary to understand the whole design to take advantage
of it
◦ Failure to understand, results in bad design decisions
Requirement of New and Specialized Manpower
◦ Need to hire, train and retrain manpower on regular basis to
design and implement databases
Large Size of DBMS
◦ Requires large amount of memory to run efficiently
31
Disadvantages of DBMS
Increased Installation & Management Cost
◦ Require trained manpower to install and operate DBMS,
also requires upgrade to the hardware, software and data
communication system
◦ Substantial training is required on ongoing basis to keep up
with new releases and upgrades
Conversion Cost
◦ From legacy system to modern DBMS environment
◦ It includes cost of DBMS, hardware, cost of employing
specialists
Need for Explicit Backup & Recovery
◦ Comprehensive procedure is required for the backup copies of
data and restoring a database when damage occurs
32
Data Administrator
Identified individual person in the organization who
has central repository of controlling data
Jobof data administrator is to decide
◦ What data should be stored in the database, identify the entities
of the interest to the organization
◦ Establishing policies for maintaining and dealing with that data
33
Database Administrator
An individual or group of persons with an overview of
one or more databases who controls the design and use
of these databases
Provides the necessary technical support for
implementing policy decisions of databases
A DBA is the central controller of the database systems
who oversees and manages all the resources
Responsible for authorizing access to the database, for
coordinating and monitoring its use and for acquiring
software and hardware resources as required
34
Functions & Responsibilities of Database
Administrator
Defining conceptual schema and database creation
Storage structure and access-method definition
Granting authorization to the users
Physical organization modification
Routine maintenance
Job monitoring
35
Database Users
Users are differentiated by the way they expected to
interact with the system
Application programmers
◦ Develop applications that interact with DBMS through
DML calls
Sophisticated users
◦ Form requests in a database query language
End users
◦ Invoke one of the existing application programs (e.g.,
print monthly sales report)
◦ Interact with applications through GUI
◦ E.g. People accessing database over the web, bank tellers,
clerical staff
36
Information
Information is processes, organized or summarized
data
Data are processed to create the information, which is
meaningful to the recipient
It helps in giving warning signals before something
starts going wrong
It predicts the future with reasonable level of accuracy
and helps the organization to make the best decisions.
Database may contain either data or information or
both.
37
Metadata
Data about the data
It describes objects in the database and makes easier for
those objects to be accessed or manipulated
It describes the database structure, constraints,
applications, authorization, size of data types and so on
Example:
Employee_Name , Character Type, 30 character size field,
Age, Integer Type, 8 bytes size
38
System Catalog
Repository of information describing the data in the
database
System-created database that describes all database
objects, data dictionary information and user access
information
It also describes table-related data such as table names,
table creator or owners, column names, data type, data
size, authorize users and so on
It describes the structure of the primary database
39
Data Item or Field
A data item is the smallest unit of data that has meaning to
its user
Data items are the modules of the database
A data item may be used to construct other, more complex
structures
40
Records
A record is a collection of logically related fields or
data items.
Data items are grouped together to form a record.
Retrieved or updated using programs.
41
Files
A file is a collection of related sequence of records
Fixed-length records: Every record in the file has exactly
the same size
Variable-length records: different records in the file has
different sizes.
42
Data Dictionary
Data Dictionary are mini database management
systems that manage metadata
t is a repository of information about a database that
documents data elements of a database.
The data dictionary is an integral part of the database
management systems (DBMSs) and stores metadata, or
information about the database, attribute names and
definitions for each table in the database.
Data dictionaries aid the database administrator in the
management of a database, user view definitions as
well as their use.
43
Structure of Data Dictionary
45
Transaction Management
AAtransaction
transactionisisaacollection
collectionof
ofoperations
operationsthat
thatperforms
performsaa
single
singlelogical
logicalfunction
functionininaadatabase
databaseapplication
application
B A
46
Transaction Management
Atomicity
◦ All-or-none (Both debit and credit or neither)
Consistency
◦ Preserve the consistency of the database (A+B before =
A+B after)
Isolation
◦ Transaction not to be influences by changes made by
other concurrently executing transactions
Durability
◦ New values must persist despite system failures (e.g.,
power failures and operating system crashes)
47
Transaction Management
48
References
Chapter 1, Database Systems, S K Singh
Chapter 1, Database System Concepts, Silberschatz, Korth, Sudarshan
Chapter 1, Database Management Systems, by Ramakrishnan and
Gehrke
Course material from:
◦ Introduction to database systems – Duke University
◦ Database Systems – MCS Fall 2009
49