Chap1 Introduction To Database
Chap1 Introduction To Database
Databases
1
Outline
Introduction
- Basic Definitions
- Example of a Database
Characteristics of the Database Approach
Advantages of Using the Database Approach
2
Introduction
Database: A collection of related data
Data: - Known facts
- implicit meaning
Implicit Properties
- represents some aspect of the real world, sometimes
called the miniworld
- logically coherent collection of data with some inherent
meaning
- designed, built, and populated with data for a specific
purpose
Database
- any size and complexity
- may be generated and maintained manually or it may
be computerized
3
Database Management System (DBMS):
General-purpose software system that facilitates the processes
of
- Defining
- constructing
- manipulating and
- sharing databases among various users and applications.
Functions
- protecting the database:
system protection
security protection
- maintaining it over a long period of time
4
Fig: A Simplified Database System 5
Example of a Database
9
Constructing the UNIVERSITY database
store data to represent each student, course, section, grade
report, and prerequisite as a record in the appropriate file
records in the various files may be related
Example:
10
Manipulating a UNIVERSITY database
‘Smith’
• List the prerequisites of the ‘Database’ course
Examples of updates :
structure’ section 11
Characteristics of the Database Approach
File Processing v/s DBMS
Redundancy
• In traditional file processing, each user defines and implements
the files needed for a specific software application as part of
programming the application
• For example, one user, the grade reporting office, may keep
files on students and their grades
• Programs to print a student’s transcript and to enter new
grades are implemented as part of the application
• A second user, the accounting office, may keep track of
students’ fees and their payments
• Although both users are interested in data about students, each
user maintains separate files and programs to manipulate these
files because each requires some data not available from the
other user’s files
12
• Redundancy in defining and storing data results in wasted
storage space and in redundant efforts to maintain common
up-to-date data.
• In the database approach, a single repository maintains data
that is defined once and then accessed by various users.
• In file systems, each application is free to name data elements
independently.
• In contrast, in a database, the names or labels of data are
defined once, and used repeatedly by queries, transactions,
and applications.
13
Fig 1.3: Database System vs. File System
14
File System: Problem Case
15
The main characteristics of the database approach versus the
file-processing approach are the following:
Self-describing nature of a database system
Insulation between programs and data, and data abstraction
Support of multiple views of the data
Sharing of data and multiuser transaction processing
16
Self-Describing Nature of a Database System
Database system contains not only the database itself but also a
complete definition or description of the database structure and
constraints
This definition is stored in the DBMS catalog
The information stored in the catalog is called meta-data
The catalog is used by the DBMS software and also by database
users who need information about the database structure
In traditional file processing, data definition is typically part of
the application programs themselves.
Hence, these programs are constrained to work with only one
specific database, whose structure is declared in the application
programs. 17
18
Insulation between Programs and Data, and Data
Abstraction
Program-data independence
In traditional file processing, the structure of data files is
embedded in the application programs, so any changes to the
structure of a file may require changing all programs that
access that file.
By contrast, DBMS access programs do not require such
changes in most cases. The structure of data files is stored in
the DBMS catalog separately from the access programs.
Program-operation independence
Users can define operations on data as part of the database
definitions.
User application programs can operate on the data by invoking
these operations through their names and arguments,
regardless of how the operations are implemented.
19
Data abstraction
A DBMS provides users with a conceptual representation
of data that does not include many of the details of how
the data is stored or how the operations are implemented
20
Support of Multiple Views of the Data
A database has many users, each of whom may require a
different perspective or view of the database
A view may be a subset of the database that is derived from
the database files but is not explicitly stored
For example, one user of the database may be interested
only in accessing and printing the transcript of each student;
the view for this user is shown in Figure below
21
Sharing of Data and Multiuser Transaction Processing
22
Advantages of Using the DBMS Approach
1. Controlling Redundancy
2. Restricting Unauthorized Access
3. Providing Persistent Storage for Program Objects
4. Providing Storage Structures and Search Techniques for
Efficient Query Processing
5. Providing Backup and Recovery
6. Providing Multiple User Interfaces
7. Representing Complex Relationships among Data
8. Enforcing Integrity Constraints
9. Permitting Inferencing and Actions Using Rules
10.Additional Implications of Using the Database Approach
23
Advantages of Using the DBMS Approach
1. Controlling Redundancy
Redundancy leads to several problems- duplication, wastage
24
It is sometimes necessary to use controlled redundancy to
improve the performance of queries
For example, we may store Student_name and Course_number
redundantly in a GRADE_REPORT file because whenever we
retrieve a GRADE_REPORT record, we want to retrieve the
student name and course number along with the grade,
student number, and section identifier.
By placing all the data together, we do not have to search
multiple files to collect this data. This is known as
denormalization.
25
2.Restricting Unauthorized Access
When multiple users share a large database, it is likely that
most users will not be authorized to access all information in
the database.
For example, financial data is often considered confidential, and
only authorized persons are allowed to access such data.
A DBMS should provide a security and authorization
subsystem, which the DBA uses to create accounts and to
specify account restrictions
DBMS should enforce these restrictions automatically
26
3. Providing Persistent Storage for Program
Objects
object-oriented database systems make it easier for complex
runtime objects to be saved in secondary storage so as to
survive beyond program termination and to be retrievable at a
later time.
Object-oriented database systems are compatible with
programming languages such as C++ and Java, and the DBMS
software automatically performs any necessary conversions.
27
4. Providing Storage Structures and Search Techniques for
Efficient Query Processing
The DBMS maintains indexes that are utilized to improve the
execution time of queries and updates.
DBMS has a buffering or caching module that maintains parts of
the database in main memory buffers.
The query processing and optimization module is responsible
for choosing an efficient query execution plan for each query
submitted to the system
28
5. Providing Backup and Recovery
The backup and recovery subsystem of the DBMS is
responsible for recovery.
ensures that recovery is possible in the case of a system crash
during execution of one or more transactions.
Disk backup is also necessary in case of a catastrophic disk
failure.
29
6 Providing Multiple User Interfaces
Because many types of users with varying levels of technical
knowledge use a database, a DBMS should provide a variety of
user interfaces.
These include
programmers
30
7. Representing Complex Relationships among Data
31
8.Enforcing Integrity Constraints
Most database applications are such that the semantics of the
data require that it satisfy certain restrictions in order to make
sense.
The simplest type of integrity constraint involves specifying a
data type for each data item.
- For example, in student table we specified that the value of
Name must be a string of no more than 30 alphabetic
characters.
More complex type of constraint is referential integrity involves
specifying that a record in one file must be related to records in
other files
- For example, in university database, we can specify that
every section record must be related to a course record.
32
Another type of constraint specifies uniqueness on data item
values, such as every course record must have a unique value
for Course_number. This is known as a key or uniqueness
constraint.
It is the responsibility of the database designers to identify
integrity constraints during database design.
33
9.Permitting Inferencing and Actions Using Rules
facilities 35
Flexibility
- It may be necessary to change the structure of a database as
requirements change.
- DBMSs allow changes to the structure of the database without
affecting the stored data and the existing application programs.
Availability of Up-to-Date Information
or banking databases
36
Economies of Scale
- DBMS approach permits consolidation of data and
applications, to overlap between activities of data-processing
in different projects or departments
- enables the whole organization to invest in more powerful
processors, storage devices, or communication gear, rather
than having each department purchase its equipment.
-reduces overall costs of operation and management.
37
Question Bank
38