Chapter 1 Introduction To Databases For Students
Chapter 1 Introduction To Databases For Students
Introduction to Databases
1
DB course is about:
– How to organize data
– Supporting multiple users
– Efficient and effective data retrieval
– Secured and reliable storage of data
– Maintaining consistent data
– Making information useful for decision making
2
Chapter 1 Topics
• Introduction to database
1.1.Traditional File-Based Systems
1.2.Database Approach
1.3.Characteristics of the Database Approach
1.4.Advantages of Using the DBMS Approach
1.5.Roles in the Database Environment
1.6.History of Database Management Systems
1.7.Advantages and Disadvantages of DBMSs
3
Basic Terms
– Produces output
6
Manual Approach
– data storage and retrieval follows the primitive
and traditional way of data/information handling
where cards and papers are used for the purpose.
– Typing the data on paper and put in a file cabinet
– storage and retrieval will be performed using
human labour.
– Works well if the number of items to be stored is
small
7
Limitations of the Manual approach
– Prone to error
– Data loss: due to damaged papers or unable to locate it.
9
File based Approach
– an early attempt to computerize the manual filing system.
10
F ile-B ased A pproac h
14
11
Limitations of the File Based approach
– Inconsistency: Logical mismatch of data in files especially
caused due to changes
– Redundancy: Repeated occurrences of the same date in
different files.
• Problems: Wastage of memory, increased management,
increased processing time and cost
– Data Program dependence: Changes in program structure
required to Incorporate new data type. Changes in file
structure lead to changes in the application program
structure 12
Limitations of the File Based approach Cont..
13
Database Approach
and users.
(program–data independence). 14
• Database is a collection of logically related data where these
information.
15
Requirements of database systems:
• Minimal data redundancy: Redundancy is carefully controlled by
availed duplicate copies of the same data by an integrated view
on data.
• Data independence: Application programms are independent of
representation of data and data storage (Abstract view)
• Efficient data Access: DBMS uses a variety of techniques to store
and retrieve data efficiently.
• Concurrent data access: System lo allow simultaneous access to
the same data by different users.
16
Requirements of database systems Cont..
Integrity rules.
– DBMS checks constraints for each insertion, change and deletion of
data.
18
Components of a Database System
19
Data:
Issues
maintenance?
21
Software: Database Management System (DBMS)
– A database management system (DBMS) is a collection of
programs that enables users to create and maintain a
database.
– The DBMS is hence a general-purpose software system that
facilitates the processes of defining, constructing,
manipulating, and sharing databases among various users
and applications.
22
• Defining a database involves specifying the data types, structures,
and constraints for the data to be stored in the database.
• Constructing the database is the process of storing the data itself
on some storage medium that is controlled by the DBMS.
• Manipulating a database includes such functions as querying the
database to retrieve specific data, updating the database to
reflect changes in the miniworld, and generating reports from the
data.
• Sharing a database allows multiple users and programs to access
the database concurrently.
23
What does a database system do?
• Manages Very Large Amounts of Data
• Supports efficient access to Very Large Amounts of Data
• Supports concurrent access to Very Large Amounts of Data
• Supports secure, atomic access to Very Large Amounts of
Data
24
Characteristics of the Database Approach
25
1. Self-Describing Nature of a Database System
• Database system contains not only the database itself but also
a complete definition or description of the database structure
and constraints.
• The definition contains information like
– the structure of each file, the type and storage
– format of each data item, and
– various constraints on the data.
catalog. 26
• The catalog is used by the DBMS software and also by database
users who need information about the database structure.
• A general-purpose DBMS software package is not written for a
specific database application.
– Therefore, it must refer to the catalog to know the structure
of the files in a specific database, such as the type and format
of data it will access.
– The DBMS software must work equally well with any number
of database applications—for example, a university database,
a banking database, or a company database—as long as the
database definition is stored in the catalog. 27
• In traditional file processing, data definition is typically part of
the application programs themselves. Hence, these programs
are constrained to work with only one specific database,
whose structure is declared in the application programs.
28
2. Insulation between Programs and Data, and Data Abstraction
• In DBMS, the structure of data files is stored in the DBMS
catalog separately from the access programs.
– We call this property program-data independence.
• The characteristic that allows program-data is called data
abstraction.
• A DBMS provides users with a conceptual representation of
data
• This does not include many of the details of how the data is
stored or how the operations are implemented.
29
• Database users and application programs refer to the
conceptual representation of the files, and the DBMS extracts
the details of file storage from the catalog when these are
30
3. Data Independence
• One of the most important characteristics of the database
approach is data independence.
• This refers to the ability to change the structure of the
database without affecting the programs that access the data.
• This is achieved by separating the logical and physical aspects
of the database, which allows the database administrator to
make changes to the physical structure without affecting the
logical structure.
31
For example, imagine a database that stores information about
employees.
The logical structure of the database may include information
such as employee name, employee ID, and employee salary.
The physical structure of the database, on the other hand, may
include information such as the location of the data on disk and
the specific file format used to store the data. By separating
these two aspects of the database, the database administrator
can change the physical structure of the database, such as
moving the data to a new disk or changing the file format,
without affecting the programs that access the data.
32
4. Support of Multiple Views of the Data
• A view may be a subset of the database or it may contain virtual
data that is derived from the database files but is not explicitly
stored.
• A multiuser DBMS whose users have a variety of distinct
applications must provide facilities for defining multiple views.
33
Example
• Assume we have data about
– course
– section
– grade_report
– Prerequisite
• One user of the database may be interested only in
accessing and printing the transcript of each student;
34
• A second user, who is interested only in checking that
students have taken all the prerequisites of each
course for which they register,
35
5. Data Sharing
37
6. Multiuser Transaction Processing
• A transaction is an executing program or process that includes
one or more database accesses, such as reading or updating of
database records.
• A multiuser DBMS must allow multiple users to access the
database at the same time.
38
7. Data Integrity
• Another important characteristic of the database approach is
data integrity.
• This refers to the accuracy and consistency of the data in the
database.
• The database approach uses a variety of techniques to ensure
data integrity, such as data validation, data constraints, and
data normalization.
39
Data validation is the process of checking the data entered into
the database to ensure that it is correct and consistent. For
example, if a program is designed to input employee information
into the database, the program may check that the employee ID
is a unique number and that the employee's salary is a number
between $0 and $1000000.
40
• Data constraints are used to ensure that the data in the
database follows specific rules.
• For example, a constraint may be used to ensure that an
employee's salary is greater than $0 and less than
$1000000.
41
Data normalization is the process of organizing the data in the
database to reduce data redundancy and increase data consistency.
For example, if the database stores information about employees
and departments, the data may be normalized by storing the
department information in a separate table and creating a
relationship between the employee and department tables.
42
8.Backup and Recovery
• Another important characteristic of the database approach is
the ability to back up and recover data. This is important in
case of system failures or other unexpected events that may
cause data loss. The database approach uses a variety of
techniques to ensure that data can be backed up and
recovered, such as database backups, transaction logs, and
replication.
• Database backups are copies of the entire database or specific
parts of the database that can be used to restore the data in
case of data loss.
43
9. Scalability
• Another important characteristic of the database approach is
scalability. This refers to the ability of the database to handle a
large amount of data and a large number of users without
performance degradation.
• The database approach uses a variety of techniques to ensure
scalability, such as horizontal scaling and vertical scaling
46
Authentication is the process of verifying the identity of a user who
is trying to access the database. This can include techniques such
as username and password, or biometric authentication.
47
Database Actors
48
Actors on the Scene
• People whose jobs involve the day-to-day use of a large
database; we call them the actors on the scene.
49
Actors On the Scene:
– System Analysts
– Database Administrator
– Application Programmers
– Database Designer
– End Users
– Tool developers
50
Database Actors
1- System Analysts:
51
Database Actors
2- Database Designers:
• Identifying the data to be stored in the database.
• Choosing appropriate structures to represent and store this
data undertaken before the database is actually implemented
and populated with data.
• Communicate with all prospective database users.
• Develop a view of the database that meets the data and
processing requirements for each group of users
52
Database Actors
3 - Application Programmers:
these transactions
53
Database Actors
4- Database Administrators:
• Authorizing access to the database.
• Coordinating and monitoring its use.
• Acquiring software and hardware resources as needed
• Accountable for problems like poor security, poor performance
of the system
54
Database Actors
5- End Users:
• Access to the database for querying, updating, and generating
reports.
5.1- Casual end users:
• Occasionally access the database.
• Need different information each time.
• Learn only a few facilities that they may use repeatedly.
• Typically middle-level or high-level managers or other
occasional browsers.
55
Database Actors
56
Database Actors
57
Database Actors
58
Workers behind the Scene
DBMS system designers and implementers
– design and implement the DBMS modules and interfaces
as a software package.
Tool developers
– design and implement tools
– software packages that facilitate database modeling and
design, database system design, and improved
performance.
59
Database System Utilities
60
Database System Utilities cont..
61
Advantages of Database Management System
Data Integrity
• Data integrity means data is consistent and accurate in
the database.
Data Security
• Data security is a vital concept in a database. Only users authorized must
be allowed to access the database and their identity must be
authenticated using username and password. Unauthorized users
shouldn’t be allowed to access the database under any circumstances as it
violets the integrity constraints.
62
Better data integration
• Due to the database management system, we have access to
well managed and synchronized form of data making it easy to
handle. It also gives an integrated view of how a particular
organization is working and keeps track of how one segment of
the company affects another segment.
63
Minimized Data Inconsistency
Data inconsistency occurs between files when various versions of
the same data appear in different places. Data consistency is
ensured in the database; there is no data redundancy. Besides,
any database changes are immediately reflected by all users, and
there is no data inconsistency.
65
• advantages and disadvantages of DBMS
66
Advantages of DBMS
The advantages of the DBMS are explained below −
67
Has a very high security level.
Avoidance of inconsistency.
DBMS controls data redundancy and also controls data
consistency. Data consistency is nothing but if you want to update
data in any files then all the files should not be updated again.
69
Shared data
Enforcement of standards
• As DBMS have central control of the database. So, a DBA can ensure that all the
applications follow some standards such as format of data, document
standards etc. These standards help in data migrations or in interchanging the
data.
70
Any unauthorized access is restricted
Data loss is a big problem for all the organizations. In the file system users
have to back up the files in regular intervals which lead to waste of time
and resources.
• Complexity
• The provision of the functionality that is expected of a good DBMS
makes the DBMS an extremely complex piece of software. Database
designers, developers, database administrators and end-users must
understand this functionality to take full advantage of it.
72
Size
The functionality of DBMS makes use of a large piece of software which
occupies megabytes of disk space.
Performance
Performance may not run as fast as desired.
73
Cost of DBMS
74