Chapter 1- Introduction to Database Systems
Chapter 1- Introduction to Database Systems
Compiled By: BM 1
Topics under discussion
Introduction
Data handling approaches
Roles in Database Design and Development
The ANSI/SPARC and Database Architecture
Types of Database Systems
Database Management System (DBMS)
Database Languages
Compiled By: BM 2
Introduction
Data
Raw facts or figures.
Data represents facts or figures obtained from experiments, surveys, or
observations used as basis for making calculations or drawing
conclusions .
In and of itself, data has no meaning.
Example:- If I count the number of students attending this class, that's data.
It has no meaning until it is placed in a context.
It is like an event out of context, without a meaningful relation to other
things.
If we are given a certain data, we can associate it to different things and give
it different meanings.
Compiled By: BM 3
Cont’d
Information
Information is data that has been processed and organized.
It is the result of gathering, processing, manipulating and organizing
data in a way that adds to the knowledge of the receiver.
Example
Let's say I want to buy a car. I can collect a lot of data about
makes of cars, performance ratings, prices and so on.
Once I do that, I have a lot of information about cars and the car
market. Unless we think of this collection of data and put it in
context (car/car market), it has no meaning.
Compiled By: BM 4
Cont’d
Information is data that has been given a meaning by way of relational connection.
– This relational connection converts data in to information.
Information is data with context. Therefore, information is context dependent.
Example: consider the following data
15 degrees, and
it is raining
The temperature dropped to 15 degrees and then it started raining.
It is the cause and effect relationship between the two that provides information.
Therefore,
Information = Data sets + understanding of relationship among data sets
What we perceive or understand is the relationship between pieces of data, or
between pieces of data and other information.
Compiled By: BM 5
Cont’d
Knowledge
An organized and processed information to convey understanding, learning,
expertise…. is called knowledge.
Information becomes knowledge when one is able to understand the patterns that exist
within information and their implication.
For Example
I put Birr 100 in my saving account and the bank pays 5% per annum, at the end of
one year. Interest = Birr 5 and principal = 105 .
Understanding this pattern represents knowledge and it enables to understand
the results it will produce.
Therefore
– knowledge= Information + understanding of the pattern.
There are two types of knowledge
Formal /Explicit
Informal /implicit/tacit knowledge
Compiled By: BM 6
Data handling approaches
Day-to-day business processes executed by individuals and
organizations require both present and historical data.
Therefore, data storage is essential for organizations and
individuals.
Data supports business functions and aids in business decision-
making.
Below are the three approaches that organizations use to store
organizational Data.
A. Manual Data Handling Approach
B. Traditional File Based Data Handling Approach
C. Database Data Handling Approach
Compiled By: BM 7
Manual Data Handling
In the manual approach, data storage and retrieval follows the primitive and
traditional way of information handling where cards and paper are used for
the purpose.
The data storage and retrieval is performed using human labor.
Files for as many event and objects as the organization has are used to store
information.
Each of the files containing various kinds of information is labelled and stored
in one or more cabinets.
The cabinets could be kept in safe places for security purpose based on the
sensitivity of the information contained in it.
Insertion and retrieval is done by searching first for the right cabinet then for
the right the file then the information.
Compiled By: BM 8
Manual…
Limitations of the Manual approach
−Time-Consuming: Manual data handling can be slow, especially
with large datasets. It requires significant time and effort to
collect and process data.
−Prone to Errors: Human error is a common issue. Mistakes in
data entry or analysis can lead to inaccurate results.
−Limited Scalability: As data volume increases, manual methods
become less practical and efficient.
Compiled By: BM 9
File based approach
After the introduction of Computer for data processing to the business community, the
need to use the device for data storage and processing increase.
File based systems were an early attempt to computerize the manual system.
It is also called a traditional based approach in which a decentralized approach was
taken where each department stored and controlled its own data with the help of a data
processing specialist.
A collection of application programs performs services for the end-users. In such systems,
every application program that provides service to end users define and manage its own
data.
Such systems have number of programs for each of the different applications in the
organization.
Since every application defines and manages its own data, the system is subjected to
serious data duplication problem.
Compiled By: BM 10
File…
Limitations of File based data handling approach
⁃ Separation/Isolation of data:
⁃ When data is isolated in separate files, it is difficult to access data that should be
available. This is because; there is no concept of relationship between files.
⁃ Duplication of data (Redundancy):
⁃ This is concerning with storage of similar information in multiple files
⁃ Incompatible file formats
⁃ The structure of file is dependent on the application programs.
⁃ Incompatibility of files makes them difficult to process jointly.
⁃ Example: consider two files with in the same enterprise but in different
departments, or in different branches: If the first file is constructed using COBOL
and the second file is written using C++, then there will be a problem of integrity.
Compiled By: BM 11
Database Approach
Database is a shared collection of logically related data designed to meet the
information needs of an organization. Since it is a shared corporate resource, the
database is integrated with minimum amount of duplication.
Database is a collection of logically related data where these logically related data
comprise entities, attributes, relationships, and business rules of an organization's
information.
In addition to containing data required by an organization, database also contains a
description of the data which called as “Metadata” or “Data Dictionary” or “Systems
Catalogue” or “Data about Data”.
Since a database contains information about the data (metadata), it is called a self-
descriptive collection on integrated records.
Database is designed once and used simultaneously by many users.
Compiled By: BM 12
Cont’d
Unlike the traditional file-based approach in database approach there is
program data independence. That is the separation of the data definition
from the application.
Thus, the application is not affected by changes made in the data structure
and file organization.
Each database application will perform the combination of: Creating
database, Reading, Updating and Deleting data.
Compiled By: BM 13
Benefits of database approach
Data can be shared: two or more users can access and use same data instead of storing
data in redundant manner for each user.
Improved accessibility of data: by using structured query languages, the users can easily
access data without programming experience.
Redundancy can be reduced: isolated data is integrated in database to decrease the
redundant data stored at different applications.
Speed: data storage and retrieval are fast as it will be using the modern fast computer
systems.
Less labor: unlike the other data handling methods, data maintenance will not demand
much resource.
Centralized information control: since relevant data in the organization will be stored at
one repository, it can be controlled and managed at the central level.
Etc.
Compiled By: BM 14
Roles in DB design and development
Effective database design and development require collaboration among
different roles.
Each contributes to ensuring the database is efficient, secure, and aligned
with business needs.
There are four distinct types of people that participate in the DBMS
environment:
⁃ Database administrators
⁃ Database designers
⁃ Application developers and
⁃ End-users.
Compiled By: BM 15
Database Administrator
In a database environment, the primary resource is the database itself and the
secondary resource is the DBMS and related software.
Administering these resources is the responsibility of the database
administrator (DBA).
The DBA is responsible for
Managing and maintaining existing databases.
Ensuring database security, backups, and recovery.
Monitoring database performance and optimizing for efficiency.
Managing user access and permissions.
Troubleshooting and resolving database issues
Compiled By: BM 16
Database Designer
Database designers are responsible for identifying the data to be stored in the
database and for choosing appropriate structures to represent and store this data.
It is the responsibility of database designers to communicate with all prospective
database users, in order to understand their requirements, and to come up with a
design that meets these requirements.
Responsibilities of database designer also includes:
Designing, programming, and implementing new databases.
Creating database schemas, tables, indexes, and stored procedures.
Optimizing database performance and ensuring data integrity.
Collaborating with application developers and stakeholders to understand data
needs.
Troubleshooting and resolving database issues.
Compiled By: BM 17
Application Developers
Once the database has been implemented, the application programs that
provide the required functionality for the end-users must be implemented.
This is the responsibility of the application developers.
Typically, the application developers work from a specification produced by
systems analysts.
Each program contains statements that request the DBMS to perform some
operation on the database. This includes retrieving data, inserting, updating,
and deleting data.
The programs may be written in a third-generation programming language
or a fourth-generation language
Compiled By: BM 18
End Users
The end-users are the ‘clients’ for the database, which has been designed and implemented,
and is being maintained to serve their information needs.
End-users can be classified according to the way they use the system:-
Naïve Users
They interact with the database through existing applications, without needing to
understand the underlying database structure or SQL.
These users typically have little or no technical background in database
management.
Example
Bank tellers check account balances and post withdrawals and deposits.
Reservation clerks for airlines, hotels, and car rental companies check availability for a
given request and make reservations.
Healthcare professionals using electronic health record (EHR) systems to enter patient
data or retrieve records while unaware of the database management processes.
Compiled By: BM 19
Cont’d
Sophisticated Users
Sophisticated users have a deeper understanding of database concepts and
directly query the database using SQL or other tools.
These users have a good understanding of database concepts and SQL.
They can directly query the database to retrieve, insert, update, or delete data.
They thoroughly familiarize themselves with the facilities of the DBMS in
order to implement their application to meet their complex requirements.
Examples include engineers, scientists, business analysts, or database
developers.
Compiled By: BM 20
The ANSI/SPARC and Database Architecture
Compiled By: BM 27
Types of DBMS
Database management systems can be classified based on several criteria,
such as
1. Classification Based on Data Model
Relational DBMS
Object oriented DBMS
Etc.
2. Classification Based on Number of user it supports
Single user DBMS
Multi user DBMS
3. Classification Based on number of sites over which the database is distributed
Centralized DBMS
Distributed DBMS
Relational DBMS
In this model, the data is organized into a collection of two-dimensional inter-
related tables, also known as relations.
Each relation is a collection of columns and rows, where the column represents the
attributes of an entity and the rows (or tuples) represents the records.
A relational database uses SQL for storing, manipulating, as well as maintaining
the data.
Well-known DBMSs like Oracle, MS SQL Server, DB2 and MySQL support this
model.
Single-user DBMSs
As the name itself indicates it can support only one user at a time.
It is mostly used with the personal computer on which the data resides
accessible to a single person.
The user may design, maintain and write the database programs.
Multiuser DBMSs
which include the majority of DBMSs, supports multiple users concurrently.
Data can be both integrated and shared, a database should be integrated when
the same information is not need be recorded in two places.
Needs concurrency control and deadlock management techniques.
Classification of DBMS based on Number of Databases
Centralized Database
In a centralized database there is a single database file at one location in the network.
Multiple users can access this single database via a computer network (LAN, WAN,
etc.)
This type of database is mainly used by institutions or organizations.
Classification of DBMS based on Number of
Databases
Distributed Database
Distributed database is basically a type of database which consists of multiple
databases that are connected with each other and are spread across different physical
locations
The communication b/n databases at different physical location is thus done by a
computer network.
Types of DDB system
Distributed database systems are classified into two types. These are,
1. Homogeneous distributed database system
– In a homogeneous distributed database system, all sites have identical database
management system software, are aware of one another, and agree to cooperate in
processing users’ requests.
2. Heterogeneous distributed database system
– in a heterogeneous distributed database, different sites may use different schemas,
and different database-management system software.
– The sites may not be aware of one another, and they may provide only limited
facilities for cooperation in transaction processing.
– The differences in schemas are often a major problem for query processing, while
the divergence in software becomes a hindrance for processing transactions that
access multiple sites.
Database Management System (DBMS)
DBMS is A software system that enables users to define, create, maintain, and control
access to the database.
Typically, a DBMS provides the following facilities:
It allows users to define the database, usually through a Data Definition Language
(DDL).
It allows users to insert, update, delete, and retrieve data from the database, usually
through a Data Manipulation Language (DML).
It provides controlled access to the database. For example, it may provide
‐ a security system, which prevents unauthorized users accessing the database;
‐ a concurrency control system, which allows shared access of the database;
‐ a recovery control system, which restores the database to a previous consistent state
following a hardware or software failure;
Compiled By: BM 35
Components DBMS Environment
The database management system can be divided into five major components, they are:
⁃ Hardware
⁃ Software
⁃ Data
⁃ Procedure and
⁃ people
These components are illustrated in the following figure.
Compiled By: BM 36
Hardware Component
The hardware is the actual computer system used for Maintaining and accessing the
database.
Hardware components refers to all the systems physical devices.
For example:-computers (PCs, workstations, servers and supercomputers),storage
devices (Hard disks, RAM, ROM,...), networking devices (switches, hubs, routers,...),
and other devices (input and output devices)
One can’t implement or use DBMS without using Hardware components. It can range
from a single personal computer, to a single mainframe, to a network of computers.
The particular hardware depends on the organization’s requirements and the DBMS used.
Some DBMSs run only on particular hardware or operating systems, while others run on
a wide variety of hardware and operating systems.
When we run any DBMS like Oracle, MySQL, etc. on our PC, then computer parts like
mouse, keyboard, RAM, ROM, hard disks all become part of DBMS hardware
components.
Compiled By: BM 37
Software component
Software is a set of instructions that is used to instruct the computer hardware
for the operation of the computers.
The software establishes an easy-to-use interface for users to control the
hardware and to create, store, access and/or update in the database.
All requests made by users for database management are handled and
processed by the DBMS software.
The software component of DBMS comprises of
– DBMS software:- Microsoft SQL Server, Oracle, MySQL, etc.
– Operating System:- Microsoft Window, Linux, UNIX, etc.
– Application Programs and Utility Programs
– Network Software if the DBMS is being used over a network.
Compiled By: BM 38
Data
Data is that resource, for which DBMS was designed. The motive behind
the creation of DBMS was to store and utilize data.
The database contains both the operational data and the metadata
Metadata is data about the data. This is information stored by the DBMS to
better understand the data stored in it.
For Example:- when we store specific data (let us say, a person's name) in
the database, the DBMS also stores additional information such as when
and where the data was stored, the size of the data, whether the data is
relative or dependent, data type, etc. all this additional information about
the actual data (i.e. person's name) is collectively called metadata.
Compiled By: BM 39
Procedure
The procedure is a type of general instruction or guidelines for the use of
DBMS.
This instruction includes
‐ how to set up the database,
‐ how to install the database,
‐ how to log in and log out of the database,
‐ how to manage the database,
‐ how to take a backup of the database, and
‐ how to generate the report of the database
The basic purpose of procedures is to help guide users th the operation and
management of database systems
Compiled By: BM 40
People/ User
People refers to every person who design and accesses the database and performs any
operation like creating, deleting, accessing or modifying data in the database with DBMS.
The user (people) of the DBMS can be classified into the following types;
‐ Database designer:- database designers are responsible for identifying the data to be
stored in the database and for choosing appropriate structures to represent and store
this data.
‐ Database administrator:- responsible to oversee, control and manage the database
resources (the database itself, the DBMS and other related software).
‐ Application developer:- the application programmer determines the interface on how
to retrieve, insert, update and delete data in the database.
‐ End user:-any person who directly interacts with a DBMS and performs various
database-related operations like inserting, modifying, retrieving or deleting data using
database commands or applications
Compiled By: BM 41
Functions of DBMS
A Database Management System (DBMS) serves several critical functions that are essential
for managing and organizing data efficiently.
Here are the key functions of a DBMS:
Data Storage, Retrieval, and Update: It provides a systematic way to store, aretrieve
and update data easily.
Data Manipulation: Uses languages like SQL (Structured Query Language) for data
querying, insertion, updating, and deletion.
Data Security: Protects data from unauthorized access through user authentication and
permissions.
Multi-User Support: Allows multiple users to access and manipulate the database
simultaneously while maintaining data integrity.
Data Abstraction and Independence: Allows users to interact with data without needing
to understand the underlying complexities of how the data is stored.
Etc.
Compiled By: BM 42
Database Languages
Database languages are specialized languages used to interact with a
database.
They allow users to perform different tasks such as defining, controlling, and
manipulating the data.
There are several types of database languages in DBMS, categorized into the
following four main types:
DDL (Data Definition Language)
DCL (Data Control Language)
DML (Data Manipulation Language)
TCL (Transaction Control Language)
Compiled By: BM 43
DDL
DDL is used to define and modify the structure of the database itself,
including the tables, views, indexes, and other schema-related objects.
It deals with the creation and modification of database schema, but it doesn't
deal with the data itself.
Following are the main DDL commands in SQL:
CREATE: Used to create database objects like tables, indexes, or views.
ALTER: Used to modify the structure of an existing database object, such as adding a
new column to a table.
DROP: Used to delete database objects.
TRUNCATE: Used to remove all rows from a table, without affecting the structure.
RENAME: Used to change the name of a database object.
Compiled By: BM 44
DML
The DML (Data Manipulation Language) is used to manage and manipulate
data within a database.
With DML, you can perform various operations such as inserting, updating,
selecting, and deleting data.
These operations allow you to work with the actual content in your database
tables.
Here are the key DML commands:
SELECT: Retrieves data from the table based on specific criteria.
INSERT: Adds new rows of data into an existing table.
UPDATE: Modifies existing data in a table.
DELETE: Removes data from a table.
Compiled By: BM 45
DCL
DCL is used to control the access permissions of users to the database.
DCL commands help grant or revoke privileges to users, determining who
can perform actions like reading or modifying data.
The two main DCL commands are:
Grant: Gives user access to the database
Revoke: Removes access or permissions from the user
Compiled By: BM 46
TCL
The TCL commands are used to manage and control transactions in a
database, grouping them into logical units.
TCL is used to run the changes made by the DML statement.
TCL commands include;
COMMIT: to save the transaction on the database
ROLLBACK: to restore the database to original since the last commit
SAVEPOINT: savepoint command is used to temporarily save a transaction so
that you can rollback to that point whenever necessary
Compiled By: BM 47
End of Chapter One
Compiled By: BM 48