0% found this document useful (0 votes)
6 views

Chapter 1- Introduction to Database Systems

This document provides an introduction to database systems, covering key concepts such as data, information, knowledge, and various data handling approaches. It discusses the roles involved in database design and development, including database administrators, designers, application developers, and end-users. Additionally, it outlines the ANSI/SPARC architecture, which defines a three-level structure for database management to ensure user-specific views while maintaining data integrity.

Uploaded by

znbb.kbn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Chapter 1- Introduction to Database Systems

This document provides an introduction to database systems, covering key concepts such as data, information, knowledge, and various data handling approaches. It discusses the roles involved in database design and development, including database administrators, designers, application developers, and end-users. Additionally, it outlines the ANSI/SPARC architecture, which defines a three-level structure for database management to ensure user-specific views while maintaining data integrity.

Uploaded by

znbb.kbn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

Chapter 1

Introduction To Database Systems

Compiled By: BM 1
Topics under discussion
 Introduction
 Data handling approaches
 Roles in Database Design and Development
 The ANSI/SPARC and Database Architecture
 Types of Database Systems
 Database Management System (DBMS)
 Database Languages

Compiled By: BM 2
Introduction
 Data
 Raw facts or figures.
 Data represents facts or figures obtained from experiments, surveys, or
observations used as basis for making calculations or drawing
conclusions .
 In and of itself, data has no meaning.
 Example:- If I count the number of students attending this class, that's data.
 It has no meaning until it is placed in a context.
 It is like an event out of context, without a meaningful relation to other
things.
 If we are given a certain data, we can associate it to different things and give
it different meanings.
Compiled By: BM 3
Cont’d
 Information
 Information is data that has been processed and organized.
 It is the result of gathering, processing, manipulating and organizing
data in a way that adds to the knowledge of the receiver.
 Example
 Let's say I want to buy a car. I can collect a lot of data about
makes of cars, performance ratings, prices and so on.
 Once I do that, I have a lot of information about cars and the car
market. Unless we think of this collection of data and put it in
context (car/car market), it has no meaning.

Compiled By: BM 4
Cont’d
 Information is data that has been given a meaning by way of relational connection.
– This relational connection converts data in to information.
 Information is data with context. Therefore, information is context dependent.
 Example: consider the following data
 15 degrees, and
 it is raining
 The temperature dropped to 15 degrees and then it started raining.
 It is the cause and effect relationship between the two that provides information.
 Therefore,
 Information = Data sets + understanding of relationship among data sets
 What we perceive or understand is the relationship between pieces of data, or
between pieces of data and other information.

Compiled By: BM 5
Cont’d
 Knowledge
 An organized and processed information to convey understanding, learning,
expertise…. is called knowledge.
 Information becomes knowledge when one is able to understand the patterns that exist
within information and their implication.
 For Example
 I put Birr 100 in my saving account and the bank pays 5% per annum, at the end of
one year. Interest = Birr 5 and principal = 105 .
 Understanding this pattern represents knowledge and it enables to understand
the results it will produce.
 Therefore
– knowledge= Information + understanding of the pattern.
 There are two types of knowledge
 Formal /Explicit
 Informal /implicit/tacit knowledge
Compiled By: BM 6
Data handling approaches
 Day-to-day business processes executed by individuals and
organizations require both present and historical data.
 Therefore, data storage is essential for organizations and
individuals.
 Data supports business functions and aids in business decision-
making.
 Below are the three approaches that organizations use to store
organizational Data.
A. Manual Data Handling Approach
B. Traditional File Based Data Handling Approach
C. Database Data Handling Approach
Compiled By: BM 7
Manual Data Handling
 In the manual approach, data storage and retrieval follows the primitive and
traditional way of information handling where cards and paper are used for
the purpose.
 The data storage and retrieval is performed using human labor.
 Files for as many event and objects as the organization has are used to store
information.
 Each of the files containing various kinds of information is labelled and stored
in one or more cabinets.
 The cabinets could be kept in safe places for security purpose based on the
sensitivity of the information contained in it.
 Insertion and retrieval is done by searching first for the right cabinet then for
the right the file then the information.
Compiled By: BM 8
Manual…
 Limitations of the Manual approach
−Time-Consuming: Manual data handling can be slow, especially
with large datasets. It requires significant time and effort to
collect and process data.
−Prone to Errors: Human error is a common issue. Mistakes in
data entry or analysis can lead to inaccurate results.
−Limited Scalability: As data volume increases, manual methods
become less practical and efficient.

Compiled By: BM 9
File based approach
 After the introduction of Computer for data processing to the business community, the
need to use the device for data storage and processing increase.
 File based systems were an early attempt to computerize the manual system.
 It is also called a traditional based approach in which a decentralized approach was
taken where each department stored and controlled its own data with the help of a data
processing specialist.
 A collection of application programs performs services for the end-users. In such systems,
every application program that provides service to end users define and manage its own
data.
 Such systems have number of programs for each of the different applications in the
organization.
 Since every application defines and manages its own data, the system is subjected to
serious data duplication problem.

Compiled By: BM 10
File…
 Limitations of File based data handling approach
⁃ Separation/Isolation of data:
⁃ When data is isolated in separate files, it is difficult to access data that should be
available. This is because; there is no concept of relationship between files.
⁃ Duplication of data (Redundancy):
⁃ This is concerning with storage of similar information in multiple files
⁃ Incompatible file formats
⁃ The structure of file is dependent on the application programs.
⁃ Incompatibility of files makes them difficult to process jointly.
⁃ Example: consider two files with in the same enterprise but in different
departments, or in different branches: If the first file is constructed using COBOL
and the second file is written using C++, then there will be a problem of integrity.

Compiled By: BM 11
Database Approach
 Database is a shared collection of logically related data designed to meet the
information needs of an organization. Since it is a shared corporate resource, the
database is integrated with minimum amount of duplication.
 Database is a collection of logically related data where these logically related data
comprise entities, attributes, relationships, and business rules of an organization's
information.
 In addition to containing data required by an organization, database also contains a
description of the data which called as “Metadata” or “Data Dictionary” or “Systems
Catalogue” or “Data about Data”.
 Since a database contains information about the data (metadata), it is called a self-
descriptive collection on integrated records.
 Database is designed once and used simultaneously by many users.

Compiled By: BM 12
Cont’d
 Unlike the traditional file-based approach in database approach there is
program data independence. That is the separation of the data definition
from the application.
 Thus, the application is not affected by changes made in the data structure
and file organization.
 Each database application will perform the combination of: Creating
database, Reading, Updating and Deleting data.

Compiled By: BM 13
Benefits of database approach
 Data can be shared: two or more users can access and use same data instead of storing
data in redundant manner for each user.
 Improved accessibility of data: by using structured query languages, the users can easily
access data without programming experience.
 Redundancy can be reduced: isolated data is integrated in database to decrease the
redundant data stored at different applications.
 Speed: data storage and retrieval are fast as it will be using the modern fast computer
systems.
 Less labor: unlike the other data handling methods, data maintenance will not demand
much resource.
 Centralized information control: since relevant data in the organization will be stored at
one repository, it can be controlled and managed at the central level.
 Etc.
Compiled By: BM 14
Roles in DB design and development
 Effective database design and development require collaboration among
different roles.
 Each contributes to ensuring the database is efficient, secure, and aligned
with business needs.
 There are four distinct types of people that participate in the DBMS
environment:
⁃ Database administrators
⁃ Database designers
⁃ Application developers and
⁃ End-users.

Compiled By: BM 15
Database Administrator
 In a database environment, the primary resource is the database itself and the
secondary resource is the DBMS and related software.
 Administering these resources is the responsibility of the database
administrator (DBA).
 The DBA is responsible for
 Managing and maintaining existing databases.
 Ensuring database security, backups, and recovery.
 Monitoring database performance and optimizing for efficiency.
 Managing user access and permissions.
 Troubleshooting and resolving database issues

Compiled By: BM 16
Database Designer
 Database designers are responsible for identifying the data to be stored in the
database and for choosing appropriate structures to represent and store this data.
 It is the responsibility of database designers to communicate with all prospective
database users, in order to understand their requirements, and to come up with a
design that meets these requirements.
 Responsibilities of database designer also includes:
 Designing, programming, and implementing new databases.
 Creating database schemas, tables, indexes, and stored procedures.
 Optimizing database performance and ensuring data integrity.
 Collaborating with application developers and stakeholders to understand data
needs.
 Troubleshooting and resolving database issues.
Compiled By: BM 17
Application Developers
 Once the database has been implemented, the application programs that
provide the required functionality for the end-users must be implemented.
This is the responsibility of the application developers.
 Typically, the application developers work from a specification produced by
systems analysts.
 Each program contains statements that request the DBMS to perform some
operation on the database. This includes retrieving data, inserting, updating,
and deleting data.
 The programs may be written in a third-generation programming language
or a fourth-generation language

Compiled By: BM 18
End Users
 The end-users are the ‘clients’ for the database, which has been designed and implemented,
and is being maintained to serve their information needs.
 End-users can be classified according to the way they use the system:-
 Naïve Users
 They interact with the database through existing applications, without needing to
understand the underlying database structure or SQL.
 These users typically have little or no technical background in database
management.
 Example
 Bank tellers check account balances and post withdrawals and deposits.
 Reservation clerks for airlines, hotels, and car rental companies check availability for a
given request and make reservations.
 Healthcare professionals using electronic health record (EHR) systems to enter patient
data or retrieve records while unaware of the database management processes.
Compiled By: BM 19
Cont’d
 Sophisticated Users
 Sophisticated users have a deeper understanding of database concepts and
directly query the database using SQL or other tools.
 These users have a good understanding of database concepts and SQL.
 They can directly query the database to retrieve, insert, update, or delete data.
 They thoroughly familiarize themselves with the facilities of the DBMS in
order to implement their application to meet their complex requirements.
 Examples include engineers, scientists, business analysts, or database
developers.

Compiled By: BM 20
The ANSI/SPARC and Database Architecture

 ANSI/SPARC is an abstraction design standard for a database management


system.
 This architecture first proposed in 1975 by ANSI/SPARC which stand for the
American National Standard Institute Standard Planning and
Requirements Committee and this architecture is composed of three levels:
 External level
 Conceptual level
 Internal level (includes physical data storage)
 The 3 Level Architecture has the aim of enabling users to access the same
data but with a personalized view of it. This means, the three levels
architecture defines one database with multiple views.
Compiled By: BM 21
Cont’d
 The objective of the three-level architecture is to separate each user’s view of the
database from the way the database is physically represented.
 There are several reasons why this separation is desirable:
 Each user should be able to access the same data, but have a different customized
view of the data.
 Users should not have to deal directly with physical database storage details, such as
indexing or hashing. In other words, a user’s interaction with the database should be
independent of storage considerations.
 The Database Administrator (DBA) should be able to change the database storage
structures without affecting the users’ views.
 The internal structure of the database should be unaffected by changes to the physical
aspects of storage, such as the changeover to a new storage device.
 The DBA should be able to change the conceptual structure of the database without
affecting all users.
Compiled By: BM 22
Cont’d

Figure: The ANSI-SPARC three-level architecture.


Compiled By: BM 23
Cont’d
 The purpose of the external/conceptual
and the conceptual/internal mappings
 External Level: Users' view of the
database. Describes that part of
database that is relevant to a particular
user. Different users have their own
customized view of the database
independent of other users.
 Conceptual Level: Community view
of the database. Describes what data is
stored in database and relationships
among the data.
 Internal Level: Physical representation
of the database on the computer.
Describes how the data is stored in the
database.

Figure: Difference Between External, Conceptual and Internal Level


Compiled By: BM 24
External Level
 It is the highest level of abstraction that deals with the user’s view of the database and
thus, is also known as view level.
 The external level consists of a number of different external views of the database.
Each user has a view of the ‘real world’ represented in a form that is familiar for that
user.
 The external view includes only those entities, attributes, and relationships in the ‘real
world’ that the user is interested in. Other entities, attributes, or relationships that are
not of interest may be represented in the database, but the user will be unaware of
them.
 In addition, different views may have different representations of the same data. For
example, one user may view dates in the form (day, month, year), while another may
view dates as (year, month, day).
 Some views might include derived or calculated data: data not actually stored in the
database as such, but created when needed.
Compiled By: BM 25
Conceptual View
 The middle level in the three-level architecture is the conceptual level. This level contains
the logical structure of the entire database as seen by the DBA.
 It is a complete view of the data requirements of the organization that is independent of
any storage considerations.
 The conceptual level represents: all entities, their attributes, and their relationships; the
constraints on the data; semantic information about the data; security and integrity
information.
 The conceptual level supports each external view, in that any data available to a user must
be contained in, or derivable from, the conceptual level.
 However, this level must not contain any storage-dependent details.
 For instance, the description of an entity should contain only data types of attributes
(for example, integer, real, character) and their length (such as the maximum number
of digits or characters), but not any storage considerations, such as the number of
bytes occupied.
Compiled By: BM 26
Internal Level
 It is the lowest level of data abstraction that deals with the physical
representation of the database on the computer.
 The internal level covers the physical implementation of the database to
achieve optimal runtime performance and storage space utilization.
 It covers the data structures and file organizations used to store data on storage
devices.
 It interfaces with the operating system access methods (file management
techniques for storing and retrieving data records) to place the data on the
storage devices, build the indexes, retrieve the data, and so on.

Compiled By: BM 27
Types of DBMS
 Database management systems can be classified based on several criteria,
such as
1. Classification Based on Data Model
 Relational DBMS
 Object oriented DBMS
 Etc.
2. Classification Based on Number of user it supports
 Single user DBMS
 Multi user DBMS
3. Classification Based on number of sites over which the database is distributed
 Centralized DBMS
 Distributed DBMS
Relational DBMS
 In this model, the data is organized into a collection of two-dimensional inter-
related tables, also known as relations.
 Each relation is a collection of columns and rows, where the column represents the
attributes of an entity and the rows (or tuples) represents the records.
 A relational database uses SQL for storing, manipulating, as well as maintaining
the data.
 Well-known DBMSs like Oracle, MS SQL Server, DB2 and MySQL support this
model.

Figure: Example of relational data model


Object-oriented DB

 This model is a database management system in which information is


represented in the form of objects as used in object-oriented
programming.
 Object-oriented databases are different from relational databases, which
are table-oriented.
 Object-oriented database management systems (OODBMS) combine
database capabilities with object-oriented programming language
capabilities.
 Examples of object-oriented DBMS include MongoDB and Apache
Cassandra.
Classification of DBMS based on Number of users

 Single-user DBMSs
 As the name itself indicates it can support only one user at a time.
 It is mostly used with the personal computer on which the data resides
accessible to a single person.
 The user may design, maintain and write the database programs.
 Multiuser DBMSs
 which include the majority of DBMSs, supports multiple users concurrently.
 Data can be both integrated and shared, a database should be integrated when
the same information is not need be recorded in two places.
 Needs concurrency control and deadlock management techniques.
Classification of DBMS based on Number of Databases

 Centralized Database
 In a centralized database there is a single database file at one location in the network.
 Multiple users can access this single database via a computer network (LAN, WAN,
etc.)
 This type of database is mainly used by institutions or organizations.
Classification of DBMS based on Number of
Databases
 Distributed Database
 Distributed database is basically a type of database which consists of multiple
databases that are connected with each other and are spread across different physical
locations
 The communication b/n databases at different physical location is thus done by a
computer network.
Types of DDB system
 Distributed database systems are classified into two types. These are,
1. Homogeneous distributed database system
– In a homogeneous distributed database system, all sites have identical database
management system software, are aware of one another, and agree to cooperate in
processing users’ requests.
2. Heterogeneous distributed database system
– in a heterogeneous distributed database, different sites may use different schemas,
and different database-management system software.
– The sites may not be aware of one another, and they may provide only limited
facilities for cooperation in transaction processing.
– The differences in schemas are often a major problem for query processing, while
the divergence in software becomes a hindrance for processing transactions that
access multiple sites.
Database Management System (DBMS)
 DBMS is A software system that enables users to define, create, maintain, and control
access to the database.
 Typically, a DBMS provides the following facilities:
 It allows users to define the database, usually through a Data Definition Language
(DDL).
 It allows users to insert, update, delete, and retrieve data from the database, usually
through a Data Manipulation Language (DML).
 It provides controlled access to the database. For example, it may provide
‐ a security system, which prevents unauthorized users accessing the database;
‐ a concurrency control system, which allows shared access of the database;
‐ a recovery control system, which restores the database to a previous consistent state
following a hardware or software failure;

Compiled By: BM 35
Components DBMS Environment
 The database management system can be divided into five major components, they are:
⁃ Hardware
⁃ Software
⁃ Data
⁃ Procedure and
⁃ people
 These components are illustrated in the following figure.

Figure: Components of DBMS Environment

Compiled By: BM 36
Hardware Component
 The hardware is the actual computer system used for Maintaining and accessing the
database.
 Hardware components refers to all the systems physical devices.
 For example:-computers (PCs, workstations, servers and supercomputers),storage
devices (Hard disks, RAM, ROM,...), networking devices (switches, hubs, routers,...),
and other devices (input and output devices)
 One can’t implement or use DBMS without using Hardware components. It can range
from a single personal computer, to a single mainframe, to a network of computers.
 The particular hardware depends on the organization’s requirements and the DBMS used.
Some DBMSs run only on particular hardware or operating systems, while others run on
a wide variety of hardware and operating systems.
 When we run any DBMS like Oracle, MySQL, etc. on our PC, then computer parts like
mouse, keyboard, RAM, ROM, hard disks all become part of DBMS hardware
components.
Compiled By: BM 37
Software component
 Software is a set of instructions that is used to instruct the computer hardware
for the operation of the computers.
 The software establishes an easy-to-use interface for users to control the
hardware and to create, store, access and/or update in the database.
 All requests made by users for database management are handled and
processed by the DBMS software.
 The software component of DBMS comprises of
– DBMS software:- Microsoft SQL Server, Oracle, MySQL, etc.
– Operating System:- Microsoft Window, Linux, UNIX, etc.
– Application Programs and Utility Programs
– Network Software if the DBMS is being used over a network.
Compiled By: BM 38
Data
 Data is that resource, for which DBMS was designed. The motive behind
the creation of DBMS was to store and utilize data.
 The database contains both the operational data and the metadata
 Metadata is data about the data. This is information stored by the DBMS to
better understand the data stored in it.
 For Example:- when we store specific data (let us say, a person's name) in
the database, the DBMS also stores additional information such as when
and where the data was stored, the size of the data, whether the data is
relative or dependent, data type, etc. all this additional information about
the actual data (i.e. person's name) is collectively called metadata.

Compiled By: BM 39
Procedure
 The procedure is a type of general instruction or guidelines for the use of
DBMS.
 This instruction includes
‐ how to set up the database,
‐ how to install the database,
‐ how to log in and log out of the database,
‐ how to manage the database,
‐ how to take a backup of the database, and
‐ how to generate the report of the database
 The basic purpose of procedures is to help guide users th the operation and
management of database systems

Compiled By: BM 40
People/ User
 People refers to every person who design and accesses the database and performs any
operation like creating, deleting, accessing or modifying data in the database with DBMS.
 The user (people) of the DBMS can be classified into the following types;
‐ Database designer:- database designers are responsible for identifying the data to be
stored in the database and for choosing appropriate structures to represent and store
this data.
‐ Database administrator:- responsible to oversee, control and manage the database
resources (the database itself, the DBMS and other related software).
‐ Application developer:- the application programmer determines the interface on how
to retrieve, insert, update and delete data in the database.
‐ End user:-any person who directly interacts with a DBMS and performs various
database-related operations like inserting, modifying, retrieving or deleting data using
database commands or applications
Compiled By: BM 41
Functions of DBMS
 A Database Management System (DBMS) serves several critical functions that are essential
for managing and organizing data efficiently.
 Here are the key functions of a DBMS:
 Data Storage, Retrieval, and Update: It provides a systematic way to store, aretrieve
and update data easily.
 Data Manipulation: Uses languages like SQL (Structured Query Language) for data
querying, insertion, updating, and deletion.
 Data Security: Protects data from unauthorized access through user authentication and
permissions.
 Multi-User Support: Allows multiple users to access and manipulate the database
simultaneously while maintaining data integrity.
 Data Abstraction and Independence: Allows users to interact with data without needing
to understand the underlying complexities of how the data is stored.
 Etc.
Compiled By: BM 42
Database Languages
 Database languages are specialized languages used to interact with a
database.
 They allow users to perform different tasks such as defining, controlling, and
manipulating the data.
 There are several types of database languages in DBMS, categorized into the
following four main types:
 DDL (Data Definition Language)
 DCL (Data Control Language)
 DML (Data Manipulation Language)
 TCL (Transaction Control Language)

Compiled By: BM 43
DDL
 DDL is used to define and modify the structure of the database itself,
including the tables, views, indexes, and other schema-related objects.
 It deals with the creation and modification of database schema, but it doesn't
deal with the data itself.
 Following are the main DDL commands in SQL:
 CREATE: Used to create database objects like tables, indexes, or views.
 ALTER: Used to modify the structure of an existing database object, such as adding a
new column to a table.
 DROP: Used to delete database objects.
 TRUNCATE: Used to remove all rows from a table, without affecting the structure.
 RENAME: Used to change the name of a database object.

Compiled By: BM 44
DML
 The DML (Data Manipulation Language) is used to manage and manipulate
data within a database.
 With DML, you can perform various operations such as inserting, updating,
selecting, and deleting data.
 These operations allow you to work with the actual content in your database
tables.
 Here are the key DML commands:
 SELECT: Retrieves data from the table based on specific criteria.
 INSERT: Adds new rows of data into an existing table.
 UPDATE: Modifies existing data in a table.
 DELETE: Removes data from a table.

Compiled By: BM 45
DCL
 DCL is used to control the access permissions of users to the database.
 DCL commands help grant or revoke privileges to users, determining who
can perform actions like reading or modifying data.
 The two main DCL commands are:
 Grant: Gives user access to the database
 Revoke: Removes access or permissions from the user

Compiled By: BM 46
TCL
 The TCL commands are used to manage and control transactions in a
database, grouping them into logical units.
 TCL is used to run the changes made by the DML statement.
 TCL commands include;
 COMMIT: to save the transaction on the database
 ROLLBACK: to restore the database to original since the last commit
 SAVEPOINT: savepoint command is used to temporarily save a transaction so
that you can rollback to that point whenever necessary

Compiled By: BM 47
End of Chapter One

Compiled By: BM 48

You might also like