0% found this document useful (0 votes)
55 views78 pages

Dbms Boiii

This document provides an introduction to databases and database management systems. It defines key terms like data, database, and DBMS. It describes different types of databases and characteristics of the database approach, such as data abstraction and concurrent access. It also discusses database users, advantages of databases, DBMS software, and considerations for when not to use a DBMS.

Uploaded by

Arya Atharva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views78 pages

Dbms Boiii

This document provides an introduction to databases and database management systems. It defines key terms like data, database, and DBMS. It describes different types of databases and characteristics of the database approach, such as data abstraction and concurrent access. It also discusses database users, advantages of databases, DBMS software, and considerations for when not to use a DBMS.

Uploaded by

Arya Atharva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 78

Unit 1: Introduction

Basic Definitions
 Data:
Known facts that can be recorded and have an implicit meaning.
 Database:
A collection of related data.
 Database Management System (DBMS):
Collection of programs that enables users to create and maintain
a database.
A software package/ system to facilitate the creation and
maintenance of a computerized database.
Types of Databases and Database Applications
 Numeric and Textual Databases
 Multimedia Databases
 Geographic Information Systems (GIS)
 Data Warehouses
 Real-time and Active Databases
File System and Drawbacks of using file
system
 Data redundancy and inconsistency
 Multiple file formats, duplication of information in different files
 Difficulty in accessing data
 Need to write a new program to carry out each new task
 Integrity problems
 Integrity constraints - provide a way of ensuring that changes
made to the database by authorized users do not result in a
loss of data consistency.
 Hard to add new constraints or change existing ones
Drawbacks of using File system
 Atomicity of updates
 Failures may leave database in an inconsistent state with partial
updates carried out
 E.g. transfer of funds from one account to another should either
complete or not happen at all
 Concurrent access by multiple users
 Concurrent accessed needed for performance
 Uncontrolled concurrent accesses can lead to inconsistencies
 E.g. two people reading a balance and updating it at the same time
 Security problems
 Database System:
The DBMS software and the database.
Characteristics of the Database Approach
 Self-describing nature of a database system:
 A DBMS catalog stores the description of the database. Which
contains information such as the structure of each file, the type
and storage format of each data item, and various constraints on
the data. The description is called meta-data.
 This allows the DBMS software to work with different
databases.
 Insulation between programs and data:
 Called program-data independence.
 Allows changing data storage structures and operations without
having to change the DBMS access programs.
Characteristics of the Database Approach
 Data Abstraction:
 A data model is used to hide storage details and present the
users with a conceptual view of the database.

 Support of multiple views of the data:


 Each user may see a different view of the database, which
describes only the data of interest to that user.

 Sharing of data and multiuser transaction processing :


 allowing a set of concurrent users to retrieve and to update the database.
 Concurrency control within the DBMS guarantees that each transaction is
correctly executed or completely aborted.
 OLTP (Online Transaction Processing) is a major part of database applications.
Typical DBMS Functionality

 Define a database : in terms of data types, structures and


constraints
 Construct or Load the Database on a secondary storage
medium
 Manipulating the database : querying, generating reports,
insertions, deletions and modifications to its content, Accessing
the database through Web applications
 Concurrent Processing and Sharing by a set of users and
programs – yet, keeping all data valid and consistent

Slide 1-9
Typical DBMS Functionality
Other features:
 Protection or Security measures to prevent unauthorized
access
 “Active” processing to take internal actions on data
 Presentation and Visualization of data
Example of a Database
(with a Conceptual Data Model)

 Mini-world for the example:


Part of a UNIVERSITY environment.
 Some mini-world entities:
 STUDENTs
 COURSEs
 SECTIONs (of COURSEs)
 (academic) DEPARTMENTs
 INSTRUCTORs
Note: The above could be expressed in the ENTITY-RELATIONSHIP
data model.

Slide 1-
11
Example of a Database
(with a Conceptual Data Model)
 Some mini-world relationships:
 SECTIONs are of specific COURSEs
 STUDENTs take SECTIONs
 COURSEs have prerequisite COURSEs
 INSTRUCTORs teach SECTIONs
 COURSEs are offered by DEPARTMENTs
 STUDENTs major in DEPARTMENTs

Note: The above could be expressed in the ENTITY-RELATIONSHIP


data model.

Slide 1-
12
Database Users
 Users may be divided into
 those who actually use and control the content (called
“Actors on the Scene”) and
 those who enable the database to be developed and the
DBMS software to be designed and implemented (called
“Workers Behind the Scene”).

Slide 1-
13
Database Users
Actors on the scene
 Database administrators: responsible for authorizing
access to the database, for coordinating and monitoring
its use, acquiring software and hardware resources,
controlling its use and monitoring efficiency of
operations.
 Database Designers: responsible to define the
content, the structure, the constraints, and functions or
transactions against the database. They must
communicate with the end-users and understand their
needs.
 End-users: they use the data for queries, reports and
some of them actually update the database content. Slide 1-
14
Categories of End-users
 Casual : access database occasionally when needed
 Naive or Parametric : they make up a large section of the
end-user population. They use previously well-defined
functions in the form of “canned transactions” against the
database.
 Examples are bank-tellers or reservation clerks who do this activity
for an entire shift of operations.

Slide 1-
15
Categories of End-users
 Sophisticated : these include business analysts, scientists,
engineers, others thoroughly familiar with the system
capabilities. Many use tools in the form of software packages
that work closely with the stored database.

 Stand-alone : mostly maintain personal databases using


ready-to-use packaged applications. An example is a tax
program user that creates his or her own internal database.

Slide 1-
16
Workers Behind the Scene
 DBMS system Designers and Implementers
 Design and implement the DBMS modules and
interfaces as a software package.
 Tool Developers
 Design and implement tools
 Operators and Maintenance Personnel
 Responsible for the actual running and maintenance of
the hardware and software environment for the database
system.

Slide 1-
17
Advantages of Using the Database Approach
 Controlling redundancy

 Sharing of data among multiple users.

 Restricting unauthorized access to data.

 Providing backup and recovery services.

 Providing multiple interfaces to different classes of users.

 Representing complex relationships among data.

 Enforcing integrity constraints on the database.


Slide 1-
18
DBMS Software
1. Oracle
2. SQL server
3. IBM DB2
4. SAP Sybase ASE
5. Postgre SQL
6. MYSQL
7. Tera data
8. Informix
9. Ingres
10. MariaDB
Largest Databases in the World
 1. The World Data Centre for Climate
 2. National Energy Research Scientific computing Center
 3. AT&T – Telecommunication
 4. Google
 5. Sprint – Telecommunication
 6. LexisNexis
 7. Youtube
 8. Amazon
 9. Central Intelligence Agency (CIA)
 10. Library of Congress
When not to use a DBMS
 Main inhibitors (costs) of using a DBMS:
 High initial investment in hardware, software and training.
 Overhead for providing generality, security, concurrency
control, recovery, and integrity functions.
 When a DBMS may be unnecessary:
 If the database and applications are simple, well defined, and
not expected to change.
 If there are stringent real-time requirements that may not be
met because of DBMS overhead.
 If access to data by multiple users is not required.

Slide 1-
21
When not to use a DBMS
 When no DBMS may suffice:
 If the database system is not able to handle the complexity of
data because of modeling limitations
 If the database users need special operations not supported by
the DBMS.

Slide 1-
22
Data Models
 Data Model:
 A collection of concepts to describe the structure of a database,
the operations for manipulating these structures, and certain
constraints that the database should obey.
 Data Model Structure and Constraints:
 Constructs are used to define the database structure
 Constructs typically include elements (and their data types) as
well as groups of elements (e.g. entity, record, table), and
relationships among such groups
 Constraints specify some restrictions on valid data; these
constraints must be enforced at all times
Categories of Data Models
 Conceptual (high-level, semantic) data models:
 Provide concepts that are close to the way many users perceive
data. (Also called entity-based or object-based data models.)
 Physical (low-level, internal) data models:
 Provide concepts that describe details of how data is stored in the
computer.
 These are usually specified through DBMS design and
administration manuals
 Implementation (representational) data models:
 Provide concepts that fall between the above two, used by many
commercial
 DBMS implementations (e.g. relational data models used in many
commercial systems).
Schemas versus Instances
 Database Schema:
 The description of a database.
 Includes descriptions of the database structure, data types, and
the constraints on the database.

Slide 1-25
 Database State:
 The data stored in a database at a particular moment in time. This
includes the collection of all the data in the database.
 Refers to the content of a database at a moment in time.
 Also called database instance (or occurrence or snapshot).
 The term instance is also applied to individual database
components, e.g. record instance, table instance, entity instance
 Initial Database State:
 Refers to the database state when it is initially loaded into the
system.
 Valid State:
 A state that satisfies the structure and constraints of the
database.
 Distinction
 The database schema changes very infrequently.
 The database state changes every time the database is updated.
 Schema is also called intension.

 State is also called extension.


Three schema architecture
 Important characteristics of database approach:
 Insulation of programs and data
 Support multiple user views
 Use of a catalog to store the database description (Schema)
 An architecture of schema is proposed to achieve these
characteristics
 Architecture goal: To separate the user application and the
database
Three schema architecture
Three schema architecture
Defines DBMS schemas at three levels:
 Internal schema at the internal level to describe physical
storage structures and access paths. Typically uses a
physical data model.
 Conceptual schema at the conceptual level to describe the
structure and constraints for the whole database for a
community of users. Uses a conceptual or an
implementation data model.
 External schemas at the external level to describe the
various user views. Usually uses the same data model as
the conceptual level.
Data Independence
 Capacity to change the schema at one level of database
system without having to change the schema at the next
higher level
 Logical Data Independence: The capacity to change the
conceptual schema without having to change the external
schemas and their application programs.
 Physical Data Independence: The capacity to change the
internal schema without having to change the conceptual
schema.
 Once the design of a database is
completed and a DBMS is chosen to
implement the database, the first order of
the day is to specify conceptual and
internal schemas for the database and any
mappings between the two.
 In many DBMS’s where no strict separation of
levels is maintained, one language, called the
data definition language (DDL), is used by
the DBA and by database designers to define
both schemas.
 This Language is used define data structures
and specially database schemas. these
statements are used to create, alter, or drop data
structures. ALTER ,CREATE ,DROP are some
examples of DDL.

 The DBMS will have a DDL compiler
whose function is to process DDL
statements in order to identify
descriptions of the schema constructs and
to store the schema description in the
DBMS catalog.
 In DBMSs where a clear separation is
maintained between the conceptual and internal
levels, the DDL is used to specify the
conceptual schema only. Another language, the
storage definition language (SDL), is used to
specify the internal schema.
 This language is used to define internal
schema. It defines that what will be the
Physical structure of database, How many bites
per field will be used, what will be the order of
fields, and how records will be accesses etc.
 The mappings between the two schemas
may be specified in either one of these
languages. For a true three-schema
architecture, we would need a third
language, the view definition language
(VDL), to specify user views and their
mappings to the conceptual schema, but
in most DBMS’s the DDL is used to
define both conceptual and external
schemas.
 Once the database schemas are compiled and
the database is populated with data, users must
have some means to manipulate the database.
Typical manipulations include retrieval,
insertion, deletion, and modification of the
data.
 The DBMS provides a data manipulation
language (DML) for these purposes.
 DBMSs, the preceding types of languages
are usually not considered distinct
languages; rather, a comprehensive
integrated language is used that includes
constructs for conceptual schema
definition, view definition, and data
manipulation
 Storage definition is typically kept
separate, since it is used for defining
physical storage structures to fine-tune the
performance of the database system, and
it is usually utilized by the DBA staff
 A typical example of a comprehensive
database language is the SQL relational
database language which represents a
combination of DDL, VDL, and DML, as
well as statements for constraint
specification and schema evolution. The
SDL was a component in earlier versions
of SQL but has been removed from the
language to keep it at the conceptual and
external levels only.
 There are two main types of DMLs. A high-
level or nonprocedural DML can be used on
its own to specify complex database operations
in a concise manner. Many DBMSs allow high-
level DML statements either to be entered
interactively from a terminal (or monitor) or to
be embedded in a general-purpose
programming language.
 In the latter case, DML statements must
be identified within the program so that
they can be extracted by a pre-compiler
and processed by the DBMS. A low-level
or procedural DML must be embedded
in a general-purpose programming
language.
 Low-level DMLs are also called record-at-a-
time DMLs because of this property.
 High-level DMLs, such as SQL, can specify
and retrieve many records in a single DML
statement and are hence called set-at-a-time or
set-oriented DMLs.
 A query in a high-level DML often specifies
which data to retrieve rather than how to
retrieve it; hence, such languages are also
called declarative
 Whenever DML commands, whether high-
level or low-level, are embedded in a general-
purpose programming language, that language
is called the host language and the DML is
called the data sublanguage.
 On the other hand, a high-level DML used in a
stand-alone interactive manner is called a
query language.
DBMS Interfaces
 User-friendly interfaces provided by a DBMS may include the
following:
 Menu-Based Interfaces for Web Clients or Browsing. These
interfaces present the user with lists of options (called menus)
that lead the user through the formulation of a request.
 Forms-Based Interfaces. A forms-based interface displays a
form to each user.
 Graphical User Interfaces. A GUI typically displays a schema
to the user in diagrammatic form
 Natural Language Interfaces. These interfaces accept requests
written in English or some other language and attempt to
understand them.
 Speech Input and Output. Limited use of speech as an input
query and speech as an answer to a question or result of a
request is becoming commonplace.
 Interfaces for Parametric Users. Parametric users, such as
bank tellers, often have a small set of operations that they
must perform repeatedly.
 Interfaces for the DBA. Most database systems contain
privileged commands that can be used only by the DBA staff.
These include commands for creating
 accounts, setting system parameters, granting account
authorization, changing a schema, and reorganizing the
storage structures of a database.
DBMS Components

Slide 1-50
 A database system being a complex software
system is partitioned into several software
components that handle various tasks such as
data definition and manipulation, security and
data integrity, data recovery and concurrency
control, and performance optimization
Data definition:
 DBMS provides functions to define the structure of the data.
functions include defining and modifying the record structure, the
data type of fields, and the various constraints to be satisfied by the
data in each field.

 It is the responsibility of database administrator to define the


database, and make changes to its definition (if required) using the
DDL and other privileged commands.

 The DDL compiler component of DBMS processes these schema


definitions, and stores the schema descriptions in the DBMS catalog
(data dictionary). Other DBMS components then refer to the catalog
information as and when required.
Data manipulation
 Once the data structure is defined, data needs
to be manipulated. The manipulation of data
includes insertion, deletion, and modification
of records. The functions that perform these
operations are also part of the DBMS.
 The queries that are defined as a part of the
application programs are known as planned
queries.
 The application programs are submitted to a
precompiler, which extracts DML commands
from the application program and send them to
DML compiler for compilation.
 The rest of the program is sent to the host
language compiler.
 The object codes of both the DML commands
and the rest of the program are linked and sent
to the query evaluation engine for execution.
 The sudden queries that are executed as and when the
need arises are known as unplanned queries
(interactive queries). These queries are compiled by
the query complier, and then optimized by the query
optimizer.
 The query optimizer consults the data dictionary for
statistical and other physical information about the
stored data. (The optimized query is finally passed to
the query evaluation engine for execution.
 The naive users of the database can also

query and update the database by using some


already given application program interfaces.
The object code of these queries is also
passed to query evaluation engine for
processing.
Data security and integrity:
 The DBMS contains functions, which handle
the security and integrity of data stored in the
database. Since these functions can be easily
invoked by the application, the application
programmer need not code these functions in
the programs.
Concurrency and data recovery:
 The DBMS also contains some functions that
deal with the concurrent access of records by
multiple users and the recovery of data after a
system failure
Performance optimization:
 The DBMS has a set of functions that optimize
the performance of the queries by evaluating
the different execution plans of a query and
choosing the best among them.
Slide 1-68

You might also like