DBMS First Chapter

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 24

Introduction to Database Systems

Chapter1: Introduction (T1:3-26) (T1:29-52)


 Introduction
 An example
 Characteristics of Database approach
 Actors on the screen
 Workers behind the scene
 Advantages of using DBMS approach
 A brief history of database applications
 When not to use a DBMS
 Data models, schemas and instances
 Three-schema architecture and data independence
 Database languages and interfaces
 The database system environment
 Centralized and client-server architectures
 Classification of Database Management systems

Introduction
 Data are the known facts that can be recorded and have an implicit meaning.
 A database is a collection of logically related data.
 Properties of a database:
 It represents some aspect of real world (miniworld).
 It is a logically coherent collection of data with some inherent meaning.
 It is designed, built and populated with data for a specific purpose.
 A database can be of any size and complexity.
Overview
 Ex1: A university database
 Entities such as students, faculty, courses
 Relationships between entities such as students’ enrollment in courses, faculty teaching
courses
 Ex2: A Hospital database
 Entities such as doctors, patients, nurses, wards
 Relationships between entities such as doctors visiting patients, patients in rooms.
 A database may be generated and maintained manually or it may be computerized.
 A Database Management System (DBMS) is a software package (collection of programs) that
enables users to create, store and maintain a database.
 The DBMS is a general purpose software system that facilitates the process of defining,
constructing, manipulating, and sharing databases among various users and applications.
Functionalitie of DBMS
 A DBMS is a general purpose software system facilitating each of the following (with respect to
a database):
 Defining a database
o specifying data types, structures, and constraints of the data to be stored in the database.
 Constructing the database
o the process of storing the data on some storage medium (e.g., magnetic disk) that is
controlled by the DBMS
 Manipulating the database
o querying the database to retrieve specific data, updating the database to reflect changes in
the miniworld, and generating reports
 Sharing a database
o allowing multiple users and programs to access the database "simultaneously"
 Maintaining the database
o allowing the system to evolve as requirements change over time
Protection includes system protection and security protection.
 System protection
o preventing database from becoming corrupted when hardware or software failures occur
 Security protection
o preventing unauthorized or malicious access to database.

An Example for Database Design

 1st Step: Requirements definition & analysis


 2nd Step: Conceptual design
 3rd Step: Logical design or data model mapping
 4th Step: Physical design
Database Design
1st Step: Requirements analysis
 Database designers interview prospective database users to understand and document their data
requirements
 Two types of requirements
 Functional requirements
 Database requirements
2nd Step: Conceptual design
 Create conceptual schema using high level conceptual data model
 Conceptual schema is a description of the data requirements of the users and includes entity
types, relationships, constraints.
 Conceptual schema do not include implementation details and can be used to communicate with
non technical users.
 It can be used to ensure that all users data requirements are met and no conflict exists
3rd Step: Logical design or data model mapping
 Conceptual schema is transformed from the high level data model into the implementation data
model.
 Actual implementation of the database using a commercial DBMS
4th Step: Physical design
 Internal storage structures, access paths, and file organizations for the database files are specified
 In parallel with all steps, application programs are designed and implemented as database
transactions corresponding to the high level transaction specifications.

File System vs a DBMS


 In the file processing approach
o Each user defines and implements the files needed and software applications to
manipulate those files.
o Data definition is part of the application programs themselves; to change definition of
files, programs must be changed
o A program can access only specific databases and special programs should be written for
every query.
o This redundancy in defining and storing data results in wasted storage space and in
redundant efforts to maintain up-to data.
 In the database approach
o A single repository of data is maintained that is used by various users.
o A database system has Self-describing nature.
o The definition and description of the database is stored in the catalog (known as
metadata); to change description of databases, database software are not changed.
Characteristics of a Database system
 Self-describing nature of a database system
 Insulation between programs and data, and data abstraction
 Support of multiple views of the data
 Sharing of data and multiuser transaction processing
1. Self describing nature of a Database system
In DBMS
 A complete description of the database structure and constraints is stored in the DBMS catalog
called metadata.
 A general purpose DBMS Software package is not written for a specific database application; it
must refer to the catalog to know the structure of the files in a specific database.
In Traditional file processing System
 The structure of the data files accessed by an application is "hard-coded" in its source code.
 To change the structure of the data, every application in which a description of that file's structure
is hard-coded must be changed.
2. Insulation between programs and data, and data abstraction
In DBMS
 DBMS provides an abstract view of the data that hides the details. A single repository of data is
maintained that is used by various users.
 A DBMS provides users with a conceptual representation of data that does not include the details
of how data is stored and how the operations are implemented.
o Program-data independence
o Program-operation independence
 Insulation between programs and data, and data abstraction
 Program-data independence:
o The definition and description of the database is stored in the catalog (known as metadata); to
change description of databases, database software is not changed. Insulation between
programs and data, and data abstraction
 Program-operation independence:
o In object-oriented relational system, an operation is specified in two parts:
o The interface of an operation name and data types of its arguments.
o The implementation of the operation is specified separately and can be changed without
affecting the interface.
o User application programs can operate on the data by invoking on these operations through
their names and arguments, regardless of how the operations are implemented.
Support of multiple views of the data
 A database has many users, different users may have different requirements i.e. require a different
view of the database.
 A view may be a subset of the database or it may contain virtual data that is derived from the
database files but is not explicitly stored.
Sharing of data and multiuser transaction processing
 A multiuser DBMS must allow multiple users to access the database at the same time, if data
for multiple related applications is integrated and maintained in a single database
 DBMS includes concurrency control software to ensure that several users updating the same
data, should update in a controlled manner so that the result of the updates is correct.
 Each transaction should possess ACID properties.

Actors on the Scene


 Database Administrator
 Database Designers
 End Users
 Casual end users
 Naive/Parametric end users
 Sophisticated end users
 Stand-alone users
 System Analysts and Application Programmers
Database Administrator (DBA)
 This is the chief administrator, who oversees and manages the database system.
Duties include
1. Security & Authorization
 authorizing users to access the database, coordinating/monitoring its use
2. Data availability and recovery failures
 restoring the data, maintaining the log files
3. Database tuning
 acquiring hardware/software for upgrades, etc.
Database Designers
 They are responsible for identifying the data to be stored and for choosing appropriate
structures to represent and store this data.
 They also define views for different categories of users. The final design must be able to
support the requirements of all the user sub-groups.
End-users
 These are persons who access the database for querying, updating, and report generation.
They are main reason for database's existence!
1. Casual end users
 They use database occasionally, needing different information each time.
 They use query language to specify their requests.
 They are middle- or high-level software literate managers.
2. Naive/Parametric end users:
 Typically the biggest group of users; frequently query/update the database using standard
front end based application that have been carefully programmed and tested in advance.
 Examples:
 bank tellers check account balances, post withdrawals/deposits
 reservation clerks for airlines, hotels, etc., check availability of seats/rooms and make
reservations.
3. Sophisticated end users
 They are engineers, scientists, business analysts who implement their own applications to
meet their complex needs.
4. Stand-alone users
 They use "personal" databases,
 A tax program user that creates his or her own internal database by using ready-made
package.
System Analysts and Application Programmers
 They develop the packages that facilitate the data access for end users using the host language
DBMS packages, other DBMS related software tools like report writers.

Workers behind the Scene


Database system designers & implementers
 They design and implement the DBMS modules and interfaces as a software packages.
 DBMS consists of many components e.g. for implementing the catalog, processing query
language, processing the interface, accessing and buffering data, controlling concurrency and
handling data recovery and security
Tool Developers
 They design and implement tools – the software packages that facilitate database modeling
design.
 These tools include packages for database design, performance monitoring, natural language
or graphical interfaces, prototyping, simulation, and test data generation.
 Tools can be purchased separately which are developed by different vendors.
Operators and maintenance personnel
 They are responsible for the actual running and maintenance of the hardware and software
environment for the database system.

Advantages of using the DBMS approach


1. Controlled Redundancy
 In the file processing approach, each user defines and implements the files needed and software
applications to manipulate those files.
 Various files are likely to have different formats and programs may be written in different
languages and same information may be duplicated in several files.
 Data redundancy leads to
o wasted storage space,
o duplication of effort (when multiple copies of a datum need to be updated),
o a higher likelihood of the introduction of inconsistency.
 Database design stores each logical data item at one place to ensure consistency and saves
storage.
 But sometimes, controlled redundancy is necessary to improve the performance.
 Database should have capability to control this redundancy & maintain consistency by specifying
the checks during database design.
2. Restricting Unauthorized Access
 A DBMS provides a security and authorization subsystem, which is used by DBA to create user
accounts and to specify restrictions on user accounts.
 File processing system provides password mechanism and very less security which is not
sufficient to enforce security policies like DBMS.
3. Providing Persistent Storage for Program Objects
 Object oriented database systems are compatible with programming languages such as C++ and
Java.
 A DBMS software automatically performs the conversion of a complex object which can be
stored in object oriented DBMS, such an object is said to be persistent due to its survival after the
termination of the program.
4. Providing Storage Structures for Efficient Query Processing
 The DBMS utilizes a variety of sophisticated techniques (view, indexes etc.) to store and retrieve
the data efficiently that are utilized to improve the execution time of queries and updates.
 DBMS provides indexes and buffering for fast access of query result, the choice of index is part
of physical database design and tuning.
 The query processing and optimization module is responsible for choosing an efficient query
execution plan for each query submitted to the system.
5. Providing Backup & Recovery
 Data should be restored to a consistent state at the time system crash and changes being made
 If hardware or software fails in the middle of the update program, the recovery subsystem of
DBMS ensures that update program is resumed at the point of failure.
6. Multiple user interfaces
 DBMS provides a variety of user interfaces for the users of varying level of technical knowledge.
 These includes query language for casual users, programming language interfaces for application
programmers, forms and command codes for parametric users, menu driven interfaces and natural
language interfaces for stand alone users etc
7. Representing Complex Relationships among data
 A DBMS must have the capability to represent a variety of complex relationship among the data,
to define new relationships as they arise, and to retrieve and update the related data easily and
efficiently.
8. Enforcing Integrity Constraints
 The DBMS have certain integrity constraints that hold on data.
 These constraints are derived from the meaning of the data and of the miniworld.
 Some constraints can be specified to the DBMS at the time of defining data definitions and
automatically enforced.
 Database does not allow violation of constraints at the time of updating the database.
9. Permitting Inferencing and Action Using Rules
 Deductive database systems provide capabilities for defining deduction rules for inferencing new
information from the stored database facts.
 Triggers can be associated with tables.
 A trigger is a form of a rule activated by updates to the table, which results in performing some
additional operations to some other tables, sending messages and so on.
 Stored procedure can also be used as a part of the overall database definition and are invoked
appropriately when certain conditions are met.
 Active database provides more powerful functionality by providing the active rules that can
automatically initiate actions when certain events and conditions occur.

Additional Implications of using the DBMS approach


1. Potential for enforcing standards
 The database approach permits the DBA to define and enforce standards among database users in
a large organization.
 Standards can be defined for names and formats of data elements, display formats, report
structures, terminology etc.
 This facilitates communication and cooperation among various departments, projects and users
within the organization.
2. Reduced application time
 DBMS applications are more robust because many important tasks are handled by DBMS, thus
do not have to be debugged and tested.
 Programmer can concentrate more on the specific functionality required by the users.
3. Flexibility
 It may be necessary to change the structure of a database as requirements change.
 Modern DBMS allows certain type of changes to the structure of database without affecting the
stored data and the existing application programs.
4. Availability of Up-to date information
 Availability of immediate up to information is essential for many transaction processing
applications.
 In DBMS, update applied to database by one user can immediately be seen by other users.
 Additional Implications of using the DBMS approach
5. Economies of scale
 The DBMS approach permits consolidation of data and applications, thus reducing the amount of
wasteful overlap between the activities of data processing personnel, also reduce the storage
space.
A brief history of database applications
Early Database Applications using Hierarchical and Network Systems
 Early database applications were maintained in the form of large number of records of similar
structure.
 Early database systems were intermixing of conceptual relationships with the physical storage
and placement of records on disk.
 It was difficult to reorganize the database when changes were made to the requirements of the
applications.
 Early systems provided only programming language interfaces which made it more time
consuming and expensive to implement new queries and transactions.
 Most of these database systems were implemented on large and expensive mainframe computers
during mid-1960s to 1980s.
Providing Application Flexibility with Relational Databases
 Relational databases separate the physical storage of data from its conceptual representation and
to provide a mathematical foundation for content storage.
 These also introduce high level query languages that makes query writing very fast.
 These also provide flexibility to reorganize the database for changed requirements.
 Early experimental relational systems were developed in 1970s and the commercial relational
database management systems were introduced in 1980s.
 These systems were slow due to lack of physical storage pointers or record placement to access
related data records.
 Due to the improved performance with the development of new storage and indexing techniques
and better query processing and optimization, relational databases became the dominant type of
database system for traditional database applications
Object-Oriented Applications and the need for more Complex Databases
 OODBs incorporate many of the useful object-oriented paradigms, such as abstract data types,
encapsulation of the operations, inheritance, and object identity.
 They are mainly used in specialized applications such as engineering design, multimedia
publishing, and manufacturing systems.
 Overall usage of OODB in database market is 5%.
Interchanging Data on the Web for E-Commerce
 Users can create documents using a Web publishing language, such as Hyper Text Markup
language (HTML), and store these documents on Web servers where other users (clients) can
access them.
 In 1990s, e-commerce emerged as a major application on the Web in which parts of the
information on e-commerce Web pages are often dynamically extracted data from DBMSs.
 A variety of techniques e.g. XML is used for interchanging data among various types of databases
and Web pages.
Extending Database Capabilities for new Applications
 Scientific applications
 Storage and retrieval of images
 Storage and retrieval of videos
 Data mining
 Spatial applications
 Time series
For all above applications special capabilities are required
 More complex data structures
 New data types in addition to basic data types
 New operations and query language constructs to manipulate new data types
 New storage and indexing structures

Database developers have added more functionalities to general purpose DBMS:


 Some of these functionalities are general purpose and incorporated in object oriented database.
 Special purpose functionalities can be bought as additional modules.
 Extending Database Capabilities for new Applications
 Most organizations use a variety of software application packages that work with database back-
ends.
 Database is manipulated by these packages for supporting transactions, generating reports, and
answering ad-hoc queries.
For example:
 Enterprise Resource Planning (ERP)
 Customer Relationship Management (CRM)
Databases vs. Information Retrieval
 Traditionally Database technology applies to structured and formatted data, thus it is heavily used
in manufacturing retail, banking, insurance, finance and health care industries.
 Databases vs. Information Retrieval
 Another developed field is information retrieval (IR) that deals with books, manuscripts, and
various forms of library based articles.
 Data is indexed, cataloged, and annotated using keywords, IR is concerned searching with
material based on these keywords.
 Data on Web pages contains images, text and objects that are active and change dynamically,
retrieval of information on the Web is new problem that requires techniques from databases and
IR to be applied in a variety of novel combinations.

When not to use DBMS


 A database is a complex piece of software, optimized for certain kind of workload (e.g. answering
complex queries or handling many concurrent requests), and its performance may not be fit for
certain specialized applications such as:
 Applications with tight real time constraints need efficient custom code.
 Multiple user access is not required.
 Manipulation and processing of data is simple and does not match with DBMS features.
 The overhead costs of using a database which would not be incurred in traditional file processing
system are due to following reasons:
o High initial investment in hardware, software and training
o Generality that a database provides for defining and processing data.
o For providing security, concurrency control, recovery, and integrity functions.
Use a DBMS when this is important
o persistent storage of data
o centralized control of data
o control of redundancy
o control of consistency and integrity
o multiple user support
o sharing of data
o data documentation
o data independence
o control of access and security
o backup and recovery
Do not use a DBMS when
 the initial investment in hardware, software, and training is too high
 the generality a DBMS provides is not needed
 the overhead for security, concurrency control, and recovery is too high
 data and applications are simple and stable
 real-time requirements cannot be met by DBMS
 multiple user access is not needed

Database System Concepts & Architecture


Introduction
 The architecture of DBMS packages has evolved as follows:
o Early monolithic systems: whole DBMS package was one tightly integrated system
o Modern DBMS packages those are modular in design with client/server system architecture.
Data Models, Schemas, and Instances
 Database approach provides data abstraction.
 Database abstraction refers to the suppression of details of data organization and storage and the
highlighting of the essential features for an improved understanding of data.
 A data model is a collection of concepts that can be used to describe the structure of a database
and thus it provides the necessary means to achieve this abstraction.
 Structure of a database includes data types, relationships, and constraints that should hold on the
data.
 Most data models also include a set of basic operations for specifying retrievals and updates on
the database.
 Data models also include the dynamic aspect or behavior of a database application.
 Concepts to specify behavior are fundamental to object oriented data models but are also being
incorporated in more traditional relational data models (e.g. stored procedures)

Categories of Data Models


 A number of models for data representation have been proposed.
 High level or conceptual data models
o provide concepts that how users perceive the data, e.g. ER model
o object data model group (ODMG): a conceptual model for object oriented database
 Low level or physical data models
o provide concepts that describe the details of how data is stored in the computer e.g.
record formats, record orderings, access path, index etc.
 Representational data models
o provide the concepts which is in between two extremes
o they hide some details of data storage but can be implemented on a computer system
directly
o most frequently used in commercial relational data model (also called record-based data
models)
o other legacy data models: network and hierarchical models

Schemas, Instances, and Database State


 The description of a database is called the database schema which is specified during
database design and is not expected to change frequently.
 Schema diagram displays structure of each record type but not the actual instances of records.
 Each object in the schema is known as a schema construct e.g. student table.
 A schema diagram displays only some aspects of a schema such as names of record types and
data items, and some type of constraints; other aspects are not specified.
An Example

 The data in the database at a particular moment is called a database state or snapshot.
 This state is also the current set of occurrences or instances in the database.
 Many database states can be constructed to correspond to a particular database.
 When a new database is defined; database state is empty state with no data.
 When database is populated, database enters in initial state.
 The DBMS is partly responsible for ensuring that every state of the database is a valid state -
that is, a state that satisfies the structure and constraints specified in the schema.
 Hence, specifying a correct schema to the DBMS is extremely important.
 DBMS stores the descriptions of the schema constructs and constraints in the catalog (meta-
data).
 Schema is called the intension and state is called as extension.
 Changes in application requirements result in schema evolution.

Three-Schema Architecture
 The schema in DBMS can be described at three levels:
 Internal level has an internal schema
 Conceptual level has a conceptual schema
 External level includes a number of external schemas or user views
 The information about all three schemas is stored in the system catalog.
Three-Schema Architecture: Diagram

Internal Schema
 The internal schema specifies complete details of storage and access paths for the database.
 File organization on the disk should be decided e.g. hashing, indexing etc.
 The process of arriving at a good physical database schema is called physical database design.
Conceptual Schema
 The conceptual schema (or logical schema) describes the structure of the database.
 The conceptual schema hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations, and constraints.
 A representational data model is used to describe the conceptual schema when a database is
implemented.
 The process of arriving at a good conceptual database schema is called conceptual database
design.
Conceptual Schema (example)
Students (sid: string, sname: string, login: string, age: integer, gpa: real)
Faculty (fid: string, fname: string, sal: real)
Courses (cid: string, cname: string, credits: integer)
Rooms (rno: string, address: string, capacity: integer)
Enrolled (sid: string, cid: string, grade: string)
Teaches (fid: string, cid: string)
Meets_In (cid: string, rno: integer, time: string)
External Schema
 The external schemas allow data access to be customized at the level of individual users and hide
rest of the details from the users.
 Any database has exactly one conceptual schema and one physical schema but may have many
external schemas in view to support different users.
 Each external schema consists of a collection of one or more number of views or tables.
 The external schema design is guided by end user requirements e.g. courseinfo (cid, fname, sid)
Three-Schema Architecture (more)
 Three schemas are only descriptions of data; the stored data that actually exists is at the physical
level.
 DBMS must transform a request specified on an external schema into a request against the
conceptual schema, and then into a request on the internal schema for processing over the stored
database.
 The process of transforming requests and results between levels are called mappings.

Data Independence
 One of the most important benefits of using a DBMS is its support for data independence.
 Applications are insulated from how data are structured and stored.
 Data independence is the capacity to change the schema at one level of a database without
changing the schema at the next higher level.
Data Independence (types)
 Logical data independence:
o Protection of user views from changes in logical structure of data.
o Logical data independence is the capacity to change the conceptual schema without having to
change external schemas or application programs.
 Physical data independence:
o Protection of logical structure from changes in physical structure of data.
o Physical data independence is the capacity to change the internal schema without having to
change conceptual schemas.
 Physical data independence exists in most databases.
 But logical data independence is hard to achieve.
 Data independence occurs because when the schema is changed at some level, the schema at
higher level remains unchanged; only the mapping between the two levels is changed.
 Two level of mappings create an overhead during compilation or execution of a query or
program, leading to inefficiencies in the DBMS.

Database Languages
o Data Definition Language (DDL)
o Storage Definition Language (SDL)
o View Definition Language (VDL)
o Data Manipulation Language (DML)
o A high level or nonprocedural DML
o A low level or procedural DML
Data Definition Language (DDL)
 In most of the DBMS, there is no separate language for internal and conceptual schema, DDL is
used by DBA and Designers.
 The DBMS will have a DDL compiler which processes the DDL statements.
Storage Definition Language (SDL)
 If distinct languages are used, DDL is used to specify conceptual schema and SDL is used to
specify internal schema.
 The mapping between the two schemas may be specified in either of the language.
View Definition Language (VDL)
 For a true three-schema architecture, VDL is required for external schema and its mapping with
conceptual schema.
 In relational DBMSs, SQL is used in the role of VDL to define views as results of predefined
queries.
Data Manipulation Language (DML)
 DBMS provides a set of operations like insertion, modification and deletion through DML
 A high level or nonprocedural DML
o It can be used on its own to specify complex database operations.
o These statements can be entered interactively from a display monitor or terminal.
o These can be embedded in a general purpose programming language where they can be
extracted by a precompiler and processed by the DBMS.
o This can specify and retrieve many records in a single DML statement, thus known as
set-at-a-time or set-oriented DMLs (eg. SQL)
o Also known as declarative language
 A low level or procedural DML
o It must be embedded in a general purpose programming language.
o This retrieves individual records from the database and processes each separately.
o Thus it needs programming language constructs like loops.
o This is also known as record-at-a-time (eg. DL/1)
 Whenever a DML is embedded in a general purpose language, that language is called host
language and DML is data sublanguage.
 A high level DML used in a standalone interactive manner is called a query language.
 Naïve and parametric users generally interact with database through user friendly interfaces.

Database Interfaces
User-friendly interfaces provided by a DBMS include:
 Menu-based Interfaces for Web Clients or Browsing
 Form-based Interfaces
 Graphical Interfaces
 Natural Language Interfaces
 Speech Input and Output
 Interfaces for parametric Users
 Interfaces for DBA
Menu-based Interfaces for Web Clients or Browsing
 These interfaces present the users with lists of options (called menus) that help the user in
formulation of query request.
 The query is composed step by step by picking options from a menu that is displayed by the
system.
 Pull-down menu is a popular technique in Web-based user interfaces.
 They are used in browsing interfaces, which allow a user to browse the content of a database
in an exploratory and unstructured manner.
Form-based Interfaces
 Forms are designed and programmed for naïve users.
 User can fill up the form for new entries in database.
 User can also fill up few entries and rest of the matching entries are retrieved from the
database.
 Many DBMSs have specification languages, which help programmers specify such forms.
 Oracle Forms, a component of Oracle product suite, provides an extensive set of features to
design and build applications using forms.
Graphical Interfaces
 A GUI displays a schema to the user in programmatic form.
 Users can specify a query by manipulating the diagram.
 GUIs utilize both menus and forms.
 A pointing device, mouse, can be used.
Natural Language Interfaces
 A natural language interface has its own schema and a dictionary of important words.
 The natural language interface refers to the words in its schema and to the set of standard
words in dictionary.
 If the interpretation is successful, the interface generates a high level query corresponding to
the natural language request and submit it to DBMS for further processing.
 If interpretation is not successful, a dialogue is started with the user to clarify the request.
Speech Input and Output
 It provides limited use of speech as an input query and speech as an answer to a question or
result of a request e.g. Telephone directory, Flight arrival/departure, and Bank account
information etc.
 The speech input is detected using a library of predetermined words and used to set up the
parameters that are supplied to the queries.
 For output, similar conversion from text or numbers into speech takes place.
Interfaces for parametric Users
 A special interface is implemented for each known class of naive users.
 Special function keys can be programmed to do repeated work fast.
Interfaces for DBA
 DBA Staff can use privileged commands for creating accounts, setting system parameters,
granting account authorization, changing a schema and reorganizing the storage structure of
the database.

The Database System Environment


 The database and DBMS catalog are stored on disk.
 A stored data manager (module of DBMS) controls access to DBMS information that is
stored on disk whether it is part of the database or catalog.
 Various users such as
 DBA staff & casual users work with interactive interfaces to formulate queries.
 Application programmers write programs using host language.
 Parametric users make data entry to database through predefined forms.
 DBA staff works on defining the database and tuning it using DDL and other privileged
commands.
 DDL compiler processes schema definitions, specified in the DDL, and stores description of
the schema in the DBMS catalog.
 Queries are parsed, analyzed for correctness of the operations and names of data elements by
query compiler that compiles them into an internal form.
 Internal query is subjected to query optimization by query optimization.
 The precompiler extracts DML commands from an application program and sends to DML
compiler for compilation.
 Rest of the program is sent to host language compiler.
 The object codes for the DML commands and rest of the program are linked, forming a
canned transaction whose executable code includes calls to runtime database processor.
 Parametric users supply the parameters to these canned transactions directly so they can run
transactions repeatedly.
 Runtime database processor executes
 Privileged commands
 Executable query plans
 Canned transactions with runtime parameters.
 It works with the system dictionary, stored data manager which in turn uses basic OS services
for carrying out low level input/output operations between disk and memory.
 It also handles some aspects of management of buffers in main memory, some DBMS have
their own buffer management.
 Concurrency control, back up and recovery manager are integrated into the working of the
runtime database processor for purposes of transaction management.

Database System Utilities


 Loading
 Backup
 Database storage reorganization
 Performance monitoring
Few other utilities:
 Sorting files
 Handling data compression
 Monitoring access by users
 Interfacing with networks
Loading
 It is used to load existing data files (text files or sequential files) into the database.
 Format of source data file and target data file are specified to the utility.
 Utility reformats the data and stores it in the database
 Some vendors offer the tools, conversion tools, with loading programs, formats of source data
file and target database.
 e.g. IMS (IBM), IDMS (Computer Associates), SURA (Cincom), IMAGE (HP)
Backup
 It creates backup copy of the database, usually dumping the entire database onto the tape.
 Back up copy can be used to restore the database in case of catastrophic failure.
 Incremental backups can also be used, where only changes since previous backup are
recorded.
 Incremental backups are more complex but saves memory
Database storage reorganization
 It can be used to reorganize a set of database files into a different file organization to improve
performance.
Performance monitoring
 It monitors database usage and provides statistics to DBA in making decisions such as
whether or not to reorganize files or whether to add or drop indexes to improve performance.

Tools, Application Environments, and Communications Facilities


Tools
 CASE tools are used in design phase of database systems.
 Expanded Data dictionary (data repository) is used to store catalog information about schema
& constraints, and also design decision, usage standards, application program descriptions,
and user information , thus known as information repository.
 This information can be accessed directly by users and DBA.
Application Environments
 These system provide an environment for developing database applications and includes
facilities that help in many facets of database systems,
 These include database design, GUI development, querying and updating and application
program development.

Centralized DBMSs Architectures


 Earlier architectures used mainframe computers to provide the main processing for all system
functions, user application programs, user interface programs, and also all DBMSs
functionality.
 User accessed such systems via computer terminals that did not have processing power and
only provided display capabilities.
 These computer terminals are connected to the central computer via various types of
communication network.
Basic Client/Server Architectures
 Specialized servers are responsible for specific functionalities e.g. print server, file server,
web server, e-mail servers etc.
 The client machines provide the user with the appropriate interfaces to utilize these services,
as well as with local processing power to run local applications.
 Some clients may be diskless workstations, some may be dedicated server, and some may
work both as client and server.

Two-Tier Client/Server Architectures for DBMSs


 In RDBMSs (in early centralized system)
o Client had user interface and application programs
o Server provided query and transaction functionalities related to SQL
o Server is called a query server or transaction server or SQL server.
o When DBMS access is required, the program establishes a connection to DBMS
o Once connection is established, client program can communicate with DBMS.
o A standard called Open Database Connectivity (ODBC) provides an application
programming interface (API), which allows client-side programs to call DBMS.
o Any query results are sent back to the client program and can be displayed or used for
further process.
o Most DBMS vendors provide ODBC drivers for their systems.
o JDBC is a standard that allows Java client programs to access the DBMS.
 In object oriented DBMSs
o Software modules of the DBMS are divided between client and server in a more
integrated way.
o Server may include the part of the DBMS software responsible for handling data
storage on disk pages, local concurrency control and recovery, buffering and caching
of disk pages etc.
o Client may handle the user interface, data dictionary functions, DBMS interactions
with programming language compilers, structuring of complex objects from the
buffers etc.
o In this approach the client and server interaction is more tightly coupled.
o The exact division of the functionality varies from system to system.
o In this case server is called data server.

Three-Tier Architectures for Web Applications


 Web applications use an architecture called the three-tier architecture, which adds an
intermediate layer between client and the database server.
 Thus, three tiers are:
o User interface
o Application rules
o Data access
 Clients contain GUI (user interface) and some additional application specific business rules.
 Database server includes database services.
 This intermediate layer or middle layer is also known as application layer or Web server
depending on the application.
 Middle layer stores business logic (procedures, constraints) that are used to access data from
the database server.
 Middle layer also improve database security by checking a client’s details before forwarding
a request to the database server.
 The intermediate server
o accepts the requests from the client,
o processes the request and sends database commands to the database server,
o and then acts as a conduit for passing processed data from the database server to the
clients, where it may be processed to display in defined format.

n-Tier Architectures for Web Applications


o Bottom layer in three-tier architecture includes all data management services.
o If the bottom layer is split into two layers (a web server and a database server), then this
becomes a four tier architecture.
o Layers between user and the stored data can be defined into finer components, thus gives n-
tier architecture

Classification of Database Management Systems


Based on number of users:
o Single User Systems: with PCs
o Multiuser Systems: majority of DBMS

Based on number of sites over which database is distributed:


o Centralized DBMS: Data is stored at single computer site
o Distributed DBMS: Database and DBMS software are distributed over many sites connected
by a computer network.
 Homogeneous: Same DBMS software at multiple sites
 Heterogeneous: Participating DBMS have a degree of local autonomy. This
leads to federated Database in which participating DBMSs are loosely coupled.

Based on cost:
o Open Source DBMS: Main RDBMS products like MySQL, PostgresSQL are available as 30
days versions, Many vendors support with additional facilities and sell. Giant systems are
sold in modular form according the configuration required.
o License based
 Site license allow unlimited use of database system with any number of
copies running at customer site.
 License limits the number of concurrent users at a location.
o Standalone single user versions are sold per copy or included in desktop or laptop
configuration, ACCESS
o Additional features can be made available in any of the above kind at extra cost

Based on types of access paths:


 Different File Structures
e.g. Inverted File Structure

Based on purpose of use:


o General Purpose: These DBMS systems can be used for variety of applications.
o Special Purpose: A DBMS system is designed for a special application e.g. Online transaction
processing (OLTP)

Based on data model:


o Hierarchical data model
o Network data model
o Object data model
o Relational data model
o Object-Relational data model
Network model
o Network data model contains many links among various items of the data. This model organizes
data using two fundamental constructs, called records and sets. Records contain fields, and sets
define one-to-many relationships between records: one owner, many members.
Example: IDMS (Cullinet – now Computer Associates), DMS 1100 (Univac – now Unisys),
IMAGE (Hewlett-Packard), VAX-DBMS (Digital – now Compaq), SUPRA (Cincom)
o Network model represents data as record types and also represents a limited type of 1:N
relationship, called a set type.
o A 1:N relationship relates one instance of a record to many record instances using some
pointer linking mechanism in these model.
o In 1969, the Conference on Data Systems Languages established the first specification of the
network database model and its language.
o Network model is also known as CODASYL DBTG model.
o CODASYL DBTG model has an associated record at-a-time language embedded in a host
programming language.
o The network DML was proposed in the 1971 Database Task Group (DBTG) Report as an
extension of the COBOL language.
o It provides commands for locating records directly e.g.
o FIND ANY <record-type>USING <field-list>
o FIND DUPLICATE <record-type>USING <field-list>
o It provides commands to support traversals within set-types e.g.
o GET OWNER, GET {FIRST, NEXT, LAST} WITHIN
o <set-type> WHERE <condition>
o It provides commands to store new data and to make it part of a set type e.g.
o STORE <set-type>
o CONNECT <record-type>TO < set-type>

Hierarchical data model


o Hierarchical data model is constructed using a tree model, with one root and several levels of
subtrees.
o Example: IMS (IBM) used at governmental and industrial installations, hospitals and banks
o A hierarchical data model is a data model in which the data is organized into a tree-like
structure.
o The structure allows repeating information using parent/child relationships
o Each parent can have many children but each child has only one parent.
o All attributes of a specific record are listed under an entity type.
o In a database, an entity type is the equivalent of a table; each individual record is represented
as a row and an attribute as a column.
o Entity types are related to each other using 1: N mapping, also known as one-to-many
relationships.
o There is no standard language for hierarchical model.
o A popular hierarchical DML is DL/1 of the IMS system which dominated market for 20 years
(1965-1985).
o DL/1 has commands to locate a record e.g.
o GET {UNIQUE, NEXT} <record-type> WHERE <condition>
o It has navigation facilities to navigate within hierarchies e.g.
o GET NEXT WITHIN PARENT
o GET {FIRST, NEXT} PATH <hierarchical-path-specification> WHERE <condition>
o It has facilities to store and update records e.g.
o INSERT <record-type> REPLACE <record-type>

Object-oriented databases
o Object data model is based on object oriented approach i.e. objects, classes with their
attributes and operations. ODMG object model provides standards for commercial object data
model.
o Example: Some object-oriented databases are designed to work well with object-oriented
programming languages such as Python, Java, C#, Visual Basic .NET, C++, Objective-C and
Smalltalk; others have their own programming languages. The early commercial products
were integrated with various languages: GemStone (Smalltalk), Gbase (LISP), Vbase (COP)
and VOSS (Virtual Object Storage System for Smalltalk).

Relational data model


o Relational data model represents the database as a collection of relations.
o Example: MySQL, Oracle, SQL Server etc.
o Object-Relational data model is relational model having the capabilities of object database.
objects, classes and inheritance are directly supported in database schemas and in the query
language
o Example: Informix Universal Server, Oracle (due to enhanced features), Data blades
(Informix), Cartridges (Oracle)

eXtended Markup Language Model (XML)


o XML is standard for data interchange over the Internet, also uses hierarchical tree structures.
o It combines database concepts with concepts from document representation models.
o Data is represented as elements with the use of tags, data can be nested to create complex
hierarchical structures.
o This model conceptually resembles the object model, but use different terminology.

End of Chapter 1

You might also like