0% found this document useful (0 votes)
14 views44 pages

CH 2 FDB

Uploaded by

ermidej25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views44 pages

CH 2 FDB

Uploaded by

ermidej25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

BULE HORA UNI VERSI T Y

COLLEGE OF COMPUT ING AND


INFORMAT ICS
DEPARTMENT OF COMPUTER SCIENCE

Fundamentals of Database System

CHAPTER 2:
Database System Concepts and Architecture
Compiled by: Firaol B.

BHU/ComputerScience/F-DatabaseSystem 1
OUTLINE
• Data Models and Their Categories
• History of Data Models
• Schemas, Instances, and States
• Three-Schema Architecture
• Data Independence
• DBMS Languages and Interfaces
• Database System Utilities and Tools
• Centralized and Client-Server Architectures
• Classification of DBMSs

BHU/ComputerScience/F-DatabaseSystem Slide 1- 2
Terminology and basic concepts
• Data Abstraction
• One of the fundamental characteristics of database approach
• refers to the suppression of details of data organization and storage, and the
highlighting of the essential features for an improved understanding of data.
• Help different users to perceive data at their preferred level of detail
• A Data model
• a collection of concepts that can be used to describe the structure(data types,
relationships, and constraints) of a database—provides the necessary means to
achieve this
• Most data models also include a set of basic operations for specifying retrievals and
updates on the database.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 3
Cont’d
Data Model: A set of concepts that describe
• structure of a database,
• operations for manipulating these structures, and
• certain constraints that the database should obey.
• Data Model Structure and Constraints:
• used to define the database structure
• Constructs typically include elements (and their data types) as well as groups of
elements (e.g. entity, record, table), and relationships among such groups
• specify some restrictions on valid data; these constraints must be enforced at all
times
• Data Model Operations:
• used for specifying database retrievals and updates by referring
to the constructs of the data model.
• basic model operations (e.g. generic insert, delete, update) and user-defined
operations (e.g. compute_student_gpa, update_inventory)

BHU/ComputerScience/F-DatabaseSystem Slide 1- 4
Categories of Data Models
• According to the types of concepts they use to describe the database structure
data models can be categorized in to
• Conceptual (High-level)
• provide concepts that are close to the way many users perceive data.
• use concepts such as entities, attributes, and relationships.
• Also called entity-based, object-based data models
• the entity–relationship model—a popular high-level conceptual data model.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 5
Cont’d
• Representational (or implementation) data models,
• provide concepts that may be easily understood by end users but that are not too far
removed from the way data is organized in computer storage.
• hide many details of data storage on disk but can be implemented on a computer
system directly.
• are the models used most frequently in traditional commercial DBMSs.
• These include the widely used relational data model, as well as the so-called
legacy data models—the network and hierarchical models—that have been
widely used in the past.
• Representational data models represent data by using record structures and
hence are sometimes called record-based data models.
• We can regard the object data model as an example of a new family of
higher-level implementation data models that are closer to conceptual data
models.
• A standard for object databases called the ODMG object model has been
proposed by the Object Data Management Group (ODMG).
• Object data models are also frequently utilized as high-level conceptual
models, particularly in the software engineering domain.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 6
Cont’d
• physical data models (low-level)
• provide concepts that describe the details of how data is stored on the computer
storage media, typically magnetic disks.
• Concepts provided by physical data models are generally meant for computer
specialists, not for end user
• Physical data models describe how data is stored as files in the computer by
representing information such as record formats, record orderings, and access
paths.
• Access path is a search structure that makes the search for particular database
records efficient, such as indexing or hashing.
• Self-describing data models
• The data storage in systems based on these models combines the description of the
data with the data values themselves.
• These models include XML as well as many of the key-value stores and NOSQL systems
that were recently created for managing big data.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 7
Schemas, Instances, and Database State
• Database Schema(intension)
• description of a database
• not expected to change frequently.
• Most data models have certain conventions for
displaying schemas as diagrams– schema Diagram.
•schema Diagram: a diagrammatic display of some aspects of a
database schema such as the names of record types and data
items, and some types of constraints.
•schema construct : a component of the schema or an object
in the schema—such as STUDENT or COURSE.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 8
Examples of Database Schema

BHU/ComputerScience/F-DatabaseSystem Slide 1- 9
Cont’d
• Database State (extension)
• The data in the database at a particular moment in time.
• It is also called the current set of occurrences or instances or snapshot in the
database.
• every update operation changes the database from one state of the database
into another state.
• The DBMS is partly responsible for ensuring that every state of the database is
a valid state—that is, a state that satisfies the structure and constraints
specified in the schema.
• The DBMS stores the descriptions of the schema constructs and constraints—
also called the meta-data—in the DBMS catalog so that DBMS software can
refer to the schema whenever it needs to.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 10
Examples of database state

BHU/ComputerScience/F-DatabaseSystem Slide 1- 11
Database Schema Architecture
•Three-Schema Architecture
• Three of the four important characteristics of the database
approach
• self-describing of the database (schema)
• program-data independence.
• support of multiple views of the data.
• Three-Schema Architecture is proposed to help achieve and
visualize these characteristics
• The goal is to separate the user applications from the physical
database.
• This is also known as the ANSI/SPARC (American National
Standards Institute/ Standards Planning And Requirements
Committee) architecture, after the committee that proposed it.
• convenient tool with which the user can visualize the schema levels
in a database system

BHU/ComputerScience/F-DatabaseSystem Slide 1- 12
Three-Schema Architecture

BHU/ComputerScience/F-DatabaseSystem Slide 1- 13
Three-Schema Architecture
• Define schemas at three levels:
• Internal level has an internal schema
• Conceptual level has a conceptual schema,
• External or view level has a external schema
• Internal Schema
• at the internal level describes the physical storage structure and access path (e.g
indexes)of the database.
• Uses a physical data model and describes the complete details of data storage and
access paths for the database
• Conceptual Schema
• At the conceptual level describes the structure of the whole database.
• hides the details of physical storage structures and concentrates on describing entities,
data types, relationships, user operations, and constraints.
• Uses representational / implementation/conceptual data model
• This implementation conceptual schema is often based on a conceptual schema
design in a high-level data model.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 14
Cont’d
• External Schemas
• At the external level describe number of user views.
• Each external schema describes the part of the database that a particular user
group is interested in and hides the rest of the database from that user group.
• Uses representational / implementation/conceptual data model
• Notice : three schemas are only descriptions of data; the actual data
is stored at the physical level only.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 15
Data Independence
• The three-schema architecture can be used to further explain the concept of
data independence
• the capacity to change the schema at one level of a database system without
having to change the schema at the next higher level.
• two types of data independence
• Logical data independence
• the capacity to change the conceptual schema without having to change external schemas or
application programs.
• Physical data independence
• is the capacity to change the internal schema without having to change the conceptual
schema.
• Changes to the internal schema may be needed because some physical files were
reorganized—for example, by creating additional access structures( indexes)—to improve the
performance of retrieval or update.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 16
Database Languages and Interfaces
• The first step to create a database through DBMS is to specify conceptual
and internal schemas for the database and any mappings between the two.
• Data Definition Language (DDL)
• is used by the DBA and by database designers to specify conceptual schemas.
• In many DBMS, the DDL is also used to define internal and external schemas (views)
• The DBMS will have a DDL compiler whose function is to process DDL statements in
order to identify descriptions of the schema constructs and to store the schema
description in the DBMS catalog.
• In some DBMS storage definition language (SDL), is used to specify the
internal schema and view definition language (VDL), to specify external
schemas.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 17
Database Languages and Interfaces
Data Definition Language (DDL):
• Common DDL commands include:
• CREATE: Used to create tables, indexes, constraints, and other database objects.
• ALTER: Modifies the structure of existing database objects.
• DROP: Deletes objects from the database.
• TRUNCATE: Removes all records from a table.
• RENAME: Renames database objects.
• Example: Creating a table using
• CREATE TABLE Students (
• column1 INT,
• column2 VARCHAR(50),
• column3 INT
• );

BHU/ComputerScience/F-DatabaseSystem Slide 1- 18
Database Languages and Interfaces
• Data Manipulation Language (DML)
• Used to specify typical manipulations which include retrieval, insertion, deletion, and
modification of the data.
• DML commands (data sublanguage) can be embedded in a general-purpose
programming language (host language), such as COBOL, C, C+ +, or Java.
• A library of functions can also be provided to access the DBMS from a programming language
• Alternatively, stand-alone DML commands can be applied directly (called a query
language).
• Key DML commands include:
• SELECT: Retrieves data from tables.
• INSERT: Adds new records to tables.
• DELETE: Removes specific records from tables.
• UPDATE: Modifies existing data.
• Example: Retrieving student names using
• SELECT name FROM Students;

BHU/ComputerScience/F-DatabaseSystem Slide 1- 19
Types of DML
• A high-level or nonprocedural DML
• can be used on its own to specify complex database operations concisely
• specifies which data to retrieve rather than how to retrieve it; therefore, such
languages are also called declarative.
• E.g SQL
• A low- level or procedural DML
• must be embedded in a general-purpose programming language.
• retrieves individual records or objects from the database and processes each
separately. Therefore, it needs to use programming language constructs, such
as looping, to retrieve and process each record from a set of records.
• are also called record-at-a-time DMLs because of this property.

• Reading assignment: DBMS Interfaces, DBMS Programming Language Interfaces

BHU/ComputerScience/F-DatabaseSystem Slide 1- 20
Database System Utilities
• Loading stored in files into a database Includes data conversion tools.
• Backup
• File Reorganization
• Performance monitoring
• Other utilities
• Sorting files
• Handling data compression
• Monitoring access by users
• interfacing with the network

BHU/ComputerScience/F-DatabaseSystem Slide 1- 21
Tools, Application Environments, and
Communications Facilities
Other Tools
• Data dictionary / repository:
• Used to store schema descriptions and other information such as design decisions,
application program descriptions, user information, usage standards, etc.
• Active data dictionary is accessed by DBMS software and users/DBA.
• Passive data dictionary is accessed by users/DBA only.
• Application Development Environments and CASE (computer-aided software
engineering) tools:
• Examples:
• PowerBuilder (Sybase)
• JBuilder (Borland)
• JDeveloper 10G (Oracle)

BHU/ComputerScience/F-DatabaseSystem Slide 1- 22
Centralized and Client/Server
Architectures for DBMSs
• Centralized DBMSs Architecture
• Combines everything into single system including DBMS software, hardware,
application programs, and user interface processing software.
• User can still connect through a remote terminal – however, all processing is done at
centralized site.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 23
Basic 2-tier Client-Server Architectures
• Specialized Servers with Specialized functions
• Print server
• File server
• DBMS server
• Web server
• Email server
• Clients can access the specialized servers as needed

BHU/ComputerScience/F-DatabaseSystem Slide 1- 24
Logical two-tier client server architecture

BHU/ComputerScience/F-DatabaseSystem Slide 1- 25
Clients
• Provide appropriate interfaces through a client software module to access
and utilize the various server resources.
• Clients may be diskless machines or PCs or Workstations with disks with only
the client software installed.
• Connected to the servers via some form of a network.
• (LAN: local area network, wireless network, etc.)
DBMS Server
• Provides database query and transaction services to the clients
• Relational DBMS servers are often called SQL servers, query servers, or transaction
servers
• Applications running on clients utilize an Application Program Interface (API) to access
server databases via standard interface such as:
• ODBC: Open Database Connectivity standard
• JDBC: for Java programming access
• Client and server must install appropriate client module and server module software
for ODBC or JDBC

BHU/ComputerScience/F-DatabaseSystem Slide 1- 26
Two Tier Client-Server Architecture
• A client program may connect to several DBMSs, sometimes called the data
sources.
• In general, data sources can be files or other non-DBMS software that
manages data.
• Other variations of clients are possible: e.g., in some object DBMSs, more
functionality is transferred to clients including data dictionary functions,
optimization and recovery across multiple servers, etc.
• Common for Web applications
• Intermediate Layer called Application Server or Web Server:
• Stores the web connectivity software and the business logic part of the application
used to access the corresponding data from the database server
• Acts like a conduit for sending partially processed data between the database server
and the client.
• Three-tier Architecture Can Enhance Security:
• Database server only accessible via middle tier
• Clients cannot directly access database server

BHU/ComputerScience/F-DatabaseSystem Slide 1- 27
Three-tier client-server architecture

BHU/ComputerScience/F-DatabaseSystem Slide 1- 28
Classification of DBMS
• Based on the data model used
• Traditional: Relational, Network, Hierarchical.
• Emerging: Object-oriented, Object-relational, NOSQL, key-value
• Based on the number of users supported by the system
• Single-user (typically used with personal computers) vs. multi-user (most
DBMSs).
• Based on thee types of access path
• One well-known family of DBMSs is based on inverted file structures.
• Purpose
• general purpose or special purpose. Many airline reservations and telephone
directory systems developed in the past are special-purpose DBMS
• Based on the number of sites over which the database is distributed
• Centralized (uses a single computer with one database) vs. distributed (uses
multiple computers, multiple databases)

BHU/ComputerScience/F-DatabaseSystem Slide 1- 29
Cont’d
• Categories of database models include:
• Object-based
• Record-based
• Physical

BHU/ComputerScience/F-DatabaseSystem Slide 1- 30
Record-based Data Models
• Consist of a number of fixed format records.
• Each record type defines a fixed number of fields, Each field is typically of a
fixed length.
• The following are examples of this database model category.
• Hierarchical Database Model
• Network Database Model
• Relational Database Model

BHU/ComputerScience/F-DatabaseSystem Slide 1- 31
Hierarchical Data Model:
• The simplest database model
• Employs two main data structuring concepts: records and parent-child
relationships(represented by sets).
• A hierarchical model can be represented as a tree graph, with records appearing as
nodes (also called segments) and sets as edges.
• Hierarchical upside-down tree structure. The tree may be of arbitrary depth
• Each record contain multiple fields, where each field may contain either data values
like integer, real, text or a pointer to a record. The pointer is not allowed to form a
cycle.
• Some hierarchical DBMS support null values or variable-length fields

BHU/ComputerScience/F-DatabaseSystem Slide 1- 32
Cont’d
• Advantages

• Hierarchical Model is simple to construct and operate on


• Conceptual simplicity – easy to understand the model layout
• Data independence (a change in a data type will be automatically cascaded
throughout the database by the DBMS, thereby eliminating the need to make
changes in the program segments that reference the changes data type)
• Database integrity – always a link between parent and child
• Efficiency – very efficient when it contains a large volume of data in 1:M
relationships and whose relationships are fixed over time
• Corresponds to a number of natural hierarchically organized domains, e.g.,
organization (“org”) chart

• Searching of any record in a hierarchical tree is very fast since the


hierarchical databases uses contiguous storage for hierarchical structures.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 33
Cont’d
• Disadvantages
• Complex implementation – detailed knowledge of the physical data storage
characteristics is required by the designers and programmers
• Difficult to manage – relocation of segments requires application changes
• Lacks structural independence
• Complex applications programming and use – programmers and end users must
know precisely how the data are physically distributed within the database
• Lack of standards – no standard DDL and no DML
• Navigational and procedural nature of processing.
• A programmer still has to write the program to access the account information,
but now the program navigates through the hierarchy e.g. to find the balance of
the savings account for a given customer number.
• does not support much consistency and security
• Implementation limitations – difficult to support M:N relationships
• Little scope for "query optimization"

BHU/ComputerScience/F-DatabaseSystem Slide 1- 34
Network Data Model
• Like hierarchical model network model is a collection of physically linked records.
• Collection of records in 1:M relationships
• A relationship is called a Set
• Composed of at least two record types
• Owner
• Equivalent to the hierarchical model’s parent
• Member
• Equivalent to the hierarchical model’s child
• A record can appear as a member in more than one set i.e., a member
may have multiple owners
• relationships are explicitly modeled by the sets, which become pointers in the
implementation.
• The records are organized as generalized graph structures with records appearing as
nodes (also called segments) and sets as edges in the graph.
• The data is still stored in files but now we also have to define the set types.
• Relationships are modelled using sets.
• Set types
• CustommerAccount, AccountTransaction
• The entry points are records that can be searched

BHU/ComputerScience/F-DatabaseSystem Slide 1- 35
Example : Registrar system

BHU/ComputerScience/F-DatabaseSystem 36
Cont’d
Advantages
• Conceptual simplicity
• Data access flexibility – no need for a preorder traversal
• Promotes database integrity – must first define the owner and then the member
record
• Data independence
• able to model complex relationships and represents semantics of add or delete on the
relationships.
• Can handle most situations for modeling using record types and relationship types.
• Language is navigational; uses constructs like FIND, FIND member, FIND owner,
FIND NEXT within set, GET etc.
• Programmers can do optimal navigation through the database
• Conformance to standards

BHU/ComputerScience/F-DatabaseSystem 37
Cont’d
Disadvantages
• System complexity
• Lack of structural independence
• Navigational and procedural nature of processing.
• Database contains a complex array of pointers that thread through a set of
records.
• Little scope for automated "query optimization”

BHU/ComputerScience/F-DatabaseSystem Slide 1- 38
Relational Model
• Terminologies originates from the branch of mathematics called set
theory and relational algebra.
• Now in several commercial products (e.g. DB2, ORACLE, MS SQL Server,
SYBASE, INFORMIX).
• Several free open source implementations, e.g. MySQL, PostgreSQL
• Can define more flexible and complex relationship.
• Viewed as a collection of tables called “Relations” equivalent to collection of
record types.
• Relation: Two dimensional table.
• Stores information or data in the form of tables rows and columns.
• A row of the table is called tuple equivalent to record.
• The rows represent records (collections of information about separate items).
• A column of a table is called attribute equivalent to fields.
• The columns represent fields (particular attributes of a record).

BHU/ComputerScience/F-DatabaseSystem 39
Cont’d
• Data value is the value of the Attribute.
• Records are related by the data stored jointly in the fields of records in two
tables or files. The related tables contain information that creates the
relation.
• No physical consideration of the storage is required by the user.
• Many tables are merged together to come up with a new virtual view of the
relationship.

• Ex. Customer and Account table

BHU/ComputerScience/F-DatabaseSystem Slide 1- 40
Cont’d
• The RDBMS manages all of the physical details, while the user sees the
relational database as a collection of tables in which data are stored.
• The user can manipulate and query the data in a way that seems intuitive and
logical.
• Conducts searches by using data in specified columns of one table to find
additional data in another table.
• In conducting searches, a relational database model matches information from
a field in one table with information in a corresponding field of another table
to produce a third table that combines requested data from both tables.

BHU/ComputerScience/F-DatabaseSystem 41
Cont’d
• Advantages
• Structural independence – changes in the relational data structure do not affect the
DBMS’s data access in any way
• Improved conceptual simplicity by concentrating on the logical view
• Easier database design, implementation, management, and use
• Ad hoc query capability - SQL
• Powerful database management system
• Disadvantages
• Substantial hardware and system software overhead
• Can facilitate poor design and implementation

BHU/ComputerScience/F-DatabaseSystem 42
Object-oriented Data Models:
• Several models have been proposed for implementing in a database system.
• One set comprises models of persistent O-O Programming Languages such as C++
(e.g., in OBJECTSTORE or VERSANT), and Smalltalk (e.g., in GEMSTONE).
• Additionally, systems like O2, ORION (at MCC - then ITASCA), IRIS (at H.P.- used in
Open OODB).
• Object Database Standard: ODMG-93, ODMG-version 2.0, ODMG-version 3.0.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 43
Cont’d
Advantages
• Inheritance
• The Object Relational data model allows its users to inherit objects, tables etc. so that they can
extend their functionality. Inherited objects contains new attributes as well as the attributes that
were inherited.
• Complex Data Types
• Complex data types can be formed using existing data types. This is useful in Object relational data
model as complex data types allow better manipulation of the data.
• Extensibility
• The functionality of the system can be extended in Object relational data model. This can be
achieved using complex data types as well as advanced concepts of object oriented model such as
inheritance.
Disadvantages :
• The object relational data model can get quite complicated and difficult to handle at
times as it is a combination of the Object oriented data model and Relational data
model and utilizes the functionalities of both of them.

BHU/ComputerScience/F-DatabaseSystem Slide 1- 44

You might also like