0% found this document useful (0 votes)
4 views

Chapter 2 Database System Concepts and Architecture

Chapter 2 discusses database system concepts and architecture, focusing on data models, schemas, and the three-schema architecture. It outlines the different levels of data representation (external, conceptual, and internal) and the importance of data independence. Additionally, it categorizes data models into conceptual, physical, and implementation types, and explains the roles of DBMS languages in defining and manipulating databases.

Uploaded by

gesgisermias
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Chapter 2 Database System Concepts and Architecture

Chapter 2 discusses database system concepts and architecture, focusing on data models, schemas, and the three-schema architecture. It outlines the different levels of data representation (external, conceptual, and internal) and the importance of data independence. Additionally, it categorizes data models into conceptual, physical, and implementation types, and explains the roles of DBMS languages in defining and manipulating databases.

Uploaded by

gesgisermias
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 64

Chapter 2

DATABASE SYSTEM CONCEPTS


AND ARCHITECTURE
Outline
2

 Data Models and Their Categories


 History of Data Models
 Schemas, Instances, and States
 Three-Schema Architecture
 Data Independence
 DBMS Languages and Interfaces
 Database System Utilities and Tools
 Centralized and Client-Server Architectures
 Classification of DBMSs
The Three-Level ANSI-SPARC Architecture
1971 1975

 An early proposal for a standard terminology  The American National Standards


and general architecture for database systems Institute (ANSI) Standards Planning and
was produced in 1971 by the DBTG (Data Requirements Committee (SPARC),
Base Task Group) appointed by the ANSI/X3/SPARC, produced a similar
Conference on Data Systems and Languages terminology and architecture in 1975
(CODASYL, 1971). (ANSI, 1975).

 The DBTG recognized the need for a two-  ANSI-SPARC recognized the need for a

level approach with a system view called the three-level approach with a system
schema and user views called subschemas. catalog.

 The levels form a three-level


architecture comprising an external, a
conceptual, and an internal level.
Three schema architecture
4

The Three-Level ANSI-SPARC Architecture


External level
Slide
2- 5

 The users’ view of the database. This level describes that part of the

database that is relevant to each user.


 The external view includes only those entities, attributes, and relationships

in the ‘real world’ that the user is interested in.


 Different views may have different representations of the same data.

 For example, one user may view dates in the form (day, month, year),

while another may view dates as (year, month, day).


Conceptual level
Slide
2- 6

 The community view of the database. This level describes what data is stored

in the database and the relationships among the data.


 The conceptual level represents:

 All entities, their attributes, and their relationships;


 The constraints on the data;
 Semantic information about the data;
 Security and integrity information
Internal level
Slide
2- 7

 The physical representation of the database on the computer.

 This level describes how the data is stored in the database.

 The internal level is concerned with such things as:

 Storage space allocation for data and indexes;


 Record descriptions for storage (with stored sizes for data items);
 Record placement;
 Data compression and data encryption techniques.
Data Models
8

 Data Model:
 A set of concepts to describe the structure of a database, the
operations for manipulating these structures, and certain constraints
that the database should obey.
 Data Model Structure and Constraints:
 Constructs are used to define the database structure

 Constructs typically include elements (and their data types) as well

as groups of elements (e.g entity, record, table), and relationships


among such groups
 Constraints specify some restrictions on valid data; these constraints

must be enforced at all times


Data Models (continued)
9

Data Model Operations:

 These operations are used for specifying database


retrievals and updates by referring to the constructs
of the data model.
 Operations on the data model may include basic
model operations (e.g. generic insert, delete,
update) and user-defined operations (e.g.
compute_student_gpa, update_inventory)
Data models
Slide
2- 10

 A data model can be thought of as comprising three components:

1. A structural part, consisting of a set of rules according to which databases


can be constructed;

2. A manipulative part, defining the types of operation that are allowed on the
data (this includes the operations that are used for updating or retrieving data
from the database and for changing the structure of the database);

3. Possibly a set of integrity constraints, which ensures that the data is accurate

The purpose of a data model is to represent data and to make


the data understandable.
Con …
Slide
2- 11

 To reflect the ANSI-SPARC architecture, we can

identify three related data models:


1. an external data model, to represent each user’s view of the
organization, sometimes called the Universe of Discourse (UoD);

2. a conceptual data model, to represent the logical (or community)


view that is DBMS independent;

3. an internal data model, to represent the conceptual schema in


such a way that it can be understood by the DBMS.
Categories of Data Models
12

I. Conceptual (high-level, semantic) data models:


Provide concepts that are close to the way many users perceive data.

The way users perceive the data.

(Also called entity-based or object-based data models.)


The community view of the database.
This level describes what data is stored in the database and the
relationships among the data.
 All the constraints on the data;
 Semantic information about the data;
 security and integrity information.

 entities, their attributes, and their relationships;


Categories of Data Models
13

II. Physical (low-level, internal) data models:

The way the DBMS and the operating system perceive the data.
Provide concepts that describe details of how data is stored in the
computer.
The physical representation of the database on the computer.
The internal level is concerned with such things as:
 Storage space allocation for data and indexes;
 Record descriptions for storage (with stored sizes for data items);
 Record placement;
 Data compression and data encryption techniques.
Categories of Data Models
14

III. Implementation (representational) data models:


 Provide concepts that fall between the above two, used by many
commercial DBMS implementations (e.g. relational data models
used in many commercial systems).
 which provide concepts that may be easily
understood by end users.
 Representational data models hide many details of
data storage on disk but can be implemented on a
computer system directly.
Con…
Slide
2- 15

 There have been many data models proposed in the literature.

 They fall into three broad categories: object-based, record-based,

and physical data models.


 The first two are used to describe data at the conceptual and

external levels, the latter is used to describe data at the internal


level.
Object-Based Data Models
Slide
2- 16

 Object-based data models use concepts such as entities, attributes, and relationships.

 An entity is a distinct object (a person, place, thing, concept, event) in the

organization that is to be represented in the database.


 An attribute is a property that describes some aspect of the object that we wish to

record, and a relationship is an association between entities.


 Some of the more common types of object-based data model are:
 Entity–Relationship
 Semantic
 Functional
 Object-Oriented.
Record-Based Data Models
Slide
2- 17

 In a record-based model, the database consists of a number of fixed-


format records possibly of differing types.
 Each record type defines a fixed number of fields, each typically of a
fixed length.
 There are three principal types of record-based logical data model:
 The relational data model,

 The network data model, and

 The hierarchical data model.

 The hierarchical and network data models were developed almost a

decade before the relational data model, so their links to traditional file
processing concepts are more evident.
Relational data model
Slide
2- 18

 The relational data model is based on the concept of mathematical relations.

 In the relational model, data and relationships are represented as tables, each

of which has a number of columns with a unique name.


Hierarchical model
Slide
2- 19

 The hierarchical model organizes data into a tree-like structure, where each

record has a single parent or root.


 Sibling records are sorted in a particular order.

 That order is used as the physical order for storing the database.

 This model is good for describing many real-world relationships.


Network model
Slide
2- 20

 The network model builds on the hierarchical model by allowing many-to-many

relationships between linked records, implying multiple parent records.


 Each set consists of one owner or parent record and one or more member or

child records.
 It was most popular in the 70s after it was formally defined by the Conference

on Data Systems Languages (CODASYL).


Physical Data Models
Slide
2- 21

 Physical data models describe how data is stored in the computer,

representing information such as record structures, record


orderings, and access paths.
 There are not as many physical data models as logical data

models, the most common ones being the unifying model and the
frame memory.
Schemas, Mappings, and Instances
Slide
2- 22

Database Schemas-
 A database schema is the skeleton structure that
represents the logical view of the entire database.
 It defines how the data is organized and how the
relations among them are associated.
 It formulates all the constraints that are to be applied
on the data.
Schemas versus Instances
23

 Simply Schema is the description of a database.


 Includes descriptions of the database structure, data types, and
the constraints on the database.
 The data in the database at any particular point in time is called
a database instance.
 Schema Diagram:

 An illustrative display of (some aspects of) a database schema.


 Schema Construct:

 A component of the schema or an object within the schema,


e.g., STUDENT, COURSE.
Schemas versus Instances
24

Database State:

 The actual data stored in a database at a particular


moment in time. This includes the collection of all the
data in the database.
 Also called database instance (or occurrence or
snapshot).
 The term instance is also applied to individual database
components, e.g. record instance, table instance, entity
instance
Database Schema vs. Database State
25

Database State:
 Refers to the content of a database at a moment in time.

Initial Database State:


 Refers to the database state when it is initially loaded
into the system.

Valid State:
 A state that satisfies the structure and constraints of the
database.
Database Schema vs. Database State
26

Distinction

 The database schema changes very infrequently.


 The database state changes every time the database
is updated.

Schema is sometimes called the intension of the database

Instance is called an extension(or state) of the database.


Example of a Database Schema
27
Example of a database
28
state

STUDENT

Name Student_numb Class Major


er

Smith 17 1 CS

Brown 8 2 CS
Three-Schema Architecture
29

 Proposed to support DBMS characteristics of:


 Program-data independence.
 Support of multiple views of the data.

 Not explicitly used in commercial DBMS products,


but has been useful in explaining database system
organization
Three-Schema Architecture
30

Defines DBMS schemas at three levels:

 Internal schema at the internal level to describe physical


storage structures and access paths.
 Is a complete description of the internal model, containing the
definitions of stored records, the methods of representation, the
data fields, and the indexes and storage structures used.
 Typically uses a physical data model.
31

 Conceptual schema at the conceptual level to describe


the structure and constraints for the whole database for a
community of users.
 Describes all the entities, attributes, and relationships
together with integrity constraints.
 Uses a conceptual or an implementation data model.
 External schemas (also called subschema) at the
external level to describe the various user views.
 Correspond to different views of the data.
 Usually uses the same data model as the conceptual level.
The three-schema architecture
32
The three-schema architecture
33

Examples: Differences between the three levels.


Three-Schema Architecture
34

Mappings among schema levels are needed to

transform requests and data.


 Programs refer to an external schema, and are mapped
by the DBMS to the internal schema for execution.
 Data extracted from the internal DBMS level is
reformatted to match the user’s external view (e.g.
formatting the results of an SQL query for display in a
Web page)
Data Independence
Slide
2- 35

 Data independence is the ability to modify the schema definition

in one level without affecting the schema definition in the next


higher level.
 Two types of Data Independence:

1. Physical Data Independence

2. Logical Data Independence


Data Independence
36
 Logical Data Independence:
 Logical data is data about database, that is, it stores information about how
data is managed inside.
 The capacity to change the conceptual schema without having to change the
external schemas and their associated application programs.
 Physical Data Independence:

 The capacity to change the internal schema without having to change the

conceptual schema.
 For example, the internal schema may be changed when certain file structures
are reorganized or new indexes are created to improve database performance
Data Independence (continued)
37

 When a schema at a lower level is changed, only the mappings

between this schema and higher-level schemas need to be


changed in a DBMS that fully supports data independence.
 The higher-level schemas themselves are unchanged.

 Hence, the application programs need not be changed since they


refer to the external schemas.
DBMS Languages
38

 A data sublanguage consists of two parts:


 Data Definition Language(DDL)
 The DDL is used to specify the database schema
 Data Manipulation Language(DML).
 The DML is used to both read and update the database.
 DDL and DML not two separate languages; instead they simply

form parts of a single db language, such as SQL language.


DBMS Languages
39

 These languages are called data sublanguages because they do

not include constructs for all computing needs such as


conditional or iterative statements, which are provided by the
high-level programming languages.
 Many DBMSs have a facility for embedding the sublanguage in a

high-level programming language such as COBOL, Fortran,


Pascal, Ada, ‘C’, C++, Java, or Visual Basic.
DBMS Languages
40

 High-Level or Non-procedural Languages: These include the relational


language SQL

May be used in a standalone way or may be embedded in a programming


language
 Low Level or Procedural Languages: These must be embedded in a
programming language.
 View definition language (VDL)- Specifies user views/mappings to

conceptual schema
 Storage definition language (SDL)- Specifies the internal schema
Data Definition Language (DDL):
41

 A language that is used to define database schemas.

 The DDL statement is used to identify description of the schema construct and

store the schema description in the DBMS catalog (data dictionary).


 A data dictionary contains metadata (i.e., data about data).

 Example:

Create table Student (StNo number(14),name varchar(20),Bdate date);


 When create a new table, it also update a special set of tables (data
dictionary).
An example of a database catalog for the database.
Slide
2- 42
Data Definition Language (DDL):
43

 DDL is Used by the DBA and database designers to specify the conceptual

schema of a database.
 In many DBMSs, the DDL is also used to define internal and external schemas

(views).
 In some DBMSs, separate Storage Definition Language (SDL) And View

Definition Language (VDL) are used to define internal and external schemas.
 SDL is typically realized via DBMS commands provided to the DBA
and database designers
Data Manipulation Language (DML)
44

 DML is Used to specify database retrievals and updates

 DML commands (data sublanguage) can be embedded in a

general-purpose programming language (host language), such as


COBOL, C, C++, or Java.
 A library of functions can also be provided to access the DBMS

from a programming language


 Alternatively, stand-alone DML commands can be
applied directly (called a query language).
Data Manipulation Language (DML)
45

 Data manipulation operations usually include the following:

 Insertion of new data into the database;


 Modification of data stored in the database;
 Retrieval of data contained in the database;
 Deletion of data from the database.
Types of DML
46

 There are two classes of DMLs:


 High Level or Non-procedural Language:
 Are set-oriented and specify what data to retrieve rather
than how to retrieve it.
 Also called declarative languages.
 Example the SQL relational language
 Low Level or Procedural Language:
 A language that allows the user to tell the system what data is needed
and exactly how to retrieve the data.
 Retrieve data one record-at-a-time;
 Constructs such as looping are needed to retrieve
multiple records, along with positioning pointers.
 Example –PL/SQL
DBMS Interfaces
47

 DBMS interface- is a user interface which allows for the ability to input

queries to a database without using the query language itself.


 Stand-alone query language interfaces

 Example: Entering SQL queries at the DBMS interactive SQL


interface (e.g. SQL*Plus in ORACLE)
 Programmer interfaces for embedding DML in programming languages

 User-friendly interfaces

 Menu-based, forms-based, graphics-based, etc.


DBMS Interfaces cont…
48

 Menu-based interfaces for Web clients or browsing

 Forms-based interfaces

 Graphical user interfaces

 Natural language interfaces

 Speech input and output

 Interfaces for parametric users

 Interfaces for the DBA


Database System Utilities
49

To perform certain functions such as:


 Loading data stored in files into a database. Includes data
conversion tools.
 Backing up the database periodically on tape.
 Reorganizing database file structures.
 Report generation utilities.
 Performance monitoring utilities.
 Other functions, such as sorting, user monitoring, data compression,
etc.
Other Tools
50

Data dictionary / repository:


 Used to store schema descriptions and other
information such as design decisions, application
program descriptions, user information, usage
standards, etc.
 Active data dictionary is accessed by DBMS
software and users/DBA.
 Passive data dictionary is accessed by users/DBA
only.
Other Tools
51

Application Development Environments and CASE


(computer-aided software engineering) tools:
Examples:
 PowerBuilder (Sybase)
 JBuilder (Borland)
 JDeveloper 10G (Oracle)
Typical DBMS Component Modules
52
Centralized and Client-Server DBMS
Architectures
53

Centralized DBMS:
 Combines everything into single system including:-
 DBMS software,
 hardware,
 application programs, and
 user interface processing software.
 User can still connect through a remote terminal –
however, all processing is done at centralized site.
A Physical Centralized Architecture
54
Basic 2-tier Client-Server Architectures
55

Specialized Servers with Specialized


functions
 Print server
 File server
 DBMS server
 Web server
 Email server
Clients can access the specialized servers as
needed
Logical two-tier client server
architecture
56
Clients
57

 Provide appropriate interfaces through a client software

module to access and utilize the various server resources.


 Clients may be diskless machines or PCs or Workstations

with disks with only the client software installed.


 Connected to the servers via some form of a network.

 (LAN: local area network, wireless network, etc.)


DBMS Server
58

 Provides database query and transaction services to the clients


 Relational DBMS servers are often called SQL servers, query servers,
or transaction servers
 Applications running on clients utilize an Application Program
Interface (API) to access server databases via standard interface Like
 ODBC: Open Database Connectivity standard
 JDBC: for Java programming access
 Client and server must install appropriate client module and server
module software for ODBC or JDBC
Two Tier Client-Server Architecture
59

 A client program may connect to several DBMSs, sometimes


called the data sources.
 In general, data sources can be files or other non-DBMS
software that manages data.
 Other variations of clients are possible: e.g., in some object
DBMSs, more functionality is transferred to clients including
data dictionary functions, optimization and recovery across
multiple servers, etc.
Three Tier Client-Server Architecture
60

 Common for Web applications


 Intermediate Layer called Application Server or Web Server:
 Stores the web connectivity software and the business
logic part of the application used to access the
corresponding data from the database server
 Acts like a conduit for sending partially processed
data between the database server and the client.
 Three-tier Architecture Can Enhance Security:
 Database server only accessible via middle tier
 Clients cannot directly access database server
Three-tier client-server architecture
61
Classification of DBMSs
62

I. Based on the data model used


 Traditional: Relational, Network, Hierarchical.
 Emerging: Object-oriented, Object-relational.

II. Other classifications


 Single-user (typically used with personal computers)
vs. multi-user (most DBMSs).
 Centralized (uses a single computer with one database)
vs. distributed (uses multiple computers, multiple databases)
Summary
63

Data Models and Their Categories

History of Data Models

Schemas, Instances, and States

Three-Schema Architecture

Data Independence

DBMS Languages and Interfaces

Database System Utilities and Tools

Centralized and Client-Server Architectures

Classification of DBMSs
Questions
Slide
2- 64

1. What is Data Model?


Ans: A collection of conceptual tools for describing data, data relationships data
semantics and constraints.
2. What is DDL (Data Definition Language)?
Ans: A data base schema is specifies by a set of definitions expressed by a
special language called DDL.
3. What is VDL (View Definition Language)?
Ans: It specifies user views and their mappings to the conceptual schema.
4. What is SDL (Storage Definition Language)?
Ans: This language is to specify the internal schema. This language may specify
the mapping between two schemas.

You might also like