0% found this document useful (0 votes)
14 views16 pages

COMP 150 - Topic 2

This document provides an introduction to Database Management Systems (DBMS), detailing its architecture, data models, and languages. It explains the three levels of DBMS architecture: external, conceptual, and internal, along with the importance of schemas and various data models like the Entity-Relationship and Relational models. Additionally, it covers data definition and manipulation languages used to interact with databases, emphasizing the significance of data organization and user interaction.

Uploaded by

wanjikudavid403
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views16 pages

COMP 150 - Topic 2

This document provides an introduction to Database Management Systems (DBMS), detailing its architecture, data models, and languages. It explains the three levels of DBMS architecture: external, conceptual, and internal, along with the importance of schemas and various data models like the Entity-Relationship and Relational models. Additionally, it covers data definition and manipulation languages used to interact with databases, emphasizing the significance of data organization and user interaction.

Uploaded by

wanjikudavid403
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Topic Two: Introduction to Database Design

Introduction
Welcome to topic two. This topic helps you begin to understand the
structure of a Database Management System (DBMS). A highlight of the
significant components of the DBMS is introduced too so that you
understand the scope of the whole process of designing a functional DBMS

Learning Outcomes

By the end of this topic you should be able to:

i. What is a DBMS architecture?


ii. What is a Data model?
iii. What does transaction management entail?
iv. What is storage management all about?
v. Who are the various users of a database?

2.1 DBMS Architecture and Data Independence

There are three levels or layers of DBMS architecture:


 External or view level,
 Conceptual or Logical level, and
 Internal or storage or physical level.
The objective of the three level architecture is to separate each user's view
of the database from the way the database is physically represented. There
are several reasons why this separation is desirable:
• Each user should be able to access the same data, but have a different
customized view of the data. Each user should be able to change the
way he or she views the data, and this change should not affect other
users.
• Users should not have to deal directly with physical database storage
details, such as indexing or hashing. In other words a user's interaction
with the database should be independent of storage considerations.
• The Database Administrator (DBA) should be able to change the
database storage structures without affecting the user's views.
 The internal structure of the database should be unaffected by changes
to the physical aspects of storage, such as the changeover to a new
storage device.
 The DBA should be able to change the conceptual structure of the
database without affecting all users.

Since many database-systems users are not computer trained, developers


hide the complexity from users through several levels of abstraction, to
simplify users’ interactions with the system:
 Physical level. The lowest level of abstraction describes how the data
are actually stored. The physical level describes complex low-level data
structures in detail.
 Logical level. The next-higher level of abstraction describes what data
are stored in the database, and what relationships exist among those
data. The logical level thus describes the entire database in terms of a
small number of relatively simple structures.
 View level. The highest level of abstraction describes only part of the
entire database. Even though the logical level uses simpler structures,
complexity remains because of the variety of information stored in a
large database. Many users of the database system do not need all this
information; instead, they need to access only a part of the database.
The view level of abstraction exists to simplify their interaction with the
system. The system may provide many views for the same database.
An architecture for a database system

Figure 1.0:
Database System Architecture
2.1.1 External Level or View level
It is the users' view of the database. This level describes that part of the
database that is relevant to each user. External level is the one which is
closest to the end users. This level deals with the way in which individual
users view data. Individual users are given different views according to the
user's requirement.
A view involves only those portions of a database which are of concern to
a user. Therefore same database can have different views for different
users. The external view insulates users from the details of the internal and
conceptual levels. External level is also known as the view level. In addition
different views may have different representations of the same data. For
example, one user may view dates in the form (day, month, year), while
another may view dates as (year, month, day).

2.1.2 Conceptual Level or Logical level


It is the community view of the database. This level describes what data is
stored in the database and the relationships among the data. The middle
level in the three level architecture is the conceptual level. This level
contains the logical structure of the entire database as seen by the
Database Administrator (DBA). It is a complete view of the data
requirements of the organization that is independent of any storage
considerations. The conceptual level represents:
• All entities, their attributes, and their relationships;
An Entity is an object whose information is stored in the database. For
example, in student database the entity is student. An attribute is a
characteristic of interest about an entity. For example, in case of student
database RollNo, Name, Class, Address etc. are attributes of entity
student.

Figure 1.1: An Entity showing attributes


• The constraints on the data;
• Semantic information about the data;
• Security and integrity information.

The conceptual level supports each external view, in that any data available
to a user must be contained in, or derivable from, the conceptual level.
However, this level must not contain any storage dependent details. For
instance, the description of an entity should contain only data types of
attributes (for example, integer, real, character) and their length (such as
the maximum number of digits or characters), but not any storage
considerations, such as the number of bytes occupied. Conceptual level is
also known as the, logical level.

2.1.3 Internal level or Storage level


It is the physical representation of the database on the computer. This level
describes how the data is stored in the database. The internal level is the
one that concerns the way the data are physically stored on the hardware.
The internal level covers the physical/implementation of the database to
achieve optimal runtime performance and storage space utilization. It
covers the data structures and file organizations used to store data on
storage devices. It interfaces with the operating system access methods to
place the data on the storage devices, build the indexes, retrieve the data,
and so· on.
The internal level is concerned with such things as:
• Storage space allocation for data and indexes;
• Record descriptions for storage (with stored sizes for data items);
• Record placement;
• Data compression and data encryption techniques.

There will be only one conceptual view, consisting of the abstract


representation of the database in it’s entirety. Similarly there will be only
one internal or physical view, representing the total database, as it is
physically stored.
2.2 Schema
It is important to note that the data in the database changes frequently,
while the plans or schemes remain the same over long periods of time. The
database plans consist of types of entities that a database deals with, the
relationship among these entities and the ways in which the entities and
relationships are expressed from one level of abstraction to the next level
for the users' view. The users' view of the data (also called logical
organization of data) should be in a form that is most convenient for the
users and they should not be concerned about the way data is physically
organized. Therefore, a DBMS should do the translation between the logical
(users' view) organization and the physical organization of the data in the
database.

The plan or scheme of the database is known as Schema. Schema gives


the names of the entities and attributes. It specifies the relationship among
them. It is a framework into which the values of the data items (or fields)
are fitted. The plans or the format of schema remains the same. But the
values fitted into this format changes from instance to instance. In other
terms, schema means overall plans of all the data item (field) types and
record types stored in a database. Schema includes the definition of
the database name, the record type and the components that make
up those records
2.2.1 Types of Schema
There are three different types of schema in the database corresponding
to each data view of database. In other words, the data views at each of
three levels are described by schema.

A schema is defined as an outline or a plan that describes the records and


relationships existing at the particular level. The External view is
described by means of a schema called external schema that correspond to
different views of the data. Similarly the Conceptual view is defined by
conceptual schema, which describes all the entities, attributes, and
relationship together with integrity constraints. Internal View is defined
by internal schema, which is a complete description of the internal model,
containing definition of stored records, the methods of representation, the
data fields, and the indexes used.

There is only one conceptual schema and one internal schema per
database. The schema also describes the way in which data elements at
one level can be mapped to the corresponding data elements in the next
level.

Thus, we can say that schema establishes correspondence between the


records and relationships in the two levels. In a relational database, the
schema defines the tables, the fields in each table, and the relationships
between fields and tables. Schema are generally stored in a data dictionary.

The data in the database at any particular point in time is called a database
instance. Therefore, many database instances can correspond to the same
database schema. The schema is sometimes called the intension of the
database, while an instance is called an extension (or state) of the
database.
Figure 1.2: Three Level Architecture of a DBMS

Example: To understand the difference between the three levels, consider


again the database schema that describes College Database system. If
User1 is a Library clerk, the external view would contain only the student
and book information. If User2 is an account office clerk then he/she may
be interested in students detail and fee detail, etcetera, etcetera.

The external view would depend upon the user who is accessing the
database. The conceptual level contain the logical view of the whole
database, it represents the data type of each required field. The
internal view represents the physical location of each element on the
disk of the servers well as how many bytes of storage each element needs.

2.3 Data Models


Underlying the structure of a database is a data model. A model is a
collection of conceptual tools for describing data, data relationships, data
semantics, and consistency constraints. Over the development period of
databases, several models have been introduced and after a while, new
models introduced. Below is a discussion of two models;

2.3.1 The Entity-Relationship Model


The entity-relationship (E-R) data model is based on a perception of a real
world that consists of a collection of basic objects, called entities, and of
relationships among these objects. An entity is a "thing" or "object" in the
real world that is distinguishable from other objects. For example, each
person is an entity, and bank accounts can be considered as entities.
Entities are described in a database by a set of attributes. For example, the
attributes account-number and balance may describe one particular
account in a bank, and they form attributes of the account entity set.
Similarly, attributes customer-name, customer-street address and
customer-city may describe a customer entity.

A relationship is an association among several entities. For example, a


depositor relationship associates a customer with each account that she
has. The set of all entities of the same type and the set of all relationships
of the same type are termed an entity set and relationship set, respectively.
The overall logical structure (schema) of a database can be expressed
graphically by an E-R diagram, which is built up from the following
components:
· Rectangles, which represent entity sets
· Ellipses, which represent attributes
· Diamonds, which represent relationships among entity sets
· Lines, which link attributes to entity sets and entity sets to relationships

Figure 1.3: Entity-relationship model


In addition to entities and relationships, the E-R model represents certain
constraints to which the contents of a database must conform. One
important constraint is mapping cardinalities, which express the number of
entities to which another entity can be associated via a relationship set. For
example, if each account must belong to only one customer, the E-R model
can express that constraint.

2.3.2 Relational Model


The relational model uses a collection of tables to represent both data and
the relationships among those data. Each table has multiple columns, and
each column has a unique name.

The relational model is at a lower level of abstraction than the E-R model.
Database designs are often carried out in the E-R model, and then
translated to the relational model.

Figure 1.4: An example of a Relational Model

2.3.3 Other Data Models


The object-oriented data model is another data model that has seen
increasing attention. The object-oriented model can be seen as extending
the E-R model with notions of encapsulation, methods (functions), and
object identity. Later we shall examine the object-oriented data model. The
object-relational data model combines features of the object-oriented data
model and relational data model.

Semi-structured data models permit the specification of data where


individual data items of the same type may have different sets of attributes.
This is in contrast with the data models mentioned earlier, where every
data item of a particular type must have the same set of attributes. The
extensible markup language (XML) is widely used to represent semi-
structured data.

Historically, two other data models, the network data model and the
hierarchical data model, preceded the relational data model. These models
were tied closely to the underlying implementation, and complicated the
task of modeling data. As a result they are little used now, except in old
database code that is still in service in some places.

2.4 Database Languages

A database system provides a data definition language to specify the


database schema and a data manipulation language to express database
queries and updates. In practice, the data definition and data manipulation
languages are not two separate languages; instead they simply form parts
of a single database language, such as the widely used SQL language.

2.4.1 Data-Definition Language

We specify a database schema by a set of definitions expressed by a special


language called a data-definition language (DDL). For instance, the
following statement in the SQL language defines the account table:

create table account


(account-number char(10),
balance integer)

Execution of the above DDL statement creates the account table. In


addition, it updates a special set of tables called the data dictionary or data
directory. A data dictionary contains metadata—that is, data about data.
The schema of a table is an example of metadata. A database system
consults the data dictionary before reading or modifying actual data.

We specify the storage structure and access methods used by the database
system by a set of statements in a special type of DDL called a data
storage and definition language. These statements define the
implementation details of the database schemas, which are usually hidden
from the users.

The data values stored in the database must satisfy certain consistency
constraints. For example, suppose the balance on an account should not
fall below $100. The DDL provides facilities to specify such constraints. The
database systems check these constraints every time the database is
updated.

2.4.2 Data-Manipulation Language

Data manipulation is

 The retrieval of information stored in the database


 The insertion of new information into the database
 The deletion of information from the database
 The modification of information stored in the database

A data-manipulation language (DML) is a language that enables users


to access or manipulate data as organized by the appropriate data model.
There are basically two types:

a) Procedural DMLs require a user to specify what data are needed


and how to get those data.
b) Declarative DMLs (also referred to as nonprocedural DMLs)
require a user to specify what data are needed without specifying
how to get those data.

Declarative DMLs are usually easier to learn and use than are procedural
DMLs. However, since a user does not have to specify how to get the data,
the database system has to figure out an efficient means of accessing data.
The DML component of the SQL language is nonprocedural.

A query is a statement requesting the retrieval of information. The portion


of a DML that involves information retrieval is called a query language.
Although technically incorrect, it is common practice to use the terms query
language and data manipulation language synonymously.

This query in the SQL language finds the name of the customer whose
customer-id is 192-83-7465:

select customer.customer-name
from customer
where customer.customer-id = 192-83-7465
The query specifies that those rows from the table customer where the
customer-id is 192-83-7465 must be retrieved, and the customer-name
attribute of these rows must be displayed.

Queries may involve information from more than one table. For instance,
the following query finds the balance of all accounts owned by the customer
with customerid 192-83-7465.

select account.balance
from depositor, account
where depositor.customer-id = 192-83-7465 and
depositor.account-number = account.account-number
There are a number of database query languages in use, either
commercially or experimentally.

The levels of abstraction apply not only to defining or structuring data, but
also to manipulating data. At the physical level, we must define algorithms
that allow efficient access to data. At higher levels of abstraction, we
emphasize ease of use. The goal is to allow humans to interact efficiently
with the system. The query processor component of the database system
translates DML queries into sequences of actions at the physical level of the
database system.

2.4.3 Data Dictionary

We can define a data dictionary as a DBMS component that stores the


definition of data characteristics and relationships. You may recall that such
“data about data” were labeled metadata. The DBMS data dictionary
provides the DBMS with its self-describing characteristic. In effect, the data
dictionary resembles and X-ray of the company’s entire data set, and is a
crucial element in the data administration function.

The two main types of data dictionary exist, integrated and stand alone.
An integrated data dictionary is included with the DBMS. For example, all
relational DBMSs include a built in data dictionary or system catalog that is
frequently accessed and updated by the RDBMS. Other DBMSs especially
older types, do not have a built in data dictionary instead the DBA may use
third party stand alone data dictionary systems.

Data dictionaries can also be classified as active or passive. An active data


dictionary is automatically updated by the DBMS with every database
access, thereby keeping its access information up-to-date. A passive data
dictionary is not updated automatically and usually requires a batch process
to be run. Data dictionary access information is normally used by the DBMS
for query optimization purpose.

The data dictionary’s main function is to store the description of all objects
that interact with the database. Integrated data dictionaries tend to limit
their metadata to the data managed by the DBMS. Stand-alone data
dictionary systems are usually more flexible and allow the DBA to describe
and manage all the organization’s data, whether or not they are
computerized. Whatever the data dictionary’s format, its existence provides
database designers and end users with a much improved ability to
communicate. In addition, the data dictionary is the tool that helps the DBA
to resolve data conflicts.

Although, there is no standard format for the information stored in the data
dictionary several features are common. For example, the data dictionary
typically stores descriptions of all:
• Data elements that are define in all tables of all databases. Specifically
the data dictionary stores the name, datatypes, display formats, internal
storage formats, and validation rules. The data dictionary tells where an
element is used, by whom it is used and so on.
• Tables define in all databases. For example, the data dictionary is likely
to store the name of the table creator, the date of creation access
authorizations, the number of columns, and so on.
• Indexes define for each database tables. For each index the DBMS stores
at least the index name the attributes used, the location, specific index
characteristics and the creation date.
• Define databases: who created each database, the date of creation
where the database is located, who the DBA is and so on.
• End users and The Administrators of the data base
• Programs that access the database including screen formats, report
formats application formats, SQL queries and so on.
• Access authorization for all users of all databases.
• Relationships among data elements which elements are involved:
whether the relationship are mandatory or optional, the connectivity and
cardinality and so on.

If the data dictionary can be organized to include data external to the DBMS
itself, it becomes an especially flexible to for more general corporate
resource management. The management of such an extensive data
dictionary, thus, makes it possible to manage the use and allocation of all
of the organization information regardless whether it has its roots in the
database data. This is why some managers consider the data dictionary to
be the key element of the information resource management function. And
this is also why the data dictionary might be described as the information
resource dictionary.

The metadata stored in the data dictionary is often the bases for monitoring
the database use and assignment of access rights to the database users.
The information stored in the database is usually based on the relational
table format, thus, enabling the DBA to query the database with SQL
command. For example, SQL command can be used to extract information
about the users of the specific table or about the access rights of a
particular users.

You might also like