0% found this document useful (0 votes)
16 views19 pages

Epaphras Simango Dtabases Assignment

Uploaded by

esimango
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views19 pages

Epaphras Simango Dtabases Assignment

Uploaded by

esimango
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

MUNHUMUTAPA SCHOOL OF COMMERCE

DEPARTMENT OF ACCOUNTING & INFORMATION SYSTEMS


NAME : EPAPHRAS SIMANGO

REG NUMBER : M101947

PROGRAMME : BCOM HONS INFORMATION SYSTEMS

LEVEL : 1.1 (BLOCK RELEASE)

MODULE TITLE : DATABASE SYSTEMS AND CONCEPTS

MODULE CODE : ISH 116

LECTURER : MR CHIBIDI I

ASSIGNMENT NO: 01

MARK : ……………………………….……………………………….….

LECTURERS’ REMARKS:…………………………………………………………………..

………….……………………………………………………………………………………….

…….…………………………………………………………………………………………….
QUESTION 1

a). With the aid of a diagram, explain the ANSI-SPARC database architecture. How
does this architecture support physical and logical data independence? [9 marks]

The American National Standards Institute (ANSI) Standards Planning and Requirements Committee
(SPARC) The ANSI-SPARC database architecture, which is also known as the three-schema architecture,
was first created in 1970) to give a conceptual framework for the structure and organization of
Databases. In 1971, DBTG (Database Task Group) realized the requirement for a two-level
approach having views and schema and afterward, in 1975, ANSI-SPARC realized the need for a
Three-level approach with the three levels of abstraction comprises of an external, a conceptual, and
an internal level. The three-level architecture aims to separate each user’s view of the database from the way the
database is physically represented.

The ANSI-SPARC three – level architecture

User 1 User 2 User n


External Level
View 1 View 2 View n

Conceptual Level Conceptual Schema

Internal level

Internal Schema

Physical data Organization Database


There are three levels:

External level:
The user's perspective on the database is represented by this level. It specifies how each user interprets the data
and contains schemas or views that are particular to that user.
It is the view how the user views the database. The data that is more important to the user is found at this level.
This level has many external views of the database. In the external view only those

entities, attributes, and


relationships are included that the
user wants. Data at this level,
although it
has different views, it can also be
represented in different ways.
For example, one user may view
name in the form (first name, last
name), while another may view
as (last name, first name
Conceptual level:
Irrespective of any application,
this level depicts the overall
logical organisation of the
database. It
includes the conceptual schema
and defines the connections
between various data pieces.
It is the community view of the
database and describes what data
is stored in the database and
represents the entities, their
attributes, and their relationships.
It represents the semantic,
security, and integrity
information about the data. The
middle-level or
the second level in the three-level
architecture is the conceptual
level. This level contains the
logical structure of the entire
database, it represents the
complete view of the database
that the
organization demands
independent of any storage
consideration.
Internal level: The physical
data storage and access
techniques utilised by computer
systems are
represented by this level. The
physical schema, which specifies
how data is kept and retrieved on
disc,
is part of it.
At the internal level, the
database is represented
physically on the computer. It
emphasizes the
physical implementation of the
database to do storage space
utilization and to achieve the
optimal
runtime performance, and data
encryption techniques. It
interfaces with the operating
system to
place the data on storage files
and build the storage space,
retrieve the data,
entities, attributes, and relationships the user wants are shown. Data at this level, although it
has different views, it can also be represented in different ways. For example, one user may view
name in the form (first name, last name), while another may view as (last name, first name

Conceptual level:
Irrespective of any application, this level depicts the overall logical organisation of the database. It
includes the conceptual schema and defines the connections between various data pieces.
It is the community view of the database and describes what data is stored in the database and
represents the entities, their attributes, and their relationships.
It represents the semantic, security, and integrity information about the data. The middle-level or
the second level in the three-level architecture is the conceptual level. This level contains the
logical structure of the entire database, it represents the complete view of the database that the
organization demands independent of any storage consideration.

Internal level:
The physical data storage and access techniques utilised by computer systems are represented by this
level. The physical schema, which specifies how data is kept and retrieved on disc, is part of it. At the internal
level, the database is represented physically on the computer. It emphasizes the physical implementation of
the database to do storage space utilization and to achieve the optimal runtime performance, and data encryption
techniques. It interfaces with the operating system to place the data on storage files and build the
storage space, retrieve the data etc. ,
Two kinds of data independence: logical and physical. Physical Data and Logical data

The Committee on Data Systems


Languages (CODASYL) in the
1970s defined three kinds of
models
as a way of addressing the
problem caused by databases
design.
Framework for Enterprise
Architecture (John Zachman
1990) defines 5 levels of
models: Contextual
Conceptual, Logical, Physical,
with different meanings from the
CODASYL terms.
The Physical Data Model is term
concept with the most stable and
consistent meaning (CODASYL),
It’s a detailed model of data
stored by a computer in the
database management system
(DBMS), it
understands the characteristics,
constraints, and conceptual
approach of a particular DBMS.
Since
most DBMSs are relational,
Physical Data Model diagrams
are Entity Relationship Diagram
in form.
Physical data independence
refers to the fact that
modifications to the physical
schema do not
necessitate corresponding
modifications to the conceptual or
external schemas. This means
that the
logical view of the data can still
be optimised for performance
without changing the database.
For instance, if a business
decides to migrate the database to
a new storage system, it can alter
the
physical schema to benefit
from the features of the new
system without impacting the
conceptual
schema or the way users view the
data.
Hence, Physical Data
Independence means ability to
change schema at Physical Level
without
affecting schema at Conceptual
and Logical Levels. e.g., if you
change physical media where the
above-mentioned table employee
is stored then it will not affect the
existing employee table.
Logical Data Model: It is a model
of information requirements and
is better called an Information
Model. An Information Model
describes all the information
needed to support an enterprise,
an
activity, or an algorithm. An
Information Model represents the
net sum of the situational
awareness of
off the practitioners (human and
computer) in a functional area.
An Information Model is a
requirements document, and a
good information model is
broader than
just the data to be included in a
computer. Part of the early phases
of information system/database
design is to decide what will be
supported by each available
technology and what will not be
automated at all. The Information
Model provides a common
reference point for describing
these
de
The Committee on Data Systems Languages (CODASYL) in the 1970s defined three kinds of models
as a way of addressing the problem caused by databases design.
Framework for Enterprise Architecture (John Zach man 1990) defines 5 levels of models: Contextual
Conceptual, Logical, Physical, with different meanings from the CODASYL terms.
The Physical Data Model is term concept with the most stable and consistent meaning (CODASYL),
it’s a detailed model of data stored by a computer in the database management system (DBMS), it
understands the characteristics, constraints, and conceptual approach of a particular DBMS. Since
most DBMSs are relational,
Physical Data Model diagrams are Entity Relationship Diagram in form.
Physical data independence refers to the fact that modifications to the physical schema do not
necessitate corresponding modifications to the conceptual or external schemas. This means that the
logical view of the data can still be optimised for performance without changing the database.

For instance, if a business decides to migrate the database to a new storage system, it can alter the
physical schema to benefit from the features of the new system without impacting the conceptual
schema or the way users view the data.

Hence, Physical Data Independence means ability to change schema at Physical Level without
affecting schema at Conceptual and Logical Levels. e.g., if you change physical media where the table
employee is stored then it will not affect the existing employee table.

Logical Data Model: It is a model of information requirements and is better called an Information
Model. An Information Model describes all the information needed to support an enterprise, an
activity, or an algorithm. An Information Model represents the net sum of the situational awareness of
off the practitioners (human and computer) in a functional area.
An Information Model is a requirements document, and a good information model is broader than
just the data to be included in a computer. Part of the early phases of information system/database
design is to decide what will be supported by each available technology and what will not be
automated at all. The Information Model provides a common reference point for describing these decisions.
Logical Data Independence means ability to change schema at Conceptual Level (Logical Level)
without affecting schema at Logical Level (View Level). e.g., if you add an attribute date birth to the
employee table then the existing record used by the end-user of dispatch department won’t get
affected.

Data Independence and the ANSI-SPARC Three Level Model

External Schema External Schema External schema


External /Conceptual Mapping Logical Data
independence

Conceptual Schema

Conceptual /Internal Mapping Physical Data


independence

Internal schema

b) Explain using examples the normalization process up to 3NF. [9 marks]


Database normalization is a technique of organizing the data in the database. Normalization
of data can be considered a process of analysing the given relation schemas based on their
Functional Dependencies and primary keys to achieve the following properties:
i. Minimizing redundancy
ii. Minimizing the insertion, deletion, and update anomalies
iii. Ensuring data is stored in correct table
It can be considered as a filtering process to make the design have successively better quality.
It is a multi-step process that puts data into tabular form by removing duplicated data from
the relation tables. Without normalization it becomes difficult to handle and update database
without facing data loss. The various forms of normalization are described below:
I. First Normal Form (1NF):

First normal form (1NF) states that the domain of an attribute must include only atomic
values and that the value of any attribute in a tuple must be a single value from the domain of
that attribute. A relation is said to be in 1NF if it contains no non-atomic values and each row
can provide a unique combination of values.1NF disallows having a set of values, a tuple of
values, or a combination of both as an attribute value for a single tuple. The only attribute
values permitted by 1NF are single atomic values.

Example: Un-Normalized Table-


Student Age Subject

Rooney 15 Java, C++

Kane 16 HTML, PHP

Normalized Table: Any Row must not have a column in which more than one value is saved,
instead data is separated in multiple rows as shown below.

Student Age Subject

Rooney 15 JAVA

Rooney 15 C++

Kane 16 HTML

Kane 16 PHP

II. Second Normal Form (2NF):

A relation is said to be in 2NF, if it is already in 1NF and each and every attribute fully
depends on the primary key of the relation. There must not be any partial dependency of any
column on the primary key. Second normal form (2NF) is based on the concept of full
functional dependency. A functional dependency X -> Y is a full functional dependency if
removal of any attribute A from X means that the dependency does not hold any more. A
functional dependency X->Y is a partial dependency if some attribute A belongs to X can be
removed from X and the dependency still holds.
Example:

Student_Project Table

Stud_ID Proj_ID Stud_Name Proj_Name

100 001 Rooney Cloud

200 002 Kane Servers

Stud_Name depends on Stud_ID and Proj_Name depends on Proj_ID


The above table can be normalized to 2NF as shown below.
Student Table in 2NF

Stud_ID Proj_ID Stud_Name

100 001 Rooney

200 001 Kane

Project Table in 2NF

Proj_ID Proj_Name

001 001

002 Servers

III. Third Normal Form (3NF):


A relation is said to be in 3NF, if it is already in 2NF and there exists no transitive
dependency in that relation. If a table contains transitive dependency, then it is not in 3NF,
and the table must be split to bring it into 3NF.
What is a transitive dependency?
A -> B [B depends on A] & B -> C [C depends on B]
Then A -> C[C depends on A] can be derived.

Example: Below table not in 3NF

Stud_ID Stud_Name City Zip

100 Rooney Manchester 4001

200 Kane Stoke 4002

Stud_ID is the only prime key attribute. City can be identified by Stu_ID as well as Zip.
Neither Zip is a super key nor City is a prime attribute.
Stud_ID -> Zip -> City, so there exists transitive dependency. Hence 3NF table is below
Student_Detail

Stud_ID Stud_Name Zip

100 Rooney 4001

200 Kane 4002

Zip_Code

Zip City

4001 Manchester
Zip City

4002 Stoke

c). Distinguish between centralized and distributed databases. [4 marks]


Centralized Databases Distributed Databases

All the data is stored and managed at a single location, The data is spread across multiple interconnected sites or
typically on a single server or mainframe. Users access nodes. Each site has its own local DBMS, and data can be
the data through a central database management system stored and processed locally. The distributed database
(DBMS). This architecture provides a unified view of architecture offers advantages such as improved scalability,
the data but may have scalability and performance fault tolerance, and reduced network traffic. However, it
limitations. introduces challenges related to data consistency, transaction
management, and network communication.

d). Describe the problems of lost update, inconsistent read and phantom phenomenon
which arise as a result of concurrency. [3 marks]

Lost update Problem

The Lost Update problem arises when an update in the data is done over another update but
by two different transactions.

Example: Consider two transactions A and B performing read/write operations on a data DT


in the database DB. The current value of DT is 1000: The following table shows the
read/write operations in A and B transactions.
Time A B
T1 READ(DT) ------
T2 DT=DT+500 ------
T3 WRITE(DT) ------
T4 ------ DT=DT+300
T5 ------ WRITE(DT)
T6 READ(DT) ------

Transaction A initially reads the value of DT as 1000. Transaction A modifies the value of
DT from 1000 to 1500 and then again transaction B modifies the value to 1800. Transaction
A again reads DT and finds 1800 in DT and therefore the update done by transaction A has
been lost.

Inconsistent read
When a transaction reads the object x twice and x has different values the problem is called
inconsistent read. It happens because between the two reads another transaction has modified
the value of x. There are two kinds of inconsistent read:
ghost update (a) if two transaction access concurrently to the same object and they view their
modification each other; notice that all objects are already present into the database;
ghost update (b) if one of two transaction insert a new object into the database and another
transaction access use that data.

Phantom Read Problem

In the phantom read problem, data is read through two different read operations in the same
transaction. In the first read operation, a value of the data is obtained but in the second
operation, an error is obtained saying the data does not exist.

Example: Consider two transactions A and B performing read/write operations on a data DT


in the database DB. The current value of DT is 1000: The following table shows the
read/write operations in A and B transactions.

Time A B
T1 READ(DT) ------
T2 ------ READ(DT)
T3 DELETE(DT) ------
T4 ------ READ(DT)

Transaction B initially reads the value of DT as 1000. Transaction A deletes the data DT
from the database DB and then again transaction B reads the value and finds an error saying
the data DT does not exist in the database DB.
References
DBMS database management system tutorialspoint
http://claudiofiandrino.altervista.org/
https://www.javatpoint.com/
https://www.scaler.com/

You might also like