Epaphras Simango Dtabases Assignment
Epaphras Simango Dtabases Assignment
LECTURER : MR CHIBIDI I
ASSIGNMENT NO: 01
MARK : ……………………………….……………………………….….
LECTURERS’ REMARKS:…………………………………………………………………..
………….……………………………………………………………………………………….
…….…………………………………………………………………………………………….
QUESTION 1
a). With the aid of a diagram, explain the ANSI-SPARC database architecture. How
does this architecture support physical and logical data independence? [9 marks]
The American National Standards Institute (ANSI) Standards Planning and Requirements Committee
(SPARC) The ANSI-SPARC database architecture, which is also known as the three-schema architecture,
was first created in 1970) to give a conceptual framework for the structure and organization of
Databases. In 1971, DBTG (Database Task Group) realized the requirement for a two-level
approach having views and schema and afterward, in 1975, ANSI-SPARC realized the need for a
Three-level approach with the three levels of abstraction comprises of an external, a conceptual, and
an internal level. The three-level architecture aims to separate each user’s view of the database from the way the
database is physically represented.
Internal level
Internal Schema
External level:
The user's perspective on the database is represented by this level. It specifies how each user interprets the data
and contains schemas or views that are particular to that user.
It is the view how the user views the database. The data that is more important to the user is found at this level.
This level has many external views of the database. In the external view only those
Conceptual level:
Irrespective of any application, this level depicts the overall logical organisation of the database. It
includes the conceptual schema and defines the connections between various data pieces.
It is the community view of the database and describes what data is stored in the database and
represents the entities, their attributes, and their relationships.
It represents the semantic, security, and integrity information about the data. The middle-level or
the second level in the three-level architecture is the conceptual level. This level contains the
logical structure of the entire database, it represents the complete view of the database that the
organization demands independent of any storage consideration.
Internal level:
The physical data storage and access techniques utilised by computer systems are represented by this
level. The physical schema, which specifies how data is kept and retrieved on disc, is part of it. At the internal
level, the database is represented physically on the computer. It emphasizes the physical implementation of
the database to do storage space utilization and to achieve the optimal runtime performance, and data encryption
techniques. It interfaces with the operating system to place the data on storage files and build the
storage space, retrieve the data etc. ,
Two kinds of data independence: logical and physical. Physical Data and Logical data
For instance, if a business decides to migrate the database to a new storage system, it can alter the
physical schema to benefit from the features of the new system without impacting the conceptual
schema or the way users view the data.
Hence, Physical Data Independence means ability to change schema at Physical Level without
affecting schema at Conceptual and Logical Levels. e.g., if you change physical media where the table
employee is stored then it will not affect the existing employee table.
Logical Data Model: It is a model of information requirements and is better called an Information
Model. An Information Model describes all the information needed to support an enterprise, an
activity, or an algorithm. An Information Model represents the net sum of the situational awareness of
off the practitioners (human and computer) in a functional area.
An Information Model is a requirements document, and a good information model is broader than
just the data to be included in a computer. Part of the early phases of information system/database
design is to decide what will be supported by each available technology and what will not be
automated at all. The Information Model provides a common reference point for describing these decisions.
Logical Data Independence means ability to change schema at Conceptual Level (Logical Level)
without affecting schema at Logical Level (View Level). e.g., if you add an attribute date birth to the
employee table then the existing record used by the end-user of dispatch department won’t get
affected.
Conceptual Schema
Internal schema
First normal form (1NF) states that the domain of an attribute must include only atomic
values and that the value of any attribute in a tuple must be a single value from the domain of
that attribute. A relation is said to be in 1NF if it contains no non-atomic values and each row
can provide a unique combination of values.1NF disallows having a set of values, a tuple of
values, or a combination of both as an attribute value for a single tuple. The only attribute
values permitted by 1NF are single atomic values.
Normalized Table: Any Row must not have a column in which more than one value is saved,
instead data is separated in multiple rows as shown below.
Rooney 15 JAVA
Rooney 15 C++
Kane 16 HTML
Kane 16 PHP
A relation is said to be in 2NF, if it is already in 1NF and each and every attribute fully
depends on the primary key of the relation. There must not be any partial dependency of any
column on the primary key. Second normal form (2NF) is based on the concept of full
functional dependency. A functional dependency X -> Y is a full functional dependency if
removal of any attribute A from X means that the dependency does not hold any more. A
functional dependency X->Y is a partial dependency if some attribute A belongs to X can be
removed from X and the dependency still holds.
Example:
Student_Project Table
Proj_ID Proj_Name
001 001
002 Servers
Stud_ID is the only prime key attribute. City can be identified by Stu_ID as well as Zip.
Neither Zip is a super key nor City is a prime attribute.
Stud_ID -> Zip -> City, so there exists transitive dependency. Hence 3NF table is below
Student_Detail
Zip_Code
Zip City
4001 Manchester
Zip City
4002 Stoke
All the data is stored and managed at a single location, The data is spread across multiple interconnected sites or
typically on a single server or mainframe. Users access nodes. Each site has its own local DBMS, and data can be
the data through a central database management system stored and processed locally. The distributed database
(DBMS). This architecture provides a unified view of architecture offers advantages such as improved scalability,
the data but may have scalability and performance fault tolerance, and reduced network traffic. However, it
limitations. introduces challenges related to data consistency, transaction
management, and network communication.
d). Describe the problems of lost update, inconsistent read and phantom phenomenon
which arise as a result of concurrency. [3 marks]
The Lost Update problem arises when an update in the data is done over another update but
by two different transactions.
Transaction A initially reads the value of DT as 1000. Transaction A modifies the value of
DT from 1000 to 1500 and then again transaction B modifies the value to 1800. Transaction
A again reads DT and finds 1800 in DT and therefore the update done by transaction A has
been lost.
Inconsistent read
When a transaction reads the object x twice and x has different values the problem is called
inconsistent read. It happens because between the two reads another transaction has modified
the value of x. There are two kinds of inconsistent read:
ghost update (a) if two transaction access concurrently to the same object and they view their
modification each other; notice that all objects are already present into the database;
ghost update (b) if one of two transaction insert a new object into the database and another
transaction access use that data.
In the phantom read problem, data is read through two different read operations in the same
transaction. In the first read operation, a value of the data is obtained but in the second
operation, an error is obtained saying the data does not exist.
Time A B
T1 READ(DT) ------
T2 ------ READ(DT)
T3 DELETE(DT) ------
T4 ------ READ(DT)
Transaction B initially reads the value of DT as 1000. Transaction A deletes the data DT
from the database DB and then again transaction B reads the value and finds an error saying
the data DT does not exist in the database DB.
References
DBMS database management system tutorialspoint
http://claudiofiandrino.altervista.org/
https://www.javatpoint.com/
https://www.scaler.com/