DBMS Revision Stuff-1
DBMS Revision Stuff-1
ii) SQL
Allocate 2 marks for a full answer as above, 1 mark for partial answer.
(2 Marks)
iii) Meta-data
• Stores data about the structure of data, relationships between data items, integrity constraints,
names and authorisation privileges (2 Marks)
Allocate 2 marks for a full answer as above, 1 mark for partial answer.
• Oracle
• E-R diagramming
CourseCode
d) Explain the differences between a logical database design and a physical database design.
(4 Marks)
• Logical database design is the process of constructing a business data model. It involves the
• Physical database design involves taking the logical design and fine-tuning it against the usage,
performance and storage requirements of some application. Certain implementation decisions in terms of a chosen
DBMS (and operating system and hardware) have to be made. It is implementation dependent.
e) In the Catering College scenario described at the beginning of the paper, a course contains many modules.
There are at least two tutors teaching each module, each taking one class. Draw an entity relationship model
showing this, resolving any many to many relationships you may find.
(5 Marks)
Course
Module
Class Tutor
f) Explain why the word ‘Distinct’ may be in included in an SQL statement, such as SELECT DISTINCT Name.
(2 Marks)
• SQL does NOT eliminate redundant duplicate rows from the result of a SELECT
unless the user explicitly requests it to do so, using the keyword DISTINCT.
• Also, updates frequently have to be performed across more than one table.
• Thus to re-introduce some controlled redundancy, retrieval performance can be improved at the
• For example, a unary relationship where the employee entity acts in two different ways – as a
Allocate 1 mark for the description and 1 mark for the valid example, maximum 2 marks.
QUESTION 2
inconsistencies between copies of the data items, and cause, at best, confusion
● In an effort to preserve integrity between different copies of the same data item,
any valid changes to one data item must be propagated to the other copies of that
data item, and this requires programs to be written to perform such reconciliation
procedures.
● Duplicated, redundant data leads to the unnecessary and wasteful use of data
storage.
● Manipulation of the data held within one or more files requires several, potentially
query languages like SQL and any GUI-based front ends to generate
● On occasions, similar data requirements may result in a data file being shared
across several applications programs (this approach is, however, more in keeping
with the database approach to data storage). However, if one of the applications
programs requires a change in data file format then this will affect all the other
applications programs using the same file, and require each of the other programs
to be updated so that they can work with the changed file format. This leads to
extra maintenance and programming effort just to enable the other applications
to work as they were before the file format change was made (this type of
Many of the benefits of the database approach are the opposite of the limitations of
approach to the support of the entire data store and its manipulation.
● Greatly enhanced security, due to the DBMS-provided protection (e.g., user ids
etc.) and other associated security features (e.g., physically centralised data
storage means that the security to the live disk media could be isolated to one
● Duplicated, redundant data can be minimised where it was not useful, and more
effectively controlled where it was useful (e.g., as in the case of data warehouse
the database are shielded from those applications that do not require the
changes. Changes to the structure of the data within the database are easier, as
● The pooling of the data and the provision of the DBMS to support and maintain
that database in a uniform manner provides the ideal environment for the provision
of standard high level (declarative) query languages like SQL to be made
available. This means that data manipulation is more intuitive and less ‘low-level’ in
● Data integrity can be considered at the database level, and therefore any
applications that use this database do not have to implement the integrity rules
within their code. This also lessens the burden of applications development, and
c) The data in the system will undergo the process of normalization. Explain the reasoning behind carrying out a
normalization process. (2 Marks)
• minimise redundancy
• promotes data integrity
2 points, 1 mark each
A relation is in first normal form if and only if every non-key attribute is functionally dependent
A relation is in second normal form if and only if it is in first normal form and every non-key
A relation is in third normal form if and only if it is in second normal form and every non-key
Consider the case study of an insurance company. A basic part of the database of the insurance
a) Create two SQL data structures (tables) for the above part of the insurance company database.
(8 Marks)
Allocate 4 marks for each data structure. Sample data structures are:
startDate DATE,
premium DECIMAL(8,2),
renewalDate DATE,
policyType CHARACTER(10))
holderName CHARACTER(20),
holderAddress CHARACTER(50),
holderTelNo CHARACTER(12))
d) Write an SQL statement to count the Type of Policies that are there. (3 Marks)
SELECT COUNT(*)
FROM Policies
Group by PolicyNo
e) Write an SQL statement to remove Policies the table from the database. (3 Marks)
DROP TABLE Policies
QUESTION 4
foreign key (FK) values. The FK value must match an appropriate PK value. Where
c) In terms of the DBMS interface (sub-language), expand the following terms and state what they are used for:
i) DDL (3 Marks)
DDL – data definition language, used to create, delete and amend the data
DCL – data control language, used by the DBA to define authorised users and
d) Briefly describe the difference between a database and a database management system.
(5 Marks)
A dbms ia a suite of computer software providing the interface between users and a database(s).
QUESTION 5
a) Define the term normalisation. Explain why normalisation is performed during the design of a relational
database. (6 Marks)
Award 2 marks for definition. Indicative answer:
• Normalisation is the process of identifying the logical associations between data items and
• Looking for understanding that normalisation is a bottom-up design technique of data analysis.
• The aim is to eliminate certain update, insertion and deletion anomalies from the design of a
relational database.
• Repeating groups and non-functional dependencies are eliminated to produce logical tables
which can be mapped to relations.
Allows flexibility
d) Removing repeating groups from data items converts to First Normal Form. Describe the process of converting
data already in First Normal Form to Second Normal Form.
(4 Marks)
• The conversion to Second Normal Form is concerned with relations with more than one attribute in the key
• The question to be asked is, for each non-key attribute, ‘Is it dependent on the whole key or only
• Those non-key attributes, which are only dependent on part of the key can be removed to form a
separate relation
• Sometimes the question cannot be answered in a clear-cut manner, and the users’ advice must be sought before
proceeding
Explain how entity relation diagrams can assist in the development of a database system.
(4 Marks)
Allocate 1 mark for the definition and 1 mark for a suitable example from the scenario, maximum
2 marks:
• An optional relationship is a relationship between two entities, the relationship may not always exist
• For example, a unary relationship where the employee entity acts in two different ways – as a manager and as a
subordinate.
Allocate 1 mark for the description and 1 mark for the valid example, maximum 2 marks.
QUESTION 1
A distributed database is a logically related set of databases physically distributed at different sites but
connected by a network.
the enterprise.
b) Describe the THREE major benefits that a data warehouse is seen to deliver for organisations. (3 Marks)
Award 1 mark for each correct benefit description,up to a maximum of 3 marks. An indicative answer
is provided below:
• A data warehouse provides a single manageable structure for decision support data.
• A data warehouse enables organisational users to run complex queries on data that traverses a
• It typically holds data for one business area rather than the whole organisation.
• Events that cause changes in a database from one consistent state to another consistent state.
iii) QBE 2
• Query-by-example (2 Marks)
• A forms based retrieval interface
• The process by which simultaneous access to a database is enabled for multiple users
e) Briefly describe FOUR non-computer-based measures a data administrator may use to counter threats to the
security of a database. (4 Marks)
Award 1 mark for each point up to a maximum of 4 marks:
• End-users. Most users in organisations will be unlikely to know they are using a database system. This is
because they will be using a database system indirectly through some ICT system. Some sophisticated end-
users will be able to access database systems directly through employing enduser tools such as visual
query interfaces.
maintaining databases for various applications. Hence, most DBMS provide a range of tools for DBAs,
particularly in the area of data control.
• System developers. Developers of ICT systems will need to integrate d atabase systems with the wider
functions of the ICT system. Various tools such as application programming interfaces are available for this
purpose.
SECTION B: ANSWER ANY TWO QUESTIONS {20 MARKS EACH}.
QUESTION 2
a) The four basic properties of a transaction are often described by the acronym ‘ACID’.
List and explain what each of the letters in ACID stands for. (8 marks)
state to another.
b) Explain, with supporting examples, the problems that can occur when
executing transactions in a concurrent environment. (12 Marks)
The fi rst transaction reads several values from a database but a second
transaction updates some of them during the execution of the fi rst. As a result
the values, which are interdependent, are inconsistent in the eyes of the fi rst
transaction.
QUESTION 3
A distributed database system is a database system which is fragmented or replicated on the various
configurations of hardware and software, located usually at different geographical sites within an
organisation.
advantages are:
• Greater control. Greater control over data may occur if data is devolved to the places where it is needed
• Greater reliability. Keeping replicas of data may increase the reliability of systems. Sitting data where it is
needed increases a system’s availability
• Better performance. Performance can actually increase if distribution is judiciously applied. If most queries
go to a local small database, rather than a large central one, then both update and retrieval access are likely
to be improved
• Easier growth. A distributed environment may enhance the organisation’s ability to expand its data
infrastructure
c) Discuss the THREE (3) types of transparency that a distributed database system should ideally display. (9
Marks)
Award32 marks for correct discussion of location transparency, 2 marks for correct discussion of replication
transparency and 1 mark for correct discussion of fragmentation transparency.
• Location transparency - users do not need to be aware at what sites data is located. Location independence
is desirable because it simplifies user programs and interface activities. Data can migrate from site to site
without invalidating any of those programs or activities. Data may be migrated around the network in
response to changing usage or performance requirements.
• Replication transparency - users should not need to be aware of how data is replicated.
Replication is desirable because performance is better if applications can operate on local copies and
availability is better so long as at least one copy remains available for retrieval purposes.
QUESTION 4
a) A Database Management System (DBMS) normally provides four facilities in order to deal with recovery. List
these facilities and briefly describe the purpose of each facility. (8 marks)
can be used in conjunction with the logging facilities mentioned below, whenever
● Logging facilities: keeps track of the current state of transactions and database
CPU failure.
Computer-Based:
• Implementing suitable authorisation in relation to an operating system on which the database system runs.
• Implementing authorisation strategies that grant privileges to certain users and groups to access certain
database objects via the chosen DBMS.
• Setting up views into the database and granting access to such views to users and groups.
QUESTION 5
systems:
DBMS.
Authentic users will have a user account and also know the password for their
account.
Once we have determined who the authorised users are, we need to set out
access privileges for these users. These defi ne which data a user has access to,
and the nature of that access (i.e., does s/he just have read access or has s/he
not want every user accessing ‘sensitive’ data. Only those users who need the
confi dential data to carry out their job will be given access to this data. We can
use views to make sure that users only get access to the data that they need to
carry out their jobs. A user view is a window to the database. The view defi nition
determines the size of the window, i.e. how much data the user can access.
● University Administrators
● Students
● Lecturers
this situation to ensure the security and integrity of the data held within the database
system. (8 marks)
● The DBA needs to create accounts for all of the users and set up password
● The DBA would create views for each user/user group to help determine the data
● The DBA would need to assign system and object privileges to each user. This
can be done using roles. So for example, we could defi ne a role fo r the Admin
staff. Then each admin person can be granted that role rather than granting many
● The DBA can ensure integrity constraints are set up appropriately for entity
The above list is not exhaustive and other points can also be added.
c) A number of functions are usually provided to enable the DBA to specify precise usage profiles for a given
database and DBMS. Discuss THREE (3) of these functions. (6 Marks)
Allocate 2 marks for each correct discussion up to a maximum of 6. Indicative answers:
Most DBMS now enable the DBA to enrol new users onto a database. Users are normally assigned unique
user names. Some DBMS also allow the DBA to create user groups or ‘roles’ with defined names. This
facility enables the DBA to specify a common profile for a collection of similar users.
Given that users or user groups have been specified in terms of user name, a number of properties can be
attached to each user name or group.
One of the most basic properties is the definition of a password against a user or user group. This enables
some basic security to be built into a database system.
Individual user passwords may be changed by the respective user at any time. Role passwords are likely to
be changed by the DBA or changed by a delegated member of the user group at periodic intervals.
This will normally be done using the SQL grant options below:
• Granting the capability to insert data into specific tables or views to specific users or user groups
• Granting the capability to delete data from specific tables or views to specific users or user groups
• Granting the capability to update data in specific tables or views to specific users or user groups
• Granting the capability to retrieve data from specific tables or views to specific users or user groups
As well as being able to grant these functions to specific users or user groups, the DBA is able to revoke
such privileges from specific users or groups.
The commands for granting access to data are relatively well-standardised. The commands for
granting access to DBMS facilities vary among vendors. Generally the DBMS will assign one or more levels
of DBMS privileges to a specific user or user group. For instance, the DBMS may distinguish between users
in terms of:
Generally, end-users will be assigned a level which permits them to access data in existing tables. In
contrast, application developers will need the ability to create their own tables, views and indexes. At the
highest level, one or more DBAs will be able to act as superusers in terms of a given DBMS.
a) Explain the following concepts as applied to databases {5 Marks}
(i) Schema:
(ii) Data Dictionary (encyclopedia):
(iii) DDL (Data Definition Language):
(iv) DML (Data Manipulation Language):
(v) DBMS (Database Management System
Data Dictionary (encyclopedia): holds the entire information about the database, including
the tables, relations, design comments.
DDL (Data Definition Language): Used to generate the schema. It can be created via the GUI
interfaces in Access or SQL commands.
DML (Data Manipulation Language): are the SQL commands for processing data .
Trigger is a statement that the system executes auto. As a side effect of a modification
to the database.
Data mining involves the use of sophisticated data analysis tools to discover
previously unknown, valid patterns and relationships in large data sets. These tools can include
statistical models, mathematical algorithms, and machine learning methods (algorithms that
improve their performance automatically through experience, such as neural networks or
decision trees). Consequently, data mining consists of more than collecting and managing data,
it also includes analysis and prediction.
CREATE SCHEMA – used to define that portion of the database that a particular user owns. Schemas
are dependent on a catalog, and contain schema objects including tables, views, domains and
constraints etc
CREATE TABLE – Defines a new table and its columns. Tables are dependent on the schema. Table
can be a base table or a derived table. Tables are dependent on schema and are created by
executing an SQL query that creastes a TABLE (rather than a query)
CREATE VIEW – Defines a logical table from one or more tables or views.
e) (i) Describe briefly three problems that make it necessary to embrace concurrency control
during transaction processing. {6 Marks}
• The lost update problem: A second transaction writes a second value of a data-item
(datum) on top of a first value written by a first concurrent transaction, and the first
value is lost to other transactions running concurrently which need, by their
precedence, to read the first value. The transactions that have read the wrong value end
with incorrect results.
• The dirty read problem: Transactions read a value written by a transaction that has been
later aborted. This value disappears from the database upon abort, and should not have
been read by any transaction ("dirty read"). The reading transactions end with incorrect
results.
• The incorrect summary problem: While one transaction takes a summary over the
values of all the instances of a repeated data-item, a second transaction updates some
instances of that data-item. The resulting summary does not reflect a correct result for
any (usually needed for correctness) precedence order between the two transactions (if
one is executed before the other), but rather some random result, depending on the
timing of the updates, and whether certain update results have been included in the
summary or not.
(ii) Discuss Jobs in Database Area {4 Marks}
A data warehouse is a place where data is stored for archival, analysis and security
purposes. Usually a data warehouse is either a single computer or many computers
(servers) tied together to create one giant computer system.
Data can consist of raw data or formatted data. It can be on various types of topics
including organization's sales, salaries, operational data, summaries of data including
reports, copies of data, human resource data, inventory data, external data to provide
simulations and analysis, etc.
Besides being a store house for large amount of data, they must possess systems in place
that make it easy to access the data and use it in day to day operations. A data warehouse
is sometimes said to be a major role player in a decision support system (DSS). DSS is a
technique used by organizations to come up with facts, trends or relationships that can
help them make effective decisions or create effective strategies to accomplish their
organizational goals.
Data warehousing is combining data from multiple and usually varied sources into one
comprehensive and easily manipulated database. Common accessing systems of data
warehousing include queries, analysis and reporting. Because data warehousing creates
one database in the end, the number of sources can be anything you want it to be,
provided that the system can handle the volume, of course. The final result, however, is
homogeneous data, which can be more easily manipulated.
(ii) Explain four types of data warehouses {8 Marks}
Offline Operational Data Warehouses are data warehouses where data is usually
copied and pasted from real time data networks into an offline system where it can be
used. It is usually the simplest and less technical type of data warehouse.
Offline Data Warehouses are data warehouses that are updated frequently, daily, weekly
or monthly and that data is then stored in an integrated structure, where others can access
it and perform reporting.
Real Time Data Warehouses are data warehouses where it is updated each moment with
the influx of new data. For instance, a Real Time Data Warehouse might incorporate data
from a Point of Sales system and is updated with each sale that is made.
Integrated Data Warehouses are data warehouses that can be used for other systems to
access them for operational systems. Some Integrated Data Warehouses are used by other
data warehouses, allowing them to access them to process reports, as well as look up
current data.
While data mining products can be very powerful tools, they are not self sufficient
applications. To be successful, data mining requires skilled technical and analytical specialists who can
structure the analysis and interpret the output that is created. Consequently, the limitations of data
mining are primarily data or personnel related, rather than technology-related.
Although data mining can help reveal patterns and relationships, it does not tell
the user the value or significance of these patterns. These types of determinations
must be made by the user. Similarly, the validity of the patterns discovered is
assess the validity of a data mining application designed to identify potential terrorist
suspects in a large pool of individuals, the user may test the model using data that
a particular profile, it does not necessarily mean that the application will identify a
Another limitation of data mining is that while it can identify connections between behaviors and/or
variables, it does not necessarily identify a causal relationship. For example, an application may identify
that a pattern of behavior, such as the propensity to purchase airline tickets just shortly before the flight
is scheduled to depart, is related to characteristics such as income, level of education, and Internet use.
However, that does not necessarily indicate that the ticket purchasing behavior is caused by one or
more of these variables. In fact, the individual’s behavior could be affected by some additional
variable(s) such as occupation (the need to make trips on short notice), family status (a sick relative
needing care), or a hobby (taking advantage of last minute discounts to visit new destinations).
• Users work with a subset of whole data increasing the performance of the solution due to faster
delivery of data.
• Data can be processed in several locations increasing the performance of the solution.
• Workload balance between the underlying systems.
• Increase the availability because in a distributed environment, we don't have a single -point of
failure.
The relational model restricts itself to homogeneous (only one record type) sequential sets. The virtue of
this approach is its simplicity and the ability to define operators that “distribute" over the set, applying
uniformly to each record of the set. Since much of data processing involves repetitive operations on
large volumes of data, this distributive property provides a concise language to express such algorithms.
To give an example of this, a “relational” program to find all overdue accounts in an invoice file might
be:
SELECT ACCOUNT_NUMBER
FROM INVOICE
WHERE DUE_DATE<TODAY;
Hierarchical models use parent-child sets in a stylized way to produce a forest (collection of trees)of
records. A typical application might use the three record types: LOCATIONS, ACCOUNTS, and INVOICES
and two parent-child sets to construct the following hierarchy: All the accounts at a location are
clustered together and all outstanding invoices of an account are clustered with the account, That is, a
location has its accounts as children and an account has its invoices as children. This may be depicted
schematically by:
+-----------+
| LOCATIONS |
+-----------+
|
|
+-----------+
| ACCOUNTS |
+-----------+
|
|
+-----------+
| INVOICES |
+-----------+
This structure has the advantage that records used together may appear clustered together in
physical storage and that information common to all the children can be factored into the parent
record. Also, one may quickly find the first record under a parent and deduce when the last has
been seen without scanning the rest of the database.
Not all problems conveniently fit a hierarchical model. If nothing else, different users may want to see
the same information in a different hierarchy. For example an application might want to see the
hierarchy “upside-down” with invoice at the top and location at the bottom. Support for logical
hierarchies (views) requires that the data management system support a general network. The efficient
implementation of certain relational operators (sort-merge or join) also require parent-child sets and so
require the full capability of the network data model.
The general statement is that if all relationships are nested one -to-many mappings then the data can be
expressed as a hierarchy. If there are many-to-many mappings then a network is required. To consider a
specific example of the need for networks, imagine that several locations may service the same account
and that each location services several accounts, Then the hierarchy introduced in the previous section
would require either that locations be subsidiary to accounts and be duplicated or that the accounts
record be duplicated in the hierarchy under the two locations. This will give rise to complexities about
the account having two balances.....
+----------+ +----------+
| LOCATION | | LOCATION |
+----------+ +----------+
| | | |
+---)-------------+ |
| | |
| +------------------+
| |
V V
+----------+ +----------+
| ACCOUNT | | ACCOUNT |
+----------+ +----------+
| |
| |
V V
+----------+ +----------+
| INVOICE | | INVOICE |
+----------+ +----------+
A network built out of two parent-child sets.
QUESTION 4 {20 Marks}
Atomicity
All changes to data are performed as if they are a single operation. That is, all the changes are
performed, or none of them are.
For example, in an application that transfers funds from one account to another, the atomicity
property ensures that, if a debit is made successfully from one account, the corresponding credit
is made to the other account.
Consistency
For example, in an application that transfers funds from one account to another, the consistency
property ensures that the total value of funds in both the accounts is the same at the s tart and
end of each transaction.
Isolation
For example, in an application that transfers funds from one account to another, the isolation
property ensures that another transaction sees the transferred funds in one account or the
other, but not in both, nor in neither.
Durability
After a transaction successfully completes, changes to data persist and are not undone, even in
the event of a system failure.
For example, in an application that transfers funds from one account to another, the durability
property ensures that the changes made to each account will not be reversed.
Locking (e.g., Two-phase locking - 2PL) - Controlling access to data by locks assigned to the data.
Access of a transaction to a data item (database object) locked by another transaction may be
blocked (depending on lock type and access operation type) until lock release.
The database administrator performs a critical role within an organization and is an important
and key role in Database Management Systems. The major responsibility of a database
administrator is to handle the process of developing the database and maintaining the database
of an organization. The database administrator is responsible for defining the internal layout of
the database and ensuring the internal layout optimizes system performance.
The database administrator has full access over all type of important data of an organization.
The database administrator decides what data will be stored in the database and how to
organize data in database so that it can be access easily on requirement or need of an
organization. To design the database of an organization, the database administrator must have a
meeting with users and determine their requirements.
The database administrator is also responsible for preparing documentation, including recording
the procedures, standards, guidelines, and data descriptions necessary for the efficient and
continuing use of the database environment. Documents should include mate rials to help end
users, database application programmers, the operation staff, and all personnel connected with
the database management system.
The database administrator is responsible for monitoring the database environment, such as
seeing that the database is meeting performance standards, making sure the accuracy, integrity,
and security of data are maintained.
The database administrator is also responsible to manage any enhancements into the database
environment.
The database administrator DBA Is responsible for the design, operation, and management of
the database. He or she must be technically competent, a good manager, a skilled diplomat and
posses excellent communication skills. Management skills are required to plan, coordinate and
carry out a multitude of tasks during all phases of the database project and to supervise a staff
.Technical skills are needed because the DBA has to be able to understand the comple x
hardware and software issues involved and to work with system and application experts in
solving problems.
Diplomatic skills are used to communicate to users and determine their needs to negotiate
agreements on data definitions and database access rights to secure agreements on changes to
the database structure or operations that affect users and to mediate between users with
conflicting requirements. Excellent communication skills are required for all these activities. All
kind of good communication skills, diplomatic skills are needed to communicate with customers
because sometimes customer are not capable of telling their requirements accurately so DBA
has to communicate with them in such a way that they can tell exactly what they need.
c) Use the SQL CREATE command to create the tables for the ERD defining the relations
between CUSTOMER, ORDER, ORDER_LINE and PRODUCT.
i. Pessimistic Locking
i. Excessive privileges
A privilege is a right to run a particular type of SQL statement or to access another user's object.
When users (or applications) are granted database privileges that exceed the requirements of
their job function, these privileges may be used to gain access to confidential information. For
example, a university administrator whose job requires read-only access to student records may
take advantage of excessive update privileges to change grades.
The solution to this problem (besides good hiring policies) is query-level access control. Query-
level access control restricts privileges to minimum-required operations and data. Most native
database security platforms offer some of these capabilities (triggers, RLS, and so on), but the
manual design of these tools make them impractical in all but the most limited deployments.
Users may abuse legitimate data access privileges for unauthorized purposes. For example, a
user with privileges to view individual patient records via a custom healthcare application client
may abuse that privilege to retrieve all patient records via a MS-Excel client.
The solution is access control policies that apply not only to what data is accessible, but how data
is accessed. By enforcing policies for time of day, location, and application client and volume of
data retrieved, it is possible to identify users who are abusing access privileges.
Privilege elevation exploits can be defeated with a combination of query-level access control and
traditional intrusion prevention systems (IPS). Query-level access control can detect a user who
suddenly uses an unusual SQL operation, while an IPS can identify a specific documented threat
within the operation.
Weak audit policy and technology represent risks in terms of compliance, deterrence, detection,
forensics and recovery.
Unfortunately, native database management system (DBMS) audit capabilities result in
unacceptable performance degradation and are vulnerable to privilege-related attacks -- i.e.
developers or database administrators (DBAs) can turn off auditing.
Most DBMS audit solutions also lack necessary granularity. For example, DBMS products rarely
log what application was used to access the database, the source IP addresses and failed queries.
v. Denial of service
Denial of service (DoS) may be invoked through many techniques. Common DoS techniques
include buffer overflows, data corruption, network flooding and resource consumption. The latter
is unique to the database environment and frequently overlooked.
DoS prevention should occur at multiple layers including the network, applications and
databases.
c) Explain the difference between data mining and data warehousing and their importance of an
organization (4 Marks)
Data mining (sometimes called data or knowledge discovery) is the process of analyzing data
from different perspectives and summarizing it into useful information – information that can be
used to increase revenue, cuts costs, or both. Data mining software is one of a number of
analytical tools for analyzing data. It allows users to analyze data from many different
dimensions or angles, categorize it, and summarize the relationships identified. Technically, data
mining is the process of finding correlations or patterns among dozens of fields in large
relational databases.
Data warehousing is defined as a process of centralized data management and retrieval. Data
warehousing, like data mining, is a relatively new term although the concept itself has been
around for years. Data warehousing represents an ideal vision of maintaining a central repository
of all organizational data. Centralization of data is needed to maximize user access and analysis.
Dramatic technological advances are making this vision a reality for many companies. And,
equally dramatic advances in data analysis software are allowing users to access this data freely.
d) Describe distributed DBMS and explain advantages and disadvantages distributed DBMS
(10 Marks)
Advantages
• Organizational Structure - fits into organizations distributed over several locations
• Shareability and Local Autonomy - Local users can control their own data while being
accessible ‘globally’
• Improved availability – if there is failure at one site, others are accessible
• Improved reliability - replication of data
• Improved performance - local data is located where demand for it is likely to be greatest
• Transparent management of distributed, fragmented, and replicated data
• Economical - centralized processing power in a single piece of hardware is not necessarily
cheaper than separate smaller units
• Modular growth – simpler to expand
Disadvantages
• Complexity – by hiding their distributed nature and trying to ensure optimum performance,
reliability and availability, DDBS are more complex
• Cost – procurement and maintenance cost
• Security – more difficult
• Integrity Control more difficult
• Lack of Standards
• Lack of Experience
• Database Design more Complex
SECTION B: ANSWER ANY TWO QUESTIONS {20 MARKS EACH}.
QUESTION 2
b) Explain the difference between durability and Isolation properties of a transaction in DBMS
(4 Marks)
• Isolation
o Data used during the execution of a database transaction must not be used by another
database transaction until the execution is completed. Therefore, the partial results of
an incomplete transaction must not be usable for other transactions until the
transaction is successfully committed. It also means that the execution of a transaction
is not affected by the database operations of other concurrent transactions.
• Durability
o All the database modifications of a transaction will be made permanent even if a system
failure occurs after the transaction has been completed.
c) A database transaction can be in one of the four states. List and explain these four states
(6 Marks)
QUESTION 3
d) Identify the entities and the relationships then developed an E-R diagram based on the following
statement. (18 Marks)
“Patients are treated in a single ward by the doctors assigned to them. Usually each patient will be
assigned a single doctor, but in rare cases they will have two. Nurses also attend to the patients; a number
of these are associated with each ward. Each patient is required to take a variety of drugs a certain
number of times per day and for varying lengths of time. Some staff are paid part time and doctors and
care assistants work varying amounts of overtime at varying rates (subject to grade). “
Entities: (6 Marks)
Doctors
Patients
Nurses (staff)
Drugs
Wards
Salaries
Attributes:
Drugs
Take
Paid
for
work
Salaries
Award two marks for correctly related entities (6*2=12 marks)
QUESTION 4
• Performance problems associated with re-assembling simple data structures into their
more complicated real-world representations.
• Lack of support for complex base types, e.g., drawings.
• SQL is limited when accessing complex data.
• Knowledge of the database structure is required to create ad hoc queries.
• Locking mechanisms defined by RDBMSs do not allow design transactions to be
supported, e.g., the "check in" and "check out" type of feature that would allow an
engineer to modify a drawing over the course of several working days.
(a) In any database system, it is necessary to implement Security Restrictions to prevent all users
accessing all data in the database. Describe, with the aid of examples, ways in which these
security restrictions may be implemented in an SQL based Relational DBMS.
(8 Marks)
▪ Daily Maintenance: Database audit logs require daily review to make certain that there has been no
data misuse. This requires overseeing database privileges and then consistently updating user access
accounts. A database security manager also provides different types of access control for different
users and assesses new programs that are performing with the database. If these tasks are
performed on a daily basis, you can avoid a lot of problems with users that may pose a threat to the
security of the database.
▪ Varied Security Methods for Applications: More often than not applications developers will vary
the methods of security for different applications that are being utilized within the database. This
can create difficulty with creating policies for accessing the applications. The database must also
possess the proper access controls for regulating the varying methods of security otherwise
sensitive data is at risk.
▪ Split the Position: Sometimes organizations fail to split the duties between the IT administrator and
the database security manager. Instead the company tries to cut costs by having the IT
administrator do everything. This action can significantly compromise the security of the data due to
the responsibilities involved with both positions. The IT administrator should manage the database
while the security manager performs all of the daily security processes.
▪ Application Spoofing: Hackers are capable of creating applications that resemble the existing
applications connected to the database. These unauthorized applications are often difficult to
identify and allow hackers access to the database via the application in disguise.
▪ Manage User Passwords: Sometimes IT database security managers will forget to remove IDs and
access privileges of former users which leads to password vulnerabilities in the database. Password
rules and maintenance needs to be strictly enforced to avoid opening up the database to
unauthorized users.
▪ Windows OS Flaws: Windows operating systems are not effective when it comes to database
security. Often theft of passwords is prevalent as well as denial of service issues. The database
security manager can take precautions through routine daily maintenance checks.
QUESTION 5
a) Discuss the use of Data Replication in a Distributed Database System, including any problems which the
introduction of Data Replication may cause. (8 Marks)
b) Requirements Analysis is one of the most important stages in Database Development Life Cycle. Describe
the objectives and the activities that are performed at this stage and problems that are likely to me faced
(12 Marks)
Activities
• Disagreements
• Lack of professionalism in the research process
QUESTION 1
a) Distinguish between the following roles in DBMS; Data Administrator and Database Administrator.
(2 marks)
The DA is responsible for the management of data resources; the DA consults with and advises senior
managers, ensuring that the direction of database development will ultimately support corporate objectives.
The DBA is responsible for physical realization of the database, and ensures satisfactory performance of the
applications for users.
c) Outline any four application areas where databases are highly used. (2 marks)
Banking, airlines, stock exchange, hospitals, manufacturing, schools
d) Below are two logical tables designed for a human resources system at Kenya Revenue Authority.
Employee
Department
(DepartmentNo, Branch)
NB: The primary keys are underlined and the foreign keys have asterisk (*).
i. Write a query for creating the Employee table. The query should enforce the rule that an
employee must belong to a department. (5 marks)
e) Write a query that will update the SalaryBracket that originally was ‘25000’ to be changed to
‘31500’. (3 marks)
UPDATE Employee
SET SalaryBracket = ’31500’
WHERE SalaryBracket = ’25000’
f) Discuss briefly the term referential integrity, use a valid example (3 marks)
Referential integrity is a database concept that ensures that relationships between tables remain
consistent. When one table has a foreign key to another table, the concept of referential integrity
states that you may not add a record to the table that contains the foreign key unless there is a
corresponding record in the linked table.
g) Explain the following terms as used in SQL
i. Union
• This is a term used in sql statements to retrieve data from two tables that have a
relationship
ii. Not null
• A property in table fields that specifies that a field must contain a value for every record
entered
(2 marks)
h) Create a simple transaction query that if successful will insert four records as in figure Z below and
change the county of a student whose StdNumber is 106 to Kisumu.
(7 marks)
Figure Z.
i) Use a well labeled diagram to describe the distributed database management system environment.
(4 marks)
SECTION B: ANSWER ANY TWO QUESTIONS {20 MARKS EACH}.
QUESTION 2
a) Describe the three level ANSI-SPARK Architecture for database system. Complement your answer
with a well labeled diagram (6 marks)
Identifies a three level of data abstraction, ie. Three distinct level at which data items can be
described. The levels comprises external, conceptual & internal.
{Award 4 marks for correct & well labeled diagram, 2 marks for description}
b) Discuss any four shortcomings of the traditional data storage before introduction of computers.
(4 marks)
• Bulky storage
• Long response time
• Labor-intensive
• Often incomplete or inaccurate
c) Elucidate any five common security threats to a database system (10 marks)
• Excessive Privilege Abuse
• Legitimate Privilege Abuse
• Privilege Elevation
• Platform Vulnerabilities
• SQL Injection
• Weak Audit Trail
• Denial of Service
• Database Communications Protocol Vulnerabilities
• Weak Authentication
• Backup Data Exposure
QUESTION 3
a) Database Design Process comprises of six fundamental stages. Briefly discuss each.
(12 marks)
1. requirement analysis
5. physical design
b) Discuss any four objectives of the three level ANSI-SPARK architecture. (6 marks)
• Each user should be able to access the same data, but have a different customized view of the
data.
• Users should not have to deal directly with physical database storage details such as indexing .
• The DBA should be able to change the database storage structures without affecting the users’
views.
• The internal structure of the database should be unaffected to changes to the physical storage,
such as change to a new storage device.
• The DBA should be able to change the conceptual structure of the database without affecting all
users.
c) Distinguish between composite attribute and Simple attribute (2 marks)
• Composite attributes have an overall significance (e.g. an address) but can be subdivided into
more basic attributes with independent meaning (city, postal code etc.)
• Simple attributes are indivisible (e.g. age)
QUESTION 4
a) Using an example in each case, discuss the five integrity constraints in databases.
(10 marks)
• Not Null
• Unique
• Primary Key
• Foreign Key
• Check
{Award 1 mark for example, and 1 mark for correct discussion}
b) A school has several departments. Each department has a supervisor and at least one employee.
Employees must be assigned to at least one, but possibly more departments. At least one
employee is assigned to a project, but an employee may be on vacation and not be assigned to
any projects. Employees have name & a number which is unique, a supervisor has a name & a
unique number, every project has a unique ID and a Title.
Required:
i. Draw an appropriate ER Diagram for the scenario. Include cardinality and relationship
dependency. (10 marks)
{Award 3 marks for cardinality & dependency, 2 marks for relationship name and 5 marks for
correct ERD}
QUESTION 5
d) Define a distributed database system and discuss the two approaches of distributing data in a
distributed database system (5 marks)
A distributed database system (DDB) exists where logically related data is physically distributed
between a number of separate processors linked by a communications network. {Award 1 mark for
this and 4 marks for approaches}
Approaches:
• Replication
System maintains multiple copies of data, stored in different sites, for faster retrieval and fault
tolerance.
• Fragmentation
Relation is partitioned into several fragments stored in distinct sites
QUESTION 3
a) iTechom LTD has been experiencing some security breaches on their database, as the Database manager
discuss some of the key mitigation steps you would take to avert the situation.
(8 marks)
• Encryption
• Access control by use of an authentication mechanism
• Physical security protocols e.g. locks
• Technological – h/w, s/w
• Policies and procedures
• Education, training and awareness
b) Constraints are key to ensuring referential integrity of your data in the database, using SQL statement(s)
show how you would apply THREE kinds of constraints when creating a database object.
(6 marks)
(Examples of constraints include; NULL, NOT NULL, PRIMARY KEY, UNIQUE etc)
CREATE TABLE "tablename" ("column1" "data type" [constraint], "column2" "data type" [constraint], "column3"
"data type" [constraint]);
c) Create a database called libraryStock {data; size=2, maxsize=4, filegrowth=2}, { log; size=1, maxsize=2,
filegrowth=1}.
(6 marks)
CREATE DATABASE libraryStock
filename='c:\ libraryStock.mdf',
size=2,
maxsize=4,
filegrowth=2)
LOG ON
filename='c:\ libraryStock.ldf',
size=1,
maxsize=2,
filegrowth=1)
QUESTION 4
b) Define the term Database backup giving the three types of backups that you can use in the database.
(8 Marks)
• Full database backup; full backup of the database
• Transaction log backup; copies only transaction log
• Differential backup; copies only the database page modified after the full database backup
i) Create a View called Pprice that contains only the Pno, Pname and Manufacturer.
(5 marks)
CREATE VIEW P_Price
AS SELECT Pno, PName, Manufacturer
FROM Products
COURSE TABLE
DC201 Mondays 50 5
DC205 Fridays 60 4
DC202 Wednesdays 80 5
CREATE table course (CourseId int,Day char(10), MaxHrs int, CourseWeight int)
ii) Write the SQL statement that would populate the table. (6 marks)
INSERT INTO course (CourseId int,Day char(10), MaxHrs int, CourseWeight int) VALUES(DC201,’Mondays’,50,5)
iii) Write SQL query to obtain information on courses taught on Mondays and have a weight above 4.5
(4 marks)
iv) Write SQL statement that corrects the MaxHrs for DC202 to 70. (4 marks)
v) Write SQL statement that would add the total number of hours of all the courses
(2 marks)
Commit is used to end the transaction and make the changes permanent.
Rollback is used for undoing the work done in the current transaction. This command also releases the locks if
any hold by the current transaction.
f) Outline THREE measures you would take to avert security breaches in a database. (3 Marks)
Encryption
Access control by use of an authentication mechanism
Physical security protocols e.g. locks to server rooms
Policies and procedures
Education, training and awareness
g) Briefly describe three problems that make it necessary to embrace concurrency control during transaction
processing. (3 Marks)
• Lost Updates Problem
Lost updates occur when two or more transactions select the same row and then update the row based on the
value originally selected. Each transaction is unaware of other transactions. The last update overwrites updates
made by the other transactions, which results in lost data.
For example, two editors make an electronic copy of the same document. Each editor changes the copy
independently and then saves the changed copy, thereby overwriting the original document. The editor who
saves the changed copy last overwrites changes made by the first editor. This problem could be avoided if the
second editor could not make changes until the first editor had finished.
For example, an editor reads the same document twice, but between each reading, the writer rewrites the
document. When the editor reads the document for the second time, it has changed. The original read was not
repeatable. This problem could be avoided if the editor could read the document only after the writer has
finished writing it.
• Phantom Reads
Phantom reads occur when an insert or delete action is performed against a row that belongs to a range of rows
being read by a transaction. The transaction's first read of the range of rows shows a row that no longer exists
in the second or succeeding read, as a result of a deletion by a different transaction. Similarly, as the result of
an insert by a different transaction, the transaction's second or succeeding read shows a row that did not exist
in the original read.
d) Discuss the THREE (3) types of transparency that a distributed database system should ideally display.
(6 Marks)
• Location transparency - users do not need to be aware at what sites data is located. Location
independence is desirable because it simplifies user programs and interface activities. Data can migrate
from site to site without invalidating any of those programs or activities. Data may be migrated around the
network in response to changing usage or performance requirements.
Replication is desirable because performance is better if applications can operate on local copies and
availability is better so long as at least one copy remains available for retrieval purposes.
d) Name and briefly explain FOUR (4) different constraints that may be enforced on a database table in Microsoft
SQL Server. (4 Marks)
A NOT NULL constraint is a rule that prevents null values from being entered into one or more columns
within a table.
A unique constraint (also referred to as a unique key constraint) is a rule that forbids duplicate values in one
or more columns within a table. Unique and primary keys are the supported unique constraints. For
example, a unique constraint can be defined on the supplier identifier in the supplier table to ensure that the
same supplier identifier is not given to two suppliers.
A primary key constraint is a column or combination of columns that has the same properties as a unique
constraint. You can use a primary key and foreign key constraints to define relationships between tables.
A foreign key constraint (also referred to as a referential constraint or a referential integrity constraint) is a
logical rule about values in one or more columns in one or more tables. For example, a set of tables shares
information about a corporation's suppliers. Occasionally, a supplier's name changes. You can define a
referential constraint stating that the ID of the supplier in a table must match a supplier ID in the supplier
information. This constraint prevents insert, update, or delete operations that would otherwise result in
missing supplier information.
A (table) check constraint (simply called a check constraint) sets restrictions on data added to a specific
table. For example, a table check constraint can ensure that the salary level for an employee is at least
$20,000 whenever salary data is added or updated in a table containing
c) Normalization is a process aimed at coming up with a database design void of data redundancy and
inconsistencies. Further, data anomalies are also eliminated. Below is a table with data used by an
organization. With clear and detailed steps, convert this unNormalized table to the Third Normal Form.
(6 Marks)
Mgr
ID Tel Cust Date of Nature of
Dept # DName Location MgrName No. Extn. Cust # Name Complaint Complaint
Soap Mary Robert
11232 Division Cincinnati Samuel S11 7711 P10451 Drumtree 1/12/1998 Poor Service
QUESTION 4
COURSES TABLE
DC201 Mondays 50 5
DC205 Fridays 60 4
DC202 Wednesdays 80 5
vi) Write the SQL statement that would realize the tables. (4 marks)
CREATE table courses (CourseId int, Day char (10), MaxHrs int, CourseWeight int);
vii) Write a query that makes the CourseID field the Primary Key (2 Marks)
viii) Write a query that returns the StudentID, CourseID, and CourseWeight from both the Courses and
Select R.StudentID,R.CourseID,C.CourseWeight
ON (C.CourseID=R.CourseID);
A foreign key refers to an attribute whose values match the primary key values of a related relation
A relation is said to exhibit entity integrity if each row is uniquely identified by the primary key.
(8 Marks)
b) Briefly discuss any FIVE advantages of the database approach over the file system approach.
• Control of data redundancy
• Improved data consistency
• Improved Sharing of data,
• Enhanced enforcement of standards
• Improved backup and recovery services.
• Improved data security
• Improved data accessibility and responsiveness
• Improved maintenance through data independence
• Increased concurrency
(5 Marks)
(6 Marks)
Companies are generally divided into several business units, such as sales, finance, and marketing. Each
business unit is subject to specific constraints and requirements, and each one uses a data subset of the
overall data in the organization. Therefore, end users working within those business units view their data
subsets as separate from or external to other units within the organization.
Having identified the external views, a conceptual model is used, to integrate all external views into a single
view. The conceptual model represents a global view of the entire database as viewed by the enti re
organization.
The internal model is the representation of the database as seen by the DBMS. The internal model requires
the designer to match the conceptual model’s characteristics and constraints to those of the selected
implementation model.
An internal schema depicts a specific representation of an internal model, using the database constructs
supported by the chosen database.
e) SQL is a DDL and a DML; using appropriate examples clearly distinguish between the two terms.
(4 Marks)
DDL: SQL includes commands to create database objects (such as tables, indexes, and views), as well as
commands to define access rights to those database objects.
Example commands:
• CREATE TABLE, CREATE VIEW, CREATE INDEX, DROP TABLE, DROP INDEX, NOT NULL,
UNIQUE
DML: SQL includes commands to insert, update, delete, and retrieve data within the database tables.
Example commands:
f) Discuss the importance of carrying out entity relationship modeling in database design.
(4 Marks)
Very little or no redundant data will be stored (save on space and improve performance)
The database will support both planned and unplanned (ad hoc) queries for data retrieval
A good design will document itself. A new person looking at the ERD will understand what is going on.
The ERD will help to maintain good and consistent naming standards
The Application that works form the database will be easier to develop.
a) Discuss the TWO relationship participation types showing the symbol used to denote each type.
(4 Marks)
Optional participation means that one entity occurrence does not require a corresponding entity
occurrence in a particular relationship.
Mandatory participation means that one entity occurrence requires a corresponding entity occurrence in a
particular relationship.
Optionality symbols:
Zero or one
Zero or many
Mandatory symbols:
One or many
One-to-one (1:1) relationship. For any occurrence of entity A there may only be one member of B and for
any occurrence of B there is only one member of A associated with it at any time.
One to Many - For any A there may be many members of B and for any B there is only one member of A
associated with it.
Many to Many - For any A there may be many members of B and for any B there may be many members of
A associated with it.
c) Using examples differentiate between the left outer join and the right outer join.
(4 Marks)
In the left outer join the matching fields for the left table will be included in the results i.e all displayed fields for the left
table will have data whereas some fields belonging to the right table will be null.
In the right outer join all the displayed fields for the right table will have data whereas some fields in the left table will
be null.
d) Use the following SQL special operators to write an example query and explain what each query does.
i. Between
ii. Like
iii. In
(6 Marks)
BETWEEN Checks whether an attribute value is within a range
LIKE Checks whether an attribute value matches a given string pattern
IN Checks whether an attribute value matches any value within a value list
QUESTION 3
a) Define the term wild card character as used in databases, state and explain any THREE wildcard characters
used in SQL.
(7 Marks)
% (percentage) - means any and all following or preceding characters are eligible. Example: Ja% includes
jane, james etc
_ (underscore) means any one character may be substituted for the underscore. Example 123_ includes 1234,
1235, 1238 etc.
c) Using a relevant example discuss the THREE data anomalies associated with redundant data.
(6 Marks)
Update anomaly
Delete anomaly
Insert anomaly
QUESTION 4
a) Explain any SIX clauses used together with the SELECT statement
(6 Marks)
WHERE clause (optional) specifies which data values or rows will be returned or displayed, based on the criteria
described after the keyword where.
GROUP BY clause will gather all of the rows together that contain data in the specified column(s) and will allow
aggregate functions to be performed on the one or more columns
HAVING clause allows you to specify conditions on the rows for each group - in other words, which rows should be
selected will be based on the conditions you specify
ORDER BY is an optional clause which will allow you to display the results of your query in a sorted order (either
ascending order or descending order) based on the columns that you specify to order by
DISTINCT discards the duplicate records for the columns you specified after the "SELECT" statement
b) Define the term constraints as applied in databases and discuss any THREE constraints used in
relational databases.
(8 marks)
Constraints allow the DB designer to define the way to automatically enforce the integrity of a
database. Constraints define rules regarding the values allowed in columns and are the standard
mechanism for enforcing integrity
Constraints:
Primary key: it is an attribute or a combination of attributes that uniquely identify each row; it does not
accept nulls (entity integrity).
Foreign key: refers to an attribute that point to the primary key value in a related table (referential
integrity).
The NOT NULL constraint ensures that a column does not accept nulls.
The UNIQUE constraint ensures that all values in a column are unique.
The DEFAULT constraint assigns a value to an attribute when a new row is added to a table. The
CHECK constraint is used to validate data when an attribute value is entered.
The CHECK constraint checks to see that a specified condition exists.
a) Discuss any THREE concurrency control mechanisms used in databases.
(6 Marks)
A lock guarantees exclusive use of a data item to a current transaction. In other words, transaction T2 does
not have access to a data item that is currently being used by transaction T1. A transaction acquires a lock
prior to data access; the lock is released (unlocked) when the transaction is completed so that another
transaction can lock the data item for its exclusive use.
The time stamping approach to scheduling concurrent transactions assigns a global, unique time stamp to
each transaction. The time stamp value produces an explicit order in which transactions are submitted to the
DBMS. Time stamps must have two properties:
Optimistic methods This approach is based on the assumption that the majority of the database
operations do not conflict. Transactions move through three phases:
During the read phase, the transaction reads the database, executes the needed computations,
and makes the updates to a private copy of the database values.
During the validation phase, the transaction is validated to ensure that the changes made will not
affect the integrity and consistency of the database.
During the write phase, the changes are permanently applied to the database.
QUESTION 5
P_code and V_code are the primary keys to the product and vendor tables respectively and V_code is the foreign
key in the product table.
Write SQL queries that will:
PRIMARY KEY(P_code)
ii. Create a view that contains the names of the products to be ordered(use a reorder level of 20).
CREATE VIEW Reorder AS SELECT P_description FROM product WHERE P-quantity < 20;
iii. Output the quantity of the product with the highest price.
SELECT P_quantity FROM product WHERE P_price = (Selece Max(P_price from product));
iv. Output the number of products supplied by a vendor whose code is 0034.
SELECT count(P_description) FROM product WHERE V_code = 0034;
v. Outputs the names of all products alongside the names of the vendors who supply them.
SELECT product.P-description, Vendor.V_name
ON product.P_code, Vendor.V_code
(20 Marks)