Database Management System Revision Guide
Database Management System Revision Guide
Attributes
Entities are represented by means of their properties, called attributes. All attributes have
values. For example, a student entity may have name, class, and age as attributes.
There exists a domain or range of values that can be assigned to attributes. For example, a
student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be
negative, etc.
Types of Attributes
Simple attribute − Simple attributes are atomic values, which cannot be divided further.
For example, a student's phone number is an atomic value of 10 digits.
Composite attribute − Composite attributes are made of more than one simple attribute.
For example, a student's complete name may have first_name and last_name.
Derived attribute − Derived attributes are the attributes that do not exist in the physical
database, but their values are derived from other attributes present in the database. For
example, average_salary in a department should not be saved directly in the database,
instead it can be derived. For another example, age can be derived from data_of_birth.
1
Multi-value attribute − Multi-value attributes may contain more than one values. For
example, a person can have more than one phone number, email_address, etc.
ACID Properties
A transaction is a very small unit of a program and it may contain several lowlevel tasks. A
transaction in a database system must maintain Atomicity, Consistency, Isolation,
and Durability − commonly known as ACID properties − in order to ensure accuracy,
completeness, and data integrity.
Atomicity − This property states that a transaction must be treated as an atomic unit, that
is, either all of its operations are executed or none. There must be no state in a database
where a transaction is left partially completed. States should be defined either before the
execution of the transaction or after the execution/abortion/failure of the transaction.
Consistency − The database must remain in a consistent state after any transaction. No
transaction should have any adverse effect on the data residing in the database. If the
database was in a consistent state before the execution of a transaction, it must remain
consistent after the execution of the transaction as well.
Durability − The database should be durable enough to hold all its latest updates even if
the system fails or restarts. If a transaction updates a chunk of data in a database and
commits, then the database will hold the modified data. If a transaction commits but the
2
system fails before the data could be written on to the disk, then that data will be
updated once the system springs back into action.
Isolation − In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions
will be carried out and executed as if it is the only transaction in the system. No
transaction will affect the existence of any other transaction.
Differentiate between parameter query and aggregate query.
Parameter query works with other types of queries to get whatever results you are after. This is
because, when using this type of query, you are able to pass a parameter to a different query,
such as an action or a select query. It can either be a value or a condition and will essentially tell
the other query specifically what you want it to do.
It is often chosen because it allows for a dialog box where the end user can enter whatever
parameter value they wish each time the query is run. The parameter query is just a modified
select query.
Aggregate Query: A special type of query is known as an aggregate query. It can work on other
queries (such as selection, action or parameter) just like the parameter query does, but instead of
passing a parameter to another query it totals up the items by selected groups.
3
Database: A collection of related tables or files
Designers − Designers are the group of people who actually work on the designing part
of the database. They keep a close watch on what data should be kept and in what
format. They identify and design the whole set of entities, relations, constraints, and
views.
End Users − End users are those who actually reap the benefits of having a DBMS. End
users can range from simple viewers who pay attention to the logs or market rates to
sophisticated users such as business analysts.
Discuss Normalization
If a database design is not perfect, it may contain anomalies, which are like a bad dream for any
database administrator. Managing a database with anomalies is next to impossible.
Update anomalies − If data items are scattered and are not linked to each other properly,
then it could lead to strange situations. For example, when we try to update one data
item having its copies scattered over several places, a few instances get updated properly
while a few others are left with old values. Such instances leave the database in an
inconsistent state.
4
Deletion anomalies − We tried to delete a record, but parts of it was left undeleted
because of unawareness, the data is also saved somewhere else.
Insert anomalies − We tried to insert data in a record that does not exist at all.
Normalization is a method to remove all these anomalies and bring the database to a consistent
state.
Each attribute must contain only a single value from its pre-defined domain.
If we follow second normal form, then every non-prime attribute should be fully functionally
dependent on prime key attribute. That is, if X → A holds, then there should not be any proper
subset Y of X, for which Y → A also holds true.
5
We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID.
According to the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent
upon both and not on any of the prime key attribute individually. But we find that Stu_Name
can be identified by Stu_ID and Proj_Name can be identified by Proj_ID independently. This is
called partial dependency, which is not allowed in Second Normal Form.
We broke the relation in two as depicted in the above picture. So there exists no partial
dependency.
We find that in the above Student_detail relation, Stu_ID is the key and only prime key
attribute. We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a
superkey nor is City a prime attribute. Additionally, Stu_ID → Zip → City, so there
exists transitive dependency.
To bring this relation into third normal form, we break the relation into two relations as follows
6
Boyce-Codd Normal Form
Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms.
BCNF states that −
and
Zip → City
Databases are stored in file formats, which contain records. At physical level, the actual data is
stored in electromagnetic format on some device. These storage devices can be broadly
categorized into three types −
Primary Storage − The memory storage that is directly accessible to the CPU comes
under this category. CPU's internal memory (registers), fast memory (cache), and main
memory (RAM) are directly accessible to the CPU, as they are all placed on the
motherboard or CPU chipset. This storage is typically very small, ultra-fast, and volatile.
Primary storage requires continuous power supply in order to maintain its state. In case
of a power failure, all its data is lost.
8
Secondary Storage − Secondary storage devices are used to store data for future use or
as backup. Secondary storage includes memory devices that are not a part of the CPU
chipset or motherboard, for example, magnetic disks, optical disks (DVD, CD, etc.),
hard disks, flash drives, and magnetic tapes.
Tertiary Storage − Tertiary storage is used to store huge volumes of data. Since such
storage devices are external to the computer system, they are the slowest in speed. These
storage devices are mostly used to take the back up of an entire system. Optical disks
and magnetic tapes are widely used as tertiary storage.
9
State and explain the characteristics of a modern database
Real-world entity − A modern DBMS is more realistic and uses real-world entities to
design its architecture. It uses the behavior and attributes too. For example, a school
database may use students as an entity and their age as an attribute.
Relation-based tables − DBMS allows entities and relations among them to form tables.
A user can understand the architecture of a database just by looking at the table names.
Isolation of data and application − A database system is entirely different than its data.
A database is an active entity, whereas data is said to be passive, on which the database
works and organizes. DBMS also stores metadata, which is data about data, to ease its
own process.
Less redundancy − DBMS follows the rules of normalization, which splits a relation
when any of its attributes is having redundancy in values. Normalization is a
mathematically rich and scientific process that reduces data redundancy.
Query Language − DBMS is equipped with query language, which makes it more
efficient to retrieve and manipulate data. A user can apply as many and as different
filtering options as required to retrieve a set of data. Traditionally it was not possible
where file-processing system was used.
Multiple views − DBMS offers multiple views for different users. A user who is in the
Sales department will have a different view of database than a person working in the
10
Production department. This feature enables the users to have a concentrate view of the
database according to their requirements.
Security − Features like multiple views offer security to some extent where users are
unable to access data of other users and departments. DBMS offers methods to impose
constraints while entering data into the database and retrieving the same at a later stage.
DBMS offers many different levels of security features, which enables multiple users to
have different views with different features. For example, a user in the Sales department
cannot see the data that belongs to the Purchase department. Additionally, it can also be
managed how much data of the Sales department should be displayed to the user. Since
a DBMS is not saved on the disk as traditional file systems, it is very hard for miscreants
to break the code.
Active − In this state, the transaction is being executed. This is the initial state of every
transaction.
Failed − A transaction is said to be in a failed state if any of the checks made by the
database recovery system fails. A failed transaction can no longer proceed further.
Aborted − If any of the checks fails and the transaction has reached a failed state, then
the recovery manager rolls back all its write operations on the database to bring the
database back to its original state where it was prior to the execution of the transaction.
11
Transactions in this state are called aborted. The database recovery module can select
one of the two operations after a transaction aborts −
3-tier Architecture
A 3-tier architecture separates its tiers from each other based on the complexity of the users and
how they use the data present in the database. It is the most widely used architecture to design a
DBMS.
Database (Data) Tier − At this tier, the database resides along with its query processing
languages. We also have the relations that define the data and their constraints at this
level.
Application (Middle) Tier − At this tier reside the application server and the programs
that access the database. For a user, this application tier presents an abstracted view of
the database. End-users are unaware of any existence of the database beyond the
application. At the other end, the database tier is not aware of any other user beyond the
12
application tier. Hence, the application layer sits in the middle and acts as a mediator
between the end-user and the database.
User (Presentation) Tier − End-users operate on this tier and they know nothing about
any existence of the database beyond this layer. At this layer, multiple views of the
database can be provided by the application. All views are generated by applications that
reside in the application tier.
Multiple-tier database architecture is highly modifiable, as almost all its components are
independent and can be changed independently.
ER Model is based on −
Entities and their attributes.
Relationships among entities.
13
Entity − An entity in an ER Model is a real-world entity having properties
called attributes. Every attribute is defined by its set of values called domain. For
example, in a school database, a student is considered as an entity. Student has various
attributes like name, age, class, etc.
Mapping cardinalities −
o one to one
o one to many
o many to one
o many to many
Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model
than others. This model is based on first-order predicate logic and defines a table as an n-ary
relation.
Note that the "P_Id" column in the "Orders" table points to the "P_Id" column in the
"Persons" table.
(ii) Write DML statements to store the data in the two relations.
(iii) ALTER TABLE Persons
ADD PhoneNumber int
(iv) CREATE TABLE Persons
(
P_Id int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Address varchar(255),
City varchar(255),
15
PRIMARY KEY (P_Id)
)
iii) Write SQL statement to add another field/ column “phoneNumber” to the persons
table.
16
Reflexive rule − If alpha is a set of attributes and beta is_subset_of alpha, then alpha
holds beta.
Augmentation rule − If a → b holds and y is attribute set, then ay → by also holds. That
is adding attributes in dependencies, does not change the basic dependencies.
17
Graphical User Interfaces. A GUI typically displays a schema to the user in diagrammatic
form. The user then can specify a query by manipulating the diagram. In many cases, GUIs
utilize both menus and forms. Most GUIs use a pointing device, such as a mouse, to select
certain parts of the displayed schema diagram.
Natural Language Interfaces. These interfaces accept requests written in English or some
other language and attempt to understand them. A natural language interface usually has its
own schema, which is similar to the database conceptual schema, as well as a dictionary of
important words. The natural language interface refers to the words in its schema, as well as
to the set of standard words in its dictionary, to interpret the request.
Speech Input and Output. Limited use of speech as an input query and speech as an answer to
a question or result of a request is becoming commonplace. Applications with limited
vocabularies such as inquiries for telephone directory, flight arrival/departure, and credit card
account information are allowing speech for input and output to enable customers to access
this information. The speech input is detected using a library of predefined words and used to
set up the parameters that are supplied to the queries. For output, a similar conversion from
text or numbers into speech takes place.
Interfaces for Parametric Users. Parametric users, such as bank tellers, often have a small set
of operations that they must perform repeatedly. For example, a teller is able to use single
function keys to invoke routine and repetitive transactions such as account deposits or
withdrawals, or balance inquiries. Systems analysts and programmers design and implement
a special interface for each known class of naive users. Usually a small set of abbreviated
commands is included, with the goal of minimizing the number of keystrokes required for
each request. For example, function keys in a terminal can be programmed to initiate various
commands. This allows the parametric user to proceed with a minimal number of keystrokes.
Interfaces for the DBA. Most database systems contain privileged commands that can be
used only by the DBA staff. These include commands for creating accounts, setting system
parameters, granting account authorization, changing a schema, and reorganizing the storage
structures of a database.
Outline the four benefits for designing a table from scratch rather than using a table
wizard It gives one control of the field naming and data types selection
It provides a means of defining the fields by entering fields’ description
It enables one to specify the fields’ properties hence ease the data entry and
management
It doesn’t require a lot of memory to run hence it is quick even in a slower computer.
State any three data description and three data manipulation commands as used in SQL
DDL- Create; Delete; Update; Drop
18
DML- insert; select; group by; where; having;
Company X has an employee entity with the following attributes: name, PfNo, DoB,
deptNo. Where name refers to the name of the employee, PfNo refers to the personal file
number of that employee, DoB is the date of birth of the employee and deptNo is the
number of the department where the employee belongs.
(i) Identify metadata and a primary key for the employee entity.
Pk- PfNo
(ii) Issue SQL commands for creating a database for the company and a table containing
information on the company’s employees. Use your own field sizes.
What is Normalization? Database normalization is the process of organizing the fields and
tables of a relational database to minimize redundancy and dependency
Give an example of why it is important to consider the skill and resources available to
likely intruders when designing computer security mechanisms and policies to defend
against those intruders?
It the intruder has few skills or resources (e.g. grandma), then a simple cheap security
mechanism might be sufficient (e.g. passwords). If the adversary is the Safaricom Company,
then passwords will be of little defence
19
A database system normally contains a lot of data in addition to users’ data. For example, it
stores data about data, known as metadata, to locate and retrieve data easily. It is rather difficult
to modify or update a set of metadata once it is stored in the database. But as a DBMS expands,
it needs to change over time to satisfy the requirements of the users. If the entire data is
dependent, it would become a tedious and highly complex job.
Metadata itself follows a layered architecture, so that when we change data at one layer, it does
not affect the data at another level. This data is independent but mapped to each other.
Logical data independence is a kind of mechanism, which liberalizes itself from actual data
stored on the disk. If we do some changes on table format, it should not change the data residing
on the disk.
20
Primary Key The primary key is a unique identifier for a record. The primary key cannot be the
same for two records. This field can never be blank.
Composite Key A composite key is a primary key that is comprised of two or more fields. It
can also be called a compound or concatenated key.
Foreign Key A foreign key is a field or combination of fields that are related to the primary key
of another table.
Super Key − A set of attributes (one or more) that collectively identifies an entity in an entity
set.
Candidate key Set of attributes could be used as the primary key
Discuss database Relationship
Relationships are represented by diamond-shaped box. Name of the relationship is written
inside the diamond-box. All the entities (rectangles) participating in a relationship, are
connected to it by a line.
One-to-one − When only one instance of an entity is associated with the relationship, it
is marked as '1:1'. The following image reflects that only one instance of each entity
should be associated with the relationship. It depicts one-to-one relationship.
21
Many-to-one − When more than one instance of entity is associated with the
relationship, it is marked as 'N:1'. The following image reflects that more than one
instance of an entity on the left and only one instance of an entity on the right can be
associated with the relationship. It depicts many-to-one relationship.
Many-to-many − The following image reflects that more than one instance of an entity
on the left and more than one instance of an entity on the right can be associated with the
relationship. It depicts many-to-many relationship.
i) Write SQL statement that will retrieve fields contactid and lastname from
customers and phonetype and number from PhoneNumbers where contactid
is common between the two relations
Select c. contacted, c. lastname, p.phonetype, p.number
From customers c, PhoneNumbers p
Where c.contactid=p.contactid
ii) Write SQL statement to create table PhoneNumbers. Remember to use
appropriate data types for the fields
Create table PhoneNumbers
(contacid int Pimary key,
Phonetype varchar(20),
Number int
)
iii) What is the difference between composite and key field
22
Customers
A certain company coming in town contracted to develop its database describe the steps
you will follow when designing this database
(i) Determine the purpose of your database
(ii) Determine the tables you need in the database
(iii) Determine the fields you need in the tables
(iv)Identify fields with unique values
(v) Determine the relationships between tables
(vi)Refine your design
(vii) Add data and create other database objects
(viii) Use Microsoft Access analysis tools.
With an aid of examples differentiate between specialization, generalization and
inheritance in a database.
Generalization
As mentioned above, the process of generalizing entities, where the generalized entities contain
the properties of all the generalized entities, is called generalization. In generalization, a number
of entities are brought together into one generalized entity based on their similar characteristics.
For example, pigeon, house sparrow, crow and dove can all be generalized as Birds.
23
Specialization
Specialization is the opposite of generalization. In specialization, a group of entities is divided
into sub-groups based on their characteristics. Take a group ‘Person’ for example. A person has
name, date of birth, gender, etc. These properties are common in all persons, human beings. But
in a company, persons can be identified as employee, employer, customer, or vendor, based on
what role they play in the company.
Similarly, in a school database, persons can be specialized as teacher, student, or a staff, based
on what role they play in school as entities.
Inheritance
We use all the above features of ER-Model in order to create classes of objects in object-
oriented programming. The details of entities are generally hidden from the user; this process
known as abstraction.
For example, the attributes of a Person class such as name, age, and gender can be inherited by
lower-level entities such as Student or Teacher.
24
Outline five factors to consider when developing a DBMS
Purpose of the database
Entities in the user environment
Their attributes
Relationship among entities
End- users
What are access control lists and capabilties, and how do they relate to the protection
matrix model of representing authority in a system.
Access control lists define what users / processes are able to perform certain
actions on a resource. For example, UNIX permissions: read, write, execute.
Capabilities define, for each user / process, what actions they can perform on
different resources. They are both ways to represent the protection matrix model
of authority in a system.
ACLs represent the protection matrix stored by column, capabilities represent the
protection matrix stored by row (asusming, like in the notes that each rows is
labelled by process and each column is labelled by an object/resource).
Random numbers − Users are provided cards having numbers printed along with
corresponding alphabets. System asks for numbers corresponding to few alphabets
randomly chosen.
Secret key − User are provided a hardware device which can create a secret id mapped
with user id. System asks for such secret id which is to be generated every time prior to
login.
25
Cardinality refers to the minimum and maximum number of entity occurrences associated
with the occurrence of a related entity.
Relationship type: In a relationship between two entities A and B it refers to the number of
occurrences of entity B associated with an occurrence of entity A, and the number of
occurrences of entity A associated with an occurrence of entity B.
iii. Partial dependency and transitive dependency
A partial dependency exists when a part of the primary key acts as a determinant attribute
example: If (A, B) → (C,D), B → C, and (A, B) is the PK, then the functional dependence B
→ C is a partial dependency because only part of the primary key (B) is needed to determine
the value of C.
A transitive dependency exists when an attribute depends on another attribute via another
attribute example: X → Y, Y → Z, and X is the primary key. In that case, the dependency X
→ Z is a transitive dependency because X determines the value of Z via Y.
Discuss the following OODBS concepts
(i) Persistence
(ii) support of transactions
(iii) concurrency control
(iv) resilience and recovery
(v) security
Discuss the TWO statements used by SQL to manage transactions.
COMMIT – permanently saves changes to the DB.
ROLLBACK – it is used to restore the DB to a previously consistent state.
Discuss the THREE data integrity and consistency problems that can be caused by poorly
managed concurrent transactions.
The lost update problem occurs when two concurrent transactions, T1 and T2, are updating the
same data element and one of the updates is lost (overwritten by the other transaction).
The phenomenon of uncommitted data occurs when two transactions, T1 and T2, are executed
concurrently and the first transaction (T1) is rolled back after the second transaction (T2) has
already accessed the uncommitted data—thus violating the isolation property of transactions.
Inconsistent retrievals occur when a transaction accesses data before and after another
transaction(s) finish working with such data. For example, an inconsistent retrieval would occur
if transaction T1 calculated some summary (aggregate) function over a set of data while another
transaction (T2) was updating the same data. The problem is that the transaction might read some
26
data before they are changed and other data after they are changed, thereby yielding inconsistent
results.
Discuss the FOUR properties of a transaction.
Atomicity (all or none) requires that all operations of a transaction be completed; if not, the
transaction is aborted. If a transaction T1 has four SQL requests, all four requests must be
successfully completed; otherwise, the entire transaction is aborted. In other words, a transaction
is treated as a single, indivisible, logical unit of work.
Consistency indicates the permanence of the database’s consistent state. A transaction takes a
database from one consistent state to another consistent state. When a transaction is completed,
the database must be in a consistent state; if any of the transaction parts violates an integrity
constraint, the entire transaction is aborted.
Isolation (independence) means that the data used during the execution of a transaction cannot
be used by a second transaction until the first one is completed (each transaction executes
independently). In other words, if a transaction T1 is being executed and is using the data item X,
that data item cannot be accessed by any other transaction (T2 ... Tn) until T1 ends. This
property is particularly useful in multiuser database environments because several users can
access and update the database at the same time.
Durability ensures that once transaction changes are done (committed), they cannot be undone
or lost, even in the event of a system failure.
Discuss ERD components
a) Entities:
Represented by a rectangle on the ERD
Represent data which could be people, places, objects, event, concepts, or whatever else
an organization wishes to maintain data about.
It can be distinguished from all of the other entities in the business environment by some
set of identifying characteristics or attributes.
It represents a class of things that can occur more than once within the business
environment.
Entity type
A collection of entities that all share one or more common properties or attributes.
Entity instance
Represents a single, unique occurrence of a member of an entity type.
27
b) Attributes
Represented by an oval or ellipse on the ERD
A characteristic of an entity or a relationship
Example: The entity STUDENT could be described by attributes such as NAME,
ADDRESS, ID NUMBER, MAJOR, etc.
c) Key Attributes
An unique identifier of an entity
Can be a single attribute, or group of attributes
Example: The entity PRODUCT might be uniquely identified by a key PRODUCT ID
d) Concatenated/Composite/Compound Key
a. A group of attributes that uniquely identifies an instance of an entity
e) Candidate Key
a. A candidate to become the primary identifier
b. Example: The entity STUDENT could be uniquely identified by SSN, STUDENT ID,
or E-MAIL ADDRESS
f) Multivalued Attributes:
Multiple values for a single entity instance
Example: the attribute TRAINING of the entity EMPLOYEE can have more than one value at
any given time
Define a constraints and discuss its types
Constraints are conditions that must hold on all valid relation instances. There are three main
types of constraints:
1. Key constraints
2. Entity integrity constraints
3. Referential integrity constraints
1. Key constraints:- Examples include primary, candidate and superkeys (explained
elsewhere in this document).
2. Entity integrity constraints
The entity integrity constraint states that no primary key value can be null. This is
because the primary key value is used to identify individual tuples in a relation. Having
null value for the primary key implies that we cannot identify some tuples.This also
specifies that there may not be any duplicate entries in primary key column key row.
28
3. Referential integrity
Referential integrity is a database concept that ensures that relationships between tables remain
consistent. When one table has a foreign key to another table, the concept of referential integrity
states that you may not add a record to the table that contains the foreign key unless there is a
corresponding record in the linked table. It also includes the techniques known as cascading
update and cascading delete, which ensure that changes made to the linked table are reflected in
the primary table.
Consider the situation where we have two tables: Employees and Managers. The Employees
table has a foreign key attribute entitled Managed by which points to the record for that
employee’s manager in the Managers table
Using relevant examples discuss any THREE data anomalies associated with redundant
data in a DBMS.
The student should use an example to explain:
Update anomaly
Delete anomaly
Insert anomaly
Mapping Entity
An entity is a real-world object with some attributes.
29
Entity's attributes should become fields of tables with their respective data types.
Declare primary key.
Mapping Relationship
A relationship is an association among entities.
Mapping Process
Mapping Process
30
Mapping Process
Create tables for all higher-level entities.
Declare primary key of higher-level table and the primary key for lower-level table.
A relation can be said to be in 1NF, 2NF or 3NF Discuss in detail using examples for each
case.
1NF describes a tabular format in which:
• All of the key attributes are defined.
• There are no repeating groups in the table. In other words, each row/column intersection
contains one and only one value, not a set of values.
• All attributes are dependent on the primary key.
2NF: A table is in second normal form (2NF) when:
• It is in 1NF and
31
• It includes no partial dependencies; that is, no attribute is dependent on only a portion of
the primary key.
3NF: A table is in third normal form (3NF) when:
It is in 2NF and
It contains no transitive dependencies.
Hierarchical database model
Discuss cods 12 rules
32
A database must support high-level insertion, updation, and deletion. This must not be limited
to a single row, that is, it must also support union, intersection and minus operations to yield
sets of data records.
33
Modules- This is a Visual Basic Programming environment that consists of a collection of
declarations and procedures used to automate other database objects.
Pages – Data access pages are used to create web-based databases
Referential integrity enforces three rules, discuss the three rules.
1. We may not add a record to the Employees table unless the Managed by attribute points
to a valid record in the Managers table.
2. If the primary key for a record in the Managers table changes, all corresponding records
in the Employees table must be modified using a cascading update.
3. If a record in the Managers table is deleted, all corresponding records in the Employees
table must be deleted using a cascading delete.
4. Domain Integrity
The domain integrity states that every element from a relation should respect the type and
restrictions of its corresponding attribute. A type can have a variable length which needs to be
respected. Restrictions could be the range of values that the element can have, the default value if
none is provided, and if the element can be NULL.
34
State the advantages and disadvantages of a database Model Structures:
MODEL ADVANTAGES DISADVANTAGES
Ease with which data can be
Hierarchical Data Hierarchical one-to-many
Structure stored and retrieved in structured, relationships must be
routine types of transactions. specified in advance, and
Ease with which data can be are not flexible.
extracted for reporting purposes. Cannot easily handle ad
Structured and routine types of hoc requests for
transaction processing are fast information.
and efficient. Modifying a hierarchical
database structure is
complex.
Requires knowledge of a
programming language.
Requires knowledge of a
programming language.
Compact and easy to understand Not currently developed
Multidimensional
Structure way to visualize and manipulate for broad business
data elements that has many application use.
interrelationships.
35
hoc information requests. transactions as quickly
and efficiently as the
Easy for programmers to work hierarchical and network
with. End users can use this models.
model with little effort or
training.
Discuss the role of joins in SQL and differentiate between the TWO main SQL joins.
SQL joins are used to retrieve data from more than one related tables using an equality
comparison between the FK and the PK.
INNER JOIN - Returns the matched records from the tables that are being joined.
OUTER JOIN - Returns the matched pairs and any unmatched values in the other table
would be left null.
(Two marks for the role and two marks for each join type)
36
P_ID and Emp_ID are the primary keys to the Project and Employee tables respectively,
on the employee table, P_ID is a foreign key to the project table.
Write SQl statements that will:
i. Create the employee table.
CREATE TABLE employee(
Emp_ID int NOT NULL
Emp_FName char(30) NOT NULL
EMmp_LName char(30) NOT NULL
Gender char(8) NOT NULL
P_ID int NOT NULL
Department char(10) NOT NULL
PRIMARY KEY(Emp_ID)
FOREIGN KEY(P_ID) REFERENCES project);
ii. Insert one row of data into the project table.
INSERT INTO project VALUES( 001, “Maseno lecture rooms ”, “Maseno University”,
“Maseno”);
iii. Retrieve the details of all employees working on project ID number 20
SELECT * FROM employee WHERE P_ID = 20;
iv. Update the first name of an employee whose employee ID is 0045 to Mark.
UPDATE employee SET EmpLName = “Mark” WHERE Emp_ID = 0045;
v. Count the number of female employees in the IT department.
SELECT COUNT(Emp_ID) WHERE department = “IT” AND gender = “Female”;
vi. Output the names of all employees and the names of the projects they are working
on.
SELECT employee.EmpFName, employee.EmpLName, project.P_Name
FROM employee, project
ON employee.Emp_ID = project.P_ID
vii. Outputs all the unique department names the organization.
SELECT DISTINCT(Department) FROM Employee;
Define the following terms
Tables − In relational data model, relations are saved in the format of Tables. This format stores
the relation among entities. A table has rows and columns, where rows represents records and
columns represent the attributes.
37
Tuple − A single row of a table, which contains a single record for that relation is called a tuple.
Relation instance − A finite set of tuples in the relational database system represents relation
instance. Relation instances do not have duplicate tuples.
Relation schema − A relation schema describes the relation name (table name), attributes, and
their names.
Relation key − Each row has one or more attributes, known as relation key, which can identify
the row in the relation (table) uniquely.
Attribute domain − Every attribute has some pre-defined value scope, known as attribute
domain.
Discuss different types of locks in concurrency control
38
Two-phase locking has two phases, one is growing, where all the locks are being acquired by
the transaction; and the second phase is shrinking, where the locks held by the transaction are
being released. To claim an exclusive (write) lock, a transaction must first acquire a shared
(read) lock and then upgrade it to an exclusive lock.
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol. This protocol
uses either system time or logical counter as a timestamp.
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution, whereas timestamp-based protocols start working as soon as a transaction is
created.
Explain the factors a person wishing to acquire a DBMS should take into account before
purchasing the DBMS he needs to run his business-Cost
-Security features
-Recovery features
-User interface
-Memory requirements
39
Briefly explain the components of a DBMS environment;
- Hardware:
- Software:
- People :
- Procedures:
- Data:
Relational Algebra
Relational algebra is a procedural query language, which takes instances of relations as input
and yields instances of relations as output. It uses operators to perform queries. An operator can
be either unary or binary. They accept relations as their input and yield relations as their
output. Relational algebra is performed recursively on a relation and intermediate results are
also considered relations.
Select
Project
Union
Set different
Cartesian product
Rename
We will discuss all these operations in the following sections.
Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is prepositional logic formula
which may use connectors like and, or, and not. These terms may use relational operators like
− =, ≠, ≥, < , >, ≤.
For example −
40
σsubject="database"(Books)
Output − Selects tuples from books where subject is 'database' and 'price' is 450.
Output − Selects tuples from books where subject is 'database' and 'price' is 450 or those books
published after 2010.
For example −
Selects and projects columns named as subject and author from the relation Books.
r ∪ s = { t | t ∈ r or t ∈ s}
Notation − r U s
41
r, and s must have the same number of attributes.
Attribute domains must be compatible.
Duplicate tuples are automatically eliminated.
Output − Projects the names of the authors who have either written a book or an article or both.
Notation − r − s
Finds all the tuples that are present in r but not in s.
Output − Provides the name of authors who have written books but not articles.
Notation − r Χ s
r Χ s = { q t | q ∈ r and t ∈ s}
Output − Yields a relation, which shows all the books and articles written by tutorialspoint.
Notation − ρ x (E)
42
Additional operations are −
Set intersection
Assignment
Natural join
Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural query language, that
is, it tells what to do but never explains how to do it.
Notation − {T | Condition}
For example −
TRC can be quantified. We can use Existential (∃) and Universal Quantifiers (∀).
For example −
Notation −
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Where a1, a2 are attributes and P stands for formulae built by inner attributes.
For example −
43
Output − Yields Article, Page, and Subject from the relation TutorialsPoint, where subject is
database.
Just like TRC, DRC can also be written using existential and universal quantifiers. DRC also
involves relational operators.
The expression power of Tuple Relation Calculus and Domain Relation Calculus is equivalent
to Relational Algebra.
Deferred database modification − All logs are written on to the stable storage and the
database is updated when a transaction commits.
Primary Index − Primary index is defined on an ordered data file. The data file is
ordered on a key field. The key field is generally the primary key of the relation.
Secondary Index − Secondary index may be generated from a field which is a candidate
key and has a unique value in every record, or a non-key with duplicate values.
Clustering Index − Clustering index is defined on an ordered data file. The data file is
ordered on a non-key field.
Logical errors − Where a transaction cannot complete because it has some code error or
any internal error condition.
System errors − Where the database system itself terminates an active transaction
because the DBMS is not able to execute it, or it has to stop because of some system
44
condition. For example, in case of deadlock or resource unavailability, the system aborts
an active transaction.
System Crash
There are problems − external to the system − that may cause the system to stop abruptly and
cause the system to crash. For example, interruptions in power supply may cause the failure of
underlying hardware or software failure.
Disk Failure
In early days of technology evolution, it was a common problem where hard-disk drives or
storage drives used to fail frequently.
REVISION QUESTIONS
Define RDBMS, and with the help of a well-labeled diagram describe RDBMS
Architecture State any three data description and two data manipulation commands as used
in SQL.
List four examples of Database Management System
Explain the four basic building blocks of all data models
With the help of a well-labeled diagram describe a client/server architecture.
Why does system database needs life circle, and use a diagram to explain the
characteristics of basic information system
Discuss various database designs processes.
Explain the following broad categories of SQL functions.
i) Data Definition Language (DDL) ii) Data Manipulation Language
iii. Data Control Language (DCL)
Describe the process of data integration as applied to DBMS
Write an SQL statement that will produce the output shown below.
45
iv) Write SQL statement that will retrieve fields contactid and lastname from
customers and phonetype and number from PhoneNumbers where contactid is
common between the two relations
v) Write SQL statement to create table PhoneNumbers. Remember to use
appropriate data types for the fields
vi) What is the difference between composite and key field
Database follows under two different types, explain the two types.
Give five advantages and disadvantages of client/server architecture
Define Query functions and state two types of Integrity Constraints.
Explain the SQL syntax of ;
o i) CREATE TABLE ii) UPDATE record
Note that the "P_Id" column in the "Orders" table points to the "P_Id" column in the
"Persons" table.
Write DDL statements to create the two relations
Write DML statements to store the data in the two relations
Write SQL statement to add another field/ column “phoneNumber” to the persons table
Discuss various types of relationships used in DBMS.
Define normalizations and Identify two common situations in which database designers
use normalization
Describe database normal forms
Explain the concept of concurrency control in database transactions.
Define Schema, and describe Semiology term as applied to database.
Formal Language define schema as data model. State three major principles of
data model
46