Index U1
Index U1
U1
U2
Q.No Questions Pg.No
Q1 Explain in detail CODD’s Rules. 15,16
Q2 Define normalization? Explain three normal forms with Suitable example. 17
Q3 Need of normalisation 17
Q4 Explain an anomaly in database design? How it can be solved? 18,19
Q5 Explain 2NF, 3NF with suitable example 20,21
Q6 Describe the concept of Transitive Dependency. 22
Q7 How the data integrity problem is handle with DBMS. 22
Q8 Define BCNF 22
Q9 Describe decomposition. What are the desirable properties of 23
Decomposition?
Q10 Explain the need of foreign key. 24
Q11 Differentiate between primary key and foreign key. 24
Q12 Write short note on multi-value dependency. 25
Q13 Differenciate between 3NF & BCNF 25
Q14 Explain why 4NF is more desirable than BCNF Rewrite the definition of 4NF 26
and BCNF Using the notion of domain constrain and general constrain
1
Unit 1
Q.1Explain Significant Difference between file processing and DBMS.
Sr. No File Processing System Database Management System
1 Duplicate data may exist in multiple files which The data is integrated into a single database
lead to data redundancy. which avoids data redundancy.
2 Data inconsistency occurs when data is not The data consistency is obtained by
updated in all the files simultaneously. controlling the data redundancy
3 It is difficult to share data in traditional file In DBMS, data can be easily shared by
system. different applications.
4 In the files, data is stored in specific format. If In DBMS we can completely separate the data
the format of any of the file is changed, then we structure of database and programs or
have to make changes in program processes the applications which are used to access the data.
file.
5 The Traditional File System does not have The DBMS provides centralized data storage.
centralized data control; the data is de- Hence keeping control on data is very much
centralized or distributed. easy.
6 It is very difficult to enforce security checks In DBMS the different used ca “can have
and access rights in a traditional file system. different levels of access to data based on
their roles which provides strong security to
data.
7 Concurrency problems means updation of same DBMS have sub- systems to control the
data by multiple users may generate irrelevant concurrency.
results.
8 It is difficult for File Processing System to The DBMS has many functionalities are
represent the complex data and interfile provided to represent the complex data and
relationships. This results in poor data interfile relationships. This helps to map the
modelling. database with real world applications.
Q.2Describe Relational Model? Explain Structure & Characteristics of Relational Model?
The Relational model stores data in the form of tables. This concept is introduced by Dr. E.F. Codd, a researcher
of IBM. The relational model is the first choice of commercial data processing applications for storing the data.
Relational model is most famous because of its simplest structure as compare to other database models like
network or hierarchical model.
The relational model is now considered as the primary model for databases. The relational data model is the
simple model having all the properties and capabilities required to process data.
Most of the modern Database Management Systems (DBMS) are relational.
Definition: A relational database is collectively combination of data structures, storage and retrieval operations
and integrity constraints.
The characteristics of Relational database systems are as follows:
1. This model is called as Relational Model by Dr. Codd because the data is stored in the tables which are
having relationships in between them.
2. The whole data of the system is represented as systematic arrangement of data into rows and columns,
called as relation or table.
3. A table is also form in two-dimensional structure.
4. At any given row/column position means in every cell in the relation there is one and only one value
which is known as scalar value. 5.Column represents attribute, and each column has a distinct name. All
values entered in the columns are the same format.
2
Q.3Explain in detail the different levels of Abstraction.
1. Physical Level
In data abstraction Physical level is the lowest level. This level describes how the data is actually stored in the
physical memory.
The physical memory may be hard disks, magnetic tapes, etc. In Physical level the methods like hashing are
used for organization purpose.
Developer would know the requirement, size and accessing frequency of the records clearly in this level which
makes easy to design this level.
2. Logical Level
This is the next higher level of abstraction which is used to describe what data the database stores, and what
relationships exist in between the data items. The logical level thus describes an entire database in terms of a
small number of relatively simple structures.
Although implementation of the simple structures at the logical level may involve complex physical level
structures, the user of the logical level does not need to be aware of this complexity. This is considered as
physical data independence. Database administrators use the logical level of Abstraction to decide what
information to keep in aDatabase.
Q.4Explain the need of view.
It is the highest level of data abstraction. This level describes the user interaction with database system. In the
logical level, simple structures are used but still complexity remains because in the large database various type
of information is stored.
Many users are not aware of technical details of the system, and also they need not to access whole information
from the database. Hence it is necessary to provide a simple and short interface for such users as per their
requirements. Multiple views can be created for same database for multiple users.
Example: Consider that we are storing information of all the employees of an organization in employee table. El
At physical level these records can be described as blocks of storage (bytes, gigabytes, terabytes etc.) in
memory. These details are usually hidden from the Se developer.
The records can be described as fields and attributes along with their data types at the logical level. The
relationship between these fields can be implemented logically.
3
Q.5Explain the Distinction among the terms Primary Key, Candidate Key, and Super Key with suitable
examples
The specification of differentiation of entities in a given entity set is very important. The entities can be
differentiated in terms of attributes. Here the values of attributes come in picture.
These values should be different to identify the attributes uniquely. It is necessary that in an entity set, no two
entities should have same values for all the attributes.
In the database schema the notion of key is directly applies to entity sets. The key for an entity in entity set is an
attribute or set of attributes which is used to distinguish entities from each other.
Keys are also used to identify relationships uniquely and differentiate these relationships from each other. There
are six types of keys available in DBMS:
1.Primary Key
Primary key uniquely identify each entity in the entity set. It must have unique values and cannot hold null
values. Let, R be a relationship set having entity sets E1,E2,….En.
Consider primary key (Ei) denotes the set of attributes that forms the primary key for entity ser Ei.
The set of attributes associated with the relationship set R is responsible for composition of primary key for that
relationship set.
Example:- In Bank database, the account_number Entity should be primary key. Because this field cannot Be
kept NULL as well as no account_number should be Repeated.
2. Super Key
This key is formed by combining more than one attributes for the purpose of uniquely identifying entities.
Example In student database having attributes Student_reg_id, Student_roll_no, Sutdent_name,
Address, Contact_no.
The Super keys are:
{Student_reg_id} {Student_roll_no}
{Student_reg_id, Student_roll_no}
{Student_reg_id, Sutdent_name}
{Student_roll_no, Sutdent_name}
(Student_reg_id, Student_roll_no, Sutdent_name} It means super key can be any combination of attributes, so
that identifying the record becomes easier.
3. Candidate Key
Candidate key is formed by collection of attributes which hold unique values.
A super key without redundant values is known as candidate key. Candidate keys are selected from the set of
super keys.
Candidate key are also known as minimal super key having uniqueness property. The attribute which do not
contain duplicate value, may be a candidate key.
Example: In student database with attributes Student_reg_id, Student_roll_no, Sutdent_name, Address,
Contact_no.
The Candidate keys are:
(Student_reg_id)
(Student_roll_no)
(Student_reg_id, Student_roll_no}
It means candidate key can be any combination of key attributes, so that identifying the record from the table
becomes easier.
4
Q.6Explain Selection & Projection Operation of Relational Algebra with Examples
1 Selection (σ)
As the name suggest, it selects the rows that satisfy the given statement from a relation.
This is a unary operation. This operation pulls the horizontal subset of the relation that fulfils the specified
conditions.
This may use operators like <, >, <=, >=,= and != to separate the data from the relation. It can also use logical
operator like AND, OR and NOT to combine the various filtering conditions.
This operation can be represented as follows:
2.Projection (II)
This is also a unary operator like selection operator. The subset of relation based on the conditions specified is
created by this operator. It considers only selected columns from the relation for vertical subset of relation. It is
denoted as below:
Syntax
π(Column 1,Column 2…Column ) (Relation Name)
Where II is the symbol used for projection, Relation Name is the name of the relation and Columnl, Column
1… are the columns of the relations which will be displayed in the resultant subset.
5
Q.7Describe Mapping Cardinality? Explain Different types of Cardinalities for a binary relationship with
Examples.
In Database Management System the term ‘Cardinality refers to the uniqueness of data values contained in a
column.
If the column contains a large percentage of totally unique values then it is considered as high cardinality while
if the column contains a lot of “repeats” in its data range, it is known as Low cardinality.
Sometimes cardinality also refers to the relationships between tables. Cardinality between tables can be one- to-
one, many-to-one or many-to-many.
A relationship where two entities are participating is called a binary relationship.
1.One to One
An entity of entity-set A can be associated with at most One entity of entity-set B and vice versa that means an
Entity in entity-set B can be associated with at most one Entity of entity-set A.
2.One to Many
In this type an entity in set A is associated with many other entities in set B.
But an entity in entity set B can be associated with maximum of one entity in entity set A.
3.Many to One
In this type an entity in set A is associated with at most one entity in set B. And an entity in set B can be
associated with number of entities in set A.
6
4.Many to Many
In this type any entity in entity set A is associated with number of entities in entity set B.
An entity in entity set B is associated with number of entities in set entity A.
• Example: Many students learn many subjects.
The appropriate mapping cardinality for a particular relationship set is depending upon the real world situation
to which the relationship set is modeling.
7
Q.8Discuss weak Entity set and how it is represented in E-R Diagrams.
Weak Entity
Weak entity is an entity which depends upon another entity. Weak entity is represented by double rectangle.
Subject is the weak entity. Because subject is depends on course.
8
Q.10Write the syntax for following SQL commands
1)Create Table
CREATE TABLE table_name (
Column1 datatype,
Column2 datatype,
…
);
2)Alter Table
ALTER TABLE table_name
ADD column_name datatype;
3)Drop Table
DROP TABLE table_name;
4)Delete
DELETE FROM table_name
WHERE condition;
5)Update
UPDATE table_name
SET column1 = value1, column2 = value2
WHERE condition;
9
Q.12Write note on Tuple Relational Calculus.
1.The tuple relational calculus is a nonprocedur Language.
2. A query in the tuple relational calculus is expressed as
{tlP(t))
That is the set of tuples t for which predicate P is true.
We use selection and projection operators to sele required data from the table.
10
11
12
13
Unit 2
Q.Explain in detail CODD’s Rules.
Codd designed these rules to define what is required from the relational database management system.
All these thirteen rules are followed by very few commercial products. Even the oracle follows eight and half
rules out of thirteen.
Codd’s rule are
Rule 0: Foundation Rule
Foundation rule states that the system must be capable to manage their database systems through their relational
capabilities. All other twelve rules are derived from this foundation rule.
Remaining twelve rule are as follows: Rule 1: Information Representation
All the information in the database must be stored in standard form of tables.
Table is considered as best format to store and manage The data.
Rule 2: Systematic treatment of null values
In a database the NULL values must be given a systematic and uniform treatment. In number of cases we may
have to set null values on place of data. For Example, data is missing, data is not known, or data is not
applicable.
Null Value is different from empty character or string, string of a blank character, and it is also different from
zero value or any number.
14
c) Supports data definition operations (including view definitions), data manipulation operations (update as
well as retrieval), security and integrity constraints, and transaction management operations (begin,
commit, and rollback).
Rule 6: View updating rule
Views are the virtual tables created using queries to show the partial or complete view of the table.
All views that are theoretically updatable must also be updatable by the system
The rule states that we should be able to make changes in views.
Rule 7: High-level Insert, Update, and Delete Rule
High Level means multiple rows from multi columns are affected by the single query. This rule states that a
database must support insertica
Updation, and deletion. This must not be limited for a single row, that is, must also support union, minus and
intersectic operations to yield sets of data records.
For example, Suppose employees got 5% hike in a year. Then ther salary has to be updated to reflect the new
salary. Since this is the annual hike given to the employees, this
Increment is applicable for all the employees.
Hence, the query should not be written for updating the salary one by one for thousands of employee. A single
query should be strong enough to update the entire employee’s salary at a time.
Rule 8: Physical data independence
Changes in the physical level (i.e. format of data storage or container of data like array or linked list must not
require any change to an application.
Rule 9: Logical data independence
The user’s view (application) should not be dependent
Upon logical data in a database. That means changes in logical data must not affect the applications which uses
it. For example, if two tables are merged or one is split into two different tables there should be no impact or
change on the user application.
Rule 10: Integrity independence
The integrity constraints must be specified independently from application programs and stored in the catalog.
We should be able to make changes in integrity constraints independently without the need of any change in the
application.
Rule 11: Distribution independence
The distribution of database at various locations should
Rule 12: The non-subversion rule
If a relational system a low-level (single-record-at-a-time) language, that low level cannot be used to subvert or
bypass the integrity rules and constraints expressed in the higher level relational language.
Means anyhow those integrity rules and constraints must be followed, we cannot violet them by using any back
door option.
15
Q.Define normalization? Explain three normal forms with Suitable example.
Dr. Edgar F. Codd proposes normalization as integral part of a relational model. Normalization is. Database
design technique which is used to organize t tables in such manner that it should reduce redundan and
dependency of data.
Normalization is a database design technique which used to organize the tables in such manner that should
reduce redundancy and dependency of data.
It divides larger tables to smaller tables and links the smaller tables using their relationships.
Normalization is implemented by following so formal rules either by a process of synthesis decomposition.
Synthesis is used to create databas design in normalized form based on a known set dependencies.
Decomposition generally done existing database which is not normalized. The relatio of this database is further
divided into multip relations to improve the design.
Normalization is the multi step process. It creates smaller tables from larger table.
Types of Normalization
1) First normal form (1NF): Having unique values, no He repeating groups
2) Second Normal form (2NF): Having unique values, no repeating groups, no partial dependency.
3) Third Normal form (3NF): Same like second normal form and having transitive dependency
4) Boyce-Codd Normal form (BCNF): It is more developed version then 3NF.
5) Fourth normal form (4NF): No multi-valued dependency.
Q.Need of normalisation
In database Normalization, it is not easy to manage database operations like insertion, deletion, and
modification without facing data loss problem.
If database is not normalized, then Insertion, Updation and Deletion Anomalies occur frequently. To understand
need of normalization, first we should learn, what are Anomalies?
16
Q.Explain an anomaly in database design? How it can be solved?
Anomalies are inconvenient or error-prone situation arising when we process the tables. There are three types of
anomalies:
A) Insert Anomaly
An Insert Anomaly occurs when it is not possible to insert certain attributes into the database without the
availability of other attributes.
For example, In College Database System, it is not possible to add entry of any new course unless any student
enrolled for it.
Here no student has enrolled for the course Dot Net, hence no entry for course Dot Net.
B) Update Anomaly
An Update Anomaly occurs when there is requirement of changes in multiple records of an entity, but all
records not get updated.
For Example: Address of Kunal get changed.
C) Delete Anomaly
A Delete Anomaly is opposite to insert anomaly. This anomaly occurs when data of some attributes get lost
because of deletion of data of other attributes in the same record.
For Example: Just consider the only one entry for course Oracle of student Jay is deleted, then there will be no
record related to course ‘DS”.
17
18
Q.Explain 2NF, 3NF with suitable example
2NF:-
To understand the second normal form fist we should get concepts of Prime and Non-prime attributes.
Prime attribute: An attribute, which is a part of the prime-key is known as a prime attribute.
Non-prime attribute: An attribute, which is not a part of the prime-key is said to be a non-prime attribute.
As per the rule of Second Normal Form, every non- prime attribute must be dependent upon the prime attribute.
If we follow second normal form, then every non-prime attribute should be fully functionally dependent on
prime key attribute.
In the above example the prime key attributs STUD ID and PROJ_ID. Here the non key attribu like
STU_NAME and PROJ_NAME are depende upon one of the prime key attributes.
Means STUD NAME is depend upon STUD_ID= the PROJ_NAME depends upon PROJ_ID. This called as
partial dependency.
As per the rule of Second Normal Form the non attributes should be dependent upon all the attributes which is
not followed. To convert the data in Second Normal Form we have
Split the table in two different tables as follows.
In the table STUDENT all the non prime attributes STU NAME and PROJ_NAME are comple depends upon
the prime attribute STUD_ID
In the table PROJECT the non prime attribu PROJ_NAME is completely depends upon the f attribute PROJ_ID
19
3NF:-
A database design is said to be in 3NF if both the Following conditions are satisfied by it Initially the design
must be in 2NF. No non-prime attribute should Be transitively Dependent on prime key attribute. For Any Non-
trivial Functional Dependency, X → A
Either X is a superkey or A is prime attribute.
In Table 2.9.1 the STUD_ID is the only one primary key. Here the data of city can be retrieved through either
STUD_ID or ZIP code. But the CITY is not
Superkey as well as nor the CITY is prime attribute. Here the CITY is depends upon ZIP and ZIP is depends
upon STUD_ID Stud_id→ zip→ city. This relationship is known as transitive dependency.
We have to remove this transitive dependency by implementing Third Normal Form. For this purpose we have
to split the STUDENT_DETAILS table in two different tables say STUDENT_DATA and CITY.
Now the database design is converted into Third Normal Form. Here the basically design is already in Second
Normal Form and No non-prime attribute is transitively dependent on prime key attribute.
20
Q.Describe the concept of Transitive Dependency.
Functional dependency is said to be transitive if it is indirectly form by using two functional dependencies.
For example in relation R (A, B, C), Functional dependency F includes: A → B and B → C both the FDs are
hold by relation, and if BA doesn’t hold
Then we can say that AC also holds by the relation.
This type of dependency is called as transitive dependency.
Example
If Student Courses relation holds following FDs:
{Course_Id} {Student_Name}
(Student_Name) → (Student_Age}
Then we can say that the relation also holds following dependency.
(Course_Id) [Student_Age}
Q. Define BCNF
A database design is said to be in BCNF if both the following conditions are satisfied by it Initially the table
must be in 3NF.
For any non-trivial functional dependency X → A, X must be a super-key.
In above database design, Stud_id is the super-key in the relation Student_Data and Zip is the super-key in the
relation City
Stu_ID Stu_name, City
And Zip City
Which confirms that both the relations are in BCNF.
21
Q.Describe decomposition. What are the desirable properties of Decomposition?
One solution to eliminate the redundancy and als update and deletion anomalies in database is breaki relation
into two or more relations. This process is call decomposition.
Let, R be a relation schema.
A set of relation schemas (R1, R2,…, Rn} decomposition of R if R RIUR2 U….. Rn =
For example,
Let us know how to break the relation into more one relation.
Consider CLASSINFO relation having COURSE COURSE_NAME, STUD_NAME, SUB_ID,
SUB_NAME.
Here two additional rows are displayed this states that the bad design of decomposition may leads to information loss. So,
to ensure that the database has good design after decomposing; the decomposition process is done through some criteria
which is defined by desirable properties of decomposition.
22
Q.Explain the need of foreign key.
The Foreign key constraint is also known as Referen Integrity Constraint. In this constraint one field common in
between two tables.
Foreign key represent relationships between tabl There is parent child relationship between two tabl having
common column.
The master table can be referenced as parent while th transaction table is considered as child. The comm field
will work as primary key in parent table whi foreign key in child table.
Example : Consider Training Institute Database havi two tables Course_details and Student. There is condition
that the students may register for cours which are available in institute currently and not fort courses which are
not offered at the moment. 1 specify this rule while inserting values into databa foreign key constrain is used. As
follows:
In both the tables, the field Course_Code is comm In Course details Course_Code is referred as prima key and
in Student table it is referred as foreign key
So, after assigning foreign key constraint to Student table the record entry for new student will not accept
Course Code which is not available in the master table Course_details.
23
Q.Write short note on multi-value dependency.
A multi-valued dependency exists when a relation R has at least 3 attributes (like A, B and C) and for value of A
there is a well defined set of values of B and a well defined set of values of C. However, the set of values of B is
independent of set C and vice versa.
For Example:
24
Q.Explain why 4NF is more desirable than BCNF Rewrite the definition of 4NF and BCNF Using the
notion of domain constrain and general constrain
In the Fourth Normal Form, Initially the design must be in 3NF.
It should not contain any multivalued dependencies. Consider Table 2.11.1. It contains the data about Subject,
Lecturer and Book_Name referred by the lecturers.
Now this design satisfies the rules of 3NF but two attributes Lecturer and Book_Name are independent entities.
The subject English can teach by either Prof. Jitesh or Prof. Meetali. And also student have two choices for
books of English. They can either refer English Book 1 or English Book 2.
Hence the relationship will be SUBJECT LECTURER SUBJECT BOOK_NAME
The relationship shows multi-valued dependency of attribute SUBJECT.
. If we need to select both lecturer and books. Recommended for any of the subject, then it will show the
combination of lecturer, books, which implies lecturer who recommends which book. This is not correct.
• To remove this dependency the table should be split into two tables as below:
Now in this design, if we want to retrieve names of lecturer or books recommended, then two independent
queries can be executed. This eliminates the multi- valued dependency of attribute “Subject”. Hence the design
is in 4NF.
25